San Francisco —
Just blocks from the Presidio of San Francisco, the national park at the base of the Golden Gate Bridge, stands a gleaming white building, its façade adorned with eight striking gothic columns.
But what was once the home of a Christian Scientist church, is now the holy grail of Internet history — the Internet Archive, a non-profit library run by a group of software engineers and librarians, who for nearly 30 years have been saving the web one page at a time.
Inside the stained-glass-adorned sanctuary, the sounds of church sermons have been replaced by the hum of servers, where the Internet Archive’s Wayback Machine preserves web pages.
The Wayback Machine, a tool used by millions every day, has proven critical for academics and journalists searching for historical information on what corporations, people and governments have published online in the past, long after their websites have been updated or changed.
For many, the Wayback Machine is like a living history of the internet, and it just logged its trillionth page last month.
Archiving the web is more important and more challenging than ever before. The White House in January ordered vast amounts of government webpages to be taken down. Meanwhile, artificial intelligence is blurring the line between what’s real and what’s artificially generated — in some ways replacing the need to visit websites entirely. And more of the internet is now hidden behind paywalls or tucked in conversations with AI chatbots.
It’s the Internet Archive’s job to figure out how to preserve it all.
“We are here to try to provide a record of what happened, so that people can learn and build on that to build a better future, or to build new ideas that are worthy of being in the (Internet Archive’s) library,” said Internet Archive founder Brewster Kahle.
Kahle created the archive in 1996 when a year’s worth of saved pages could fit on about 2 terabytes worth of hard drives, the amount of storage you can get today in an iPhone. Now, the archive is saving closer to 150 terabytes, or hundreds of millions worth of web pages, per day.
Kahle is the driving force and personality behind the archive, with the exuberance and energy of your favorite science teacher and like an evangelist whose religion is libraries and technology. Sitting for an interview on the original wooden pews of the church, Kahle said he was inspired to purchase the building because it resembles the group’s logo. But more importantly, he said it’s a symbol of permanence and a reference to the Library of Alexandria in Egypt.
“That was the first time somebody tried to go and collect everything ever written by humans,” Kahle said. “Of course, now that place is the internet, and the Internet Archive serves the whole internet as a library.”
The Wayback Machine tool does more than just screenshot the page. It also saves the technical architecture — the HTML, CSS, JavaScript codes and more — so that it can attempt to “replay the page as it existed” even if the server is no longer functioning, said Wayback Machine Director Mark Graham.
The rise of artificial intelligence and AI chatbots means the Internet Archive is changing how it records the history of the internet. In addition to web pages, the Internet Archive now captures AI-generated content, like ChatGPT answers and those summaries that appear at the top of Google search results.
The Internet Archive team, which is made up of librarians and software engineers, are experimenting with ways to preserve how people get their news from chatbots by coming up with hundreds of questions and prompts each day based on the news, and recording both the queries and outputs, Graham said.
The group keeps copies of its archive in locations around the world in the event of a fire or flood that could damage its servers. But there are political considerations behind this approach, as well. The Trump administration has exerted pressure over content it disagrees with by filing lawsuits against media companies or by way of the Federal Communications Commission.
“Libraries are always targeted. The new guys often don’t like the old stuff around. So let’s design for it,” Kahle said. “Let’s go and live up to the moment and make it so that there’s different points of view stored and made permanently accessible in different environments.”
The Trump administration implemented a massive overhaul of government websites that included taking down countless pages on everything from health policies to the achievements of minority members of the military. It was the archive, which has been saving webpages during the transition of presidential administration websites since 2004, which enabled journalists to understand what had been altered.
“This change was huge. Whole sections of the web came down,” Kahle said. “(The administration) has a new point of view, and that’s why we have libraries to go and have the record.”
Most of the archive’s servers live in a large warehouse outside of San Francisco, although a set of servers have been symbolically placed in the main sanctuary of the former church. That placement is intentional, said Kahle. By displaying the servers, he hopes “that people understand that we’re all part of the collective protection for our knowledge.”
The headquarters is an homage to the work of the Internet Archive’s 200 staff members, which include engineers, librarians and archivists.
Archivists use bespoke machines to digitize books page by page, livestreaming their work on YouTube for all to see (alongside some lo-fi music). Record players churn out vintage tunes from 1920s and 1940s, and the building houses every type of media console for any type of content imaginable, from microfilm, to CDs and satellite television. (The Internet Archive preserves music, television, books and video games, too).
The former church’s main sanctuary also boasts more than 100 three-foot statues of employees who have been on staff for at least three years – a reference to the famous Chinese terracotta army from thousands of years ago.
In some ways, the space captures the quirkiness -— and community — of the internet itself.
“There are a lot of people that are just passionate about the cause. There’s a cyberpunk atmosphere,” Annie Rauwerda, a Wikipedia editor and social media influencer, said at a party thrown at the Internet Archive’s headquarters to celebrate reaching a trillion pages “The internet (feels) quite corporate when I use it a lot these days, but you wouldn’t know from the people here.”
The headquarters might feel something like a living history exhibit. But the Internet Archive’s goal, says Kahle, is to preserve the web so that it can continue to have a future, not to be the arbiters of truth.
“It’s not a presentation. It’s not a museum that has a story,” he said. “It’s trying to be a resource to make it so that other people can come up with their own ideas.”
egyptindependent.com (Article Sourced Website)
#church #trillion #webpages #saved #Egypt #Independent

