About | Blog | Downloads | Projects |

Why archiving the internet is important

Feb. 14, 2022

I like proof. Proof of existence in text or multimedia. Not anecdotal evidence but solid proof such a thing existed with complete context building up to the event in the form of a complete collection of community chat logs, interaction, screenshots etc.

I don't really see how it's acceptable to make a claim and then say "well I can't find it/the post seems to be deleted but from what I remember, if I recall correctly, this happened". This ambiguity really pisses me off. So did it or did it not happen. Did this person say that or did he not. Can I even trust that you correctly remember what happened? Everything gets blurry and complicated which more often than not starts rumours and other toxic behavior.

You can see this a lot in many internet communities. A story of a event just spreads by word of mouth through out the years without any physical proof which slowly morphs into an absurd overexagerated story. With the transient and ethereal nature of social media where posts and videos have a lifespan of a week, this is rather disturbing. Take a look at any reddit thread older than a month or your Youtube playlists. How much of it is left? Check your browser bookmarks from a few years ago. How many are 404s?

The internet is self destructing and it's sad because it's not like we don't have the computing power or storage to process this content or a natural disaster wiped out our servers. All this is self inflicted and we are literally erasing our history as a species. We can look back 20 years from now and find that none of it exists which is shocking as it has never been easier or cheaper to store written text, pictures and videos. It's interesting because people tend to hold such reverence for the early stages of any medium like the first movies, first human paintings, first photographs but very rarely does preservation of the internet come into the conversation outside of a few fringe communities. People like to describe it as low value content that has very little value but so were the first paintings and movies whether that be in the form of cave paintings or silly home videos. 90% of humanities early films are lost to time but they are still highly sought after by collectors, historians and academics despite being "low quality" and unrefined. History is bound to repeat itself and unless changes are made, we will fall into the same pitfalls as our ancestors. Luckily, archiving anything is incredibly cheap and easy in the modern era of computing. 8TB HDDs are sold for under $200 and a plethora of tools exist to scrape and archive all sorts of sites. Unlike our predecessors who were limited by space and cost, we have all the tools and resources to do this. All you need is to take a bit of initiative.