The AI Hard Drive Shortage Is Making It More Expensive and Harder to Archive the Internet
Skyrocketing hard drive and storage costs caused by the AI data center boom are making it more expensive and more difficult for digital archivists, academics, Wikipedia, and hobby data hoarders to save data and archive the internet. Specific drives favored by some high profile organizations like the Internet Archive have become far more expensive or are difficult to find at all, archivists said.
Over the last several months, prices for both consumer level and enterprise solid state drives, hard drives, and other types of storage have skyrocketed. As an example, a 2TB external Samsung SSD I purchased last fall for $159 now costs $575. PC Part Picker, a website that tracks the average price of different types of drives, shows a universal increase in storage prices starting in about October of last year. Prices of many of the drives it tracks have doubled or increased by more than 150 percent, and at some stores SSDs and hard drives are simply sold out. There is now even a secondary market for some SSDs, with people scalping them on eBay and elsewhere.
Brewster Kahle, founder of the Internet Archive and the Wayback Machine, the most important archiving projects in the history of the internet, told 404 Media that the skyrocketing costs of storage is “a very real issue costing us time and money.”
“We have found that the preferred 28-30TB drives are just not available or at very high price,” Kahle said. “We gather over 100 terabytes of new materials each day, and we have over 210 Petabytes of materials already archived on machines that need continuous upgrades and maintenance, so we need to constantly get new hard drives.”
“We are fortunate to have an active community that donates to the Archive, and we are also looking for help from hard drive manufacturers in these difficult times. We are always looking for more help,” he added. “So far we have ways to work around these shortages, but it is a very real issue causing us time and money.”
The Wikimedia Foundation, which runs Wikipedia and various other projects, including Wikimedia Commons, an open repository of royalty free media, told 404 Media that the cost of storage has become a concern for the foundation’s projects as well.
“With over 65 million articles on Wikipedia alone, access to server and storage capacity is vital to us. We’ve certainly seen price increases since the end of 2025.These price increases are of concern to us, as with every other player in the industry. We see the primary impact in the purchase of memory and hard drives but also in terms of lead times on server deliveries and our capacity to place future orders,” a Wikimedia Foundation spokesperson told us. “The Wikimedia Foundation is a non-profit, and as such how we allocate budget is very carefully considered. We maintain our own data centers to serve our users from all over the world. We’re putting workarounds in place where we can, mainly involving being smart with how we prioritize investment in hardware, building in flexibility as well as extending the life of existing hardware where possible.”
💡
Have you been affected by skyrocketing SSD or RAM prices? I would love to hear from you. Using a non-work device, you can message me securely on Signal at jason.404. Otherwise, send me an email at jason@404media.co.
Western Digital, one of the largest manufacturers of hard drives and other storage systems, said that it has essentially sold out of its 2026 inventory to enterprise clients, many of which run data centers. Micron, which made RAM and SSDs under the brand name Crucial, has exited the consumer market altogether because “AI-driven growth in the data center has led to a surge in demand for memory and storage. Micron has made the difficult decision to exit the Crucial consumer business in order to improve supply and support for our larger, strategic customers in faster-growing segments.”
The AI boom is thus harming critical archiving projects in multiple ways. As a reaction to AI companies indiscriminately scraping the entire internet to train their large language models, website owners have increasingly put up registration walls, blocked web scrapers by changing their robots.txt to disallow bots, and have otherwise attempted to stop bots from accessing their websites. Many of these websites have either accidentally or purposefully ended up blocking bots from the Internet Archive and other archiving projects. The Electronic Frontier Foundation suggested “blocking the Internet Archive won’t stop AI, but it will erase the web’s historical record.” Beyond that logistical challenge, archivists are now needing to make difficult decisions about how and what to archive because they are, in some cases, simply running out of storage.
Mark Phillips, a University of North Texas professor who helps runs the End of Term Archive, which archives government websites between changes in presidential administrations, told 404 Media that he has had to consider the price of infrastructure recently: “When we went to refresh some of our servers, the costs of the RAM and SSDs for those machines were a dramatic increase and made us rethink some of the capacity we were hoping to go with,” he said. “We have not had to do any major storage purchases in the past six months, and I hope that by the time we do the market will have leveled out a bit.”
The cost of storage has become a constant topic of discussion on Reddit’s r/DataHoarder community, where digital librarians and hobby archivists discuss different archiving setups; many posts are from people who say they have simply had to stop buying drives, have had to put their archiving plans on hold, or are looking to vent about the price of drives. Occasionally, there are posts from people who managed to find a large drive for a decent price on clearance or at a thrift store. Many of these posts are from people who say that they have essentially given up on archiving new content until prices go down:
- “I've decided to just call it quits for now. I don't really download much anymore. I just maintain my current data.”
- “Slim pickings currently. Check Facebook marketplace as occasionally a deal can be had there especially from people who accidentally bought a sas drive and can't use it.”
- “I'm looking for efficient ways to use older smaller drives that I have laying around doing nothing, because I need more space for backups. I can't see buying a 28tb drive right now. I've started adjusting my backup retentions to stretch the space I have.”
- “Bust out your wallet is the only way or try to ride this out and hope prices come down.”
- “You don't [buy new drives] right now. Better pray we actually get drives going forward.”
- “Every vendor i worked with offered me a dinner and told me wait when i asked for a rather large quote.”
- “Bwwaahahahahahahahahhahaha.....not until 2029...MAYBE. All the AI/datacenters have prepurchased hard drives.”
The question that seems to be on everyone's mind is how long will this shortage last, and will the price of storage ever go down again?
Digital memory at stake: News outlets block Wayback Machine
The "Wayback Machine," custodian of digital memory, is fighting for its survival. An increasing number of media outlets are refusing to allow the Web Archive to archive their content.Martin Muno (dw.com)
The AI Compute Crunch Is Here (and It's Affecting the Entire Economy)
Earlier this week, I wrote an article about startups that are spending money on AI compute (tokens on tools like Claude and OpenAI’s products) rather than hiring human employees. There are all sorts of ways this business strategy could fail, and we are beginning to see signs that one of the most obvious ones could be coming to pass: AI companies can’t endlessly subsidize their AI products by charging users less than it costs to actually run them.This is the AI compute crunch, and the signs are all around us:
- GitHub announced it is pausing new signups for Copilot, tightening usage limits, and removing access to several more expensive AI models.
- Anthropic has tightened access to Claude Code, and tested removing access to Claude Code entirely in its $20 per month plan (keeping access in its $100 per month plan)
- As noted in The Verge, Anthropic restricted Claude access to users of OpenClaw because the heavy usage was unsustainable
- OpenAI’s CFO Sarah Friar has been talking endlessly about how the company does not have enough compute, which has manifested in decisions like deciding to shut down Sora
- Software that has AI tools embedded in them have increased between 20 and 37 percent according to some analysts; this has included increases in prices for Microsoft 365, Notion’s Business plan, Salesforce, and Google Workspace prices
- There is a general rationing of AI products and services
- Meta is laying off 10 percent of its workforce in part because it sounds like the company wants to spend some of the savings on AI infrastructure: The layoffs are “to allow us to offset the other investments we’re making,” the company told its remaining employees. Its main recent investments have been data centers and the tech to run data centers.
But it’s not just that AI companies are restricting access to their products, shutting down products altogether, and beginning to increase prices. The broader impact of the current unsustainability of AI can be seen across various sectors of the economy.
- RAM, graphics cards, and hard drive / solid state storage for consumers have skyrocketed in price and are sold out in many stores. The same 2TB external SSD I bought late last year cost me $159 at the time, cost $449 a month ago, and costs $575 today.
- Similarly, the general cost of consumer electronics is increasing as chip manufacturers and production lines shift their focus to building more AI capacity. The largest consumer electronics manufacturer in the world, Apple, says it is having trouble securing chipmaking capacity for upcoming iPhones.
- Home electric bill costs have skyrocketed in some states with high concentrations of AI data centers, leading in part to a widespread, concerted effort by some towns and states to reject and restrict new data centers entirely. There is a fear among experts that similar shortages and price increases could come for water supplies as well.
What this means is that the age of cheap, underpriced AI appears to be ending, or at least the compute crunch means the venture capitalists and investment firms funding OpenAI and Anthropic are going to have to be willing to burn even more cash in order to continue subsidizing their products.
On the podcast this week, I compared this situation to Uber (and any number of fast-scaling startups that sought to lock in customers then jack up prices). This comparison is only useful in that, like Uber, what AI companies are doing to this point is wildly unsustainable and is being subsidized by investors. For years, Uber’s investors subsidized the cost of individual Uber rides to keep prices for consumers artificially low in order to gain market share, crush competition, and destroy the taxi industry. Uber and its investors could only lose money on each ride for so long as it continued to burn cash. This eventually led to enshittification for both riders and drivers as Uber suddenly jacked up prices for consumers and sought to find ways to pay drivers less. The difference, as Ed Zitron has pointed out, is that Uber’s costs were extremely low because Uber is essentially an app that owns none of the infrastructure, and so jacking up the cost of its service went quite a bit further toward getting it to break even.
Some version of this is coming for AI companies, but the path toward sustainability is far more complicated because of the enormous infrastructure and societal costs of scaling AI even further. “Make Claude more expensive and limit its services” is a lever Anthropic can pull, but AI companies are also burning money trying to build new data centers, juggling the political backlash to those data centers, fending off various copyright and public safety lawsuits, and spending huge amounts of money trying to train the next frontier versions of their large language models. None of this is remotely sustainable as it currently stands.
This means that the startups that are using AI agents to scale their operations are doing so at a time when AI costs are unsustainably low and may wake up one day to find that their compute costs suddenly double, 10x, or that they simply aren’t able to access compute anymore.
The general, long-term hope for the AI industry seems to be one in which multiple things need to happen to avoid a broader AI bubble burst. There needs to be a widespread renewable energy revolution (which society and our environment desperately needs), vastly increased chip and component manufacturing, and models need to become more efficient. On top of that, AI needs to be widely adopted and prove to be enduringly useful and reliable across a bunch of different sectors and use cases, something the jury is still very much out on (and some studies have already shown AI use is creating more work for humans, not less). All of this must happen while AI continues to put pressures on these systems that are making the problem worse (AI is making energy more expensive in the short term; lots of data centers are powered by fossil fuels; AI is pushing up the costs of components, chips, and gadgets, etc).
Finally, all of this must happen while society juggles whatever potential mass unemployment / economic fallout comes from AI and the ensuing problems this causes for these employee-less companies who expect to sell their products to a populous that is struggling to find work. As many commenters pointed out in response to my last story: If companies begin replacing their employees with AI agents, who are they going to sell their products to?
Thousands of CEOs admit AI had no impact on employment or productivity—and it has economists resurrecting a paradox from 40 years ago | Fortune
In the 1980s, economist Robert Solow made an observation that reminded economists of today’s AI boom: “You can see the computer age everywhere but in the productivity statistics.”Sasha Rogelberg (Fortune)
reshared this
Jacob Urlich 🌍 reshared this.
Jacob Urlich 🌍
in reply to 404 Media • •