Why We Don’t Know AI's True Water Footprint
Miranda Gabbott / Nov 26, 2024Miranda Gabbott is a British journalist and copywriter who writes about culture and technology.
If the artificial intelligence boom is felt anywhere in Europe, it’s in the flashy coworking spaces of Barcelona. Home to the ‘second office’ of over 9,000 companies, the city is an ideal location for a work trip: year-round sunshine, reasonably-priced restaurants, and plenty of tourism geared towards English speakers. Spain is investing heavily in digital infrastructure and currently has the fastest rate of data center expansion in Europe. One town in the metropolitan area of Barcelona, Cerdanyola del Vallés, will shortly become home to four new data centers, some of which are explicitly designed to be AI-ready.
But Barcelona’s thriving startup scene is only a small part of the picture. The conversation no one wants to have is that it’s still impossible to calculate how many resources the infrastructure of AI needs to run at scale—including water, a resource Spain is rapidly running out of, with 78% of its landmass threatened by desertification. By choosing AI industry-driven growth today, will urban planners in Spain jeopardize their water tomorrow?
When generative AI first started dominating headlines following the release of OpenAI’s ChatGPT in 2021, concerning reports emerged about how much water and energy it takes to train a large language model. Now that the technology is being incorporated into increasing numbers of consumer applications, from fitness apps to Google search, it’s becoming apparent that the problem is not limited to the creation process but to its mass daily usage too. This is due to the resource demands of data centers, the warehouses of computers that process every click and scroll users make on cloud-based applications, including online AI tools. Cloud computing takes little energy on the end users’ device not because technology has evolved to a point where processing is ephemeral—though it appears so to the end user—but because the processing is outsourced to a computer in a data center. These “server farms” require power equivalent to heavy industry and run without interruption, 24/7, 365 days per year. They also depend on cooling systems to ensure their rows of servers do not overheat and malfunction, which use electricity and water.
There are different techniques to cool a data center, and while the most environmentally conscious choice will depend on its location, water or energy usage generally sit at opposite ends of a see-saw: if usage of one is decreased, the other must be increased to compensate. If operators use evaporative cooling—whereby warm air from the data center is passed over water and evaporated in a cooling tower—electricity usage will plummet, but inordinate amounts of water are required. If they use a closed-loop system—where water is cooled with air conditioning and piped to cool down servers, returning to be cooled again—operators will use far less water but an outsized amount of electricity. Most modern data centers combine one of these methods with some degree of free cooling which, as the name suggests, involves using fans to blow fresh outside air into servers. However, except in very rare circumstances, this method is not sufficient on its own. In short, there’s no getting around the fact that the data centers consume water.
It makes little sense to talk about electricity usage or water usage of data centers without mentioning the other. After all, water is used in the process of converting non-renewable resources like oil and methane into usable energy. Therefore, assessing the water usage of data centers means accounting for both the water directly used to cool servers and indirectly by way of electricity generation.
Data centers that support AI applications take a particularly large amount of energy and water to run due to their specialized processors, graphics processing units (GPUs). Big Tech’s pre-pandemic climate forecasts did not account for the energy needs of runaway AI usage. In May, Brad Smith, Microsoft’s president, stated that the company’s 2020 “carbon moonshot”—its pledge to remove more carbon than it emits by the end of the decade—is now out of reach thanks to the AI boom. “In many ways,” Smith said, “the moon is five times as far away as it was in 2020, if you just think of our own forecast for the expansion of AI and its electrical needs.”
In recent months, the spiraling resource drain of data centers has garnered a consistent string of headlines. A Washington Post investigation conducted in collaboration with researchers at the University of California, Riverside, recently revealed that ChatGPT requires up to three bottles of water to generate a single 100-word email. If the global AI demand continues its current trajectory, best estimates put its water withdrawal around 4.2 – 6.6 billion cubic meters by 2027; the equivalent of the annual consumption of half the United Kingdom.
Though data centers have been around since the 1990s, the companies that operate them have never faced legal requirements to report their water usage figures. Nonetheless, every year there is a new scandal over how much water a particular data center uses, either due to numbers prised from its operator’s hands or to a sudden and notable lack of water for anything else in the surrounding area. Just last year, in the midst of what UN experts have labeled a worldwide water crisis, only 41% of data center operators reported on any water usage metric at all. The data center industry is notoriously private, with key players like Google publicly lobbying for this information to remain a trade secret. These facilities are perhaps the ultimate physical metaphor for the algorithmic “black box” and have become something of a fetish object for digital humanities researchers. Entering one generally requires a passport or other government identification, and they are often obscured on Google Maps, leading to gonzo attempts to map their presence in cities.
Between the secrecy of the industry and the complexity of making accurate calculations thanks to the water-energy nexus, there are no trustworthy ways to estimate how much water an AI data center will use. Given this, you might think that urban planners would approach them with caution, especially those in water-stressed areas—of which Spain is one of the most in the industrialized world. Spring 2024 saw the worst drought on record for the northeast region of Catalunya. At the height of the crisis, the reservoir that supplies the metropolitan area of Barcelona, including Cerdanyola, dipped to 15% capacity. A state of emergency was declared along with a slew of temporary laws. Barcelona’s iconic public fountains were switched off, and daily water caps were introduced at a citizen level. The agricultural sector, responsible for just under a fifth of the region’s GDP, was required to reduce its consumption by 80%: a move which spurred tractor blockades of Barcelona’s iconic main streets. (The recent floods in Spain that devastated Valencia are, in fact, a related phenomenon.)
At the time of writing, the region’s drought status has been downgraded from ‘emergency’ to ‘alert.’ However, the problem is not just one of a bad season. Oliver Goshey, a regenerative farming expert in Catalunya who specializes in water management, told me: “Right now, we're paying for decades, if not centuries, of poor water management practices for short-term gains of re-election.”
Barcelona’s water future is precarious, and few of the solutions posed by local politicians are without controversy. There was talk of bringing fresh water in by boat from Marseille or by pipeline from the Ebro, a river further south, but the chosen option, it seems, was to invest €500 million in floating desalination plants.
The desalination plan has been criticized by the 30 campaign groups involved in the discourse around Barcelona’s water management issues, who gather at a Social Water Summit (Cimera Social de l’Aigua). Speaking on behalf of Plataforma Defensa de l'Ebre, a campaign group that is part of the Summit and which is dedicated to preserving the Ebro River, Matilde Font Ten told me:
We are consuming far more water than we have available. Climate change is reducing precipitation levels, decreasing river flows, and causing sea levels to rise. Combined, these factors offer little hope for a sustainable water future in Catalonia. Enjoying a healthy environment is crucial for citizens to have a future with fewer illnesses and better health. This should be the priority of any government rather than placing economic concerns above other sectors.
The Summit calls for the restoration of aquifers to increase the region’s water recharge capacity in the long term, along with transparent reporting on how much water various segments of society use, which could inform reductions. The problem with this suggestion is that, according to regional law, industrial consumers’ water usage data is protected from publication on the grounds of its connection to tax data, which is legally considered sensitive.
The expansion of the data centers in Cerdanyola will make it one of, if not the most important data center hub in Spain. The two largest among them will be a 42-megawatt data center operated by a US commercial real estate company, Panattoni, and a 60-megawatt one by AQ Compute, a subsidiary of German investment company Aquila Capital. To put their scale into perspective, a single megawatt of energy can power an average of 173 US homes. Until relatively recently, the only data centers of this great size were ‘hyperscalers,’ built on behalf of a tech behemoth like Google or Meta, but now they are increasingly being built by lesser-known companies for whom data centers are a lucrative sideline, rather than a main offer. Such data centers lack the same public scrutiny of their sustainability credentials.
Carlos Dapena, the Project Manager of Cerdanyola’s industrial park, Parc de l’Alba, relayed that the town’s data centers will be cooled using “air condensation chillers that avoid the consumption of drinking water.” He explained that none of the data centers coming to the town have sought permission for an abnormal amount of water for a business in the industrial park.
But there is precedent for data center operators to wildly underestimate the amount of water their services will consume. Earlier this year, Meta announced plans to build its headquarters in Southern Europe in the Spanish city of Talavera de la Reina—plans which, the local government claimed, would use “little or no” water. When details were released under pressure from the activist group Tu Nube Seca Mi Río (Your Cloud Dries My River), it came to light that the project would use 665.4 million liters of water every year—despite the mere 40.6 million allocated for it in municipal agreements. This is not an isolated incident; in 2022, a data center in the Netherlands was found to use over four times the amount of water its operator claimed. When Microsoft assessed the water footprint of one of its data centers in Texas, it discovered the true water cost was 11 times more than it was paying.
According to the tech press and industry trade publications, the data center water problem is perennially on the cusp of being solved, be that with more efficient chips or the advent of clean fusion energy. There was much fanfare around the discovery that it is now, technically, possible to cool a data center without water. However, the feasibility of running large-scale data centers without considerable water resources is generally overstated. Microsoft’s underwater data center, for instance, an often-cited example of the industry’s cooling advances, closed earlier this year. The hurtling rate of expansion is far outpacing the search for game-changing technical fixes.
A minority of data center industry professionals freely admit this. John Booth is a seasoned data center sustainability consultant and outspoken voice in the industry. Though he believes the sector has improved practices considerably in the last ten years, Booth still sees data centers as “tinkering around the edges” of the sustainability question. On a podcast from industry media outlet DataCentreDynamics, he described a culture of “willful ignorance”:
I think there’s a lot of people in the industry at senior management level who are treading water until they retire. They look at the scale of what we’re going to have to do to make our data centers more sustainable, and they think, ‘I'm going to kick the can down the road until that’s someone else’s problem.’
Booth is a co-author of a new EU Energy Efficiency Directive (EED) that’s set to force operators’ hands on the reporting issue. The legislation, which came into force this September, requires European data centers to self-report 14 items of resource usage for the first time, including their water consumption and how much of it is derived from potable sources. It’s hoped that, over time, this will lead to the first truly representative benchmarks on the resource footprint of data centers.
In terms of local interests, a crucial question was whether the EED would require individual data centers to publish their numbers or whether water usage would only be publicly available as a total per country. Ultimately, the matter was left to the country level to decide. Spain appears to be adopting an interpretation of the EED that ensures companies’ resource usage data remains private, invoking its Trade Secrets Act.
In drought-stricken Catalunya, the secrecy around how much water individual businesses use in general is a hot topic. Under pressure from journalists at CRÍTIC, the Catalan Water Agency (ACA) recently released the names of the corporations that have been granted permission to be “large consumers” of water—though not the amounts of water they use. Following this precedent, I asked the ACA in April whether Pannatoni, the company behind one of Cerdanyola’s upcoming data centers, had solicited permission to be a large consumer of water. The ACA replied that it had not. Since plans for the Panattoni data center, which is not yet built, were approved in spring 2023, it appears that the operation was accepted for development before its water consumption needs were officially cleared.
In a place where fears about the future water supply are justifiably widespread, it seems noteworthy that there has not been more visible public opposition to the plans to make Cerdanyola a major data center hub—or at least, calls for water management of this development to be released. The town has high levels of civic engagement, and its increasing industrialization has been subject to protests since at least the 1980s. One urban development plan was recently overturned following a citizen campaign. When approached for comment, a representative of the organization behind the campaign, Refem el Centre Direccional said that the group knew little about the topic of data centers. Arguably, the data center industry’s culture of opacity around its resource usage is a democratic issue as much as a sustainability one.
With no opposition and no plans available for public discussion, the water footprint of the data centers planned for Cerdanyola appears, on the face of it, as a non-problem. After all, to reap the economic benefits of the AI boom, its infrastructure needs to be located somewhere—and Barcelona, with its healthy ecosystem of tech companies, seems to be among the beneficiaries. But with so little clarity over resource usage, how can urban planners weigh the potential benefits of such infrastructure against the effect on water resources, let alone communicate these sufficiently for citizens to have a say?
One first step would be to commission more independent assessments of the effects of data centers on the area they host. All too often, impact reports are sponsored by the industry itself, and the conclusion is drawn ahead of the investigation. In the case of Barcelona, a pamphlet was recently published that explains that thanks to "a virtuous circle between innovation and economic development,” data center investments of 1.04 billion euros will be returned sevenfold. It was co-authored by Digital Realty, a data center company with a presence in the city. The problem is not only that biased assessments weigh an area’s economic gain over questions around its water security; the few non-industry reports conducted on this topic reveal that it’s unclear whether data centers bring significant financial benefit to their host area at all. In 2016, an analysis by the non-profit think tank Good Jobs First claimed that US state subsidies pay $2 million for every data center job created.
Such independent reports could also be instructive to data center operators. Right now, businesses face a catch-22 when deciding where to build their data centers. If they “follow the sun” and locate their warehouses in hot places like Cerdanyola, they’ll need considerable water for cooling but can access abundant solar energy, which means they’ll require less “scope 2” water usage. Should they “unfollow the sun” and locate them in cooler regions, they will use less water directly, but with diminished access to renewable solar energy, likely need to increase their fossil fuel consumption, raising the scope 2 water usage. That would be less resource-intensive and, therefore, cheaper and potentially easier to wrangle through the EU’s web of sustainability laws. It would be helpful to have hard numbers to answer this question.
Secondly: if, as it appears, data centers represent a considerable burden on the water resources of their host location, it would be wise to ensure that they are not concentrated in one area, perhaps via centralized national planning. Data center hubs tend to mushroom in size since there is a connectivity advantage to placing one near another. This has led to a situation where four European cities have become operators’ preferred building locations—Frankfurt, London, Amsterdam, and Paris, known by the acronym FLAP—almost regardless of regional laws or climate considerations. Whilst Cerdanyola might have the resource capacity to host one or two cautiously planned data centers, allowing a water-stressed area to transform into a large data center hub would clearly be unwise.
However, these are only part solutions. Unfortunately, the only lasting fix to the problem of where to locate water-heavy infrastructure on a drying planet must involve reducing demand. This would require a reassessment of the speed and breadth of AI adoption into every facet of consumer life. From what we know of AI’s water footprint, it seems wreckless that Google, operator of the world’s most-used search engine, has integrated the technology into every search by default. Not least since the AI summaries are more likely to be incorrect than traditional search results. AI is being rolled out en masse, regardless of whether it is the most suitable tool for a job. From a resource usage standpoint, using ChatGPT to generate a 100-word email is like unblocking your toilet with a stick of dynamite. Yes, it will work, but there were routes you could have tried first that would do the job more reliably—and without the damage to your immediate environment.
Hopes to delay the rising tide of AI embeds might seem fanciful, but public opinion towards the technology is changing, if not the speed of its roll-out. We might look to the parable of cryptocurrency: once hailed as the solution to banking disasters like the 2008 financial crisis, its mining was found to consume massive amounts of energy. Having brought few visible benefits to society, crypto is now often considered wasteful and unnecessary. This has gone some way to initiating a conversation around the moral hierarchy of data processing which commentators on AI might pick up.
At the time of writing, the internal basins of Catalunya are hovering at just over 30% capacity. Yet the region’s water future is uncertain, and it is impossible to assess how much the large data centers in Cerdanyola will affect it. In a climate crisis, where any large consumer of water must be evaluated, it is nonsensical that data centers do not have to declare their numbers. If such reporting were public, the popular imaginary of AI as something which only exists in the immaterial world of cyberspace might change to something coherent with its environmental impacts. With such a shift, it would seem right to reserve the technology for situations that merit a resource-intensive solution. Data center operators might also be held to a level of local accountability commensurate with their local impact. In the absence of hard numbers, we can only hope that the operator’s definitions of “virtually no” water usage are made in good faith.