NVIDIA's Vera Rubin systems will demand up to 600 kW per rack. That is the thermal output of a small industrial furnace, running continuously, inside a building full of identical furnaces, all connected to plumbing that cannot leak. Peter de Bock, VP of Data Center Energy and Cooling at Eaton, told Techzine that the industry needs to stop thinking about data centers as telecom facilities and start engineering them like aerospace systems. Given what the thermal loads now look like, he has a point.
The data center industry built its engineering culture around relatively gentle power densities. Ten kilowatts per rack. Twenty. Maybe 40 at the high end. Air cooling handled all of it. The mechanical systems were straightforward. CRAC units, raised floors, hot aisle containment if you were being disciplined about it. The failure modes were understood. The workforce knew how to commission and maintain the equipment. That era ended when AI accelerators arrived and NVIDIA's watt roadmap started dictating what the cooling industry had to build.
De Bock's framing is precise. A single AI rack at 600 kW produces thermal density comparable to an internal combustion engine running continuously. Except an engine cycles on and off, operates in open air with forced convection from vehicle movement, and has a century of failure-mode engineering behind it. An AI rack runs at full thermal load 24 hours a day, seven days a week, inside an enclosed building, surrounded by other racks producing identical heat flux, all cooled by a shared liquid loop that must deliver precise thermal control to silicon that throttles at 85 to 95 degrees Celsius.
The global AI data center fleet already requires more than 150 gigawatts of power, roughly 15 times the grid capacity of New York City. Every one of those watts becomes a watt of heat. The thermal management infrastructure required to reject that heat is industrial in scale and aerospace in precision. De Bock is arguing that the engineering discipline needs to match.
This is where de Bock's argument gets sharp. He contends that PUE masks actual efficiency by 0.3 points or more. A facility reporting a PUE of 1.1 may be operating closer to 1.4 when you account for the full cooling chain, including pumps, fans, CDU overhead, and heat rejection equipment that does not show up in the simplified PUE calculation. Traditional air-cooled facilities already consume roughly 40 percent of total energy for cooling alone. PUE, as currently measured by most operators, does not capture enough of the parasitic load to be a useful engineering metric.
De Bock's proposed replacement: tokens per watt. Measure actual AI computational output against grid connection. This shifts the optimization target from facility-level overhead ratios to workload-level productivity. A data center with a mediocre PUE that runs its GPUs at higher utilization and delivers more inference throughput per watt of grid power may be a better-engineered facility than one with a pristine PUE number and chronic thermal throttling that nobody measures. We have covered this blindspot before. Most operators leave enormous efficiency gains on the table because they optimize for the wrong metric.
The tokens-per-watt framing is useful because it connects cooling performance directly to revenue. Every degree of thermal headroom that prevents GPU throttling translates to more tokens generated per second. Every watt saved on cooling overhead is a watt available for compute. The financial case for better thermal engineering becomes self-evident when the metric is output per grid watt rather than overhead per IT watt.
De Bock advocates for hot-water cooling loops running at 45 degrees Celsius in and 60 degrees out. This is the same temperature regime that NVIDIA has been pushing to eliminate chillers from the cooling chain entirely. The physics are compelling and they hinge on the cubic relationship that de Bock emphasizes.
Heat rejection through dry coolers follows a cubic relationship between the temperature differential and fan power. Double the delta-T between coolant and ambient air, and fan power requirements drop by a factor of eight. A traditional cooling loop running at 20 to 30 degrees Celsius operates with a narrow temperature differential to ambient, especially in warm climates, which means the dry coolers work hard. Push the coolant to 45 to 60 degrees and the differential to ambient opens up dramatically, even on a 35-degree day. The fans slow down. The parasitic power drops. And in many climates, the chiller disappears from the design entirely.
There is a second benefit. Hot water at 60 degrees Celsius leaving the data center is warm enough for district heating networks. European directives are already pushing data center operators toward waste heat recovery, and a 60-degree return loop connects directly to municipal heating infrastructure without a heat pump intermediary. That same cooling loop that rejects heat from GPUs can feed warmth into apartment buildings and hospitals, turning an operational cost into sellable thermal energy.
The tradeoff is silicon temperature margin. Running coolant at 45 degrees instead of 20 degrees raises the cold plate surface temperature, which raises junction temperatures on the GPU die. At 600 kW per rack, the thermal resistance of every layer between junction and coolant matters. Cold plates, thermal interface materials, manifold design, flow balancing across dozens of GPUs in a single rack. The engineering tolerance compresses with every degree. A jet engine operates under the same constraint: extreme temperatures, tight margins, zero room for slop in the thermal chain. De Bock's aerospace analogy holds because the physics are functionally identical.
De Bock borrows directly from aerospace reliability frameworks. Three dimensions of failure: severity, occurrence, and detectability. Severity asks how bad a failure is when it happens and whether inherent backup systems limit the damage. Occurrence quantifies how often it happens based on measured data, not assumptions. Detectability asks whether predictive monitoring can flag the problem weeks or months before it becomes a failure event.
This framework matters at 600 kW per rack because the cost of failure scales with the equipment it destroys. Individual AI servers now exceed $1 million in value. A cooling failure that takes out a rack of NVIDIA Vera Rubin GPUs is a seven-figure loss event before you count the revenue impact of the workload interruption. The data center industry has historically tolerated cooling system failures because the equipment was relatively cheap to replace and the workloads could be migrated. Neither condition holds for AI infrastructure.
Eaton's position is that prefabricated, pre-tested modular compute pods built in factories are the answer to the reliability problem. Build the rack, the cooling, the power distribution, and the plumbing as an integrated unit. Test it before it ships. Commission it as a known-good assembly rather than integrating four different vendor systems on a live data center floor. This echoes what we have tracked in the modular data center movement, where the build model is shifting from construction site to manufacturing line.
Aerospace engineering requires aerospace engineers. The data center industry does not have them. De Bock's call for "system-level thinking" assumes a workforce that can integrate electrical, mechanical, thermal, and controls engineering into a single coherent design. The current reality is that the biggest barrier to liquid cooling adoption has nothing to do with technology. The barrier is people. Operators who have never commissioned a liquid cooling system. Technicians who have never bled a manifold. Facility managers who built their careers around air handling units and now face a thermal architecture that looks more like a chemical plant than a server room.
Eaton can build the hardware. So can Vertiv, Schneider, and the hundred other vendors racing to fill the liquid cooling supply chain. The harder problem is the human capital required to operate what they build. Aerospace-grade systems need aerospace-grade maintenance. The industry is not there yet.
De Bock is right about the engineering gap. The data center industry is running 2026 thermal loads through a 2015 engineering culture. Air cooling tolerated sloppy commissioning, oversized CRAC units, and maintenance schedules that slipped by a quarter without consequence. Liquid cooling at 600 kW per rack punishes all of that. Margins and failure costs now move in opposite directions at the same time, compressing the first while the second compounds with every additional GPU rack on the loop. The interdependencies between power, cooling, and compute are tighter than anything this industry has managed before.
The tokens-per-watt metric deserves adoption. PUE served the industry well when the optimization question was "how much overhead does this facility generate." That question has changed. What matters now is how much useful AI compute a facility produces per watt of grid power, and PUE cannot answer it. The facilities that can will be the ones that win hyperscaler contracts in 2027 and beyond.
Our position: within three years, the operators who treat cooling as an integrated engineering discipline, with the same rigor applied to the thermal chain as to the compute it supports, will run hotter coolant, reject more heat per dollar, throttle fewer GPUs, and produce more tokens per watt of grid connection. Everyone else will watch their PUE hold steady while their customers migrate to facilities where the cooling architecture was designed around the workload from day one.