Cooling Is the Single Greatest Risk in Commissioning a 100MW AI Data Center

Somewhere in Ellendale, North Dakota, a 100-megawatt data center just came online for CoreWeave. Applied Digital pulled it off in two phases, hitting Ready for Service on the first 50 MW in October 2025 and the second 50 MW a month later. On paper, that reads like a clean execution story. But the people who actually had to commission the thing will tell you the paper version skips the hardest part.

The hardest part is cooling.

In a recent conversation with Philbert Shih of Structure Research, Applied Digital COO Laura Laltrello and VP of Data Center Design Stephen Lattimer laid out the operational realities of commissioning a 100MW AI facility. Their core message lands like a warning for anyone building at this scale: mechanical systems, and cooling above all else, represent the single greatest operational vulnerability in bringing a hyperscale AI data center to life.

That should stop every cooling vendor and operator in this industry cold. Because the gap between design intent and real-world performance is where projects die.

The Scale Problem Nobody Solved in Simulation

Traditional data center builds ran 5 to 10 megawatts. The commissioning playbook for those facilities was well-understood. Validate the equipment. Check the boxes. Move on. But AI factories running at 100MW and above have obliterated that model. Applied Digital's Polaris Forge 1 campus handles power densities 15 to 30 times higher than traditional data centers. You cannot commission that kind of thermal load with a checklist.

Lattimer knows this viscerally. He spent nearly three decades at Sturgeon Electric, starting as an electrician and working his way up to project leadership before moving to Flexential as a Data Center Architect. When he talks about commissioning, he talks about it the way a surgeon talks about operating: sequencing matters, timing matters, and the body on the table does not behave the way the textbook said it would.

Laltrello brings the systems-thinking lens. Before joining Applied Digital as COO in January 2025, she ran Honeywell's Building Automation Services business and led Lenovo's DataCenter Services unit. Two decades of managing complex infrastructure at global scale. She has seen what happens when cooling systems that performed perfectly in controlled testing meet the chaos of actual operations.

What happens is recalibration. Constant, ongoing, frustrating recalibration.

Commissioning Starts Before the Concrete Dries

One of the more striking details from the discussion: commissioning validation at Applied Digital begins 30 to 45 days after groundbreaking. Not after construction wraps. After ground breaks. That means the commissioning team is embedded alongside the construction crew from almost day one, running validation processes on systems that are still being built around them.

This is a fundamental departure from how the industry operated five years ago. At 100MW, you cannot afford to discover that your cooling distribution doesn't match your rack layout after the building is finished. The feedback loops have to run in parallel with construction, and they have to involve the people who designed the systems, the people who built them, and the people who will operate them. All in the same room. Laltrello and Lattimer both emphasized that in-person collaboration across these teams is non-negotiable for hitting aggressive timelines.

The commissioning process intensifies as the facility approaches Ready for Service, stretching into a months-long validation campaign. This is where cooling infrastructure faces its harshest test.

Why Cooling Breaks Different

Power systems are binary in useful ways. A breaker works or it doesn't. A transformer steps voltage down or it fails. You can test these things with a high degree of confidence before load hits.

Cooling systems are analog. They depend on fluid dynamics, ambient temperature, humidity, airflow patterns, and the interactions between all of these variables under real thermal load. Every assumption the design engineers made gets stress-tested the moment actual GPUs start generating actual heat. And the assumptions are always at least partially wrong.

Applied Digital's cooling architecture at Polaris Forge 1 illustrates both the ambition and the complexity. Working with BASX, a subsidiary of AAON, they deployed a custom zero-water chiller system built around three operating modes: full free cooling using only pumps and fans, partial free cooling that supplements with direct expansion when ambient conditions demand it, and full mechanical cooling for peak temperature periods. The system targets a PUE of 1.18 with near-zero water usage.

On paper, elegant. In commissioning, every transition between those three modes is a potential failure point. The control logic that governs when the system shifts from free cooling to partial mechanical, and from partial to full DX, has to be tuned against real conditions. North Dakota gives you roughly 220 days of free cooling annually, which sounds generous until you realize it means 145 days where the system must execute those transitions flawlessly under load.

One documented commissioning failure at a hyperscale facility makes the risk concrete. During testing, an air handling unit displayed the correct status on its interface while the dampers shut completely and the fans refused to increase speed. The monitoring said everything was fine. The physics said the IT equipment was about to overheat. That kind of divergence between what the controls report and what the hardware actually does is exactly what commissioning is supposed to catch. But it only gets caught if the commissioning process is rigorous enough, and starts early enough, to test every mode under every condition.

The Vendor Alignment Problem

Laltrello and Lattimer were emphatic about one thing: early vendor alignment determines whether you hit your timeline or blow past it. At 100MW, the cooling infrastructure involves dozens of vendors. Chiller manufacturers, piping contractors, controls integrators, BMS providers, the liquid cooling CDU suppliers, the direct-to-chip loop specialists. Each one operates on its own schedule, its own testing protocols, its own definition of "ready."

Commissioning forces all of those definitions to converge. And they never converge smoothly.

The coordination challenge compounds at AI-scale densities because the cooling system isn't just rejecting heat from servers. It's managing a direct-to-chip liquid cooling loop that enables higher return fluid temperatures, which in turn allows more heat rejection directly to ambient air. That thermodynamic chain only works if every component in the loop performs within spec simultaneously. One vendor's heat exchanger running 3 degrees off design temperature ripples through the entire system.

The Uptime Institute's 2024 outage analysis found that power and cooling issues drive the majority of data center outages, with 54% of respondents reporting their most recent outage cost more than $100,000 and 16% exceeding $1 million. Those numbers were calculated against traditional facilities. At 100MW AI scale, where a single building can house hundreds of millions of dollars in GPU infrastructure for a tenant like CoreWeave, the cost of a cooling-induced outage doesn't just scale linearly. It compounds.

What This Means for the Supply Chain

Applied Digital's Polaris Forge campus is designed to expand to a gigawatt. The second building, a 150MW facility, is under construction now with a mid-2026 target. Every lesson learned commissioning the first 100MW building feeds directly into the next one. That institutional knowledge is a competitive moat.

But for the broader cooling industry, the implications are stark. If commissioning is a months-long validation process that begins 30 days after groundbreaking, then cooling vendors who show up with equipment and a spec sheet are already behind. The vendors who win at this scale are the ones who embed with the commissioning team from day one, who design their systems with testability as a first-order requirement, and who staff field engineers capable of real-time recalibration as conditions diverge from models.

Lattimer's career arc tells the story of what the industry needs more of. He went from pulling wire to designing entire data center architectures. That ground-level mechanical intuition, the ability to look at a cooling system and know from experience where it will misbehave, cannot be replicated by simulation software. The industry is building facilities that demand both computational design tools and the kind of hard-won field knowledge that only comes from decades of hands-on work.

The race to build AI infrastructure at hyperscale is accelerating. Applied Digital alone has $5 billion in new lease commitments. But every megawatt of that capacity has to pass through the commissioning bottleneck, and cooling is where that bottleneck is tightest. The companies that figure out how to compress commissioning timelines without cutting corners on thermal validation will define the next era of data center deployment. Everyone else will be explaining to their hyperscaler tenants why Ready for Service slipped another quarter.