← Back to Intel

Data Center Cooling Systems: A Foundational Guide From Vertiv

Most articles in The Cooling Report assume the reader already knows the difference between a CRAH and a CDU, can quote PUE at the rack level, and has opinions about single-phase versus two-phase immersion. The audience that pays for this newsletter has been doing this work for years. But that audience also keeps asking us for a clean primer to send to procurement leads, board members, sustainability teams, and the new hires who need to be operational by Q3.

This piece, sponsored by Vertiv, is that primer. It pulls from Vertiv's foundational guide to data center cooling systems and adapts it to the tighter, more operational frame this publication is known for. If you have ever tried to explain to a non-engineer why a 100-megawatt facility needs more cooling capacity than the cooling capacity itself uses to operate, this is the link to send.

What Environmental Control Actually Means

Data center environmental control is the management of temperature, humidity, particulate filtration, and air movement inside a facility, calibrated to keep IT equipment operating within manufacturer-specified tolerances. The most-cited target band is 70 to 75°F (21 to 24°C) at the server intake. Some research suggests legacy facilities run cooler than that, often below 70°F, which costs energy without improving reliability.

The reason the band matters is that every degree below the upper specification limit costs additional cooling power. Every degree above accelerates component degradation and shortens equipment lifetime. The optimization problem the cooling architecture is trying to solve is to hold the intake band as close to the warm end of the spec as possible without sustained excursions above it.

The Five Constraints Every Cooling Design Must Solve For

Vertiv's foundational guide names five technical challenges that every cooling design has to address simultaneously. They are worth listing because facility operators tend to optimize for one or two of them and underweight the others, which is how mid-life problems compound.

Adaptability and scalability. A cooling architecture has to flex as compute demand changes. Rack densities have migrated from 5 kW historical norms to 50 kW and beyond in AI-heavy environments. A design that locks in the wrong density assumption locks the operator out of the next workload generation.

Availability. The cooling system has to be more reliable than the workload it supports. Mission-critical data centers run at N+1 or 2N cooling redundancy because a single chiller failure in a non-redundant plant can take a facility down inside 90 seconds.

Life cycle costs. The capex of a cooling system is a fraction of the total cost of ownership. The opex on power, water, maintenance, and refrigerant management over a 15-year operating life will exceed the original purchase price several times over.

Maintenance and serviceability. Components have to be accessible. Spare parts have to be available. Service technicians have to be qualified on the specific equipment installed. A facility designed around an exotic architecture with no local service depth will pay for that decision through every emergency repair.

Manageability. The control system has to be operable by the team that will own it. Vendor-specific control platforms with proprietary interfaces and limited integration paths create operational debt that compounds over time.

Airflow Management Is the Foundation

Before liquid cooling enters the conversation, every facility has to solve airflow management. The principle is simple: hot exhaust air and cold intake air should not mix. The execution is harder than it looks.

Rack hygiene is the term of art. Blanking plates seal gaps in racks where empty rack units would otherwise allow hot exhaust to recirculate to the front of the rack. Hot-aisle and cold-aisle separation runs perpendicular to the racks, keeping all server intakes facing into a single cold aisle and all exhausts venting into hot aisles. Containment, either hot-aisle or cold-aisle, takes that separation to its logical conclusion by physically barricading the two streams.

Facilities that get airflow management right operate at PUEs below 1.4 with conventional cooling. Facilities that do not can run PUEs above 2.0 even with newer equipment, because the cooling plant is constantly fighting recirculation losses that good airflow design would have eliminated for free.

Where Energy Efficiency Gets Found

The energy efficiency conversation in modern data center cooling has three lever points worth understanding.

The first is measurement. Most facilities do not have accurate metering of non-IT power consumption at the granularity required to find the next efficiency win. Cooling represents 30 to 40% of facility electricity draw at conventional PUE. Without measurement, the optimization opportunity stays invisible.

The second is computational fluid dynamics. CFD analysis models airflow patterns inside the facility before construction. The output reveals dead zones, recirculation paths, and stratification that human intuition misses. Facilities designed with CFD operate closer to theoretical PUE than facilities designed by rule of thumb.

The third is free cooling. When ambient air is colder than chilled water return temperature, the facility can transfer heat to the outside without running compressors. Free cooling availability is a function of geography, season, and the design temperature of the cooling loops. Sites with favorable climates can run free cooling for thousands of hours per year, materially reducing the operating energy of the cooling plant.

What Comes Next

Vertiv's guide identifies four trends shaping the next decade of cooling architecture, and the operational data backs them up.

Operating temperatures are rising. ASHRAE has progressively expanded its recommended envelope, and major hyperscalers now run intake temperatures above 80°F on production workloads without reliability impact. The benefit is reduced cooling energy. The risk is narrower margin for cooling-system failure recovery.

Hybrid cooling is the practical default. Air cooling handles workloads up to roughly 40 kW per rack. Liquid cooling takes over above that. Most facilities will run both architectures concurrently for the rest of the decade, because workload mixes do not migrate uniformly and stranded thermal capacity is expensive.

Site selection now centers on thermal access. Proximity to cold-water bodies, high-altitude or high-latitude geography with low ambient temperatures, and access to renewable baseload power are reshaping where new capacity is being announced. Nordic latitudes, the Pacific Northwest, and certain parts of the U.S. Mountain West are absorbing demand that historically would have gone to lower-cost-of-power Southern markets.

Free cooling deployment is expanding. Direct outside air economizers, indirect evaporative systems, and closed-loop chilled water plants with high-temperature setpoints are all extending the operating hours during which mechanical refrigeration is offline. The energy savings compound across the operating life of the facility.

Send this primer to someone who needs the fundamentals. We will be back next week with the regular intelligence operators come here for.