DataCentersX > Stack > Cooling and Thermal Management


Cooling & Thermal Management


Cooling is the constraint that determines how much compute a data center can physically host. Every watt delivered to a server must leave the building as heat. The engineering discipline of cooling and thermal management designs the path by which heat moves from silicon die through package, chassis, rack, facility plant, and ultimately to atmosphere or to a reuse stream.

The dominant driver reshaping this discipline is accelerator power density. Hyperscale racks drew 8 to 15 kilowatts through the 2010s. Current AI training racks exceed 120 kilowatts, and Rubin-class and post-Rubin reference designs anticipate 250 to 600 kilowatts per rack. Beyond roughly 30 kilowatts air cooling is uneconomic; at AI training densities liquid is required.


Cooling modalities by rack density

Modality Density Range Transport Fluid Primary Use
HVAC and Air Handling Up to ~30 kW per rack Air Enterprise, general-purpose colocation, ancillary loads
Liquid Cooling 30 to 100+ kW per rack Water or water-glycol, closed loop AI training, HPC, dense hyperscale
Direct-to-Chip Cooling 50 to 250+ kW per rack Water through cold plates on CPU and accelerator packages Current-generation AI accelerator racks
Immersion Cooling Effectively unbounded at rack scale Dielectric fluid, single-phase or two-phase Crypto at scale, select HPC, AI pilots
Cooling Tower and Heat Rejection Facility-level, all densities Water to air, air to air, or hybrid Terminal heat sink at the mechanical plant
Cooling Water Systems Facility-level, all densities Purified and treated facility water Water chemistry, makeup, and blowdown for all liquid loops

Cooling across the stack layers

The modality view organizes cooling by heat-transport mechanism. A second view organizes cooling by the layer of the physical stack at which it appears. Every layer from die to campus has its own cooling problem and its own equipment class, and the solutions compose into the end-to-end thermal chain.

Stack Layer Cooling Elements Function
Chip Heat spreaders, thermal interface materials, on-package vapor chambers Extracts heat from die at hundreds to thousands of watts per package
Server Heat sinks, cold plates, chassis fans, immersion-compatible boards Moves heat from package into the chassis transport fluid
Rack In-rack manifolds, rear-door heat exchangers, immersion tanks, rack CDUs Aggregates chassis heat and transfers to facility loop
Cluster Sidecar and in-row CDUs, manifold distribution units, secondary loops Balances flow across racks; isolates technology from facility water
Facility CRACs and CRAHs, chillers, pumps, chilled-water and condenser loops Delivers cooling to every hall; returns heat to the plant
Campus District plants, thermal storage, heat-reuse interconnects, reclaimed water Centralizes rejection; enables campus energy and water strategy

Where cooling sits in the DatacentersX stack

Cooling and thermal management is a STACK discipline covering the engineering and architecture of how heat moves out of the facility. The operational side (real-time monitoring of supply and return temperatures, leak detection, CDU fleet health) is covered under Facility Ops at Cooling Monitoring. The energy-side view (thermal load as an energy strategy, waste heat reuse, district heating integration) is covered under Energy at Thermal Energy and Waste Heat.


Related coverage

Stack | Rack Layer | Facility Layer | Campus Layer | Power Distribution | Cooling Monitoring | Thermal Energy and Waste Heat | Resource Usage (PUE / WUE) | AI Factory