DataCentersX > Stack > Cooling and Thermal Management
Cooling & Thermal Management
Cooling is the constraint that determines how much compute a data center can physically host. Every watt delivered to a server must leave the building as heat. The engineering discipline of cooling and thermal management designs the path by which heat moves from silicon die through package, chassis, rack, facility plant, and ultimately to atmosphere or to a reuse stream.
The dominant driver reshaping this discipline is accelerator power density. Hyperscale racks drew 8 to 15 kilowatts through the 2010s. Current AI training racks exceed 120 kilowatts, and Rubin-class and post-Rubin reference designs anticipate 250 to 600 kilowatts per rack. Beyond roughly 30 kilowatts air cooling is uneconomic; at AI training densities liquid is required.
Cooling modalities by rack density
| Modality | Density Range | Transport Fluid | Primary Use |
|---|---|---|---|
| HVAC and Air Handling | Up to ~30 kW per rack | Air | Enterprise, general-purpose colocation, ancillary loads |
| Liquid Cooling | 30 to 100+ kW per rack | Water or water-glycol, closed loop | AI training, HPC, dense hyperscale |
| Direct-to-Chip Cooling | 50 to 250+ kW per rack | Water through cold plates on CPU and accelerator packages | Current-generation AI accelerator racks |
| Immersion Cooling | Effectively unbounded at rack scale | Dielectric fluid, single-phase or two-phase | Crypto at scale, select HPC, AI pilots |
| Cooling Tower and Heat Rejection | Facility-level, all densities | Water to air, air to air, or hybrid | Terminal heat sink at the mechanical plant |
| Cooling Water Systems | Facility-level, all densities | Purified and treated facility water | Water chemistry, makeup, and blowdown for all liquid loops |
Cooling across the stack layers
The modality view organizes cooling by heat-transport mechanism. A second view organizes cooling by the layer of the physical stack at which it appears. Every layer from die to campus has its own cooling problem and its own equipment class, and the solutions compose into the end-to-end thermal chain.
| Stack Layer | Cooling Elements | Function |
|---|---|---|
| Chip | Heat spreaders, thermal interface materials, on-package vapor chambers | Extracts heat from die at hundreds to thousands of watts per package |
| Server | Heat sinks, cold plates, chassis fans, immersion-compatible boards | Moves heat from package into the chassis transport fluid |
| Rack | In-rack manifolds, rear-door heat exchangers, immersion tanks, rack CDUs | Aggregates chassis heat and transfers to facility loop |
| Cluster | Sidecar and in-row CDUs, manifold distribution units, secondary loops | Balances flow across racks; isolates technology from facility water |
| Facility | CRACs and CRAHs, chillers, pumps, chilled-water and condenser loops | Delivers cooling to every hall; returns heat to the plant |
| Campus | District plants, thermal storage, heat-reuse interconnects, reclaimed water | Centralizes rejection; enables campus energy and water strategy |
Where cooling sits in the DatacentersX stack
Cooling and thermal management is a STACK discipline covering the engineering and architecture of how heat moves out of the facility. The operational side (real-time monitoring of supply and return temperatures, leak detection, CDU fleet health) is covered under Facility Ops at Cooling Monitoring. The energy-side view (thermal load as an energy strategy, waste heat reuse, district heating integration) is covered under Energy at Thermal Energy and Waste Heat.
Related coverage
Stack | Rack Layer | Facility Layer | Campus Layer | Power Distribution | Cooling Monitoring | Thermal Energy and Waste Heat | Resource Usage (PUE / WUE) | AI Factory