Background
Next-generation AI servers concentrate very high-power densities into increasingly compact footprints. As power per rack rises, air cooling alone proves insufficient or impractical, and direct-to-chip and liquid-based cooling architectures provide the most consistent path to sustain performance, reliability, and scalability.
In parallel, AI deployments are increasingly designed around a single unit of compute concept: not just IT, but the full infrastructure stack, often including a hybrid, modular building approach manufactured in the factory and assembled on site. This introduces a new requirement: managing the data center as an integrated system, not as separate silos (IT, thermal, building, security).
Challenge
Liquid cooling changes the operational profile of the data center:
- Higher thermal criticality: small anomalies propagate faster when power density is concentrated.
- More complex fluid architectures: secondary and primary networks, multiple heat exchange stages, and potential alternative cooling sources.
- New risk domain: liquid leakage becomes a first-class operational and safety concern, requiring early detection, rapid localization, and automated isolation.
- Need for orchestration: IT monitoring alone is not enough; thermal, power, and safety systems must operate under a unified logic.
Traditional approaches, where each subsystem operates in isolation and responses are manual, do not match AI operational expectations.
Vertiv™ Solution
End-to-end cooling chain, designed as a system
The infrastructure comprises a complete thermal chain:
Secondary fluid network
- Distributes cooling liquid to the data hall
- Delivers liquid directly to rack manifolds
- Feeds the heat exchange process at rack level, allowing it to occur within the servers
CDUs (Coolant Distribution Units)
- Manage and stabilize secondary loop conditions
- Transfer heat to the primary network
- Enable reliable, safe and controlled operation at the rack / row level
Primary cooling network options
Heat rejected from the CDUs can be routed to:
- Free Cooling Chillers (typical reference architecture)
- or alternative sources when available (e.g., water-to-water heat exchangers using lake water or ocean water as thermal sink)
This makes the cooling design flexible and site-adaptive, while maintaining standardized control logic.
Integrated monitoring & operations
Operations run through Vertiv™ Unify in addition to Vertiv™ Avocent® ACS consoles, which provide:
- Unified visibility over key operational IT data and infrastructure data across the entire cooling chain
- Centralized monitoring aligned with data center governance and operational workflows
- System-level orchestration that breaks down traditional silos between IT, thermal, power, and safety domains
- The ability to integrate additional sensors and automation logic, especially relevant in liquid-cooled environments
Vertiv™ Unify transforms fragmented subsystem monitoring into true infrastructure orchestration—essential for operating AI factories as integrated systems.
Leak Detection and Automated Isolation
Liquid cooling introduces the need to detect and mitigate leaks before they become downtime events.
The Vertiv™ Access Control System, equipped with:
- sensor reading capability (leak detection, localized sensors),
- and valve control,
provides a dual function:
- Detection and confirmation of leakage signals
- Automated response, including:
- isolating the impacted section of the secondary network
- blocking a specific loop or a single affected rack or recirculation element
- segregating the event to prevent propagation and reduce risk
This capability turns leak management from a manual emergency response into an engineered and automated safety mechanism, fully integrated into the AI factory operating model.
Results
This integrated approach delivers tangible benefits:
- Operational continuity: early detection and automated fault containment minimize broad outages
- Faster incident containment: events localize to the smallest possible domain
- Higher system resilience: infrastructure behaves as one coherent system rather than independent components
- Scalable repeatability: modular “single unit of compute” deployments support easier replication and operation consistently
- Improved governance: unified monitoring enables stronger control, auditing, and performance optimization over time
In liquid-cooled AI data centers, the competitive advantage now extends beyond deploying high-density compute.
It is in operating the entire AI factory as an integrated, monitored, and automated system, from secondary fluid network to primary heat rejection, from CDUs to ACS consoles, with leak detection and isolation natively engineered into the control layer.
A system-based reference architecture enables consultants to deliver predictable performance, accelerated deployment and long-term resilience in an AI-driven world.