Some data center operators remain skeptical that liquid cooling will make significant inroads into a data center ecosystem dominated by air-cooling systems. It’s hard to fault them for that. They’ve seen multiple projections over the years for how rack densities would climb to levels unsupportable by air cooling, and most of those projections were premature or never realized.
Instead of packing racks with high-density servers to deal with the relentless demand for compute capacity as predicted, operators chose instead to spread out their loads, optimize their air-cooling systems, and increase reliance on cloud computing when that resource became available.
The market simply has not had an appetite for making the changes required to bring liquid to the rack. So, it’s fair to ask, what’s different now?
The Link Between Digital Transformation and High-Density Computing
The short answer is the use of high-performance computing (HPC) principles to support applications that a large percentage of businesses are adopting or planning to adopt, most notably artificial intelligence (AI). AI has become integral to digital transformation initiatives for organizations across a range of industries. There’s a good chance your organization is making plans today to leverage the power of AI.
Like current HPC deployments, these business-changing applications require the ability to process massive amounts of data with extremely low latency. But unlike HPC, that data isn’t just in the form of text and numbers. AI applications often process data from heterogenous sources in multiple forms, including large image, audio, and video files. Entering 2020, 73% of AI applications were working with image, video, audio, or sensor data.
Chip manufacturers have responded to the growing demand for multi-format AI applications with more powerful chips. Thermal power densities for leading central processing units (CPUs) and graphics processing units (GPUs) rose sharply in the last two years, after relatively modest growth in the previous five years. Intel’s upcoming Ponte Vecchio GPUs shared by Intel CEO Pat Gelsinger, for example, have thermal power densities that will require liquid cooling.
With more of these high-powered CPUs and GPUs being packed into 1U servers and equipment racks being packed with these 1U servers, we are seeing a growing number of applications with racks densities of 30 kW or higher — and we are still very early in the adoption of AI.
Liquid Cooling: The Best — and Only — Alternative
Now it’s the data center industry’s turn to respond. To do that, we have to accept the physical limitations of air cooling — it does not have the thermal transfer capacity required to cool high-density racks, no matter how it is optimized. The best case scenario is that energy costs rise sharply while CPUs and GPUs throttle back their clock speeds to prevent overheating, compromising the performance of the application. The worst case is equipment failure.
In contrast, the liquid cooling technologies available today have the capacity to efficiently and effectively cool racks of 50 kW and higher. It’s been a long time coming, but liquid cooling is finally positioned to penetrate the larger ecosystem the way some predicted it would 15 years ago.
Introducing Liquid into Air-Cooled Data Centers
While there are some dedicated liquid cooling facilities being developed, most liquid cooling deployments happening today involve transforming existing air-cooled facilities into hybrid air- and liquid-cooled facilities.
This is by no means a simple transformation. Available liquid cooling technologies, the liquid-to-heat ratio, plumbing runs, the liquid distribution loop, and final heat rejection all require careful planning and engineering.
But the expertise, technologies, and best practices are available today to support a successful and minimally disruptive deployment. And the benefits — beyond supporting the applications businesses will become increasingly dependent on — include packing more capacity in the same footprint, dramatic improvements in efficiency, and lower total cost of ownership.