In our recent blog post on introducing liquid cooling into an air-cooled data center, I shared the results from the first major analysis of the impact on energy efficiency and power usage effectiveness. That analysis, conducted by a team of specialists from Vertiv and NVIDIA, documented an 18.1% reduction in facility power and a 10.2% reduction in total data center power by transitioning from 100% air cooling to 75% direct-to-chip liquid cooling in concert with other optimizations to supply air, chilled water, and secondary inlet temperatures.
How PUE Falls Short in Evaluating Liquid Cooling Efficiency
While total data center power was reduced by more than 10%, the PUE for the facility we analyzed dropped by only 3.3% between our baseline and full optimization studies. The reason for this becomes obvious when you review the data collected during the four studies included in our analysis.
In the first study, in which the data center was 100% air cooled, total data center power was 2,562.8 kilowatts (kW) and IT power was 1860.4 kW, resulting in a PUE of 2,562.8/1,860.4 or 1.38.
With the first introduction of liquid cooling (61.4% of the load), data center power was reduced to 2,403.1 kW and IT power to 1,791.1 kW, producing a PUE of 1.35. In Study 3, the percent of liquid cooling was increased to 68.6%, and total power and IT power fell proportionally. As a result, even though data center power was cut by 1.8%, the PUE remained flat at 1.35.
Our methodology provides a little more insight into what was happening and points to a better metric that can be employed for evaluating the efficiency of liquid cooling.
Server fan power consumption decreased by 41% between Study 1 and Study 2, and 80% between Study 1 and Study 4. In PUE calculations, server fan power is included in IT power, so IT power was reduced 7% between Study 1 and 4. While physically integrated with the server, fan power is functionally more infrastructure than IT and including it in the IT power used to calculate PUE distorts the value of liquid cooling and can influence design in ways that can prevent true optimization. That led us to look for a metric that would better reflect the changes we were seeing and be more valuable to data center designers.
Total Usage Effectiveness (TUE): A Better Metric for Evaluating Liquid Cooling Efficiency
Fortunately, this metric had already been defined in a 2013 paper presented at the International Supercomputing Conference, TUE: A New Energy-Efficiency Metric Applied at ORNL's Jaguar.
TUE is a PUE-like metric that addresses the flaws in PUE when it Is used to compare different approaches to cooling. The TUE metric is calculated by replacing the total IT power used in the PUE calculation with the total energy being consumed to directly support IT functions such as central processing units (CPUs), graphics processing units (GPUs), memory, and storage. This separates the power consumption of fans and other auxiliaries that don’t directly support IT processes from the IT power required for storage and compute to provide a truer picture of IT energy usage effectiveness and thus total energy usage effectiveness.
For example, the PUE in our third study didn’t change from the PUE in Study 2. A designer evaluating that data might conclude there is no benefit to be gained by increasing liquid cooling from 61.4% to 68.6% as we did between Study 2 and 3.
But, if we look at the TUE for these two studies, the benefits become obvious. The TUE for Study 2 can be calculated as 24,03.1 (data center power) divided by 1,692.5 (the power used to support IT functions) for a TUE of 1.42.
In Study 3, the numerator in the TUE equation is reduced from 2,403.1 to 2,360.1 while the denominator remains constant and the TUE drops to 1.39 — a more accurate reflection of the improvements achieved through the changes implemented in Study 3.
PUE remains a valuable metric for evaluating the effect of some infrastructure changes on data center efficiency and for comparing the infrastructure efficiency of a facility to other facilities of the same size and operating in a similar climate. But it shouldn’t be used to compare the efficiency of liquid and air-cooling systems or evaluate the efficiency of liquid cooling designs. For that, TUE will prove to be a more accurate and valuable metric.
You can learn more about the methodology, results, and takeaways from our analysis in our blog post about introducing liquid cooling into an air-cooled data center.