A data-assisted first-principle approach to modeling server outlet temperature in air free-cooled data centers

The server outlet temperature is an important thermal condition to the operation of an air free-cooled data center that uses fans to continuously pass the outside air through the server room to cool the computing devices. However, the standard server's management and monitoring tool cannot read...

Full description

Saved in:
Bibliographic Details
Main Authors: Liu, Yingbo, Le, Duc Van, Tan, Rui
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/162447
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The server outlet temperature is an important thermal condition to the operation of an air free-cooled data center that uses fans to continuously pass the outside air through the server room to cool the computing devices. However, the standard server's management and monitoring tool cannot read the server's built-in outlet temperature sensors fast enough to catch up the fast dynamics of the server outlet thermal condition caused by the changing server workload. Moreover, many server models do not have built-in sensors that can measure the server outlet temperature. In this paper, we develop a data-assisted first-principle model that leverages available built-in sensors and server's operating monitoring tools to achieve low-latency estimation of the server outlet temperature. Specifically, the developed model takes the inlet and processor core temperatures, server's fan speed, and processor utilization which are measured by hardware/software sensors as inputs to predict the outlet temperature with low latencies. Our extensive evaluation based on real data traces collected from a real air free-cooled data center testbed shows that our model can accurately predict the outlet temperature with an average root mean squared error ranging from 1.21 °C to 1.46 °C under various cold supply air temperatures and processor utilization levels.