Toward physics-guided safe deep reinforcement learning for green data center cooling control

Deep reinforcement learning (DRL) has shown good performance in tackling Markov decision process (MDP) problems. As DRL optimizes a long-term reward, it is a promising approach to improving the energy efficiency of data center cooling. However, enforcement of thermal safety constraints during DRL�...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wang, Ruihang, Zhang, Xinyi, Zhou, Xin, Wen, Yonggang, Tan, Rui
Other Authors:	School of Computer Science and Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2022
Subjects:	Engineering::Computer science and engineering::Computer applications::Physical sciences and engineering Data Center Safe Reinforcement Learning Energy Efficiency Thermal Safety
Online Access:	https://hdl.handle.net/10356/157736
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-157736
record_format	dspace
spelling	sg-ntu-dr.10356-1577362022-07-06T00:12:38Z Toward physics-guided safe deep reinforcement learning for green data center cooling control Wang, Ruihang Zhang, Xinyi Zhou, Xin Wen, Yonggang Tan, Rui School of Computer Science and Engineering 2022 ACM/IEEE 13th International Conference on Cyber-Physical Systems (ICCPS) Data Management and Analytics Lab Engineering::Computer science and engineering::Computer applications::Physical sciences and engineering Data Center Safe Reinforcement Learning Energy Efficiency Thermal Safety Deep reinforcement learning (DRL) has shown good performance in tackling Markov decision process (MDP) problems. As DRL optimizes a long-term reward, it is a promising approach to improving the energy efficiency of data center cooling. However, enforcement of thermal safety constraints during DRL's state exploration is a main challenge. The widely adopted reward shaping approach adds negative reward when the exploratory action results in unsafety. Thus, it needs to experience sufficient unsafe states before it learns how to prevent unsafety. In this paper, we propose a safety-aware DRL framework for single-hall data center cooling control. It applies offline imitation learning and online post-hoc rectification to holistically prevent thermal unsafety during online DRL. In particular, the post-hoc rectification searches for the minimum modification to the DRL-recommended action such that the rectified action will not result in unsafety. The rectification is designed based on a thermal state transition model that is fitted using historical safe operation traces and able to extrapolate the transitions to unsafe states explored by DRL. Extensive evaluation for chilled water and direct expansion cooled data centers in two climate conditions shows that our approach saves 22.7% to 26.6\% total data center power compared with conventional control, reduces safety violations by 94.5% to 99\% compared with reward shaping. National Research Foundation (NRF) Submitted/Accepted version This research is supported by the National Research Foundation, Prime Minister's Office, Singapore under its Energy Research Testbed and Industry Partnership Funding Initiative of the Energy Grid (EG) 2.0 programme and its Central Gap Fund (“Central Gap” Award No. NRF2020NRF-CG001-027) and its NTUitive Gap Fund administrated by the NTUitive Pte Ltd and Ministry of Education. 2022-07-06T00:12:37Z 2022-07-06T00:12:37Z 2022 Conference Paper Wang, R., Zhang, X., Zhou, X., Wen, Y. & Tan, R. (2022). Toward physics-guided safe deep reinforcement learning for green data center cooling control. 2022 ACM/IEEE 13th International Conference on Cyber-Physical Systems (ICCPS), 159-169. https://dx.doi.org/10.1109/ICCPS54341.2022.00021 https://hdl.handle.net/10356/157736 10.1109/ICCPS54341.2022.00021 159 169 en NRF2020NRF-CG001-027 © 2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/ICCPS54341.2022.00021. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering::Computer applications::Physical sciences and engineering Data Center Safe Reinforcement Learning Energy Efficiency Thermal Safety
spellingShingle	Engineering::Computer science and engineering::Computer applications::Physical sciences and engineering Data Center Safe Reinforcement Learning Energy Efficiency Thermal Safety Wang, Ruihang Zhang, Xinyi Zhou, Xin Wen, Yonggang Tan, Rui Toward physics-guided safe deep reinforcement learning for green data center cooling control
description	Deep reinforcement learning (DRL) has shown good performance in tackling Markov decision process (MDP) problems. As DRL optimizes a long-term reward, it is a promising approach to improving the energy efficiency of data center cooling. However, enforcement of thermal safety constraints during DRL's state exploration is a main challenge. The widely adopted reward shaping approach adds negative reward when the exploratory action results in unsafety. Thus, it needs to experience sufficient unsafe states before it learns how to prevent unsafety. In this paper, we propose a safety-aware DRL framework for single-hall data center cooling control. It applies offline imitation learning and online post-hoc rectification to holistically prevent thermal unsafety during online DRL. In particular, the post-hoc rectification searches for the minimum modification to the DRL-recommended action such that the rectified action will not result in unsafety. The rectification is designed based on a thermal state transition model that is fitted using historical safe operation traces and able to extrapolate the transitions to unsafe states explored by DRL. Extensive evaluation for chilled water and direct expansion cooled data centers in two climate conditions shows that our approach saves 22.7% to 26.6\% total data center power compared with conventional control, reduces safety violations by 94.5% to 99\% compared with reward shaping.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Wang, Ruihang Zhang, Xinyi Zhou, Xin Wen, Yonggang Tan, Rui
format	Conference or Workshop Item
author	Wang, Ruihang Zhang, Xinyi Zhou, Xin Wen, Yonggang Tan, Rui
author_sort	Wang, Ruihang
title	Toward physics-guided safe deep reinforcement learning for green data center cooling control
title_short	Toward physics-guided safe deep reinforcement learning for green data center cooling control
title_full	Toward physics-guided safe deep reinforcement learning for green data center cooling control
title_fullStr	Toward physics-guided safe deep reinforcement learning for green data center cooling control
title_full_unstemmed	Toward physics-guided safe deep reinforcement learning for green data center cooling control
title_sort	toward physics-guided safe deep reinforcement learning for green data center cooling control
publishDate	2022
url	https://hdl.handle.net/10356/157736
_version_	1738844821619998720

Toward physics-guided safe deep reinforcement learning for green data center cooling control

Similar Items