Toward efficient compute-intensive job allocation for green data centers : a deep reinforcement learning approach
Reducing the energy consumption of the servers in a data center via proper job allocation is desirable. Existing advanced job allocation algorithms, based on constrained optimization formulations capturing servers’ complex power consumption and thermal dynamics, often scale poorly with the data c...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2019
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/104419 http://hdl.handle.net/10220/50011 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Reducing the energy consumption of the servers in a data center via proper job allocation is
desirable. Existing advanced job allocation algorithms, based on constrained optimization formulations
capturing servers’ complex power consumption and thermal dynamics, often scale
poorly with the data center size and optimization horizon. This paper applies deep reinforcement
learning to build an allocation algorithm for long-lasting and compute-intensive jobs that
are increasingly seen among today’s computation demands. Specifically, a deep Q-network
is trained to allocate jobs, aiming to maximize a cumulative reward over long horizons. The
training is performed offline using a computational model based on long short-term memory
networks that capture the servers’ power and thermal dynamics. This offline training approach
avoids slow online convergence, low energy efficiency, and potential server overheating during
the agent’s extensive state-action space exploration if it directly interacts with the physical data
center in the usually adopted online learning scheme. At run time, the trained Q-network is
forward-propagated with little computation to allocate jobs. Evaluation based on eight months’
physical state and job arrival records from a national supercomputing data center hosting 1,152
processors shows that our solution reduces computing power consumption by more than 10%
and processor temperature by more than 4°C without sacrificing job processing throughput. |
---|