Reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs

Due to the NP-hard nature, the strategic airport slot scheduling problem is calling for exploring sub-optimal approaches, such as heuristics and learning-based approaches. Moreover, the continuous increase in air traffic demand requires approaches that can work well in new scenarios. While heuristic...

Full description

Saved in:

Bibliographic Details
Main Authors:	Nguyen-Duy, Anh, Pham, Duc-Thinh, Lye, Jian-Yi, Ta, Duong
Other Authors:	School of Mechanical and Aerospace Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2025
Subjects:	Computer and Information Science Other Reinforcement learning Airport slot scheduling Strategic
Online Access:	https://hdl.handle.net/10356/182321
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-182321
record_format	dspace
spelling	sg-ntu-dr.10356-1823212025-01-25T16:48:11Z Reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs Nguyen-Duy, Anh Pham, Duc-Thinh Lye, Jian-Yi Ta, Duong School of Mechanical and Aerospace Engineering 2024 IEEE Conference on Artificial Intelligence (CAI) Air Traffic Management Research Institute Computer and Information Science Other Reinforcement learning Airport slot scheduling Strategic Due to the NP-hard nature, the strategic airport slot scheduling problem is calling for exploring sub-optimal approaches, such as heuristics and learning-based approaches. Moreover, the continuous increase in air traffic demand requires approaches that can work well in new scenarios. While heuristics rely on a fixed set of rules, which limits the ability to explore new solutions, Reinforcement Learning offers a versatile framework to automate the search and generalize to unseen scenarios. Finding a suitable state observation and reward structure design is essential in using Reinforcement Learning. In this paper, we investigate the impact of providing the Reinforcement Learning agent with an intermediate positive signal in the reward structure along with the use of the Full State Observation and the Local State Observation. We perform training with different combinations of the reward structure, the state observation, and the Deep Q-Network (DQN) algorithm to define the training efficient formulation. We use two types of scenarios, medium and high-density, to test the ability to generalize to unseen data of the approach. Each type of scenario is used to train two separate models, Model 1 and Model 2. Model 1, which is trained on high-density scenarios, will be tested with medium-density scenarios; the results obtained will then be compared with the results of Model 2, and vice versa. We additionally analyze the performance of the DQN models with the Proximal Policy Optimization (PPO) models. Results suggest that combining the Local State Observation and the intermediate positive signal leads to a stable convergence. The obtained DQN models perform better compared to the PPO models, achieving an average displacement per request of 1.44/1.99 while only having on average 0.00/0.02 unaccommodated requests for medium/high-density scenarios. The t-statistic of 0.0810/-1.0016 and the p-value of 0.9356/0.3190 also suggest that the DQN models can generalize to unseen scenarios. Submitted/Accepted version 2025-01-22T05:50:02Z 2025-01-22T05:50:02Z 2024 Conference Paper Nguyen-Duy, A., Pham, D., Lye, J. & Ta, D. (2024). Reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs. 2024 IEEE Conference on Artificial Intelligence (CAI), 1195-1201. https://dx.doi.org/10.1109/CAI59869.2024.00213 979-8-3503-5409-6 https://hdl.handle.net/10356/182321 10.1109/CAI59869.2024.00213 1195 1201 en © 2024 IEEE. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. The Version of Record is available online at http://doi.org/10.1109/CAI59869.2024.00213. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Other Reinforcement learning Airport slot scheduling Strategic
spellingShingle	Computer and Information Science Other Reinforcement learning Airport slot scheduling Strategic Nguyen-Duy, Anh Pham, Duc-Thinh Lye, Jian-Yi Ta, Duong Reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs
description	Due to the NP-hard nature, the strategic airport slot scheduling problem is calling for exploring sub-optimal approaches, such as heuristics and learning-based approaches. Moreover, the continuous increase in air traffic demand requires approaches that can work well in new scenarios. While heuristics rely on a fixed set of rules, which limits the ability to explore new solutions, Reinforcement Learning offers a versatile framework to automate the search and generalize to unseen scenarios. Finding a suitable state observation and reward structure design is essential in using Reinforcement Learning. In this paper, we investigate the impact of providing the Reinforcement Learning agent with an intermediate positive signal in the reward structure along with the use of the Full State Observation and the Local State Observation. We perform training with different combinations of the reward structure, the state observation, and the Deep Q-Network (DQN) algorithm to define the training efficient formulation. We use two types of scenarios, medium and high-density, to test the ability to generalize to unseen data of the approach. Each type of scenario is used to train two separate models, Model 1 and Model 2. Model 1, which is trained on high-density scenarios, will be tested with medium-density scenarios; the results obtained will then be compared with the results of Model 2, and vice versa. We additionally analyze the performance of the DQN models with the Proximal Policy Optimization (PPO) models. Results suggest that combining the Local State Observation and the intermediate positive signal leads to a stable convergence. The obtained DQN models perform better compared to the PPO models, achieving an average displacement per request of 1.44/1.99 while only having on average 0.00/0.02 unaccommodated requests for medium/high-density scenarios. The t-statistic of 0.0810/-1.0016 and the p-value of 0.9356/0.3190 also suggest that the DQN models can generalize to unseen scenarios.
author2	School of Mechanical and Aerospace Engineering
author_facet	School of Mechanical and Aerospace Engineering Nguyen-Duy, Anh Pham, Duc-Thinh Lye, Jian-Yi Ta, Duong
format	Conference or Workshop Item
author	Nguyen-Duy, Anh Pham, Duc-Thinh Lye, Jian-Yi Ta, Duong
author_sort	Nguyen-Duy, Anh
title	Reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs
title_short	Reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs
title_full	Reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs
title_fullStr	Reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs
title_full_unstemmed	Reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs
title_sort	reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs
publishDate	2025
url	https://hdl.handle.net/10356/182321
_version_	1823108698511245312

Reinforcement learning for strategic airport slot scheduling: analysis of state observations and reward designs

Similar Items