REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT
fuel consumption, exhaust emissions, as well as the mental and physical health of the public. Fixed-duration traffic light control systems are unable to adapt to dynamic traffic conditions. Therefore, an adaptive traffic light duration control system is needed, with the objective of reducing cong...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/87727 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:87727 |
---|---|
spelling |
id-itb.:877272025-02-03T07:52:42ZREINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT Pandu Irsyadi, Naufal Indonesia Theses Traffic Congestion, Traffic Light Control, Reinforcement Learning, Q-learning, Policy Gradient, Google Maps. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/87727 fuel consumption, exhaust emissions, as well as the mental and physical health of the public. Fixed-duration traffic light control systems are unable to adapt to dynamic traffic conditions. Therefore, an adaptive traffic light duration control system is needed, with the objective of reducing congestion. This research aims to develop a traffic light duration control system based on Reinforcement Learning (RL) that can adjust traffic light durations in real-time. The research methodology follows the CRISP-DM (Cross Industry Standard Process for Data Mining) framework, which consists of the following phases: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment. Traffic data is obtained through the Google Maps API and further synthesized into vehicle route data using the SUMO simulator to construct a simulated traffic environment. The RL model is trained using eight different algorithms: SARSA, Q-Learning, Deep Q-Network (DQN), Double Deep Q-Network (DDQN), REINFORCE, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO), and Soft Actor-Critic (SAC). Hyperparameter optimization is performed using the grid search approach to determine the best parameters for each algorithm. In this study, the evaluation results show that the off-policy policy gradient-based algorithm, DDPG, achieves the best performance in reducing congestion compared to other models. On weekday traffic, this model is able to reduce travel time by 30%, decrease queue length by 23%, and reduce the maximum queue length at intersections by 79%. On weekend traffic, it reduces travel time by 16%, decreases queue length by 14%, and reduces the maximum queue length at intersections by 75%. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
fuel consumption, exhaust emissions, as well as the mental and physical health of
the public. Fixed-duration traffic light control systems are unable to adapt to
dynamic traffic conditions. Therefore, an adaptive traffic light duration control
system is needed, with the objective of reducing congestion.
This research aims to develop a traffic light duration control system based on
Reinforcement Learning (RL) that can adjust traffic light durations in real-time. The
research methodology follows the CRISP-DM (Cross Industry Standard Process for
Data Mining) framework, which consists of the following phases: Business
Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and
Deployment. Traffic data is obtained through the Google Maps API and further
synthesized into vehicle route data using the SUMO simulator to construct a
simulated traffic environment. The RL model is trained using eight different
algorithms: SARSA, Q-Learning, Deep Q-Network (DQN), Double Deep Q-Network
(DDQN), REINFORCE, Deep Deterministic Policy Gradient (DDPG), Proximal
Policy Optimization (PPO), and Soft Actor-Critic (SAC). Hyperparameter
optimization is performed using the grid search approach to determine the best
parameters for each algorithm.
In this study, the evaluation results show that the off-policy policy gradient-based
algorithm, DDPG, achieves the best performance in reducing congestion compared
to other models. On weekday traffic, this model is able to reduce travel time by 30%,
decrease queue length by 23%, and reduce the maximum queue length at
intersections by 79%. On weekend traffic, it reduces travel time by 16%, decreases
queue length by 14%, and reduces the maximum queue length at intersections by
75%. |
format |
Theses |
author |
Pandu Irsyadi, Naufal |
spellingShingle |
Pandu Irsyadi, Naufal REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT |
author_facet |
Pandu Irsyadi, Naufal |
author_sort |
Pandu Irsyadi, Naufal |
title |
REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT |
title_short |
REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT |
title_full |
REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT |
title_fullStr |
REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT |
title_full_unstemmed |
REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT |
title_sort |
reinforcement learning-based traffic light duration control system development |
url |
https://digilib.itb.ac.id/gdl/view/87727 |
_version_ |
1823658252858032128 |