REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT

fuel consumption, exhaust emissions, as well as the mental and physical health of the public. Fixed-duration traffic light control systems are unable to adapt to dynamic traffic conditions. Therefore, an adaptive traffic light duration control system is needed, with the objective of reducing cong...

Full description

Saved in:

Bibliographic Details
Main Author:	Pandu Irsyadi, Naufal
Format:	Theses
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/87727
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:87727
spelling	id-itb.:877272025-02-03T07:52:42ZREINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT Pandu Irsyadi, Naufal Indonesia Theses Traffic Congestion, Traffic Light Control, Reinforcement Learning, Q-learning, Policy Gradient, Google Maps. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/87727 fuel consumption, exhaust emissions, as well as the mental and physical health of the public. Fixed-duration traffic light control systems are unable to adapt to dynamic traffic conditions. Therefore, an adaptive traffic light duration control system is needed, with the objective of reducing congestion. This research aims to develop a traffic light duration control system based on Reinforcement Learning (RL) that can adjust traffic light durations in real-time. The research methodology follows the CRISP-DM (Cross Industry Standard Process for Data Mining) framework, which consists of the following phases: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment. Traffic data is obtained through the Google Maps API and further synthesized into vehicle route data using the SUMO simulator to construct a simulated traffic environment. The RL model is trained using eight different algorithms: SARSA, Q-Learning, Deep Q-Network (DQN), Double Deep Q-Network (DDQN), REINFORCE, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO), and Soft Actor-Critic (SAC). Hyperparameter optimization is performed using the grid search approach to determine the best parameters for each algorithm. In this study, the evaluation results show that the off-policy policy gradient-based algorithm, DDPG, achieves the best performance in reducing congestion compared to other models. On weekday traffic, this model is able to reduce travel time by 30%, decrease queue length by 23%, and reduce the maximum queue length at intersections by 79%. On weekend traffic, it reduces travel time by 16%, decreases queue length by 14%, and reduces the maximum queue length at intersections by 75%. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	fuel consumption, exhaust emissions, as well as the mental and physical health of the public. Fixed-duration traffic light control systems are unable to adapt to dynamic traffic conditions. Therefore, an adaptive traffic light duration control system is needed, with the objective of reducing congestion. This research aims to develop a traffic light duration control system based on Reinforcement Learning (RL) that can adjust traffic light durations in real-time. The research methodology follows the CRISP-DM (Cross Industry Standard Process for Data Mining) framework, which consists of the following phases: Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment. Traffic data is obtained through the Google Maps API and further synthesized into vehicle route data using the SUMO simulator to construct a simulated traffic environment. The RL model is trained using eight different algorithms: SARSA, Q-Learning, Deep Q-Network (DQN), Double Deep Q-Network (DDQN), REINFORCE, Deep Deterministic Policy Gradient (DDPG), Proximal Policy Optimization (PPO), and Soft Actor-Critic (SAC). Hyperparameter optimization is performed using the grid search approach to determine the best parameters for each algorithm. In this study, the evaluation results show that the off-policy policy gradient-based algorithm, DDPG, achieves the best performance in reducing congestion compared to other models. On weekday traffic, this model is able to reduce travel time by 30%, decrease queue length by 23%, and reduce the maximum queue length at intersections by 79%. On weekend traffic, it reduces travel time by 16%, decreases queue length by 14%, and reduces the maximum queue length at intersections by 75%.
format	Theses
author	Pandu Irsyadi, Naufal
spellingShingle	Pandu Irsyadi, Naufal REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT
author_facet	Pandu Irsyadi, Naufal
author_sort	Pandu Irsyadi, Naufal
title	REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT
title_short	REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT
title_full	REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT
title_fullStr	REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT
title_full_unstemmed	REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT
title_sort	reinforcement learning-based traffic light duration control system development
url	https://digilib.itb.ac.id/gdl/view/87727
_version_	1823658252858032128

REINFORCEMENT LEARNING-BASED TRAFFIC LIGHT DURATION CONTROL SYSTEM DEVELOPMENT

Similar Items