STUDY IN IMPLEMENTATION OF REINFORCEMENT LEARNING FOR COORDINATED TRAFFIC LIGHT CONTROL IN MULTI-INTERSECTION ROAD NETWORK MODEL

Traffic congestion has been known notoriously to cause severe losses in various sectors. One of its primary causes is conflicting vehicle flows at intersections, where traffic light control was implemented to solve these conflicts. Recent developments in machine learning, especially reinforcement le...

Full description

Saved in:
Bibliographic Details
Main Author: Junaedi, Dandi
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/65288
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Traffic congestion has been known notoriously to cause severe losses in various sectors. One of its primary causes is conflicting vehicle flows at intersections, where traffic light control was implemented to solve these conflicts. Recent developments in machine learning, especially reinforcement learning (RL), have shown a stupendous ability to learn and solve problems in complex models. This potential can be applied to traffic light control to help manage large-scale traffic control in a coordinated manner. This research proposed the utilization of mean-field theory in RL to enhance the learning process by sharing parameters’ information between neighboring agents to improve coordination between intersections with Cooperative Double Q-Learning (Co-DQL) algorithm. This research was conducted on an area of Central Jakarta by replicating its network and traffic conditions on traffic simulators, VISSIM and SUMO. Sydney Coordinated Adaptive Traffic System (SCATS) was implemented in VISSIM to simulate the traffic conditions that the chosen algorithms will face. Co-DQL was implemented in SUMO and was compared to other RL algorithms (Deep Deterministic Policy Gradient (DDPG) and Deep Q-learning (DQN)) and conventional algorithms (Max-pressure (MP), Uniform, and Webster’s). From the comparison of the vehicles’ travel time and vehicle throughput, the simulations show that DDPG, Uniform, and Webster’s have better performance than the remaining algorithms (over 40.000 vehicle throughput and small distribution of vehicles that has over 25.000 seconds of travel time). These algorithms that determine the change in green light’s duration for the next cycle as the control action are performing better than the remaining algorithms that adaptively change the phases of traffic lights. This type of action is almost boundless in its decision to change phases. Therefore, if a bottleneck in vehicle flow happens, the traffic light can hold its phase for an uncertain amount of time. This result shows that city-scale traffic control requires more developments to achieve better results, especially on the network model and the control action type.