HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER

Traffic jam is a serious problem that causes many losses. There are many causes of congestion, ranging from vehicle debits that exceed road capacity, poor driving culture, and traffic control systems that don't adapt to road conditions. Most of the traffic management systems in Indonesia are...

Full description

Saved in:
Bibliographic Details
Main Author: Reswara Wiradjanu, Jalu
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/73890
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:73890
spelling id-itb.:738902023-06-24T17:31:50ZHARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER Reswara Wiradjanu, Jalu Indonesia Final Project traffic jam, congestion, Q-learning, pre-timed, adaptive traffic control. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/73890 Traffic jam is a serious problem that causes many losses. There are many causes of congestion, ranging from vehicle debits that exceed road capacity, poor driving culture, and traffic control systems that don't adapt to road conditions. Most of the traffic management systems in Indonesia are still regulated on a pre-timed basis which makes it unable to adjust to traffic conditions at any time. This increases the risk of congestion due to the remaining queues. An adaptive traffic control system was developed for two adjacent intersections based on Q-Learning at the simulation level. The system is capable of simulating traffic conditions based on traffic counting results and traffic lights are regulated based on the Q-learning algorithm. Traffic conditions are also replicated in traffic miniatures as a form of real-world implementation approach. The Q-Learning algorithm is implemented in the hardware description language Verilog. The implementation of Q-learning has not been successful. Supposedly, the Q-matrix results are sent to SUMO to be able to manage the simulated traffic. The resulting Q-matrix is 256 x 4 in size with a 32- bit signed Q-value data width. The results of the reward graph show that the number of rewards is increasingly positive. Policy changes make the results of the reward graph change because the reward is determined at each step. There is still no performance comparison between adaptive settings and manual or pre-timed settings yet. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Traffic jam is a serious problem that causes many losses. There are many causes of congestion, ranging from vehicle debits that exceed road capacity, poor driving culture, and traffic control systems that don't adapt to road conditions. Most of the traffic management systems in Indonesia are still regulated on a pre-timed basis which makes it unable to adjust to traffic conditions at any time. This increases the risk of congestion due to the remaining queues. An adaptive traffic control system was developed for two adjacent intersections based on Q-Learning at the simulation level. The system is capable of simulating traffic conditions based on traffic counting results and traffic lights are regulated based on the Q-learning algorithm. Traffic conditions are also replicated in traffic miniatures as a form of real-world implementation approach. The Q-Learning algorithm is implemented in the hardware description language Verilog. The implementation of Q-learning has not been successful. Supposedly, the Q-matrix results are sent to SUMO to be able to manage the simulated traffic. The resulting Q-matrix is 256 x 4 in size with a 32- bit signed Q-value data width. The results of the reward graph show that the number of rewards is increasingly positive. Policy changes make the results of the reward graph change because the reward is determined at each step. There is still no performance comparison between adaptive settings and manual or pre-timed settings yet.
format Final Project
author Reswara Wiradjanu, Jalu
spellingShingle Reswara Wiradjanu, Jalu
HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER
author_facet Reswara Wiradjanu, Jalu
author_sort Reswara Wiradjanu, Jalu
title HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER
title_short HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER
title_full HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER
title_fullStr HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER
title_full_unstemmed HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER
title_sort hardware architecture design of double q-learning algorithm for smart traffic controller
url https://digilib.itb.ac.id/gdl/view/73890
_version_ 1822007240617361408