HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER
Traffic jam is a serious problem that causes many losses. There are many causes of congestion, ranging from vehicle debits that exceed road capacity, poor driving culture, and traffic control systems that don't adapt to road conditions. Most of the traffic management systems in Indonesia are...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/73890 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:73890 |
---|---|
spelling |
id-itb.:738902023-06-24T17:31:50ZHARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER Reswara Wiradjanu, Jalu Indonesia Final Project traffic jam, congestion, Q-learning, pre-timed, adaptive traffic control. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/73890 Traffic jam is a serious problem that causes many losses. There are many causes of congestion, ranging from vehicle debits that exceed road capacity, poor driving culture, and traffic control systems that don't adapt to road conditions. Most of the traffic management systems in Indonesia are still regulated on a pre-timed basis which makes it unable to adjust to traffic conditions at any time. This increases the risk of congestion due to the remaining queues. An adaptive traffic control system was developed for two adjacent intersections based on Q-Learning at the simulation level. The system is capable of simulating traffic conditions based on traffic counting results and traffic lights are regulated based on the Q-learning algorithm. Traffic conditions are also replicated in traffic miniatures as a form of real-world implementation approach. The Q-Learning algorithm is implemented in the hardware description language Verilog. The implementation of Q-learning has not been successful. Supposedly, the Q-matrix results are sent to SUMO to be able to manage the simulated traffic. The resulting Q-matrix is 256 x 4 in size with a 32- bit signed Q-value data width. The results of the reward graph show that the number of rewards is increasingly positive. Policy changes make the results of the reward graph change because the reward is determined at each step. There is still no performance comparison between adaptive settings and manual or pre-timed settings yet. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Traffic jam is a serious problem that causes many losses. There are many causes
of congestion, ranging from vehicle debits that exceed road capacity, poor driving
culture, and traffic control systems that don't adapt to road conditions. Most of the
traffic management systems in Indonesia are still regulated on a pre-timed basis
which makes it unable to adjust to traffic conditions at any time. This increases the
risk of congestion due to the remaining queues. An adaptive traffic control system
was developed for two adjacent intersections based on Q-Learning at the
simulation level. The system is capable of simulating traffic conditions based on
traffic counting results and traffic lights are regulated based on the Q-learning
algorithm. Traffic conditions are also replicated in traffic miniatures as a form of
real-world implementation approach. The Q-Learning algorithm is implemented in
the hardware description language Verilog. The implementation of Q-learning has
not been successful. Supposedly, the Q-matrix results are sent to SUMO to be able
to manage the simulated traffic. The resulting Q-matrix is 256 x 4 in size with a 32-
bit signed Q-value data width. The results of the reward graph show that the
number of rewards is increasingly positive. Policy changes make the results of the
reward graph change because the reward is determined at each step. There is still
no performance comparison between adaptive settings and manual or pre-timed
settings yet. |
format |
Final Project |
author |
Reswara Wiradjanu, Jalu |
spellingShingle |
Reswara Wiradjanu, Jalu HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER |
author_facet |
Reswara Wiradjanu, Jalu |
author_sort |
Reswara Wiradjanu, Jalu |
title |
HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER |
title_short |
HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER |
title_full |
HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER |
title_fullStr |
HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER |
title_full_unstemmed |
HARDWARE ARCHITECTURE DESIGN OF DOUBLE Q-LEARNING ALGORITHM FOR SMART TRAFFIC CONTROLLER |
title_sort |
hardware architecture design of double q-learning algorithm for smart traffic controller |
url |
https://digilib.itb.ac.id/gdl/view/73890 |
_version_ |
1822007240617361408 |