Intelligent buses in a loop service: emergence of no-boarding and holding strategies

We study how N intelligent buses serving a loop of M bus stops learn a no-boarding strategy and a holding strategy by reinforcement learning. The no-boarding and holding strategies emerge from the actions of stay or leave when a bus is at a bus stop and everyone who wishes to alight has done so. A r...

Full description

Saved in:
Bibliographic Details
Main Authors: Saw, Vee-Liem, Vismara, Luca, Chew, Lock Yue
Other Authors: School of Physical and Mathematical Sciences
Format: Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/145027
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:We study how N intelligent buses serving a loop of M bus stops learn a no-boarding strategy and a holding strategy by reinforcement learning. The no-boarding and holding strategies emerge from the actions of stay or leave when a bus is at a bus stop and everyone who wishes to alight has done so. A reward that encourages the buses to strive towards a staggered phase difference amongst them whilst picking up passengers allows the reinforcement learning process to converge to an optimal Q-table within a reasonable amount of simulation time. It is remarkable that this emergent behaviour of intelligent buses turns out to minimise the average waiting time of commuters, in various setups where buses move with the same speed or different speeds, during busy as well as lull periods. Cooperative actions are also observed, e.g., the buses learn to unbunch.