Taxi demand hotspots and customers’ travel destinations prediction using data mining approaches

It has been estimated that over 23 thousand licensed taxis in Singapore are not occupied around 50 percent of driving time on average [4, 6]. Knowing taxi demand hotspots and customers’ travel destinations at a given time and location helps taxi drivers in daily planning and scheduling, as well as t...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Cheun Pin.
Other Authors: Ng Wee Keong
Format: Final Year Project
Language:English
Published: 2012
Subjects:
Online Access:http://hdl.handle.net/10356/48804
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:It has been estimated that over 23 thousand licensed taxis in Singapore are not occupied around 50 percent of driving time on average [4, 6]. Knowing taxi demand hotspots and customers’ travel destinations at a given time and location helps taxi drivers in daily planning and scheduling, as well as the taxi service provider in dispatching and also increases customers’ satisfactions toward taxi service. In this project, the author developed a Singapore taxi predicting system. Main functions of the system are predicting taxi demand hotspots and customers’ travel destinations based on hour of the day, day of the week and weather condition. The history is used to build the inference engines for both types of predictions. Inference engine for customers’ travel destinations prediction is based on the decision tree classifier while inference engine for taxi demand hotspots prediction is based on the algorithm proposed in previous work [10]. Finally, the predicting system predicts potential hotspots and customers’ travel destinations for sake of taxi drivers and taxi service operator. Both predictors of the system were verified with artificial data. Based on the experiment results, both predictors achieve remarkable performance in detection rate ( AUC>0.96), but their accuracies are unsatisfactory around 55 percent as the artificial history data is used to train both predictors.