Accident occurrence predictions with cost effective feature selection methods

Traffic accidents are a huge social cost that needs to be curbed and prediction of traffic accidents helps officials in implementing strategies for safer roads. Most of the research in this field focuses on using powerful classifiers for higher performance in making traffic accident predictions....

Full description

Saved in:
Bibliographic Details
Main Author: Garimella Krishna Apoorva
Other Authors: Zhu Feng
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/141918
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Traffic accidents are a huge social cost that needs to be curbed and prediction of traffic accidents helps officials in implementing strategies for safer roads. Most of the research in this field focuses on using powerful classifiers for higher performance in making traffic accident predictions. This report works on building a high performing model by introducing cost-sensitive feature selection methods as a pre processing step that helps to curb the class imbalance and noisy features problem most accident prediction models face when only a classifier without a pre-processing step is used. This study introduces three cost-sensitive feature selection methods that factor in imbalanced data and helps in figuring out important features using various performance metrics. The selected features are then used in the Cost-sensitive Support Vector Machine Classifier to check for the performance. K-Nearest Neighbor classifier was used as well to draw comparisons on the performance of the three feature selection methods. Overall, the report has found feature selection to be an effective pre-processing step for traffic accident prediction and of all features used, weather was found to be the most important feature with the highest feature score and AUC score, followed by features such as location, time and speed. The results were comparable to previous researches in determining important features for traffic accident prediction.