Driver state monitoring for intelligent vehicles - part I: in-cabin activity identification

The growing interest in Intelligent Vehicles (IV) worldwide have made it possible to operate vehicles with varying degrees of autonomy. Driver Activity Recognition (DAR) is an important area of research which aims to reduce road accidents caused by distracted driving. Past research conducted have a...

Full description

Saved in:
Bibliographic Details
Main Author: Lim, Cai Yin
Other Authors: Lyu Chen
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/167301
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The growing interest in Intelligent Vehicles (IV) worldwide have made it possible to operate vehicles with varying degrees of autonomy. Driver Activity Recognition (DAR) is an important area of research which aims to reduce road accidents caused by distracted driving. Past research conducted have a strong emphasis on using datasets in optimal condition to achieve high accuracy. The lack of variations in data for the models to train may not perform as well in real time as it has been optimised. Therefore, this report aims to develop a deep learning model which has the lowest computational cost while maintaining a high accuracy. In this research, Convolutional Neural Network (CNN), transfer learning and two-stream neural network will be investigated. The dataset used to train models are Kaggle for single-frame and Ben Khalifa’s for multi-frame models. 2D CNN and 2D CNN with transfer learning was applied on single frame dataset while 3D CNN with transfer learning, Two-Stream transfer learning and Two-Stream CNN was used on multi-frame dataset. 2D CNN with transfer learning, ResNet18, performs the best for single-frame dataset with a high accuracy of 99.44%. Two-Stream transfer learning with RGB and optical flow was able to attain an accuracy of 97.16% for multi-frame dataset. The best performing model for each type was then compared before selecting the final model for testing with real-world data.