Driver state monitoring for intelligent vehicles - part I: in-cabin activity identification

The evolution of Intelligent Vehicles (IV) has enabled various degrees of autonomous driving, aiming to enhance road safety through Advanced Driver Assistance Systems (ADAS). Apart from road obstacle detection, the research on IV extends to driver-state monitoring, specifically on driver distract...

Full description

Saved in:
Bibliographic Details
Main Author: Low, Daniel Teck Fatt
Other Authors: Lyu Chen
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/177419
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The evolution of Intelligent Vehicles (IV) has enabled various degrees of autonomous driving, aiming to enhance road safety through Advanced Driver Assistance Systems (ADAS). Apart from road obstacle detection, the research on IV extends to driver-state monitoring, specifically on driver distraction to promote safe driving and minimise the likelihood of road accidents due to human error. Past studies focused on attaining high accuracy in driver activity recognition through deeper convolutional neural networks (CNN) with more parameters, which require more computational power, making them less viable for real-time classification. This report presents efficient CNN model architectures: MobileNetV3 and MobileVGG, designed for edge and mobile-like system, predominantly for driver activity recognition. Employing transfer learning approach, the models utilised parameters pretrained on large dataset for model training, enhancing data generalisation and model performance. The findings indicate that MobileNetV3 Large is the most effective for driver activity recognition. A dual-stream model, using MobileNetV3 Large as its backbone, has been developed to address occlusion and variations in camera angles by processing images from the driver’s front and side views. This model achieved 81% classification accuracy on real-world data with 10.9M parameters, about 50% less than the state-of-the-art models, and delivered 27 FPS in real-time.