Deep-learning-based 3D driver pose estimation for autonomous driving

Human-machine interaction is a key for the future development of virtual reality, augmented reality, artificial intelligence and smart device. The application of human-machine interaction technology, especially human body estimation, in autonomous driving is important to facilitate drivers to drive...

全面介紹

Saved in:
書目詳細資料
主要作者: Cao, Xiao
其他作者: Lyu Chen
格式: Thesis-Master by Coursework
語言:English
出版: Nanyang Technological University 2021
主題:
在線閱讀:https://hdl.handle.net/10356/149968
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:Human-machine interaction is a key for the future development of virtual reality, augmented reality, artificial intelligence and smart device. The application of human-machine interaction technology, especially human body estimation, in autonomous driving is important to facilitate drivers to drive safely and smoothly. Human estimation can detect driver fatigue. It can also help ergonomics research and then improve human-machine interface design in automated vehicles. Researchers have got great achievements in human state estimation, including body estimation, hand estimation and face estimation. In the past, human estimation technology is dependent on hardware devices while estimation methods based on machine learning and deep learning become increasingly more popular and show excellent performance compared with traditional ways in terms of cost and efficiency. However, most estimation models are developed separately, which means that the existing models can only process body estimation or hand estimation separately instead simultaneously, while the model that can identify different parts of human at the same time is more expected in the research and application. In this dissertation, five deep learning models, including Simple Faster R-CNN, RootNet, PoseNet, YOLOv3 and a hand estimation model, are selected and then combined through a cascade method to develop an integrated model which can estimate the human body and human hand simultaneously. The outputs of each model are saved in different coordinate systems, so they cannot be fed into the subsequent neural network directly. Hence, in this project, they are transformed into the same coordinate system by a rotation transformation matrix and that enables five models to be connected in series. Through the experiment designed specifically, the integrated model is proven to be able to produce 2D pose and 3D pose of the human body and human hands at the same time. In this project, many problems still exist. These problems will be solved and other functions, such as the face estimation models will be added in the future.