Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker

Accurate dynamic driver head pose tracking is of great importance for driver–automotive collaboration, intelligent copilot, head-up display (HUD), and other human-centered automated driving applications. To further advance this technology, this article proposes a low-cost and markerless headtracking...

Full description

Saved in:

Bibliographic Details
Main Authors:	Hu, Zhongxu, Zhang, Yiran, Xing, Yang, Zhao, Yifan, Cao, Dongpu, Lv, Chen
Other Authors:	School of Mechanical and Aerospace Engineering
Format:	Article
Language:	English
Published:	2022
Subjects:	Engineering::Mechanical engineering Magnetic Heads Pose Estimation
Online Access:	https://hdl.handle.net/10356/162995
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-162995
record_format	dspace
spelling	sg-ntu-dr.10356-1629952022-11-15T01:13:22Z Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker Hu, Zhongxu Zhang, Yiran Xing, Yang Zhao, Yifan Cao, Dongpu Lv, Chen School of Mechanical and Aerospace Engineering Continental-NTU Corporate Laboratory Engineering::Mechanical engineering Magnetic Heads Pose Estimation Accurate dynamic driver head pose tracking is of great importance for driver–automotive collaboration, intelligent copilot, head-up display (HUD), and other human-centered automated driving applications. To further advance this technology, this article proposes a low-cost and markerless headtracking system using a deep learning-based dynamic head pose estimation model. The proposed system requires only a red, green, blue (RGB) camera without other hardware or markers. To enhance the accuracy of the driver’s head pose estimation, a spatiotemporal vision transformer (ST-ViT) model, which takes an image pair as the input instead of a single frame, is proposed. Compared to a standard transformer, the ST-ViT contains a spatial–convolutional vision transformer and a temporal transformer, which can improve the model performance. To handle the error fluctuation of the head pose estimation model, this article proposes an adaptive Kalman filter (AKF). By analyzing the error distribution of the estimation model and the user experience of the head tracker, the proposed AKF includes an adaptive observation noise coefficient; this can adaptively moderate the smoothness of the curve. Comprehensive experiments show that the proposed system is feasible and effective, and it achieves a state-of-the-art performance. Agency for Science, Technology and Research (ASTAR) Nanyang Technological University This work was supported in part by in part by the ASTAR National Robotics Program under grant W1925d0046, the Start-Up Grant, Nanyang Assistant Professorship under grant M4082268.050, Nanyang Technological University, Singapore, and the State Key Laboratory of Automotive Safety and Energy under project KF2021. 2022-11-15T01:13:22Z 2022-11-15T01:13:22Z 2022 Journal Article Hu, Z., Zhang, Y., Xing, Y., Zhao, Y., Cao, D. & Lv, C. (2022). Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker. IEEE Vehicular Technology Magazine, 3140047-. https://dx.doi.org/10.1109/MVT.2021.3140047 1556-6072 https://hdl.handle.net/10356/162995 10.1109/MVT.2021.3140047 2-s2.0-85123775783 3140047 en W1925d0046 M4082268.050 IEEE Vehicular Technology Magazine © 2022 IEEE. All rights reserved.
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Mechanical engineering Magnetic Heads Pose Estimation
spellingShingle	Engineering::Mechanical engineering Magnetic Heads Pose Estimation Hu, Zhongxu Zhang, Yiran Xing, Yang Zhao, Yifan Cao, Dongpu Lv, Chen Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
description	Accurate dynamic driver head pose tracking is of great importance for driver–automotive collaboration, intelligent copilot, head-up display (HUD), and other human-centered automated driving applications. To further advance this technology, this article proposes a low-cost and markerless headtracking system using a deep learning-based dynamic head pose estimation model. The proposed system requires only a red, green, blue (RGB) camera without other hardware or markers. To enhance the accuracy of the driver’s head pose estimation, a spatiotemporal vision transformer (ST-ViT) model, which takes an image pair as the input instead of a single frame, is proposed. Compared to a standard transformer, the ST-ViT contains a spatial–convolutional vision transformer and a temporal transformer, which can improve the model performance. To handle the error fluctuation of the head pose estimation model, this article proposes an adaptive Kalman filter (AKF). By analyzing the error distribution of the estimation model and the user experience of the head tracker, the proposed AKF includes an adaptive observation noise coefficient; this can adaptively moderate the smoothness of the curve. Comprehensive experiments show that the proposed system is feasible and effective, and it achieves a state-of-the-art performance.
author2	School of Mechanical and Aerospace Engineering
author_facet	School of Mechanical and Aerospace Engineering Hu, Zhongxu Zhang, Yiran Xing, Yang Zhao, Yifan Cao, Dongpu Lv, Chen
format	Article
author	Hu, Zhongxu Zhang, Yiran Xing, Yang Zhao, Yifan Cao, Dongpu Lv, Chen
author_sort	Hu, Zhongxu
title	Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
title_short	Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
title_full	Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
title_fullStr	Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
title_full_unstemmed	Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
title_sort	toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
publishDate	2022
url	https://hdl.handle.net/10356/162995
_version_	1751548580569546752

Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker

Similar Items