Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker

Accurate dynamic driver head pose tracking is of great importance for driver–automotive collaboration, intelligent copilot, head-up display (HUD), and other human-centered automated driving applications. To further advance this technology, this article proposes a low-cost and markerless headtracking...

Full description

Saved in:
Bibliographic Details
Main Authors: Hu, Zhongxu, Zhang, Yiran, Xing, Yang, Zhao, Yifan, Cao, Dongpu, Lv, Chen
Other Authors: School of Mechanical and Aerospace Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/162995
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-162995
record_format dspace
spelling sg-ntu-dr.10356-1629952022-11-15T01:13:22Z Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker Hu, Zhongxu Zhang, Yiran Xing, Yang Zhao, Yifan Cao, Dongpu Lv, Chen School of Mechanical and Aerospace Engineering Continental-NTU Corporate Laboratory Engineering::Mechanical engineering Magnetic Heads Pose Estimation Accurate dynamic driver head pose tracking is of great importance for driver–automotive collaboration, intelligent copilot, head-up display (HUD), and other human-centered automated driving applications. To further advance this technology, this article proposes a low-cost and markerless headtracking system using a deep learning-based dynamic head pose estimation model. The proposed system requires only a red, green, blue (RGB) camera without other hardware or markers. To enhance the accuracy of the driver’s head pose estimation, a spatiotemporal vision transformer (ST-ViT) model, which takes an image pair as the input instead of a single frame, is proposed. Compared to a standard transformer, the ST-ViT contains a spatial–convolutional vision transformer and a temporal transformer, which can improve the model performance. To handle the error fluctuation of the head pose estimation model, this article proposes an adaptive Kalman filter (AKF). By analyzing the error distribution of the estimation model and the user experience of the head tracker, the proposed AKF includes an adaptive observation noise coefficient; this can adaptively moderate the smoothness of the curve. Comprehensive experiments show that the proposed system is feasible and effective, and it achieves a state-of-the-art performance. Agency for Science, Technology and Research (A*STAR) Nanyang Technological University This work was supported in part by in part by the A*STAR National Robotics Program under grant W1925d0046, the Start-Up Grant, Nanyang Assistant Professorship under grant M4082268.050, Nanyang Technological University, Singapore, and the State Key Laboratory of Automotive Safety and Energy under project KF2021. 2022-11-15T01:13:22Z 2022-11-15T01:13:22Z 2022 Journal Article Hu, Z., Zhang, Y., Xing, Y., Zhao, Y., Cao, D. & Lv, C. (2022). Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker. IEEE Vehicular Technology Magazine, 3140047-. https://dx.doi.org/10.1109/MVT.2021.3140047 1556-6072 https://hdl.handle.net/10356/162995 10.1109/MVT.2021.3140047 2-s2.0-85123775783 3140047 en W1925d0046 M4082268.050 IEEE Vehicular Technology Magazine © 2022 IEEE. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Mechanical engineering
Magnetic Heads
Pose Estimation
spellingShingle Engineering::Mechanical engineering
Magnetic Heads
Pose Estimation
Hu, Zhongxu
Zhang, Yiran
Xing, Yang
Zhao, Yifan
Cao, Dongpu
Lv, Chen
Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
description Accurate dynamic driver head pose tracking is of great importance for driver–automotive collaboration, intelligent copilot, head-up display (HUD), and other human-centered automated driving applications. To further advance this technology, this article proposes a low-cost and markerless headtracking system using a deep learning-based dynamic head pose estimation model. The proposed system requires only a red, green, blue (RGB) camera without other hardware or markers. To enhance the accuracy of the driver’s head pose estimation, a spatiotemporal vision transformer (ST-ViT) model, which takes an image pair as the input instead of a single frame, is proposed. Compared to a standard transformer, the ST-ViT contains a spatial–convolutional vision transformer and a temporal transformer, which can improve the model performance. To handle the error fluctuation of the head pose estimation model, this article proposes an adaptive Kalman filter (AKF). By analyzing the error distribution of the estimation model and the user experience of the head tracker, the proposed AKF includes an adaptive observation noise coefficient; this can adaptively moderate the smoothness of the curve. Comprehensive experiments show that the proposed system is feasible and effective, and it achieves a state-of-the-art performance.
author2 School of Mechanical and Aerospace Engineering
author_facet School of Mechanical and Aerospace Engineering
Hu, Zhongxu
Zhang, Yiran
Xing, Yang
Zhao, Yifan
Cao, Dongpu
Lv, Chen
format Article
author Hu, Zhongxu
Zhang, Yiran
Xing, Yang
Zhao, Yifan
Cao, Dongpu
Lv, Chen
author_sort Hu, Zhongxu
title Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
title_short Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
title_full Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
title_fullStr Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
title_full_unstemmed Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
title_sort toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
publishDate 2022
url https://hdl.handle.net/10356/162995
_version_ 1751548580569546752