Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker
Accurate dynamic driver head pose tracking is of great importance for driver–automotive collaboration, intelligent copilot, head-up display (HUD), and other human-centered automated driving applications. To further advance this technology, this article proposes a low-cost and markerless headtracking...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/162995 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-162995 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1629952022-11-15T01:13:22Z Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker Hu, Zhongxu Zhang, Yiran Xing, Yang Zhao, Yifan Cao, Dongpu Lv, Chen School of Mechanical and Aerospace Engineering Continental-NTU Corporate Laboratory Engineering::Mechanical engineering Magnetic Heads Pose Estimation Accurate dynamic driver head pose tracking is of great importance for driver–automotive collaboration, intelligent copilot, head-up display (HUD), and other human-centered automated driving applications. To further advance this technology, this article proposes a low-cost and markerless headtracking system using a deep learning-based dynamic head pose estimation model. The proposed system requires only a red, green, blue (RGB) camera without other hardware or markers. To enhance the accuracy of the driver’s head pose estimation, a spatiotemporal vision transformer (ST-ViT) model, which takes an image pair as the input instead of a single frame, is proposed. Compared to a standard transformer, the ST-ViT contains a spatial–convolutional vision transformer and a temporal transformer, which can improve the model performance. To handle the error fluctuation of the head pose estimation model, this article proposes an adaptive Kalman filter (AKF). By analyzing the error distribution of the estimation model and the user experience of the head tracker, the proposed AKF includes an adaptive observation noise coefficient; this can adaptively moderate the smoothness of the curve. Comprehensive experiments show that the proposed system is feasible and effective, and it achieves a state-of-the-art performance. Agency for Science, Technology and Research (A*STAR) Nanyang Technological University This work was supported in part by in part by the A*STAR National Robotics Program under grant W1925d0046, the Start-Up Grant, Nanyang Assistant Professorship under grant M4082268.050, Nanyang Technological University, Singapore, and the State Key Laboratory of Automotive Safety and Energy under project KF2021. 2022-11-15T01:13:22Z 2022-11-15T01:13:22Z 2022 Journal Article Hu, Z., Zhang, Y., Xing, Y., Zhao, Y., Cao, D. & Lv, C. (2022). Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker. IEEE Vehicular Technology Magazine, 3140047-. https://dx.doi.org/10.1109/MVT.2021.3140047 1556-6072 https://hdl.handle.net/10356/162995 10.1109/MVT.2021.3140047 2-s2.0-85123775783 3140047 en W1925d0046 M4082268.050 IEEE Vehicular Technology Magazine © 2022 IEEE. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Mechanical engineering Magnetic Heads Pose Estimation |
spellingShingle |
Engineering::Mechanical engineering Magnetic Heads Pose Estimation Hu, Zhongxu Zhang, Yiran Xing, Yang Zhao, Yifan Cao, Dongpu Lv, Chen Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker |
description |
Accurate dynamic driver head pose tracking is of great importance for driver–automotive collaboration, intelligent copilot, head-up display (HUD), and other human-centered automated driving applications. To further advance this technology, this article proposes a low-cost and markerless headtracking system using a deep learning-based dynamic head pose estimation model. The proposed system requires only a red, green, blue (RGB) camera without other hardware or markers. To enhance the accuracy of the driver’s head pose estimation, a spatiotemporal vision transformer (ST-ViT) model, which takes an image pair as the input instead of a single frame, is proposed. Compared to a standard transformer, the ST-ViT contains a spatial–convolutional vision transformer and a temporal transformer, which can improve the model performance. To handle the error fluctuation of the head pose estimation model, this article proposes an adaptive Kalman filter (AKF). By analyzing the error distribution of the estimation model and the user experience of the head tracker, the proposed AKF includes an adaptive observation noise coefficient; this can adaptively moderate the smoothness of the curve. Comprehensive experiments show that the proposed system is feasible and effective, and it achieves a state-of-the-art performance. |
author2 |
School of Mechanical and Aerospace Engineering |
author_facet |
School of Mechanical and Aerospace Engineering Hu, Zhongxu Zhang, Yiran Xing, Yang Zhao, Yifan Cao, Dongpu Lv, Chen |
format |
Article |
author |
Hu, Zhongxu Zhang, Yiran Xing, Yang Zhao, Yifan Cao, Dongpu Lv, Chen |
author_sort |
Hu, Zhongxu |
title |
Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker |
title_short |
Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker |
title_full |
Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker |
title_fullStr |
Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker |
title_full_unstemmed |
Toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker |
title_sort |
toward human-centered automated driving: a novel spatiotemporal vision transformer-enabled head tracker |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/162995 |
_version_ |
1751548580569546752 |