Weakly-supervised 3D hand pose estimation from monocular RGB images
Compared with depth-based 3D hand pose estimation, it is more challenging to infer 3D hand pose from monocular RGB images, due to substantial depth ambiguity and the difficulty of obtaining fully-annotated training data. Different from existing learning-based monocular RGB-input approaches that requ...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/140530 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-140530 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1405302020-09-26T21:53:02Z Weakly-supervised 3D hand pose estimation from monocular RGB images Cai, Yujun Ge, Liuhao Cai, Jianfei Yuan, Junsong School of Computer Science and Engineering Interdisciplinary Graduate School (IGS) European Conference on Computer Vision (ECCV) 2018 Department of Computer Science and Engineering, State University of New York at Buffalo University Institute for Media Innovation (IMI) Engineering::Computer science and engineering Computer Vision 3D Hand Pose Estimation Compared with depth-based 3D hand pose estimation, it is more challenging to infer 3D hand pose from monocular RGB images, due to substantial depth ambiguity and the difficulty of obtaining fully-annotated training data. Different from existing learning-based monocular RGB-input approaches that require accurate 3D annotations for training, we propose to leverage the depth images that can be easily obtained from commodity RGB-D cameras during training, while during testing we take only RGB inputs for 3D joint predictions. In this way, we alleviate the burden of the costly 3D annotations in real-world dataset. Particularly, we propose a weakly-supervised method, adaptating from fully-annotated synthetic dataset to weakly-labeled real-world dataset with the aid of a depth regularizer, which generates depth maps from predicted 3D pose and serves as weak supervision for 3D pose regression. Extensive experiments on benchmark datasets validate the effectiveness of the proposed depth regularizer in both weakly-supervised and fully-supervised settings. NRF (Natl Research Foundation, S’pore) MOE (Min. of Education, S’pore) Accepted version 2020-05-30T07:09:34Z 2020-05-30T07:09:34Z 2018 Conference Paper Cai, Y., Ge, L., Cai, J., & Yuan, J. (2018). Weakly-supervised 3D hand pose estimation from monocular RGB images. European Conference on Computer Vision (ECCV) 2018, 678-694. doi:10.1007/978-3-030-01231-1_41 https://hdl.handle.net/10356/140530 10.1007/978-3-030-01231-1_41 678 694 en MOE2016-T2-2-065 © 2018 Springer Nature Switzerland AG. All rights reserved. This paper was published in European Conference on Computer Vision (ECCV) 2018 and is made available with permission of Springer Nature Switzerland AG. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Computer Vision 3D Hand Pose Estimation |
spellingShingle |
Engineering::Computer science and engineering Computer Vision 3D Hand Pose Estimation Cai, Yujun Ge, Liuhao Cai, Jianfei Yuan, Junsong Weakly-supervised 3D hand pose estimation from monocular RGB images |
description |
Compared with depth-based 3D hand pose estimation, it is more challenging to infer 3D hand pose from monocular RGB images, due to substantial depth ambiguity and the difficulty of obtaining fully-annotated training data. Different from existing learning-based monocular RGB-input approaches that require accurate 3D annotations for training, we propose to leverage the depth images that can be easily obtained from commodity RGB-D cameras during training, while during testing we take only RGB inputs for 3D joint predictions. In this way, we alleviate the burden of the costly 3D annotations in real-world dataset. Particularly, we propose a weakly-supervised method, adaptating from fully-annotated synthetic dataset to weakly-labeled real-world dataset with the aid of a depth regularizer, which generates depth maps from predicted 3D pose and serves as weak supervision for 3D pose regression. Extensive experiments on benchmark datasets validate the effectiveness of the proposed depth regularizer in both weakly-supervised and fully-supervised settings. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Cai, Yujun Ge, Liuhao Cai, Jianfei Yuan, Junsong |
format |
Conference or Workshop Item |
author |
Cai, Yujun Ge, Liuhao Cai, Jianfei Yuan, Junsong |
author_sort |
Cai, Yujun |
title |
Weakly-supervised 3D hand pose estimation from monocular RGB images |
title_short |
Weakly-supervised 3D hand pose estimation from monocular RGB images |
title_full |
Weakly-supervised 3D hand pose estimation from monocular RGB images |
title_fullStr |
Weakly-supervised 3D hand pose estimation from monocular RGB images |
title_full_unstemmed |
Weakly-supervised 3D hand pose estimation from monocular RGB images |
title_sort |
weakly-supervised 3d hand pose estimation from monocular rgb images |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/140530 |
_version_ |
1681057978606682112 |