Mining actionlet ensemble for action recognition with depth cameras

Human action recognition is an important yet challenging task. The recently developed commodity depth sensors open up new possibilities of dealing with this problem but also present some unique challenges. The depth maps captured by the depth cameras are very noisy and the 3D positions of the tracke...

Full description

Saved in:

Bibliographic Details
Main Authors:	Wang, Jiang, Liu, Zicheng, Wu, Ying, Yuan, Junsong
Other Authors:	School of Electrical and Electronic Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Electrical and electronic engineering
Online Access:	https://hdl.handle.net/10356/100602 http://hdl.handle.net/10220/17897
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-100602
record_format	dspace
spelling	sg-ntu-dr.10356-1006022020-03-07T13:24:50Z Mining actionlet ensemble for action recognition with depth cameras Wang, Jiang Liu, Zicheng Wu, Ying Yuan, Junsong School of Electrical and Electronic Engineering IEEE Conference on Computer Vision and Pattern Recognition (2012 : Providence, Rhode Island, US) DRNTU::Engineering::Electrical and electronic engineering Human action recognition is an important yet challenging task. The recently developed commodity depth sensors open up new possibilities of dealing with this problem but also present some unique challenges. The depth maps captured by the depth cameras are very noisy and the 3D positions of the tracked joints may be completely wrong if serious occlusions occur, which increases the intra-class variations in the actions. In this paper, an actionlet ensemble model is learnt to represent each action and to capture the intra-class variance. In addition, novel features that are suitable for depth data are proposed. They are robust to noise, invariant to translational and temporal misalignments, and capable of characterizing both the human motion and the human-object interactions. The proposed approach is evaluated on two challenging action recognition datasets captured by commodity depth cameras, and another dataset captured by a MoCap system. The experimental evaluations show that the proposed approach achieves superior performance to the state of the art algorithms. Accepted version 2013-11-29T03:20:19Z 2019-12-06T20:25:13Z 2013-11-29T03:20:19Z 2019-12-06T20:25:13Z 2012 2012 Conference Paper Wang, J., Liu, Z., Wu, Y., & Yuan, J. (2012). Mining actionlet ensemble for action recognition with depth cameras. 2012 IEEE Conference on Computer Vision and Pattern Recognition, 1290-1297. https://hdl.handle.net/10356/100602 http://hdl.handle.net/10220/17897 10.1109/CVPR.2012.6247813 en © 2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/CVPR.2012.6247813]. This work was supported in part by National Science Foundation grant IIS-0347877,IIS-0916607,US Army Research Laboratory and the US Army Research Office under grant ARO W911NF-08-1-0504, and DARPA Award FA 8650-11-1-7149.This work is partially supported by Microsoft Research. application/pdf
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Electrical and electronic engineering
spellingShingle	DRNTU::Engineering::Electrical and electronic engineering Wang, Jiang Liu, Zicheng Wu, Ying Yuan, Junsong Mining actionlet ensemble for action recognition with depth cameras
description	Human action recognition is an important yet challenging task. The recently developed commodity depth sensors open up new possibilities of dealing with this problem but also present some unique challenges. The depth maps captured by the depth cameras are very noisy and the 3D positions of the tracked joints may be completely wrong if serious occlusions occur, which increases the intra-class variations in the actions. In this paper, an actionlet ensemble model is learnt to represent each action and to capture the intra-class variance. In addition, novel features that are suitable for depth data are proposed. They are robust to noise, invariant to translational and temporal misalignments, and capable of characterizing both the human motion and the human-object interactions. The proposed approach is evaluated on two challenging action recognition datasets captured by commodity depth cameras, and another dataset captured by a MoCap system. The experimental evaluations show that the proposed approach achieves superior performance to the state of the art algorithms.
author2	School of Electrical and Electronic Engineering
author_facet	School of Electrical and Electronic Engineering Wang, Jiang Liu, Zicheng Wu, Ying Yuan, Junsong
format	Conference or Workshop Item
author	Wang, Jiang Liu, Zicheng Wu, Ying Yuan, Junsong
author_sort	Wang, Jiang
title	Mining actionlet ensemble for action recognition with depth cameras
title_short	Mining actionlet ensemble for action recognition with depth cameras
title_full	Mining actionlet ensemble for action recognition with depth cameras
title_fullStr	Mining actionlet ensemble for action recognition with depth cameras
title_full_unstemmed	Mining actionlet ensemble for action recognition with depth cameras
title_sort	mining actionlet ensemble for action recognition with depth cameras
publishDate	2013
url	https://hdl.handle.net/10356/100602 http://hdl.handle.net/10220/17897
_version_	1681035543885905920

Mining actionlet ensemble for action recognition with depth cameras

Similar Items