Multimodal multipart learning for action recognition in depth videos

The articulated and complex nature of human actions makes the task of action recognition difficult. One approach to handle this complexity is dividing it to the kinetics of body parts and analyzing the actions based on these partial descriptors. We propose a joint sparse regression based learning me...

Full description

Saved in:

Bibliographic Details
Main Authors:	Shahroudy, Amir, Ng, Tian-Tsong, Yang, Qingxiong, Wang, Gang
Other Authors:	School of Electrical and Electronic Engineering
Format:	Article
Language:	English
Published:	2018
Subjects:	Action Recognition Kinect
Online Access:	https://hdl.handle.net/10356/87094 http://hdl.handle.net/10220/45224
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-87094
record_format	dspace
spelling	sg-ntu-dr.10356-870942020-03-07T13:57:22Z Multimodal multipart learning for action recognition in depth videos Shahroudy, Amir Ng, Tian-Tsong Yang, Qingxiong Wang, Gang School of Electrical and Electronic Engineering Action Recognition Kinect The articulated and complex nature of human actions makes the task of action recognition difficult. One approach to handle this complexity is dividing it to the kinetics of body parts and analyzing the actions based on these partial descriptors. We propose a joint sparse regression based learning method which utilizes the structured sparsity to model each action as a combination of multimodal features from a sparse set of body parts. To represent dynamics and appearance of parts, we employ a heterogeneous set of depth and skeleton based features. The proper structure of multimodal multipart features are formulated into the learning framework via the proposed hierarchical mixed norm, to regularize the structured features of each part and to apply sparsity between them, in favor of a group feature selection. Our experimental results expose the effectiveness of the proposed learning method in which it outperforms other methods in all three tested datasets while saturating one of them by achieving perfect accuracy. NRF (Natl Research Foundation, S’pore) ASTAR (Agency for Sci., Tech. and Research, S’pore) MOE (Min. of Education, S’pore) Accepted version 2018-07-25T05:38:32Z 2019-12-06T16:35:02Z 2018-07-25T05:38:32Z 2019-12-06T16:35:02Z 2016 Journal Article Shahroudy, A., Ng, T.-T., Yang, Q., & Wang, G. (2016). Multimodal multipart learning for action recognition in depth videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(10), 2123-2129. 0162-8828 https://hdl.handle.net/10356/87094 http://hdl.handle.net/10220/45224 10.1109/TPAMI.2015.2505295 en IEEE Transactions on Pattern Analysis and Machine Intelligence © 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/TPAMI.2015.2505295]. 8 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	Action Recognition Kinect
spellingShingle	Action Recognition Kinect Shahroudy, Amir Ng, Tian-Tsong Yang, Qingxiong Wang, Gang Multimodal multipart learning for action recognition in depth videos
description	The articulated and complex nature of human actions makes the task of action recognition difficult. One approach to handle this complexity is dividing it to the kinetics of body parts and analyzing the actions based on these partial descriptors. We propose a joint sparse regression based learning method which utilizes the structured sparsity to model each action as a combination of multimodal features from a sparse set of body parts. To represent dynamics and appearance of parts, we employ a heterogeneous set of depth and skeleton based features. The proper structure of multimodal multipart features are formulated into the learning framework via the proposed hierarchical mixed norm, to regularize the structured features of each part and to apply sparsity between them, in favor of a group feature selection. Our experimental results expose the effectiveness of the proposed learning method in which it outperforms other methods in all three tested datasets while saturating one of them by achieving perfect accuracy.
author2	School of Electrical and Electronic Engineering
author_facet	School of Electrical and Electronic Engineering Shahroudy, Amir Ng, Tian-Tsong Yang, Qingxiong Wang, Gang
format	Article
author	Shahroudy, Amir Ng, Tian-Tsong Yang, Qingxiong Wang, Gang
author_sort	Shahroudy, Amir
title	Multimodal multipart learning for action recognition in depth videos
title_short	Multimodal multipart learning for action recognition in depth videos
title_full	Multimodal multipart learning for action recognition in depth videos
title_fullStr	Multimodal multipart learning for action recognition in depth videos
title_full_unstemmed	Multimodal multipart learning for action recognition in depth videos
title_sort	multimodal multipart learning for action recognition in depth videos
publishDate	2018
url	https://hdl.handle.net/10356/87094 http://hdl.handle.net/10220/45224
_version_	1681044582705397760

Multimodal multipart learning for action recognition in depth videos

Similar Items