Combining pose-invariant kinematic features and object context features for RGB-D action recognition

Action recognition using RGB-D cameras is a popular research topic. Recognising actions in a pose-invariant manner is very challenging due to view changes, posture changes and huge intra-class variations. This study aims to propose a novel pose-invariant action recognition framework based on kinemat...

Full description

Saved in:
Bibliographic Details
Main Authors: Ramanathan, Manoj, Kochanowicz, Jaroslaw, Thalmann, Nadia Magnenat
Other Authors: Institute for Media Innovation (IMI)
Format: Article
Language:English
Published: 2019
Subjects:
Online Access:https://hdl.handle.net/10356/102468
http://hdl.handle.net/10220/49519
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Action recognition using RGB-D cameras is a popular research topic. Recognising actions in a pose-invariant manner is very challenging due to view changes, posture changes and huge intra-class variations. This study aims to propose a novel pose-invariant action recognition framework based on kinematic features and object context features. Using RGB, depth and skeletal joints, the proposed framework extracts a novel set of pose-invariant motion kinematic features based on 3D scene flow and captures the motion of body parts with respect to the body. The obtained features are converted to a human body centric space that allows partial viewinvariant recognition of actions. The proposed pose-invariant kinematic features are extracted for both foreground (RGB and depth) and skeleton joints and separate classifiers are trained. Bordacount based classifier decision fusion is employed to obtain an action recognition result. For capturing object context features, a convolutional neural network (CNN) classifier is proposed to identify the involved objects. The proposed context features also include temporal information on object interaction and help in obtaining a final action recognition. The proposed framework works even with non-upright human postures and allows simultaneous action recognition for multiple people, which are topics that remain comparatively unresearched. The performance and robustness of the proposed pose-invariant action recognition framework are tested on several benchmark datasets. We also show that the proposed method works in real-time.