Deformable pose traversal convolution for 3D action and gesture recognition

The representation of 3D pose plays a critical role for 3D action and gesture recognition. Rather than representing a 3D pose directly by its joint locations, in this paper, we propose a Deformable Pose Traversal Convolution Network that applies one-dimensional convolution to traverse the 3D pose fo...

Full description

Saved in:
Bibliographic Details
Main Authors: Weng, Junwu, Liu, Mengyuan, Jiang, Xudong, Yuan, Junsong
Other Authors: School of Electrical and Electronic Engineering
Format: Conference or Workshop Item
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/140842
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The representation of 3D pose plays a critical role for 3D action and gesture recognition. Rather than representing a 3D pose directly by its joint locations, in this paper, we propose a Deformable Pose Traversal Convolution Network that applies one-dimensional convolution to traverse the 3D pose for its representation. Instead of fixing the receptive field when performing traversal convolution, it optimizes the convolution kernel for each joint, by considering contextual joints with various weights. This deformable convolution better utilizes the contextual joints for action and gesture recognition and is more robust to noisy joints. Moreover, by feeding the learned pose feature to a LSTM, we perform end-to-end training that jointly optimizes 3D pose representation and temporal sequence recognition. Experiments on three benchmark datasets validate the competitive performance of our proposed method, as well as its efficiency and robustness to handle noisy joints of pose.