Graph convolution network based skeleton action recognition with DCT features
Human Action Recognition (HAR), which aims to decipher human movements from video, has been an important research topic in computer vision for many years, as it serves as the foundation for many innovative technologies and applications. While most recent HAR-related research focused on applying Grap...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/172751 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Human Action Recognition (HAR), which aims to decipher human movements from video, has been an important research topic in computer vision for many years, as it serves as the foundation for many innovative technologies and applications. While most recent HAR-related research focused on applying Graph Convolutional Networks (GCNs) on skeleton modality, little attention has been paid to taking advantage of the frequency representation of skeleton data. In this project, our objective is to study the effect of utilizing skeleton features in the frequency domain to perform HAR with GCN. To achieve
the target, we first conduct a thorough review of current approaches for HAR and frequency analysis. Inspired by research on attention mechanism, we proposed to combine channel attention and 2-D Discrete Cosine Transform (DCT) as a universal layer of a deep learning network to utilize the frequency information from skeleton data, which can be inserted in the current GCNs for improvements in classification accuracy. With the NTU-RGBD dataset, we conducted the experiments on three advanced GCN-based models as baseline models. Analysis of the experiment results has proven that by adding the proposed network layer, the classification accuracy of human actions of all three baseline models improved. The enhanced performance indicates the effectiveness of frequency information in the task of skeleton action recognition, as well as the potential of attention mechanism in utilizing the frequency information. |
---|