SkeFi: cross-modal knowledge transfer for wireless skeleton-based action recognition
Skeleton-based action recognition can effectively solve the problem of reduced classification accuracy caused by background clutter. However, prevalent skeleton datasets predominantly rely on cameras to capture RGB frames and annotate skeletal keypoints, leading to susceptibility to ambient lighting...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/174832 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Skeleton-based action recognition can effectively solve the problem of reduced classification accuracy caused by background clutter. However, prevalent skeleton datasets predominantly rely on cameras to capture RGB frames and annotate skeletal keypoints, leading to susceptibility to ambient lighting fluctuations and potential privacy violations. To mitigate these challenges, leveraging non-invasive sensors such as LiDAR and mmWave for wireless human sensing emerges as a viable alternative. Nevertheless, the diminutive data size associated with these non-invasive sensing methodologies renders the direct application of RGB-based skeleton action classification models less than optimal. Moreover, extracting keypoints from non-invasive data lacks the precision found in RGB modalities, culminating in skeleton data marred by noise and information loss. To address these issues, our work involves cross-model knowledge transfer acquired from the data-rich RGB modality to our classification task, naming this framework SkeFi. For specific instances of information loss, we integrate the enhanced Temporal Correlation Adaptive Graph Convolution (TC-AGC). Additionally, our research underscores the capability of augmenting the intrinsic potency of multiscale temporal modeling through the integration of ESPNet modules. By combining TC-AGC with this improved temporal modeling and implementing transfer learning, our framework realizes superior performance across three non-invasive modalities (RGB, mmWave, and LiDAR). |
---|