Learning temporal dynamics in videos with image transformer
Temporal dynamics represent the evolving of video content over time, which are critical for action recognition. In this paper, we ask the question: can the off-the-shelf image transformer architecture learn temporal dynamics in videos? To this end, we propose Multidimensional Stacked Image (MSImage)...
Saved in:
Main Authors: | SHU, Yan, QIU, Z, LONG, Fuchen, YAO, Ting, NGO, Chong-wah, MEI, Tao |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2024
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/9860 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Similar Items
-
TRANSFORMER TECHNIQUES FOR HUMAN ACTION RECOGNITION AND LOCALIZATION
by: CHANG SHUNING
Published: (2024) -
Real-time human action recognition by luminance field trajectory analysis
by: Li, Z., et al.
Published: (2014) -
Architecture for 3D Convolutional Neural Networks Based on Temporal Similarity Removal
by: WATHUTHANTHRIGE UDARI CHARITHA DE ALWIS, et al.
Published: (2023) -
A distribution based video representation for human action recognition
by: Song, Y., et al.
Published: (2013) -
Exploring probabilistic localized video representation for human action recognition
by: Song, Y., et al.
Published: (2014)