DualFormer: Local-global stratified transformer for efficient video recognition
While transformers have shown great potential on video recognition with their strong capability of capturing long-range dependencies, they often suffer high computational costs induced by the self-attention to the huge number of 3D tokens. In this paper, we present a new transformer architecture ter...
Saved in:
Main Authors: | LIANG, Yuxuan, ZHOU, Pan, ZIMMERMANN, Roger, YAN, Shuicheng |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2022
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8980 https://ink.library.smu.edu.sg/context/sis_research/article/9983/viewcontent/2022_ECCV_DualFormer.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Similar Items
-
Video graph transformer for video question answering
by: XIAO, Junbin, et al.
Published: (2022) -
Long-term leap attention, short-term periodic shift for video classification
by: ZHANG, Hao, et al.
Published: (2022) -
MetaFormer baselines for vision
by: YU, Weihao, et al.
Published: (2023) -
Contrastive video question answering via video graph transformer
by: XIAO, Junbin Xiao, et al.
Published: (2023) -
MetaFormer is actually what you need for vision
by: YU, Weihao, et al.
Published: (2022)