MLP-3D: A MLP-like 3D architecture with grouped time mixing
Convolutional Neural Networks (CNNs) have been re-garded as the go-to models for visual recognition. More re-cently, convolution-free networks, based on multi-head self-attention (MSA) or multi-layer perceptrons (MLPs), become more and more popular. Nevertheless, it is not trivial when utilizing the...
Saved in:
Main Authors: | QIU, Zhaofan, YAO, Ting, NGO, Chong-wah, MEI, Tao |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2022
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/7505 https://ink.library.smu.edu.sg/context/sis_research/article/8508/viewcontent/Qiu_MLP_3D_A_MLP_Like_3D_Architecture_With_Grouped_Time_Mixing_CVPR_2022_paper.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Similar Items
-
PosMLP-Video: Spatial and temporal relative position encoding for efficient video recognition
by: HAO, Yanbin, et al.
Published: (2023) -
Architecture analysis of MLP by geometrical interpretation
by: Xiang, C., et al.
Published: (2014) -
Geometrical interpretation and architecture selection of MLP
by: Xiang, C., et al.
Published: (2014) -
Dynamic temporal filtering in video models
by: LONG, Fuchen, et al.
Published: (2022) -
Condensing a sequence to one informative frame for video recognition
by: QIU. Zhaofan,, et al.
Published: (2021)