MetaFormer is actually what you need for vision

MetaFormer is actually what you need for vision

Transformers have shown great potential in computer vision tasks. A common belief is their attention-based token mixer module contributes most to their competence. However, recent works show the attention-based module in transformers can be replaced by spatial MLPs and the resulted models still perf...

Full description

Saved in:

Bibliographic Details
Main Authors:	YU, Weihao, LUO, Mi, ZHOU, Pan, SI, Chenyang, ZHOU, Yichen, WANG, Xinchao, FENG, Jiashi, YAN, Shuicheng
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2022
Subjects:	Computer vision Computational modeling Focusing Computer architecture Transformers Pattern recognition Task analysis Graphics and Human Computer Interfaces
Online Access:	https://ink.library.smu.edu.sg/sis_research/8983 https://ink.library.smu.edu.sg/context/sis_research/article/9986/viewcontent/2022_CVPR_MetaFormer.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Similar Items

MetaFormer baselines for vision
by: YU, Weihao, et al.
Published: (2023)

InceptionNeXt: When Inception meets ConvNeXt
by: YU, Weihao, et al.
Published: (2024)

DualFormer: Local-global stratified transformer for efficient video recognition
by: LIANG, Yuxuan, et al.
Published: (2022)

STPrivacy: Spatio-temporal privacy-preserving action recognition
by: LI, Ming, et al.
Published: (2023)

Efficient meta learning via minibatch proximal update
by: ZHOU, Pan, et al.
Published: (2019)

Low-shot Object Detection via Classification Refinement.
by: Li, Yiting, et al.
Published: (2020)

Hough-based model for recognizing bar charts in document images
by: Yan Ping Zhou, et al.
Published: (2013)

Sparse representation for computer vision and pattern recognition
by: Wright, J., et al.
Published: (2014)

The spatially-correlative loss for various image translation tasks
by: Zheng, Chuanxia, et al.
Published: (2021)

Deep adversarial subspace clustering
by: ZHOU, Pan, et al.
Published: (2018)

Investigating biological feature detectors in simple pattern recognition towards complex saliency prediction tasks
by: Cordel, Macario O., II
Published: (2018)

Texture aware image segmentation using graph cuts and active contours
by: Zhou, Hailing, et al.
Published: (2013)

Position-guided text prompt for vision-language pre-training
by: WANG, Alex Jinpeng, et al.
Published: (2023)

Computer Vision and Computer Graphics. Theory and Applications
Published: (2017)

Development and implementation of vision-based techniques for target identification
by: Thenmugilan Gandhy
Published: (2015)

Vision-based hand pose estimation and gesture recognition
by: Liang, Hui
Published: (2015)

Texture search engine
by: Low, Wei Lian.
Published: (2011)

3D content recovery and complexity reduction
by: Tan, Cheen Hau
Published: (2015)

Faster first-order methods for stochastic non-convex optimization on Riemannian manifolds
by: ZHOU, Pan, et al.
Published: (2019)

Vision system for hand gesture recognition
by: Ilao, Joel P., et al.
Published: (2007)

Neural networks as applied to vision systems in recognizing 3-D objects in different orientations
by: Choi, Tsun Kit, et al.
Published: (1993)

Towards understanding why mask reconstruction pretraining helps in downstream tasks
by: PAN, Jiachun, et al.
Published: (2023)

Video graph transformer for video question answering
by: XIAO, Junbin, et al.
Published: (2022)

Perspective on and re-orientation of physical proxies in object-focused remote collaboration
by: FEICK, Martin, et al.
Published: (2018)

Deep learning for snake pattern detection
by: Ching, Jia Chin
Published: (2020)

Wireless and mobile localization and tracking
by: Lim, Teng Chuen
Published: (2020)

Fraction-score : a new support measure for co-location pattern mining
by: Koh, Wei Hao
Published: (2020)

Brain-computer interface for mental attention
by: Fatemeh Fahimi
Published: (2020)

Aircraft engine turbine RUL prediction using NADINE
by: Tsang, Aloysius Jin Hou
Published: (2020)

Cluster analysis on dynamic graphs
by: Nguyen, Ngoc Khanh
Published: (2020)

Hand gesture recognition using RF-sensing
by: Tan, Sheng Rong
Published: (2021)

Behavioural-based malware detection on android phones
by: Kyran Ming Kuttan
Published: (2021)

Pattern recognition and forecasting from multiple financial time series data and news
by: Yee Aung, Su Wai
Published: (2021)

Thumbmark recognition system
by: Antonio, Percival S., et al.
Published: (1993)

An adaptive dropout based deep metric learning algorithm
by: Tan, Ronald Tay Siang
Published: (2022)

Software development of pattern recognition and forecasting on financial time series data and news
by: Huang, YuHang
Published: (2022)

Neural networks based pattern classification system for information extraction on disaster news
by: Li, Qi
Published: (2022)

Weather prediction with machine learning
by: Liew, Hon Weng
Published: (2023)

Multi-level probabilistic uniqueness reasoning of autonomous robots based on spatial-semantic fusion
by: Yang, Chule
Published: (2019)

Boosting knowledge distillation and interpretability
by: Song, Huan
Published: (2021)