AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens

Mamba has recently emerged as a promising architecture for handling long se- quences in vision tasks. However, research on pruning methods specifically for vision Mamba remains unexplored. To address this gap, we aim to introduce to- ken pruning for vision Mamba to enhance efficiency, which faces...

Full description

Saved in:

Bibliographic Details
Main Author:	Song, Shangye
Other Authors:	Jiang Xudong
Format:	Thesis-Master by Coursework
Language:	English
Published:	Nanyang Technological University 2025
Subjects:	Computer and Information Science Deep learning
Online Access:	https://hdl.handle.net/10356/182039
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-182039
record_format	dspace
spelling	sg-ntu-dr.10356-1820392025-01-10T15:48:40Z AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens Song, Shangye Jiang Xudong School of Electrical and Electronic Engineering EXDJiang@ntu.edu.sg Computer and Information Science Deep learning Mamba has recently emerged as a promising architecture for handling long se- quences in vision tasks. However, research on pruning methods specifically for vision Mamba remains unexplored. To address this gap, we aim to introduce to- ken pruning for vision Mamba to enhance efficiency, which faces several chal- lenges. First, without an attention mechanism, selecting tokens for pruning is less straightforward compared with vision transformers. Second, Mamba’s se- quential processing requires re-adjusting the token sequence after pruning to maintain calculation validity. To address these challenges, we propose to Adap- tively Prune Tokens for Mamba (AptMamba). Specifically, we design an Adap- tive Token Prediction (ATP) module to adaptively assess token importance, gen- erating binary decisions through Gumbel-Softmax. Pruned tokens are set to zero to minimize interference, and a multi-stage, progressive pruning strategy is ap- plied. To account for Mamba’s sequential scanning, we reorder tokens after pruning. Moreover, we present an auxiliary Pruning Reconstruction Decoder (PR-Decoder) during training to adaptively reconstruct pruned tokens, enhancing the representation capacity of the pruned Mamba without introducing extra in- ference costs. Additionally, we develop a Multi-Stage Pruning Loss (MSP-Loss) to adaptively confirm that pruned tokens contain minimal information, reducing the impact of token pruning. Extensive experiments show that our method can reduce the amount of computation by about 11.8% - 37.5% with 66% progres- sive token pruning, while maintaining accuracy impact within 1.1% - 3.5%. Master's degree 2025-01-06T05:09:24Z 2025-01-06T05:09:24Z 2024 Thesis-Master by Coursework Song, S. (2024). AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182039 https://hdl.handle.net/10356/182039 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Deep learning
spellingShingle	Computer and Information Science Deep learning Song, Shangye AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
description	Mamba has recently emerged as a promising architecture for handling long se- quences in vision tasks. However, research on pruning methods specifically for vision Mamba remains unexplored. To address this gap, we aim to introduce to- ken pruning for vision Mamba to enhance efficiency, which faces several chal- lenges. First, without an attention mechanism, selecting tokens for pruning is less straightforward compared with vision transformers. Second, Mamba’s se- quential processing requires re-adjusting the token sequence after pruning to maintain calculation validity. To address these challenges, we propose to Adap- tively Prune Tokens for Mamba (AptMamba). Specifically, we design an Adap- tive Token Prediction (ATP) module to adaptively assess token importance, gen- erating binary decisions through Gumbel-Softmax. Pruned tokens are set to zero to minimize interference, and a multi-stage, progressive pruning strategy is ap- plied. To account for Mamba’s sequential scanning, we reorder tokens after pruning. Moreover, we present an auxiliary Pruning Reconstruction Decoder (PR-Decoder) during training to adaptively reconstruct pruned tokens, enhancing the representation capacity of the pruned Mamba without introducing extra in- ference costs. Additionally, we develop a Multi-Stage Pruning Loss (MSP-Loss) to adaptively confirm that pruned tokens contain minimal information, reducing the impact of token pruning. Extensive experiments show that our method can reduce the amount of computation by about 11.8% - 37.5% with 66% progres- sive token pruning, while maintaining accuracy impact within 1.1% - 3.5%.
author2	Jiang Xudong
author_facet	Jiang Xudong Song, Shangye
format	Thesis-Master by Coursework
author	Song, Shangye
author_sort	Song, Shangye
title	AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
title_short	AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
title_full	AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
title_fullStr	AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
title_full_unstemmed	AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
title_sort	aptmamba: enhancing vision mamba efficiency by adaptively pruning tokens
publisher	Nanyang Technological University
publishDate	2025
url	https://hdl.handle.net/10356/182039
_version_	1821237151532056576

AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens

Similar Items