AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens

Mamba has recently emerged as a promising architecture for handling long se- quences in vision tasks. However, research on pruning methods specifically for vision Mamba remains unexplored. To address this gap, we aim to introduce to- ken pruning for vision Mamba to enhance efficiency, which faces...

Full description

Saved in:
Bibliographic Details
Main Author: Song, Shangye
Other Authors: Jiang Xudong
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2025
Subjects:
Online Access:https://hdl.handle.net/10356/182039
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-182039
record_format dspace
spelling sg-ntu-dr.10356-1820392025-01-10T15:48:40Z AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens Song, Shangye Jiang Xudong School of Electrical and Electronic Engineering EXDJiang@ntu.edu.sg Computer and Information Science Deep learning Mamba has recently emerged as a promising architecture for handling long se- quences in vision tasks. However, research on pruning methods specifically for vision Mamba remains unexplored. To address this gap, we aim to introduce to- ken pruning for vision Mamba to enhance efficiency, which faces several chal- lenges. First, without an attention mechanism, selecting tokens for pruning is less straightforward compared with vision transformers. Second, Mamba’s se- quential processing requires re-adjusting the token sequence after pruning to maintain calculation validity. To address these challenges, we propose to Adap- tively Prune Tokens for Mamba (AptMamba). Specifically, we design an Adap- tive Token Prediction (ATP) module to adaptively assess token importance, gen- erating binary decisions through Gumbel-Softmax. Pruned tokens are set to zero to minimize interference, and a multi-stage, progressive pruning strategy is ap- plied. To account for Mamba’s sequential scanning, we reorder tokens after pruning. Moreover, we present an auxiliary Pruning Reconstruction Decoder (PR-Decoder) during training to adaptively reconstruct pruned tokens, enhancing the representation capacity of the pruned Mamba without introducing extra in- ference costs. Additionally, we develop a Multi-Stage Pruning Loss (MSP-Loss) to adaptively confirm that pruned tokens contain minimal information, reducing the impact of token pruning. Extensive experiments show that our method can reduce the amount of computation by about 11.8% - 37.5% with 66% progres- sive token pruning, while maintaining accuracy impact within 1.1% - 3.5%. Master's degree 2025-01-06T05:09:24Z 2025-01-06T05:09:24Z 2024 Thesis-Master by Coursework Song, S. (2024). AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182039 https://hdl.handle.net/10356/182039 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Deep learning
spellingShingle Computer and Information Science
Deep learning
Song, Shangye
AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
description Mamba has recently emerged as a promising architecture for handling long se- quences in vision tasks. However, research on pruning methods specifically for vision Mamba remains unexplored. To address this gap, we aim to introduce to- ken pruning for vision Mamba to enhance efficiency, which faces several chal- lenges. First, without an attention mechanism, selecting tokens for pruning is less straightforward compared with vision transformers. Second, Mamba’s se- quential processing requires re-adjusting the token sequence after pruning to maintain calculation validity. To address these challenges, we propose to Adap- tively Prune Tokens for Mamba (AptMamba). Specifically, we design an Adap- tive Token Prediction (ATP) module to adaptively assess token importance, gen- erating binary decisions through Gumbel-Softmax. Pruned tokens are set to zero to minimize interference, and a multi-stage, progressive pruning strategy is ap- plied. To account for Mamba’s sequential scanning, we reorder tokens after pruning. Moreover, we present an auxiliary Pruning Reconstruction Decoder (PR-Decoder) during training to adaptively reconstruct pruned tokens, enhancing the representation capacity of the pruned Mamba without introducing extra in- ference costs. Additionally, we develop a Multi-Stage Pruning Loss (MSP-Loss) to adaptively confirm that pruned tokens contain minimal information, reducing the impact of token pruning. Extensive experiments show that our method can reduce the amount of computation by about 11.8% - 37.5% with 66% progres- sive token pruning, while maintaining accuracy impact within 1.1% - 3.5%.
author2 Jiang Xudong
author_facet Jiang Xudong
Song, Shangye
format Thesis-Master by Coursework
author Song, Shangye
author_sort Song, Shangye
title AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
title_short AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
title_full AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
title_fullStr AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
title_full_unstemmed AptMamba: enhancing vision Mamba efficiency by adaptively pruning tokens
title_sort aptmamba: enhancing vision mamba efficiency by adaptively pruning tokens
publisher Nanyang Technological University
publishDate 2025
url https://hdl.handle.net/10356/182039
_version_ 1821237151532056576