Progressive channel-shrinking network
Currently, salience-based channel pruning makes continuous breakthroughs in network compression. In the realization, the salience mechanism is used as a metric of channel salience to guide pruning. Therefore, salience-based channel pruning can dynamically adjust the channel width at run-time, which...
Saved in:
Main Authors: | , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/171831 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-171831 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1718312023-11-09T04:26:58Z Progressive channel-shrinking network Pan, Jianhong Yang, Siyuan Foo, Lin Geng Ke, Qiuhong Rahmani, Hossein Fan, Zhipeng Liu, Jun Interdisciplinary Graduate School (IGS) Engineering::Computer science and engineering Progressive Network Shrinking Currently, salience-based channel pruning makes continuous breakthroughs in network compression. In the realization, the salience mechanism is used as a metric of channel salience to guide pruning. Therefore, salience-based channel pruning can dynamically adjust the channel width at run-time, which provides a flexible pruning scheme. However, there are two problems emerging: a gating function is often needed to truncate the specific salience entries to zero, which destabilizes the forward propagation; dynamic architecture brings more cost for indexing in inference which bottlenecks the inference speed. In this paper, we propose a Progressive Channel-Shrinking (PCS) method to compress the selected salience entries at run-time instead of roughly approximating them to zero. We also propose a Running Shrinking Policy to provide a testing-static pruning scheme that can reduce the memory access cost for filter indexing. We evaluate our method on ImageNet and CIFAR10 datasets over two prevalent networks: ResNet and VGG, and demonstrate that our PCS outperforms all baselines and achieves state-of-the-art in terms of compression-performance tradeoff. Moreover, we observe a significant and practical acceleration of inference. The code will be released upon acceptance. Ministry of Education (MOE) National Research Foundation (NRF) This work is supported by MOE AcRF Tier 2 (Proposal ID: T2EP20222-0035), National Research Foundation Singapore under its AI Singapore Programme (AISG-100E-2020-065), and SUTD SKI Project (SKI 2021 02 06). This work is also supported by TAILOR, a project funded by EU Horizon 2020 research and innovation programme under GA No 952215. 2023-11-09T04:26:58Z 2023-11-09T04:26:58Z 2023 Journal Article Pan, J., Yang, S., Foo, L. G., Ke, Q., Rahmani, H., Fan, Z. & Liu, J. (2023). Progressive channel-shrinking network. IEEE Transactions On Multimedia. https://dx.doi.org/10.1109/TMM.2023.3291197 1520-9210 https://hdl.handle.net/10356/171831 10.1109/TMM.2023.3291197 2-s2.0-85163437770 en T2EP20222-0035 AISG-100E-2020-065 IEEE Transactions on Multimedia © 2023 IEEE. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Progressive Network Shrinking |
spellingShingle |
Engineering::Computer science and engineering Progressive Network Shrinking Pan, Jianhong Yang, Siyuan Foo, Lin Geng Ke, Qiuhong Rahmani, Hossein Fan, Zhipeng Liu, Jun Progressive channel-shrinking network |
description |
Currently, salience-based channel pruning makes continuous breakthroughs in network compression. In the realization, the salience mechanism is used as a metric of channel salience to guide pruning. Therefore, salience-based channel pruning can dynamically adjust the channel width at run-time, which provides a flexible pruning scheme. However, there are two problems emerging: a gating function is often needed to truncate the specific salience entries to zero, which destabilizes the forward propagation; dynamic architecture brings more cost for indexing in inference which bottlenecks the inference speed. In this paper, we propose a Progressive Channel-Shrinking (PCS) method to compress the selected salience entries at run-time instead of roughly approximating them to zero. We also propose a Running Shrinking Policy to provide a testing-static pruning scheme that can reduce the memory access cost for filter indexing. We evaluate our method on ImageNet and CIFAR10 datasets over two prevalent networks: ResNet and VGG, and demonstrate that our PCS outperforms all baselines and achieves state-of-the-art in terms of compression-performance tradeoff. Moreover, we observe a significant and practical acceleration of inference. The code will be released upon acceptance. |
author2 |
Interdisciplinary Graduate School (IGS) |
author_facet |
Interdisciplinary Graduate School (IGS) Pan, Jianhong Yang, Siyuan Foo, Lin Geng Ke, Qiuhong Rahmani, Hossein Fan, Zhipeng Liu, Jun |
format |
Article |
author |
Pan, Jianhong Yang, Siyuan Foo, Lin Geng Ke, Qiuhong Rahmani, Hossein Fan, Zhipeng Liu, Jun |
author_sort |
Pan, Jianhong |
title |
Progressive channel-shrinking network |
title_short |
Progressive channel-shrinking network |
title_full |
Progressive channel-shrinking network |
title_fullStr |
Progressive channel-shrinking network |
title_full_unstemmed |
Progressive channel-shrinking network |
title_sort |
progressive channel-shrinking network |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/171831 |
_version_ |
1783955519392710656 |