EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI

Convolutional neural networks (CNNs) have demonstrated encouraging results in image classification tasks. However, the prohibitive computational cost of CNNs hinders the deployment of CNNs onto resource-constrained embedded devices. To address this issue, we propose, a comprehensive compression fram...

Full description

Saved in:

Bibliographic Details
Main Authors:	Kong, Hao, Liu, Di, Huai, Shuo, Luo, Xiangzhong, Subramaniam, Ravi, Makaya, Christian, Lin, Qian, Liu, Weichen
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2023
Subjects:	Engineering::Computer science and engineering Embedded Systems Neural Network Compression
Online Access:	https://hdl.handle.net/10356/171623
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-171623
record_format	dspace
spelling	sg-ntu-dr.10356-1716232023-12-15T03:20:53Z EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI Kong, Hao Liu, Di Huai, Shuo Luo, Xiangzhong Subramaniam, Ravi Makaya, Christian Lin, Qian Liu, Weichen School of Computer Science and Engineering HP-NTU Digital Manufacturing Corporate Lab Engineering::Computer science and engineering Embedded Systems Neural Network Compression Convolutional neural networks (CNNs) have demonstrated encouraging results in image classification tasks. However, the prohibitive computational cost of CNNs hinders the deployment of CNNs onto resource-constrained embedded devices. To address this issue, we propose, a comprehensive compression framework to reduce the computational overhead of CNNs. In, we first introduce dynamic image cropping, where we design a lightweight foreground predictor to accurately crop the most informative foreground object of input images for inference, which avoids redundant computation on background regions. Subsequently, we present compound shrinking to collaboratively compress the three dimensions (depth, width, and resolution) of CNNs according to their contribution to accuracy and model computation. Dynamic image cropping and compound shrinking together constitute a multi-dimensional CNN compression framework, which is able to comprehensively reduce the computational redundancy in both input images and neural network architectures, thereby improving the inference efficiency of CNNs. Further, we present a dynamic inference framework to efficiently process input images with different recognition difficulties, where we cascade multiple models with different complexities from our compression framework and dynamically adopt different models for different input images, which further compresses the computational redundancy and improves the inference efficiency of CNNs, facilitating the deployment of advanced CNNs onto embedded hardware. Experiments on ImageNet-1K demonstrate that reduces the computation of ResNet-50 by 48.8% while improving the top-1 accuracy by 0.8%. Meanwhile, we improve the accuracy by 4.1% with similar computation compared to HRank. the state-of-the-art compression framework. The source code and models are available at Ministry of Education (MOE) Nanyang Technological University Submitted/Accepted version This study is partially supported under the RIE2020 Industry Alignment Fund – Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner, HP Inc., through the HP-NTU Digital Manufacturing Corporate Lab (I1801E0028). This work is also partially supported by the Ministry of Education, Singapore, under its Academic Research Fund Tier 2 (MOE2019- T2-1-071), and Nanyang Technological University, Singapore, under its NAP (M4082282). 2023-11-01T07:20:46Z 2023-11-01T07:20:46Z 2023 Journal Article Kong, H., Liu, D., Huai, S., Luo, X., Subramaniam, R., Makaya, C., Lin, Q. & Liu, W. (2023). EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI. IEEE Transactions On Computer-Aided Design of Integrated Circuits and Systems. https://dx.doi.org/10.1109/TCAD.2023.3276938 0278-0070 https://hdl.handle.net/10356/171623 10.1109/TCAD.2023.3276938 2-s2.0-85162902626 en IAF-ICP I1801E0028 MOE2019- T2-1-071 NAP (M4082282) IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 10.21979/N9/GCAMZH © 2023 IEEE. All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. The Version of Record is available online at http://doi.org/10.1109/TCAD.2023.3276938. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering Embedded Systems Neural Network Compression
spellingShingle	Engineering::Computer science and engineering Embedded Systems Neural Network Compression Kong, Hao Liu, Di Huai, Shuo Luo, Xiangzhong Subramaniam, Ravi Makaya, Christian Lin, Qian Liu, Weichen EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI
description	Convolutional neural networks (CNNs) have demonstrated encouraging results in image classification tasks. However, the prohibitive computational cost of CNNs hinders the deployment of CNNs onto resource-constrained embedded devices. To address this issue, we propose, a comprehensive compression framework to reduce the computational overhead of CNNs. In, we first introduce dynamic image cropping, where we design a lightweight foreground predictor to accurately crop the most informative foreground object of input images for inference, which avoids redundant computation on background regions. Subsequently, we present compound shrinking to collaboratively compress the three dimensions (depth, width, and resolution) of CNNs according to their contribution to accuracy and model computation. Dynamic image cropping and compound shrinking together constitute a multi-dimensional CNN compression framework, which is able to comprehensively reduce the computational redundancy in both input images and neural network architectures, thereby improving the inference efficiency of CNNs. Further, we present a dynamic inference framework to efficiently process input images with different recognition difficulties, where we cascade multiple models with different complexities from our compression framework and dynamically adopt different models for different input images, which further compresses the computational redundancy and improves the inference efficiency of CNNs, facilitating the deployment of advanced CNNs onto embedded hardware. Experiments on ImageNet-1K demonstrate that reduces the computation of ResNet-50 by 48.8% while improving the top-1 accuracy by 0.8%. Meanwhile, we improve the accuracy by 4.1% with similar computation compared to HRank. the state-of-the-art compression framework. The source code and models are available at
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Kong, Hao Liu, Di Huai, Shuo Luo, Xiangzhong Subramaniam, Ravi Makaya, Christian Lin, Qian Liu, Weichen
format	Article
author	Kong, Hao Liu, Di Huai, Shuo Luo, Xiangzhong Subramaniam, Ravi Makaya, Christian Lin, Qian Liu, Weichen
author_sort	Kong, Hao
title	EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI
title_short	EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI
title_full	EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI
title_fullStr	EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI
title_full_unstemmed	EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI
title_sort	edgecompress: coupling multi-dimensional model compression and dynamic inference for edgeai
publishDate	2023
url	https://hdl.handle.net/10356/171623
_version_	1787136556246499328

EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI

Similar Items