EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI

Convolutional neural networks (CNNs) have demonstrated encouraging results in image classification tasks. However, the prohibitive computational cost of CNNs hinders the deployment of CNNs onto resource-constrained embedded devices. To address this issue, we propose, a comprehensive compression fram...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Kong, Hao, Liu, Di, Huai, Shuo, Luo, Xiangzhong, Subramaniam, Ravi, Makaya, Christian, Lin, Qian, Liu, Weichen
مؤلفون آخرون:	School of Computer Science and Engineering
التنسيق:	مقال
اللغة:	English
منشور في:	2023
الموضوعات:	Engineering::Computer science and engineering Embedded Systems Neural Network Compression
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/171623
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

الوصف
الملخص:	Convolutional neural networks (CNNs) have demonstrated encouraging results in image classification tasks. However, the prohibitive computational cost of CNNs hinders the deployment of CNNs onto resource-constrained embedded devices. To address this issue, we propose, a comprehensive compression framework to reduce the computational overhead of CNNs. In, we first introduce dynamic image cropping, where we design a lightweight foreground predictor to accurately crop the most informative foreground object of input images for inference, which avoids redundant computation on background regions. Subsequently, we present compound shrinking to collaboratively compress the three dimensions (depth, width, and resolution) of CNNs according to their contribution to accuracy and model computation. Dynamic image cropping and compound shrinking together constitute a multi-dimensional CNN compression framework, which is able to comprehensively reduce the computational redundancy in both input images and neural network architectures, thereby improving the inference efficiency of CNNs. Further, we present a dynamic inference framework to efficiently process input images with different recognition difficulties, where we cascade multiple models with different complexities from our compression framework and dynamically adopt different models for different input images, which further compresses the computational redundancy and improves the inference efficiency of CNNs, facilitating the deployment of advanced CNNs onto embedded hardware. Experiments on ImageNet-1K demonstrate that reduces the computation of ResNet-50 by 48.8% while improving the top-1 accuracy by 0.8%. Meanwhile, we improve the accuracy by 4.1% with similar computation compared to HRank. the state-of-the-art compression framework. The source code and models are available at

EdgeCompress: coupling multi-dimensional model compression and dynamic inference for EdgeAI

مواد مشابهة