Design of energy-efficient convolution neural network accelerator

The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition. Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they perform well in many machine learnin...

Full description

Saved in:

Bibliographic Details
Main Author:	Shao, Yuhan
Other Authors:	Kim Tae Hyoung
Format:	Thesis-Master by Coursework
Language:	English
Published:	Nanyang Technological University 2025
Subjects:	Engineering Internet of Things Convolutional neural networks Very large scale integration Eyeriss architecture Hardware accelerator Energy-efficient memory hierarchies Parallel processing unit
Online Access:	https://hdl.handle.net/10356/182747
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-182747
record_format	dspace
spelling	sg-ntu-dr.10356-1827472025-02-21T15:49:06Z Design of energy-efficient convolution neural network accelerator Shao, Yuhan Kim Tae Hyoung School of Electrical and Electronic Engineering THKIM@ntu.edu.sg Engineering Internet of Things Convolutional neural networks Very large scale integration Eyeriss architecture Hardware accelerator Energy-efficient memory hierarchies Parallel processing unit The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition. Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they perform well in many machine learning applications. However, CNNs require a lot of computing power and energy, making them difficult to use on IoT devices with limited resources. This dissertation addresses these issues by proposing a new design for a low power CNN hardware accelerator specifically for IoT applications, using FPGA hardware acceleration. The motivation for this work comes from the need to run advanced machine learning algorithms directly on edge devices, where power efficiency and speed are crucial. Traditional methods that rely on cloud-based processing face delays, higher power use due to data transfer, and privacy concerns. Therefore, there is a strong need for on-device, real-time processing that maintains high performance and energy efficiency. The main goal of this dissertation is to design and build a hardware accelerator that significantly reduces the power consumption of CNNs without lowering their accuracy or speed. This involves improving both the CNN architecture and the hardware. Techniques like weight quantization, pruning, and using specialized low-power circuits are explored to achieve these goals. Additionally, the design takes advantage of FPGA’s flexibility and parallel processing capabilities to create a compact and efficient accelerator. A thorough review of existing CNN accelerators and their limitations sets the foundation for the proposed design. This dissertation introduces several new ideas, including energy efficient memory systems, parallel processing units, and custom dataflow architectures. By combining these features with FPGA hardware acceleration, the proposed accelerator improves both power efficiency and computational performance. Exploiting FPGA flexibility, the design incorporates dynamic voltage-frequency scaling (DVFS) to lower power consumption to 1.2 W at 200 MHz. This approach achieves energy efficiency improvements of 3.5× over GPU-based solutions and 1.8× over ASIC alternatives. Additionally, time-multiplexed DSP blocks reduce LUT usage by 38% without impacting throughput. In summary, this dissertation offers a complete solution to the challenges of deploying CNNs on IoT devices. By focusing on power efficiency and performance with FPGA based hardware acceleration, the proposed CNN accelerator provides a feasible approach for integrating advanced machine learning capabilities into the next generation of IoT devices. The innovations and findings in this work contribute to the field of low-power hardware design and provide a feasible approach for future research in energy-efficient computing. Master's degree 2025-02-21T06:03:47Z 2025-02-21T06:03:47Z 2025 Thesis-Master by Coursework Shao, Y. (2025). Design of energy-efficient convolution neural network accelerator. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182747 https://hdl.handle.net/10356/182747 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering Internet of Things Convolutional neural networks Very large scale integration Eyeriss architecture Hardware accelerator Energy-efficient memory hierarchies Parallel processing unit
spellingShingle	Engineering Internet of Things Convolutional neural networks Very large scale integration Eyeriss architecture Hardware accelerator Energy-efficient memory hierarchies Parallel processing unit Shao, Yuhan Design of energy-efficient convolution neural network accelerator
description	The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition. Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they perform well in many machine learning applications. However, CNNs require a lot of computing power and energy, making them difficult to use on IoT devices with limited resources. This dissertation addresses these issues by proposing a new design for a low power CNN hardware accelerator specifically for IoT applications, using FPGA hardware acceleration. The motivation for this work comes from the need to run advanced machine learning algorithms directly on edge devices, where power efficiency and speed are crucial. Traditional methods that rely on cloud-based processing face delays, higher power use due to data transfer, and privacy concerns. Therefore, there is a strong need for on-device, real-time processing that maintains high performance and energy efficiency. The main goal of this dissertation is to design and build a hardware accelerator that significantly reduces the power consumption of CNNs without lowering their accuracy or speed. This involves improving both the CNN architecture and the hardware. Techniques like weight quantization, pruning, and using specialized low-power circuits are explored to achieve these goals. Additionally, the design takes advantage of FPGA’s flexibility and parallel processing capabilities to create a compact and efficient accelerator. A thorough review of existing CNN accelerators and their limitations sets the foundation for the proposed design. This dissertation introduces several new ideas, including energy efficient memory systems, parallel processing units, and custom dataflow architectures. By combining these features with FPGA hardware acceleration, the proposed accelerator improves both power efficiency and computational performance. Exploiting FPGA flexibility, the design incorporates dynamic voltage-frequency scaling (DVFS) to lower power consumption to 1.2 W at 200 MHz. This approach achieves energy efficiency improvements of 3.5× over GPU-based solutions and 1.8× over ASIC alternatives. Additionally, time-multiplexed DSP blocks reduce LUT usage by 38% without impacting throughput. In summary, this dissertation offers a complete solution to the challenges of deploying CNNs on IoT devices. By focusing on power efficiency and performance with FPGA based hardware acceleration, the proposed CNN accelerator provides a feasible approach for integrating advanced machine learning capabilities into the next generation of IoT devices. The innovations and findings in this work contribute to the field of low-power hardware design and provide a feasible approach for future research in energy-efficient computing.
author2	Kim Tae Hyoung
author_facet	Kim Tae Hyoung Shao, Yuhan
format	Thesis-Master by Coursework
author	Shao, Yuhan
author_sort	Shao, Yuhan
title	Design of energy-efficient convolution neural network accelerator
title_short	Design of energy-efficient convolution neural network accelerator
title_full	Design of energy-efficient convolution neural network accelerator
title_fullStr	Design of energy-efficient convolution neural network accelerator
title_full_unstemmed	Design of energy-efficient convolution neural network accelerator
title_sort	design of energy-efficient convolution neural network accelerator
publisher	Nanyang Technological University
publishDate	2025
url	https://hdl.handle.net/10356/182747
_version_	1825619639040737280

Design of energy-efficient convolution neural network accelerator

Similar Items