Design of energy-efficient convolution neural network accelerator
The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition. Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they perform well in many machine learnin...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2025
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/182747 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-182747 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1827472025-02-21T15:49:06Z Design of energy-efficient convolution neural network accelerator Shao, Yuhan Kim Tae Hyoung School of Electrical and Electronic Engineering THKIM@ntu.edu.sg Engineering Internet of Things Convolutional neural networks Very large scale integration Eyeriss architecture Hardware accelerator Energy-efficient memory hierarchies Parallel processing unit The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition. Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they perform well in many machine learning applications. However, CNNs require a lot of computing power and energy, making them difficult to use on IoT devices with limited resources. This dissertation addresses these issues by proposing a new design for a low power CNN hardware accelerator specifically for IoT applications, using FPGA hardware acceleration. The motivation for this work comes from the need to run advanced machine learning algorithms directly on edge devices, where power efficiency and speed are crucial. Traditional methods that rely on cloud-based processing face delays, higher power use due to data transfer, and privacy concerns. Therefore, there is a strong need for on-device, real-time processing that maintains high performance and energy efficiency. The main goal of this dissertation is to design and build a hardware accelerator that significantly reduces the power consumption of CNNs without lowering their accuracy or speed. This involves improving both the CNN architecture and the hardware. Techniques like weight quantization, pruning, and using specialized low-power circuits are explored to achieve these goals. Additionally, the design takes advantage of FPGA’s flexibility and parallel processing capabilities to create a compact and efficient accelerator. A thorough review of existing CNN accelerators and their limitations sets the foundation for the proposed design. This dissertation introduces several new ideas, including energy efficient memory systems, parallel processing units, and custom dataflow architectures. By combining these features with FPGA hardware acceleration, the proposed accelerator improves both power efficiency and computational performance. Exploiting FPGA flexibility, the design incorporates dynamic voltage-frequency scaling (DVFS) to lower power consumption to 1.2 W at 200 MHz. This approach achieves energy efficiency improvements of 3.5× over GPU-based solutions and 1.8× over ASIC alternatives. Additionally, time-multiplexed DSP blocks reduce LUT usage by 38% without impacting throughput. In summary, this dissertation offers a complete solution to the challenges of deploying CNNs on IoT devices. By focusing on power efficiency and performance with FPGA based hardware acceleration, the proposed CNN accelerator provides a feasible approach for integrating advanced machine learning capabilities into the next generation of IoT devices. The innovations and findings in this work contribute to the field of low-power hardware design and provide a feasible approach for future research in energy-efficient computing. Master's degree 2025-02-21T06:03:47Z 2025-02-21T06:03:47Z 2025 Thesis-Master by Coursework Shao, Y. (2025). Design of energy-efficient convolution neural network accelerator. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182747 https://hdl.handle.net/10356/182747 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering Internet of Things Convolutional neural networks Very large scale integration Eyeriss architecture Hardware accelerator Energy-efficient memory hierarchies Parallel processing unit |
spellingShingle |
Engineering Internet of Things Convolutional neural networks Very large scale integration Eyeriss architecture Hardware accelerator Energy-efficient memory hierarchies Parallel processing unit Shao, Yuhan Design of energy-efficient convolution neural network accelerator |
description |
The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition.
Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they
perform well in many machine learning applications. However, CNNs require a lot of
computing power and energy, making them difficult to use on IoT devices with limited
resources. This dissertation addresses these issues by proposing a new design for a low power CNN hardware accelerator specifically for IoT applications, using FPGA hardware
acceleration.
The motivation for this work comes from the need to run advanced machine learning
algorithms directly on edge devices, where power efficiency and speed are crucial.
Traditional methods that rely on cloud-based processing face delays, higher power use
due to data transfer, and privacy concerns. Therefore, there is a strong need for on-device,
real-time processing that maintains high performance and energy efficiency. The main
goal of this dissertation is to design and build a hardware accelerator that significantly
reduces the power consumption of CNNs without lowering their accuracy or speed. This
involves improving both the CNN architecture and the hardware. Techniques like weight
quantization, pruning, and using specialized low-power circuits are explored to achieve
these goals. Additionally, the design takes advantage of FPGA’s flexibility and parallel
processing capabilities to create a compact and efficient accelerator.
A thorough review of existing CNN accelerators and their limitations sets the foundation
for the proposed design. This dissertation introduces several new ideas, including energy efficient memory systems, parallel processing units, and custom dataflow architectures.
By combining these features with FPGA hardware acceleration, the proposed accelerator
improves both power efficiency and computational performance.
Exploiting FPGA flexibility, the design incorporates dynamic voltage-frequency scaling
(DVFS) to lower power consumption to 1.2 W at 200 MHz. This approach achieves
energy efficiency improvements of 3.5× over GPU-based solutions and 1.8× over ASIC
alternatives. Additionally, time-multiplexed DSP blocks reduce LUT usage by 38%
without impacting throughput.
In summary, this dissertation offers a complete solution to the challenges of deploying
CNNs on IoT devices. By focusing on power efficiency and performance with FPGA based hardware acceleration, the proposed CNN accelerator provides a feasible approach
for integrating advanced machine learning capabilities into the next generation of IoT
devices. The innovations and findings in this work contribute to the field of low-power
hardware design and provide a feasible approach for future research in energy-efficient
computing. |
author2 |
Kim Tae Hyoung |
author_facet |
Kim Tae Hyoung Shao, Yuhan |
format |
Thesis-Master by Coursework |
author |
Shao, Yuhan |
author_sort |
Shao, Yuhan |
title |
Design of energy-efficient convolution neural network accelerator |
title_short |
Design of energy-efficient convolution neural network accelerator |
title_full |
Design of energy-efficient convolution neural network accelerator |
title_fullStr |
Design of energy-efficient convolution neural network accelerator |
title_full_unstemmed |
Design of energy-efficient convolution neural network accelerator |
title_sort |
design of energy-efficient convolution neural network accelerator |
publisher |
Nanyang Technological University |
publishDate |
2025 |
url |
https://hdl.handle.net/10356/182747 |
_version_ |
1825619639040737280 |