Design of energy-efficient convolution neural network accelerator

The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition. Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they perform well in many machine learnin...

Full description

Saved in:
Bibliographic Details
Main Author: Shao, Yuhan
Other Authors: Kim Tae Hyoung
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2025
Subjects:
Online Access:https://hdl.handle.net/10356/182747
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-182747
record_format dspace
spelling sg-ntu-dr.10356-1827472025-02-21T15:49:06Z Design of energy-efficient convolution neural network accelerator Shao, Yuhan Kim Tae Hyoung School of Electrical and Electronic Engineering THKIM@ntu.edu.sg Engineering Internet of Things Convolutional neural networks Very large scale integration Eyeriss architecture Hardware accelerator Energy-efficient memory hierarchies Parallel processing unit The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition. Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they perform well in many machine learning applications. However, CNNs require a lot of computing power and energy, making them difficult to use on IoT devices with limited resources. This dissertation addresses these issues by proposing a new design for a low power CNN hardware accelerator specifically for IoT applications, using FPGA hardware acceleration. The motivation for this work comes from the need to run advanced machine learning algorithms directly on edge devices, where power efficiency and speed are crucial. Traditional methods that rely on cloud-based processing face delays, higher power use due to data transfer, and privacy concerns. Therefore, there is a strong need for on-device, real-time processing that maintains high performance and energy efficiency. The main goal of this dissertation is to design and build a hardware accelerator that significantly reduces the power consumption of CNNs without lowering their accuracy or speed. This involves improving both the CNN architecture and the hardware. Techniques like weight quantization, pruning, and using specialized low-power circuits are explored to achieve these goals. Additionally, the design takes advantage of FPGA’s flexibility and parallel processing capabilities to create a compact and efficient accelerator. A thorough review of existing CNN accelerators and their limitations sets the foundation for the proposed design. This dissertation introduces several new ideas, including energy efficient memory systems, parallel processing units, and custom dataflow architectures. By combining these features with FPGA hardware acceleration, the proposed accelerator improves both power efficiency and computational performance. Exploiting FPGA flexibility, the design incorporates dynamic voltage-frequency scaling (DVFS) to lower power consumption to 1.2 W at 200 MHz. This approach achieves energy efficiency improvements of 3.5× over GPU-based solutions and 1.8× over ASIC alternatives. Additionally, time-multiplexed DSP blocks reduce LUT usage by 38% without impacting throughput. In summary, this dissertation offers a complete solution to the challenges of deploying CNNs on IoT devices. By focusing on power efficiency and performance with FPGA based hardware acceleration, the proposed CNN accelerator provides a feasible approach for integrating advanced machine learning capabilities into the next generation of IoT devices. The innovations and findings in this work contribute to the field of low-power hardware design and provide a feasible approach for future research in energy-efficient computing. Master's degree 2025-02-21T06:03:47Z 2025-02-21T06:03:47Z 2025 Thesis-Master by Coursework Shao, Y. (2025). Design of energy-efficient convolution neural network accelerator. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182747 https://hdl.handle.net/10356/182747 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering
Internet of Things
Convolutional neural networks
Very large scale integration
Eyeriss architecture
Hardware accelerator
Energy-efficient memory hierarchies
Parallel processing unit
spellingShingle Engineering
Internet of Things
Convolutional neural networks
Very large scale integration
Eyeriss architecture
Hardware accelerator
Energy-efficient memory hierarchies
Parallel processing unit
Shao, Yuhan
Design of energy-efficient convolution neural network accelerator
description The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition. Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they perform well in many machine learning applications. However, CNNs require a lot of computing power and energy, making them difficult to use on IoT devices with limited resources. This dissertation addresses these issues by proposing a new design for a low power CNN hardware accelerator specifically for IoT applications, using FPGA hardware acceleration. The motivation for this work comes from the need to run advanced machine learning algorithms directly on edge devices, where power efficiency and speed are crucial. Traditional methods that rely on cloud-based processing face delays, higher power use due to data transfer, and privacy concerns. Therefore, there is a strong need for on-device, real-time processing that maintains high performance and energy efficiency. The main goal of this dissertation is to design and build a hardware accelerator that significantly reduces the power consumption of CNNs without lowering their accuracy or speed. This involves improving both the CNN architecture and the hardware. Techniques like weight quantization, pruning, and using specialized low-power circuits are explored to achieve these goals. Additionally, the design takes advantage of FPGA’s flexibility and parallel processing capabilities to create a compact and efficient accelerator. A thorough review of existing CNN accelerators and their limitations sets the foundation for the proposed design. This dissertation introduces several new ideas, including energy efficient memory systems, parallel processing units, and custom dataflow architectures. By combining these features with FPGA hardware acceleration, the proposed accelerator improves both power efficiency and computational performance. Exploiting FPGA flexibility, the design incorporates dynamic voltage-frequency scaling (DVFS) to lower power consumption to 1.2 W at 200 MHz. This approach achieves energy efficiency improvements of 3.5× over GPU-based solutions and 1.8× over ASIC alternatives. Additionally, time-multiplexed DSP blocks reduce LUT usage by 38% without impacting throughput. In summary, this dissertation offers a complete solution to the challenges of deploying CNNs on IoT devices. By focusing on power efficiency and performance with FPGA based hardware acceleration, the proposed CNN accelerator provides a feasible approach for integrating advanced machine learning capabilities into the next generation of IoT devices. The innovations and findings in this work contribute to the field of low-power hardware design and provide a feasible approach for future research in energy-efficient computing.
author2 Kim Tae Hyoung
author_facet Kim Tae Hyoung
Shao, Yuhan
format Thesis-Master by Coursework
author Shao, Yuhan
author_sort Shao, Yuhan
title Design of energy-efficient convolution neural network accelerator
title_short Design of energy-efficient convolution neural network accelerator
title_full Design of energy-efficient convolution neural network accelerator
title_fullStr Design of energy-efficient convolution neural network accelerator
title_full_unstemmed Design of energy-efficient convolution neural network accelerator
title_sort design of energy-efficient convolution neural network accelerator
publisher Nanyang Technological University
publishDate 2025
url https://hdl.handle.net/10356/182747
_version_ 1825619639040737280