Design of energy-efficient convolution neural network accelerator
The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition. Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they perform well in many machine learnin...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2025
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/182747 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The rapid growth of Internet of Things (IoT) devices has created a need for efficient, low power computing solutions that can handle tasks like image and speech recognition.
Convolutional Neural Networks (CNNs) are key for these intelligent tasks because they
perform well in many machine learning applications. However, CNNs require a lot of
computing power and energy, making them difficult to use on IoT devices with limited
resources. This dissertation addresses these issues by proposing a new design for a low power CNN hardware accelerator specifically for IoT applications, using FPGA hardware
acceleration.
The motivation for this work comes from the need to run advanced machine learning
algorithms directly on edge devices, where power efficiency and speed are crucial.
Traditional methods that rely on cloud-based processing face delays, higher power use
due to data transfer, and privacy concerns. Therefore, there is a strong need for on-device,
real-time processing that maintains high performance and energy efficiency. The main
goal of this dissertation is to design and build a hardware accelerator that significantly
reduces the power consumption of CNNs without lowering their accuracy or speed. This
involves improving both the CNN architecture and the hardware. Techniques like weight
quantization, pruning, and using specialized low-power circuits are explored to achieve
these goals. Additionally, the design takes advantage of FPGA’s flexibility and parallel
processing capabilities to create a compact and efficient accelerator.
A thorough review of existing CNN accelerators and their limitations sets the foundation
for the proposed design. This dissertation introduces several new ideas, including energy efficient memory systems, parallel processing units, and custom dataflow architectures.
By combining these features with FPGA hardware acceleration, the proposed accelerator
improves both power efficiency and computational performance.
Exploiting FPGA flexibility, the design incorporates dynamic voltage-frequency scaling
(DVFS) to lower power consumption to 1.2 W at 200 MHz. This approach achieves
energy efficiency improvements of 3.5× over GPU-based solutions and 1.8× over ASIC
alternatives. Additionally, time-multiplexed DSP blocks reduce LUT usage by 38%
without impacting throughput.
In summary, this dissertation offers a complete solution to the challenges of deploying
CNNs on IoT devices. By focusing on power efficiency and performance with FPGA based hardware acceleration, the proposed CNN accelerator provides a feasible approach
for integrating advanced machine learning capabilities into the next generation of IoT
devices. The innovations and findings in this work contribute to the field of low-power
hardware design and provide a feasible approach for future research in energy-efficient
computing. |
---|