Running CNN efficiently on a FPGA
With increased demand for AI at the edge, there is a pressing need to adapt ever more computationally demanding deep learning models for deployment onto embedded devices. As accelerators for these networks, FPGAs have become preferred for their energy efficiency and adaptability, but models also...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/156579 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | With increased demand for AI at the edge, there is a pressing need to adapt
ever more computationally demanding deep learning models for deployment
onto embedded devices. As accelerators for these networks, FPGAs have
become preferred for their energy efficiency and adaptability, but models also
need to be pre-processed before effective FPGA-based hardware accelerators
can be designed. In this project, the author investigates the performance of
Block-Balanced Sparsity, a model compression approach that prunes parameter
matrices in deep learning networks via a structured manner that allows for
efficient FPGA accelerator implementations. By testing this approach across
different pruning strategies, the author found that the fine-tuning strategy led
to the highest model accuracy, gradual pruning allowed for the fastest model
development and learning rate rewinding provided the greatest ease-of-use. |
---|