Accelerating binary-matrix multiplication on FPGA
Matrix multiplication is required for a wide variety of applications, including data mining, linear algebra, graph transformations, etc. Most of the existing works to accelerate matrix multiplication have focused on matrices with integer and floating point elements. In this work, we proposed for the...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/72881 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Matrix multiplication is required for a wide variety of applications, including data mining, linear algebra, graph transformations, etc. Most of the existing works to accelerate matrix multiplication have focused on matrices with integer and floating point elements. In this work, we proposed for the first time an FPGA-based accelerator architecture for binary matrix multiplication. It consists of processing elements laid out in regular tiled manner. The communication structure used is a torus. We undertook detailed experimental study of the proposed architecture. The architecture shows excellent scalability with increase in number of processing elements, with minimal drop in operating frequency. The proposed system achieves maximum throughput of 1084.37 Gops for 4x4 network size with 2048x2048 matrix size. The performance achieved by the system is considerably higher than existing works of integer and floating point matrix multiplications on FPGAs, due to optimized PE design for binary matrix multiplication. We also studied the impact of deploying efficient overlay Network-on-Chip (NoC) infrastructure to different aspects of our accelerator system. |
---|