Custom FPGA-based soft-processors for sparse graph acceleration
FPGA-based soft processors customized for operations on sparse graphs can deliver significant performance improvements over conventional organizations (ARMv7 CPUs) for bulk synchronous sparse graph algorithms. We develop a stripped-down soft processor ISA to implement specific repetitive operations...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/81243 http://hdl.handle.net/10220/39162 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-81243 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-812432020-05-28T07:17:27Z Custom FPGA-based soft-processors for sparse graph acceleration Kapre, Nachiket School of Computer Engineering 2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP) Computer Science and Engineering FPGA-based soft processors customized for operations on sparse graphs can deliver significant performance improvements over conventional organizations (ARMv7 CPUs) for bulk synchronous sparse graph algorithms. We develop a stripped-down soft processor ISA to implement specific repetitive operations on graph nodes and edges that are commonly observed in sparse graph computations. In the processing core, we provide hardware support for rapidly fetching and processing state of local graph nodes and edges through spatial address generators and zero-overhead loop iterators. We interconnect a 2D array of these lightweight processors with a packet-switched network-on-chip to enable fine-grained operand routing along the graph edges and provide custom send/receive instructions in the soft processor. We develop the processor RTL using Vivado High-Level Synthesis and also provide an assembler and compilation flow to configure the processor instruction and data memories. We outperform a Microblaze (100MHz on Zedboard) and an NIOS-II/f (100MHz on DE2-115) by 6× (single processor design) as well as the ARMv7 dual-core CPU on the Zynq SoCs by as much as 10× on the Xilinx ZC706 board (100 processor design) across a range of matrix datasets. Accepted version 2015-12-18T06:15:20Z 2019-12-06T14:26:23Z 2015-12-18T06:15:20Z 2019-12-06T14:26:23Z 2015 Conference Paper Kapre, N. (2015). Custom FPGA-based soft-processors for sparse graph acceleration. 2015 IEEE 26th International Conference on Application-specific Systems, Architectures and Processors (ASAP), 9-16. https://hdl.handle.net/10356/81243 http://hdl.handle.net/10220/39162 10.1109/ASAP.2015.7245698 en © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/ASAP.2015.7245698]. 8 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Computer Science and Engineering |
spellingShingle |
Computer Science and Engineering Kapre, Nachiket Custom FPGA-based soft-processors for sparse graph acceleration |
description |
FPGA-based soft processors customized for operations on sparse graphs can deliver significant performance improvements over conventional organizations (ARMv7 CPUs) for bulk synchronous sparse graph algorithms. We develop a stripped-down soft processor ISA to implement specific repetitive operations on graph nodes and edges that are commonly observed in sparse graph computations. In the processing core, we provide hardware support for rapidly fetching and processing state of local graph nodes and edges through spatial address generators and zero-overhead loop iterators. We interconnect a 2D array of these lightweight processors with a packet-switched network-on-chip to enable fine-grained operand routing along the graph edges and provide custom send/receive instructions in the soft processor. We develop the processor RTL using Vivado High-Level Synthesis and also provide an assembler and compilation flow to configure the processor instruction and data memories. We outperform a Microblaze (100MHz on Zedboard) and an NIOS-II/f (100MHz on DE2-115) by 6× (single processor design) as well as the ARMv7 dual-core CPU on the Zynq SoCs by as much as 10× on the Xilinx ZC706 board (100 processor design) across a range of matrix datasets. |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Kapre, Nachiket |
format |
Conference or Workshop Item |
author |
Kapre, Nachiket |
author_sort |
Kapre, Nachiket |
title |
Custom FPGA-based soft-processors for sparse graph acceleration |
title_short |
Custom FPGA-based soft-processors for sparse graph acceleration |
title_full |
Custom FPGA-based soft-processors for sparse graph acceleration |
title_fullStr |
Custom FPGA-based soft-processors for sparse graph acceleration |
title_full_unstemmed |
Custom FPGA-based soft-processors for sparse graph acceleration |
title_sort |
custom fpga-based soft-processors for sparse graph acceleration |
publishDate |
2015 |
url |
https://hdl.handle.net/10356/81243 http://hdl.handle.net/10220/39162 |
_version_ |
1681057699040591872 |