Hoplite: Building austere overlay NoCs for FPGAs

Customized unidirectional, bufferless, deflection-routed torus networks can outperform classic, bidirectional, buffered mesh networks for single-flit-oriented FPGA applications by as much as 1.5× (best achievable throughputs for a 10×10 system) or 2.5× (allocating same FPGA resources to both NoCs) f...

Full description

Saved in:
Bibliographic Details
Main Authors: Kapre, Nachiket, Gray, Jan
Other Authors: School of Computer Engineering
Format: Conference or Workshop Item
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/81222
http://hdl.handle.net/10220/39180
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-81222
record_format dspace
spelling sg-ntu-dr.10356-812222020-05-28T07:17:53Z Hoplite: Building austere overlay NoCs for FPGAs Kapre, Nachiket Gray, Jan School of Computer Engineering 2015 25th International Conference on Field Programmable Logic and Applications (FPL) Computer Science and Engineering Customized unidirectional, bufferless, deflection-routed torus networks can outperform classic, bidirectional, buffered mesh networks for single-flit-oriented FPGA applications by as much as 1.5× (best achievable throughputs for a 10×10 system) or 2.5× (allocating same FPGA resources to both NoCs) for uniform random traffic. We present Hoplite, an efficient, lightweight, fast FPGA overlay NoC that is designed to be small and compact by (1) eliminating input buffers, and (2) reducing the cost of switch crossbar that have traditionally limited speeds and imposed heavy resource costs in conventional FPGA overlay NoCs. We implement bufferless deflection routing cheaply, requiring the generation of only output multiplexer controls and no backpressure handshakes. Additionally, we use directional channels that help reduce crossbar cost by restricting the number of inputs to the crossbar to three instead of four. When compared to buffered mesh switches, FPGA-based deflection routers are ≈3.5× smaller (HLS-generated switch) and 2.5× faster (clock period) for 32b payloads. In a separate experiment, we hand-crafted a prototype RTL version of our switch with RLOCS that requires only 60 LUTs and 100 FFs per router and runs at 2.9 ns. Accepted version 2015-12-18T08:54:43Z 2019-12-06T14:25:53Z 2015-12-18T08:54:43Z 2019-12-06T14:25:53Z 2015 Conference Paper Kapre, N., & Gray, J. (2015). Hoplite: Building austere overlay NoCs for FPGAs. 2015 25th International Conference on Field Programmable Logic and Applications (FPL), 1-8. https://hdl.handle.net/10356/81222 http://hdl.handle.net/10220/39180 10.1109/FPL.2015.7293956 en © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/FPL.2015.7293956]. 8 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Computer Science and Engineering
spellingShingle Computer Science and Engineering
Kapre, Nachiket
Gray, Jan
Hoplite: Building austere overlay NoCs for FPGAs
description Customized unidirectional, bufferless, deflection-routed torus networks can outperform classic, bidirectional, buffered mesh networks for single-flit-oriented FPGA applications by as much as 1.5× (best achievable throughputs for a 10×10 system) or 2.5× (allocating same FPGA resources to both NoCs) for uniform random traffic. We present Hoplite, an efficient, lightweight, fast FPGA overlay NoC that is designed to be small and compact by (1) eliminating input buffers, and (2) reducing the cost of switch crossbar that have traditionally limited speeds and imposed heavy resource costs in conventional FPGA overlay NoCs. We implement bufferless deflection routing cheaply, requiring the generation of only output multiplexer controls and no backpressure handshakes. Additionally, we use directional channels that help reduce crossbar cost by restricting the number of inputs to the crossbar to three instead of four. When compared to buffered mesh switches, FPGA-based deflection routers are ≈3.5× smaller (HLS-generated switch) and 2.5× faster (clock period) for 32b payloads. In a separate experiment, we hand-crafted a prototype RTL version of our switch with RLOCS that requires only 60 LUTs and 100 FFs per router and runs at 2.9 ns.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Kapre, Nachiket
Gray, Jan
format Conference or Workshop Item
author Kapre, Nachiket
Gray, Jan
author_sort Kapre, Nachiket
title Hoplite: Building austere overlay NoCs for FPGAs
title_short Hoplite: Building austere overlay NoCs for FPGAs
title_full Hoplite: Building austere overlay NoCs for FPGAs
title_fullStr Hoplite: Building austere overlay NoCs for FPGAs
title_full_unstemmed Hoplite: Building austere overlay NoCs for FPGAs
title_sort hoplite: building austere overlay nocs for fpgas
publishDate 2015
url https://hdl.handle.net/10356/81222
http://hdl.handle.net/10220/39180
_version_ 1681058234629095424