Hoplite: Building austere overlay NoCs for FPGAs
Customized unidirectional, bufferless, deflection-routed torus networks can outperform classic, bidirectional, buffered mesh networks for single-flit-oriented FPGA applications by as much as 1.5× (best achievable throughputs for a 10×10 system) or 2.5× (allocating same FPGA resources to both NoCs) f...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/81222 http://hdl.handle.net/10220/39180 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-81222 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-812222020-05-28T07:17:53Z Hoplite: Building austere overlay NoCs for FPGAs Kapre, Nachiket Gray, Jan School of Computer Engineering 2015 25th International Conference on Field Programmable Logic and Applications (FPL) Computer Science and Engineering Customized unidirectional, bufferless, deflection-routed torus networks can outperform classic, bidirectional, buffered mesh networks for single-flit-oriented FPGA applications by as much as 1.5× (best achievable throughputs for a 10×10 system) or 2.5× (allocating same FPGA resources to both NoCs) for uniform random traffic. We present Hoplite, an efficient, lightweight, fast FPGA overlay NoC that is designed to be small and compact by (1) eliminating input buffers, and (2) reducing the cost of switch crossbar that have traditionally limited speeds and imposed heavy resource costs in conventional FPGA overlay NoCs. We implement bufferless deflection routing cheaply, requiring the generation of only output multiplexer controls and no backpressure handshakes. Additionally, we use directional channels that help reduce crossbar cost by restricting the number of inputs to the crossbar to three instead of four. When compared to buffered mesh switches, FPGA-based deflection routers are ≈3.5× smaller (HLS-generated switch) and 2.5× faster (clock period) for 32b payloads. In a separate experiment, we hand-crafted a prototype RTL version of our switch with RLOCS that requires only 60 LUTs and 100 FFs per router and runs at 2.9 ns. Accepted version 2015-12-18T08:54:43Z 2019-12-06T14:25:53Z 2015-12-18T08:54:43Z 2019-12-06T14:25:53Z 2015 Conference Paper Kapre, N., & Gray, J. (2015). Hoplite: Building austere overlay NoCs for FPGAs. 2015 25th International Conference on Field Programmable Logic and Applications (FPL), 1-8. https://hdl.handle.net/10356/81222 http://hdl.handle.net/10220/39180 10.1109/FPL.2015.7293956 en © 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/FPL.2015.7293956]. 8 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Computer Science and Engineering |
spellingShingle |
Computer Science and Engineering Kapre, Nachiket Gray, Jan Hoplite: Building austere overlay NoCs for FPGAs |
description |
Customized unidirectional, bufferless, deflection-routed torus networks can outperform classic, bidirectional, buffered mesh networks for single-flit-oriented FPGA applications by as much as 1.5× (best achievable throughputs for a 10×10 system) or 2.5× (allocating same FPGA resources to both NoCs) for uniform random traffic. We present Hoplite, an efficient, lightweight, fast FPGA overlay NoC that is designed to be small and compact by (1) eliminating input buffers, and (2) reducing the cost of switch crossbar that have traditionally limited speeds and imposed heavy resource costs in conventional FPGA overlay NoCs. We implement bufferless deflection routing cheaply, requiring the generation of only output multiplexer controls and no backpressure handshakes. Additionally, we use directional channels that help reduce crossbar cost by restricting the number of inputs to the crossbar to three instead of four. When compared to buffered mesh switches, FPGA-based deflection routers are ≈3.5× smaller (HLS-generated switch) and 2.5× faster (clock period) for 32b payloads. In a separate experiment, we hand-crafted a prototype RTL version of our switch with RLOCS that requires only 60 LUTs and 100 FFs per router and runs at 2.9 ns. |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Kapre, Nachiket Gray, Jan |
format |
Conference or Workshop Item |
author |
Kapre, Nachiket Gray, Jan |
author_sort |
Kapre, Nachiket |
title |
Hoplite: Building austere overlay NoCs for FPGAs |
title_short |
Hoplite: Building austere overlay NoCs for FPGAs |
title_full |
Hoplite: Building austere overlay NoCs for FPGAs |
title_fullStr |
Hoplite: Building austere overlay NoCs for FPGAs |
title_full_unstemmed |
Hoplite: Building austere overlay NoCs for FPGAs |
title_sort |
hoplite: building austere overlay nocs for fpgas |
publishDate |
2015 |
url |
https://hdl.handle.net/10356/81222 http://hdl.handle.net/10220/39180 |
_version_ |
1681058234629095424 |