VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration
Many stand-alone, FPGA-based accelerators separate the implementation of a computation into two components - (1) a large parallel component that is realized as hardware on spatial FPGA fabric and (2) a small control and co-ordination component that is realized as software on embedded soft-core proce...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2015
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/81249 http://hdl.handle.net/10220/39204 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-81249 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-812492020-05-28T07:41:41Z VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration Kapre, Nachiket DeHon, Andre School of Computer Engineering 2011 International Conference on Field-Programmable Technology (FPT) Computer Science and Engineering Many stand-alone, FPGA-based accelerators separate the implementation of a computation into two components - (1) a large parallel component that is realized as hardware on spatial FPGA fabric and (2) a small control and co-ordination component that is realized as software on embedded soft-core processors like an off-the-shelf Xilinx Microblaze (or host offchip CPU). While this hardware-software partitioning methodology allows the designer to lower design effort when composing the accelerator system, it introduces unnecessary Amdahl's Law bottlenecks and limits scalability. In this paper, we show how to avoid these limitations with VLIW-SCORE: a combination of a high-level parallel programming framework called SCORE and a custom, hybrid VLIW hardware organization. We demonstrate the benefits of this methodology for the SPICE circuit simulator when implementing the simulation control algorithms. With our spatial mapping flow we are able to improve performance by ≈30% (mean across circuit benchmarks) when compared to the Microblaze implementation for the Xilinx Virtex-6 LX760 FPGA. For complete application acceleration, we see an improved speedup from 1.9× for the Microblaze-based design to 2.6× for the hybrid, custom VLIW implementation when comparing a Xilinx Virtex-6 LX760 FPGA (40nm) with an Intel Core i7 965 CPU (45nm). Accepted version 2015-12-22T09:08:02Z 2019-12-06T14:26:31Z 2015-12-22T09:08:02Z 2019-12-06T14:26:31Z 2011 Conference Paper Kapre, N., & DeHon, A. (2011). VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration. 2011 International Conference on Field-Programmable Technology, 1-9. https://hdl.handle.net/10356/81249 http://hdl.handle.net/10220/39204 10.1109/FPT.2011.6132678 en © 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/FPT.2011.6132678]. 9 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Computer Science and Engineering |
spellingShingle |
Computer Science and Engineering Kapre, Nachiket DeHon, Andre VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration |
description |
Many stand-alone, FPGA-based accelerators separate the implementation of a computation into two components - (1) a large parallel component that is realized as hardware on spatial FPGA fabric and (2) a small control and co-ordination component that is realized as software on embedded soft-core processors like an off-the-shelf Xilinx Microblaze (or host offchip CPU). While this hardware-software partitioning methodology allows the designer to lower design effort when composing the accelerator system, it introduces unnecessary Amdahl's Law bottlenecks and limits scalability. In this paper, we show how to avoid these limitations with VLIW-SCORE: a combination of a high-level parallel programming framework called SCORE and a custom, hybrid VLIW hardware organization. We demonstrate the benefits of this methodology for the SPICE circuit simulator when implementing the simulation control algorithms. With our spatial mapping flow we are able to improve performance by ≈30% (mean across circuit benchmarks) when compared to the Microblaze implementation for the Xilinx Virtex-6 LX760 FPGA. For complete application acceleration, we see an improved speedup from 1.9× for the Microblaze-based design to 2.6× for the hybrid, custom VLIW implementation when comparing a Xilinx Virtex-6 LX760 FPGA (40nm) with an Intel Core i7 965 CPU (45nm). |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Kapre, Nachiket DeHon, Andre |
format |
Conference or Workshop Item |
author |
Kapre, Nachiket DeHon, Andre |
author_sort |
Kapre, Nachiket |
title |
VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration |
title_short |
VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration |
title_full |
VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration |
title_fullStr |
VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration |
title_full_unstemmed |
VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration |
title_sort |
vliw-score: beyond c for sequential control of spice fpga acceleration |
publishDate |
2015 |
url |
https://hdl.handle.net/10356/81249 http://hdl.handle.net/10220/39204 |
_version_ |
1681058463748194304 |