VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration

Many stand-alone, FPGA-based accelerators separate the implementation of a computation into two components - (1) a large parallel component that is realized as hardware on spatial FPGA fabric and (2) a small control and co-ordination component that is realized as software on embedded soft-core proce...

Full description

Saved in:
Bibliographic Details
Main Authors: Kapre, Nachiket, DeHon, Andre
Other Authors: School of Computer Engineering
Format: Conference or Workshop Item
Language:English
Published: 2015
Subjects:
Online Access:https://hdl.handle.net/10356/81249
http://hdl.handle.net/10220/39204
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-81249
record_format dspace
spelling sg-ntu-dr.10356-812492020-05-28T07:41:41Z VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration Kapre, Nachiket DeHon, Andre School of Computer Engineering 2011 International Conference on Field-Programmable Technology (FPT) Computer Science and Engineering Many stand-alone, FPGA-based accelerators separate the implementation of a computation into two components - (1) a large parallel component that is realized as hardware on spatial FPGA fabric and (2) a small control and co-ordination component that is realized as software on embedded soft-core processors like an off-the-shelf Xilinx Microblaze (or host offchip CPU). While this hardware-software partitioning methodology allows the designer to lower design effort when composing the accelerator system, it introduces unnecessary Amdahl's Law bottlenecks and limits scalability. In this paper, we show how to avoid these limitations with VLIW-SCORE: a combination of a high-level parallel programming framework called SCORE and a custom, hybrid VLIW hardware organization. We demonstrate the benefits of this methodology for the SPICE circuit simulator when implementing the simulation control algorithms. With our spatial mapping flow we are able to improve performance by ≈30% (mean across circuit benchmarks) when compared to the Microblaze implementation for the Xilinx Virtex-6 LX760 FPGA. For complete application acceleration, we see an improved speedup from 1.9× for the Microblaze-based design to 2.6× for the hybrid, custom VLIW implementation when comparing a Xilinx Virtex-6 LX760 FPGA (40nm) with an Intel Core i7 965 CPU (45nm). Accepted version 2015-12-22T09:08:02Z 2019-12-06T14:26:31Z 2015-12-22T09:08:02Z 2019-12-06T14:26:31Z 2011 Conference Paper Kapre, N., & DeHon, A. (2011). VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration. 2011 International Conference on Field-Programmable Technology, 1-9. https://hdl.handle.net/10356/81249 http://hdl.handle.net/10220/39204 10.1109/FPT.2011.6132678 en © 2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/FPT.2011.6132678]. 9 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Computer Science and Engineering
spellingShingle Computer Science and Engineering
Kapre, Nachiket
DeHon, Andre
VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration
description Many stand-alone, FPGA-based accelerators separate the implementation of a computation into two components - (1) a large parallel component that is realized as hardware on spatial FPGA fabric and (2) a small control and co-ordination component that is realized as software on embedded soft-core processors like an off-the-shelf Xilinx Microblaze (or host offchip CPU). While this hardware-software partitioning methodology allows the designer to lower design effort when composing the accelerator system, it introduces unnecessary Amdahl's Law bottlenecks and limits scalability. In this paper, we show how to avoid these limitations with VLIW-SCORE: a combination of a high-level parallel programming framework called SCORE and a custom, hybrid VLIW hardware organization. We demonstrate the benefits of this methodology for the SPICE circuit simulator when implementing the simulation control algorithms. With our spatial mapping flow we are able to improve performance by ≈30% (mean across circuit benchmarks) when compared to the Microblaze implementation for the Xilinx Virtex-6 LX760 FPGA. For complete application acceleration, we see an improved speedup from 1.9× for the Microblaze-based design to 2.6× for the hybrid, custom VLIW implementation when comparing a Xilinx Virtex-6 LX760 FPGA (40nm) with an Intel Core i7 965 CPU (45nm).
author2 School of Computer Engineering
author_facet School of Computer Engineering
Kapre, Nachiket
DeHon, Andre
format Conference or Workshop Item
author Kapre, Nachiket
DeHon, Andre
author_sort Kapre, Nachiket
title VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration
title_short VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration
title_full VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration
title_fullStr VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration
title_full_unstemmed VLIW-SCORE: Beyond C for sequential control of SPICE FPGA acceleration
title_sort vliw-score: beyond c for sequential control of spice fpga acceleration
publishDate 2015
url https://hdl.handle.net/10356/81249
http://hdl.handle.net/10220/39204
_version_ 1681058463748194304