Rapid memory-aware selection of hardware accelerators in programmable SoC design

Programmable Systems-on-Chips (SoCs) are expected to incorporate a larger number of application-specific hardware accelerators with tightly integrated memories in order to meet stringent performance-power requirements of embedded systems. As data sharing between the accelerator memories and the proc...

Full description

Saved in:
Bibliographic Details
Main Authors: Prakash, Alok, Clarke, Christopher T., Lam, Siew-Kei, Srikanthan, Thambipillai
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2019
Subjects:
Online Access:https://hdl.handle.net/10356/99051
http://hdl.handle.net/10220/48550
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-99051
record_format dspace
spelling sg-ntu-dr.10356-990512020-03-07T11:50:46Z Rapid memory-aware selection of hardware accelerators in programmable SoC design Prakash, Alok Clarke, Christopher T. Lam, Siew-Kei Srikanthan, Thambipillai School of Computer Science and Engineering Accelerator Architectures DRNTU::Engineering::Computer science and engineering Memory Management Programmable Systems-on-Chips (SoCs) are expected to incorporate a larger number of application-specific hardware accelerators with tightly integrated memories in order to meet stringent performance-power requirements of embedded systems. As data sharing between the accelerator memories and the processor is inevitable, it is of paramount importance that the selection of application segments for hardware acceleration must be undertaken such that the communication overhead of data transfers do not impede the advantages of the accelerators. In this paper, we propose a novel memory-aware selection algorithm that is based on an iterative approach to rapidly recommend a set of hardware accelerators that will provide high performance gain under varying area constraint. In order to significantly reduce the algorithm runtime while still guaranteeing near-optimal solutions, we propose a heuristic to estimate the penalties incurred when the processor accesses the accelerator memories. In each iteration of the proposed algorithm, a two-pass method is employed where a set of good hardware accelerator candidates is selected using a greedy approach in the first pass, and a “sliding window” approach is used in the second pass to refine the solution. The two-pass method is iteratively performed on a bounded set of candidate hardware accelerators to limit the search space and to avoid local maxima. In order to validate the benefits of the proposed selection algorithm, an exhaustive search algorithm is also developed. Experimental results using the popular CHStone benchmark suite show that the performance achieved by the accelerators recommended by the proposed algorithm closely matches the performance of the exhaustive algorithm, with close to 99% accuracy, while being orders of magnitude faster. Accepted version 2019-06-04T09:08:16Z 2019-12-06T20:02:42Z 2019-06-04T09:08:16Z 2019-12-06T20:02:42Z 2017 Journal Article Prakash, A., Clarke, C. T., Lam, S.-K., & Srikanthan, T. (2018). Rapid memory-aware selection of hardware accelerators in programmable SoC design. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 26(3), 445-456. doi:10.1109/TVLSI.2017.2769125 1063-8210 https://hdl.handle.net/10356/99051 http://hdl.handle.net/10220/48550 10.1109/TVLSI.2017.2769125 en IEEE Transactions on Very Large Scale Integration (VLSI) Systems © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/TVLSI.2017.2769125 11 p. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic Accelerator Architectures
DRNTU::Engineering::Computer science and engineering
Memory Management
spellingShingle Accelerator Architectures
DRNTU::Engineering::Computer science and engineering
Memory Management
Prakash, Alok
Clarke, Christopher T.
Lam, Siew-Kei
Srikanthan, Thambipillai
Rapid memory-aware selection of hardware accelerators in programmable SoC design
description Programmable Systems-on-Chips (SoCs) are expected to incorporate a larger number of application-specific hardware accelerators with tightly integrated memories in order to meet stringent performance-power requirements of embedded systems. As data sharing between the accelerator memories and the processor is inevitable, it is of paramount importance that the selection of application segments for hardware acceleration must be undertaken such that the communication overhead of data transfers do not impede the advantages of the accelerators. In this paper, we propose a novel memory-aware selection algorithm that is based on an iterative approach to rapidly recommend a set of hardware accelerators that will provide high performance gain under varying area constraint. In order to significantly reduce the algorithm runtime while still guaranteeing near-optimal solutions, we propose a heuristic to estimate the penalties incurred when the processor accesses the accelerator memories. In each iteration of the proposed algorithm, a two-pass method is employed where a set of good hardware accelerator candidates is selected using a greedy approach in the first pass, and a “sliding window” approach is used in the second pass to refine the solution. The two-pass method is iteratively performed on a bounded set of candidate hardware accelerators to limit the search space and to avoid local maxima. In order to validate the benefits of the proposed selection algorithm, an exhaustive search algorithm is also developed. Experimental results using the popular CHStone benchmark suite show that the performance achieved by the accelerators recommended by the proposed algorithm closely matches the performance of the exhaustive algorithm, with close to 99% accuracy, while being orders of magnitude faster.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Prakash, Alok
Clarke, Christopher T.
Lam, Siew-Kei
Srikanthan, Thambipillai
format Article
author Prakash, Alok
Clarke, Christopher T.
Lam, Siew-Kei
Srikanthan, Thambipillai
author_sort Prakash, Alok
title Rapid memory-aware selection of hardware accelerators in programmable SoC design
title_short Rapid memory-aware selection of hardware accelerators in programmable SoC design
title_full Rapid memory-aware selection of hardware accelerators in programmable SoC design
title_fullStr Rapid memory-aware selection of hardware accelerators in programmable SoC design
title_full_unstemmed Rapid memory-aware selection of hardware accelerators in programmable SoC design
title_sort rapid memory-aware selection of hardware accelerators in programmable soc design
publishDate 2019
url https://hdl.handle.net/10356/99051
http://hdl.handle.net/10220/48550
_version_ 1681039378675138560