Intelligent high level synthesis for customization on reconfigurable platforms

High level synthesis (HLS) using C/C++ has increasingly become a critical step in the realization of complex digital systems. One of the major research focus areas in this space has been to realize efficient synthesis of complex systems without violating stringent time-to-market constraints. Most of...

Full description

Saved in:
Bibliographic Details
Main Author: Sharad Sinha
Other Authors: Thambipillai Srikanthan
Format: Theses and Dissertations
Language:English
Published: 2014
Subjects:
Online Access:https://hdl.handle.net/10356/61691
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-61691
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Hardware::Arithmetic and logic structures
DRNTU::Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation
spellingShingle DRNTU::Engineering::Computer science and engineering::Hardware::Arithmetic and logic structures
DRNTU::Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation
Sharad Sinha
Intelligent high level synthesis for customization on reconfigurable platforms
description High level synthesis (HLS) using C/C++ has increasingly become a critical step in the realization of complex digital systems. One of the major research focus areas in this space has been to realize efficient synthesis of complex systems without violating stringent time-to-market constraints. Most of the related work cited in the literature has been mainly confined to supporting language constructs in C/C++, scheduling of operations and hardware binding for area reduction phase. The main motivation of the research work presented in this thesis is to develop novel algorithms for intelligent high level synthesis without designer intervention. Using timing constraint as a knob, an algorithm called Extended Compatibility Path Based Binding (ECPB) has been proposed for resource sharing during hardware binding to minimize area utilization. It was demonstrated that the proposed method can be automated and has been shown to yield 12.49% and 29.21% on average lower area-delay product compared to compatibility path based (CPB) and weighted bipartite matching (WBM) based binding respectively. Building upon ECPB, a latency-preserving algorithm for area-delay optimization has been proposed to reduce area without violating timing constraint. Data-initiation-interval aware graph partitioning algorithm has been proposed to partition an application’s dataflow graph. This has also paved the way for further area reduction by invoking resource sharing without violating the initiation interval constraints. In addition, a systematic technique for area-delay trade-off analysis has been developed to establish multiple design points. The proposed method for initiation-interval-aware area optimization can be deployed for analyzing large dataflow graphs efficiently, making it highly scalable. Technique for the efficient utilization of DSP resources available in FPGA platforms has been proposed to maximize application performance. It relies on the systematic investigation for the existence of multiplication and allied operations within the frequently executed code blocks of an application. Model based inferences for different types of multiplication were developed to facilitate the rapid identification of profitable regions for maximizing performance. Investigations confirm that operating clock frequency can be increased by up to 3 times when compared to a commercial (Vivado-HLS) tool. In order to combine the strengths of IP-core based design and high level synthesis, concepts of program recognition and automatic algorithm replacement are relied upon to develop lexical and pattern based analysis. This has led to the automatic identification of in-built arithmetic functions in a C/C++ application and compiler specific patterns. A novel IP-core selection algorithm has been proposed to facilitate the binding to available IP cores. Our investigations confirm that it lends well for notable area reduction when compared with that possible using a commercial HLS tool (Vivado-HLS). In addition, multiple design points can be generated to facilitate area-delay tradeoff analysis by associating a combination of IP-cores at a time. Our investigations show that the Look up Table (LUT) reduction can be from 60% to 75% while the clock period reduction can range from 16% to 40% for the benchmarks investigated. While the proposed methods for an intelligent high-level synthesis flow are applicable across all application domains, digital signal and information processing applications benefit greatly due to the existence of operations such as multiplication and transcendental functions. Additionally, since these methods look for application characteristics and exploit architecture specifics, they lead to customized synthesis solutions. Finally, the proposed methods have contributed to the realization of an intelligent high-level synthesis framework, which paves the way for less reliance on hand-crafted designs and skilled hardware designers.
author2 Thambipillai Srikanthan
author_facet Thambipillai Srikanthan
Sharad Sinha
format Theses and Dissertations
author Sharad Sinha
author_sort Sharad Sinha
title Intelligent high level synthesis for customization on reconfigurable platforms
title_short Intelligent high level synthesis for customization on reconfigurable platforms
title_full Intelligent high level synthesis for customization on reconfigurable platforms
title_fullStr Intelligent high level synthesis for customization on reconfigurable platforms
title_full_unstemmed Intelligent high level synthesis for customization on reconfigurable platforms
title_sort intelligent high level synthesis for customization on reconfigurable platforms
publishDate 2014
url https://hdl.handle.net/10356/61691
_version_ 1759855490981953536
spelling sg-ntu-dr.10356-616912023-03-04T00:45:58Z Intelligent high level synthesis for customization on reconfigurable platforms Sharad Sinha Thambipillai Srikanthan School of Computer Engineering Centre for High Performance Embedded Systems DRNTU::Engineering::Computer science and engineering::Hardware::Arithmetic and logic structures DRNTU::Engineering::Computer science and engineering::Hardware::Register-transfer-level implementation High level synthesis (HLS) using C/C++ has increasingly become a critical step in the realization of complex digital systems. One of the major research focus areas in this space has been to realize efficient synthesis of complex systems without violating stringent time-to-market constraints. Most of the related work cited in the literature has been mainly confined to supporting language constructs in C/C++, scheduling of operations and hardware binding for area reduction phase. The main motivation of the research work presented in this thesis is to develop novel algorithms for intelligent high level synthesis without designer intervention. Using timing constraint as a knob, an algorithm called Extended Compatibility Path Based Binding (ECPB) has been proposed for resource sharing during hardware binding to minimize area utilization. It was demonstrated that the proposed method can be automated and has been shown to yield 12.49% and 29.21% on average lower area-delay product compared to compatibility path based (CPB) and weighted bipartite matching (WBM) based binding respectively. Building upon ECPB, a latency-preserving algorithm for area-delay optimization has been proposed to reduce area without violating timing constraint. Data-initiation-interval aware graph partitioning algorithm has been proposed to partition an application’s dataflow graph. This has also paved the way for further area reduction by invoking resource sharing without violating the initiation interval constraints. In addition, a systematic technique for area-delay trade-off analysis has been developed to establish multiple design points. The proposed method for initiation-interval-aware area optimization can be deployed for analyzing large dataflow graphs efficiently, making it highly scalable. Technique for the efficient utilization of DSP resources available in FPGA platforms has been proposed to maximize application performance. It relies on the systematic investigation for the existence of multiplication and allied operations within the frequently executed code blocks of an application. Model based inferences for different types of multiplication were developed to facilitate the rapid identification of profitable regions for maximizing performance. Investigations confirm that operating clock frequency can be increased by up to 3 times when compared to a commercial (Vivado-HLS) tool. In order to combine the strengths of IP-core based design and high level synthesis, concepts of program recognition and automatic algorithm replacement are relied upon to develop lexical and pattern based analysis. This has led to the automatic identification of in-built arithmetic functions in a C/C++ application and compiler specific patterns. A novel IP-core selection algorithm has been proposed to facilitate the binding to available IP cores. Our investigations confirm that it lends well for notable area reduction when compared with that possible using a commercial HLS tool (Vivado-HLS). In addition, multiple design points can be generated to facilitate area-delay tradeoff analysis by associating a combination of IP-cores at a time. Our investigations show that the Look up Table (LUT) reduction can be from 60% to 75% while the clock period reduction can range from 16% to 40% for the benchmarks investigated. While the proposed methods for an intelligent high-level synthesis flow are applicable across all application domains, digital signal and information processing applications benefit greatly due to the existence of operations such as multiplication and transcendental functions. Additionally, since these methods look for application characteristics and exploit architecture specifics, they lead to customized synthesis solutions. Finally, the proposed methods have contributed to the realization of an intelligent high-level synthesis framework, which paves the way for less reliance on hand-crafted designs and skilled hardware designers. DOCTOR OF PHILOSOPHY (SCE) 2014-08-13T06:56:06Z 2014-08-13T06:56:06Z 2014 2014 Thesis Sharad Sinha. (2014). Intelligent high level synthesis for customization on reconfigurable platforms. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/61691 10.32657/10356/61691 en 179 p. application/pdf