A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA

In the dynamic world of artificial intelligence (AI) with escalating computational needs, the significance of edge computing has risen. Edge computing emphasizes local computations on devices, particularly using Software-Defined Chip (SDC) such as the Coarse-Grained Reconfigurable Architecture (CGRA...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Jiaxu
Other Authors: Goh Wang Ling
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2023
Online Access:https://hdl.handle.net/10356/171880
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-171880
record_format dspace
spelling sg-ntu-dr.10356-1718802024-02-23T04:26:49Z A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA Li, Jiaxu Goh Wang Ling School of Electrical and Electronic Engineering Institute of Microelectronics, A*STAR EWLGOH@ntu.edu.sg In the dynamic world of artificial intelligence (AI) with escalating computational needs, the significance of edge computing has risen. Edge computing emphasizes local computations on devices, particularly using Software-Defined Chip (SDC) such as the Coarse-Grained Reconfigurable Architecture (CGRA), renowned for its superior energy efficiency and adaptable reconfiguration attributes. On an algorithmic front, edge computing's adaptability to computational precision has led to the widespread use of quantized AI models within the CGRA framework, aiming to further boost computational prowess and explore the new way to solve the bottleneck problem of computing power under the limit of Moore's Law. This research aims at increasing the throughput of the CGRA chip, and therefore introduces an innovative strategy to amplify power efficiency of the conventional multiplier, featuring its functionality of conducting parallel multiplications to replace the conventional multiplier inside the Process Element (PE). Comprehensive simulations using 40nm CMOS technology revealed that when using continuous short-bit multiplications workload as a testing benchmark, the throughput of PE saw a notable rise in comparison to the PCAE chip with the original Cadence IP multiplier. And after implementing the array multiplications application on the optimized CGRA, the throughput sees an improvement of 98% and 296% for 4-bit multiplications and 2-bit multiplications respectively. This improvement highlights the substantial enhancement in PE's performance with the integration of the suggested reconfigurable multiplier into the CGRA design, expecting for higher potential to deal with quantized AI models. Master's degree 2023-11-15T00:15:19Z 2023-11-15T00:15:19Z 2023 Thesis-Master by Coursework Li, J. (2023). A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/171880 https://hdl.handle.net/10356/171880 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
description In the dynamic world of artificial intelligence (AI) with escalating computational needs, the significance of edge computing has risen. Edge computing emphasizes local computations on devices, particularly using Software-Defined Chip (SDC) such as the Coarse-Grained Reconfigurable Architecture (CGRA), renowned for its superior energy efficiency and adaptable reconfiguration attributes. On an algorithmic front, edge computing's adaptability to computational precision has led to the widespread use of quantized AI models within the CGRA framework, aiming to further boost computational prowess and explore the new way to solve the bottleneck problem of computing power under the limit of Moore's Law. This research aims at increasing the throughput of the CGRA chip, and therefore introduces an innovative strategy to amplify power efficiency of the conventional multiplier, featuring its functionality of conducting parallel multiplications to replace the conventional multiplier inside the Process Element (PE). Comprehensive simulations using 40nm CMOS technology revealed that when using continuous short-bit multiplications workload as a testing benchmark, the throughput of PE saw a notable rise in comparison to the PCAE chip with the original Cadence IP multiplier. And after implementing the array multiplications application on the optimized CGRA, the throughput sees an improvement of 98% and 296% for 4-bit multiplications and 2-bit multiplications respectively. This improvement highlights the substantial enhancement in PE's performance with the integration of the suggested reconfigurable multiplier into the CGRA design, expecting for higher potential to deal with quantized AI models.
author2 Goh Wang Ling
author_facet Goh Wang Ling
Li, Jiaxu
format Thesis-Master by Coursework
author Li, Jiaxu
spellingShingle Li, Jiaxu
A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA
author_sort Li, Jiaxu
title A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA
title_short A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA
title_full A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA
title_fullStr A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA
title_full_unstemmed A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA
title_sort novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the cgra
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/171880
_version_ 1794549466818150400