A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA
In the dynamic world of artificial intelligence (AI) with escalating computational needs, the significance of edge computing has risen. Edge computing emphasizes local computations on devices, particularly using Software-Defined Chip (SDC) such as the Coarse-Grained Reconfigurable Architecture (CGRA...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Online Access: | https://hdl.handle.net/10356/171880 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-171880 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1718802024-02-23T04:26:49Z A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA Li, Jiaxu Goh Wang Ling School of Electrical and Electronic Engineering Institute of Microelectronics, A*STAR EWLGOH@ntu.edu.sg In the dynamic world of artificial intelligence (AI) with escalating computational needs, the significance of edge computing has risen. Edge computing emphasizes local computations on devices, particularly using Software-Defined Chip (SDC) such as the Coarse-Grained Reconfigurable Architecture (CGRA), renowned for its superior energy efficiency and adaptable reconfiguration attributes. On an algorithmic front, edge computing's adaptability to computational precision has led to the widespread use of quantized AI models within the CGRA framework, aiming to further boost computational prowess and explore the new way to solve the bottleneck problem of computing power under the limit of Moore's Law. This research aims at increasing the throughput of the CGRA chip, and therefore introduces an innovative strategy to amplify power efficiency of the conventional multiplier, featuring its functionality of conducting parallel multiplications to replace the conventional multiplier inside the Process Element (PE). Comprehensive simulations using 40nm CMOS technology revealed that when using continuous short-bit multiplications workload as a testing benchmark, the throughput of PE saw a notable rise in comparison to the PCAE chip with the original Cadence IP multiplier. And after implementing the array multiplications application on the optimized CGRA, the throughput sees an improvement of 98% and 296% for 4-bit multiplications and 2-bit multiplications respectively. This improvement highlights the substantial enhancement in PE's performance with the integration of the suggested reconfigurable multiplier into the CGRA design, expecting for higher potential to deal with quantized AI models. Master's degree 2023-11-15T00:15:19Z 2023-11-15T00:15:19Z 2023 Thesis-Master by Coursework Li, J. (2023). A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/171880 https://hdl.handle.net/10356/171880 en application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
description |
In the dynamic world of artificial intelligence (AI) with escalating computational needs, the significance of edge computing has risen. Edge computing emphasizes local computations on devices, particularly using Software-Defined Chip (SDC) such as the Coarse-Grained Reconfigurable Architecture (CGRA), renowned for its superior energy efficiency and adaptable reconfiguration attributes. On an algorithmic front, edge computing's adaptability to computational precision has led to the widespread use of quantized AI models within the CGRA framework, aiming to further boost computational prowess and explore the new way to solve the bottleneck problem of computing power under the limit of Moore's Law. This research aims at increasing the throughput of the CGRA chip, and therefore introduces an innovative strategy to amplify power efficiency of the conventional multiplier, featuring its functionality of conducting parallel multiplications to replace the conventional multiplier inside the Process Element (PE). Comprehensive simulations using 40nm CMOS technology revealed that when using continuous short-bit multiplications workload as a testing benchmark, the throughput of PE saw a notable rise in comparison to the PCAE chip with the original Cadence IP multiplier. And after implementing the array multiplications application on the optimized CGRA, the throughput sees an improvement of 98% and 296% for 4-bit multiplications and 2-bit multiplications respectively. This improvement highlights the substantial enhancement in PE's performance with the integration of the suggested reconfigurable multiplier into the CGRA design, expecting for higher potential to deal with quantized AI models. |
author2 |
Goh Wang Ling |
author_facet |
Goh Wang Ling Li, Jiaxu |
format |
Thesis-Master by Coursework |
author |
Li, Jiaxu |
spellingShingle |
Li, Jiaxu A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA |
author_sort |
Li, Jiaxu |
title |
A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA |
title_short |
A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA |
title_full |
A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA |
title_fullStr |
A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA |
title_full_unstemmed |
A novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the CGRA |
title_sort |
novel design of power-efficient reconfigurable multiplier targeting at increasing the throughput of the cgra |
publisher |
Nanyang Technological University |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/171880 |
_version_ |
1794549466818150400 |