ASIC implementation of a high speed and low power scalar product computation unit
This project involves the design, synthesis and placement & routing of improved 16-bit 15-element unsigned inner product architecture. Improvement to the design were made in the carry free addition stage, which is also known as column compression stage or reduction stage, whereby counters are in...
محفوظ في:
المؤلف الرئيسي: | |
---|---|
مؤلفون آخرون: | |
التنسيق: | Final Year Project |
اللغة: | English |
منشور في: |
2009
|
الموضوعات: | |
الوصول للمادة أونلاين: | http://hdl.handle.net/10356/16733 |
الوسوم: |
إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
|
id |
sg-ntu-dr.10356-16733 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-167332023-07-07T16:43:30Z ASIC implementation of a high speed and low power scalar product computation unit Low, Jeremy Yung Shern. Chan Pak Kwong Chang Chip Hong School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering::Integrated circuits This project involves the design, synthesis and placement & routing of improved 16-bit 15-element unsigned inner product architecture. Improvement to the design were made in the carry free addition stage, which is also known as column compression stage or reduction stage, whereby counters are incorporated to perform the preliminary partial product bit accumulation before summation using adders. This report discusses the entire application-specific integrated circuit implementation process, from RTL coding and functional simulations of the proposed architecture to synthesis and timing verification of the design, and finally the placement and routing of the synthesized design. The proposed inner product architecture can reduce the resultant height of partial product tree up to 4 times smaller than that of inner product using conventional merged arithmetic approach. Drastic decrease in resultant height leads to significant reduces in total number of adders, and hence reduces the total area. In fact, the design had been estimated to have area saving approximately 45.5% as compared to latest inner product architecture. The design had been functionally verified using several different input test patterns. The proposed design was then synthesized using STM90nm technology. The synthesized design has latency of two clock cycles with minimum clock period of 5.25ns and thus total delay of 10.5ns. Due to the pipeline manner of the proposed design, it has throughput of 1 clock cycle (5.25ns). The proposed design was then placed and routed. Preliminary timing analysis revealed that the placed and routed design had passed the timing constraint as well as the design constraints, except for max fanout requirement. Furthermore, the die size of the routed design is 1.56mm2 which includes the area of IO pads and special hard macro required for IO pad stability. The project shows that the proposed inner product architecture has remarkable area reduction and commendable speed performance. Completion of the placement and routing process with detailed timing calculations and power analysis shall ensure the reliability of the 16-bit 15-element counter-based unsigned inner product processor chip. Bachelor of Engineering 2009-05-28T03:02:13Z 2009-05-28T03:02:13Z 2009 2009 Final Year Project (FYP) http://hdl.handle.net/10356/16733 en Nanyang Technological University 127 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Electrical and electronic engineering::Integrated circuits |
spellingShingle |
DRNTU::Engineering::Electrical and electronic engineering::Integrated circuits Low, Jeremy Yung Shern. ASIC implementation of a high speed and low power scalar product computation unit |
description |
This project involves the design, synthesis and placement & routing of improved 16-bit 15-element unsigned inner product architecture. Improvement to the design were made in the carry free addition stage, which is also known as column compression stage or reduction stage, whereby counters are incorporated to perform the preliminary partial product bit accumulation before summation using adders. This report discusses the entire application-specific integrated circuit implementation process, from RTL coding and functional simulations of the proposed architecture to synthesis and timing verification of the design, and finally the placement and routing of the synthesized design.
The proposed inner product architecture can reduce the resultant height of partial product tree up to 4 times smaller than that of inner product using conventional merged arithmetic approach. Drastic decrease in resultant height leads to significant reduces in total number of adders, and hence reduces the total area. In fact, the design had been estimated to have area saving approximately 45.5% as compared to latest inner product architecture. The design had been functionally verified using several different input test patterns.
The proposed design was then synthesized using STM90nm technology. The synthesized design has latency of two clock cycles with minimum clock period of 5.25ns and thus total delay of 10.5ns. Due to the pipeline manner of the proposed design, it has throughput of 1 clock cycle (5.25ns). The proposed design was then placed and routed. Preliminary timing analysis revealed that the placed and routed design had passed the timing constraint as well as the design constraints, except for max fanout requirement. Furthermore, the die size of the routed design is 1.56mm2 which includes the area of IO pads and special hard macro required for IO pad stability.
The project shows that the proposed inner product architecture has remarkable area reduction and commendable speed performance. Completion of the placement and routing process with detailed timing calculations and power analysis shall ensure the reliability of the 16-bit 15-element counter-based unsigned inner product processor chip. |
author2 |
Chan Pak Kwong |
author_facet |
Chan Pak Kwong Low, Jeremy Yung Shern. |
format |
Final Year Project |
author |
Low, Jeremy Yung Shern. |
author_sort |
Low, Jeremy Yung Shern. |
title |
ASIC implementation of a high speed and low power scalar product computation unit |
title_short |
ASIC implementation of a high speed and low power scalar product computation unit |
title_full |
ASIC implementation of a high speed and low power scalar product computation unit |
title_fullStr |
ASIC implementation of a high speed and low power scalar product computation unit |
title_full_unstemmed |
ASIC implementation of a high speed and low power scalar product computation unit |
title_sort |
asic implementation of a high speed and low power scalar product computation unit |
publishDate |
2009 |
url |
http://hdl.handle.net/10356/16733 |
_version_ |
1772829143194402816 |