ASIC implementation of counter-based 1-D DCT processor

This project proposes a new design of 8x8 1-D Discrete Cosine Transform (DCT) unit. It avoids using multipliers for inner products by employing the counter-based inner product architecture. As the inner products for DCT are constant-variable multiplications, redundancies are exploited to optimize th...

Full description

Saved in:
Bibliographic Details
Main Author: Zhang, Li
Other Authors: Chang Chip Hong
Format: Final Year Project
Language:English
Published: 2010
Subjects:
Online Access:http://hdl.handle.net/10356/40744
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This project proposes a new design of 8x8 1-D Discrete Cosine Transform (DCT) unit. It avoids using multipliers for inner products by employing the counter-based inner product architecture. As the inner products for DCT are constant-variable multiplications, redundancies are exploited to optimize the architecture further. This report discusses the entire flow for the ASIC implementation process, from HDL coding and functional simulations to synthesis and place & routing of the design. Timing analysis is also done for design verification. The counter-based inner product architecture is the major part of the design, which can be seen as serial architecture for multiplication. It uses counters to store values of partial product matrix from inner products, which transforms L vertical bits into ⌊log2L⌋+1 horizontal bits of the accumulated matrix and thus reduces the matrix height drastically. This results in much fewer adders used in the reduction stage and less hardware required. On the other hand, as the counters can operate at a frequency of a few GHz, the accumulation process can be performed quite fast and the throughput of this design is comparable with many existing parallel inner-product computation architectures. The 1-D DCT is designed to contain 8 sets of such architectures, each of which is used to deal with one inner product required in the forward DCT computation. After placement & routing process, the reported minimum clock period for the design is 0.5 ns. The design has an initial latency of 18 clock cycles in pipelining, and the pipelined technique used inside the counter-based architecture enables our proposed 1-D DCT architecture to output the eight 1-D DCT transformed domain values every 9 clock cycles (4.5 ns).