VLSI Implementation of a Cost-Efficient Loeffler-DCT Algorithm with Recursive CORDIC for DCT-Based Encoder

This paper presents a low-cost and high-quality; hardware-oriented; two-dimensional discrete cosine transform (2-D DCT) signal analyzer for image and video encoders. In order to reduce memory requirement and improve image quality; a novel Loeffler DCT based on a coordinate rotation digital computer...

Full description

Saved in:
Bibliographic Details
Main Authors: Chung, Rih-Lung, Chen, Chen-Wei, Chen, Chiung-An, Abu, Patricia Angela R, Chen, Shih-Lun
Format: text
Published: Archīum Ateneo 2021
Subjects:
Online Access:https://archium.ateneo.edu/discs-faculty-pubs/217
https://archium.ateneo.edu/cgi/viewcontent.cgi?article=1217&context=discs-faculty-pubs
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Ateneo De Manila University
Description
Summary:This paper presents a low-cost and high-quality; hardware-oriented; two-dimensional discrete cosine transform (2-D DCT) signal analyzer for image and video encoders. In order to reduce memory requirement and improve image quality; a novel Loeffler DCT based on a coordinate rotation digital computer (CORDIC) technique is proposed. In addition; the proposed algorithm is realized by a recursive CORDIC architecture instead of an unfolded CORDIC architecture with approximated scale factors. In the proposed design; a fully pipelined architecture is developed to efficiently increase operating frequency and throughput; and scale factors are implemented by using four hardware-sharing machines for complexity reduction. Thus; the computational complexity can be decreased significantly with only 0.01 dB loss deviated from the optimal image quality of the Loeffler DCT. Experimental results show that the proposed 2-D DCT spectral analyzer not only achieved a superior average peak signal–noise ratio (PSNR) compared to the previous CORDIC-DCT algorithms but also designed cost-efficient architecture for very large scale integration (VLSI) implementation. The proposed design was realized using a UMC 0.18-μm CMOS process with a synthesized gate count of 8.04 k and core area of 75,100 μm2. Its operating frequency was 100 MHz and power consumption was 4.17 mW. Moreover; this work had at least a 64.1% gate count reduction and saved at least 22.5% in power consumption compared to previous designs.