16-bit high-speed CMOS multiplier IC design

This dissertation focuses on a high-speed 16-bit CMOS multiplier design. In order to satisfy the increasing demands of contemporary computing systems, multiplication—a basic operation in digital signal processing, cryptography, and arithmetic units—needs to be implemented efficiently in hardwa...

Full description

Saved in:
Bibliographic Details
Main Author: Feng, Haotian
Other Authors: Gwee Bah Hwee
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2025
Subjects:
Online Access:https://hdl.handle.net/10356/182449
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This dissertation focuses on a high-speed 16-bit CMOS multiplier design. In order to satisfy the increasing demands of contemporary computing systems, multiplication—a basic operation in digital signal processing, cryptography, and arithmetic units—needs to be implemented efficiently in hardware. The goal of the dissertation is to determine which a greater design is with good speed and efficiency performance by investigating a variety of multiplication algorithms, such as the Vedic, Booth, and Wallace-tree. Since Vedic algorithm’s parallelism and simplicity, Vedic multiplier has the lowest latency in 16-bit unsigned number multiplication, making it an ideal choice for applications that require fast computations. In the contrast, Booth Algorithm, with Wallace-tree optimizations, can do better partial product compression but introduces more complexity, limiting its performance in 16-bit operation systems. This dissertation also examines how three adder architectures—Ripple Carry Adder (RCA), Carry Lookahead Adder (CLA), and Kogge-Stone Adder (KSA)—affect overall multiplier performance. In terms of the RCA adder, the CLA adder and the KSA adder improve the computational speed of the final 32-bit addition by 93% and 51%, respectively. Simulation show that CLA-based Wallace-Booth multiplier has a better performance over KSA-based in smaller bit-width applications because of lower wiring overhead and complexity. Post-synthesis was done in Verilog on Design Vision, with default timing restrictions in the ST 65nm process library. This dissertation shows that the Vedic multiplier with cascaded CLA adders has the shortest worst computation time of about 1280 ps when multiplying sixteen-bit unsigned numbers, which is about 3.75 times the computation time of a conventional RCA multiplier (4800 ps). The worst computation time is also improved by a factor of 2 from 2800 ps to the multiplier unit synthesized by the DC's own library and its synthesis logic. The findings made an instruction to the trade-offs between speed, complexity, and hardware area in multiplier design, leading further research on higher-bitwidth and pipeline-optimized systems.