16-bit high-speed CMOS multiplier IC design
This dissertation focuses on a high-speed 16-bit CMOS multiplier design. In order to satisfy the increasing demands of contemporary computing systems, multiplication—a basic operation in digital signal processing, cryptography, and arithmetic units—needs to be implemented efficiently in hardwa...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Master by Coursework |
Language: | English |
Published: |
Nanyang Technological University
2025
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/182449 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This dissertation focuses on a high-speed 16-bit CMOS multiplier design. In order to
satisfy the increasing demands of contemporary computing systems, multiplication—a
basic operation in digital signal processing, cryptography, and arithmetic units—needs to
be implemented efficiently in hardware. The goal of the dissertation is to determine which
a greater design is with good speed and efficiency performance by investigating a variety
of multiplication algorithms, such as the Vedic, Booth, and Wallace-tree.
Since Vedic algorithm’s parallelism and simplicity, Vedic multiplier has the lowest
latency in 16-bit unsigned number multiplication, making it an ideal choice for
applications that require fast computations. In the contrast, Booth Algorithm, with
Wallace-tree optimizations, can do better partial product compression but introduces
more complexity, limiting its performance in 16-bit operation systems. This dissertation
also examines how three adder architectures—Ripple Carry Adder (RCA), Carry
Lookahead Adder (CLA), and Kogge-Stone Adder (KSA)—affect overall multiplier
performance. In terms of the RCA adder, the CLA adder and the KSA adder improve the
computational speed of the final 32-bit addition by 93% and 51%, respectively.
Simulation show that CLA-based Wallace-Booth multiplier has a better performance over
KSA-based in smaller bit-width applications because of lower wiring overhead and
complexity.
Post-synthesis was done in Verilog on Design Vision, with default timing restrictions in
the ST 65nm process library. This dissertation shows that the Vedic multiplier with
cascaded CLA adders has the shortest worst computation time of about 1280 ps when
multiplying sixteen-bit unsigned numbers, which is about 3.75 times the computation
time of a conventional RCA multiplier (4800 ps). The worst computation time is also
improved by a factor of 2 from 2800 ps to the multiplier unit synthesized by the DC's own
library and its synthesis logic. The findings made an instruction to the trade-offs between
speed, complexity, and hardware area in multiplier design, leading further research on
higher-bitwidth and pipeline-optimized systems. |
---|