Efficient polynomial evaluation algorithm and implementation on FPGA

In this thesis, an optimized polynomial evaluation algorithm is presented. Compared to Horner's Rule which has the least number of computation steps but longest latency, or parallel evaluation methods like Estrin's method which are fast but with large hardware overhead, the proposed algori...

Full description

Saved in:
Bibliographic Details
Main Author: Xu, Simin
Other Authors: Ian Vince McLoughlin
Format: Theses and Dissertations
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/54869
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In this thesis, an optimized polynomial evaluation algorithm is presented. Compared to Horner's Rule which has the least number of computation steps but longest latency, or parallel evaluation methods like Estrin's method which are fast but with large hardware overhead, the proposed algorithm could achieve high level of parallelism with smallest area, by means of replacing multiplication with sqaure. To enable the performance gain for the proposed algorithm, an efficient integer squarer is proposed and implemented in FPGA with fewer DSP blocks. Previous work has presented tiling method for a double precision squarer which uses the least amount of DSP blocks so far. However it incurs a large LUT overhead and has a complex and irregular structure that it is not expandable for higher word size. The circuit proposed in this thesis can reduce the DSP block usage by an equivalent amount compared to the tiling method while incurring a much lower LUT overhead: 21.8\% fewer LUTs for a 53-bit squarer. The circuit is mapped to Xilinx Virtex 6 FPGA and evaluated for a wide range of operand word sizes, demonstrating its scalability and efficiency. With the novel squarer, the proposed polynomial algorithm exhibits 41\% latency reduction over conventional Horner's Rule for a $5^{th}$ degree polynomial with 11.9\% less area and 44.8\% latency reduction in a $4^{th}$ degree polynomial with 5\% less area on FPGA. In contrast, Estrin's method occupies 26\% and 16.5\% more area compared to Horner's Rule to achieve same level of speed improvement for the same $5^{th}$ and $4^{th}$ degree polynomial respectively.