Efficient polynomial evaluation algorithm and implementation on FPGA

In this thesis, an optimized polynomial evaluation algorithm is presented. Compared to Horner's Rule which has the least number of computation steps but longest latency, or parallel evaluation methods like Estrin's method which are fast but with large hardware overhead, the proposed algori...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Xu, Simin
مؤلفون آخرون: Ian Vince McLoughlin
التنسيق: Theses and Dissertations
اللغة:English
منشور في: 2013
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/54869
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:In this thesis, an optimized polynomial evaluation algorithm is presented. Compared to Horner's Rule which has the least number of computation steps but longest latency, or parallel evaluation methods like Estrin's method which are fast but with large hardware overhead, the proposed algorithm could achieve high level of parallelism with smallest area, by means of replacing multiplication with sqaure. To enable the performance gain for the proposed algorithm, an efficient integer squarer is proposed and implemented in FPGA with fewer DSP blocks. Previous work has presented tiling method for a double precision squarer which uses the least amount of DSP blocks so far. However it incurs a large LUT overhead and has a complex and irregular structure that it is not expandable for higher word size. The circuit proposed in this thesis can reduce the DSP block usage by an equivalent amount compared to the tiling method while incurring a much lower LUT overhead: 21.8\% fewer LUTs for a 53-bit squarer. The circuit is mapped to Xilinx Virtex 6 FPGA and evaluated for a wide range of operand word sizes, demonstrating its scalability and efficiency. With the novel squarer, the proposed polynomial algorithm exhibits 41\% latency reduction over conventional Horner's Rule for a $5^{th}$ degree polynomial with 11.9\% less area and 44.8\% latency reduction in a $4^{th}$ degree polynomial with 5\% less area on FPGA. In contrast, Estrin's method occupies 26\% and 16.5\% more area compared to Horner's Rule to achieve same level of speed improvement for the same $5^{th}$ and $4^{th}$ degree polynomial respectively.