Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors

Automated code generation and performance tuning techniques for concurrent architectures such as GPUs, Cell and FPGAs can provide integer factor speedups over multi-core processor organizations for data-parallel, floating-point computation in SPICE model-evaluation. Our Verilog AMS compiler produces...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Kapre, Nachiket, DeHon, Andre
مؤلفون آخرون:	School of Computer Engineering
التنسيق:	Conference or Workshop Item
اللغة:	English
منشور في:	2015
الموضوعات:	Computer Science and Engineering
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/81245 http://hdl.handle.net/10220/39197
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

id	sg-ntu-dr.10356-81245
record_format	dspace
spelling	sg-ntu-dr.10356-812452020-05-28T07:18:29Z Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors Kapre, Nachiket DeHon, Andre School of Computer Engineering 2009 International Conference on Field Programmable Logic and Applications (FPL) Computer Science and Engineering Automated code generation and performance tuning techniques for concurrent architectures such as GPUs, Cell and FPGAs can provide integer factor speedups over multi-core processor organizations for data-parallel, floating-point computation in SPICE model-evaluation. Our Verilog AMS compiler produces code for parallel evaluation of non-linear circuit models suitable for use in SPICE simulations where the same model is evaluated several times for all the devices in the circuit. Our compiler uses architecture specific parallelization strategies (OpenMP for multi-core, PThreads for Cell, CUDA for GPU, statically scheduled VLIW for FPGA) when producing code for these different architectures. We automatically explore different implementation configurations (e.g. unroll factor, vector length) using our performance-tuner to identify the best possible configuration for each architecture. We demonstrate speedups of 3- 182times for a Xilinx Virtex5 LX 330T, 1.3-33times for an IBM Cell, and 3-131times for an NVIDIA 9600 GT GPU over a 3 GHz Intel Xeon 5160 implementation for a variety of single-precision device models. Accepted version 2015-12-21T07:52:15Z 2019-12-06T14:26:25Z 2015-12-21T07:52:15Z 2019-12-06T14:26:25Z 2009 Conference Paper Kapre, N., & DeHon, A. (2009). Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors. 2009 International Conference on Field Programmable Logic and Applications. https://hdl.handle.net/10356/81245 http://hdl.handle.net/10220/39197 10.1109/FPL.2009.5272548 en © 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/FPL.2009.5272548]. application/pdf
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	Computer Science and Engineering
spellingShingle	Computer Science and Engineering Kapre, Nachiket DeHon, Andre Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors
description	Automated code generation and performance tuning techniques for concurrent architectures such as GPUs, Cell and FPGAs can provide integer factor speedups over multi-core processor organizations for data-parallel, floating-point computation in SPICE model-evaluation. Our Verilog AMS compiler produces code for parallel evaluation of non-linear circuit models suitable for use in SPICE simulations where the same model is evaluated several times for all the devices in the circuit. Our compiler uses architecture specific parallelization strategies (OpenMP for multi-core, PThreads for Cell, CUDA for GPU, statically scheduled VLIW for FPGA) when producing code for these different architectures. We automatically explore different implementation configurations (e.g. unroll factor, vector length) using our performance-tuner to identify the best possible configuration for each architecture. We demonstrate speedups of 3- 182times for a Xilinx Virtex5 LX 330T, 1.3-33times for an IBM Cell, and 3-131times for an NVIDIA 9600 GT GPU over a 3 GHz Intel Xeon 5160 implementation for a variety of single-precision device models.
author2	School of Computer Engineering
author_facet	School of Computer Engineering Kapre, Nachiket DeHon, Andre
format	Conference or Workshop Item
author	Kapre, Nachiket DeHon, Andre
author_sort	Kapre, Nachiket
title	Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors
title_short	Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors
title_full	Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors
title_fullStr	Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors
title_full_unstemmed	Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors
title_sort	performance comparison of single-precision spice model-evaluation on fpga, gpu, cell, and multi-core processors
publishDate	2015
url	https://hdl.handle.net/10356/81245 http://hdl.handle.net/10220/39197
_version_	1681056934061408256

Performance comparison of single-precision SPICE Model-Evaluation on FPGA, GPU, Cell, and multi-core processors

مواد مشابهة