Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization

Substitution, and reassociation of irregular sparse LU factorization can deliver up to 31% additional speedup over an existing state-of-the-art parallel FPGA implementation where further parallelization was deemed virtually impossible. The state-of-the-art implementation is already capable of delive...

全面介紹

Saved in:

書目詳細資料
Main Authors:	Siddhartha, Kapre, Nachiket
其他作者:	School of Computer Engineering
格式:	Conference or Workshop Item
語言:	English
出版:	2015
主題:	Computer Science and Engineering
在線閱讀:	https://hdl.handle.net/10356/81075 http://hdl.handle.net/10220/39139
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

id	sg-ntu-dr.10356-81075
record_format	dspace
spelling	sg-ntu-dr.10356-810752020-05-28T07:17:20Z Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization Siddhartha Kapre, Nachiket School of Computer Engineering 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM) Computer Science and Engineering Substitution, and reassociation of irregular sparse LU factorization can deliver up to 31% additional speedup over an existing state-of-the-art parallel FPGA implementation where further parallelization was deemed virtually impossible. The state-of-the-art implementation is already capable of delivering 3× acceleration over CPU-based sparse LU solvers. Sparse LU factorization is a well-known computational bottleneck in many existing scientific and engineering applications and is notoriously hard to parallelize due to inherent sequential dependencies in the computation graph. In this paper, we show how to break these alleged inherent dependencies using depth-limited substitution, and reassociation of the resulting computation. This is a work-parallelism tradeoff that is well-suited for implementation on FPGA-based token dataflow architectures. Such compute organizations are capable of fast parallel processing of large irregular graphs extracted from the sparse LU computation. We manage and control the growth in additional work due to substitution through careful selection of substitution depth. We exploit associativity in the generated graphs to restructure long compute chains into reduction trees. Accepted version 2015-12-17T07:46:40Z 2019-12-06T14:20:54Z 2015-12-17T07:46:40Z 2019-12-06T14:20:54Z 2014 Conference Paper Siddhartha, & Kapre, N. (2014). Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization. 2014 IEEE 22nd Annual International Symposium on Field-Programmable Custom Computing Machines, 60-63. https://hdl.handle.net/10356/81075 http://hdl.handle.net/10220/39139 10.1109/FCCM.2014.26 en © 2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: [http://dx.doi.org/10.1109/FCCM.2014.26]. 4 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
country	Singapore
collection	DR-NTU
language	English
topic	Computer Science and Engineering
spellingShingle	Computer Science and Engineering Siddhartha Kapre, Nachiket Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization
description	Substitution, and reassociation of irregular sparse LU factorization can deliver up to 31% additional speedup over an existing state-of-the-art parallel FPGA implementation where further parallelization was deemed virtually impossible. The state-of-the-art implementation is already capable of delivering 3× acceleration over CPU-based sparse LU solvers. Sparse LU factorization is a well-known computational bottleneck in many existing scientific and engineering applications and is notoriously hard to parallelize due to inherent sequential dependencies in the computation graph. In this paper, we show how to break these alleged inherent dependencies using depth-limited substitution, and reassociation of the resulting computation. This is a work-parallelism tradeoff that is well-suited for implementation on FPGA-based token dataflow architectures. Such compute organizations are capable of fast parallel processing of large irregular graphs extracted from the sparse LU computation. We manage and control the growth in additional work due to substitution through careful selection of substitution depth. We exploit associativity in the generated graphs to restructure long compute chains into reduction trees.
author2	School of Computer Engineering
author_facet	School of Computer Engineering Siddhartha Kapre, Nachiket
format	Conference or Workshop Item
author	Siddhartha Kapre, Nachiket
author_sort	Siddhartha
title	Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization
title_short	Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization
title_full	Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization
title_fullStr	Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization
title_full_unstemmed	Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization
title_sort	breaking sequential dependencies in fpga-based sparse lu factorization
publishDate	2015
url	https://hdl.handle.net/10356/81075 http://hdl.handle.net/10220/39139
_version_	1681058425876774912

Breaking Sequential Dependencies in FPGA-Based Sparse LU Factorization

相似書籍