Heterogeneous dataflow architectures for FPGA-based sparse LU factorization

FPGA-based token dataflow architectures with heterogeneous computation and communication subsystems can accelerate hard-to-parallelize, irregular computations in sparse LU factorization. We combine software pre-processing and architecture customization to fully expose and exploit the underlying hete...

全面介紹

Saved in:
書目詳細資料
Main Authors: Siddhartha, Kapre, Nachiket
其他作者: School of Computer Engineering
格式: Conference or Workshop Item
語言:English
出版: 2015
主題:
在線閱讀:https://hdl.handle.net/10356/81188
http://hdl.handle.net/10220/39175
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:FPGA-based token dataflow architectures with heterogeneous computation and communication subsystems can accelerate hard-to-parallelize, irregular computations in sparse LU factorization. We combine software pre-processing and architecture customization to fully expose and exploit the underlying heterogeneity in the factorization algorithm. We perform a one-time pre-processing of the sparse matrices in software to generate dataflow graphs that capture raw parallelism in the computation through substitution and reassociation transformations. We customize the dataflow architecture by picking the right mixture of addition and multiplication processing elements to match the observed balance in the dataflow graphs. Additionally, we modify the network-on-chip to route certain critical dependencies on a separate, faster communication channel while relegating less-critical traffic to the existing channels. Using our techniques, we show how to achieve speedups of up to 37% over existing state-of-the-art FPGA-based sparse LU factorization systems that can already run 3-4× faster than CPU-based sparse LU solvers using the same hardware constraints.