Accelerating BLAS and LAPACK via efficient floating point architecture design
Basic Linear Algebra Subprograms (BLAS) and Linear Algebra Package (LAPACK) form basic building blocks for several High Performance Computing (HPC) applications and hence dictate performance of the HPC applications. Performance in such tuned packages is attained through tuning of several algorithmic...
Saved in:
Main Authors: | Merchant, Farhad, Chattopadhyay, Anupam, Raha, Soumyendu, Nandy, S. K., Narayan, Ranjani |
---|---|
其他作者: | School of Computer Science and Engineering |
格式: | Article |
語言: | English |
出版: |
2020
|
主題: | |
在線閱讀: | https://hdl.handle.net/10356/141525 |
標簽: |
添加標簽
沒有標簽, 成為第一個標記此記錄!
|
機構: | Nanyang Technological University |
語言: | English |
相似書籍
-
Efficient realization of householder transform through algorithm-architecture co-design for acceleration of QR factorization
由: Merchant, Farhad, et al.
出版: (2020) -
Bahurupi: A polymorphic heterogeneous multi-core architecture
由: Pricopi, M., et al.
出版: (2013) -
SAFA: Stack and frame architecture
由: SOO YUEN JIEN
出版: (2010) -
Achieving efficient realization of Kalman Filter on CGRA through algorithm-architecture co-design
由: Merchant, Farhad, et al.
出版: (2020) -
ACCELERATING REAL-TIME COMPUTER VISION ALGORITHMS ON PARALLEL HARDWARE ARCHITECTURES.
由: ANG ZHI PING
出版: (2014)