Accelerating BLAS and LAPACK via efficient floating point architecture design
Basic Linear Algebra Subprograms (BLAS) and Linear Algebra Package (LAPACK) form basic building blocks for several High Performance Computing (HPC) applications and hence dictate performance of the HPC applications. Performance in such tuned packages is attained through tuning of several algorithmic...
Saved in:
Main Authors: | Merchant, Farhad, Chattopadhyay, Anupam, Raha, Soumyendu, Nandy, S. K., Narayan, Ranjani |
---|---|
Other Authors: | School of Computer Science and Engineering |
Format: | Article |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/141525 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Similar Items
-
Efficient realization of householder transform through algorithm-architecture co-design for acceleration of QR factorization
by: Merchant, Farhad, et al.
Published: (2020) -
Bahurupi: A polymorphic heterogeneous multi-core architecture
by: Pricopi, M., et al.
Published: (2013) -
Achieving efficient realization of Kalman Filter on CGRA through algorithm-architecture co-design
by: Merchant, Farhad, et al.
Published: (2020) -
SAFA: Stack and frame architecture
by: SOO YUEN JIEN
Published: (2010) -
ACCELERATING REAL-TIME COMPUTER VISION ALGORITHMS ON PARALLEL HARDWARE ARCHITECTURES.
by: ANG ZHI PING
Published: (2014)