Hardware acceleration for non-linear layers of transformer networks on RISC-V CPU

This paper explores the utilization of hardware acceleration techniques for the non-linear layers in Transformer networks, specifically within the context of RISC-V CPU archi- tectures. The growing complexity of Transformer-based models, highlighted by their significant computational demands, unders...

Full description

Saved in:
Bibliographic Details
Main Author: Seenivasagan Haresh
Other Authors: Chang Chip Hong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/177093
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This paper explores the utilization of hardware acceleration techniques for the non-linear layers in Transformer networks, specifically within the context of RISC-V CPU archi- tectures. The growing complexity of Transformer-based models, highlighted by their significant computational demands, underscores the need for optimized computing solu- tions. Despite the widespread application of these models in generating human-like text and other multi-modal AI tasks, their deployment is often hampered by the high volume of Floating Point Operations (FLOPs) required, particularly for activation functions like GELU, Softmax, and SiLU. RISC-V, an open Instruction Set Architecture (ISA), offers a promising avenue for addressing these challenges due to its customizable and royalty-free nature. This paper investigates the potential of RISC-V CPUs to provide efficient hard- ware acceleration for the computationally intensive layers of Transformer networks. By focusing on non-linear layers, we aim to enhance the overall execution speed and energy efficiency of these models .