Towards faster inference of transformers: Strategies for accelerating decoding processes

This thesis delves into the acceleration and optimization of Transformer inference, a subject of increasing importance with the emergence of Large Language Models (LLMs). The study primarily addresses the challenges posed by two inherent properties of Transformers during inference: the quadratic com...

全面介紹

Saved in:
書目詳細資料
主要作者: DU, Cunxiao
格式: text
語言:English
出版: Institutional Knowledge at Singapore Management University 2024
主題:
在線閱讀:https://ink.library.smu.edu.sg/etd_coll/613
https://ink.library.smu.edu.sg/context/etd_coll/article/1611/viewcontent/GPIS_AY2019_PhD_CunxiaoDu.pdf
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Singapore Management University
語言: English