Inference acceleration of large language models

This dissertation delves into the challenges and bottlenecks faced by current large language models during inference from three core perspectives: data, model, and system. Through meticulous research, key factors impacting inference speed are identified, encompassing data processing efficiency, m...

全面介紹

Saved in:
書目詳細資料
主要作者: Zhang, Boyu
其他作者: Mao Kezhi
格式: Thesis-Master by Coursework
語言:English
出版: Nanyang Technological University 2024
主題:
在線閱讀:https://hdl.handle.net/10356/181660
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!