Inference acceleration of large language models
This dissertation delves into the challenges and bottlenecks faced by current large language models during inference from three core perspectives: data, model, and system. Through meticulous research, key factors impacting inference speed are identified, encompassing data processing efficiency, m...
Saved in:
主要作者: | |
---|---|
其他作者: | |
格式: | Thesis-Master by Coursework |
語言: | English |
出版: |
Nanyang Technological University
2024
|
主題: | |
在線閱讀: | https://hdl.handle.net/10356/181660 |
標簽: |
添加標簽
沒有標簽, 成為第一個標記此記錄!
|
成為第一個發表評論!