OPTIMIZING INFERENCE PERFORMANCE OF BERT ON CPUS USING APACHE TVM
BERT as a model that composed of Transformer layer is game changing for field of natural language processing (NLP). There has been a lot of study to speedup training the model however only relatively little efforts are made to improve their inference performance. Also, not all machine learning...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/56144 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
Be the first to leave a comment!