OPTIMIZING INFERENCE PERFORMANCE OF BERT ON CPUS USING APACHE TVM

BERT as a model that composed of Transformer layer is game changing for field of natural language processing (NLP). There has been a lot of study to speedup training the model however only relatively little efforts are made to improve their inference performance. Also, not all machine learning...

Full description

Saved in:
Bibliographic Details
Main Author: Legowo, Setyo
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/56144
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia