Distillation and self-training in lane detection

Techniques such as knowledge distillation and selftraining have seen much research in recent years. These techniques are generalisable and provide performance improvements when applied on most models. Distillation allows a student network, usually with a smaller capacity, to perform similarly to the...

Full description

Saved in:
Bibliographic Details
Main Author: Ngo, Jia Wei
Other Authors: Chen Change Loy
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/144600
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Techniques such as knowledge distillation and selftraining have seen much research in recent years. These techniques are generalisable and provide performance improvements when applied on most models. Distillation allows a student network, usually with a smaller capacity, to perform similarly to their larger teacher networks, while retaining its lightweight and fast properties. Self-training allows us to utilize unlabeled images at scale to improve our network’s performance. Existing research has seen experimentation mainly on classification tasks, with some recent papers exploring distillation and self-training in the semantic segmentation domain, but to the best of our knowledge, never simultaneously. In this paper, we set out to explore the performance gains that can be achieved from these techniques in the domain of lane detection for selfdriving cars. Our results show that Knowledge Distillation with dark knowledge from an ensemble of same architecture models will be able to provide similar performance gains as with ensembling techniques, while retaining its low evaluation time compared to ensembling techniques (an important factor for lane detection in self-driving cars). Preliminary results from self-training, which has seen positive results when used in conjunction with pre-training, shows we may be able to provide additional performance gains on top of ensemble distillation for lane detection with large amounts of unlabeled data.