Long-tailed image recognition

The long-tailed distribution problem often poses great challenges to deep learning based computer vision tasks, making the models perform poorly on the balanced test set, particularly for the less frequent classes. With the rapid increase in large-scale deployment of AI solution in industries and th...

全面介紹

Saved in:
書目詳細資料
主要作者: Li, Zhaochen
其他作者: Chen Change Loy
格式: Final Year Project
語言:English
出版: Nanyang Technological University 2021
主題:
在線閱讀:https://hdl.handle.net/10356/148026
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English
實物特徵
總結:The long-tailed distribution problem often poses great challenges to deep learning based computer vision tasks, making the models perform poorly on the balanced test set, particularly for the less frequent classes. With the rapid increase in large-scale deployment of AI solution in industries and the long-tailed nature of many real-world dataset, it becomes critical to closely examine and address this problem. Traditional class-imbalance solutions in machine learning typically adopt the approach of data re-sampling or loss re-weighting, while recent research in deep learning focus on more sophisticated re-balancing strategy and model architecture modifications. In this paper, we set out to explore the long-tailed problem via two approaches: 1) How Mixup, a commonly used data augmentation technique, could affect the model's performance and how it could be potentially modified. 2) How to construct a memory module and incorporate NCM classifier to help with prediction. We first provide detailed analysis on Mixup under two sampling strategies: Class-Balanced Sampling and Instance-Based Sampling. We find that the original Mixup approach is hyperparameter-sensitive and fails to improve the model performance. We then propose Prior-Aware Mixup which make use of label distribution to govern the pair selection process. For the second approach, we make use of knowledge distillation with a momentum-updated memory module and propose fusion and assignment techniques which outperform several SOTA results on long-tailed benchmark datasets.