Contrastive Self-Supervised Learning for Image Classification
In computer vision, most of the existing state-of-the-art results are dominated by models trained in supervised learning approach, where abundant of labelled data is used for training. However, the labelling of data is costly and limited in some fields. Thus, people have introduced a new paradigm th...
Saved in:
Main Author: | |
---|---|
Format: | Final Year Project / Dissertation / Thesis |
Published: |
2021
|
Subjects: | |
Online Access: | http://eprints.utar.edu.my/4189/1/17ACB01800_FYP.pdf http://eprints.utar.edu.my/4189/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Tunku Abdul Rahman |
Summary: | In computer vision, most of the existing state-of-the-art results are dominated by models trained in supervised learning approach, where abundant of labelled data is used for training. However, the labelling of data is costly and limited in some fields. Thus, people have introduced a new paradigm that falls under unsupervised learning – self-supervised learning. Through self-supervised learning, pretraining of the model can be conducted without any human-labelled data and the model can learn from the data itself. The model will pretrain on a pretext task first and the pretext task will ensure the model learn some useful representation for the downstream tasks (e.g., classification, object localization and so on).
One of the top performers in the self-supervised learning paradigm is SimCLR by Chen et al. (2020), in which it achieved 76.5% of top 1 accuracy in ImageNet dataset. Chen et al. (2020) proposed a contrastive self-supervised learning approach, where a pair of samples is produced from one image through different data augmentations and the model will learn while trying to find out each image pair within a training batch. However, they include random cropping as one of their data augmentations, where they allow it to possibly crop out 8% from the original image only. Under such extent of cropping, the model could not learn anything useful of the object, as the region can be a background region or contain too little details of the object.
Thus, this project proposes a novel approach to replace random cropping, where a region proposal algorithm is used to propose regions based on low-level features, such as colour, edges and so on. Thus, the regions produced by the algorithm have a higher chance to consist of an object part, thus promoting better learning. As a result, the pretrained model performs better than the model from SimCLR approach in downstream tasks. |
---|