A theory-driven self-labeling refinement method for contrastive representation learning

For an image query, unsupervised contrastive learning labels crops of the same image as positives, and other image crops as negatives. Although intuitive, such a native label assignment strategy cannot reveal the underlying semantic similarity between a query and its positives and negatives, and imp...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHOU, Pan, XIONG, Caiming, YUAN, Xiao-Tong
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8989
https://ink.library.smu.edu.sg/context/sis_research/article/9992/viewcontent/2021_NeurIPS_SANE.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-9992
record_format dspace
spelling sg-smu-ink.sis_research-99922024-07-25T08:27:31Z A theory-driven self-labeling refinement method for contrastive representation learning ZHOU, Pan XIONG, Caiming YUAN, Xiao-Tong For an image query, unsupervised contrastive learning labels crops of the same image as positives, and other image crops as negatives. Although intuitive, such a native label assignment strategy cannot reveal the underlying semantic similarity between a query and its positives and negatives, and impairs performance, since some negatives are semantically similar to the query or even share the same semantic class as the query. In this work, we first prove that for contrastive learning, inaccurate label assignment heavily impairs its generalization for semantic instance discrimination, while accurate labels benefit its generalization. Inspired by this theory, we propose a novel self-labeling refinement approach for contrastive learning. It improves the label quality via two complementary modules: (i) selflabeling refinery (SLR) to generate accurate labels and (ii) momentum mixup (MM) to enhance similarity between query and its positive. SLR uses a positive of a query to estimate semantic similarity between a query and its positive and negatives, and combines estimated similarity with vanilla label assignment in contrastive learning to iteratively generate more accurate and informative soft labels. We theoretically show that our SLR can exactly recover the true semantic labels of label-corrupted data, and supervises networks to achieve zero prediction error on classification tasks. MM randomly combines queries and positives to increase semantic similarity between the generated virtual queries and their positives so as to improves label accuracy. Experimental results on CIFAR10, ImageNet, VOC and COCO show the effectiveness of our method. 2021-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8989 https://ink.library.smu.edu.sg/context/sis_research/article/9992/viewcontent/2021_NeurIPS_SANE.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Graphics and Human Computer Interfaces
spellingShingle Graphics and Human Computer Interfaces
ZHOU, Pan
XIONG, Caiming
YUAN, Xiao-Tong
A theory-driven self-labeling refinement method for contrastive representation learning
description For an image query, unsupervised contrastive learning labels crops of the same image as positives, and other image crops as negatives. Although intuitive, such a native label assignment strategy cannot reveal the underlying semantic similarity between a query and its positives and negatives, and impairs performance, since some negatives are semantically similar to the query or even share the same semantic class as the query. In this work, we first prove that for contrastive learning, inaccurate label assignment heavily impairs its generalization for semantic instance discrimination, while accurate labels benefit its generalization. Inspired by this theory, we propose a novel self-labeling refinement approach for contrastive learning. It improves the label quality via two complementary modules: (i) selflabeling refinery (SLR) to generate accurate labels and (ii) momentum mixup (MM) to enhance similarity between query and its positive. SLR uses a positive of a query to estimate semantic similarity between a query and its positive and negatives, and combines estimated similarity with vanilla label assignment in contrastive learning to iteratively generate more accurate and informative soft labels. We theoretically show that our SLR can exactly recover the true semantic labels of label-corrupted data, and supervises networks to achieve zero prediction error on classification tasks. MM randomly combines queries and positives to increase semantic similarity between the generated virtual queries and their positives so as to improves label accuracy. Experimental results on CIFAR10, ImageNet, VOC and COCO show the effectiveness of our method.
format text
author ZHOU, Pan
XIONG, Caiming
YUAN, Xiao-Tong
author_facet ZHOU, Pan
XIONG, Caiming
YUAN, Xiao-Tong
author_sort ZHOU, Pan
title A theory-driven self-labeling refinement method for contrastive representation learning
title_short A theory-driven self-labeling refinement method for contrastive representation learning
title_full A theory-driven self-labeling refinement method for contrastive representation learning
title_fullStr A theory-driven self-labeling refinement method for contrastive representation learning
title_full_unstemmed A theory-driven self-labeling refinement method for contrastive representation learning
title_sort theory-driven self-labeling refinement method for contrastive representation learning
publisher Institutional Knowledge at Singapore Management University
publishDate 2021
url https://ink.library.smu.edu.sg/sis_research/8989
https://ink.library.smu.edu.sg/context/sis_research/article/9992/viewcontent/2021_NeurIPS_SANE.pdf
_version_ 1814047701928509440