S2match: self-paced sampling for data-limited semi-supervised learning

Data-limited semi-supervised learning tends to be severely degraded by miscalibration (i.e., misalignment between confidence and correctness of predicted pseudo labels) and stuck at poor local minima while learning from the same set of over-confident yet incorrect pseudo labels repeatedly. We design...

Full description

Saved in:

Bibliographic Details
Main Authors:	Guan, Dayan, Xing, Yun, Huang, Jiaxing, Xiao, Aoran, El Saddik, Abdulmotaleb, Lu, Shijian
Other Authors:	College of Computing and Data Science
Format:	Article
Language:	English
Published:	2025
Subjects:	Computer and Information Science Semi-supervised learning Self-paced learning
Online Access:	https://hdl.handle.net/10356/182563
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-182563
record_format	dspace
spelling	sg-ntu-dr.10356-1825632025-02-10T04:38:32Z S2match: self-paced sampling for data-limited semi-supervised learning Guan, Dayan Xing, Yun Huang, Jiaxing Xiao, Aoran El Saddik, Abdulmotaleb Lu, Shijian College of Computing and Data Science Computer and Information Science Semi-supervised learning Self-paced learning Data-limited semi-supervised learning tends to be severely degraded by miscalibration (i.e., misalignment between confidence and correctness of predicted pseudo labels) and stuck at poor local minima while learning from the same set of over-confident yet incorrect pseudo labels repeatedly. We design a simple and effective self-paced sampling technique that can greatly alleviate the impact of miscalibration and learn more accurate semi-supervised models from limited training data. Instead of employing static or dynamic confidence thresholds which is sensitive to miscalibration, the proposed self-paced sampling follows a simple linear policy to select pseudo labels which eases repeated learning from the same set of falsely predicted pseudo labels at the early training stage and lowers the chance of being stuck at local minima effectively. Despite its simplicity, extensive evaluations over multiple data-limited semi-supervised tasks show the proposed self-paced sampling outperforms the state-of-the-art consistently by large margins. This research was funded by Talent Scientific Research Start-up Project of Harbin Institute of Technology. 2025-02-10T02:34:01Z 2025-02-10T02:34:01Z 2025 Journal Article Guan, D., Xing, Y., Huang, J., Xiao, A., El Saddik, A. & Lu, S. (2025). S2match: self-paced sampling for data-limited semi-supervised learning. Pattern Recognition, 159, 111121-. https://dx.doi.org/10.1016/j.patcog.2024.111121 0031-3203 https://hdl.handle.net/10356/182563 10.1016/j.patcog.2024.111121 2-s2.0-85208241871 159 111121 en Pattern Recognition © 2024 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies.
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Semi-supervised learning Self-paced learning
spellingShingle	Computer and Information Science Semi-supervised learning Self-paced learning Guan, Dayan Xing, Yun Huang, Jiaxing Xiao, Aoran El Saddik, Abdulmotaleb Lu, Shijian S2match: self-paced sampling for data-limited semi-supervised learning
description	Data-limited semi-supervised learning tends to be severely degraded by miscalibration (i.e., misalignment between confidence and correctness of predicted pseudo labels) and stuck at poor local minima while learning from the same set of over-confident yet incorrect pseudo labels repeatedly. We design a simple and effective self-paced sampling technique that can greatly alleviate the impact of miscalibration and learn more accurate semi-supervised models from limited training data. Instead of employing static or dynamic confidence thresholds which is sensitive to miscalibration, the proposed self-paced sampling follows a simple linear policy to select pseudo labels which eases repeated learning from the same set of falsely predicted pseudo labels at the early training stage and lowers the chance of being stuck at local minima effectively. Despite its simplicity, extensive evaluations over multiple data-limited semi-supervised tasks show the proposed self-paced sampling outperforms the state-of-the-art consistently by large margins.
author2	College of Computing and Data Science
author_facet	College of Computing and Data Science Guan, Dayan Xing, Yun Huang, Jiaxing Xiao, Aoran El Saddik, Abdulmotaleb Lu, Shijian
format	Article
author	Guan, Dayan Xing, Yun Huang, Jiaxing Xiao, Aoran El Saddik, Abdulmotaleb Lu, Shijian
author_sort	Guan, Dayan
title	S2match: self-paced sampling for data-limited semi-supervised learning
title_short	S2match: self-paced sampling for data-limited semi-supervised learning
title_full	S2match: self-paced sampling for data-limited semi-supervised learning
title_fullStr	S2match: self-paced sampling for data-limited semi-supervised learning
title_full_unstemmed	S2match: self-paced sampling for data-limited semi-supervised learning
title_sort	s2match: self-paced sampling for data-limited semi-supervised learning
publishDate	2025
url	https://hdl.handle.net/10356/182563
_version_	1823807400010842112

S2match: self-paced sampling for data-limited semi-supervised learning

Similar Items