S2match: self-paced sampling for data-limited semi-supervised learning
Data-limited semi-supervised learning tends to be severely degraded by miscalibration (i.e., misalignment between confidence and correctness of predicted pseudo labels) and stuck at poor local minima while learning from the same set of over-confident yet incorrect pseudo labels repeatedly. We design...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2025
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/182563 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-182563 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1825632025-02-10T04:38:32Z S2match: self-paced sampling for data-limited semi-supervised learning Guan, Dayan Xing, Yun Huang, Jiaxing Xiao, Aoran El Saddik, Abdulmotaleb Lu, Shijian College of Computing and Data Science Computer and Information Science Semi-supervised learning Self-paced learning Data-limited semi-supervised learning tends to be severely degraded by miscalibration (i.e., misalignment between confidence and correctness of predicted pseudo labels) and stuck at poor local minima while learning from the same set of over-confident yet incorrect pseudo labels repeatedly. We design a simple and effective self-paced sampling technique that can greatly alleviate the impact of miscalibration and learn more accurate semi-supervised models from limited training data. Instead of employing static or dynamic confidence thresholds which is sensitive to miscalibration, the proposed self-paced sampling follows a simple linear policy to select pseudo labels which eases repeated learning from the same set of falsely predicted pseudo labels at the early training stage and lowers the chance of being stuck at local minima effectively. Despite its simplicity, extensive evaluations over multiple data-limited semi-supervised tasks show the proposed self-paced sampling outperforms the state-of-the-art consistently by large margins. This research was funded by Talent Scientific Research Start-up Project of Harbin Institute of Technology. 2025-02-10T02:34:01Z 2025-02-10T02:34:01Z 2025 Journal Article Guan, D., Xing, Y., Huang, J., Xiao, A., El Saddik, A. & Lu, S. (2025). S2match: self-paced sampling for data-limited semi-supervised learning. Pattern Recognition, 159, 111121-. https://dx.doi.org/10.1016/j.patcog.2024.111121 0031-3203 https://hdl.handle.net/10356/182563 10.1016/j.patcog.2024.111121 2-s2.0-85208241871 159 111121 en Pattern Recognition © 2024 Elsevier Ltd. All rights are reserved, including those for text and data mining, AI training, and similar technologies. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Semi-supervised learning Self-paced learning |
spellingShingle |
Computer and Information Science Semi-supervised learning Self-paced learning Guan, Dayan Xing, Yun Huang, Jiaxing Xiao, Aoran El Saddik, Abdulmotaleb Lu, Shijian S2match: self-paced sampling for data-limited semi-supervised learning |
description |
Data-limited semi-supervised learning tends to be severely degraded by miscalibration (i.e., misalignment between confidence and correctness of predicted pseudo labels) and stuck at poor local minima while learning from the same set of over-confident yet incorrect pseudo labels repeatedly. We design a simple and effective self-paced sampling technique that can greatly alleviate the impact of miscalibration and learn more accurate semi-supervised models from limited training data. Instead of employing static or dynamic confidence thresholds which is sensitive to miscalibration, the proposed self-paced sampling follows a simple linear policy to select pseudo labels which eases repeated learning from the same set of falsely predicted pseudo labels at the early training stage and lowers the chance of being stuck at local minima effectively. Despite its simplicity, extensive evaluations over multiple data-limited semi-supervised tasks show the proposed self-paced sampling outperforms the state-of-the-art consistently by large margins. |
author2 |
College of Computing and Data Science |
author_facet |
College of Computing and Data Science Guan, Dayan Xing, Yun Huang, Jiaxing Xiao, Aoran El Saddik, Abdulmotaleb Lu, Shijian |
format |
Article |
author |
Guan, Dayan Xing, Yun Huang, Jiaxing Xiao, Aoran El Saddik, Abdulmotaleb Lu, Shijian |
author_sort |
Guan, Dayan |
title |
S2match: self-paced sampling for data-limited semi-supervised learning |
title_short |
S2match: self-paced sampling for data-limited semi-supervised learning |
title_full |
S2match: self-paced sampling for data-limited semi-supervised learning |
title_fullStr |
S2match: self-paced sampling for data-limited semi-supervised learning |
title_full_unstemmed |
S2match: self-paced sampling for data-limited semi-supervised learning |
title_sort |
s2match: self-paced sampling for data-limited semi-supervised learning |
publishDate |
2025 |
url |
https://hdl.handle.net/10356/182563 |
_version_ |
1823807400010842112 |