Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) access...
Saved in:
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181218 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181218 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1812182024-11-18T04:20:14Z Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need Zhou, Da-Wei Cai, Zi-Wen Ye, Han-Jia Zhan, De-Chuan Liu, Ziwei College of Computing and Data Science S-Lab Computer and Information Science Catastrophic forgetting Class-incremental learning Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM’s generalizability and adapted model’s adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL. Agency for Science, Technology and Research (A*STAR) Ministry of Education (MOE) Nanyang Technological University This work is partially supported by National Science and Technology Major Project (2022ZD0114805), Fundamental Research Funds for the Central Universities (2024300373), NSFC (62376118, 62006112, 62250069, 61921006), Collaborative Innovation Center of Novel Software Technology and Industrialization, China Scholarship Council, Ministry of Education, Singapore, under its MOE AcRF Tier 2 (MOET2EP20221- 0012), NTU NAP, and under the RIE2020 Industry Alignment Fund - Industry Collaboration Projects (IAF-ICP) Funding Initiative. 2024-11-18T04:20:14Z 2024-11-18T04:20:14Z 2024 Journal Article Zhou, D., Cai, Z., Ye, H., Zhan, D. & Liu, Z. (2024). Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need. International Journal of Computer Vision. https://dx.doi.org/10.1007/s11263-024-02218-0 0920-5691 https://hdl.handle.net/10356/181218 10.1007/s11263-024-02218-0 2-s2.0-85202713390 en MOET2EP20221-0012 NTU NAP IAF-ICP International Journal of Computer Vision © 2024 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Catastrophic forgetting Class-incremental learning |
spellingShingle |
Computer and Information Science Catastrophic forgetting Class-incremental learning Zhou, Da-Wei Cai, Zi-Wen Ye, Han-Jia Zhan, De-Chuan Liu, Ziwei Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need |
description |
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM’s generalizability and adapted model’s adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL. |
author2 |
College of Computing and Data Science |
author_facet |
College of Computing and Data Science Zhou, Da-Wei Cai, Zi-Wen Ye, Han-Jia Zhan, De-Chuan Liu, Ziwei |
format |
Article |
author |
Zhou, Da-Wei Cai, Zi-Wen Ye, Han-Jia Zhan, De-Chuan Liu, Ziwei |
author_sort |
Zhou, Da-Wei |
title |
Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need |
title_short |
Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need |
title_full |
Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need |
title_fullStr |
Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need |
title_full_unstemmed |
Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need |
title_sort |
revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181218 |
_version_ |
1816859061184888832 |