Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need

Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) access...

全面介紹

Saved in:

書目詳細資料
Main Authors:	Zhou, Da-Wei, Cai, Zi-Wen, Ye, Han-Jia, Zhan, De-Chuan, Liu, Ziwei
其他作者:	College of Computing and Data Science
格式:	Article
語言:	English
出版:	2024
主題:	Computer and Information Science Catastrophic forgetting Class-incremental learning
在線閱讀:	https://hdl.handle.net/10356/181218
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

id	sg-ntu-dr.10356-181218
record_format	dspace
spelling	sg-ntu-dr.10356-1812182024-11-18T04:20:14Z Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need Zhou, Da-Wei Cai, Zi-Wen Ye, Han-Jia Zhan, De-Chuan Liu, Ziwei College of Computing and Data Science S-Lab Computer and Information Science Catastrophic forgetting Class-incremental learning Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM’s generalizability and adapted model’s adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL. Agency for Science, Technology and Research (A*STAR) Ministry of Education (MOE) Nanyang Technological University This work is partially supported by National Science and Technology Major Project (2022ZD0114805), Fundamental Research Funds for the Central Universities (2024300373), NSFC (62376118, 62006112, 62250069, 61921006), Collaborative Innovation Center of Novel Software Technology and Industrialization, China Scholarship Council, Ministry of Education, Singapore, under its MOE AcRF Tier 2 (MOET2EP20221- 0012), NTU NAP, and under the RIE2020 Industry Alignment Fund - Industry Collaboration Projects (IAF-ICP) Funding Initiative. 2024-11-18T04:20:14Z 2024-11-18T04:20:14Z 2024 Journal Article Zhou, D., Cai, Z., Ye, H., Zhan, D. & Liu, Z. (2024). Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need. International Journal of Computer Vision. https://dx.doi.org/10.1007/s11263-024-02218-0 0920-5691 https://hdl.handle.net/10356/181218 10.1007/s11263-024-02218-0 2-s2.0-85202713390 en MOET2EP20221-0012 NTU NAP IAF-ICP International Journal of Computer Vision © 2024 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. All rights reserved.
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Catastrophic forgetting Class-incremental learning
spellingShingle	Computer and Information Science Catastrophic forgetting Class-incremental learning Zhou, Da-Wei Cai, Zi-Wen Ye, Han-Jia Zhan, De-Chuan Liu, Ziwei Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
description	Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM’s generalizability and adapted model’s adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL.
author2	College of Computing and Data Science
author_facet	College of Computing and Data Science Zhou, Da-Wei Cai, Zi-Wen Ye, Han-Jia Zhan, De-Chuan Liu, Ziwei
format	Article
author	Zhou, Da-Wei Cai, Zi-Wen Ye, Han-Jia Zhan, De-Chuan Liu, Ziwei
author_sort	Zhou, Da-Wei
title	Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
title_short	Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
title_full	Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
title_fullStr	Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
title_full_unstemmed	Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
title_sort	revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
publishDate	2024
url	https://hdl.handle.net/10356/181218
_version_	1816859061184888832

Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need

相似書籍