Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need

Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) access...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhou, Da-Wei, Cai, Zi-Wen, Ye, Han-Jia, Zhan, De-Chuan, Liu, Ziwei
Other Authors: College of Computing and Data Science
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181218
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-181218
record_format dspace
spelling sg-ntu-dr.10356-1812182024-11-18T04:20:14Z Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need Zhou, Da-Wei Cai, Zi-Wen Ye, Han-Jia Zhan, De-Chuan Liu, Ziwei College of Computing and Data Science S-Lab Computer and Information Science Catastrophic forgetting Class-incremental learning Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM’s generalizability and adapted model’s adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL. Agency for Science, Technology and Research (A*STAR) Ministry of Education (MOE) Nanyang Technological University This work is partially supported by National Science and Technology Major Project (2022ZD0114805), Fundamental Research Funds for the Central Universities (2024300373), NSFC (62376118, 62006112, 62250069, 61921006), Collaborative Innovation Center of Novel Software Technology and Industrialization, China Scholarship Council, Ministry of Education, Singapore, under its MOE AcRF Tier 2 (MOET2EP20221- 0012), NTU NAP, and under the RIE2020 Industry Alignment Fund - Industry Collaboration Projects (IAF-ICP) Funding Initiative. 2024-11-18T04:20:14Z 2024-11-18T04:20:14Z 2024 Journal Article Zhou, D., Cai, Z., Ye, H., Zhan, D. & Liu, Z. (2024). Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need. International Journal of Computer Vision. https://dx.doi.org/10.1007/s11263-024-02218-0 0920-5691 https://hdl.handle.net/10356/181218 10.1007/s11263-024-02218-0 2-s2.0-85202713390 en MOET2EP20221-0012 NTU NAP IAF-ICP International Journal of Computer Vision © 2024 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Catastrophic forgetting
Class-incremental learning
spellingShingle Computer and Information Science
Catastrophic forgetting
Class-incremental learning
Zhou, Da-Wei
Cai, Zi-Wen
Ye, Han-Jia
Zhan, De-Chuan
Liu, Ziwei
Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
description Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting old ones. Traditional CIL models are trained from scratch to continually acquire knowledge as data evolves. Recently, pre-training has achieved substantial progress, making vast pre-trained models (PTMs) accessible for CIL. Contrary to traditional methods, PTMs possess generalizable embeddings, which can be easily transferred for CIL. In this work, we revisit CIL with PTMs and argue that the core factors in CIL are adaptivity for model updating and generalizability for knowledge transferring. (1) We first reveal that frozen PTM can already provide generalizable embeddings for CIL. Surprisingly, a simple baseline (SimpleCIL) which continually sets the classifiers of PTM to prototype features can beat state-of-the-art even without training on the downstream task. (2) Due to the distribution gap between pre-trained and downstream datasets, PTM can be further cultivated with adaptivity via model adaptation. We propose AdaPt and mERge (Aper), which aggregates the embeddings of PTM and adapted models for classifier construction. Aper is a general framework that can be orthogonally combined with any parameter-efficient tuning method, which holds the advantages of PTM’s generalizability and adapted model’s adaptivity. (3) Additionally, considering previous ImageNet-based benchmarks are unsuitable in the era of PTM due to data overlapping, we propose four new benchmarks for assessment, namely ImageNet-A, ObjectNet, OmniBenchmark, and VTAB. Extensive experiments validate the effectiveness of Aper with a unified and concise framework. Code is available at https://github.com/zhoudw-zdw/RevisitingCIL.
author2 College of Computing and Data Science
author_facet College of Computing and Data Science
Zhou, Da-Wei
Cai, Zi-Wen
Ye, Han-Jia
Zhan, De-Chuan
Liu, Ziwei
format Article
author Zhou, Da-Wei
Cai, Zi-Wen
Ye, Han-Jia
Zhan, De-Chuan
Liu, Ziwei
author_sort Zhou, Da-Wei
title Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
title_short Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
title_full Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
title_fullStr Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
title_full_unstemmed Revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
title_sort revisiting class-incremental learning with pre-trained models: generalizability and adaptivity are all you need
publishDate 2024
url https://hdl.handle.net/10356/181218
_version_ 1816859061184888832