Finding meta winning ticket to train your MAML
The lottery ticket hypothesis (LTH) states that a randomly initialized dense network contains sub-networks that can be trained in isolation to the performance of the dense network. In this paper, to achieve rapid learning with less computational cost, we explore LTH in the context of meta learning....
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2022
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/7256 https://ink.library.smu.edu.sg/context/sis_research/article/8259/viewcontent/kdd22_gao.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | The lottery ticket hypothesis (LTH) states that a randomly initialized dense network contains sub-networks that can be trained in isolation to the performance of the dense network. In this paper, to achieve rapid learning with less computational cost, we explore LTH in the context of meta learning. First, we experimentally show that there are sparse sub-networks, known as meta winning tickets, which can be meta-trained to few-shot classification accuracy to the original backbone. The application of LTH in meta learning enables the adaptation of meta-trained networks on various IoT devices with fewer computation. However, the status quo to identify winning tickets requires iterative training and pruning, which is particularly expensive for finding meta winning tickets. To this end, then we investigate the inter- and intra-layer patterns among different meta winning tickets, and propose a scheme for early detection of a meta winning ticket. The proposed scheme enables efficient training in resource-limited devices. Besides, it also designs a lightweight solution to search the meta winning ticket. Evaluations on standard few-shot classification benchmarks show that we can find competitive meta winning tickets with 20% weights of the original backbone, while incurring only 8%-14% (Conv-4) and 19%-29% (ResNet-12) computation overhead (measured by FLOPs) of the standard winning ticket finding scheme. |
---|