Theory-inspired path-regularized differential network architecture search

Despite its high search efficiency, differential architecture search (DARTS) often selects network architectures with dominated skip connections which lead to performance degradation. However, theoretical understandings on this issue remain absent, hindering the development of more advanced methods...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHOU, Pan, XIONG, Caiming, SOCHER, Richard, HOI, Steven C. H.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8998
https://ink.library.smu.edu.sg/context/sis_research/article/10001/viewcontent/2020_NeurIPS_NAS__1_.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-10001
record_format dspace
spelling sg-smu-ink.sis_research-100012024-07-25T08:19:47Z Theory-inspired path-regularized differential network architecture search ZHOU, Pan XIONG, Caiming SOCHER, Richard HOI, Steven C. H. Despite its high search efficiency, differential architecture search (DARTS) often selects network architectures with dominated skip connections which lead to performance degradation. However, theoretical understandings on this issue remain absent, hindering the development of more advanced methods in a principled way. In this work, we solve this problem by theoretically analyzing the effects of various types of operations, e.g. convolution, skip connection and zero operation, to the network optimization. We prove that the architectures with more skip connections can converge faster than the other candidates, and thus are selected by DARTS. This result, for the first time, theoretically and explicitly reveals the impact of skip connections to fast network optimization and its competitive advantage over other types of operations in DARTS. Then we propose a theory-inspired path-regularized DARTS that consists of two key modules: (i) a differential group-structured sparse binary gate introduced for each operation to avoid unfair competition among operations, and (ii) a path-depth-wise regularization used to incite search exploration for deep architectures that often converge slower than shallow ones as shown in our theory and are not well explored during search. Experimental results on image classification tasks validate its advantages. 2020-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8998 https://ink.library.smu.edu.sg/context/sis_research/article/10001/viewcontent/2020_NeurIPS_NAS__1_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University OS and Networks Systems Architecture
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic OS and Networks
Systems Architecture
spellingShingle OS and Networks
Systems Architecture
ZHOU, Pan
XIONG, Caiming
SOCHER, Richard
HOI, Steven C. H.
Theory-inspired path-regularized differential network architecture search
description Despite its high search efficiency, differential architecture search (DARTS) often selects network architectures with dominated skip connections which lead to performance degradation. However, theoretical understandings on this issue remain absent, hindering the development of more advanced methods in a principled way. In this work, we solve this problem by theoretically analyzing the effects of various types of operations, e.g. convolution, skip connection and zero operation, to the network optimization. We prove that the architectures with more skip connections can converge faster than the other candidates, and thus are selected by DARTS. This result, for the first time, theoretically and explicitly reveals the impact of skip connections to fast network optimization and its competitive advantage over other types of operations in DARTS. Then we propose a theory-inspired path-regularized DARTS that consists of two key modules: (i) a differential group-structured sparse binary gate introduced for each operation to avoid unfair competition among operations, and (ii) a path-depth-wise regularization used to incite search exploration for deep architectures that often converge slower than shallow ones as shown in our theory and are not well explored during search. Experimental results on image classification tasks validate its advantages.
format text
author ZHOU, Pan
XIONG, Caiming
SOCHER, Richard
HOI, Steven C. H.
author_facet ZHOU, Pan
XIONG, Caiming
SOCHER, Richard
HOI, Steven C. H.
author_sort ZHOU, Pan
title Theory-inspired path-regularized differential network architecture search
title_short Theory-inspired path-regularized differential network architecture search
title_full Theory-inspired path-regularized differential network architecture search
title_fullStr Theory-inspired path-regularized differential network architecture search
title_full_unstemmed Theory-inspired path-regularized differential network architecture search
title_sort theory-inspired path-regularized differential network architecture search
publisher Institutional Knowledge at Singapore Management University
publishDate 2020
url https://ink.library.smu.edu.sg/sis_research/8998
https://ink.library.smu.edu.sg/context/sis_research/article/10001/viewcontent/2020_NeurIPS_NAS__1_.pdf
_version_ 1814047687858716672