Empirical risk landscape analysis for understanding deep neural networks

This work aims to provide comprehensive landscape analysis of empirical risk in deep neural networks (DNNs), including the convergence behavior of its gradient, its stationary points and the empirical risk itself to their corresponding population counterparts, which reveals how various network param...

Full description

Saved in:

Bibliographic Details
Main Authors:	ZHOU, Pan, FENG, Jiashi
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2018
Subjects:	OS and Networks Theory and Algorithms
Online Access:	https://ink.library.smu.edu.sg/sis_research/9023 https://ink.library.smu.edu.sg/context/sis_research/article/10026/viewcontent/2018_ICLR_DNN_Theory.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10026
record_format	dspace
spelling	sg-smu-ink.sis_research-100262024-07-25T08:04:46Z Empirical risk landscape analysis for understanding deep neural networks ZHOU, Pan FENG, Jiashi This work aims to provide comprehensive landscape analysis of empirical risk in deep neural networks (DNNs), including the convergence behavior of its gradient, its stationary points and the empirical risk itself to their corresponding population counterparts, which reveals how various network parameters determine the convergence performance. In particular, for an l-layer linear neural network consisting of di neurons in the i-th layer, we prove the gradient of its empirical risk uniformly converges to the one of its population risk, at the rate of O(r 2l p l √ maxi dis log(d/l)/n). Here d is the total weight dimension, s is the number of nonzero entries of all the weights and the magnitude of weights per layer is upper bounded by r. Moreover, we prove the one-to-one correspondence of the non-degenerate stationary points between the empirical and population risks and provide convergence guarantee for each pair. We also establish the uniform convergence of the empirical risk to its population counterpart and further derive the stability and generalization bounds for the empirical risk. In addition, we analyze these properties for deep nonlinear neural networks with sigmoid activation functions. We prove similar results for convergence behavior of their empirical risk gradients, non-degenerate stationary points as well as the empirical risk itself. To our best knowledge, this work is the first one theoretically characterizing the uniform convergence of the gradient and stationary points of the empirical risk of DNN models, which benefits the theoretical understanding on how the neural network depth l, the layer width di , the network size d, the sparsity in weight and the parameter magnitude r determine the neural network landscape. 2018-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9023 https://ink.library.smu.edu.sg/context/sis_research/article/10026/viewcontent/2018_ICLR_DNN_Theory.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University OS and Networks Theory and Algorithms
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	OS and Networks Theory and Algorithms
spellingShingle	OS and Networks Theory and Algorithms ZHOU, Pan FENG, Jiashi Empirical risk landscape analysis for understanding deep neural networks
description	This work aims to provide comprehensive landscape analysis of empirical risk in deep neural networks (DNNs), including the convergence behavior of its gradient, its stationary points and the empirical risk itself to their corresponding population counterparts, which reveals how various network parameters determine the convergence performance. In particular, for an l-layer linear neural network consisting of di neurons in the i-th layer, we prove the gradient of its empirical risk uniformly converges to the one of its population risk, at the rate of O(r 2l p l √ maxi dis log(d/l)/n). Here d is the total weight dimension, s is the number of nonzero entries of all the weights and the magnitude of weights per layer is upper bounded by r. Moreover, we prove the one-to-one correspondence of the non-degenerate stationary points between the empirical and population risks and provide convergence guarantee for each pair. We also establish the uniform convergence of the empirical risk to its population counterpart and further derive the stability and generalization bounds for the empirical risk. In addition, we analyze these properties for deep nonlinear neural networks with sigmoid activation functions. We prove similar results for convergence behavior of their empirical risk gradients, non-degenerate stationary points as well as the empirical risk itself. To our best knowledge, this work is the first one theoretically characterizing the uniform convergence of the gradient and stationary points of the empirical risk of DNN models, which benefits the theoretical understanding on how the neural network depth l, the layer width di , the network size d, the sparsity in weight and the parameter magnitude r determine the neural network landscape.
format	text
author	ZHOU, Pan FENG, Jiashi
author_facet	ZHOU, Pan FENG, Jiashi
author_sort	ZHOU, Pan
title	Empirical risk landscape analysis for understanding deep neural networks
title_short	Empirical risk landscape analysis for understanding deep neural networks
title_full	Empirical risk landscape analysis for understanding deep neural networks
title_fullStr	Empirical risk landscape analysis for understanding deep neural networks
title_full_unstemmed	Empirical risk landscape analysis for understanding deep neural networks
title_sort	empirical risk landscape analysis for understanding deep neural networks
publisher	Institutional Knowledge at Singapore Management University
publishDate	2018
url	https://ink.library.smu.edu.sg/sis_research/9023 https://ink.library.smu.edu.sg/context/sis_research/article/10026/viewcontent/2018_ICLR_DNN_Theory.pdf
_version_	1814047710673633280

Empirical risk landscape analysis for understanding deep neural networks

Similar Items