A simple proximal stochastic gradient method for nonsmooth nonconvex optimization

We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proxim...

Full description

Saved in:

Bibliographic Details
Main Authors:	LI, Zhize, LI, Jian
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2018
Subjects:	Databases and Information Systems
Online Access:	https://ink.library.smu.edu.sg/sis_research/8674 https://ink.library.smu.edu.sg/context/sis_research/article/9677/viewcontent/NeurIPS_2018_a_simple_proximal_stochastic_gradient_method_for_nonsmooth_nonconvex_optimization_Paper.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9677
record_format	dspace
spelling	sg-smu-ink.sis_research-96772024-03-28T09:09:04Z A simple proximal stochastic gradient method for nonsmooth nonconvex optimization LI, Zhize LI, Jian We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+. Our main contribution lies in the analysis of ProxSVRG+. It recovers several existing convergence results and improves/generalizes them (in terms of the number of stochastic gradient oracle calls and proximal oracle calls). In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by [Lei et al., NIPS'17] for the smooth nonconvex case. ProxSVRG+ is also more straightforward than SCSG and yields simpler analysis. Moreover, ProxSVRG+ outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, which partially solves an open problem proposed in [Reddi et al., NIPS'16]. Also, ProxSVRG+ uses much less proximal oracle calls than ProxSVRG [Reddi et al., NIPS'16]. Moreover, for nonconvex functions satisfied Polyak-\L{}ojasiewicz condition, we prove that ProxSVRG+ achieves a global linear convergence rate without restart unlike ProxSVRG. Thus, it can \emph{automatically} switch to the faster linear convergence in some regions as long as the objective function satisfies the PL condition locally in these regions. Finally, we conduct several experiments and the experimental results are consistent with the theoretical results. 2018-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8674 https://ink.library.smu.edu.sg/context/sis_research/article/9677/viewcontent/NeurIPS_2018_a_simple_proximal_stochastic_gradient_method_for_nonsmooth_nonconvex_optimization_Paper.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Databases and Information Systems
spellingShingle	Databases and Information Systems LI, Zhize LI, Jian A simple proximal stochastic gradient method for nonsmooth nonconvex optimization
description	We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+. Our main contribution lies in the analysis of ProxSVRG+. It recovers several existing convergence results and improves/generalizes them (in terms of the number of stochastic gradient oracle calls and proximal oracle calls). In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by [Lei et al., NIPS'17] for the smooth nonconvex case. ProxSVRG+ is also more straightforward than SCSG and yields simpler analysis. Moreover, ProxSVRG+ outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, which partially solves an open problem proposed in [Reddi et al., NIPS'16]. Also, ProxSVRG+ uses much less proximal oracle calls than ProxSVRG [Reddi et al., NIPS'16]. Moreover, for nonconvex functions satisfied Polyak-\L{}ojasiewicz condition, we prove that ProxSVRG+ achieves a global linear convergence rate without restart unlike ProxSVRG. Thus, it can \emph{automatically} switch to the faster linear convergence in some regions as long as the objective function satisfies the PL condition locally in these regions. Finally, we conduct several experiments and the experimental results are consistent with the theoretical results.
format	text
author	LI, Zhize LI, Jian
author_facet	LI, Zhize LI, Jian
author_sort	LI, Zhize
title	A simple proximal stochastic gradient method for nonsmooth nonconvex optimization
title_short	A simple proximal stochastic gradient method for nonsmooth nonconvex optimization
title_full	A simple proximal stochastic gradient method for nonsmooth nonconvex optimization
title_fullStr	A simple proximal stochastic gradient method for nonsmooth nonconvex optimization
title_full_unstemmed	A simple proximal stochastic gradient method for nonsmooth nonconvex optimization
title_sort	simple proximal stochastic gradient method for nonsmooth nonconvex optimization
publisher	Institutional Knowledge at Singapore Management University
publishDate	2018
url	https://ink.library.smu.edu.sg/sis_research/8674 https://ink.library.smu.edu.sg/context/sis_research/article/9677/viewcontent/NeurIPS_2018_a_simple_proximal_stochastic_gradient_method_for_nonsmooth_nonconvex_optimization_Paper.pdf
_version_	1795302169320095744

A simple proximal stochastic gradient method for nonsmooth nonconvex optimization

Similar Items