A simple proximal stochastic gradient method for nonsmooth nonconvex optimization
We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proxim...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2018
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8674 https://ink.library.smu.edu.sg/context/sis_research/article/9677/viewcontent/NeurIPS_2018_a_simple_proximal_stochastic_gradient_method_for_nonsmooth_nonconvex_optimization_Paper.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9677 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-96772024-03-28T09:09:04Z A simple proximal stochastic gradient method for nonsmooth nonconvex optimization LI, Zhize LI, Jian We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+. Our main contribution lies in the analysis of ProxSVRG+. It recovers several existing convergence results and improves/generalizes them (in terms of the number of stochastic gradient oracle calls and proximal oracle calls). In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by [Lei et al., NIPS'17] for the smooth nonconvex case. ProxSVRG+ is also more straightforward than SCSG and yields simpler analysis. Moreover, ProxSVRG+ outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, which partially solves an open problem proposed in [Reddi et al., NIPS'16]. Also, ProxSVRG+ uses much less proximal oracle calls than ProxSVRG [Reddi et al., NIPS'16]. Moreover, for nonconvex functions satisfied Polyak-\L{}ojasiewicz condition, we prove that ProxSVRG+ achieves a global linear convergence rate without restart unlike ProxSVRG. Thus, it can \emph{automatically} switch to the faster linear convergence in some regions as long as the objective function satisfies the PL condition locally in these regions. Finally, we conduct several experiments and the experimental results are consistent with the theoretical results. 2018-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8674 https://ink.library.smu.edu.sg/context/sis_research/article/9677/viewcontent/NeurIPS_2018_a_simple_proximal_stochastic_gradient_method_for_nonsmooth_nonconvex_optimization_Paper.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Databases and Information Systems |
spellingShingle |
Databases and Information Systems LI, Zhize LI, Jian A simple proximal stochastic gradient method for nonsmooth nonconvex optimization |
description |
We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+. Our main contribution lies in the analysis of ProxSVRG+. It recovers several existing convergence results and improves/generalizes them (in terms of the number of stochastic gradient oracle calls and proximal oracle calls). In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by [Lei et al., NIPS'17] for the smooth nonconvex case. ProxSVRG+ is also more straightforward than SCSG and yields simpler analysis. Moreover, ProxSVRG+ outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, which partially solves an open problem proposed in [Reddi et al., NIPS'16]. Also, ProxSVRG+ uses much less proximal oracle calls than ProxSVRG [Reddi et al., NIPS'16]. Moreover, for nonconvex functions satisfied Polyak-\L{}ojasiewicz condition, we prove that ProxSVRG+ achieves a global linear convergence rate without restart unlike ProxSVRG. Thus, it can \emph{automatically} switch to the faster linear convergence in some regions as long as the objective function satisfies the PL condition locally in these regions. Finally, we conduct several experiments and the experimental results are consistent with the theoretical results. |
format |
text |
author |
LI, Zhize LI, Jian |
author_facet |
LI, Zhize LI, Jian |
author_sort |
LI, Zhize |
title |
A simple proximal stochastic gradient method for nonsmooth nonconvex optimization |
title_short |
A simple proximal stochastic gradient method for nonsmooth nonconvex optimization |
title_full |
A simple proximal stochastic gradient method for nonsmooth nonconvex optimization |
title_fullStr |
A simple proximal stochastic gradient method for nonsmooth nonconvex optimization |
title_full_unstemmed |
A simple proximal stochastic gradient method for nonsmooth nonconvex optimization |
title_sort |
simple proximal stochastic gradient method for nonsmooth nonconvex optimization |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2018 |
url |
https://ink.library.smu.edu.sg/sis_research/8674 https://ink.library.smu.edu.sg/context/sis_research/article/9677/viewcontent/NeurIPS_2018_a_simple_proximal_stochastic_gradient_method_for_nonsmooth_nonconvex_optimization_Paper.pdf |
_version_ |
1795302169320095744 |