New insight into hybrid stochastic gradient descent: Beyond with-replacement sampling and convexity
As an incremental-gradient algorithm, the hybrid stochastic gradient descent (HSGD) enjoys merits of both stochastic and full gradient methods for finite-sum problem optimization. However, the existing rate-of-convergence analysis for HSGD is made under with-replacement sampling (WRS) and is restric...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2018
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/9007 https://ink.library.smu.edu.sg/context/sis_research/article/10010/viewcontent/NeurIPS_2018_new_insight_into_hybrid_stochastic_gradient_descent_beyond_with_replacement_sampling_and_convexity_Paper.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-10010 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-100102024-07-25T08:15:18Z New insight into hybrid stochastic gradient descent: Beyond with-replacement sampling and convexity ZHOU, Pan YUAN, Xiao-Tong FENG, Jiashi As an incremental-gradient algorithm, the hybrid stochastic gradient descent (HSGD) enjoys merits of both stochastic and full gradient methods for finite-sum problem optimization. However, the existing rate-of-convergence analysis for HSGD is made under with-replacement sampling (WRS) and is restricted to convex problems. It is not clear whether HSGD still carries these advantages under the common practice of without-replacement sampling (WoRS) for non-convex problems. In this paper, we affirmatively answer this open question by showing that under WoRS and for both convex and non-convex problems, it is still possible for HSGD (with constant step-size) to match full gradient descent in rate of convergence, while maintaining comparable sample-size-independent incremental first-order oracle complexity to stochastic gradient descent. For a special class of finite-sum problems with linear prediction models, our convergence results can be further improved in some cases. Extensive numerical results confirm our theoretical affirmation and demonstrate the favorable efficiency of WoRS-based HSGD. 2018-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9007 https://ink.library.smu.edu.sg/context/sis_research/article/10010/viewcontent/NeurIPS_2018_new_insight_into_hybrid_stochastic_gradient_descent_beyond_with_replacement_sampling_and_convexity_Paper.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University OS and Networks |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
OS and Networks |
spellingShingle |
OS and Networks ZHOU, Pan YUAN, Xiao-Tong FENG, Jiashi New insight into hybrid stochastic gradient descent: Beyond with-replacement sampling and convexity |
description |
As an incremental-gradient algorithm, the hybrid stochastic gradient descent (HSGD) enjoys merits of both stochastic and full gradient methods for finite-sum problem optimization. However, the existing rate-of-convergence analysis for HSGD is made under with-replacement sampling (WRS) and is restricted to convex problems. It is not clear whether HSGD still carries these advantages under the common practice of without-replacement sampling (WoRS) for non-convex problems. In this paper, we affirmatively answer this open question by showing that under WoRS and for both convex and non-convex problems, it is still possible for HSGD (with constant step-size) to match full gradient descent in rate of convergence, while maintaining comparable sample-size-independent incremental first-order oracle complexity to stochastic gradient descent. For a special class of finite-sum problems with linear prediction models, our convergence results can be further improved in some cases. Extensive numerical results confirm our theoretical affirmation and demonstrate the favorable efficiency of WoRS-based HSGD. |
format |
text |
author |
ZHOU, Pan YUAN, Xiao-Tong FENG, Jiashi |
author_facet |
ZHOU, Pan YUAN, Xiao-Tong FENG, Jiashi |
author_sort |
ZHOU, Pan |
title |
New insight into hybrid stochastic gradient descent: Beyond with-replacement sampling and convexity |
title_short |
New insight into hybrid stochastic gradient descent: Beyond with-replacement sampling and convexity |
title_full |
New insight into hybrid stochastic gradient descent: Beyond with-replacement sampling and convexity |
title_fullStr |
New insight into hybrid stochastic gradient descent: Beyond with-replacement sampling and convexity |
title_full_unstemmed |
New insight into hybrid stochastic gradient descent: Beyond with-replacement sampling and convexity |
title_sort |
new insight into hybrid stochastic gradient descent: beyond with-replacement sampling and convexity |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2018 |
url |
https://ink.library.smu.edu.sg/sis_research/9007 https://ink.library.smu.edu.sg/context/sis_research/article/10010/viewcontent/NeurIPS_2018_new_insight_into_hybrid_stochastic_gradient_descent_beyond_with_replacement_sampling_and_convexity_Paper.pdf |
_version_ |
1814047690500079616 |