A fast Anderson-Chebyshev acceleration for nonlinear optimization

Anderson acceleration (or Anderson mixing) is an efficient acceleration method for fixed point iterations $x_{t+1}=G(x_t)$, e.g., gradient descent can be viewed as iteratively applying the operation $G(x) \triangleq x-\alpha\nabla f(x)$. It is known that Anderson acceleration is quite efficient in p...

Full description

Saved in:

Bibliographic Details
Main Authors:	LI, Zhize, LI, Jian
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2020
Subjects:	Databases and Information Systems
Online Access:	https://ink.library.smu.edu.sg/sis_research/8680 https://ink.library.smu.edu.sg/context/sis_research/article/9683/viewcontent/AISTATS20_full_anderson.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-9683
record_format	dspace
spelling	sg-smu-ink.sis_research-96832024-03-28T09:05:37Z A fast Anderson-Chebyshev acceleration for nonlinear optimization LI, Zhize LI, Jian Anderson acceleration (or Anderson mixing) is an efficient acceleration method for fixed point iterations $x_{t+1}=G(x_t)$, e.g., gradient descent can be viewed as iteratively applying the operation $G(x) \triangleq x-\alpha\nabla f(x)$. It is known that Anderson acceleration is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems. In this paper, we show that Anderson acceleration with Chebyshev polynomial can achieve the optimal convergence rate $O(\sqrt{\kappa}\ln\frac{1}{\epsilon})$, which improves the previous result $O(\kappa\ln\frac{1}{\epsilon})$ provided by (Toth and Kelley, 2015) for quadratic functions. Moreover, we provide a convergence analysis for minimizing general nonlinear problems. Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter $L$) are not available, we propose a guessing algorithm for guessing them dynamically and also prove a similar convergence rate. Finally, the experimental results demonstrate that the proposed Anderson-Chebyshev acceleration method converges significantly faster than other algorithms, e.g., vanilla gradient descent (GD), Nesterov's Accelerated GD. Also, these algorithms combined with the proposed guessing algorithm (guessing the hyperparameters dynamically) achieve much better performance. 2020-08-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/8680 https://ink.library.smu.edu.sg/context/sis_research/article/9683/viewcontent/AISTATS20_full_anderson.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Databases and Information Systems
spellingShingle	Databases and Information Systems LI, Zhize LI, Jian A fast Anderson-Chebyshev acceleration for nonlinear optimization
description	Anderson acceleration (or Anderson mixing) is an efficient acceleration method for fixed point iterations $x_{t+1}=G(x_t)$, e.g., gradient descent can be viewed as iteratively applying the operation $G(x) \triangleq x-\alpha\nabla f(x)$. It is known that Anderson acceleration is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems. In this paper, we show that Anderson acceleration with Chebyshev polynomial can achieve the optimal convergence rate $O(\sqrt{\kappa}\ln\frac{1}{\epsilon})$, which improves the previous result $O(\kappa\ln\frac{1}{\epsilon})$ provided by (Toth and Kelley, 2015) for quadratic functions. Moreover, we provide a convergence analysis for minimizing general nonlinear problems. Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter $L$) are not available, we propose a guessing algorithm for guessing them dynamically and also prove a similar convergence rate. Finally, the experimental results demonstrate that the proposed Anderson-Chebyshev acceleration method converges significantly faster than other algorithms, e.g., vanilla gradient descent (GD), Nesterov's Accelerated GD. Also, these algorithms combined with the proposed guessing algorithm (guessing the hyperparameters dynamically) achieve much better performance.
format	text
author	LI, Zhize LI, Jian
author_facet	LI, Zhize LI, Jian
author_sort	LI, Zhize
title	A fast Anderson-Chebyshev acceleration for nonlinear optimization
title_short	A fast Anderson-Chebyshev acceleration for nonlinear optimization
title_full	A fast Anderson-Chebyshev acceleration for nonlinear optimization
title_fullStr	A fast Anderson-Chebyshev acceleration for nonlinear optimization
title_full_unstemmed	A fast Anderson-Chebyshev acceleration for nonlinear optimization
title_sort	fast anderson-chebyshev acceleration for nonlinear optimization
publisher	Institutional Knowledge at Singapore Management University
publishDate	2020
url	https://ink.library.smu.edu.sg/sis_research/8680 https://ink.library.smu.edu.sg/context/sis_research/article/9683/viewcontent/AISTATS20_full_anderson.pdf
_version_	1795302170417954816

A fast Anderson-Chebyshev acceleration for nonlinear optimization

Similar Items