Adan: Adaptive Nesterov Momentum Algorithm for faster optimizing deep models

In deep learning, different kinds of deep networks typically need different optimizers, which have to be chosen after multiple trials, making the training process inefficient. To relieve this issue and consistently improve the model training speed across deep networks, we propose the ADAptive Nester...

Full description

Saved in:
Bibliographic Details
Main Authors: XIE, Xingyu, ZHOU, Pan, LI, Huan, LIN, Zhouchen, YAN, Shuicheng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9037
https://ink.library.smu.edu.sg/context/sis_research/article/10040/viewcontent/ADAN_sv.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English