Win: Weight-decay-integrated Nesterov acceleration for faster network training

Win: Weight-decay-integrated Nesterov acceleration for faster network training

Training deep networks on large-scale datasets is computationally challenging. This work explores the problem of “how to accelerate adaptive gradient algorithms in a general manner", and proposes an effective Weight-decay-Integrated Nesterov acceleration (Win) to accelerate adaptive algorithms....

Full description

Saved in:

Bibliographic Details
Main Authors:	ZHOU, Pan, XIE, Xingyu, LIN, Zhouchen, TOH, Kim-Chuan, YAN, Shuicheng
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Accelerated Adaptive Gradient Algorithms Deep Learning Optimizer Network Optimization Nesterov Acceleration in Deep Learning OS and Networks Theory and Algorithms
Online Access:	https://ink.library.smu.edu.sg/sis_research/8969 https://ink.library.smu.edu.sg/context/sis_research/article/9972/viewcontent/2024JMLR.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Similar Items

Win: Weight-decay-integrated nesterov acceleration for adaptive gradient algorithms
by: ZHOU, Pan, et al.
Published: (2023)

Adan: Adaptive Nesterov Momentum Algorithm for faster optimizing deep models
by: XIE, Xingyu, et al.
Published: (2024)

Stitching weight-shared deep neural networks for efficient multitask inference on GPU
by: WANG, Zeyu, et al.
Published: (2022)

MACHINE LEARNING ACCELERATION FOR EDGE COMPUTING IN DISTRIBUTED SENSOR NETWORKS
by: DE ALWIS WATHUTHANTHRIGE UDARI CHARITHA
Published: (2023)

Network traffic classification based on deep learning
by: Cheng, Li
Published: (2023)

EDLAB : a benchmark for edge deep learning accelerators
by: Kong, Hao, et al.
Published: (2022)

Planning multiple levels constant stress accelerated life tests
by: Tang, L.-C., et al.
Published: (2014)

Audee: Automated testing for deep learning frameworks
by: GUO, Qianyu, et al.
Published: (2020)

Recent advances in deep learning for object detection
by: WU, Xiongwei, et al.
Published: (2020)

DNN model theft through trojan side-channel on edge FPGA accelerator
by: Chandrasekar, Srivatsan, et al.
Published: (2024)

TOWARDS SCALABLE GRADIENT-BASED HYPERPARAMETER OPTIMIZATION IN DEEP NEURAL NETWORKS
by: FU JIE
Published: (2017)

Pruning meta-trained networks for on-device adaptation
by: GAO, Dawei, et al.
Published: (2021)

An inexact accelerated proximal gradient method for large scale linearly constrained convex SDP
by: Jiang, K., et al.
Published: (2014)

Deep reinforcement learning for dynamic algorithm selection: A proof-of-principle study on differential evolution
by: GUO, Hongshu, et al.
Published: (2024)

A multiple objective framework for planning accelerated life tests
by: Tang, L.-C., et al.
Published: (2014)

Multi-target deep neural networks: Theoretical analysis and implementation
by: ZENG, Zeng, et al.
Published: (2018)

Multi-target deep neural networks: Theoretical analysis and implementation
by: ZENG, Zeng, et al.
Published: (2018)

Deep reinforcement learning guided improvement heuristic for job shop scheduling
by: ZHANG, Cong, et al.
Published: (2024)

Faster first-order methods for stochastic non-convex optimization on Riemannian manifolds
by: ZHOU, Pan, et al.
Published: (2019)

Deep Reinforcement Learning With Explicit Context Representation
by: Munguia-Galeano, Francisco, et al.
Published: (2023)

Pruning Blocks for CNN Compression and Acceleration via Online Ensemble Distillation
by: Wang, Z., et al.
Published: (2022)

Miniaturizing the large hadron collider: review and analysis of particle accelerator technologies
by: Xu, Wenqin
Published: (2024)

Motions with minimal joint speeds and joint accelerations for redundant manipulators
by: Lee, H.P.
Published: (2014)

Norm-based generalisation bounds for deep multi-class convolutional neural networks
by: LEDENT, Antoine, et al.
Published: (2021)

EXPLOITING GRADIENT INFORMATION FOR MODERN MACHINE LEARNING PROBLEMS
by: CHEN YIZHOU
Published: (2022)

Lifetime prediction using accelerated test data and neural networks
by: Freitag, S., et al.
Published: (2014)

Planning for accelerated life tests
by: Tang, L.-C.
Published: (2014)

Planning accelerated life tests under scheduled inspections for log-location-scale distributions
by: Liu, X., et al.
Published: (2014)

An empirical study of GUI widget detection for industrial mobile games
by: YE, Jiaming, et al.
Published: (2021)

Which neural network makes more explainable decisions? An approach towards measuring explainability
by: ZHANG, Mengdi, et al.
Published: (2022)

Intent recognition in smart living through deep recurrent neural networks
by: ZHANG, Xiang, et al.
Published: (2017)

On-device deep multi-task inference via multi-task zipping
by: HE, Xiaoxi, et al.
Published: (2023)

Planning of step-stress accelerated degradation test
by: Tang, L.C., et al.
Published: (2014)

Action selection for composable modular deep reinforcement learning
by: GUPTA, Vaibhav, et al.
Published: (2021)

LOW-POWER MANY-CORE ARCHITECTURES FOR THE NEXT GENERATION WEARABLES
by: TAN CHENG
Published: (2019)

Cross-domain retinopathy classification with optical coherence tomography images via a novel deep domain adaptation method
by: Luo, Yuemei, et al.
Published: (2023)

Planning sequential constant-stress accelerated life tests with stepwise loaded auxiliary acceleration factor
by: Liu, X., et al.
Published: (2014)

Deep anomaly detection with deviation networks
by: PANG, Guansong, et al.
Published: (2019)

Empirical risk landscape analysis for understanding deep neural networks
by: ZHOU, Pan, et al.
Published: (2018)

Efficient stochastic gradient hard thresholding
by: ZHOU, Pan, et al.
Published: (2018)