Convergence of non-convex non-concave GANs using sinkhorn divergence

Sinkhorn divergence is a symmetric normalization of entropic regularized optimal transport. It is a smooth and continuous metrized weak-convergence with excellent geometric properties. We use it as an alternative for the minimax objective function in formulating generative adversarial networks. The...

Full description

Saved in:
Bibliographic Details
Main Authors: Adnan, Risman, Saputra, Muchlisin Adi, Fadlil, Junaidillah, Ezerman, Martianus Frederic, Iqbal, Muhamad, Basaruddin, Tjan
Other Authors: School of Physical and Mathematical Sciences
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/154075
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Sinkhorn divergence is a symmetric normalization of entropic regularized optimal transport. It is a smooth and continuous metrized weak-convergence with excellent geometric properties. We use it as an alternative for the minimax objective function in formulating generative adversarial networks. The optimization is defined with Sinkhorn divergence as the objective, under the non-convex and non-concave condition. This work focuses on the optimization's convergence and stability. We propose a first order sequential stochastic gradient descent ascent (SeqSGDA) algorithm. Under some mild approximations, the learning converges to local minimax points. Using the structural similarity index measure (SSIM), we supply a non-asymptotic analysis of the algorithm's convergence rate. Empirical evidences show a convergence rate, which is inversely proportional to the number of iterations, when tested on tiny colour datasets Cats and CelebA on the deep convolutional generative adversarial networks and ResNet neural architectures. The entropy regularization parameter $\varepsilon $ is approximated to the SSIM tolerance $\epsilon $. We determine that the iteration complexity to return to an $\epsilon $ -stationary point to be $\mathcal {O}\left ({\kappa \, \log (\epsilon ^{-1})}\right)$ , where $\kappa $ is a value that depends on the Sinkhorn divergence's convexity and the minimax step ratio in the SeqSGDA algorithm.