Distributed bandit online convex optimization with time-varying coupled inequality constraints

Distributed bandit online convex optimization with time-varying coupled inequality constraints is considered, motivated by a repeated game between a group of learners and an adversary. The learners attempt to minimize a sequence of global loss functions and at the same time satisfy a sequence of cou...

Full description

Saved in:
Bibliographic Details
Main Authors: Yi, Xinlei, Li, Xiuxian, Yang, Tao, Xie, Lihua, Chai, Tianyou, Johansson, Karl Henrik
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/159488
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-159488
record_format dspace
spelling sg-ntu-dr.10356-1594882022-06-21T08:48:13Z Distributed bandit online convex optimization with time-varying coupled inequality constraints Yi, Xinlei Li, Xiuxian Yang, Tao Xie, Lihua Chai, Tianyou Johansson, Karl Henrik School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Bandit Convex Optimization Distributed Optimization Distributed bandit online convex optimization with time-varying coupled inequality constraints is considered, motivated by a repeated game between a group of learners and an adversary. The learners attempt to minimize a sequence of global loss functions and at the same time satisfy a sequence of coupled constraint functions, where the constraints are coupled across the distributed learners at each round. The global loss and the coupled constraint functions are the sum of local convex loss and constraint functions, respectively, which are adaptively generated by the adversary. The local loss and constraint functions are revealed in a bandit manner, i.e., only the values of loss and constraint functions are revealed to the learners at the sampling instance, and the revealed function values are held privately by each learner. Both one-and two-point bandit feedback are studied with the two corresponding distributed bandit online algorithms used by the learners. We show that sublinear expected regret and constraint violation are achieved by these two algorithms, if the accumulated variation of the comparator sequence also grows sublinearly. In particular, we show that O(Tθ) expected static regret and O(T7/4-θ) constraint violation are achieved in the one-point bandit feedback setting, and O(Tmax {κ,1-κ}) expected static regret and O(T1-κ/2) constraint violation in the two-point bandit feedback setting, where θ ∈ (3/4,5/6] and κ ∈ (0,1) are user-defined tradeoff parameters. Finally, the tightness of the theoretical results is illustrated by numerical simulations of a simple power grid example, which also compares the proposed algorithms to algorithms existing in the literature. Ministry of Education (MOE) This work was supported in part by the Knut and Alice Wallenberg Foundation, the Swedish Foundation for Strategic Research, the Swedish Research Council, Ministry of Education of Republic of Singapore under Grant MoE Tier 1 RG72/19 and in part by the National Natural Science Foundation of China under Grant 61991403, Grant 61991404, and Grant 61991400, and 2020 Science and Technology Major Project of Liaoning Province under Grant 2020JH1/10100008. 2022-06-21T08:48:13Z 2022-06-21T08:48:13Z 2020 Journal Article Yi, X., Li, X., Yang, T., Xie, L., Chai, T. & Johansson, K. H. (2020). Distributed bandit online convex optimization with time-varying coupled inequality constraints. IEEE Transactions On Automatic Control, 66(10), 4620-4635. https://dx.doi.org/10.1109/TAC.2020.3030883 0018-9286 https://hdl.handle.net/10356/159488 10.1109/TAC.2020.3030883 2-s2.0-85104227016 10 66 4620 4635 en RG72/19 IEEE Transactions on Automatic Control © 2020 IEEE. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
Bandit Convex Optimization
Distributed Optimization
spellingShingle Engineering::Electrical and electronic engineering
Bandit Convex Optimization
Distributed Optimization
Yi, Xinlei
Li, Xiuxian
Yang, Tao
Xie, Lihua
Chai, Tianyou
Johansson, Karl Henrik
Distributed bandit online convex optimization with time-varying coupled inequality constraints
description Distributed bandit online convex optimization with time-varying coupled inequality constraints is considered, motivated by a repeated game between a group of learners and an adversary. The learners attempt to minimize a sequence of global loss functions and at the same time satisfy a sequence of coupled constraint functions, where the constraints are coupled across the distributed learners at each round. The global loss and the coupled constraint functions are the sum of local convex loss and constraint functions, respectively, which are adaptively generated by the adversary. The local loss and constraint functions are revealed in a bandit manner, i.e., only the values of loss and constraint functions are revealed to the learners at the sampling instance, and the revealed function values are held privately by each learner. Both one-and two-point bandit feedback are studied with the two corresponding distributed bandit online algorithms used by the learners. We show that sublinear expected regret and constraint violation are achieved by these two algorithms, if the accumulated variation of the comparator sequence also grows sublinearly. In particular, we show that O(Tθ) expected static regret and O(T7/4-θ) constraint violation are achieved in the one-point bandit feedback setting, and O(Tmax {κ,1-κ}) expected static regret and O(T1-κ/2) constraint violation in the two-point bandit feedback setting, where θ ∈ (3/4,5/6] and κ ∈ (0,1) are user-defined tradeoff parameters. Finally, the tightness of the theoretical results is illustrated by numerical simulations of a simple power grid example, which also compares the proposed algorithms to algorithms existing in the literature.
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Yi, Xinlei
Li, Xiuxian
Yang, Tao
Xie, Lihua
Chai, Tianyou
Johansson, Karl Henrik
format Article
author Yi, Xinlei
Li, Xiuxian
Yang, Tao
Xie, Lihua
Chai, Tianyou
Johansson, Karl Henrik
author_sort Yi, Xinlei
title Distributed bandit online convex optimization with time-varying coupled inequality constraints
title_short Distributed bandit online convex optimization with time-varying coupled inequality constraints
title_full Distributed bandit online convex optimization with time-varying coupled inequality constraints
title_fullStr Distributed bandit online convex optimization with time-varying coupled inequality constraints
title_full_unstemmed Distributed bandit online convex optimization with time-varying coupled inequality constraints
title_sort distributed bandit online convex optimization with time-varying coupled inequality constraints
publishDate 2022
url https://hdl.handle.net/10356/159488
_version_ 1736856397786644480