Rollout approach to sensor scheduling for remote state estimation under integrity attack

We consider the sensor scheduling problem for remote state estimation under integrity attacks. We seek to optimize a trade-off between the energy consumption of communications and the state estimation error covariance when the acknowledgment (ACK) information, sent by the remote estimator to the loc...

Full description

Saved in:
Bibliographic Details
Main Authors: Liu, Hanxiao, Li, Yuchao, Johansson, Karl Henrik, Mårtensson, Jonas, Xie, Lihua
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/163294
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:We consider the sensor scheduling problem for remote state estimation under integrity attacks. We seek to optimize a trade-off between the energy consumption of communications and the state estimation error covariance when the acknowledgment (ACK) information, sent by the remote estimator to the local sensor, is compromised. The sensor scheduling problem is formulated as an infinite horizon discounted optimal control problem with infinite states. We first analyze the underlying Markov decision process (MDP) and show that the optimal scheduling without ACK attack is of the threshold type. Thus, we can simplify the problem by replacing the original state space with a finite state space. For the simplified MDP, when the ACK is under attack, the problem is modeled as a partially observable Markov decision process (POMDP). We analyze the induced MDP that uses a belief vector as its state for the POMDP. We investigate the properties of the exact optimal solution via contractive models and show that the threshold type of solution for the POMDP cannot be readily obtained. A suboptimal solution is then obtained via a rollout approach, which is a prominent class of reinforcement learning (RL) methods based on approximation in value space. We present two variants of rollout and provide performance bounds of those variants. Finally, numerical examples are used to demonstrate the effectiveness of the proposed rollout methods by comparing them with a finite history window approach that is widely used in RL for POMDP.