Adaptive duty cycling in sensor networks with energy harvesting using continuous-time markov chain and fluid models

The dynamic and unpredictable nature of energy harvesting sources available for wireless sensor networks, and the time variation in network statistics like packet transmission rates and link qualities, necessitate the use of adaptive duty cycling techniques. Such adaptive control allows sensor nodes...

Full description

Saved in:
Bibliographic Details
Main Authors: Chan, Ronald Wai Hong, Zhang, Pengfei, Nevat, Ido, Nagarajan, Sai Ganesh, VALERA, Alvin Cerdena, TAN, Hwee Xian
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2015
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3808
https://ink.library.smu.edu.sg/context/sis_research/article/4810/viewcontent/Adaptive_duty_cycling_in_sensor_networks_with_energy_harvesting_using_continuous_time_markov_chain_and_fluid_models.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:The dynamic and unpredictable nature of energy harvesting sources available for wireless sensor networks, and the time variation in network statistics like packet transmission rates and link qualities, necessitate the use of adaptive duty cycling techniques. Such adaptive control allows sensor nodes to achieve long-run energy neutrality, where energy supply and demand are balanced in a dynamic environment such that the nodes function continuously. In this paper, we develop a new framework enabling an adaptive duty cycling scheme for sensor networks that takes into account the node battery level, ambient energy that can be harvested, and application-level QoS requirements. We model the system as a Markov decision process (MDP) that modifies its state transition policy using reinforcement learning. The MDP uses continuous time Markov chains (CTMCs) to model the network state of a node to obtain key QoS metrics like latency, loss probability, and power consumption, as well as to model the node battery level taking into account physically feasible rates of change. We show that with an appropriate choice of the reward function for the MDP, as well as a suitable learning rate, exploitation probability, and discount factor, the need to maintain minimum QoS levels for optimal network performance can be balanced with the need to promote the maintenance of a finite battery level to ensure node operability. Extensive simulation results show the benefit of our algorithm for different reward functions and parameters.