Contextual trust aided enhancement of data availability in peer-to-peer backup storage systems

Peer-to-peer storage services are a cost-effective alternative for data backup. A basic question that arises in the design of such systems is: In which peers do we store redundant data? Choosing appropmailriate peers for data backup is important at a microscopic level, from an end-user’s perspective...

Full description

Saved in:
Bibliographic Details
Main Authors: Liu, Xin, Datta, Anwitaman
Other Authors: School of Computer Engineering
Format: Article
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/100246
http://hdl.handle.net/10220/17749
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Peer-to-peer storage services are a cost-effective alternative for data backup. A basic question that arises in the design of such systems is: In which peers do we store redundant data? Choosing appropmailriate peers for data backup is important at a microscopic level, from an end-user’s perspective to guarantee good performance, e.g., quick access, high availability, etc., as well as at a macroscopic level, e.g., for system optimization, fairness, etc. Existing systems apply different techniques, including random selection, based on a distributed hash table (DHT) or based on the peers’ past availability pattern. In this paper, we propose as an alternative, a contextual trust based data placement scheme to select suitable data holders. It is originally designed for and applicable to scenarios where there is inadequate historical information about peers, a common scenario in large-scale systems. Specifically, our scheme estimates trustworthiness of a peer based on stereotypes, formed by aggregating information of interactions with other (similar) peers. Simulation experiments show that our placement scheme outperforms not only random selection but also schemes using historical information, in terms of both achieved data availability as well as bandwidth overheads to sustain the system.