Traffic-optimized data placement for social media

Social media users are generating data on an unprecedented scale. Distributed storage systems are often used to cope with explosive data growth. Data partitioning and replication are two interrelated data placement issues affecting the interserver traffic caused by user-initiated read and write oper...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: Tang, Jing, Tang, Xueyan, Yuan, Junsong
مؤلفون آخرون: School of Computer Science and Engineering
التنسيق: مقال
اللغة:English
منشور في: 2020
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/140032
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:Social media users are generating data on an unprecedented scale. Distributed storage systems are often used to cope with explosive data growth. Data partitioning and replication are two interrelated data placement issues affecting the interserver traffic caused by user-initiated read and write operations in distributed storage systems. This paper investigates how to minimize the interserver traffic among a cluster of social media servers through joint data partitioning and replication optimization. We formally define the problem and study its hardness. We then propose a traffic-optimized partitioning and replication (TOPR) method to continuously adapt data placement according to various dynamics. Evaluations with real Twitter and LiveJournal social graphs show that TOPR not only reduces the interserver traffic significantly but also saves much storage cost of replication compared to state-of-the-art methods. We also benchmark TOPR against the offline optimum by a binary linear program.