GTG-shapley: efficient and accurate participant contribution evaluation in federated learning

Federated Learning (FL) bridges the gap between collaborative machine learning and preserving data privacy. To sustain the long-term operation of an FL ecosystem, it is important to attract high-quality data owners with appropriate incentive schemes. As an important building block of such incentive...

Full description

Saved in:
Bibliographic Details
Main Authors: Liu, Zelei, Chen, Yuanyuan, Yu, Han, Liu, Yang, Cui, Lizhen
Other Authors: College of Computing and Data Science
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/179060
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Federated Learning (FL) bridges the gap between collaborative machine learning and preserving data privacy. To sustain the long-term operation of an FL ecosystem, it is important to attract high-quality data owners with appropriate incentive schemes. As an important building block of such incentive schemes, it is essential to fairly evaluate participants’ contribution to the performance of the final FL model without exposing their private data. Shapley Value (SV)–based techniques have been widely adopted to provide a fair evaluation of FL participant contributions. However, existing approaches incur significant computation costs, making them difficult to apply in practice. In this article, we propose the Guided Truncation Gradient Shapley (GTG-Shapley) approach to address this challenge. It reconstructs FL models from gradient updates for SV calculation instead of repeatedly training with different combinations of FL participants. In addition, we design a guided Monte Carlo sampling approach combined with within-round and between-round truncation to further reduce the number of model reconstructions and evaluations required. This is accomplished through extensive experiments under diverse realistic data distribution settings. The results demonstrate that GTG-Shapley can closely approximate actual Shapley values while significantly increasing computational efficiency compared with the state-of-the-art, especially under non-i.i.d. settings.