BADFL: Backdoor attack defense in federated learning from local model perspective
There is substantial attention to federated learning with its ability to train a powerful global model collaboratively while protecting data privacy. Despite its many advantages, federated learning is vulnerable to backdoor attacks, where an adversary injects malicious weights into the global model,...
Saved in:
Main Authors: | , , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2024
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/9535 https://ink.library.smu.edu.sg/context/sis_research/article/10535/viewcontent/BADFL_av.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | There is substantial attention to federated learning with its ability to train a powerful global model collaboratively while protecting data privacy. Despite its many advantages, federated learning is vulnerable to backdoor attacks, where an adversary injects malicious weights into the global model, making the global model's targeted predictions incorrect. Existing defenses based on identifying and eliminating malicious weights ignore the similarity variation of the local weights during iterations in the malicious model detection and the presence of benign weights in the malicious model during the malicious local weight elimination, resulting in a poor defense and a degradation of global model accuracy. In this paper, we defend against backdoor attacks from the perspective of local models. First, a malicious model detection method based on interpretability techniques is proposed. The method appends a sampling check after clustering to identify malicious models accurately. We further design a malicious local weight elimination method based on local weight contributions. This method preserves the benign weights in the malicious model to maintain their contributions to the global model. Finally, we analyze the security of the proposed method in terms of model closeness and then verify the effectiveness of the proposed method through experiments. In comparison with existing defenses, the results show that BADFL improves the global model accuracy by 23.14% while reducing the attack success rate to 0.04% in the best case. |
---|