Privacy-preserving graph-based machine learning with fully homomorphic encryption for collaborative anti-money laundering

With the increasing digitalization of financial transactions and the rise of cybercrime, combating money laundering has become increasingly complex. Graph-based machine learning techniques have emerged as promising tools for Anti-Money Laundering (AML) detection, capable of capturing intricate relat...

Full description

Saved in:
Bibliographic Details
Main Author: Effendi, Fabrianne
Other Authors: Anupam Chattopadhyay
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175347
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:With the increasing digitalization of financial transactions and the rise of cybercrime, combating money laundering has become increasingly complex. Graph-based machine learning techniques have emerged as promising tools for Anti-Money Laundering (AML) detection, capable of capturing intricate relationships within money laundering networks. However, the effectiveness of AML solutions is hindered by the challenge of data silos within financial institutions, limiting collaboration and reducing overall efficacy. To address these challenges, this research presents a novel privacy-preserving approach for collaborative AML machine learning, facilitating secure data sharing across institutions and borders while preserving data privacy and regulatory compliance. Leveraging Fully Homomorphic Encryption (FHE), computations can be performed on encrypted data without decryption, ensuring sensitive financial data remains confidential. Notably, this research explores the integration of Fully Homomorphic Encryption over the Torus (TFHE) with graph-based machine learning techniques, marking a pioneering effort in this field. The research contributes to the development of an extensible Graph Neural Network (GNN) pipeline, integrating TFHE using Concrete ML. Although progress has been made in implementing techniques such as quantization and pruning to render the GNN FHE-compatible, challenges persist in compiling the pipeline due to the complexity of integrating GNN with Concrete ML. Nonetheless, the insights gained from this development process lay the groundwork for future research in this area. In parallel, a privacy-preserving graph-based gradient boosting pipeline was successfully developed, leveraging Graph Feature Preprocessor (GFP) to enhance XGBoost model performance on AML datasets. Through a series of experiments, the trade-offs between model performance and privacy were evaluated, highlighting the potential of the pipeline in balancing between the two aspects. This work lays the foundation for innovative approaches in safeguarding financial systems against illicit activities, paving the way for future endeavors in privacy-preserving machine learning in AML detection.