Federated learning study

With the rise of data-driven applications and services, concerns surrounding data privacy, especially concerning sensitive information such as personal opinions and sentiments in textual data, have become increasingly prevalent. Traditional Machine Learning methods often necessitate centralising dat...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Jun Wei
Other Authors: Jun Zhao
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175325
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-175325
record_format dspace
spelling sg-ntu-dr.10356-1753252024-04-26T15:44:50Z Federated learning study Tan, Jun Wei Jun Zhao School of Computer Science and Engineering junzhao@ntu.edu.sg Computer and Information Science With the rise of data-driven applications and services, concerns surrounding data privacy, especially concerning sensitive information such as personal opinions and sentiments in textual data, have become increasingly prevalent. Traditional Machine Learning methods often necessitate centralising data from various sources for model training, posing significant privacy risks as raw data must be shared or pooled into a single repository. Federated Learning (FL) emerges as a promising solution to this privacy challenge by facilitating collaborative model training across decentralised data sources. Federated Learning enables multiple parties to train a shared Machine Learning model without the need to exchange raw data, thereby preserving data privacy while harnessing the collective intelligence inherent in diverse datasets. This decentralised approach not only enhances privacy but also provides scalability and robustness by distributing computation and storage burdens. This abstract delves into the concept of Federated Learning, highlighting its significance in addressing data privacy concerns while fostering collaborative model training across decentralised environments. In this project, the efficacy of Federated Learning is demonstrated through the utilisation of three diverse datasets sourced from Kaggle, comprising Amazon reviews, IMDB reviews, and Spotify reviews. Initially, all datasets are aggregated into a unified dataset, facilitating the training and evaluation of a text sentiment classification model. Subsequently, employing a Federated Learning approach, the three datasets are distributed across separate clients for model training. The performance of various FL algorithms is evaluated to assess their effectiveness in preserving privacy while maintaining model performance. By comparing the performance of these models trained on decentralised data sources, insights into the potential of Federated Learning in preserving privacy and achieving robust model performance across heterogeneous datasets are garnered. Bachelor's degree 2024-04-23T06:28:23Z 2024-04-23T06:28:23Z 2024 Final Year Project (FYP) Tan, J. W. (2024). Federated learning study. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175325 https://hdl.handle.net/10356/175325 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
spellingShingle Computer and Information Science
Tan, Jun Wei
Federated learning study
description With the rise of data-driven applications and services, concerns surrounding data privacy, especially concerning sensitive information such as personal opinions and sentiments in textual data, have become increasingly prevalent. Traditional Machine Learning methods often necessitate centralising data from various sources for model training, posing significant privacy risks as raw data must be shared or pooled into a single repository. Federated Learning (FL) emerges as a promising solution to this privacy challenge by facilitating collaborative model training across decentralised data sources. Federated Learning enables multiple parties to train a shared Machine Learning model without the need to exchange raw data, thereby preserving data privacy while harnessing the collective intelligence inherent in diverse datasets. This decentralised approach not only enhances privacy but also provides scalability and robustness by distributing computation and storage burdens. This abstract delves into the concept of Federated Learning, highlighting its significance in addressing data privacy concerns while fostering collaborative model training across decentralised environments. In this project, the efficacy of Federated Learning is demonstrated through the utilisation of three diverse datasets sourced from Kaggle, comprising Amazon reviews, IMDB reviews, and Spotify reviews. Initially, all datasets are aggregated into a unified dataset, facilitating the training and evaluation of a text sentiment classification model. Subsequently, employing a Federated Learning approach, the three datasets are distributed across separate clients for model training. The performance of various FL algorithms is evaluated to assess their effectiveness in preserving privacy while maintaining model performance. By comparing the performance of these models trained on decentralised data sources, insights into the potential of Federated Learning in preserving privacy and achieving robust model performance across heterogeneous datasets are garnered.
author2 Jun Zhao
author_facet Jun Zhao
Tan, Jun Wei
format Final Year Project
author Tan, Jun Wei
author_sort Tan, Jun Wei
title Federated learning study
title_short Federated learning study
title_full Federated learning study
title_fullStr Federated learning study
title_full_unstemmed Federated learning study
title_sort federated learning study
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/175325
_version_ 1800916184562925568