Federated learning for natural language processing in medical domain

Recent years have witnessed an unprecedented surge in interest and innovation in the field of Natural Language Processing (NLP), largely due to ground-breaking developments such as the creation of ChatGPT and other Large Language Model (LLM) applications. In today's landscape, data-driven deep...

Full description

Saved in:
Bibliographic Details
Main Author: Saraf, Ishita
Other Authors: Anupam Chattopadhyay
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175336
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Recent years have witnessed an unprecedented surge in interest and innovation in the field of Natural Language Processing (NLP), largely due to ground-breaking developments such as the creation of ChatGPT and other Large Language Model (LLM) applications. In today's landscape, data-driven deep learning algorithms have become the norm in the field of NLP. One of the biggest challenges faced by such data-dependent NLP applications is related to data scarcity and privacy. Federated Learning (FL) is a convincing solution to overcome the issues regarding confidentiality of training data and constraints imposed by inadequate data in specific domains. The main objective of this study is to build a medical assistance chatbot using federated learning to overcome the limitations posed by the private nature data in the medical domain. The study compares the performance of centralized training and federated learning for fine-tuning a BERT-based conversational agent on a medical conversation dataset, serving as a stepping-stone for future research into federated learning for LLM-led NLP applications.