Knowledge and data integration for deep learning under small data

This research addresses limited training data in deep learning, where data volume, quality, and diversity significantly influence model performance. The availability of diverse and abundant data is crucial for effective training models. However, in many real-world scenarios, obtaining such varied da...

Full description

Saved in:
Bibliographic Details
Main Author: Teo, Hazel Kai Xin
Other Authors: Mao Kezhi
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/176309
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:This research addresses limited training data in deep learning, where data volume, quality, and diversity significantly influence model performance. The availability of diverse and abundant data is crucial for effective training models. However, in many real-world scenarios, obtaining such varied data can be challenging, potentially leading to biased models, particularly affecting minority classes. Recent literature and research by various scholars emphasize data augmentation techniques as a promising solution to mitigate data scarcity and enhance model accuracy without exhaustive labeling efforts. This study explores the potential of data augmentation, particularly text augmentation, in alleviating the dependency on extensive training data. The aim is to enhance the effectiveness and accuracy of deep learning models, especially in the context of natural language processing (NLP). We investigate the benefits of employing synonym replacement as a primary text augmentation technique, assessing its ability to generate supplementary data and improve model performance.