Towards robust natural language and image processing In low-resource scenarios

Deep learning has achieved state-of-the-art performance on a wide range of tasks, including natural language processing (NLP), computer vision (CV), speech and so on. Compared with the traditional statistical model-based machine learning methods, it eliminates the reliance on tedious feature enginee...

Full description

Saved in:
Bibliographic Details
Main Author: Liu, Linlin
Other Authors: He Ying
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/166293
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-166293
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Humanities::Linguistics::Sociolinguistics::Computational linguistics
spellingShingle Engineering::Computer science and engineering
Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Humanities::Linguistics::Sociolinguistics::Computational linguistics
Liu, Linlin
Towards robust natural language and image processing In low-resource scenarios
description Deep learning has achieved state-of-the-art performance on a wide range of tasks, including natural language processing (NLP), computer vision (CV), speech and so on. Compared with the traditional statistical model-based machine learning methods, it eliminates the reliance on tedious feature engineering work, and leverages neural models for automatic feature extraction. Neural models are data hungry, which usually requires a large amount of data to achieve the desired performance. When there is limited training data, the neural models are prone to overfitting. However, it is often expensive and time-consuming to annotate a large amount of data for training. Therefore, improving neural model robustness when there is limited data has been an increasingly important area in deep learning. In this thesis, we present our research on robust natural language and image processing methods in low-resource scenarios. The techniques to prevent overfitting have attracted considerable attention from the research communities. Existing methods can be roughly grouped into 5 main categories, including data augmentation, model parameter regularization, hidden representation regularization, label smoothing and hybrid methods. Besides, semi- supervised learning and transfer learning have also been proven to be successful to improve model performance when there is limited training data. Semi-supervised learning leverages unlabeled data to improve the target task performance, while transfer learning makes better utilization of relevant labeled data from other domains, languages or tasks for knowledge transfer. Most of the methods studied in our research projects also fall into these categories. We make several contributions to improve the robustness of deep learning in low-resource scenarios. Firstly, we propose generation-based data augmentation methods to add more diversity to the training data for the NLP sequence tagging tasks, including named entity recognition, part-of-speech tagging, and end-to-end target based sentiment analysis. Secondly, we analyse the property of adapters and find the adapter-based tuning can be viewed as a parameter regularization method since it freezes parameters of pretrained models. Through extensive experiments and analysis, we observed its robustness to the overfitting problem, encouraging performance in the low-resource and cross-lingual settings, and stability across hyper-parameter choices. Thirdly, we propose a novel method to leverage the stochastic autoencoders for neural model hidden representation augmentation. Experimental results have demonstrated its effectiveness on both sequence- and token-level NLP tasks. Due to the task-agnostic architecture, it is also potentially useful for the CV tasks. Fourthly, we explore methods to enhance the multilingual language model. In the first project, we design a novel framework for multi-sense cross-lingual contextual embedding alignment. In the second project, we pretrain multilingual language with massive multilingual knowledge graph triples to learn factual knowledge and enhance logical reasoning ability. These projects are part of the attempt to break the language barrier and enable a larger range of population to benefit from the advance of artificial intelligence techniques. Fifthly, we explore methods to improve the robustness of GAN for portrait editing. We propose multiple data augmentation methods to improve model robustness to noise in hand-drawn inputs. Based on the unique generator-discriminator architecture, a novel asymmetric conditional-GAN is also designed to help discriminators reduce the negative impacts caused by noisy inputs. Finally, we devise a novel hierarchical vectorization method for portrait image editing, and combine neural models and graphics-based methods to reduce the reliance on training data. In addition to the research projects mentioned above, we have also identified several promising future research directions, which are discussed at the end of this thesis.
author2 He Ying
author_facet He Ying
Liu, Linlin
format Thesis-Doctor of Philosophy
author Liu, Linlin
author_sort Liu, Linlin
title Towards robust natural language and image processing In low-resource scenarios
title_short Towards robust natural language and image processing In low-resource scenarios
title_full Towards robust natural language and image processing In low-resource scenarios
title_fullStr Towards robust natural language and image processing In low-resource scenarios
title_full_unstemmed Towards robust natural language and image processing In low-resource scenarios
title_sort towards robust natural language and image processing in low-resource scenarios
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/166293
_version_ 1765213815195041792
spelling sg-ntu-dr.10356-1662932023-05-02T06:33:01Z Towards robust natural language and image processing In low-resource scenarios Liu, Linlin He Ying Joty Shafiq Rayhan Interdisciplinary Graduate School (IGS) Alibaba-NTU Singapore Joint Research Institute YHe@ntu.edu.sg, srjoty@ntu.edu.sg Engineering::Computer science and engineering Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Humanities::Linguistics::Sociolinguistics::Computational linguistics Deep learning has achieved state-of-the-art performance on a wide range of tasks, including natural language processing (NLP), computer vision (CV), speech and so on. Compared with the traditional statistical model-based machine learning methods, it eliminates the reliance on tedious feature engineering work, and leverages neural models for automatic feature extraction. Neural models are data hungry, which usually requires a large amount of data to achieve the desired performance. When there is limited training data, the neural models are prone to overfitting. However, it is often expensive and time-consuming to annotate a large amount of data for training. Therefore, improving neural model robustness when there is limited data has been an increasingly important area in deep learning. In this thesis, we present our research on robust natural language and image processing methods in low-resource scenarios. The techniques to prevent overfitting have attracted considerable attention from the research communities. Existing methods can be roughly grouped into 5 main categories, including data augmentation, model parameter regularization, hidden representation regularization, label smoothing and hybrid methods. Besides, semi- supervised learning and transfer learning have also been proven to be successful to improve model performance when there is limited training data. Semi-supervised learning leverages unlabeled data to improve the target task performance, while transfer learning makes better utilization of relevant labeled data from other domains, languages or tasks for knowledge transfer. Most of the methods studied in our research projects also fall into these categories. We make several contributions to improve the robustness of deep learning in low-resource scenarios. Firstly, we propose generation-based data augmentation methods to add more diversity to the training data for the NLP sequence tagging tasks, including named entity recognition, part-of-speech tagging, and end-to-end target based sentiment analysis. Secondly, we analyse the property of adapters and find the adapter-based tuning can be viewed as a parameter regularization method since it freezes parameters of pretrained models. Through extensive experiments and analysis, we observed its robustness to the overfitting problem, encouraging performance in the low-resource and cross-lingual settings, and stability across hyper-parameter choices. Thirdly, we propose a novel method to leverage the stochastic autoencoders for neural model hidden representation augmentation. Experimental results have demonstrated its effectiveness on both sequence- and token-level NLP tasks. Due to the task-agnostic architecture, it is also potentially useful for the CV tasks. Fourthly, we explore methods to enhance the multilingual language model. In the first project, we design a novel framework for multi-sense cross-lingual contextual embedding alignment. In the second project, we pretrain multilingual language with massive multilingual knowledge graph triples to learn factual knowledge and enhance logical reasoning ability. These projects are part of the attempt to break the language barrier and enable a larger range of population to benefit from the advance of artificial intelligence techniques. Fifthly, we explore methods to improve the robustness of GAN for portrait editing. We propose multiple data augmentation methods to improve model robustness to noise in hand-drawn inputs. Based on the unique generator-discriminator architecture, a novel asymmetric conditional-GAN is also designed to help discriminators reduce the negative impacts caused by noisy inputs. Finally, we devise a novel hierarchical vectorization method for portrait image editing, and combine neural models and graphics-based methods to reduce the reliance on training data. In addition to the research projects mentioned above, we have also identified several promising future research directions, which are discussed at the end of this thesis. Doctor of Philosophy 2023-04-20T08:07:38Z 2023-04-20T08:07:38Z 2023 Thesis-Doctor of Philosophy Liu, L. (2023). Towards robust natural language and image processing In low-resource scenarios. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166293 https://hdl.handle.net/10356/166293 10.32657/10356/166293 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University