Towards robust natural language and image processing In low-resource scenarios

Deep learning has achieved state-of-the-art performance on a wide range of tasks, including natural language processing (NLP), computer vision (CV), speech and so on. Compared with the traditional statistical model-based machine learning methods, it eliminates the reliance on tedious feature enginee...

Full description

Saved in:
Bibliographic Details
Main Author: Liu, Linlin
Other Authors: He Ying
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/166293
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Deep learning has achieved state-of-the-art performance on a wide range of tasks, including natural language processing (NLP), computer vision (CV), speech and so on. Compared with the traditional statistical model-based machine learning methods, it eliminates the reliance on tedious feature engineering work, and leverages neural models for automatic feature extraction. Neural models are data hungry, which usually requires a large amount of data to achieve the desired performance. When there is limited training data, the neural models are prone to overfitting. However, it is often expensive and time-consuming to annotate a large amount of data for training. Therefore, improving neural model robustness when there is limited data has been an increasingly important area in deep learning. In this thesis, we present our research on robust natural language and image processing methods in low-resource scenarios. The techniques to prevent overfitting have attracted considerable attention from the research communities. Existing methods can be roughly grouped into 5 main categories, including data augmentation, model parameter regularization, hidden representation regularization, label smoothing and hybrid methods. Besides, semi- supervised learning and transfer learning have also been proven to be successful to improve model performance when there is limited training data. Semi-supervised learning leverages unlabeled data to improve the target task performance, while transfer learning makes better utilization of relevant labeled data from other domains, languages or tasks for knowledge transfer. Most of the methods studied in our research projects also fall into these categories. We make several contributions to improve the robustness of deep learning in low-resource scenarios. Firstly, we propose generation-based data augmentation methods to add more diversity to the training data for the NLP sequence tagging tasks, including named entity recognition, part-of-speech tagging, and end-to-end target based sentiment analysis. Secondly, we analyse the property of adapters and find the adapter-based tuning can be viewed as a parameter regularization method since it freezes parameters of pretrained models. Through extensive experiments and analysis, we observed its robustness to the overfitting problem, encouraging performance in the low-resource and cross-lingual settings, and stability across hyper-parameter choices. Thirdly, we propose a novel method to leverage the stochastic autoencoders for neural model hidden representation augmentation. Experimental results have demonstrated its effectiveness on both sequence- and token-level NLP tasks. Due to the task-agnostic architecture, it is also potentially useful for the CV tasks. Fourthly, we explore methods to enhance the multilingual language model. In the first project, we design a novel framework for multi-sense cross-lingual contextual embedding alignment. In the second project, we pretrain multilingual language with massive multilingual knowledge graph triples to learn factual knowledge and enhance logical reasoning ability. These projects are part of the attempt to break the language barrier and enable a larger range of population to benefit from the advance of artificial intelligence techniques. Fifthly, we explore methods to improve the robustness of GAN for portrait editing. We propose multiple data augmentation methods to improve model robustness to noise in hand-drawn inputs. Based on the unique generator-discriminator architecture, a novel asymmetric conditional-GAN is also designed to help discriminators reduce the negative impacts caused by noisy inputs. Finally, we devise a novel hierarchical vectorization method for portrait image editing, and combine neural models and graphics-based methods to reduce the reliance on training data. In addition to the research projects mentioned above, we have also identified several promising future research directions, which are discussed at the end of this thesis.