Feature-based transfer learning In natural language processing

In the past few decades, supervised machine learning approach is one of the most important methodologies in the Natural Language Processing (NLP) community. Although various kinds of supervised learning methods have been proposed to obtain the state-of-the-art performance across most NLP tasks, the...

Full description

Saved in:
Bibliographic Details
Main Author: YU, Jianfei
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
Online Access:https://ink.library.smu.edu.sg/etd_coll/159
https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1159&context=etd_coll
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:In the past few decades, supervised machine learning approach is one of the most important methodologies in the Natural Language Processing (NLP) community. Although various kinds of supervised learning methods have been proposed to obtain the state-of-the-art performance across most NLP tasks, the bottleneck of them lies in the heavy reliance on the large amount of manually annotated data, which is not always available in our desired target domain/task. To alleviate the data sparsity issue in the target domain/task, an attractive solution is to find sufficient labeled data from a related source domain/task. However, for most NLP applications, due to the discrepancy between the distributions of the two domains/tasks, directly training any supervised models only based on labeled data in the source domain/task usually results in poor performance in the target domain/task. Therefore, it is necessary to develop effective transfer learning techniques to leverage rich annotations in the source domain/task to improve the model performance in the target domain/task. There are generally two settings of transfer learning. We use supervised transfer learning to refer to the setting when a small amount of labeled target data is available during training, and when no such data is available we call it unsupervised transfer learning. In this thesis, we focus on proposing novel transfer learning methods for different NLP tasks in both settings, with the goal of inducing an invariant latent feature space across domains or tasks, where the knowledge gained from the source domain/task can be easily adapted to the target domain/task. In the unsupervised transfer learning setting, we first propose a simple yet effective domain adaptation method by deriving shared representations with instance similarity features, which can be generally applied for different NLP tasks, and empirical evaluation on several NLP tasks shows that our method has indistinguishable or even better performance than a widely used domain adaptation method. Furthermore, we target at a specific NLP task, i.e., sentiment classification, and propose a neural domain adaptation framework, which performs joint learning of the actual sentiment classification task and several manually designed domain-independent auxiliary tasks to produce shared representations across domains. Extensive experiments on both sentence-level and document-level sentiment classification demonstrate that our proposed domain adaptation framework can achieve promising results. In the supervised transfer learning setting, we first propose a neural domain adaptation approach for retrieval-based question answering systems by simultaneously learning shared feature representations and modelling inter-domain and intra-domain relationships in a unified model, followed by conducting both intrinsic and extrinsic evaluation to demonstrate the efficiency and effectiveness of our method. Moreover, we attempt to improve multi-label emotion classification with the help of sentiment classification by proposing a dual attention transfer network, where a shared feature space is employed to capture the general sentiment words, and another task-specific space is employed to capture the specific emotion words. Experimental results show that our method is able to outperform several highly competitive transfer learning methods. Although the transfer learning methods proposed in this thesis are originally designed for natural language processing tasks, most of them can be potentially applied to classification tasks in the other research communities such as computer vision and speech processing.