Incorporating external knowledge into machine learning algorithms for NLP applications

Natural Language Processing (NLP) is a sub-field of Artificial Intelligence (AI) that mainly uses machine learning algorithms to process and analyze large amounts of text data. It gives machines the ability to read, understand, derive meaning from human languages, and potentially generate human lang...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Pengfei
Other Authors: Mao Kezhi
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/144577
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-144577
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Li, Pengfei
Incorporating external knowledge into machine learning algorithms for NLP applications
description Natural Language Processing (NLP) is a sub-field of Artificial Intelligence (AI) that mainly uses machine learning algorithms to process and analyze large amounts of text data. It gives machines the ability to read, understand, derive meaning from human languages, and potentially generate human languages. The key issue in the modern statistical NLP is text representation learning that transforms unstructured text data into structured numerical representations. A good text representation shall capture important lexical, syntactic, and semantic information for certain NLP tasks, such as keywords and cue phrases, conceptual information, and long-distance dependencies. Traditional Bag-of-Words (BoW) model represents text as a fixed-length vector of words, where each dimension is a numerical value such as frequency or tf-idf weight. However, BoW simply looks at the surface form of words and suffers from high dimensionality and sparsity issues. Deep neural networks have shown to be more effective since word order information can be utilized and more semantic features can be captured. The commonly adopted deep neural network architectures include Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Transformer. However, deep neural network normally requires large amount of training data, heavy computations, and sufficient CPU/GPU memory. The lack of high-quality training data can easily lead to under-fitting or over-fitting issue, especially for data-driven deep neural networks. Besides, hardware constraints as well as poor interpretability often become obstacles for deep neural networks to be applied in real-world NLP applications. External knowledge is proven to be beneficial for machine learning algorithms to reduce the reliance on training data and provide additional useful information. For natural language, abundant publicly available knowledge bases such as WordNet, FrameNet, Wikipedia, etc. can be utilized for various NLP tasks. However, different tasks require different knowledge, and different machine learning models have different architectures and operations. How to effectively incorporate useful external knowledge into machine learning algorithms remains an open research question. This thesis focuses on incorporating existing knowledge from external knowledge bases into machine learning algorithms as prior knowledge for NLP applications. By utilizing external knowledge, we aim at obtaining better text representations, reducing the model's reliance on training data, and improving model interpretability. We demonstrate the advantages of leveraging both data and knowledge in machine learning systems and provide general frameworks of incorporating external knowledge into different machine learning algorithms to improve their performance on various NLP tasks. Specifically, 1. For BoW model, we show how to utilize the conceptual knowledge from a probabilistic knowledge base (Probase) and construct a Bag-of-Concepts (BoC) representation to provide more semantic and conceptual information of text and better interpretability for document classification. 2. For CNN, we demonstrate how to automatically generate convolutional filters from lexical knowledge bases such as WordNet and FrameNet to improve its ability to capture the keywords and cue phrases for causal relation extraction. 3. For Transformer, we propose a complementary knowledge-attention encoder which incorporates prior knowledge from lexical knowledge bases to better capture the important linguistic clues. We also propose three effective ways of integrating knowledge-attention with the self-attention in Transformer to maximize the utilization of both knowledge and data for relation extraction. 4. For neural networks using attention mechanism, we show how to incorporate word's sentiment intensity information from SentiWordNet into attention mechanism for sentiment analysis task. In addition, we propose two novel neural architectures including Convolutional Transformer (ConvTransformer) and Attentive Convolutional Transformer (ACT) which take the advantages of both CNN and Transformer for efficient text representation.
author2 Mao Kezhi
author_facet Mao Kezhi
Li, Pengfei
format Thesis-Doctor of Philosophy
author Li, Pengfei
author_sort Li, Pengfei
title Incorporating external knowledge into machine learning algorithms for NLP applications
title_short Incorporating external knowledge into machine learning algorithms for NLP applications
title_full Incorporating external knowledge into machine learning algorithms for NLP applications
title_fullStr Incorporating external knowledge into machine learning algorithms for NLP applications
title_full_unstemmed Incorporating external knowledge into machine learning algorithms for NLP applications
title_sort incorporating external knowledge into machine learning algorithms for nlp applications
publisher Nanyang Technological University
publishDate 2020
url https://hdl.handle.net/10356/144577
_version_ 1772828362170957824
spelling sg-ntu-dr.10356-1445772023-07-04T16:26:37Z Incorporating external knowledge into machine learning algorithms for NLP applications Li, Pengfei Mao Kezhi School of Electrical and Electronic Engineering EKZMao@ntu.edu.sg Engineering::Computer science and engineering Natural Language Processing (NLP) is a sub-field of Artificial Intelligence (AI) that mainly uses machine learning algorithms to process and analyze large amounts of text data. It gives machines the ability to read, understand, derive meaning from human languages, and potentially generate human languages. The key issue in the modern statistical NLP is text representation learning that transforms unstructured text data into structured numerical representations. A good text representation shall capture important lexical, syntactic, and semantic information for certain NLP tasks, such as keywords and cue phrases, conceptual information, and long-distance dependencies. Traditional Bag-of-Words (BoW) model represents text as a fixed-length vector of words, where each dimension is a numerical value such as frequency or tf-idf weight. However, BoW simply looks at the surface form of words and suffers from high dimensionality and sparsity issues. Deep neural networks have shown to be more effective since word order information can be utilized and more semantic features can be captured. The commonly adopted deep neural network architectures include Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Transformer. However, deep neural network normally requires large amount of training data, heavy computations, and sufficient CPU/GPU memory. The lack of high-quality training data can easily lead to under-fitting or over-fitting issue, especially for data-driven deep neural networks. Besides, hardware constraints as well as poor interpretability often become obstacles for deep neural networks to be applied in real-world NLP applications. External knowledge is proven to be beneficial for machine learning algorithms to reduce the reliance on training data and provide additional useful information. For natural language, abundant publicly available knowledge bases such as WordNet, FrameNet, Wikipedia, etc. can be utilized for various NLP tasks. However, different tasks require different knowledge, and different machine learning models have different architectures and operations. How to effectively incorporate useful external knowledge into machine learning algorithms remains an open research question. This thesis focuses on incorporating existing knowledge from external knowledge bases into machine learning algorithms as prior knowledge for NLP applications. By utilizing external knowledge, we aim at obtaining better text representations, reducing the model's reliance on training data, and improving model interpretability. We demonstrate the advantages of leveraging both data and knowledge in machine learning systems and provide general frameworks of incorporating external knowledge into different machine learning algorithms to improve their performance on various NLP tasks. Specifically, 1. For BoW model, we show how to utilize the conceptual knowledge from a probabilistic knowledge base (Probase) and construct a Bag-of-Concepts (BoC) representation to provide more semantic and conceptual information of text and better interpretability for document classification. 2. For CNN, we demonstrate how to automatically generate convolutional filters from lexical knowledge bases such as WordNet and FrameNet to improve its ability to capture the keywords and cue phrases for causal relation extraction. 3. For Transformer, we propose a complementary knowledge-attention encoder which incorporates prior knowledge from lexical knowledge bases to better capture the important linguistic clues. We also propose three effective ways of integrating knowledge-attention with the self-attention in Transformer to maximize the utilization of both knowledge and data for relation extraction. 4. For neural networks using attention mechanism, we show how to incorporate word's sentiment intensity information from SentiWordNet into attention mechanism for sentiment analysis task. In addition, we propose two novel neural architectures including Convolutional Transformer (ConvTransformer) and Attentive Convolutional Transformer (ACT) which take the advantages of both CNN and Transformer for efficient text representation. Doctor of Philosophy 2020-11-13T02:42:41Z 2020-11-13T02:42:41Z 2020 Thesis-Doctor of Philosophy Li, P. (2020). Incorporating external knowledge into machine learning algorithms for NLP applications. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/144577 10.32657/10356/144577 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University