Incorporating external knowledge into machine learning algorithms for NLP applications

Natural Language Processing (NLP) is a sub-field of Artificial Intelligence (AI) that mainly uses machine learning algorithms to process and analyze large amounts of text data. It gives machines the ability to read, understand, derive meaning from human languages, and potentially generate human lang...

Full description

Saved in:

Bibliographic Details
Main Author:	Li, Pengfei
Other Authors:	Mao Kezhi
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2020
Subjects:	Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/144577
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-144577
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Li, Pengfei Incorporating external knowledge into machine learning algorithms for NLP applications
description	Natural Language Processing (NLP) is a sub-field of Artificial Intelligence (AI) that mainly uses machine learning algorithms to process and analyze large amounts of text data. It gives machines the ability to read, understand, derive meaning from human languages, and potentially generate human languages. The key issue in the modern statistical NLP is text representation learning that transforms unstructured text data into structured numerical representations. A good text representation shall capture important lexical, syntactic, and semantic information for certain NLP tasks, such as keywords and cue phrases, conceptual information, and long-distance dependencies. Traditional Bag-of-Words (BoW) model represents text as a fixed-length vector of words, where each dimension is a numerical value such as frequency or tf-idf weight. However, BoW simply looks at the surface form of words and suffers from high dimensionality and sparsity issues. Deep neural networks have shown to be more effective since word order information can be utilized and more semantic features can be captured. The commonly adopted deep neural network architectures include Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Transformer. However, deep neural network normally requires large amount of training data, heavy computations, and sufficient CPU/GPU memory. The lack of high-quality training data can easily lead to under-fitting or over-fitting issue, especially for data-driven deep neural networks. Besides, hardware constraints as well as poor interpretability often become obstacles for deep neural networks to be applied in real-world NLP applications. External knowledge is proven to be beneficial for machine learning algorithms to reduce the reliance on training data and provide additional useful information. For natural language, abundant publicly available knowledge bases such as WordNet, FrameNet, Wikipedia, etc. can be utilized for various NLP tasks. However, different tasks require different knowledge, and different machine learning models have different architectures and operations. How to effectively incorporate useful external knowledge into machine learning algorithms remains an open research question. This thesis focuses on incorporating existing knowledge from external knowledge bases into machine learning algorithms as prior knowledge for NLP applications. By utilizing external knowledge, we aim at obtaining better text representations, reducing the model's reliance on training data, and improving model interpretability. We demonstrate the advantages of leveraging both data and knowledge in machine learning systems and provide general frameworks of incorporating external knowledge into different machine learning algorithms to improve their performance on various NLP tasks. Specifically, 1. For BoW model, we show how to utilize the conceptual knowledge from a probabilistic knowledge base (Probase) and construct a Bag-of-Concepts (BoC) representation to provide more semantic and conceptual information of text and better interpretability for document classification. 2. For CNN, we demonstrate how to automatically generate convolutional filters from lexical knowledge bases such as WordNet and FrameNet to improve its ability to capture the keywords and cue phrases for causal relation extraction. 3. For Transformer, we propose a complementary knowledge-attention encoder which incorporates prior knowledge from lexical knowledge bases to better capture the important linguistic clues. We also propose three effective ways of integrating knowledge-attention with the self-attention in Transformer to maximize the utilization of both knowledge and data for relation extraction. 4. For neural networks using attention mechanism, we show how to incorporate word's sentiment intensity information from SentiWordNet into attention mechanism for sentiment analysis task. In addition, we propose two novel neural architectures including Convolutional Transformer (ConvTransformer) and Attentive Convolutional Transformer (ACT) which take the advantages of both CNN and Transformer for efficient text representation.
author2	Mao Kezhi
author_facet	Mao Kezhi Li, Pengfei
format	Thesis-Doctor of Philosophy
author	Li, Pengfei
author_sort	Li, Pengfei
title	Incorporating external knowledge into machine learning algorithms for NLP applications
title_short	Incorporating external knowledge into machine learning algorithms for NLP applications
title_full	Incorporating external knowledge into machine learning algorithms for NLP applications
title_fullStr	Incorporating external knowledge into machine learning algorithms for NLP applications
title_full_unstemmed	Incorporating external knowledge into machine learning algorithms for NLP applications
title_sort	incorporating external knowledge into machine learning algorithms for nlp applications
publisher	Nanyang Technological University
publishDate	2020
url	https://hdl.handle.net/10356/144577
_version_	1772828362170957824
spelling	sg-ntu-dr.10356-1445772023-07-04T16:26:37Z Incorporating external knowledge into machine learning algorithms for NLP applications Li, Pengfei Mao Kezhi School of Electrical and Electronic Engineering EKZMao@ntu.edu.sg Engineering::Computer science and engineering Natural Language Processing (NLP) is a sub-field of Artificial Intelligence (AI) that mainly uses machine learning algorithms to process and analyze large amounts of text data. It gives machines the ability to read, understand, derive meaning from human languages, and potentially generate human languages. The key issue in the modern statistical NLP is text representation learning that transforms unstructured text data into structured numerical representations. A good text representation shall capture important lexical, syntactic, and semantic information for certain NLP tasks, such as keywords and cue phrases, conceptual information, and long-distance dependencies. Traditional Bag-of-Words (BoW) model represents text as a fixed-length vector of words, where each dimension is a numerical value such as frequency or tf-idf weight. However, BoW simply looks at the surface form of words and suffers from high dimensionality and sparsity issues. Deep neural networks have shown to be more effective since word order information can be utilized and more semantic features can be captured. The commonly adopted deep neural network architectures include Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Transformer. However, deep neural network normally requires large amount of training data, heavy computations, and sufficient CPU/GPU memory. The lack of high-quality training data can easily lead to under-fitting or over-fitting issue, especially for data-driven deep neural networks. Besides, hardware constraints as well as poor interpretability often become obstacles for deep neural networks to be applied in real-world NLP applications. External knowledge is proven to be beneficial for machine learning algorithms to reduce the reliance on training data and provide additional useful information. For natural language, abundant publicly available knowledge bases such as WordNet, FrameNet, Wikipedia, etc. can be utilized for various NLP tasks. However, different tasks require different knowledge, and different machine learning models have different architectures and operations. How to effectively incorporate useful external knowledge into machine learning algorithms remains an open research question. This thesis focuses on incorporating existing knowledge from external knowledge bases into machine learning algorithms as prior knowledge for NLP applications. By utilizing external knowledge, we aim at obtaining better text representations, reducing the model's reliance on training data, and improving model interpretability. We demonstrate the advantages of leveraging both data and knowledge in machine learning systems and provide general frameworks of incorporating external knowledge into different machine learning algorithms to improve their performance on various NLP tasks. Specifically, 1. For BoW model, we show how to utilize the conceptual knowledge from a probabilistic knowledge base (Probase) and construct a Bag-of-Concepts (BoC) representation to provide more semantic and conceptual information of text and better interpretability for document classification. 2. For CNN, we demonstrate how to automatically generate convolutional filters from lexical knowledge bases such as WordNet and FrameNet to improve its ability to capture the keywords and cue phrases for causal relation extraction. 3. For Transformer, we propose a complementary knowledge-attention encoder which incorporates prior knowledge from lexical knowledge bases to better capture the important linguistic clues. We also propose three effective ways of integrating knowledge-attention with the self-attention in Transformer to maximize the utilization of both knowledge and data for relation extraction. 4. For neural networks using attention mechanism, we show how to incorporate word's sentiment intensity information from SentiWordNet into attention mechanism for sentiment analysis task. In addition, we propose two novel neural architectures including Convolutional Transformer (ConvTransformer) and Attentive Convolutional Transformer (ACT) which take the advantages of both CNN and Transformer for efficient text representation. Doctor of Philosophy 2020-11-13T02:42:41Z 2020-11-13T02:42:41Z 2020 Thesis-Doctor of Philosophy Li, P. (2020). Incorporating external knowledge into machine learning algorithms for NLP applications. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/144577 10.32657/10356/144577 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University

Incorporating external knowledge into machine learning algorithms for NLP applications

Similar Items