Question answering with textual sequence matching
Question answering (QA) is one of the most important applications in natural language processing. With the explosive text data from the Internet, intelligently getting answers of questions will help humans more efficiently collect useful information. My research in this thesis mainly focuses on solv...
Saved in:
Main Author: | |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2019
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/etd_coll/197 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=1197&context=etd_coll |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | Question answering (QA) is one of the most important applications in natural language processing. With the explosive text data from the Internet, intelligently getting answers of questions will help humans more efficiently collect useful information. My research in this thesis mainly focuses on solving question answering problem with textual sequence matching model which is to build vectorized representations for pairs of text sequences to enable better reasoning. And our thesis consists of three major parts.
In Part I, we propose two general models for building vectorized representations over a pair of sentences, which can be directly used to solve the tasks of answer selection, natural language inference, etc.. In Chapter 3, we propose a model named ``match-LSTM", which performs word-by-word matching followed by a LSTM to place more emphasis on important word-level matching representations. On the Stanford Natural Language Inference (SNLI) corpus, our model achieved the state of the art. Next in Chapter 4, we present a general ``compare-aggregate'' framework that performs word-level matching followed by aggregation using Convolutional Neural Networks. We focus on exploring 6 different comparison functions we can use for word-level matching, and find that some simple comparison functions based on element-wise operations work better than standard neural network and neural tensor network based comparison.
In Part II, we make use of the sequence matching model to address the task of machine reading comprehension, where the models need to answer the question based on a specific passage. In Chapter 5, we explore the power of word-level matching for better locating the answer span from the given passage for each question in the task of machine reading comprehension. We propose an end-to-end neural architecture for the task. The architecture is based on match-LSTM and Pointer Net which constrains the output tokens coming from the given passage. We further propose two ways of using Pointer Net for our tasks. Our experiments show that both of our two models substantially outperform the best result using logistic regression and manually crafted features. Besides, our boundary model also achieved the best performance on the SQuAD and MSMARCO dataset. In Chapter 6, we will explore another challenging task, multi-choice reading comprehension, where several candidate answers are also given besides the question related passage. We propose a new co-matching approach to this problem, which jointly models whether a passage can match both a question and a candidate answer.
In Part III, we focus on solving the problem of open-domain question answering, where no specific passage is given any more comparing to the reading comprehension task. Our models for solving this problem still rely on the textual sequence matching model to build ranking and reading comprehension models. In Chapter 7, we present a novel open-domain QA system called Reinforced Ranker-Reader (R3), which jointly trains the Ranker along with an answer-extraction Reader model, based on reinforcement learning. We report extensive experimental results showing that our method significantly improves on the state of the art for multiple open-domain QA datasets. As this work can only make use of a single retrieved passage to answer the question, in the next Chapter 8, we propose two models, strength-based re-ranking and coverage-based re-ranking, which make use of multiple passages to generate their answers. Our models have achieved state-of-the-art results on three public open-domain QA datasets: Quasar-T, SearchQA and the open-domain version of TriviaQA, with about 8 percentage points of improvement over the former two datasets. |
---|