Explainable Q&A system based on domain-specific knowledge graph
The rapid development of the Internet has brought an enormous amount of available information, which makes information fragment a serious problem. Traditional information retrieval (IR) systems provide a list of web sites in which the needed information may be found. But the users have to take a lot...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2021
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/146724 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | The rapid development of the Internet has brought an enormous amount of available information, which makes information fragment a serious problem. Traditional information retrieval (IR) systems provide a list of web sites in which the needed information may be found. But the users have to take a lot of time to digest many web pages and summarize the information they want, especially for some complex search tasks. To alleviate the problems of information fragment and accelerate IR, many research works of Question and Answering (Q&A) system attempt to assist the search engine by providing simple, accurate and understandable answers to natural language queries directly. However, without the original semantic context, these answers lack explainability that makes them difficult for users to trust and adopt.
In recent years, the knowledge graph is widely used to make explainable artificial intelligence (XAI) possible in many fields (e.g. recommendation system). Since it is a large-scale semantic network that represents knowledge by concepts and their relations, which is actually similar to the human cognitive process. Encouraged by the promising results of these fields, this thesis investigates whether and how the knowledge graph and its explainability can be leveraged to Q&A system to enhance the performance of IR. Firstly, the existing Q&A systems lack a framework of the Q&A cognitive process based on the knowledge graph. In order to provide a human-centred explanation, the artificial intelligence (AI) system should align with the cognitive model of human and explain within the basic framework of human cognition. To equip the Q&A system with human-like cognitive capabilities, in this thesis, a brain-inspired cognitive framework of Q&A process named “XBot” is presented. XBot proposes five modules corresponding to the human cognitive process including perception, planning, reasoning, response and learning. Which is largely inspired by the literature of cognitive science. It can be used as a basis for designing a knowledge graph based Q&A system that can understand, answer questions and provide a human-centred explanation.
Secondly, the existing query representation and knowledge graph search methods are insufficient to represent and solve the complex multi-condition query, as well as explanation generation. And the features such as topological structure and indirect relations, etc. are not fully utilized in answer reasoning. In this thesis, a search engine assistant for developers named “DeveloperBot” based on the knowledge graph of the software engineering domain will be presented. DeveloperBot contains a query graph construction algorithm which splits a multi-condition query into several simple constraints, and meanwhile, determines their solving order. Then, a fast graph cyclic pruning reasoning algorithm inspired by the spreading activation model of cognitive science will be introduced. This algorithm models the constraint solving as subgraph search and decision-making process by deep neural network. In the end, the corresponding reasoning subgraph and confidence will be derived following the cognitive process as the qualitative and quantitative explanations to the final answers. These algorithms implement the BotPerception, BotPlanning and BotReasoning modules of XBot framework, respectively.
Thirdly, the existing knowledge graph extraction methods fall short of the precision and completeness of the textual knowledge extraction. And they can not extract the knowledge graph of the specified domain from the text materials as well. As a result, the scale of the extracted knowledge graph is very large and contains a lot of redundant information. In order to limit the scale of knowledge graph and accelerate the graph search, this thesis elaborates a knowledge graph extraction algorithm named “HDSKG” (Harvesting Domain-specific Knowledge Graph), which incorporates a dependency parser with a rule-based method to chunk the relation triples candidates (basic unit of knowledge graph) with high precision and completeness. Then it extracts novel features of these candidates to estimate their domain relevance by self-training SVM (Support Vector Machine) classifier. HDSKG is an implementation of the BotLearning module of the XBot framework. Finally, to evaluate the performance and practical values of the proposed models, we apply HDSKG to construct a high quality knowledge graph of the software engineering domain for DeveloperBot. Then a prototype of DeveloperBot was implemented and a user study involving 24 participants was conducted. The result of user study shows that compared with just using Google, with the assist of DeveloperBot, users can not only find answers faster and with more accuracy, but also understand the answers more deeply. At the same time, the explanation of the answers can significantly improve the users’ trust and adoption of the answers. Furthermore, the more complex the question is, the more effective the DeveloperBot can achieve. |
---|