Adaptive posterior knowledge selection for improving knowledge-grounded dialogue generation

In open-domain dialogue systems, knowledge information such as unstructured persona profiles, text descriptions and structured knowledge graph can help incorporate abundant background facts for delivering more engaging and informative responses. Existing studies attempted to model a general posterio...

Full description

Saved in:
Bibliographic Details
Main Authors: WANG, Weichao, GAO, Wei, FENG, Shi, CHEN, Ling, WANG, Daling
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2021
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6678
https://ink.library.smu.edu.sg/context/sis_research/article/7681/viewcontent/3459637.3482314.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:In open-domain dialogue systems, knowledge information such as unstructured persona profiles, text descriptions and structured knowledge graph can help incorporate abundant background facts for delivering more engaging and informative responses. Existing studies attempted to model a general posterior distribution over candidate knowledge by considering the entire response utterance as a whole at the beginning of decoding process for knowledge selection. However, a single smooth distribution could fail to model the variability of knowledge selection patterns over different decoding steps, and make the knowledge expression less consistent. To remedy this issue, we propose an adaptive posterior knowledge selection framework, which sequentially introduces a series of discriminative distributions to dynamically control when and what knowledge should be used in specific decoding steps. The adaptive distributions can also capture knowledge-relevant semantic dependencies between adjacent words to refine response generation. In particular, for knowledge graph-grounded dialogue generation, we further incorporate the adaptive distributions into generative word distributions to help express the knowledge entity words. The experimental results show that our developed methods outperform strong baseline systems by large margins.