Adaptive posterior knowledge selection for improving knowledge-grounded dialogue generation
In open-domain dialogue systems, knowledge information such as unstructured persona profiles, text descriptions and structured knowledge graph can help incorporate abundant background facts for delivering more engaging and informative responses. Existing studies attempted to model a general posterio...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2021
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/6678 https://ink.library.smu.edu.sg/context/sis_research/article/7681/viewcontent/3459637.3482314.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | In open-domain dialogue systems, knowledge information such as unstructured persona profiles, text descriptions and structured knowledge graph can help incorporate abundant background facts for delivering more engaging and informative responses. Existing studies attempted to model a general posterior distribution over candidate knowledge by considering the entire response utterance as a whole at the beginning of decoding process for knowledge selection. However, a single smooth distribution could fail to model the variability of knowledge selection patterns over different decoding steps, and make the knowledge expression less consistent. To remedy this issue, we propose an adaptive posterior knowledge selection framework, which sequentially introduces a series of discriminative distributions to dynamically control when and what knowledge should be used in specific decoding steps. The adaptive distributions can also capture knowledge-relevant semantic dependencies between adjacent words to refine response generation. In particular, for knowledge graph-grounded dialogue generation, we further incorporate the adaptive distributions into generative word distributions to help express the knowledge entity words. The experimental results show that our developed methods outperform strong baseline systems by large margins. |
---|