Neural multimodal belief tracker with adaptive attention for dialogue systems

Multimodal dialogue systems are attracting increasing attention with a more natural and informative way for human-computer interaction. As one of its core components, the belief tracker estimates the user's goal at each step of the dialogue and provides a direct way to validate the ability of d...

Full description

Saved in:

Bibliographic Details
Main Authors:	ZHANG, Zheng, LIAO, Lizi, HUANG, Minlie, ZHU, Xiaoyan, CHUA, Tat-Seng
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2019
Subjects:	Artificial Intelligence and Robotics Databases and Information Systems
Online Access:	https://ink.library.smu.edu.sg/sis_research/7676 https://ink.library.smu.edu.sg/context/sis_research/article/8679/viewcontent/Neural_Multimodal_Belief_Tracker_with_Adaptive_Attention_for_Dialogue_Systems.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-8679
record_format	dspace
spelling	sg-smu-ink.sis_research-86792023-01-10T03:37:54Z Neural multimodal belief tracker with adaptive attention for dialogue systems ZHANG, Zheng LIAO, Lizi HUANG, Minlie ZHU, Xiaoyan CHUA, Tat-Seng Multimodal dialogue systems are attracting increasing attention with a more natural and informative way for human-computer interaction. As one of its core components, the belief tracker estimates the user's goal at each step of the dialogue and provides a direct way to validate the ability of dialogue understanding. However, existing studies on belief trackers are largely limited to textual modality, which cannot be easily extended to capture the rich semantics in multimodal systems such as those with product images. For example, in fashion domain, the visual appearance of clothes play a crucial role in understanding the user's intention. In this case, the existing belief trackers may fail to generate accurate belief states for a multimodal dialogue system.In this paper, we present the first neural multimodal belief tracker (NMBT) to demonstrate how multimodal evidence can facilitate semantic understanding and dialogue state tracking. Given the multimodal inputs, while applying a textual encoder to represent textual utterances, the model gives special consideration to the semantics revealed in visual modality. It learns concept level fashion semantics by delving deep into image sub-regions and integrating concept probabilities via multiple instance learning. Then in each turn, an adaptive attention mechanism learns to automatically emphasize on different evidence sources of both visual and textual modalities for more accurate dialogue state prediction. We perform extensive evaluation on a multi-turn task-oriented dialogue dataset in fashion domain and the results show that our method achieves superior performance as compared to a wide range of baselines. 2019-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7676 info:doi/10.1145/3308558.3313598 https://ink.library.smu.edu.sg/context/sis_research/article/8679/viewcontent/Neural_Multimodal_Belief_Tracker_with_Adaptive_Attention_for_Dialogue_Systems.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Databases and Information Systems
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Artificial Intelligence and Robotics Databases and Information Systems
spellingShingle	Artificial Intelligence and Robotics Databases and Information Systems ZHANG, Zheng LIAO, Lizi HUANG, Minlie ZHU, Xiaoyan CHUA, Tat-Seng Neural multimodal belief tracker with adaptive attention for dialogue systems
description	Multimodal dialogue systems are attracting increasing attention with a more natural and informative way for human-computer interaction. As one of its core components, the belief tracker estimates the user's goal at each step of the dialogue and provides a direct way to validate the ability of dialogue understanding. However, existing studies on belief trackers are largely limited to textual modality, which cannot be easily extended to capture the rich semantics in multimodal systems such as those with product images. For example, in fashion domain, the visual appearance of clothes play a crucial role in understanding the user's intention. In this case, the existing belief trackers may fail to generate accurate belief states for a multimodal dialogue system.In this paper, we present the first neural multimodal belief tracker (NMBT) to demonstrate how multimodal evidence can facilitate semantic understanding and dialogue state tracking. Given the multimodal inputs, while applying a textual encoder to represent textual utterances, the model gives special consideration to the semantics revealed in visual modality. It learns concept level fashion semantics by delving deep into image sub-regions and integrating concept probabilities via multiple instance learning. Then in each turn, an adaptive attention mechanism learns to automatically emphasize on different evidence sources of both visual and textual modalities for more accurate dialogue state prediction. We perform extensive evaluation on a multi-turn task-oriented dialogue dataset in fashion domain and the results show that our method achieves superior performance as compared to a wide range of baselines.
format	text
author	ZHANG, Zheng LIAO, Lizi HUANG, Minlie ZHU, Xiaoyan CHUA, Tat-Seng
author_facet	ZHANG, Zheng LIAO, Lizi HUANG, Minlie ZHU, Xiaoyan CHUA, Tat-Seng
author_sort	ZHANG, Zheng
title	Neural multimodal belief tracker with adaptive attention for dialogue systems
title_short	Neural multimodal belief tracker with adaptive attention for dialogue systems
title_full	Neural multimodal belief tracker with adaptive attention for dialogue systems
title_fullStr	Neural multimodal belief tracker with adaptive attention for dialogue systems
title_full_unstemmed	Neural multimodal belief tracker with adaptive attention for dialogue systems
title_sort	neural multimodal belief tracker with adaptive attention for dialogue systems
publisher	Institutional Knowledge at Singapore Management University
publishDate	2019
url	https://ink.library.smu.edu.sg/sis_research/7676 https://ink.library.smu.edu.sg/context/sis_research/article/8679/viewcontent/Neural_Multimodal_Belief_Tracker_with_Adaptive_Attention_for_Dialogue_Systems.pdf
_version_	1770576412222160896

Neural multimodal belief tracker with adaptive attention for dialogue systems

Similar Items