Dialogue systems with audio context
Research on building dialogue systems that converse with humans naturally has recently attracted a lot of attention. Most work on this area assumes text-based conversation, where the user message is modeled as a sequence of words in a vocabulary. Real-world human conversation, in contrast, involves...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/160968 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-160968 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1609682022-08-10T01:14:21Z Dialogue systems with audio context Young, Tom Pandelea, Vlad Poria, Soujanya Cambria, Erik School of Computer Science and Engineering Engineering::Computer science and engineering Dialogue Systems Audio Features Research on building dialogue systems that converse with humans naturally has recently attracted a lot of attention. Most work on this area assumes text-based conversation, where the user message is modeled as a sequence of words in a vocabulary. Real-world human conversation, in contrast, involves other modalities, such as voice, facial expression and body language, which can influence the conversation significantly in certain scenarios. In this work, we explore the impact of incorporating the audio features of the user message into generative dialogue systems. Specifically, we first design an auxiliary response retrieval task for audio representation learning. Then, we use word-level modality fusion to incorporate the audio features as additional context to our main generative model. Experiments show that our audio-augmented model outperforms the audio-free counterpart on perplexity, response diversity and human evaluation. Agency for Science, Technology and Research (A*STAR) This research is supported by the Agency for Science, Technology and Research (A∗STAR) under its AME Programmatic Funding Scheme (Project #A18A2b0046). 2022-08-10T01:14:21Z 2022-08-10T01:14:21Z 2020 Journal Article Young, T., Pandelea, V., Poria, S. & Cambria, E. (2020). Dialogue systems with audio context. Neurocomputing, 388, 102-109. https://dx.doi.org/10.1016/j.neucom.2019.12.126 0925-2312 https://hdl.handle.net/10356/160968 10.1016/j.neucom.2019.12.126 2-s2.0-85078219602 388 102 109 en A18A2b0046 Neurocomputing © 2020 Elsevier B.V. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Dialogue Systems Audio Features |
spellingShingle |
Engineering::Computer science and engineering Dialogue Systems Audio Features Young, Tom Pandelea, Vlad Poria, Soujanya Cambria, Erik Dialogue systems with audio context |
description |
Research on building dialogue systems that converse with humans naturally has recently attracted a lot of attention. Most work on this area assumes text-based conversation, where the user message is modeled as a sequence of words in a vocabulary. Real-world human conversation, in contrast, involves other modalities, such as voice, facial expression and body language, which can influence the conversation significantly in certain scenarios. In this work, we explore the impact of incorporating the audio features of the user message into generative dialogue systems. Specifically, we first design an auxiliary response retrieval task for audio representation learning. Then, we use word-level modality fusion to incorporate the audio features as additional context to our main generative model. Experiments show that our audio-augmented model outperforms the audio-free counterpart on perplexity, response diversity and human evaluation. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Young, Tom Pandelea, Vlad Poria, Soujanya Cambria, Erik |
format |
Article |
author |
Young, Tom Pandelea, Vlad Poria, Soujanya Cambria, Erik |
author_sort |
Young, Tom |
title |
Dialogue systems with audio context |
title_short |
Dialogue systems with audio context |
title_full |
Dialogue systems with audio context |
title_fullStr |
Dialogue systems with audio context |
title_full_unstemmed |
Dialogue systems with audio context |
title_sort |
dialogue systems with audio context |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/160968 |
_version_ |
1743119467020288000 |