A unified dialogue user simulator for few-shot data augmentation

Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue cor...

Full description

Saved in:
Bibliographic Details
Main Authors: WAN, Dazhen, ZHANG, Zheng, ZHU, Qi, LIAO, Lizi, HUANG, Minlie
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7578
https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=8581&context=sis_research
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8581
record_format dspace
spelling sg-smu-ink.sis_research-85812022-12-12T08:08:46Z A unified dialogue user simulator for few-shot data augmentation WAN, Dazhen ZHANG, Zheng ZHU, Qi LIAO, Lizi HUANG, Minlie Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue corpora. However, they heavily rely on annotated data in the target domain, which require a tremendous amount of data collection and human labeling work. In this paper, we build a unified dialogue user simulation model by pre-training on several publicly available datasets. The model can then be tuned on a target domain with fewshot data. The experiments on a target dataset across multiple domains show that our proposed model brings remarkable performance increases through data augmentation. 2022-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7578 https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=8581&context=sis_research http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Artificial Intelligence and Robotics Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Artificial Intelligence and Robotics
Databases and Information Systems
spellingShingle Artificial Intelligence and Robotics
Databases and Information Systems
WAN, Dazhen
ZHANG, Zheng
ZHU, Qi
LIAO, Lizi
HUANG, Minlie
A unified dialogue user simulator for few-shot data augmentation
description Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue corpora. However, they heavily rely on annotated data in the target domain, which require a tremendous amount of data collection and human labeling work. In this paper, we build a unified dialogue user simulation model by pre-training on several publicly available datasets. The model can then be tuned on a target domain with fewshot data. The experiments on a target dataset across multiple domains show that our proposed model brings remarkable performance increases through data augmentation.
format text
author WAN, Dazhen
ZHANG, Zheng
ZHU, Qi
LIAO, Lizi
HUANG, Minlie
author_facet WAN, Dazhen
ZHANG, Zheng
ZHU, Qi
LIAO, Lizi
HUANG, Minlie
author_sort WAN, Dazhen
title A unified dialogue user simulator for few-shot data augmentation
title_short A unified dialogue user simulator for few-shot data augmentation
title_full A unified dialogue user simulator for few-shot data augmentation
title_fullStr A unified dialogue user simulator for few-shot data augmentation
title_full_unstemmed A unified dialogue user simulator for few-shot data augmentation
title_sort unified dialogue user simulator for few-shot data augmentation
publisher Institutional Knowledge at Singapore Management University
publishDate 2022
url https://ink.library.smu.edu.sg/sis_research/7578
https://ink.library.smu.edu.sg/cgi/viewcontent.cgi?article=8581&context=sis_research
_version_ 1753801200972595200