A unified dialogue user simulator for few-shot data augmentation

Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue cor...

Full description

Saved in:
Bibliographic Details
Main Authors: WAN, Dazhen, ZHANG, Zheng, ZHU, Qi, LIAO, Lizi, HUANG, Minlie
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7719
https://ink.library.smu.edu.sg/context/sis_research/article/8722/viewcontent/A_unified_dialogue_user_simulator_for_few_shot_data_augmentation.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8722
record_format dspace
spelling sg-smu-ink.sis_research-87222023-03-23T00:33:33Z A unified dialogue user simulator for few-shot data augmentation WAN, Dazhen ZHANG, Zheng ZHU, Qi LIAO, Lizi HUANG, Minlie Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue corpora. However, they heavily rely on annotated data in the target domain, which require a tremendous amount of data collection and human labeling work. In this paper, we build a unified dialogue user simulation model by pre-training on several publicly available datasets. The model can then be tuned on a target domain with fewshot data. The experiments on a target dataset across multiple domains show that our proposed model brings remarkable performance increases through data augmentation. 2022-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7719 https://ink.library.smu.edu.sg/context/sis_research/article/8722/viewcontent/A_unified_dialogue_user_simulator_for_few_shot_data_augmentation.pdf http://creativecommons.org/licenses/by/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Computational linguistics Augmentation methods Data augmentation; Pre-training Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Computational linguistics
Augmentation methods
Data augmentation; Pre-training
Artificial Intelligence and Robotics
Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Computational linguistics
Augmentation methods
Data augmentation; Pre-training
Artificial Intelligence and Robotics
Databases and Information Systems
Numerical Analysis and Scientific Computing
WAN, Dazhen
ZHANG, Zheng
ZHU, Qi
LIAO, Lizi
HUANG, Minlie
A unified dialogue user simulator for few-shot data augmentation
description Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue corpora. However, they heavily rely on annotated data in the target domain, which require a tremendous amount of data collection and human labeling work. In this paper, we build a unified dialogue user simulation model by pre-training on several publicly available datasets. The model can then be tuned on a target domain with fewshot data. The experiments on a target dataset across multiple domains show that our proposed model brings remarkable performance increases through data augmentation.
format text
author WAN, Dazhen
ZHANG, Zheng
ZHU, Qi
LIAO, Lizi
HUANG, Minlie
author_facet WAN, Dazhen
ZHANG, Zheng
ZHU, Qi
LIAO, Lizi
HUANG, Minlie
author_sort WAN, Dazhen
title A unified dialogue user simulator for few-shot data augmentation
title_short A unified dialogue user simulator for few-shot data augmentation
title_full A unified dialogue user simulator for few-shot data augmentation
title_fullStr A unified dialogue user simulator for few-shot data augmentation
title_full_unstemmed A unified dialogue user simulator for few-shot data augmentation
title_sort unified dialogue user simulator for few-shot data augmentation
publisher Institutional Knowledge at Singapore Management University
publishDate 2022
url https://ink.library.smu.edu.sg/sis_research/7719
https://ink.library.smu.edu.sg/context/sis_research/article/8722/viewcontent/A_unified_dialogue_user_simulator_for_few_shot_data_augmentation.pdf
_version_ 1770576420783783936