A unified dialogue user simulator for few-shot data augmentation
Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue cor...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2022
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/7719 https://ink.library.smu.edu.sg/context/sis_research/article/8722/viewcontent/A_unified_dialogue_user_simulator_for_few_shot_data_augmentation.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-8722 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-87222023-03-23T00:33:33Z A unified dialogue user simulator for few-shot data augmentation WAN, Dazhen ZHANG, Zheng ZHU, Qi LIAO, Lizi HUANG, Minlie Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue corpora. However, they heavily rely on annotated data in the target domain, which require a tremendous amount of data collection and human labeling work. In this paper, we build a unified dialogue user simulation model by pre-training on several publicly available datasets. The model can then be tuned on a target domain with fewshot data. The experiments on a target dataset across multiple domains show that our proposed model brings remarkable performance increases through data augmentation. 2022-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7719 https://ink.library.smu.edu.sg/context/sis_research/article/8722/viewcontent/A_unified_dialogue_user_simulator_for_few_shot_data_augmentation.pdf http://creativecommons.org/licenses/by/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Computational linguistics Augmentation methods Data augmentation; Pre-training Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Computational linguistics Augmentation methods Data augmentation; Pre-training Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing |
spellingShingle |
Computational linguistics Augmentation methods Data augmentation; Pre-training Artificial Intelligence and Robotics Databases and Information Systems Numerical Analysis and Scientific Computing WAN, Dazhen ZHANG, Zheng ZHU, Qi LIAO, Lizi HUANG, Minlie A unified dialogue user simulator for few-shot data augmentation |
description |
Pre-trained language models have shown superior performance in task-oriented dialogues. However, existing datasets are on limited scales, which cannot support large-scale pre-training. Fortunately, various data augmentation methods have been developed to augment largescale task-oriented dialogue corpora. However, they heavily rely on annotated data in the target domain, which require a tremendous amount of data collection and human labeling work. In this paper, we build a unified dialogue user simulation model by pre-training on several publicly available datasets. The model can then be tuned on a target domain with fewshot data. The experiments on a target dataset across multiple domains show that our proposed model brings remarkable performance increases through data augmentation. |
format |
text |
author |
WAN, Dazhen ZHANG, Zheng ZHU, Qi LIAO, Lizi HUANG, Minlie |
author_facet |
WAN, Dazhen ZHANG, Zheng ZHU, Qi LIAO, Lizi HUANG, Minlie |
author_sort |
WAN, Dazhen |
title |
A unified dialogue user simulator for few-shot data augmentation |
title_short |
A unified dialogue user simulator for few-shot data augmentation |
title_full |
A unified dialogue user simulator for few-shot data augmentation |
title_fullStr |
A unified dialogue user simulator for few-shot data augmentation |
title_full_unstemmed |
A unified dialogue user simulator for few-shot data augmentation |
title_sort |
unified dialogue user simulator for few-shot data augmentation |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2022 |
url |
https://ink.library.smu.edu.sg/sis_research/7719 https://ink.library.smu.edu.sg/context/sis_research/article/8722/viewcontent/A_unified_dialogue_user_simulator_for_few_shot_data_augmentation.pdf |
_version_ |
1770576420783783936 |