Partially randomizing transformer weights for dialogue response diversity
Despite recent progress in generative open-domain dialogue, the issue of low response diversity persists. Prior works have addressed this issue via either novel objective functions, alternative learning approaches such as variational frameworks, or architectural extensions such as the Randomized...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/172416 https://paclic2023.github.io/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-172416 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1724162023-12-29T15:40:07Z Partially randomizing transformer weights for dialogue response diversity Lee, Jing Yang Lee, Kong Aik Gan, Woon-Seng School of Electrical and Electronic Engineering 37th Pacific Asia Conference on Language, Information and Computation (PACLIC 37) Engineering::Electrical and electronic engineering Transformer Response Despite recent progress in generative open-domain dialogue, the issue of low response diversity persists. Prior works have addressed this issue via either novel objective functions, alternative learning approaches such as variational frameworks, or architectural extensions such as the Randomized Link (RL) Transformer. However, these approaches typically entail either additional difficulties during training/inference, or a significant increase in model size and complexity. Hence, we propose the \underline{Pa}rtially \underline{Ra}ndomized trans\underline{Former} (PaRaFormer), a simple extension of the transformer which involves freezing the weights of selected layers after random initialization. Experimental results reveal that the performance of the PaRaformer is comparable to that of the aforementioned approaches, despite not entailing any additional training difficulty or increase in model complexity. Published version 2023-12-29T00:34:00Z 2023-12-29T00:34:00Z 2023 Conference Paper Lee, J. Y., Lee, K. A. & Gan, W. (2023). Partially randomizing transformer weights for dialogue response diversity. 37th Pacific Asia Conference on Language, Information and Computation (PACLIC 37). https://hdl.handle.net/10356/172416 https://paclic2023.github.io/ en © 2023 The Author(s). All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering Transformer Response |
spellingShingle |
Engineering::Electrical and electronic engineering Transformer Response Lee, Jing Yang Lee, Kong Aik Gan, Woon-Seng Partially randomizing transformer weights for dialogue response diversity |
description |
Despite recent progress in generative open-domain dialogue, the issue of low
response diversity persists. Prior works have addressed this issue via either
novel objective functions, alternative learning approaches such as variational
frameworks, or architectural extensions such as the Randomized Link (RL)
Transformer. However, these approaches typically entail either additional
difficulties during training/inference, or a significant increase in model size
and complexity. Hence, we propose the \underline{Pa}rtially
\underline{Ra}ndomized trans\underline{Former} (PaRaFormer), a simple extension
of the transformer which involves freezing the weights of selected layers after
random initialization. Experimental results reveal that the performance of the
PaRaformer is comparable to that of the aforementioned approaches, despite not
entailing any additional training difficulty or increase in model complexity. |
author2 |
School of Electrical and Electronic Engineering |
author_facet |
School of Electrical and Electronic Engineering Lee, Jing Yang Lee, Kong Aik Gan, Woon-Seng |
format |
Conference or Workshop Item |
author |
Lee, Jing Yang Lee, Kong Aik Gan, Woon-Seng |
author_sort |
Lee, Jing Yang |
title |
Partially randomizing transformer weights for dialogue response diversity |
title_short |
Partially randomizing transformer weights for dialogue response diversity |
title_full |
Partially randomizing transformer weights for dialogue response diversity |
title_fullStr |
Partially randomizing transformer weights for dialogue response diversity |
title_full_unstemmed |
Partially randomizing transformer weights for dialogue response diversity |
title_sort |
partially randomizing transformer weights for dialogue response diversity |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/172416 https://paclic2023.github.io/ |
_version_ |
1787153685993750528 |