Partially randomizing transformer weights for dialogue response diversity

Despite recent progress in generative open-domain dialogue, the issue of low response diversity persists. Prior works have addressed this issue via either novel objective functions, alternative learning approaches such as variational frameworks, or architectural extensions such as the Randomized...

Full description

Saved in:

Bibliographic Details
Main Authors:	Lee, Jing Yang, Lee, Kong Aik, Gan, Woon-Seng
Other Authors:	School of Electrical and Electronic Engineering
Format:	Conference or Workshop Item
Language:	English
Published:	2023
Subjects:	Engineering::Electrical and electronic engineering Transformer Response
Online Access:	https://hdl.handle.net/10356/172416 https://paclic2023.github.io/
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-172416
record_format	dspace
spelling	sg-ntu-dr.10356-1724162023-12-29T15:40:07Z Partially randomizing transformer weights for dialogue response diversity Lee, Jing Yang Lee, Kong Aik Gan, Woon-Seng School of Electrical and Electronic Engineering 37th Pacific Asia Conference on Language, Information and Computation (PACLIC 37) Engineering::Electrical and electronic engineering Transformer Response Despite recent progress in generative open-domain dialogue, the issue of low response diversity persists. Prior works have addressed this issue via either novel objective functions, alternative learning approaches such as variational frameworks, or architectural extensions such as the Randomized Link (RL) Transformer. However, these approaches typically entail either additional difficulties during training/inference, or a significant increase in model size and complexity. Hence, we propose the \underline{Pa}rtially \underline{Ra}ndomized trans\underline{Former} (PaRaFormer), a simple extension of the transformer which involves freezing the weights of selected layers after random initialization. Experimental results reveal that the performance of the PaRaformer is comparable to that of the aforementioned approaches, despite not entailing any additional training difficulty or increase in model complexity. Published version 2023-12-29T00:34:00Z 2023-12-29T00:34:00Z 2023 Conference Paper Lee, J. Y., Lee, K. A. & Gan, W. (2023). Partially randomizing transformer weights for dialogue response diversity. 37th Pacific Asia Conference on Language, Information and Computation (PACLIC 37). https://hdl.handle.net/10356/172416 https://paclic2023.github.io/ en © 2023 The Author(s). All rights reserved. This article may be downloaded for personal use only. Any other use requires prior permission of the copyright holder. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering Transformer Response
spellingShingle	Engineering::Electrical and electronic engineering Transformer Response Lee, Jing Yang Lee, Kong Aik Gan, Woon-Seng Partially randomizing transformer weights for dialogue response diversity
description	Despite recent progress in generative open-domain dialogue, the issue of low response diversity persists. Prior works have addressed this issue via either novel objective functions, alternative learning approaches such as variational frameworks, or architectural extensions such as the Randomized Link (RL) Transformer. However, these approaches typically entail either additional difficulties during training/inference, or a significant increase in model size and complexity. Hence, we propose the \underline{Pa}rtially \underline{Ra}ndomized trans\underline{Former} (PaRaFormer), a simple extension of the transformer which involves freezing the weights of selected layers after random initialization. Experimental results reveal that the performance of the PaRaformer is comparable to that of the aforementioned approaches, despite not entailing any additional training difficulty or increase in model complexity.
author2	School of Electrical and Electronic Engineering
author_facet	School of Electrical and Electronic Engineering Lee, Jing Yang Lee, Kong Aik Gan, Woon-Seng
format	Conference or Workshop Item
author	Lee, Jing Yang Lee, Kong Aik Gan, Woon-Seng
author_sort	Lee, Jing Yang
title	Partially randomizing transformer weights for dialogue response diversity
title_short	Partially randomizing transformer weights for dialogue response diversity
title_full	Partially randomizing transformer weights for dialogue response diversity
title_fullStr	Partially randomizing transformer weights for dialogue response diversity
title_full_unstemmed	Partially randomizing transformer weights for dialogue response diversity
title_sort	partially randomizing transformer weights for dialogue response diversity
publishDate	2023
url	https://hdl.handle.net/10356/172416 https://paclic2023.github.io/
_version_	1787153685993750528

Partially randomizing transformer weights for dialogue response diversity

Similar Items