Discriminator-enhanced knowledge-distillation networks

Query auto-completion (QAC) serves as a critical functionality in contemporary textual search systems by generating real-time query completion suggestions based on a user’s input prefix. Despite the prevalent use of language models (LMs) in QAC candidate generation, LM-based approaches frequently su...

Full description

Saved in:
Bibliographic Details
Main Authors: Li, Zhenping, Cao, Zhen, Li, Pengfei, Zhong, Yong, Li, Shaobo
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/171765
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-171765
record_format dspace
spelling sg-ntu-dr.10356-1717652023-11-10T15:40:34Z Discriminator-enhanced knowledge-distillation networks Li, Zhenping Cao, Zhen Li, Pengfei Zhong, Yong Li, Shaobo School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Knowledge Distillation Reinforcement Learning Query auto-completion (QAC) serves as a critical functionality in contemporary textual search systems by generating real-time query completion suggestions based on a user’s input prefix. Despite the prevalent use of language models (LMs) in QAC candidate generation, LM-based approaches frequently suffer from overcorrection issues during pair-wise loss training and efficiency deficiencies. To address these challenges, this paper presents a novel framework—discriminator-enhanced knowledge distillation (Dis-KD)—for the QAC task. This framework combines three core components: a large-scale pre-trained teacher model, a lightweight student model, and a discriminator for adversarial learning. Specifically, the discriminator aids in discerning generative-level differences between the teacher and the student models. An additional discriminator score loss is amalgamated with the traditional knowledge-distillation loss, resulting in enhanced performance of the student model. Contrary to the stepwise evaluation of each generated word, our approach assesses the entire generation sequence. This method alleviates the prevalent overcorrection issue in the generation process. Consequently, our proposed framework boasts improvements in model accuracy and a reduction in parameter size. Empirical results highlight the superiority of Dis-KD over established baseline methods, with the student model surpassing the teacher model in QAC tasks for sub-word languages. Published version This work was supported by the AI industrial technology innovation platform of Sichuan Province, grant number 2020ZHCG0002. 2023-11-07T06:17:36Z 2023-11-07T06:17:36Z 2023 Journal Article Li, Z., Cao, Z., Li, P., Zhong, Y. & Li, S. (2023). Discriminator-enhanced knowledge-distillation networks. Applied Sciences, 13(14), 8041-. https://dx.doi.org/10.3390/app13148041 2076-3417 https://hdl.handle.net/10356/171765 10.3390/app13148041 2-s2.0-85166261907 14 13 8041 en Applied Sciences © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
Knowledge Distillation
Reinforcement Learning
spellingShingle Engineering::Electrical and electronic engineering
Knowledge Distillation
Reinforcement Learning
Li, Zhenping
Cao, Zhen
Li, Pengfei
Zhong, Yong
Li, Shaobo
Discriminator-enhanced knowledge-distillation networks
description Query auto-completion (QAC) serves as a critical functionality in contemporary textual search systems by generating real-time query completion suggestions based on a user’s input prefix. Despite the prevalent use of language models (LMs) in QAC candidate generation, LM-based approaches frequently suffer from overcorrection issues during pair-wise loss training and efficiency deficiencies. To address these challenges, this paper presents a novel framework—discriminator-enhanced knowledge distillation (Dis-KD)—for the QAC task. This framework combines three core components: a large-scale pre-trained teacher model, a lightweight student model, and a discriminator for adversarial learning. Specifically, the discriminator aids in discerning generative-level differences between the teacher and the student models. An additional discriminator score loss is amalgamated with the traditional knowledge-distillation loss, resulting in enhanced performance of the student model. Contrary to the stepwise evaluation of each generated word, our approach assesses the entire generation sequence. This method alleviates the prevalent overcorrection issue in the generation process. Consequently, our proposed framework boasts improvements in model accuracy and a reduction in parameter size. Empirical results highlight the superiority of Dis-KD over established baseline methods, with the student model surpassing the teacher model in QAC tasks for sub-word languages.
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Li, Zhenping
Cao, Zhen
Li, Pengfei
Zhong, Yong
Li, Shaobo
format Article
author Li, Zhenping
Cao, Zhen
Li, Pengfei
Zhong, Yong
Li, Shaobo
author_sort Li, Zhenping
title Discriminator-enhanced knowledge-distillation networks
title_short Discriminator-enhanced knowledge-distillation networks
title_full Discriminator-enhanced knowledge-distillation networks
title_fullStr Discriminator-enhanced knowledge-distillation networks
title_full_unstemmed Discriminator-enhanced knowledge-distillation networks
title_sort discriminator-enhanced knowledge-distillation networks
publishDate 2023
url https://hdl.handle.net/10356/171765
_version_ 1783955562163077120