Discriminator-enhanced knowledge-distillation networks

Query auto-completion (QAC) serves as a critical functionality in contemporary textual search systems by generating real-time query completion suggestions based on a user’s input prefix. Despite the prevalent use of language models (LMs) in QAC candidate generation, LM-based approaches frequently su...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Li, Zhenping, Cao, Zhen, Li, Pengfei, Zhong, Yong, Li, Shaobo
مؤلفون آخرون:	School of Electrical and Electronic Engineering
التنسيق:	مقال
اللغة:	English
منشور في:	2023
الموضوعات:	Engineering::Electrical and electronic engineering Knowledge Distillation Reinforcement Learning
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/171765
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Nanyang Technological University
اللغة:	English

id	sg-ntu-dr.10356-171765
record_format	dspace
spelling	sg-ntu-dr.10356-1717652023-11-10T15:40:34Z Discriminator-enhanced knowledge-distillation networks Li, Zhenping Cao, Zhen Li, Pengfei Zhong, Yong Li, Shaobo School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Knowledge Distillation Reinforcement Learning Query auto-completion (QAC) serves as a critical functionality in contemporary textual search systems by generating real-time query completion suggestions based on a user’s input prefix. Despite the prevalent use of language models (LMs) in QAC candidate generation, LM-based approaches frequently suffer from overcorrection issues during pair-wise loss training and efficiency deficiencies. To address these challenges, this paper presents a novel framework—discriminator-enhanced knowledge distillation (Dis-KD)—for the QAC task. This framework combines three core components: a large-scale pre-trained teacher model, a lightweight student model, and a discriminator for adversarial learning. Specifically, the discriminator aids in discerning generative-level differences between the teacher and the student models. An additional discriminator score loss is amalgamated with the traditional knowledge-distillation loss, resulting in enhanced performance of the student model. Contrary to the stepwise evaluation of each generated word, our approach assesses the entire generation sequence. This method alleviates the prevalent overcorrection issue in the generation process. Consequently, our proposed framework boasts improvements in model accuracy and a reduction in parameter size. Empirical results highlight the superiority of Dis-KD over established baseline methods, with the student model surpassing the teacher model in QAC tasks for sub-word languages. Published version This work was supported by the AI industrial technology innovation platform of Sichuan Province, grant number 2020ZHCG0002. 2023-11-07T06:17:36Z 2023-11-07T06:17:36Z 2023 Journal Article Li, Z., Cao, Z., Li, P., Zhong, Y. & Li, S. (2023). Discriminator-enhanced knowledge-distillation networks. Applied Sciences, 13(14), 8041-. https://dx.doi.org/10.3390/app13148041 2076-3417 https://hdl.handle.net/10356/171765 10.3390/app13148041 2-s2.0-85166261907 14 13 8041 en Applied Sciences © 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Electrical and electronic engineering Knowledge Distillation Reinforcement Learning
spellingShingle	Engineering::Electrical and electronic engineering Knowledge Distillation Reinforcement Learning Li, Zhenping Cao, Zhen Li, Pengfei Zhong, Yong Li, Shaobo Discriminator-enhanced knowledge-distillation networks
description	Query auto-completion (QAC) serves as a critical functionality in contemporary textual search systems by generating real-time query completion suggestions based on a user’s input prefix. Despite the prevalent use of language models (LMs) in QAC candidate generation, LM-based approaches frequently suffer from overcorrection issues during pair-wise loss training and efficiency deficiencies. To address these challenges, this paper presents a novel framework—discriminator-enhanced knowledge distillation (Dis-KD)—for the QAC task. This framework combines three core components: a large-scale pre-trained teacher model, a lightweight student model, and a discriminator for adversarial learning. Specifically, the discriminator aids in discerning generative-level differences between the teacher and the student models. An additional discriminator score loss is amalgamated with the traditional knowledge-distillation loss, resulting in enhanced performance of the student model. Contrary to the stepwise evaluation of each generated word, our approach assesses the entire generation sequence. This method alleviates the prevalent overcorrection issue in the generation process. Consequently, our proposed framework boasts improvements in model accuracy and a reduction in parameter size. Empirical results highlight the superiority of Dis-KD over established baseline methods, with the student model surpassing the teacher model in QAC tasks for sub-word languages.
author2	School of Electrical and Electronic Engineering
author_facet	School of Electrical and Electronic Engineering Li, Zhenping Cao, Zhen Li, Pengfei Zhong, Yong Li, Shaobo
format	Article
author	Li, Zhenping Cao, Zhen Li, Pengfei Zhong, Yong Li, Shaobo
author_sort	Li, Zhenping
title	Discriminator-enhanced knowledge-distillation networks
title_short	Discriminator-enhanced knowledge-distillation networks
title_full	Discriminator-enhanced knowledge-distillation networks
title_fullStr	Discriminator-enhanced knowledge-distillation networks
title_full_unstemmed	Discriminator-enhanced knowledge-distillation networks
title_sort	discriminator-enhanced knowledge-distillation networks
publishDate	2023
url	https://hdl.handle.net/10356/171765
_version_	1783955562163077120

Discriminator-enhanced knowledge-distillation networks

مواد مشابهة