Injecting descriptive meta-information into pre-trained language models with hypernetworks
Pre-trained language models have been widely adopted as backbones in various natural language processing tasks. However, existing pre-trained language models ignore the descriptive meta-information in the text such as the distinction between the title and the mainbody, leading to over-weighted atten...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2021
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/6237 https://ink.library.smu.edu.sg/context/sis_research/article/7240/viewcontent/Injecting_Descriptive_Meta_Information_into_Pre_Trained_Language_Models_with_Hypernetworks.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-7240 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-72402021-11-05T05:16:07Z Injecting descriptive meta-information into pre-trained language models with hypernetworks DUAN, Wenying HE, Xiaoxi ZHOU, Zimu RAO, Hong THIELE, Lothar Pre-trained language models have been widely adopted as backbones in various natural language processing tasks. However, existing pre-trained language models ignore the descriptive meta-information in the text such as the distinction between the title and the mainbody, leading to over-weighted attention to insignificant text. In this paper, we propose a hypernetwork-based architecture to model the descriptive meta-information and integrate it into pre-trained language models. Evaluations on three natural language processing tasks show that our method notably improves the performance of pre-trained language models and achieves the state-of-the-art results on keyphrase extraction. 2021-09-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6237 info:doi/10.21437/Interspeech.2021-229 https://ink.library.smu.edu.sg/context/sis_research/article/7240/viewcontent/Injecting_Descriptive_Meta_Information_into_Pre_Trained_Language_Models_with_Hypernetworks.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University descriptive meta-information hypernetworks pre-trained language mode Programming Languages and Compilers Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
descriptive meta-information hypernetworks pre-trained language mode Programming Languages and Compilers Software Engineering |
spellingShingle |
descriptive meta-information hypernetworks pre-trained language mode Programming Languages and Compilers Software Engineering DUAN, Wenying HE, Xiaoxi ZHOU, Zimu RAO, Hong THIELE, Lothar Injecting descriptive meta-information into pre-trained language models with hypernetworks |
description |
Pre-trained language models have been widely adopted as backbones in various natural language processing tasks. However, existing pre-trained language models ignore the descriptive meta-information in the text such as the distinction between the title and the mainbody, leading to over-weighted attention to insignificant text. In this paper, we propose a hypernetwork-based architecture to model the descriptive meta-information and integrate it into pre-trained language models. Evaluations on three natural language processing tasks show that our method notably improves the performance of pre-trained language models and achieves the state-of-the-art results on keyphrase extraction. |
format |
text |
author |
DUAN, Wenying HE, Xiaoxi ZHOU, Zimu RAO, Hong THIELE, Lothar |
author_facet |
DUAN, Wenying HE, Xiaoxi ZHOU, Zimu RAO, Hong THIELE, Lothar |
author_sort |
DUAN, Wenying |
title |
Injecting descriptive meta-information into pre-trained language models with hypernetworks |
title_short |
Injecting descriptive meta-information into pre-trained language models with hypernetworks |
title_full |
Injecting descriptive meta-information into pre-trained language models with hypernetworks |
title_fullStr |
Injecting descriptive meta-information into pre-trained language models with hypernetworks |
title_full_unstemmed |
Injecting descriptive meta-information into pre-trained language models with hypernetworks |
title_sort |
injecting descriptive meta-information into pre-trained language models with hypernetworks |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2021 |
url |
https://ink.library.smu.edu.sg/sis_research/6237 https://ink.library.smu.edu.sg/context/sis_research/article/7240/viewcontent/Injecting_Descriptive_Meta_Information_into_Pre_Trained_Language_Models_with_Hypernetworks.pdf |
_version_ |
1770575897999441920 |