Do pre-trained models benefit knowledge graph completion? A reliable evaluation and a reasonable approach

In recent years, pre-trained language models (PLMs) have been shown to capture factual knowledge from massive texts, which encourages the proposal of PLM-based knowledge graph completion (KGC) models. However, these models are still quite behind the SOTA KGC models in terms of performance. In this w...

Full description

Saved in:
Bibliographic Details
Main Authors: LV, Xin, LIN, Yankai, CAO, Yixin, HOU, Lei, LI, Juanzi, LIU, Zhiyuan, LI, Peng, ZHOU, Jie
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2022
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/7446
https://ink.library.smu.edu.sg/context/sis_research/article/8449/viewcontent/2022.findings_acl.282.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-8449
record_format dspace
spelling sg-smu-ink.sis_research-84492022-10-20T07:35:31Z Do pre-trained models benefit knowledge graph completion? A reliable evaluation and a reasonable approach LV, Xin LIN, Yankai CAO, Yixin HOU, Lei LI, Juanzi LIU, Zhiyuan LI, Peng ZHOU, Jie In recent years, pre-trained language models (PLMs) have been shown to capture factual knowledge from massive texts, which encourages the proposal of PLM-based knowledge graph completion (KGC) models. However, these models are still quite behind the SOTA KGC models in terms of performance. In this work, we find two main reasons for the weak performance: (1) Inaccurate evaluation setting. The evaluation setting under the closed-world assumption (CWA) may underestimate the PLM-based KGC models since they introduce more external knowledge; (2) Inappropriate utilization of PLMs. Most PLM-based KGC models simply splice the labels of entities and relations as inputs, leading to incoherent sentences that do not take full advantage of the implicit knowledge in PLMs. To alleviate these problems, we highlight a more accurate evaluation setting under the open-world assumption (OWA), which manual checks the correctness of knowledge that is not in KGs. Moreover, motivated by prompt tuning, we propose a novel PLM-based KGC model named PKGC. The basic idea is to convert each triple and its support information into natural prompt sentences, which is further fed into PLMs for classification. Experiment results on two KGC datasets demonstrate OWA is more reliable for evaluating KGC, especially on the link prediction, and the effectiveness of our PKCG model on both CWA and OWA settings. 2022-05-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7446 info:doi/10.18653/v1/2022.findings-acl.282 https://ink.library.smu.edu.sg/context/sis_research/article/8449/viewcontent/2022.findings_acl.282.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Databases and Information Systems
Graphics and Human Computer Interfaces
spellingShingle Databases and Information Systems
Graphics and Human Computer Interfaces
LV, Xin
LIN, Yankai
CAO, Yixin
HOU, Lei
LI, Juanzi
LIU, Zhiyuan
LI, Peng
ZHOU, Jie
Do pre-trained models benefit knowledge graph completion? A reliable evaluation and a reasonable approach
description In recent years, pre-trained language models (PLMs) have been shown to capture factual knowledge from massive texts, which encourages the proposal of PLM-based knowledge graph completion (KGC) models. However, these models are still quite behind the SOTA KGC models in terms of performance. In this work, we find two main reasons for the weak performance: (1) Inaccurate evaluation setting. The evaluation setting under the closed-world assumption (CWA) may underestimate the PLM-based KGC models since they introduce more external knowledge; (2) Inappropriate utilization of PLMs. Most PLM-based KGC models simply splice the labels of entities and relations as inputs, leading to incoherent sentences that do not take full advantage of the implicit knowledge in PLMs. To alleviate these problems, we highlight a more accurate evaluation setting under the open-world assumption (OWA), which manual checks the correctness of knowledge that is not in KGs. Moreover, motivated by prompt tuning, we propose a novel PLM-based KGC model named PKGC. The basic idea is to convert each triple and its support information into natural prompt sentences, which is further fed into PLMs for classification. Experiment results on two KGC datasets demonstrate OWA is more reliable for evaluating KGC, especially on the link prediction, and the effectiveness of our PKCG model on both CWA and OWA settings.
format text
author LV, Xin
LIN, Yankai
CAO, Yixin
HOU, Lei
LI, Juanzi
LIU, Zhiyuan
LI, Peng
ZHOU, Jie
author_facet LV, Xin
LIN, Yankai
CAO, Yixin
HOU, Lei
LI, Juanzi
LIU, Zhiyuan
LI, Peng
ZHOU, Jie
author_sort LV, Xin
title Do pre-trained models benefit knowledge graph completion? A reliable evaluation and a reasonable approach
title_short Do pre-trained models benefit knowledge graph completion? A reliable evaluation and a reasonable approach
title_full Do pre-trained models benefit knowledge graph completion? A reliable evaluation and a reasonable approach
title_fullStr Do pre-trained models benefit knowledge graph completion? A reliable evaluation and a reasonable approach
title_full_unstemmed Do pre-trained models benefit knowledge graph completion? A reliable evaluation and a reasonable approach
title_sort do pre-trained models benefit knowledge graph completion? a reliable evaluation and a reasonable approach
publisher Institutional Knowledge at Singapore Management University
publishDate 2022
url https://ink.library.smu.edu.sg/sis_research/7446
https://ink.library.smu.edu.sg/context/sis_research/article/8449/viewcontent/2022.findings_acl.282.pdf
_version_ 1770576350610980864