An efficient approach for measuring semantic similarity combining WordNet and Wikipedia

The measurement of semantic similarity between concepts is an important research topic in natural language processing. In the past, several approaches for measuring the semantic similarity between concepts have been proposed based on WordNet or Wikipedia. However, improvements in the measurement acc...

Full description

Saved in:
Bibliographic Details
Main Authors: Li, Fei, Liao, Lejian, Zhang, Lanfang, Zhu, Xinhua, Zhang, Bo, Wang, Zheng
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2021
Subjects:
Online Access:https://hdl.handle.net/10356/145817
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-145817
record_format dspace
spelling sg-ntu-dr.10356-1458172021-01-08T08:42:16Z An efficient approach for measuring semantic similarity combining WordNet and Wikipedia Li, Fei Liao, Lejian Zhang, Lanfang Zhu, Xinhua Zhang, Bo Wang, Zheng School of Computer Science and Engineering Engineering::Electrical and electronic engineering Semantic Similarity, Edge Weight Model The measurement of semantic similarity between concepts is an important research topic in natural language processing. In the past, several approaches for measuring the semantic similarity between concepts have been proposed based on WordNet or Wikipedia. However, improvements in the measurement accuracy of most methods have led to a dramatic increase in time complexity, and the existing methods do not effectively integrate WordNet and Wikipedia. In this paper, we focus on designing an efficient semantic similarity method based on WordNet and Wikipedia. To improve the accuracy of WordNet edge-based measures, we propose an edge weight model for combining edge and density information, which assigns a weight to each edge adaptively based on the number of direct hyponyms of the subsumer. Second, to improve the computational efficiencies of the existing Wikipedia link vector-based measures, we propose a new Wikipedia link feature-based semantic similarity method that converts Wikipedia links into semantic knowledge and replaces the TF-IDF statistical weight model in the existing measures. In addition, we propose two new word disambiguation strategies to further improve the accuracy of Wikipedia link-based measures. Finally, to fully exploit the advantages of WordNet and Wikipedia, we propose two new aggregation schemas for combining WordNet “is-a” semantics and Wikipedia link semantics to replace the current aggregation schemas that combine WordNet “is-a” semantics with category semantics in Wikipedia. The experimental results show that our aggregation models are outstanding in terms of accuracy, efficiency and word coverage compared to state-of-the-art similarity measures. Published version 2021-01-08T08:42:16Z 2021-01-08T08:42:16Z 2020 Journal Article Li, F., Liao, L., Zhang, L., Zhu, X., Zhang, B., & Wang, Z. (2020). An efficient approach for measuring semantic similarity combining WordNet and Wikipedia. IEEE Access, 8, 184318-184338. doi:10.1109/ACCESS.2020.3025611 2169-3536 https://hdl.handle.net/10356/145817 10.1109/ACCESS.2020.3025611 8 184318 184338 en IEEE Access © 2020 IEEE. This journal is 100% open access, which means that all content is freely available without charge to users or their institutions. All articles accepted after 12 June 2019 are published under a CC BY 4.0 license, and the author retains copyright. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, as long as proper attribution is given. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
Semantic Similarity, Edge Weight Model
spellingShingle Engineering::Electrical and electronic engineering
Semantic Similarity, Edge Weight Model
Li, Fei
Liao, Lejian
Zhang, Lanfang
Zhu, Xinhua
Zhang, Bo
Wang, Zheng
An efficient approach for measuring semantic similarity combining WordNet and Wikipedia
description The measurement of semantic similarity between concepts is an important research topic in natural language processing. In the past, several approaches for measuring the semantic similarity between concepts have been proposed based on WordNet or Wikipedia. However, improvements in the measurement accuracy of most methods have led to a dramatic increase in time complexity, and the existing methods do not effectively integrate WordNet and Wikipedia. In this paper, we focus on designing an efficient semantic similarity method based on WordNet and Wikipedia. To improve the accuracy of WordNet edge-based measures, we propose an edge weight model for combining edge and density information, which assigns a weight to each edge adaptively based on the number of direct hyponyms of the subsumer. Second, to improve the computational efficiencies of the existing Wikipedia link vector-based measures, we propose a new Wikipedia link feature-based semantic similarity method that converts Wikipedia links into semantic knowledge and replaces the TF-IDF statistical weight model in the existing measures. In addition, we propose two new word disambiguation strategies to further improve the accuracy of Wikipedia link-based measures. Finally, to fully exploit the advantages of WordNet and Wikipedia, we propose two new aggregation schemas for combining WordNet “is-a” semantics and Wikipedia link semantics to replace the current aggregation schemas that combine WordNet “is-a” semantics with category semantics in Wikipedia. The experimental results show that our aggregation models are outstanding in terms of accuracy, efficiency and word coverage compared to state-of-the-art similarity measures.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Li, Fei
Liao, Lejian
Zhang, Lanfang
Zhu, Xinhua
Zhang, Bo
Wang, Zheng
format Article
author Li, Fei
Liao, Lejian
Zhang, Lanfang
Zhu, Xinhua
Zhang, Bo
Wang, Zheng
author_sort Li, Fei
title An efficient approach for measuring semantic similarity combining WordNet and Wikipedia
title_short An efficient approach for measuring semantic similarity combining WordNet and Wikipedia
title_full An efficient approach for measuring semantic similarity combining WordNet and Wikipedia
title_fullStr An efficient approach for measuring semantic similarity combining WordNet and Wikipedia
title_full_unstemmed An efficient approach for measuring semantic similarity combining WordNet and Wikipedia
title_sort efficient approach for measuring semantic similarity combining wordnet and wikipedia
publishDate 2021
url https://hdl.handle.net/10356/145817
_version_ 1690658328958468096