KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data

Automatic key concept identification from text is the main challenging task in information extraction, information retrieval, digital libraries, ontology learning, and text analysis. The main difficulty lies in the issues with the text data itself, such as noise in text, diversity, scale of data, co...

Full description

Saved in:

Bibliographic Details
Main Authors:	Aman, M., Abdulkadir, S.J., Aziz, I.A., Alhussian, H., Ullah, I.
Format:	Article
Published:	Springer 2021
Online Access:	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85099375213&doi=10.1007%2fs11042-020-10215-x&partnerID=40&md5=9d0ccf1ff2f914e59a57acc66aadf9d6 http://eprints.utp.edu.my/23821/
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Teknologi Petronas

id	my.utp.eprints.23821
record_format	eprints
spelling	my.utp.eprints.238212021-08-19T13:09:25Z KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data Aman, M. Abdulkadir, S.J. Aziz, I.A. Alhussian, H. Ullah, I. Automatic key concept identification from text is the main challenging task in information extraction, information retrieval, digital libraries, ontology learning, and text analysis. The main difficulty lies in the issues with the text data itself, such as noise in text, diversity, scale of data, context dependency and word sense ambiguity. To cope with this challenge, numerous supervised and unsupervised approaches have been devised. The existing topical clustering-based approaches for keyphrase extraction are domain dependent and overlooks semantic similarity between candidate features while extracting the topical phrases. In this paper, a semantic based unsupervised approach (KP-Rank) is proposed for keyphrase extraction. In the proposed approach, we exploited Latent Semantic Analysis (LSA) and clustering techniques and a novel frequency-based algorithm for candidate ranking is introduced which considers locality-based sentence, paragraph and section frequencies. To evaluate the performance of the proposed method, three benchmark datasets (i.e. Inspec, 500N-KPCrowed and SemEval-2010) from different domains are used. The experimental results show that overall, the KP-Rank achieved significant improvements over the existing approaches on the selected performance measures. Â© 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC part of Springer Nature. Springer 2021 Article NonPeerReviewed https://www.scopus.com/inward/record.uri?eid=2-s2.0-85099375213&doi=10.1007%2fs11042-020-10215-x&partnerID=40&md5=9d0ccf1ff2f914e59a57acc66aadf9d6 Aman, M. and Abdulkadir, S.J. and Aziz, I.A. and Alhussian, H. and Ullah, I. (2021) KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data. Multimedia Tools and Applications, 80 (8). pp. 12469-12506. http://eprints.utp.edu.my/23821/
institution	Universiti Teknologi Petronas
building	UTP Resource Centre
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Teknologi Petronas
content_source	UTP Institutional Repository
url_provider	http://eprints.utp.edu.my/
description	Automatic key concept identification from text is the main challenging task in information extraction, information retrieval, digital libraries, ontology learning, and text analysis. The main difficulty lies in the issues with the text data itself, such as noise in text, diversity, scale of data, context dependency and word sense ambiguity. To cope with this challenge, numerous supervised and unsupervised approaches have been devised. The existing topical clustering-based approaches for keyphrase extraction are domain dependent and overlooks semantic similarity between candidate features while extracting the topical phrases. In this paper, a semantic based unsupervised approach (KP-Rank) is proposed for keyphrase extraction. In the proposed approach, we exploited Latent Semantic Analysis (LSA) and clustering techniques and a novel frequency-based algorithm for candidate ranking is introduced which considers locality-based sentence, paragraph and section frequencies. To evaluate the performance of the proposed method, three benchmark datasets (i.e. Inspec, 500N-KPCrowed and SemEval-2010) from different domains are used. The experimental results show that overall, the KP-Rank achieved significant improvements over the existing approaches on the selected performance measures. Â© 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC part of Springer Nature.
format	Article
author	Aman, M. Abdulkadir, S.J. Aziz, I.A. Alhussian, H. Ullah, I.
spellingShingle	Aman, M. Abdulkadir, S.J. Aziz, I.A. Alhussian, H. Ullah, I. KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
author_facet	Aman, M. Abdulkadir, S.J. Aziz, I.A. Alhussian, H. Ullah, I.
author_sort	Aman, M.
title	KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
title_short	KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
title_full	KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
title_fullStr	KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
title_full_unstemmed	KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data
title_sort	kp-rank: a semantic-based unsupervised approach for keyphrase extraction from text data
publisher	Springer
publishDate	2021
url	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85099375213&doi=10.1007%2fs11042-020-10215-x&partnerID=40&md5=9d0ccf1ff2f914e59a57acc66aadf9d6 http://eprints.utp.edu.my/23821/
_version_	1738656526530248704

KP-Rank: a semantic-based unsupervised approach for keyphrase extraction from text data

Similar Items