Keyphrases Concentrated Area Identification from Academic Articles as Feature of Keyphrase Extraction: A New Unsupervised Approach

The extraction of high-quality keywords and sum-marising documents at a high level has become more difficult in current research due to technological advancements and the expo-nential expansion of textual data and digital sources. Extracting high-quality keywords and summarising the documents at a h...

Full description

Saved in:
Bibliographic Details
Main Authors: Miah, Mohammad Badrul Alam, Suryanti, Awang, Azad, Md Saiful, Rahman, Md Mustafizur
Format: Article
Language:English
Published: The Science and Information (SAI) Organization Limited 2022
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/33320/1/Keyphrases%20Concentrated%20Area%20Identification.pdf
http://umpir.ump.edu.my/id/eprint/33320/
https://thesai.org/Downloads/Volume13No1/Paper_92-Keyphrases_Concentrated_Area_Identification_from_Academic_Articles.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Pahang
Language: English
id my.ump.umpir.33320
record_format eprints
spelling my.ump.umpir.333202022-09-06T07:46:59Z http://umpir.ump.edu.my/id/eprint/33320/ Keyphrases Concentrated Area Identification from Academic Articles as Feature of Keyphrase Extraction: A New Unsupervised Approach Miah, Mohammad Badrul Alam Suryanti, Awang Azad, Md Saiful Rahman, Md Mustafizur QA75 Electronic computers. Computer science The extraction of high-quality keywords and sum-marising documents at a high level has become more difficult in current research due to technological advancements and the expo-nential expansion of textual data and digital sources. Extracting high-quality keywords and summarising the documents at a high-level need to use features for the keyphrase extraction, becoming more popular. A new unsupervised keyphrase concentrated area (KCA) identification approach is proposed in this study as a feature of keyphrase extraction: corpus, domain and language independent; document length-free; utilized by both supervised and unsupervised techniques. In the proposed system, there are three phases: data pre-processing, data processing, and KCA identification. The system employs various text pre-processing methods before transferring the acquired datasets to the data processing step. The pre-processed data is subsequently used during the data processing step. The statistical approaches, curve plotting, and curve fitting technique are applied in the KCA identification step. The proposed system is then tested and evaluated using benchmark datasets collected from various sources. To demonstrate our proposed approach’s effectiveness, merits, and significance, we compared it with other proposed techniques. The experimental results on eleven (11) datasets show that the proposed approach effectively recognizes the KCA from articles as well as significantly enhances the current keyphrase extraction methods based on various text sizes, languages, and domains. The Science and Information (SAI) Organization Limited 2022 Article PeerReviewed pdf en cc_by_4 http://umpir.ump.edu.my/id/eprint/33320/1/Keyphrases%20Concentrated%20Area%20Identification.pdf Miah, Mohammad Badrul Alam and Suryanti, Awang and Azad, Md Saiful and Rahman, Md Mustafizur (2022) Keyphrases Concentrated Area Identification from Academic Articles as Feature of Keyphrase Extraction: A New Unsupervised Approach. International Journal of Advanced Computer Science and Applications (IJACSA), 13 (1). pp. 788-796. ISSN 2156-5570(Online) https://thesai.org/Downloads/Volume13No1/Paper_92-Keyphrases_Concentrated_Area_Identification_from_Academic_Articles.pdf
institution Universiti Malaysia Pahang
building UMP Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Pahang
content_source UMP Institutional Repository
url_provider http://umpir.ump.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Miah, Mohammad Badrul Alam
Suryanti, Awang
Azad, Md Saiful
Rahman, Md Mustafizur
Keyphrases Concentrated Area Identification from Academic Articles as Feature of Keyphrase Extraction: A New Unsupervised Approach
description The extraction of high-quality keywords and sum-marising documents at a high level has become more difficult in current research due to technological advancements and the expo-nential expansion of textual data and digital sources. Extracting high-quality keywords and summarising the documents at a high-level need to use features for the keyphrase extraction, becoming more popular. A new unsupervised keyphrase concentrated area (KCA) identification approach is proposed in this study as a feature of keyphrase extraction: corpus, domain and language independent; document length-free; utilized by both supervised and unsupervised techniques. In the proposed system, there are three phases: data pre-processing, data processing, and KCA identification. The system employs various text pre-processing methods before transferring the acquired datasets to the data processing step. The pre-processed data is subsequently used during the data processing step. The statistical approaches, curve plotting, and curve fitting technique are applied in the KCA identification step. The proposed system is then tested and evaluated using benchmark datasets collected from various sources. To demonstrate our proposed approach’s effectiveness, merits, and significance, we compared it with other proposed techniques. The experimental results on eleven (11) datasets show that the proposed approach effectively recognizes the KCA from articles as well as significantly enhances the current keyphrase extraction methods based on various text sizes, languages, and domains.
format Article
author Miah, Mohammad Badrul Alam
Suryanti, Awang
Azad, Md Saiful
Rahman, Md Mustafizur
author_facet Miah, Mohammad Badrul Alam
Suryanti, Awang
Azad, Md Saiful
Rahman, Md Mustafizur
author_sort Miah, Mohammad Badrul Alam
title Keyphrases Concentrated Area Identification from Academic Articles as Feature of Keyphrase Extraction: A New Unsupervised Approach
title_short Keyphrases Concentrated Area Identification from Academic Articles as Feature of Keyphrase Extraction: A New Unsupervised Approach
title_full Keyphrases Concentrated Area Identification from Academic Articles as Feature of Keyphrase Extraction: A New Unsupervised Approach
title_fullStr Keyphrases Concentrated Area Identification from Academic Articles as Feature of Keyphrase Extraction: A New Unsupervised Approach
title_full_unstemmed Keyphrases Concentrated Area Identification from Academic Articles as Feature of Keyphrase Extraction: A New Unsupervised Approach
title_sort keyphrases concentrated area identification from academic articles as feature of keyphrase extraction: a new unsupervised approach
publisher The Science and Information (SAI) Organization Limited
publishDate 2022
url http://umpir.ump.edu.my/id/eprint/33320/1/Keyphrases%20Concentrated%20Area%20Identification.pdf
http://umpir.ump.edu.my/id/eprint/33320/
https://thesai.org/Downloads/Volume13No1/Paper_92-Keyphrases_Concentrated_Area_Identification_from_Academic_Articles.pdf
_version_ 1744353870006452224