Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach

Due to the advancement of technology and the exponential proliferation of digital sources and textual data, the extraction of high-quality keyphrases and the summarizing of content at a high standard has become increasingly difficult in current research. Extracting high-quality keyphrases and summin...

Full description

Saved in:
Bibliographic Details
Main Authors: Miah, Mohammad Badrul Alam, Suryanti, Awang, Rahman, Md Mustafizur, A. S. M., Sanwar Hosen, Ra, In-Ho
Format: Article
Language:English
Published: IEEE 2022
Subjects:
Online Access:http://umpir.ump.edu.my/id/eprint/35106/1/Keyphrases%20Frequency%20Analysis.pdf
http://umpir.ump.edu.my/id/eprint/35106/
https://doi.org/10.1109/ACCESS.2022.3198959
https://doi.org/10.1109/ACCESS.2022.3198959
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaysia Pahang Al-Sultan Abdullah
Language: English
id my.ump.umpir.35106
record_format eprints
spelling my.ump.umpir.351062022-09-06T08:27:37Z http://umpir.ump.edu.my/id/eprint/35106/ Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach Miah, Mohammad Badrul Alam Suryanti, Awang Rahman, Md Mustafizur A. S. M., Sanwar Hosen Ra, In-Ho QA76 Computer software TA Engineering (General). Civil engineering (General) Due to the advancement of technology and the exponential proliferation of digital sources and textual data, the extraction of high-quality keyphrases and the summarizing of content at a high standard has become increasingly difficult in current research. Extracting high-quality keyphrases and summing texts at a high level demands the use of keyphrase frequency as a feature for keyword extraction, which is becoming more popular. This article proposed a novel unsupervised keyphrase frequency analysis (KFA) technique for feature extraction of keyphrases that is corpus-independent, domain-independent, language-agnostic, and length-free documents, and can be used by supervised and unsupervised algorithms. This proposed technique has five essential phases: data acquisition; data pre-processing; statistical methodologies; curve plotting analysis; and curve fitting technique. First, the technique begins by collecting five different datasets from various sources and then feeding those datasets into the data pre-processing phase using text pre-processing techniques. The preprocessed data is then transmitted to the region-based statistical process, followed by the curve plotting phase, and finally, the curve fitting approach. Afterward, the proposed technique is tested and assessed using five (5) standard datasets. Then, the proposed technique is compared with our recommended systems to prove its efficacy, benefits, and significance. Finally, the experimental findings indicate that the proposed technique effectively analyses the keyphrase frequency from articles and delivers the keyphrase frequency of 70.63% in 1st region and 10.74% in 2nd region of the total present keyphrase frequency. IEEE 2022 Article PeerReviewed pdf en cc_by_4 http://umpir.ump.edu.my/id/eprint/35106/1/Keyphrases%20Frequency%20Analysis.pdf Miah, Mohammad Badrul Alam and Suryanti, Awang and Rahman, Md Mustafizur and A. S. M., Sanwar Hosen and Ra, In-Ho (2022) Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach. IEEE Access. pp. 1-12. ISSN 2169-3536. (Published) https://doi.org/10.1109/ACCESS.2022.3198959 https://doi.org/10.1109/ACCESS.2022.3198959
institution Universiti Malaysia Pahang Al-Sultan Abdullah
building UMPSA Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaysia Pahang Al-Sultan Abdullah
content_source UMPSA Institutional Repository
url_provider http://umpir.ump.edu.my/
language English
topic QA76 Computer software
TA Engineering (General). Civil engineering (General)
spellingShingle QA76 Computer software
TA Engineering (General). Civil engineering (General)
Miah, Mohammad Badrul Alam
Suryanti, Awang
Rahman, Md Mustafizur
A. S. M., Sanwar Hosen
Ra, In-Ho
Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach
description Due to the advancement of technology and the exponential proliferation of digital sources and textual data, the extraction of high-quality keyphrases and the summarizing of content at a high standard has become increasingly difficult in current research. Extracting high-quality keyphrases and summing texts at a high level demands the use of keyphrase frequency as a feature for keyword extraction, which is becoming more popular. This article proposed a novel unsupervised keyphrase frequency analysis (KFA) technique for feature extraction of keyphrases that is corpus-independent, domain-independent, language-agnostic, and length-free documents, and can be used by supervised and unsupervised algorithms. This proposed technique has five essential phases: data acquisition; data pre-processing; statistical methodologies; curve plotting analysis; and curve fitting technique. First, the technique begins by collecting five different datasets from various sources and then feeding those datasets into the data pre-processing phase using text pre-processing techniques. The preprocessed data is then transmitted to the region-based statistical process, followed by the curve plotting phase, and finally, the curve fitting approach. Afterward, the proposed technique is tested and assessed using five (5) standard datasets. Then, the proposed technique is compared with our recommended systems to prove its efficacy, benefits, and significance. Finally, the experimental findings indicate that the proposed technique effectively analyses the keyphrase frequency from articles and delivers the keyphrase frequency of 70.63% in 1st region and 10.74% in 2nd region of the total present keyphrase frequency.
format Article
author Miah, Mohammad Badrul Alam
Suryanti, Awang
Rahman, Md Mustafizur
A. S. M., Sanwar Hosen
Ra, In-Ho
author_facet Miah, Mohammad Badrul Alam
Suryanti, Awang
Rahman, Md Mustafizur
A. S. M., Sanwar Hosen
Ra, In-Ho
author_sort Miah, Mohammad Badrul Alam
title Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach
title_short Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach
title_full Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach
title_fullStr Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach
title_full_unstemmed Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach
title_sort keyphrases frequency analysis from research articles: a region-based unsupervised novel approach
publisher IEEE
publishDate 2022
url http://umpir.ump.edu.my/id/eprint/35106/1/Keyphrases%20Frequency%20Analysis.pdf
http://umpir.ump.edu.my/id/eprint/35106/
https://doi.org/10.1109/ACCESS.2022.3198959
https://doi.org/10.1109/ACCESS.2022.3198959
_version_ 1822922867151470592