Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach
Due to the advancement of technology and the exponential proliferation of digital sources and textual data, the extraction of high-quality keyphrases and the summarizing of content at a high standard has become increasingly difficult in current research. Extracting high-quality keyphrases and summin...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/35106/1/Keyphrases%20Frequency%20Analysis.pdf http://umpir.ump.edu.my/id/eprint/35106/ https://doi.org/10.1109/ACCESS.2022.3198959 https://doi.org/10.1109/ACCESS.2022.3198959 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Malaysia Pahang Al-Sultan Abdullah |
Language: | English |
id |
my.ump.umpir.35106 |
---|---|
record_format |
eprints |
spelling |
my.ump.umpir.351062022-09-06T08:27:37Z http://umpir.ump.edu.my/id/eprint/35106/ Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach Miah, Mohammad Badrul Alam Suryanti, Awang Rahman, Md Mustafizur A. S. M., Sanwar Hosen Ra, In-Ho QA76 Computer software TA Engineering (General). Civil engineering (General) Due to the advancement of technology and the exponential proliferation of digital sources and textual data, the extraction of high-quality keyphrases and the summarizing of content at a high standard has become increasingly difficult in current research. Extracting high-quality keyphrases and summing texts at a high level demands the use of keyphrase frequency as a feature for keyword extraction, which is becoming more popular. This article proposed a novel unsupervised keyphrase frequency analysis (KFA) technique for feature extraction of keyphrases that is corpus-independent, domain-independent, language-agnostic, and length-free documents, and can be used by supervised and unsupervised algorithms. This proposed technique has five essential phases: data acquisition; data pre-processing; statistical methodologies; curve plotting analysis; and curve fitting technique. First, the technique begins by collecting five different datasets from various sources and then feeding those datasets into the data pre-processing phase using text pre-processing techniques. The preprocessed data is then transmitted to the region-based statistical process, followed by the curve plotting phase, and finally, the curve fitting approach. Afterward, the proposed technique is tested and assessed using five (5) standard datasets. Then, the proposed technique is compared with our recommended systems to prove its efficacy, benefits, and significance. Finally, the experimental findings indicate that the proposed technique effectively analyses the keyphrase frequency from articles and delivers the keyphrase frequency of 70.63% in 1st region and 10.74% in 2nd region of the total present keyphrase frequency. IEEE 2022 Article PeerReviewed pdf en cc_by_4 http://umpir.ump.edu.my/id/eprint/35106/1/Keyphrases%20Frequency%20Analysis.pdf Miah, Mohammad Badrul Alam and Suryanti, Awang and Rahman, Md Mustafizur and A. S. M., Sanwar Hosen and Ra, In-Ho (2022) Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach. IEEE Access. pp. 1-12. ISSN 2169-3536. (Published) https://doi.org/10.1109/ACCESS.2022.3198959 https://doi.org/10.1109/ACCESS.2022.3198959 |
institution |
Universiti Malaysia Pahang Al-Sultan Abdullah |
building |
UMPSA Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Pahang Al-Sultan Abdullah |
content_source |
UMPSA Institutional Repository |
url_provider |
http://umpir.ump.edu.my/ |
language |
English |
topic |
QA76 Computer software TA Engineering (General). Civil engineering (General) |
spellingShingle |
QA76 Computer software TA Engineering (General). Civil engineering (General) Miah, Mohammad Badrul Alam Suryanti, Awang Rahman, Md Mustafizur A. S. M., Sanwar Hosen Ra, In-Ho Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach |
description |
Due to the advancement of technology and the exponential proliferation of digital sources and textual data, the extraction of high-quality keyphrases and the summarizing of content at a high standard has become increasingly difficult in current research. Extracting high-quality keyphrases and summing texts at a high level demands the use of keyphrase frequency as a feature for keyword extraction, which is becoming more popular. This article proposed a novel unsupervised keyphrase frequency analysis (KFA) technique for feature extraction of keyphrases that is corpus-independent, domain-independent, language-agnostic, and length-free documents, and can be used by supervised and unsupervised algorithms. This proposed technique has five essential phases: data acquisition; data pre-processing; statistical methodologies; curve plotting analysis; and curve fitting technique. First, the technique begins by collecting five different datasets from various sources and then feeding those datasets into the data pre-processing phase using text pre-processing techniques. The preprocessed data is then transmitted to the region-based statistical process, followed by the curve plotting phase, and finally, the curve fitting approach. Afterward, the proposed technique is tested and assessed using five (5) standard datasets. Then, the proposed technique is compared with our recommended systems to prove its efficacy, benefits, and significance. Finally, the experimental findings indicate that the proposed technique effectively analyses the keyphrase frequency from articles and delivers the keyphrase frequency of 70.63% in 1st region and 10.74% in 2nd region of the total present keyphrase frequency. |
format |
Article |
author |
Miah, Mohammad Badrul Alam Suryanti, Awang Rahman, Md Mustafizur A. S. M., Sanwar Hosen Ra, In-Ho |
author_facet |
Miah, Mohammad Badrul Alam Suryanti, Awang Rahman, Md Mustafizur A. S. M., Sanwar Hosen Ra, In-Ho |
author_sort |
Miah, Mohammad Badrul Alam |
title |
Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach |
title_short |
Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach |
title_full |
Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach |
title_fullStr |
Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach |
title_full_unstemmed |
Keyphrases Frequency Analysis from Research Articles: A Region-Based Unsupervised Novel Approach |
title_sort |
keyphrases frequency analysis from research articles: a region-based unsupervised novel approach |
publisher |
IEEE |
publishDate |
2022 |
url |
http://umpir.ump.edu.my/id/eprint/35106/1/Keyphrases%20Frequency%20Analysis.pdf http://umpir.ump.edu.my/id/eprint/35106/ https://doi.org/10.1109/ACCESS.2022.3198959 https://doi.org/10.1109/ACCESS.2022.3198959 |
_version_ |
1822922867151470592 |