AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS
Text classification (TC)provides a better wayto organize information since it allows better understanding and interpretation of the content. It deals with the assignment of labels into a group of similar textual document. However, TC research for Asian language documents is relatively limited com...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2016
|
Subjects: | |
Online Access: | http://utpedia.utp.edu.my/id/eprint/21420/1/2015%20-IT%20-%20AN%20INTEGRATED%20GENERIC%20TEXT%20CLASSIFICATION%20ALGORITHM%20FOR%20INDONESIAN%20AND%20MALAY%20NEWS%20DOCUMENT%20-%20ZUL%20INDRA%20-%20MASTER.pdf http://utpedia.utp.edu.my/id/eprint/21420/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknologi Petronas |
Language: | English |
id |
oai:utpedia.utp.edu.my:21420 |
---|---|
record_format |
eprints |
spelling |
oai:utpedia.utp.edu.my:214202024-07-24T07:16:27Z http://utpedia.utp.edu.my/id/eprint/21420/ AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS ,, ZUL INDRA QA75 Electronic computers. Computer science Text classification (TC)provides a better wayto organize information since it allows better understanding and interpretation of the content. It deals with the assignment of labels into a group of similar textual document. However, TC research for Asian language documents is relatively limited compared to English documents and even lesser particularly for news articles. Apart from that, TC research to classify textual documents in similar morphology such Indonesian and Malay is still scarce. Hence, the aimof this study is to develop an integrated generic TCalgorithm which is able to identify the language and then classify the category for identified news documents. Furthermore, top-ra feature selection method is utilised to improve TCperformance andto overcome theonline news corpora classification challenges: rapid datagrowth of online news documents, and the high computational time. 2016-07 Thesis NonPeerReviewed application/pdf en http://utpedia.utp.edu.my/id/eprint/21420/1/2015%20-IT%20-%20AN%20INTEGRATED%20GENERIC%20TEXT%20CLASSIFICATION%20ALGORITHM%20FOR%20INDONESIAN%20AND%20MALAY%20NEWS%20DOCUMENT%20-%20ZUL%20INDRA%20-%20MASTER.pdf ,, ZUL INDRA (2016) AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS. Masters thesis, Universiti Teknologi PETRONAS. |
institution |
Universiti Teknologi Petronas |
building |
UTP Resource Centre |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Petronas |
content_source |
UTP Electronic and Digitized Intellectual Asset |
url_provider |
http://utpedia.utp.edu.my/ |
language |
English |
topic |
QA75 Electronic computers. Computer science |
spellingShingle |
QA75 Electronic computers. Computer science ,, ZUL INDRA AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS |
description |
Text classification (TC)provides a better wayto organize information since it allows
better understanding and interpretation of the content. It deals with the assignment of
labels into a group of similar textual document. However, TC research for Asian
language documents is relatively limited compared to English documents and even
lesser particularly for news articles. Apart from that, TC research to classify textual
documents in similar morphology such Indonesian and Malay is still scarce. Hence,
the aimof this study is to develop an integrated generic TCalgorithm which is able to
identify the language and then classify the category for identified news documents.
Furthermore, top-ra feature selection method is utilised to improve TCperformance
andto overcome theonline news corpora classification challenges: rapid datagrowth
of online news documents, and the high computational time. |
format |
Thesis |
author |
,, ZUL INDRA |
author_facet |
,, ZUL INDRA |
author_sort |
,, ZUL INDRA |
title |
AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS |
title_short |
AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS |
title_full |
AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS |
title_fullStr |
AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS |
title_full_unstemmed |
AN INTEGRATED GENERIC TEXT CLASSIFICATION ALGORITHM FOR INDONESIAN AND MALAY NEWS DOCUMENTS |
title_sort |
integrated generic text classification algorithm for indonesian and malay news documents |
publishDate |
2016 |
url |
http://utpedia.utp.edu.my/id/eprint/21420/1/2015%20-IT%20-%20AN%20INTEGRATED%20GENERIC%20TEXT%20CLASSIFICATION%20ALGORITHM%20FOR%20INDONESIAN%20AND%20MALAY%20NEWS%20DOCUMENT%20-%20ZUL%20INDRA%20-%20MASTER.pdf http://utpedia.utp.edu.my/id/eprint/21420/ |
_version_ |
1805891031303979008 |