QURAN VERSE SEARCH RELATED TO INDONESIAN DOCUMENT
The Quran is the main source of knowledge for Muslims in practicing their religion. But, research on Quran in Indonesia is still rare, mainly in the use of its Indonesian textual translation. Whereas Muslims in Indonesia have a need to look find correlations between daily activities and the Islamic...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/43425 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:43425 |
---|---|
spelling |
id-itb.:434252019-09-27T08:49:58ZQURAN VERSE SEARCH RELATED TO INDONESIAN DOCUMENT Ghifari Haznitrama, Faiz Indonesia Final Project Alquran, multilabel classification, information retrieval, Indonesian document INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/43425 The Quran is the main source of knowledge for Muslims in practicing their religion. But, research on Quran in Indonesia is still rare, mainly in the use of its Indonesian textual translation. Whereas Muslims in Indonesia have a need to look find correlations between daily activities and the Islamic values contained in the Quran. In this final project, the writer designed a system to search Quran verses that are related to Indonesian documents. There are at least two methods that can be used. First uses information retrieval, the second combines the categorization of text or keyword extraction with information retrieval. In this issue, these two methods are used, and their performance is compared. For the second method, text categorization is done using multilabel classification on Indonesian documents to get the corresponding Quran topics. Multilabel classification is done using the problem transformation approach utilizing the SVM algorithm, Random Forest, Decision Tree, and Naïve-Bayes. Feature extraction techniques used are bag of words and TF-IDF. Then the collection of Quran topic labels is used as a query for the information retrieval module to Quran verses. The information retrieval technique uses vector space model by utilizing the cosine similarity function. Based on experiments conducted, the first method with information retrieval alone gives the best performance with a micro-average precision of 23,557%. This proves that simplifying the text into a collection of keywords does not provide better performance. The use of a knowledge base that is less commonly used also affects the performance of multilabel classification, so that the results from information retrieval module is less appropriate to the query according to evaluators. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
The Quran is the main source of knowledge for Muslims in practicing their religion. But, research on Quran in Indonesia is still rare, mainly in the use of its Indonesian textual translation. Whereas Muslims in Indonesia have a need to look find correlations between daily activities and the Islamic values contained in the Quran. In this final project, the writer designed a system to search Quran verses that are related to Indonesian documents. There are at least two methods that can be used. First uses information retrieval, the second combines the categorization of text or keyword extraction with information retrieval. In this issue, these two methods are used, and their performance is compared. For the second method, text categorization is done using multilabel classification on Indonesian documents to get the corresponding Quran topics. Multilabel classification is done using the problem transformation approach utilizing the SVM algorithm, Random Forest, Decision Tree, and Naïve-Bayes. Feature extraction techniques used are bag of words and TF-IDF. Then the collection of Quran topic labels is used as a query for the information retrieval module to Quran verses. The information retrieval technique uses vector space model by utilizing the cosine similarity function. Based on experiments conducted, the first method with information retrieval alone gives the best performance with a micro-average precision of 23,557%. This proves that simplifying the text into a collection of keywords does not provide better performance. The use of a knowledge base that is less commonly used also affects the performance of multilabel classification, so that the results from information retrieval module is less appropriate to the query according to evaluators. |
format |
Final Project |
author |
Ghifari Haznitrama, Faiz |
spellingShingle |
Ghifari Haznitrama, Faiz QURAN VERSE SEARCH RELATED TO INDONESIAN DOCUMENT |
author_facet |
Ghifari Haznitrama, Faiz |
author_sort |
Ghifari Haznitrama, Faiz |
title |
QURAN VERSE SEARCH RELATED TO INDONESIAN DOCUMENT |
title_short |
QURAN VERSE SEARCH RELATED TO INDONESIAN DOCUMENT |
title_full |
QURAN VERSE SEARCH RELATED TO INDONESIAN DOCUMENT |
title_fullStr |
QURAN VERSE SEARCH RELATED TO INDONESIAN DOCUMENT |
title_full_unstemmed |
QURAN VERSE SEARCH RELATED TO INDONESIAN DOCUMENT |
title_sort |
quran verse search related to indonesian document |
url |
https://digilib.itb.ac.id/gdl/view/43425 |
_version_ |
1822926574746337280 |