Query cost-reduction for Quranic-Arabic information retrieval using hexadecimal conversion algorithm
Digital Quran is a natural language document that use either Arabic font or images of the verses. In the Al-Quran there are 18994 unique words.Thus, the image approach uses a significant amount of memory space.However there is not much work has been done using machine translation (MT) technique fo...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | http://repo.uum.edu.my/22804/1/ICOCI%202017%2091-98.pdf http://repo.uum.edu.my/22804/ http://icoci.cms.net.my/PROCEEDINGS/2017/Pdf_Version_Chap02e/PID79-91-98e.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Utara Malaysia |
Language: | English |
Summary: | Digital Quran is a natural language document that use either Arabic font or images of the verses. In the Al-Quran there are 18994 unique
words.Thus, the image approach uses a significant amount of memory space.However there is not much work has been done using machine translation
(MT) technique for the Quranic representation. This paper will proposed Arabic information retrieval based on keywords search in Hexadecimal
Representation using Al-Quran verses as the test case. All Quranic words will transliterate into machine language in the form of binary format
after removing diacritic and duplication.This machine language approach in representing Digital Quran reduces the size of storage around 47-54% and retrieval time up to 20% hence reduce the query cost for Arabic information retrieval in general. |
---|