Query cost-reduction for Quranic-Arabic information retrieval using hexadecimal conversion algorithm

Digital Quran is a natural language document that use either Arabic font or images of the verses. In the Al-Quran there are 18994 unique words.Thus, the image approach uses a significant amount of memory space.However there is not much work has been done using machine translation (MT) technique fo...

Full description

Saved in:
Bibliographic Details
Main Authors: Mazlan, Ahmad Akmaluddin, Md Norwawi, Norita, Abdul Wahid, Fauziah, Ismail, Roesnita, Omoush, Ashraf Al
Format: Conference or Workshop Item
Language:English
Published: 2017
Subjects:
Online Access:http://repo.uum.edu.my/22804/1/ICOCI%202017%2091-98.pdf
http://repo.uum.edu.my/22804/
http://icoci.cms.net.my/PROCEEDINGS/2017/Pdf_Version_Chap02e/PID79-91-98e.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Utara Malaysia
Language: English
Description
Summary:Digital Quran is a natural language document that use either Arabic font or images of the verses. In the Al-Quran there are 18994 unique words.Thus, the image approach uses a significant amount of memory space.However there is not much work has been done using machine translation (MT) technique for the Quranic representation. This paper will proposed Arabic information retrieval based on keywords search in Hexadecimal Representation using Al-Quran verses as the test case. All Quranic words will transliterate into machine language in the form of binary format after removing diacritic and duplication.This machine language approach in representing Digital Quran reduces the size of storage around 47-54% and retrieval time up to 20% hence reduce the query cost for Arabic information retrieval in general.