Lucene search engine development: a beginner’s experience / Azilawati Azizan ... [et al.]

Lucene provides a basic library package for building a complete textbased search engine. It can be used in various ways to benefit both researchers and users. However, for a beginner, to create a search engine utilizing Lucene, require a thorough understanding of the procedures and library packages....

Full description

Saved in:
Bibliographic Details
Main Authors: Azizan, Azilawati, Mohd Sanusi, Najwa Izzah Najihah, Khairuddin, Nurkhairizan, Shafie, Ana Salwa
Format: Article
Language:English
Published: Universiti Teknologi MARA, Perak 2022
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/74926/2/74926.pdf
https://ir.uitm.edu.my/id/eprint/74926/
https://mijuitm.com.my/view-articles/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Mara
Language: English
Description
Summary:Lucene provides a basic library package for building a complete textbased search engine. It can be used in various ways to benefit both researchers and users. However, for a beginner, to create a search engine utilizing Lucene, require a thorough understanding of the procedures and library packages. Therefore, this project seeks to explore and demonstrate the development of a search engine by employing the Malay Quran translation text as the dataset for testing purposes. This project applied the fundamental Information Retrieval (IR) model as the main methodology for developing the search engine. Apache Lucene framework, a full-text search engine library which is written in JAVA was used to construct the whole search engine components namely the indexer, searcher, query processor, and ranker. Then, the developed search engine was evaluated using a standard IR measurement, where it achieved 67% of precision and 32% recall value. This paper provides a basic approach to developing a text-based search engine that can be used for any IR testing purposes. The result of this project may also benefit the IR community in comparing the retrieval performance.