INDEXING COMPONENTS ON INTELLIGENT REPOSITORY SYSTEM (IRYS) APPLICATION SEARCH ENGINE

Document storage mechanisms are very important because they affect the ease and accuracy of the document retrieval process. In document retrieval, it is best to utilize the important information contained in the document. Generally, document storage and retrieval are done by search engines. There...

Full description

Saved in:
Bibliographic Details
Main Author: Andhika Putra, Reihan
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/73917
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Document storage mechanisms are very important because they affect the ease and accuracy of the document retrieval process. In document retrieval, it is best to utilize the important information contained in the document. Generally, document storage and retrieval are done by search engines. Therefore, Intelligent Repository System (IRyS) software that can perform electronic document storage, capable of handling and extracting information from documents with certain domains is needed. In this Final Project, the design and development of the indexing component, which is the main component in the search engine that is useful for accelerating the search process and enabling searches by utilizing information in documents is carried out,. The indexing component consists of three stages: text acquisition, text transformation, and index creation. In the text acquisition stage, documents are converted to text and image-based PDF documents are handled using OCR. In the text transformation stage, conversion and extraction of information from text into other forms, namely terms/features and document domains are classified using SVM. At the index creation stage, weighting and index data structure creation is performed. Weighting is done using the BERT model. Index creation is done with the help of Elasticsearch. Evaluation of the application shows that the indexing component in the IRyS application search engine has good performance and meets all application needs. Based on the analysis of the evaluation results, it is concluded that the indexing component successfully fulfills the needs of IRyS and provides optimal results.