INDEXING COMPONENTS ON INTELLIGENT REPOSITORY SYSTEM (IRYS) APPLICATION SEARCH ENGINE
Document storage mechanisms are very important because they affect the ease and accuracy of the document retrieval process. In document retrieval, it is best to utilize the important information contained in the document. Generally, document storage and retrieval are done by search engines. There...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/73917 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:73917 |
---|---|
spelling |
id-itb.:739172023-06-25T09:17:17ZINDEXING COMPONENTS ON INTELLIGENT REPOSITORY SYSTEM (IRYS) APPLICATION SEARCH ENGINE Andhika Putra, Reihan Indonesia Final Project search engine, indexing, ocr, svm, elasticsearch INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/73917 Document storage mechanisms are very important because they affect the ease and accuracy of the document retrieval process. In document retrieval, it is best to utilize the important information contained in the document. Generally, document storage and retrieval are done by search engines. Therefore, Intelligent Repository System (IRyS) software that can perform electronic document storage, capable of handling and extracting information from documents with certain domains is needed. In this Final Project, the design and development of the indexing component, which is the main component in the search engine that is useful for accelerating the search process and enabling searches by utilizing information in documents is carried out,. The indexing component consists of three stages: text acquisition, text transformation, and index creation. In the text acquisition stage, documents are converted to text and image-based PDF documents are handled using OCR. In the text transformation stage, conversion and extraction of information from text into other forms, namely terms/features and document domains are classified using SVM. At the index creation stage, weighting and index data structure creation is performed. Weighting is done using the BERT model. Index creation is done with the help of Elasticsearch. Evaluation of the application shows that the indexing component in the IRyS application search engine has good performance and meets all application needs. Based on the analysis of the evaluation results, it is concluded that the indexing component successfully fulfills the needs of IRyS and provides optimal results. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Document storage mechanisms are very important because they affect the ease and
accuracy of the document retrieval process. In document retrieval, it is best to utilize
the important information contained in the document. Generally, document storage
and retrieval are done by search engines. Therefore, Intelligent Repository System
(IRyS) software that can perform electronic document storage, capable of handling
and extracting information from documents with certain domains is needed. In this
Final Project, the design and development of the indexing component, which is the
main component in the search engine that is useful for accelerating the search
process and enabling searches by utilizing information in documents is carried out,.
The indexing component consists of three stages: text acquisition, text
transformation, and index creation. In the text acquisition stage, documents are
converted to text and image-based PDF documents are handled using OCR. In the
text transformation stage, conversion and extraction of information from text into
other forms, namely terms/features and document domains are classified using
SVM. At the index creation stage, weighting and index data structure creation is
performed. Weighting is done using the BERT model. Index creation is done with
the help of Elasticsearch.
Evaluation of the application shows that the indexing component in the IRyS
application search engine has good performance and meets all application needs.
Based on the analysis of the evaluation results, it is concluded that the indexing
component successfully fulfills the needs of IRyS and provides optimal results. |
format |
Final Project |
author |
Andhika Putra, Reihan |
spellingShingle |
Andhika Putra, Reihan INDEXING COMPONENTS ON INTELLIGENT REPOSITORY SYSTEM (IRYS) APPLICATION SEARCH ENGINE |
author_facet |
Andhika Putra, Reihan |
author_sort |
Andhika Putra, Reihan |
title |
INDEXING COMPONENTS ON INTELLIGENT REPOSITORY SYSTEM (IRYS) APPLICATION SEARCH ENGINE |
title_short |
INDEXING COMPONENTS ON INTELLIGENT REPOSITORY SYSTEM (IRYS) APPLICATION SEARCH ENGINE |
title_full |
INDEXING COMPONENTS ON INTELLIGENT REPOSITORY SYSTEM (IRYS) APPLICATION SEARCH ENGINE |
title_fullStr |
INDEXING COMPONENTS ON INTELLIGENT REPOSITORY SYSTEM (IRYS) APPLICATION SEARCH ENGINE |
title_full_unstemmed |
INDEXING COMPONENTS ON INTELLIGENT REPOSITORY SYSTEM (IRYS) APPLICATION SEARCH ENGINE |
title_sort |
indexing components on intelligent repository system (irys) application search engine |
url |
https://digilib.itb.ac.id/gdl/view/73917 |
_version_ |
1822007247863021568 |