Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus

Information retrieval is the first step in developing retrieval systems for text document in collections. Inverted file is the most popular and effective in searching and retrieving processes (Zobel and Moffat, 2006). This project explores the potential and limitation of prototype text search engine...

Full description

Saved in:
Bibliographic Details
Main Author: Abdul Kudus, Rosnawati
Format: Monograph
Language:English
Published: 2008
Subjects:
Online Access:https://ir.uitm.edu.my/id/eprint/98228/1/98228.PDF
https://ir.uitm.edu.my/id/eprint/98228/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Mara
Language: English
id my.uitm.ir.98228
record_format eprints
spelling my.uitm.ir.982282024-07-31T08:28:56Z https://ir.uitm.edu.my/id/eprint/98228/ Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus Abdul Kudus, Rosnawati Q Science (General) Information retrieval is the first step in developing retrieval systems for text document in collections. Inverted file is the most popular and effective in searching and retrieving processes (Zobel and Moffat, 2006). This project explores the potential and limitation of prototype text search engines using inverted files on Malaysia Hansard Documents. Malaysia Hansard Document is an official verbatim report of proceedings and debates in parliament which is documented in Malay Language and maintained by House of Parliament. These document are categorizes into House of Commons and House of Lords. Currently, searching and retrieving information from hansard document are done manually. These process are tedious, very time consuming and inefficient. Text search engine prototype using inverted file can speed up the process of searching and retrieving information from hansard document. The objectives of this study are to develop a text search engine prototype for Malaysia Hansard Documents and to evaluate the prototype for seven speakers' speech text. Scopes of the research are to search and retrieve document up to two words and in Malay language. The methodologies in this study includes preliminary study about the models of text search engines and identify similar studies, analyze indexing techniques, defines data structure for inverted file which includes hash table, linked lists, vector, array and quick sort, collect and preprocessing hansard document, design and develop prototype using Java platform, conduct testing to evaluate the accuracy of the prototype tool and analyze findings. From the experiment that has been conducted, the accuracy of search keywords through the prototype and manual check is 100 percents. 2008 Monograph NonPeerReviewed text en https://ir.uitm.edu.my/id/eprint/98228/1/98228.PDF Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus. (2008) UNSPECIFIED. UNSPECIFIED. (Unpublished)
institution Universiti Teknologi Mara
building Tun Abdul Razak Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Mara
content_source UiTM Institutional Repository
url_provider http://ir.uitm.edu.my/
language English
topic Q Science (General)
spellingShingle Q Science (General)
Abdul Kudus, Rosnawati
Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus
description Information retrieval is the first step in developing retrieval systems for text document in collections. Inverted file is the most popular and effective in searching and retrieving processes (Zobel and Moffat, 2006). This project explores the potential and limitation of prototype text search engines using inverted files on Malaysia Hansard Documents. Malaysia Hansard Document is an official verbatim report of proceedings and debates in parliament which is documented in Malay Language and maintained by House of Parliament. These document are categorizes into House of Commons and House of Lords. Currently, searching and retrieving information from hansard document are done manually. These process are tedious, very time consuming and inefficient. Text search engine prototype using inverted file can speed up the process of searching and retrieving information from hansard document. The objectives of this study are to develop a text search engine prototype for Malaysia Hansard Documents and to evaluate the prototype for seven speakers' speech text. Scopes of the research are to search and retrieve document up to two words and in Malay language. The methodologies in this study includes preliminary study about the models of text search engines and identify similar studies, analyze indexing techniques, defines data structure for inverted file which includes hash table, linked lists, vector, array and quick sort, collect and preprocessing hansard document, design and develop prototype using Java platform, conduct testing to evaluate the accuracy of the prototype tool and analyze findings. From the experiment that has been conducted, the accuracy of search keywords through the prototype and manual check is 100 percents.
format Monograph
author Abdul Kudus, Rosnawati
author_facet Abdul Kudus, Rosnawati
author_sort Abdul Kudus, Rosnawati
title Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus
title_short Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus
title_full Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus
title_fullStr Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus
title_full_unstemmed Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus
title_sort keyword indexing using inverted file on hansard documents / rosnawati abdul kudus
publishDate 2008
url https://ir.uitm.edu.my/id/eprint/98228/1/98228.PDF
https://ir.uitm.edu.my/id/eprint/98228/
_version_ 1806454656281346048