Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus
Information retrieval is the first step in developing retrieval systems for text document in collections. Inverted file is the most popular and effective in searching and retrieving processes (Zobel and Moffat, 2006). This project explores the potential and limitation of prototype text search engine...
Saved in:
Main Author: | |
---|---|
Format: | Monograph |
Language: | English |
Published: |
2008
|
Subjects: | |
Online Access: | https://ir.uitm.edu.my/id/eprint/98228/1/98228.PDF https://ir.uitm.edu.my/id/eprint/98228/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknologi Mara |
Language: | English |
id |
my.uitm.ir.98228 |
---|---|
record_format |
eprints |
spelling |
my.uitm.ir.982282024-07-31T08:28:56Z https://ir.uitm.edu.my/id/eprint/98228/ Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus Abdul Kudus, Rosnawati Q Science (General) Information retrieval is the first step in developing retrieval systems for text document in collections. Inverted file is the most popular and effective in searching and retrieving processes (Zobel and Moffat, 2006). This project explores the potential and limitation of prototype text search engines using inverted files on Malaysia Hansard Documents. Malaysia Hansard Document is an official verbatim report of proceedings and debates in parliament which is documented in Malay Language and maintained by House of Parliament. These document are categorizes into House of Commons and House of Lords. Currently, searching and retrieving information from hansard document are done manually. These process are tedious, very time consuming and inefficient. Text search engine prototype using inverted file can speed up the process of searching and retrieving information from hansard document. The objectives of this study are to develop a text search engine prototype for Malaysia Hansard Documents and to evaluate the prototype for seven speakers' speech text. Scopes of the research are to search and retrieve document up to two words and in Malay language. The methodologies in this study includes preliminary study about the models of text search engines and identify similar studies, analyze indexing techniques, defines data structure for inverted file which includes hash table, linked lists, vector, array and quick sort, collect and preprocessing hansard document, design and develop prototype using Java platform, conduct testing to evaluate the accuracy of the prototype tool and analyze findings. From the experiment that has been conducted, the accuracy of search keywords through the prototype and manual check is 100 percents. 2008 Monograph NonPeerReviewed text en https://ir.uitm.edu.my/id/eprint/98228/1/98228.PDF Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus. (2008) UNSPECIFIED. UNSPECIFIED. (Unpublished) |
institution |
Universiti Teknologi Mara |
building |
Tun Abdul Razak Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Teknologi Mara |
content_source |
UiTM Institutional Repository |
url_provider |
http://ir.uitm.edu.my/ |
language |
English |
topic |
Q Science (General) |
spellingShingle |
Q Science (General) Abdul Kudus, Rosnawati Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus |
description |
Information retrieval is the first step in developing retrieval systems for text document in collections. Inverted file is the most popular and effective in searching and retrieving processes (Zobel and Moffat, 2006). This project explores the potential and limitation of prototype text search engines using inverted files on Malaysia Hansard Documents. Malaysia Hansard Document is an official verbatim report of proceedings and debates in parliament which is documented in Malay Language and maintained by House of Parliament. These document are categorizes into House of Commons and House of Lords. Currently, searching and retrieving information from hansard document are done manually. These process are tedious, very time consuming and inefficient. Text search engine prototype using inverted file can speed up the process of searching and retrieving information from hansard document. The objectives of this study are to develop a text search engine prototype for Malaysia Hansard Documents and to evaluate the prototype for seven speakers' speech text. Scopes of the research are to search and retrieve document up to two words and in Malay language. The methodologies in this study includes preliminary study about the models of text search engines and identify similar studies, analyze indexing techniques, defines data structure for inverted file which includes hash table, linked lists, vector, array and quick sort, collect and preprocessing hansard document, design and develop prototype using Java platform, conduct testing to evaluate the accuracy of the prototype tool and analyze findings. From the experiment that has been conducted, the accuracy of search keywords through the prototype and manual check is 100 percents. |
format |
Monograph |
author |
Abdul Kudus, Rosnawati |
author_facet |
Abdul Kudus, Rosnawati |
author_sort |
Abdul Kudus, Rosnawati |
title |
Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus |
title_short |
Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus |
title_full |
Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus |
title_fullStr |
Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus |
title_full_unstemmed |
Keyword indexing using inverted file on hansard documents / Rosnawati Abdul Kudus |
title_sort |
keyword indexing using inverted file on hansard documents / rosnawati abdul kudus |
publishDate |
2008 |
url |
https://ir.uitm.edu.my/id/eprint/98228/1/98228.PDF https://ir.uitm.edu.my/id/eprint/98228/ |
_version_ |
1806454656281346048 |