Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes

Many retrieval models and techniques can be applied to retrieve theses that are most relevant to certain queries or concepts. It has been found that different retrieval methods often retrieve different sets of relevant documents. It is therefore anticipated that a particular retrieval method will us...

Full description

Saved in:
Bibliographic Details
Main Author: Wahlan, Mohammed Salem Farag
Format: Thesis
Language:English
Published: 2006
Subjects:
Online Access:http://eprints.utm.my/id/eprint/4067/1/MohammedSalemFaragWahlanMFSKSM2006.pdf
http://eprints.utm.my/id/eprint/4067/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
id my.utm.4067
record_format eprints
spelling my.utm.40672018-01-15T04:24:11Z http://eprints.utm.my/id/eprint/4067/ Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes Wahlan, Mohammed Salem Farag QA75 Electronic computers. Computer science Many retrieval models and techniques can be applied to retrieve theses that are most relevant to certain queries or concepts. It has been found that different retrieval methods often retrieve different sets of relevant documents. It is therefore anticipated that a particular retrieval method will usually retrieve some relevant theses not retrieved by other methods. Therefore in this study, different methods are used in the theses retrieval, based on different thesis structures, different similarity measures and different weighting schemes. The theses used in this study are collected from FSKSM postgraduate library. Many operations have been applied on the collected theses such as digitizing, stop words removal, stemming and building index. The results from these operations are stored in a database. In this study, 85 theses and 30 queries are used. The comparisons between query and theses were made using five similarity measures with seven weighting schemes using different thesis structures. The results show that the use of bibliography gives poorer results compared to the use of title and abstract alone. In the weighting schemes combinations, the results show that weighting schemes using Cosine and Tanimoto perform well individually but did not do well in the combinations and weighting schemes using Forbes and Russell similarity measures do not do well individually but did well in the combination. In the similarity measures combinations, the results show that the best combination was Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure but using abstract structure, the best combination was Cosine using TFIDF weighting scheme with Forbes using ATFA weighting scheme but it has less performance than the combination of Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure. The overall results show that the best thesis structure is title and the best similarity measure is Cosine with LTU weighting scheme. 2006-03 Thesis NonPeerReviewed application/pdf en http://eprints.utm.my/id/eprint/4067/1/MohammedSalemFaragWahlanMFSKSM2006.pdf Wahlan, Mohammed Salem Farag (2006) Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes. Masters thesis, Universiti Teknologi Malaysia, Faculty of Computer Science and Information System.
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Wahlan, Mohammed Salem Farag
Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
description Many retrieval models and techniques can be applied to retrieve theses that are most relevant to certain queries or concepts. It has been found that different retrieval methods often retrieve different sets of relevant documents. It is therefore anticipated that a particular retrieval method will usually retrieve some relevant theses not retrieved by other methods. Therefore in this study, different methods are used in the theses retrieval, based on different thesis structures, different similarity measures and different weighting schemes. The theses used in this study are collected from FSKSM postgraduate library. Many operations have been applied on the collected theses such as digitizing, stop words removal, stemming and building index. The results from these operations are stored in a database. In this study, 85 theses and 30 queries are used. The comparisons between query and theses were made using five similarity measures with seven weighting schemes using different thesis structures. The results show that the use of bibliography gives poorer results compared to the use of title and abstract alone. In the weighting schemes combinations, the results show that weighting schemes using Cosine and Tanimoto perform well individually but did not do well in the combinations and weighting schemes using Forbes and Russell similarity measures do not do well individually but did well in the combination. In the similarity measures combinations, the results show that the best combination was Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure but using abstract structure, the best combination was Cosine using TFIDF weighting scheme with Forbes using ATFA weighting scheme but it has less performance than the combination of Cosine using LTU weighting scheme with Russell using LOGG weighting scheme using title structure. The overall results show that the best thesis structure is title and the best similarity measure is Cosine with LTU weighting scheme.
format Thesis
author Wahlan, Mohammed Salem Farag
author_facet Wahlan, Mohammed Salem Farag
author_sort Wahlan, Mohammed Salem Farag
title Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_short Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_full Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_fullStr Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_full_unstemmed Comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
title_sort comparison and fusion of retrieval schemes based on different structures, similarity measures and weighting schemes
publishDate 2006
url http://eprints.utm.my/id/eprint/4067/1/MohammedSalemFaragWahlanMFSKSM2006.pdf
http://eprints.utm.my/id/eprint/4067/
_version_ 1643643958493970432