An analysis of fuzzy clustering algortihms for suggestion of supervisor and examiner of thesis title
Document clustering has been investigated for use in a number of different areas of information retrieval. In this project, the use of Fuzzy clustering techniques for suggestion of supervisors and examiners of thesis in School of Postgraduate Studies at Faculty of Computer Science and Information Te...
Saved in:
Main Author: | |
---|---|
Format: | Thesis |
Language: | English |
Published: |
2005
|
Subjects: | |
Online Access: | http://eprints.utm.my/id/eprint/2709/1/AzrinaSuhaimiMFC2005.pdf http://eprints.utm.my/id/eprint/2709/ |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Teknologi Malaysia |
Language: | English |
Summary: | Document clustering has been investigated for use in a number of different areas of information retrieval. In this project, the use of Fuzzy clustering techniques for suggestion of supervisors and examiners of thesis in School of Postgraduate Studies at Faculty of Computer Science and Information Technology are studied. The aim of this project is to assist the administration in assigning supervisors and examiners to each post graduate student for their project. Preprocessing tasks for document clustering that are applied in this project are commonly used in the Information Retrieval field, which are stemming, stopword removal, and indexing. Document is represented using the Vector Space Model. The index terms are then clustered using Fuzzy clustering algorithms based on similarity. The selected algorithms for Fuzzy are Fuzzy C-means and Gustafson Kessel. The clustering results are evaluated in terms of classification accuracy to predict the thesis supervisor(s) or examiner(s). Experiments show that Fuzzy C-means gives better result compared to Gustafson Kessel. However, the performances of both techniques are not at the top level. Hence, these techniques are not suitable for use in suggestion of supervisors and examiners. Nevertheless, to get a better performance, a larger dataset, thorough experiments and detailed evaluation has to be carried out and this will take longer time |
---|