Clustering techniques for web mining
In an effort to keep up with the fast growth of World Wide Web, many Web Document Clustering techniques have been designed. These techniques can be used to increase the accuracy and efficiency of the users to find the relevant information they want from the internet. In this dissertation, a Web docu...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2009
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/18909 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | In an effort to keep up with the fast growth of World Wide Web, many Web Document Clustering techniques have been designed. These techniques can be used to increase the accuracy and efficiency of the users to find the relevant information they want from the internet. In this dissertation, a Web document clustering approach based on a phrase-based document Indexing has been implemented based on three merits. The first is the new document representation called Document index Graph (DIG), which is used to represent the document. The second is a new similarity measure between documents which is based on the matching phrases and their weights. The third concept is theincremental document clustering method. The objective of this dissertation is to design and implement the clustering system based on the concepts above. The implementation details, the experimental results and performance evaluation are reported. |
---|