Semi-supervised clustering algorithms for web documents

Data mining has been a significant tool in extracting hidden and useful information from large databases in various scientific and practical applications. One of the techniques is semi-supervised clustering. Semi-supervised algorithms often demonstrate surprisingly impressive performance improvemen...

Full description

Saved in:
Bibliographic Details
Main Author: Bian, Zhiwei.
Other Authors: Chen Lihui
Format: Final Year Project
Language:English
Published: 2011
Subjects:
Online Access:http://hdl.handle.net/10356/45760
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Data mining has been a significant tool in extracting hidden and useful information from large databases in various scientific and practical applications. One of the techniques is semi-supervised clustering. Semi-supervised algorithms often demonstrate surprisingly impressive performance improvements over traditional one-sided row clustering techniques by attempting to simultaneously partition both the rows and columns. In many application algorithms, partial supervision in the form of a few rows labeling information as well columns may be available to potentially increase the performance of semi-supervised clustering. In Sindhwani‟s paper, they proposed two novel semi-supervised clustering algorithms motivated respectively by spectral bipartite graph partitioning and matrix approximation formulations for co-clustering.