Robust models and novel similarity measures for high-dimensional data clustering

The purpose of this thesis is to present our research works on some of the fundamental issues encountered in high-dimensional data clustering. From our study of the current literature, we list out a few important problems that are still open for solutions in the field, and propose the appropriate so...

Full description

Saved in:
Bibliographic Details
Main Author: Nguyen, Duc Thang
Other Authors: Chan Chee Keong
Format: Theses and Dissertations
Language:English
Published: 2012
Subjects:
Online Access:https://hdl.handle.net/10356/48657
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-48657
record_format dspace
spelling sg-ntu-dr.10356-486572023-07-04T16:13:19Z Robust models and novel similarity measures for high-dimensional data clustering Nguyen, Duc Thang Chan Chee Keong Chen Lihui School of Electrical and Electronic Engineering DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications The purpose of this thesis is to present our research works on some of the fundamental issues encountered in high-dimensional data clustering. From our study of the current literature, we list out a few important problems that are still open for solutions in the field, and propose the appropriate solutions for these problems. We investigate how statistics, machine learning and meta-heuristics techniques can be used to improve existing methods or develop novel models for unsupervised learning of high-dimensional data. Our goals are to develop efficient clustering algorithms that could reflect the natural properties of high-dimensional data, be robust to outliers and less sensitive to initialization; algorithm that are simple and fast, easily applicable and still produce good clustering quality. The main contributions of this thesis include a robust model-based clustering algorithm which is capable of handling noisy data, a novel similarity measure and its resulted algorithms for clustering text document data, and other related studies to help improve existing clustering algorithms. DOCTOR OF PHILOSOPHY (EEE) 2012-05-04T08:55:10Z 2012-05-04T08:55:10Z 2012 2012 Thesis Nguyen, D. T. (2012). Robust models and novel similarity measures for high-dimensional data clustering. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/48657 10.32657/10356/48657 en 169 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications
spellingShingle DRNTU::Engineering::Computer science and engineering::Information systems::Information systems applications
Nguyen, Duc Thang
Robust models and novel similarity measures for high-dimensional data clustering
description The purpose of this thesis is to present our research works on some of the fundamental issues encountered in high-dimensional data clustering. From our study of the current literature, we list out a few important problems that are still open for solutions in the field, and propose the appropriate solutions for these problems. We investigate how statistics, machine learning and meta-heuristics techniques can be used to improve existing methods or develop novel models for unsupervised learning of high-dimensional data. Our goals are to develop efficient clustering algorithms that could reflect the natural properties of high-dimensional data, be robust to outliers and less sensitive to initialization; algorithm that are simple and fast, easily applicable and still produce good clustering quality. The main contributions of this thesis include a robust model-based clustering algorithm which is capable of handling noisy data, a novel similarity measure and its resulted algorithms for clustering text document data, and other related studies to help improve existing clustering algorithms.
author2 Chan Chee Keong
author_facet Chan Chee Keong
Nguyen, Duc Thang
format Theses and Dissertations
author Nguyen, Duc Thang
author_sort Nguyen, Duc Thang
title Robust models and novel similarity measures for high-dimensional data clustering
title_short Robust models and novel similarity measures for high-dimensional data clustering
title_full Robust models and novel similarity measures for high-dimensional data clustering
title_fullStr Robust models and novel similarity measures for high-dimensional data clustering
title_full_unstemmed Robust models and novel similarity measures for high-dimensional data clustering
title_sort robust models and novel similarity measures for high-dimensional data clustering
publishDate 2012
url https://hdl.handle.net/10356/48657
_version_ 1772827355189870592