Unsupervised feature selection based on principal components analysis
An important issue related to mining large data sets, both in dimension and size, is of selecting a subset of the original features. In this thesis, we describe an unsupervised feature selection algorithm suitable for data sets, large in both dimension and size. The algorithm consists of two steps—...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Published: |
2008
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/4238 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Summary: | An important issue related to mining large data sets, both in dimension and size, is of selecting a subset of the original features. In this thesis, we describe an unsupervised feature selection algorithm suitable for data sets, large in both dimension and size. The algorithm consists of two steps— Pre-selection and selection. Pre-selection is based on Procrustes Analysis, which keeps the original characters as many as possible. The second step is based on feature similarity measure, with the aim of reducing the feature redundancy. |
---|