Graph embedding based feature selection

Usually many real datasets in pattern recognition applications contain a large quantity of noisy and redundant features that are irrelevant to the intrinsic characteristics of the dataset. The irrelevant features may seriously deteriorate the learning performance. Hence feature selection which aims...

Full description

Saved in:
Bibliographic Details
Main Authors: Wei, Dan., Li, Shutao., Tan, Mingkui.
Other Authors: School of Computer Engineering
Format: Article
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/84508
http://hdl.handle.net/10220/13651
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Usually many real datasets in pattern recognition applications contain a large quantity of noisy and redundant features that are irrelevant to the intrinsic characteristics of the dataset. The irrelevant features may seriously deteriorate the learning performance. Hence feature selection which aims to select the most informative features from the original dataset plays an important role in data mining, image recognition and microarray data analysis. In this paper, we developed a new feature selection technique based on the recently developed graph embedding framework for manifold learning. We first show that the recently developed feature scores such as Linear Discriminant Analysis score and Marginal Fisher Analysis score can be seen as a direct application of the graph preserving criterion. And then, we investigate the negative influence brought by the large noise features and propose two recursive feature elimination (RFE) methods based on feature score and subset level score, respectively, for identifying the optimal feature subset. The experimental results both on toy dataset and real-world dataset verify the effectiveness and efficiency of the proposed methods.