Semi-supervised SVM-based feature selection for cancer classification using microarray gene expression data

Gene expression data always suffer from the high dimensionality issue, therefore feature selection becomes a fundamental tool in the analysis of cancer classification. Basically, the data can be collected easily without providing the label information, which is quite useful in improving the accuracy...

Full description

Saved in:
Bibliographic Details
Main Authors: Ang, Jun Chin, Haron, Habibollah, Abdull Hamed, Haza Nuzly
Format: Conference or Workshop Item
Published: 2015
Subjects:
Online Access:http://eprints.utm.my/id/eprint/59470/
http://dx.doi.org/10.1007/978-3-319-19066-2_45
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Description
Summary:Gene expression data always suffer from the high dimensionality issue, therefore feature selection becomes a fundamental tool in the analysis of cancer classification. Basically, the data can be collected easily without providing the label information, which is quite useful in improving the accuracy of the classification. Label information usually difficult to obtain as the labelling processes are tedious, costly and error prone. Previous studies of gene selection are mostly dedicated to supervised and unsupervised approaches. Support vector machine (SVM) is a common supervised technique to address gene selection and cancer classification problems. Hence, this paper aims to propose a semi-supervised SVM-based feature selection (S3VM-FS), which simultaneously exploit the knowledge from unlabelled and labelled data. Experimental results on the gene expression data of lung cancer show that S3VM-FS achieves the higher accuracy yet requires shorter processing time compares with the well-known supervised method, SVM-based recursive feature elimination (SVM-RFE) and the improved method, S3VM-RFE.