Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection

Recently, feature selection and dimensionality reduction have become fundamental tools for many data mining tasks, especially for processing high-dimensional data such as gene expression microarray data. Gene expression microarray data comprises up to hundreds of thousands of features with relativel...

Full description

Saved in:
Bibliographic Details
Main Authors: Ang, J. C., Mirzal, A., Haron, H., Hamed, H. N. A.
Format: Article
Published: Institute of Electrical and Electronics Engineers Inc. 2016
Subjects:
Online Access:http://eprints.utm.my/id/eprint/72142/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-84990888711&doi=10.1109%2fTCBB.2015.2478454&partnerID=40&md5=362030937aa305290de4691d6cc15903
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
id my.utm.72142
record_format eprints
spelling my.utm.721422017-11-23T06:19:24Z http://eprints.utm.my/id/eprint/72142/ Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection Ang, J. C. Mirzal, A. Haron, H. Hamed, H. N. A. QA75 Electronic computers. Computer science Recently, feature selection and dimensionality reduction have become fundamental tools for many data mining tasks, especially for processing high-dimensional data such as gene expression microarray data. Gene expression microarray data comprises up to hundreds of thousands of features with relatively small sample size. Because learning algorithms usually do not work well with this kind of data, a challenge to reduce the data dimensionality arises. A huge number of gene selection are applied to select a subset of relevant features for model construction and to seek for better cancer classification performance. This paper presents the basic taxonomy of feature selection, and also reviews the state-of-The-Art gene selection methods by grouping the literatures into three categories: supervised, unsupervised, and semi-supervised. The comparison of experimental results on top 5 representative gene expression datasets indicates that the classification accuracy of unsupervised and semi-supervised feature selection is competitive with supervised feature selection. Institute of Electrical and Electronics Engineers Inc. 2016 Article PeerReviewed Ang, J. C. and Mirzal, A. and Haron, H. and Hamed, H. N. A. (2016) Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 13 (5). pp. 971-989. ISSN 1545-5963 https://www.scopus.com/inward/record.uri?eid=2-s2.0-84990888711&doi=10.1109%2fTCBB.2015.2478454&partnerID=40&md5=362030937aa305290de4691d6cc15903
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Ang, J. C.
Mirzal, A.
Haron, H.
Hamed, H. N. A.
Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection
description Recently, feature selection and dimensionality reduction have become fundamental tools for many data mining tasks, especially for processing high-dimensional data such as gene expression microarray data. Gene expression microarray data comprises up to hundreds of thousands of features with relatively small sample size. Because learning algorithms usually do not work well with this kind of data, a challenge to reduce the data dimensionality arises. A huge number of gene selection are applied to select a subset of relevant features for model construction and to seek for better cancer classification performance. This paper presents the basic taxonomy of feature selection, and also reviews the state-of-The-Art gene selection methods by grouping the literatures into three categories: supervised, unsupervised, and semi-supervised. The comparison of experimental results on top 5 representative gene expression datasets indicates that the classification accuracy of unsupervised and semi-supervised feature selection is competitive with supervised feature selection.
format Article
author Ang, J. C.
Mirzal, A.
Haron, H.
Hamed, H. N. A.
author_facet Ang, J. C.
Mirzal, A.
Haron, H.
Hamed, H. N. A.
author_sort Ang, J. C.
title Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection
title_short Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection
title_full Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection
title_fullStr Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection
title_full_unstemmed Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection
title_sort supervised, unsupervised, and semi-supervised feature selection: a review on gene selection
publisher Institute of Electrical and Electronics Engineers Inc.
publishDate 2016
url http://eprints.utm.my/id/eprint/72142/
https://www.scopus.com/inward/record.uri?eid=2-s2.0-84990888711&doi=10.1109%2fTCBB.2015.2478454&partnerID=40&md5=362030937aa305290de4691d6cc15903
_version_ 1643656364978864128