Gene selection for high dimensional data using k-means clustering algorithm and statistical approach

Microarray technology can measure thousands of genes which are useful for biologist to study and classify the cancer cells.However, this high dimensional data consists of large number of genes to be examined in regard of small samples size. Thus, selection of relevant genes is a challenging issue in...

Full description

Saved in:
Bibliographic Details
Main Authors: Ahmad, Farzana Kabir, Yusof, Yuhanis, Othman, Nor Hayati
Format: Conference or Workshop Item
Language:English
Published: 2014
Subjects:
Online Access:http://repo.uum.edu.my/16491/1/IEEE1.pdf
http://repo.uum.edu.my/16491/
http://doi.org/10.1109/ICCST.2014.7045188
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Utara Malaysia
Language: English
id my.uum.repo.16491
record_format eprints
spelling my.uum.repo.164912016-04-27T07:19:08Z http://repo.uum.edu.my/16491/ Gene selection for high dimensional data using k-means clustering algorithm and statistical approach Ahmad, Farzana Kabir Yusof, Yuhanis Othman, Nor Hayati QA75 Electronic computers. Computer science Microarray technology can measure thousands of genes which are useful for biologist to study and classify the cancer cells.However, this high dimensional data consists of large number of genes to be examined in regard of small samples size. Thus, selection of relevant genes is a challenging issue in microarray data analysis and has been a central research focus.This study proposed kmeans clustering algorithm to groups the relevant genes. Several statistical techniques such as Fisher criterion, Golub signal-to-noise, Mann Whitney rank and t-test have been used in deciding the clusters are well separated from one and others. Those genes with high discriminative score will later be used to train the k-NN classifier.The experimental results showed that the proposed gene selection methods able to identify differentially expressed genes with 0.86 ROC score. 2014-08-27 Conference or Workshop Item PeerReviewed application/pdf en http://repo.uum.edu.my/16491/1/IEEE1.pdf Ahmad, Farzana Kabir and Yusof, Yuhanis and Othman, Nor Hayati (2014) Gene selection for high dimensional data using k-means clustering algorithm and statistical approach. In: International Conference on Computational Science and Technology (ICCST), 27-28 Aug. 2014, Kota Kinabalu. http://doi.org/10.1109/ICCST.2014.7045188 doi:10.1109/ICCST.2014.7045188
institution Universiti Utara Malaysia
building UUM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Utara Malaysia
content_source UUM Institutionali Repository
url_provider http://repo.uum.edu.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Ahmad, Farzana Kabir
Yusof, Yuhanis
Othman, Nor Hayati
Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
description Microarray technology can measure thousands of genes which are useful for biologist to study and classify the cancer cells.However, this high dimensional data consists of large number of genes to be examined in regard of small samples size. Thus, selection of relevant genes is a challenging issue in microarray data analysis and has been a central research focus.This study proposed kmeans clustering algorithm to groups the relevant genes. Several statistical techniques such as Fisher criterion, Golub signal-to-noise, Mann Whitney rank and t-test have been used in deciding the clusters are well separated from one and others. Those genes with high discriminative score will later be used to train the k-NN classifier.The experimental results showed that the proposed gene selection methods able to identify differentially expressed genes with 0.86 ROC score.
format Conference or Workshop Item
author Ahmad, Farzana Kabir
Yusof, Yuhanis
Othman, Nor Hayati
author_facet Ahmad, Farzana Kabir
Yusof, Yuhanis
Othman, Nor Hayati
author_sort Ahmad, Farzana Kabir
title Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title_short Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title_full Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title_fullStr Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title_full_unstemmed Gene selection for high dimensional data using k-means clustering algorithm and statistical approach
title_sort gene selection for high dimensional data using k-means clustering algorithm and statistical approach
publishDate 2014
url http://repo.uum.edu.my/16491/1/IEEE1.pdf
http://repo.uum.edu.my/16491/
http://doi.org/10.1109/ICCST.2014.7045188
_version_ 1644281983308660736