Gene selection for high dimensional data using k-means clustering algorithm and statistical approach

Microarray technology can measure thousands of genes which are useful for biologist to study and classify the cancer cells.However, this high dimensional data consists of large number of genes to be examined in regard of small samples size. Thus, selection of relevant genes is a challenging issue in...

全面介紹

Saved in:
書目詳細資料
Main Authors: Ahmad, Farzana Kabir, Yusof, Yuhanis, Othman, Nor Hayati
格式: Conference or Workshop Item
語言:English
出版: 2014
主題:
在線閱讀:http://repo.uum.edu.my/16491/1/IEEE1.pdf
http://repo.uum.edu.my/16491/
http://doi.org/10.1109/ICCST.2014.7045188
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
實物特徵
總結:Microarray technology can measure thousands of genes which are useful for biologist to study and classify the cancer cells.However, this high dimensional data consists of large number of genes to be examined in regard of small samples size. Thus, selection of relevant genes is a challenging issue in microarray data analysis and has been a central research focus.This study proposed kmeans clustering algorithm to groups the relevant genes. Several statistical techniques such as Fisher criterion, Golub signal-to-noise, Mann Whitney rank and t-test have been used in deciding the clusters are well separated from one and others. Those genes with high discriminative score will later be used to train the k-NN classifier.The experimental results showed that the proposed gene selection methods able to identify differentially expressed genes with 0.86 ROC score.