Feature selection for micro-array data classification
Thousands of genes can be identified by DNA microarray technology at the same time which can have a very large application in biological processes and biomedical study. The knowledge of the micro-array data analysis is gained increasingly, and it is very important and useful for phenotype classifica...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/73007 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-73007 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-730072023-07-07T17:05:53Z Feature selection for micro-array data classification Yu, Yaping Wang Lipo School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Thousands of genes can be identified by DNA microarray technology at the same time which can have a very large application in biological processes and biomedical study. The knowledge of the micro-array data analysis is gained increasingly, and it is very important and useful for phenotype classification of diseases. Classification techniques is applied for identification and explanation of microarray gene expression data. From a machine learning approach, gene selection is regarded as feature selection. The microarray classification is based on classifying data, and the data are made by many thousands of features. A feature selection algorithm is used for selecting the most significant features, because a large number of features can lead to low prediction accuracy and very high computational complexity. This project explores various feature selection algorithms to determine a smallest set of genes that are responsible for identifying a disease. Micro-array gene expression data plays a very important role in disease diagnoses and prognoses and helps to choose the appropriate treatment plan for patients. Two feature selection algorithms are proposed in this report. We did one feature selection method and did a comparison with another one which have been done by Loris Nanni*, Alessandra Lumini [12]. Using Matlab to do experiment, we aimed to find the smallest gene subsets and get highly accuracy. Finding the smallest gene subsets is very significant. It can reduce the computational burden. We can use the minimum number of gene subsets to get accurate diagnosis. And it can decrease the cost greatly for cancer testing, and reduce the timing for treatment. In simple terms, this project is divided into two steps: to do gene importance ranking, we can get some informative and importance genes. Then we test all possible combinations of important genes through using supper vector machine to get accuracy. All in all, our project can reduce the number of compulsory genes to get faster method of treatment with highly accuracy. Bachelor of Engineering 2017-12-19T04:19:31Z 2017-12-19T04:19:31Z 2017 Final Year Project (FYP) http://hdl.handle.net/10356/73007 en Nanyang Technological University 71 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Electrical and electronic engineering |
spellingShingle |
DRNTU::Engineering::Electrical and electronic engineering Yu, Yaping Feature selection for micro-array data classification |
description |
Thousands of genes can be identified by DNA microarray technology at the same time which can have a very large application in biological processes and biomedical study. The knowledge of the micro-array data analysis is gained increasingly, and it is very important and useful for phenotype classification of diseases. Classification techniques is applied for identification and explanation of microarray gene expression data. From a machine learning approach, gene selection is regarded as feature selection. The microarray classification is based on classifying data, and the data are made by many thousands of features. A feature selection algorithm is used for selecting the most significant features, because a large number of features can lead to low prediction accuracy and very high computational complexity. This project explores various feature selection algorithms to determine a smallest set of genes that are responsible for identifying a disease. Micro-array gene expression data plays a very important role in disease diagnoses and prognoses and helps to choose the appropriate treatment plan for patients. Two feature selection algorithms are proposed in this report. We did one feature selection method and did a comparison with another one which have been done by Loris Nanni*, Alessandra Lumini [12]. Using Matlab to do experiment, we aimed to find the smallest gene subsets and get highly accuracy. Finding the smallest gene subsets is very significant. It can reduce the computational burden. We can use the minimum number of gene subsets to get accurate diagnosis. And it can decrease the cost greatly for cancer testing, and reduce the timing for treatment. In simple terms, this project is divided into two steps: to do gene importance ranking, we can get some informative and importance genes. Then we test all possible combinations of important genes through using supper vector machine to get accuracy. All in all, our project can reduce the number of compulsory genes to get faster method of treatment with highly accuracy. |
author2 |
Wang Lipo |
author_facet |
Wang Lipo Yu, Yaping |
format |
Final Year Project |
author |
Yu, Yaping |
author_sort |
Yu, Yaping |
title |
Feature selection for micro-array data classification |
title_short |
Feature selection for micro-array data classification |
title_full |
Feature selection for micro-array data classification |
title_fullStr |
Feature selection for micro-array data classification |
title_full_unstemmed |
Feature selection for micro-array data classification |
title_sort |
feature selection for micro-array data classification |
publishDate |
2017 |
url |
http://hdl.handle.net/10356/73007 |
_version_ |
1772827646328045568 |