Feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder

Genomic psychiatry is a recently expanding field which holds much promise in biomarker discovery for psychiatric disorders. However, high dimensionality of genomic data and relative smaller cohort sizes at the psychiatric outpatient clinic imposes a significant challenge for clinically significant a...

Full description

Saved in:

Bibliographic Details
Main Author:	Zeng, Yanxi
Other Authors:	Jagath C Rajapakse
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2021
Subjects:	Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/149051
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-149051
record_format	dspace
spelling	sg-ntu-dr.10356-1490512021-05-25T02:23:21Z Feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder Zeng, Yanxi Jagath C Rajapakse School of Computer Science and Engineering ASJagath@ntu.edu.sg Engineering::Computer science and engineering Genomic psychiatry is a recently expanding field which holds much promise in biomarker discovery for psychiatric disorders. However, high dimensionality of genomic data and relative smaller cohort sizes at the psychiatric outpatient clinic imposes a significant challenge for clinically significant analysis of transcriptomic data. We approach this problem using state-of- the-art machine-learning methods to extract the salient features of genomic data for potential use as biomarkers. To simulate application to psychiatric outpatient clinics, we investigate the use of the above methods on transcriptomic data of lithium treated bipolar patients (n=240) and healthy controls (n=240). After a gamut of preliminary univariate feature selection methods, we apply multivariate methods such as the recursive feature elimination with various machine learning models on transcriptomic data with nested cross-validation to select the set of genes giving the best predictive accuracy of diagnosis. Our results indicated that the genes selected with the above-mentioned process achieve higher predictive classification accuracies of the clinical outcomes and the use of lithium treatment. Furthermore, gene set enrichment analysis and gene ontology analysis were carried out on the candidate biomarkers for investigation of underlying biological and pathogenic processes. We conclude that a feature selection pipeline combining univariate filtering and machine learning based feature selection methods is capable of overcoming the challenges of high dimensionality in genomic data and extracting salient features highlighting related biological pathways for downstream analysis. Bachelor of Engineering (Computer Science) 2021-05-25T02:23:21Z 2021-05-25T02:23:21Z 2021 Final Year Project (FYP) Zeng, Y. (2021). Feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/149051 https://hdl.handle.net/10356/149051 en SCSE20-0258 application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering
spellingShingle	Engineering::Computer science and engineering Zeng, Yanxi Feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder
description	Genomic psychiatry is a recently expanding field which holds much promise in biomarker discovery for psychiatric disorders. However, high dimensionality of genomic data and relative smaller cohort sizes at the psychiatric outpatient clinic imposes a significant challenge for clinically significant analysis of transcriptomic data. We approach this problem using state-of- the-art machine-learning methods to extract the salient features of genomic data for potential use as biomarkers. To simulate application to psychiatric outpatient clinics, we investigate the use of the above methods on transcriptomic data of lithium treated bipolar patients (n=240) and healthy controls (n=240). After a gamut of preliminary univariate feature selection methods, we apply multivariate methods such as the recursive feature elimination with various machine learning models on transcriptomic data with nested cross-validation to select the set of genes giving the best predictive accuracy of diagnosis. Our results indicated that the genes selected with the above-mentioned process achieve higher predictive classification accuracies of the clinical outcomes and the use of lithium treatment. Furthermore, gene set enrichment analysis and gene ontology analysis were carried out on the candidate biomarkers for investigation of underlying biological and pathogenic processes. We conclude that a feature selection pipeline combining univariate filtering and machine learning based feature selection methods is capable of overcoming the challenges of high dimensionality in genomic data and extracting salient features highlighting related biological pathways for downstream analysis.
author2	Jagath C Rajapakse
author_facet	Jagath C Rajapakse Zeng, Yanxi
format	Final Year Project
author	Zeng, Yanxi
author_sort	Zeng, Yanxi
title	Feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder
title_short	Feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder
title_full	Feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder
title_fullStr	Feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder
title_full_unstemmed	Feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder
title_sort	feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder
publisher	Nanyang Technological University
publishDate	2021
url	https://hdl.handle.net/10356/149051
_version_	1701270492465332224

Feature selection on transcriptome data for identification of novel biomarkers for bipolar disorder

Similar Items