The discovery of novel biomarkers using multivariate modelling in major depressive disorder
Psychiatric disorders (PD) are found to have a profound impact on individuals and society. Even if a growing consensus points to a mix of genetic and environmental factors aetiology of PD are not fully understood. With the increase in application of data science in the field of medicine, using machi...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/156475 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-156475 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1564752022-04-17T11:53:19Z The discovery of novel biomarkers using multivariate modelling in major depressive disorder Simrita Janakiraman Jagath C Rajapakse School of Computer Science and Engineering ASJagath@ntu.edu.sg Engineering::Computer science and engineering Science::Medicine::Biomedical engineering Psychiatric disorders (PD) are found to have a profound impact on individuals and society. Even if a growing consensus points to a mix of genetic and environmental factors aetiology of PD are not fully understood. With the increase in application of data science in the field of medicine, using machine learning (ML) techniques to predict diagnosis, severity and treatment response presents itself as a promising opportunity. Using machine learning techniques, we aim to narrow down transcriptomic biomarkers that have a significant impact in identifying major depressive disorder (MDD). High dimensional transcriptomic data with insufficient number of data points makes it tougher to analyse and accurately classify the cohorts. In this project, we compare techniques and pipelines for pre-processing, filtering and classification to identify the one that best predicts the response to MDD treatment. In the below study, in order to understand molecular pathways better in response to the treatment drug - GPR56, we examine the gene expression data of N=203 patients at week 0 and week 8 of the treatment. The study has been divided into segments, each using a different pipeline aimed at resolving one or both of the following objectives using the gene expression data - A) To predict response to MDD treatment, B) To predict if the patient was on placebo or the treatment drug. After completing the different data science pipelines, we conclude that a combination of feature selection techniques with machine learning based classification techniques helps us understand the differentially expressed genes and their respective molecular pathways better, in order to predict MDD. Furthermore, the topological analysis performed at the end of the project preliminarily reveals that using the appropriate clustering technique to reveal pure clusters with data points of the same non-gene expression attributes – age, gender and treatment and similar range of gene expression data paving the way for future work combining the fields of bioinformatics and topological analysis. Bachelor of Science in Data Science and Artificial Intelligence 2022-04-17T11:53:19Z 2022-04-17T11:53:19Z 2022 Final Year Project (FYP) Simrita Janakiraman (2022). The discovery of novel biomarkers using multivariate modelling in major depressive disorder. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/156475 https://hdl.handle.net/10356/156475 en SCSE21-0415 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Science::Medicine::Biomedical engineering |
spellingShingle |
Engineering::Computer science and engineering Science::Medicine::Biomedical engineering Simrita Janakiraman The discovery of novel biomarkers using multivariate modelling in major depressive disorder |
description |
Psychiatric disorders (PD) are found to have a profound impact on individuals and society. Even if a growing consensus points to a mix of genetic and environmental factors aetiology of PD are not fully understood. With the increase in application of data science in the field of medicine, using machine learning (ML) techniques to predict diagnosis, severity and treatment response presents itself as a promising opportunity. Using machine learning techniques, we aim to narrow down transcriptomic biomarkers that have a significant impact in identifying major depressive disorder (MDD). High dimensional transcriptomic data with insufficient number of data points makes it tougher to analyse and accurately classify the cohorts. In this project, we compare techniques and pipelines for pre-processing, filtering and classification to identify the one that best predicts the response to MDD treatment.
In the below study, in order to understand molecular pathways better in response to the treatment drug - GPR56, we examine the gene expression data of N=203 patients at week 0 and week 8 of the treatment. The study has been divided into segments, each using a different pipeline aimed at resolving one or both of the following objectives using the gene expression data - A) To predict response to MDD treatment, B) To predict if the patient was on placebo or the treatment drug. After completing the different data science pipelines, we conclude that a combination of feature selection techniques with machine learning based classification techniques helps us understand the differentially expressed genes and their respective molecular pathways better, in order to predict MDD. Furthermore, the topological analysis performed at the end of the project preliminarily reveals that using the appropriate clustering technique to reveal pure clusters with data points of the same non-gene expression attributes – age, gender and treatment and similar range of gene expression data paving the way for future work combining the fields of bioinformatics and topological analysis. |
author2 |
Jagath C Rajapakse |
author_facet |
Jagath C Rajapakse Simrita Janakiraman |
format |
Final Year Project |
author |
Simrita Janakiraman |
author_sort |
Simrita Janakiraman |
title |
The discovery of novel biomarkers using multivariate modelling in major depressive disorder |
title_short |
The discovery of novel biomarkers using multivariate modelling in major depressive disorder |
title_full |
The discovery of novel biomarkers using multivariate modelling in major depressive disorder |
title_fullStr |
The discovery of novel biomarkers using multivariate modelling in major depressive disorder |
title_full_unstemmed |
The discovery of novel biomarkers using multivariate modelling in major depressive disorder |
title_sort |
discovery of novel biomarkers using multivariate modelling in major depressive disorder |
publisher |
Nanyang Technological University |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/156475 |
_version_ |
1731235708710944768 |