Integration of heterogeneous data sources for identification of disease genes using computational techniques

Genes related to causing some disease are called disease-causing genes or disease genes. In wet-lab experiments, disease genes are identified by mutation analysis, which is expensive and labor extensive. In this thesis, we propose novel computational techniques to predict disease genes. In the first...

Full description

Saved in:

Bibliographic Details
Main Author:	Li, Yongjin
Other Authors:	Jagdish Chandra Patra
Format:	Theses and Dissertations
Language:	English
Published:	2011
Subjects:	DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences DRNTU::Science::Medicine::Computer applications
Online Access:	https://hdl.handle.net/10356/43993
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-43993
record_format	dspace
spelling	sg-ntu-dr.10356-439932023-03-04T00:36:42Z Integration of heterogeneous data sources for identification of disease genes using computational techniques Li, Yongjin Jagdish Chandra Patra School of Computer Engineering BioSciences Research Centre DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences DRNTU::Science::Medicine::Computer applications Genes related to causing some disease are called disease-causing genes or disease genes. In wet-lab experiments, disease genes are identified by mutation analysis, which is expensive and labor extensive. In this thesis, we propose novel computational techniques to predict disease genes. In the first part of this thesis, we proposed five novel topological features obtained from the Protein-Protein Interaction (PPI) network. We applied Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP) networks to predict new cancer genes using these features. We found that SVM performed slightly better than MLP. We also found that the feature, named 2N-index, is the most discriminative feature between cancer genes and other genes. With the availability of various data sources related to genes and disease phenotype, accurate prediction of disease genes is possible by integrating the information available from multiple data sources. We propose several novel computational models to integrate multiple data sources for the identification of disease genes. These models are proposed to prioritize set of candidate disease genes, based on their functional similarity to known disease genes. DOCTOR OF PHILOSOPHY (SCE) 2011-05-18T06:13:31Z 2011-05-18T06:13:31Z 2011 2011 Thesis Li, Y. J. (2011). Integration of heterogeneous data sources for identification of disease genes using computational techniques. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/43993 10.32657/10356/43993 en 187 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences DRNTU::Science::Medicine::Computer applications
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences DRNTU::Science::Medicine::Computer applications Li, Yongjin Integration of heterogeneous data sources for identification of disease genes using computational techniques
description	Genes related to causing some disease are called disease-causing genes or disease genes. In wet-lab experiments, disease genes are identified by mutation analysis, which is expensive and labor extensive. In this thesis, we propose novel computational techniques to predict disease genes. In the first part of this thesis, we proposed five novel topological features obtained from the Protein-Protein Interaction (PPI) network. We applied Support Vector Machine (SVM) and Multi-Layer Perceptron (MLP) networks to predict new cancer genes using these features. We found that SVM performed slightly better than MLP. We also found that the feature, named 2N-index, is the most discriminative feature between cancer genes and other genes. With the availability of various data sources related to genes and disease phenotype, accurate prediction of disease genes is possible by integrating the information available from multiple data sources. We propose several novel computational models to integrate multiple data sources for the identification of disease genes. These models are proposed to prioritize set of candidate disease genes, based on their functional similarity to known disease genes.
author2	Jagdish Chandra Patra
author_facet	Jagdish Chandra Patra Li, Yongjin
format	Theses and Dissertations
author	Li, Yongjin
author_sort	Li, Yongjin
title	Integration of heterogeneous data sources for identification of disease genes using computational techniques
title_short	Integration of heterogeneous data sources for identification of disease genes using computational techniques
title_full	Integration of heterogeneous data sources for identification of disease genes using computational techniques
title_fullStr	Integration of heterogeneous data sources for identification of disease genes using computational techniques
title_full_unstemmed	Integration of heterogeneous data sources for identification of disease genes using computational techniques
title_sort	integration of heterogeneous data sources for identification of disease genes using computational techniques
publishDate	2011
url	https://hdl.handle.net/10356/43993
_version_	1759854871565041664

Integration of heterogeneous data sources for identification of disease genes using computational techniques

Similar Items