Integration of multiple data sources to prioritize candidate genes using discounted rating system

Identifying disease gene from a list of candidate genes is an important task in bioinformatics. The main strategy is to prioritize candidate genes based on their similarity to known disease genes. Most of existing gene prioritization methods access only one genomic data source, which is noisy and in...

全面介紹

Saved in:

書目詳細資料
Main Authors:	Li, Yongjin, Patra, Jagdish Chandra
其他作者:	School of Computer Engineering
格式:	Article
語言:	English
出版:	2011
主題:	DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
在線閱讀:	https://hdl.handle.net/10356/104604 http://hdl.handle.net/10220/7087
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

id	sg-ntu-dr.10356-104604
record_format	dspace
spelling	sg-ntu-dr.10356-1046042022-02-16T16:28:09Z Integration of multiple data sources to prioritize candidate genes using discounted rating system Li, Yongjin Patra, Jagdish Chandra School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Identifying disease gene from a list of candidate genes is an important task in bioinformatics. The main strategy is to prioritize candidate genes based on their similarity to known disease genes. Most of existing gene prioritization methods access only one genomic data source, which is noisy and incomplete. Thus, there is a need for the integration of multiple data sources containing different information. In this paper, we proposed a combination strategy, called discounted rating system (DRS). We performed leave one out cross validation to compare it with N-dimensional order statistics (NDOS) used in Endeavour. Results showed that the AUC (Area Under the Curve) values achieved by DRS were comparable with NDOS on most of the disease families. But DRS worked much faster than NDOS, especially when the number of data sources increases. When there are 100 candidate genes and 20 data sources, DRS works more than 180 times faster than NDOS. In the framework of DRS, we give different weights for different data sources. The weighted DRS achieved significantly higher AUC values than NDOS. The proposed DRS algorithm is a powerful and effective framework for candidate gene prioritization. If weights of different data sources are proper given, the DRS algorithm will perform better. Published version 2011-09-21T03:55:50Z 2019-12-06T21:36:06Z 2011-09-21T03:55:50Z 2019-12-06T21:36:06Z 2010 2010 Journal Article Li, Y., & Patra, J. C. (2010). Integration of multiple data sources to prioritize candidate genes using discounted rating system. BMC Bioinformatics, 11(Suppl 1), S20. 1471-2105 https://hdl.handle.net/10356/104604 http://hdl.handle.net/10220/7087 10.1186/1471-2105-11-S1-S20 20122192 152418 en BMC bioinformatics © 2010 Li and Patra. 10 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computer applications::Life and medical sciences Li, Yongjin Patra, Jagdish Chandra Integration of multiple data sources to prioritize candidate genes using discounted rating system
description	Identifying disease gene from a list of candidate genes is an important task in bioinformatics. The main strategy is to prioritize candidate genes based on their similarity to known disease genes. Most of existing gene prioritization methods access only one genomic data source, which is noisy and incomplete. Thus, there is a need for the integration of multiple data sources containing different information. In this paper, we proposed a combination strategy, called discounted rating system (DRS). We performed leave one out cross validation to compare it with N-dimensional order statistics (NDOS) used in Endeavour. Results showed that the AUC (Area Under the Curve) values achieved by DRS were comparable with NDOS on most of the disease families. But DRS worked much faster than NDOS, especially when the number of data sources increases. When there are 100 candidate genes and 20 data sources, DRS works more than 180 times faster than NDOS. In the framework of DRS, we give different weights for different data sources. The weighted DRS achieved significantly higher AUC values than NDOS. The proposed DRS algorithm is a powerful and effective framework for candidate gene prioritization. If weights of different data sources are proper given, the DRS algorithm will perform better.
author2	School of Computer Engineering
author_facet	School of Computer Engineering Li, Yongjin Patra, Jagdish Chandra
format	Article
author	Li, Yongjin Patra, Jagdish Chandra
author_sort	Li, Yongjin
title	Integration of multiple data sources to prioritize candidate genes using discounted rating system
title_short	Integration of multiple data sources to prioritize candidate genes using discounted rating system
title_full	Integration of multiple data sources to prioritize candidate genes using discounted rating system
title_fullStr	Integration of multiple data sources to prioritize candidate genes using discounted rating system
title_full_unstemmed	Integration of multiple data sources to prioritize candidate genes using discounted rating system
title_sort	integration of multiple data sources to prioritize candidate genes using discounted rating system
publishDate	2011
url	https://hdl.handle.net/10356/104604 http://hdl.handle.net/10220/7087
_version_	1725985708507660288

Integration of multiple data sources to prioritize candidate genes using discounted rating system

相似書籍