Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities

Some prediction methods use probability to rank their predictions, while some other prediction methods do not rank their predictions and instead use [Formula: see text]-values to support their predictions. This disparity renders direct cross-comparison of these two kinds of methods difficult. In par...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Goh, Wilson Wen Bin, Kong, Weijia, Wong, Limsoon
مؤلفون آخرون:	School of Biological Sciences
التنسيق:	مقال
اللغة:	English
منشور في:	2023
الموضوعات:	Science::Biological sciences Machine Learning Missing Proteins
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/169216
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

id	sg-ntu-dr.10356-169216
record_format	dspace
spelling	sg-ntu-dr.10356-1692162023-07-07T08:17:53Z Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities Goh, Wilson Wen Bin Kong, Weijia Wong, Limsoon School of Biological Sciences Lee Kong Chian School of Medicine (LKCMedicine) Center for Biomedical Informatics Science::Biological sciences Machine Learning Missing Proteins Some prediction methods use probability to rank their predictions, while some other prediction methods do not rank their predictions and instead use [Formula: see text]-values to support their predictions. This disparity renders direct cross-comparison of these two kinds of methods difficult. In particular, approaches such as the Bayes Factor upper Bound (BFB) for [Formula: see text]-value conversion may not make correct assumptions for this kind of cross-comparisons. Here, using a well-established case study on renal cancer proteomics and in the context of missing protein prediction, we demonstrate how to compare these two kinds of prediction methods using two different strategies. The first strategy is based on false discovery rate (FDR) estimation, which does not make the same naïve assumptions as BFB conversions. The second strategy is a powerful approach which we colloquially call "home ground testing". Both strategies perform better than BFB conversions. Thus, we recommend comparing prediction methods by standardization to a common performance benchmark such as a global FDR. And where this is not possible, we recommend reciprocal "home ground testing". Ministry of Education (MOE) WWBG and LW gratefully acknowledge support from a Singapore Ministry of Education Academic Research Fund Tier 2 grant (Grant No. MOE2019-T2-1-042).WWBG also gratefully acknowledges support from a Singapore Ministry of Education Academic Research Fund Tier 1 grant (Grant No. RS08/21 RG35/20). 2023-07-07T08:17:53Z 2023-07-07T08:17:53Z 2023 Journal Article Goh, W. W. B., Kong, W. & Wong, L. (2023). Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities. Journal of Bioinformatics and Computational Biology, 21(1), 2350005-. https://dx.doi.org/10.1142/S0219720023500051 0219-7200 https://hdl.handle.net/10356/169216 10.1142/S0219720023500051 36891972 2-s2.0-85150720652 1 21 2350005 en MOE2019-T2-1-042 RS08/21 RG35/20 Journal of Bioinformatics and Computational Biology © 2023 World Scientific Publishing Europe Ltd. All rights reserved.
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Science::Biological sciences Machine Learning Missing Proteins
spellingShingle	Science::Biological sciences Machine Learning Missing Proteins Goh, Wilson Wen Bin Kong, Weijia Wong, Limsoon Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities
description	Some prediction methods use probability to rank their predictions, while some other prediction methods do not rank their predictions and instead use [Formula: see text]-values to support their predictions. This disparity renders direct cross-comparison of these two kinds of methods difficult. In particular, approaches such as the Bayes Factor upper Bound (BFB) for [Formula: see text]-value conversion may not make correct assumptions for this kind of cross-comparisons. Here, using a well-established case study on renal cancer proteomics and in the context of missing protein prediction, we demonstrate how to compare these two kinds of prediction methods using two different strategies. The first strategy is based on false discovery rate (FDR) estimation, which does not make the same naïve assumptions as BFB conversions. The second strategy is a powerful approach which we colloquially call "home ground testing". Both strategies perform better than BFB conversions. Thus, we recommend comparing prediction methods by standardization to a common performance benchmark such as a global FDR. And where this is not possible, we recommend reciprocal "home ground testing".
author2	School of Biological Sciences
author_facet	School of Biological Sciences Goh, Wilson Wen Bin Kong, Weijia Wong, Limsoon
format	Article
author	Goh, Wilson Wen Bin Kong, Weijia Wong, Limsoon
author_sort	Goh, Wilson Wen Bin
title	Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities
title_short	Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities
title_full	Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities
title_fullStr	Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities
title_full_unstemmed	Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities
title_sort	evaluating network-based missing protein prediction using p-values, bayes factors, and probabilities
publishDate	2023
url	https://hdl.handle.net/10356/169216
_version_	1772827926181445632

Evaluating network-based missing protein prediction using p-values, Bayes Factors, and probabilities

مواد مشابهة