Dealing with missing values in proteomics data

Proteomics data are often plagued with missingness issues. These missing values (MVs) threaten the integrity of subsequent statistical analyses by reduction of statistical power, introduction of bias, and failure to represent the true sample. Over the years, several categories of missing value imput...

Full description

Saved in:
Bibliographic Details
Main Authors: Kong, Weijia, Hui, Harvard Wai Hann, Peng, Hui, Goh, Wilson Wen Bin
Other Authors: Lee Kong Chian School of Medicine (LKCMedicine)
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/170551
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-170551
record_format dspace
spelling sg-ntu-dr.10356-1705512023-09-19T03:31:58Z Dealing with missing values in proteomics data Kong, Weijia Hui, Harvard Wai Hann Peng, Hui Goh, Wilson Wen Bin Lee Kong Chian School of Medicine (LKCMedicine) School of Biological Sciences Centre for biomedical informatics Science::Biological sciences Bioinformatics Computational Biology Proteomics data are often plagued with missingness issues. These missing values (MVs) threaten the integrity of subsequent statistical analyses by reduction of statistical power, introduction of bias, and failure to represent the true sample. Over the years, several categories of missing value imputation (MVI) methods have been developed and adapted for proteomics data. These MVI methods perform their tasks based on different prior assumptions (e.g., data is normally or independently distributed) and operating principles (e.g., the algorithm is built to address random missingness only), resulting in varying levels of performance even when dealing with the same dataset. Thus, to achieve a satisfactory outcome, a suitable MVI method must be selected. To guide decision making on suitable MVI method, we provide a decision chart which facilitates strategic considerations on datasets presenting different characteristics. We also bring attention to other issues that can impact proper MVI such as the presence of confounders (e.g., batch effects) which can influence MVI performance. Thus, these too, should be considered during or before MVI. Ministry of Education (MOE) WWBG acknowledges support from a Ministry of Education (MOE), Singapore Tier 1 grant (Grant No. RG35/20). 2023-09-19T03:31:58Z 2023-09-19T03:31:58Z 2022 Journal Article Kong, W., Hui, H. W. H., Peng, H. & Goh, W. W. B. (2022). Dealing with missing values in proteomics data. Proteomics, 22(23-24), e2200092-. https://dx.doi.org/10.1002/pmic.202200092 1615-9853 https://hdl.handle.net/10356/170551 10.1002/pmic.202200092 36349819 2-s2.0-85142235275 23-24 22 e2200092 en RG35/20 Proteomics © 2022 Wiley-VCH GmbH. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Biological sciences
Bioinformatics
Computational Biology
spellingShingle Science::Biological sciences
Bioinformatics
Computational Biology
Kong, Weijia
Hui, Harvard Wai Hann
Peng, Hui
Goh, Wilson Wen Bin
Dealing with missing values in proteomics data
description Proteomics data are often plagued with missingness issues. These missing values (MVs) threaten the integrity of subsequent statistical analyses by reduction of statistical power, introduction of bias, and failure to represent the true sample. Over the years, several categories of missing value imputation (MVI) methods have been developed and adapted for proteomics data. These MVI methods perform their tasks based on different prior assumptions (e.g., data is normally or independently distributed) and operating principles (e.g., the algorithm is built to address random missingness only), resulting in varying levels of performance even when dealing with the same dataset. Thus, to achieve a satisfactory outcome, a suitable MVI method must be selected. To guide decision making on suitable MVI method, we provide a decision chart which facilitates strategic considerations on datasets presenting different characteristics. We also bring attention to other issues that can impact proper MVI such as the presence of confounders (e.g., batch effects) which can influence MVI performance. Thus, these too, should be considered during or before MVI.
author2 Lee Kong Chian School of Medicine (LKCMedicine)
author_facet Lee Kong Chian School of Medicine (LKCMedicine)
Kong, Weijia
Hui, Harvard Wai Hann
Peng, Hui
Goh, Wilson Wen Bin
format Article
author Kong, Weijia
Hui, Harvard Wai Hann
Peng, Hui
Goh, Wilson Wen Bin
author_sort Kong, Weijia
title Dealing with missing values in proteomics data
title_short Dealing with missing values in proteomics data
title_full Dealing with missing values in proteomics data
title_fullStr Dealing with missing values in proteomics data
title_full_unstemmed Dealing with missing values in proteomics data
title_sort dealing with missing values in proteomics data
publishDate 2023
url https://hdl.handle.net/10356/170551
_version_ 1779156458449928192