Dealing with missing values in proteomics data
Proteomics data are often plagued with missingness issues. These missing values (MVs) threaten the integrity of subsequent statistical analyses by reduction of statistical power, introduction of bias, and failure to represent the true sample. Over the years, several categories of missing value imput...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/170551 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-170551 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1705512023-09-19T03:31:58Z Dealing with missing values in proteomics data Kong, Weijia Hui, Harvard Wai Hann Peng, Hui Goh, Wilson Wen Bin Lee Kong Chian School of Medicine (LKCMedicine) School of Biological Sciences Centre for biomedical informatics Science::Biological sciences Bioinformatics Computational Biology Proteomics data are often plagued with missingness issues. These missing values (MVs) threaten the integrity of subsequent statistical analyses by reduction of statistical power, introduction of bias, and failure to represent the true sample. Over the years, several categories of missing value imputation (MVI) methods have been developed and adapted for proteomics data. These MVI methods perform their tasks based on different prior assumptions (e.g., data is normally or independently distributed) and operating principles (e.g., the algorithm is built to address random missingness only), resulting in varying levels of performance even when dealing with the same dataset. Thus, to achieve a satisfactory outcome, a suitable MVI method must be selected. To guide decision making on suitable MVI method, we provide a decision chart which facilitates strategic considerations on datasets presenting different characteristics. We also bring attention to other issues that can impact proper MVI such as the presence of confounders (e.g., batch effects) which can influence MVI performance. Thus, these too, should be considered during or before MVI. Ministry of Education (MOE) WWBG acknowledges support from a Ministry of Education (MOE), Singapore Tier 1 grant (Grant No. RG35/20). 2023-09-19T03:31:58Z 2023-09-19T03:31:58Z 2022 Journal Article Kong, W., Hui, H. W. H., Peng, H. & Goh, W. W. B. (2022). Dealing with missing values in proteomics data. Proteomics, 22(23-24), e2200092-. https://dx.doi.org/10.1002/pmic.202200092 1615-9853 https://hdl.handle.net/10356/170551 10.1002/pmic.202200092 36349819 2-s2.0-85142235275 23-24 22 e2200092 en RG35/20 Proteomics © 2022 Wiley-VCH GmbH. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Science::Biological sciences Bioinformatics Computational Biology |
spellingShingle |
Science::Biological sciences Bioinformatics Computational Biology Kong, Weijia Hui, Harvard Wai Hann Peng, Hui Goh, Wilson Wen Bin Dealing with missing values in proteomics data |
description |
Proteomics data are often plagued with missingness issues. These missing values (MVs) threaten the integrity of subsequent statistical analyses by reduction of statistical power, introduction of bias, and failure to represent the true sample. Over the years, several categories of missing value imputation (MVI) methods have been developed and adapted for proteomics data. These MVI methods perform their tasks based on different prior assumptions (e.g., data is normally or independently distributed) and operating principles (e.g., the algorithm is built to address random missingness only), resulting in varying levels of performance even when dealing with the same dataset. Thus, to achieve a satisfactory outcome, a suitable MVI method must be selected. To guide decision making on suitable MVI method, we provide a decision chart which facilitates strategic considerations on datasets presenting different characteristics. We also bring attention to other issues that can impact proper MVI such as the presence of confounders (e.g., batch effects) which can influence MVI performance. Thus, these too, should be considered during or before MVI. |
author2 |
Lee Kong Chian School of Medicine (LKCMedicine) |
author_facet |
Lee Kong Chian School of Medicine (LKCMedicine) Kong, Weijia Hui, Harvard Wai Hann Peng, Hui Goh, Wilson Wen Bin |
format |
Article |
author |
Kong, Weijia Hui, Harvard Wai Hann Peng, Hui Goh, Wilson Wen Bin |
author_sort |
Kong, Weijia |
title |
Dealing with missing values in proteomics data |
title_short |
Dealing with missing values in proteomics data |
title_full |
Dealing with missing values in proteomics data |
title_fullStr |
Dealing with missing values in proteomics data |
title_full_unstemmed |
Dealing with missing values in proteomics data |
title_sort |
dealing with missing values in proteomics data |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/170551 |
_version_ |
1779156458449928192 |