Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data
Selecting informative features, such as accurate biomarkers for disease diagnosis, prognosis and response to treatment, is an essential task in the field of bioinformatics. Medical data often contain thousands of features and identifying potential biomarkers is challenging due to small number of sam...
Saved in:
Main Authors: | , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/174293 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-174293 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1742932024-03-31T15:40:23Z Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data Budhraja, Sugam Doborjeh, Maryam Singh, Balkaran Tan, Samuel Ming Xuan Doborjeh, Zohreh Lai, Edmund Merkin, Alexander Lee, Jimmy Chee Keong Goh, Wilson Kasabov, Nikola Lee Kong Chian School of Medicine (LKCMedicine) School of Biological Sciences Medicine, Health and Life Sciences Biomarker discovery Proteomics Selecting informative features, such as accurate biomarkers for disease diagnosis, prognosis and response to treatment, is an essential task in the field of bioinformatics. Medical data often contain thousands of features and identifying potential biomarkers is challenging due to small number of samples in the data, method dependence and non-reproducibility. This paper proposes a novel ensemble feature selection method, named Filter and Wrapper Stacking Ensemble (FWSE), to identify reproducible biomarkers from high-dimensional omics data. In FWSE, filter feature selection methods are run on numerous subsets of the data to eliminate irrelevant features, and then wrapper feature selection methods are applied to rank the top features. The method was validated on four high-dimensional medical datasets related to mental illnesses and cancer. The results indicate that the features selected by FWSE are stable and statistically more significant than the ones obtained by existing methods while also demonstrating biological relevance. Furthermore, FWSE is a generic method, applicable to various high-dimensional datasets in the fields of machine intelligence and bioinformatics. National Medical Research Council (NMRC) National Research Foundation (NRF) Published version This research is supported by theMBIE Catalyst: StrategicNewZealand-SingaporeData Science Research Programand theNationalResearchFoundation,Singapore,underitsIndustry Alignment Fund–Pre-positioning (IAF-PP) Funding Initiative. The LYRIKS studywas supported by theNational Research Foundation Singapore under the National Medical Research Council Translational andClinical ResearchFlagshipProgram (NMRC/TCR/003/2008).Anyopinions, findingsandconclusions or recommendations expressed in thismaterial are those of theauthor(s)anddonotreflecttheviewsofNationalResearch Foundation,Singapore. 2024-03-25T08:15:56Z 2024-03-25T08:15:56Z 2023 Journal Article Budhraja, S., Doborjeh, M., Singh, B., Tan, S. M. X., Doborjeh, Z., Lai, E., Merkin, A., Lee, J. C. K., Goh, W. & Kasabov, N. (2023). Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data. Briefings in Bioinformatics, 24(6), bbad382-. https://dx.doi.org/10.1093/bib/bbad382 1467-5463 https://hdl.handle.net/10356/174293 10.1093/bib/bbad382 37889118 2-s2.0-85175273783 6 24 bbad382 en NMRC/TCR/003/2008 Briefings in Bioinformatics © The Author(s) 2023. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/ licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Medicine, Health and Life Sciences Biomarker discovery Proteomics |
spellingShingle |
Medicine, Health and Life Sciences Biomarker discovery Proteomics Budhraja, Sugam Doborjeh, Maryam Singh, Balkaran Tan, Samuel Ming Xuan Doborjeh, Zohreh Lai, Edmund Merkin, Alexander Lee, Jimmy Chee Keong Goh, Wilson Kasabov, Nikola Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data |
description |
Selecting informative features, such as accurate biomarkers for disease diagnosis, prognosis and response to treatment, is an essential task in the field of bioinformatics. Medical data often contain thousands of features and identifying potential biomarkers is challenging due to small number of samples in the data, method dependence and non-reproducibility. This paper proposes a novel ensemble feature selection method, named Filter and Wrapper Stacking Ensemble (FWSE), to identify reproducible biomarkers from high-dimensional omics data. In FWSE, filter feature selection methods are run on numerous subsets of the data to eliminate irrelevant features, and then wrapper feature selection methods are applied to rank the top features. The method was validated on four high-dimensional medical datasets related to mental illnesses and cancer. The results indicate that the features selected by FWSE are stable and statistically more significant than the ones obtained by existing methods while also demonstrating biological relevance. Furthermore, FWSE is a generic method, applicable to various high-dimensional datasets in the fields of machine intelligence and bioinformatics. |
author2 |
Lee Kong Chian School of Medicine (LKCMedicine) |
author_facet |
Lee Kong Chian School of Medicine (LKCMedicine) Budhraja, Sugam Doborjeh, Maryam Singh, Balkaran Tan, Samuel Ming Xuan Doborjeh, Zohreh Lai, Edmund Merkin, Alexander Lee, Jimmy Chee Keong Goh, Wilson Kasabov, Nikola |
format |
Article |
author |
Budhraja, Sugam Doborjeh, Maryam Singh, Balkaran Tan, Samuel Ming Xuan Doborjeh, Zohreh Lai, Edmund Merkin, Alexander Lee, Jimmy Chee Keong Goh, Wilson Kasabov, Nikola |
author_sort |
Budhraja, Sugam |
title |
Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data |
title_short |
Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data |
title_full |
Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data |
title_fullStr |
Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data |
title_full_unstemmed |
Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data |
title_sort |
filter and wrapper stacking ensemble (fwse): a robust approach for reliable biomarker discovery in high-dimensional omics data |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/174293 |
_version_ |
1795375072531185664 |