Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data

Selecting informative features, such as accurate biomarkers for disease diagnosis, prognosis and response to treatment, is an essential task in the field of bioinformatics. Medical data often contain thousands of features and identifying potential biomarkers is challenging due to small number of sam...

Full description

Saved in:
Bibliographic Details
Main Authors: Budhraja, Sugam, Doborjeh, Maryam, Singh, Balkaran, Tan, Samuel Ming Xuan, Doborjeh, Zohreh, Lai, Edmund, Merkin, Alexander, Lee, Jimmy Chee Keong, Goh, Wilson, Kasabov, Nikola
Other Authors: Lee Kong Chian School of Medicine (LKCMedicine)
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/174293
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-174293
record_format dspace
spelling sg-ntu-dr.10356-1742932024-03-31T15:40:23Z Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data Budhraja, Sugam Doborjeh, Maryam Singh, Balkaran Tan, Samuel Ming Xuan Doborjeh, Zohreh Lai, Edmund Merkin, Alexander Lee, Jimmy Chee Keong Goh, Wilson Kasabov, Nikola Lee Kong Chian School of Medicine (LKCMedicine) School of Biological Sciences Medicine, Health and Life Sciences Biomarker discovery Proteomics Selecting informative features, such as accurate biomarkers for disease diagnosis, prognosis and response to treatment, is an essential task in the field of bioinformatics. Medical data often contain thousands of features and identifying potential biomarkers is challenging due to small number of samples in the data, method dependence and non-reproducibility. This paper proposes a novel ensemble feature selection method, named Filter and Wrapper Stacking Ensemble (FWSE), to identify reproducible biomarkers from high-dimensional omics data. In FWSE, filter feature selection methods are run on numerous subsets of the data to eliminate irrelevant features, and then wrapper feature selection methods are applied to rank the top features. The method was validated on four high-dimensional medical datasets related to mental illnesses and cancer. The results indicate that the features selected by FWSE are stable and statistically more significant than the ones obtained by existing methods while also demonstrating biological relevance. Furthermore, FWSE is a generic method, applicable to various high-dimensional datasets in the fields of machine intelligence and bioinformatics. National Medical Research Council (NMRC) National Research Foundation (NRF) Published version This research is supported by theMBIE Catalyst: StrategicNewZealand-SingaporeData Science Research Programand theNationalResearchFoundation,Singapore,underitsIndustry Alignment Fund–Pre-positioning (IAF-PP) Funding Initiative. The LYRIKS studywas supported by theNational Research Foundation Singapore under the National Medical Research Council Translational andClinical ResearchFlagshipProgram (NMRC/TCR/003/2008).Anyopinions, findingsandconclusions or recommendations expressed in thismaterial are those of theauthor(s)anddonotreflecttheviewsofNationalResearch Foundation,Singapore. 2024-03-25T08:15:56Z 2024-03-25T08:15:56Z 2023 Journal Article Budhraja, S., Doborjeh, M., Singh, B., Tan, S. M. X., Doborjeh, Z., Lai, E., Merkin, A., Lee, J. C. K., Goh, W. & Kasabov, N. (2023). Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data. Briefings in Bioinformatics, 24(6), bbad382-. https://dx.doi.org/10.1093/bib/bbad382 1467-5463 https://hdl.handle.net/10356/174293 10.1093/bib/bbad382 37889118 2-s2.0-85175273783 6 24 bbad382 en NMRC/TCR/003/2008 Briefings in Bioinformatics © The Author(s) 2023. Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/ licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Medicine, Health and Life Sciences
Biomarker discovery
Proteomics
spellingShingle Medicine, Health and Life Sciences
Biomarker discovery
Proteomics
Budhraja, Sugam
Doborjeh, Maryam
Singh, Balkaran
Tan, Samuel Ming Xuan
Doborjeh, Zohreh
Lai, Edmund
Merkin, Alexander
Lee, Jimmy Chee Keong
Goh, Wilson
Kasabov, Nikola
Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data
description Selecting informative features, such as accurate biomarkers for disease diagnosis, prognosis and response to treatment, is an essential task in the field of bioinformatics. Medical data often contain thousands of features and identifying potential biomarkers is challenging due to small number of samples in the data, method dependence and non-reproducibility. This paper proposes a novel ensemble feature selection method, named Filter and Wrapper Stacking Ensemble (FWSE), to identify reproducible biomarkers from high-dimensional omics data. In FWSE, filter feature selection methods are run on numerous subsets of the data to eliminate irrelevant features, and then wrapper feature selection methods are applied to rank the top features. The method was validated on four high-dimensional medical datasets related to mental illnesses and cancer. The results indicate that the features selected by FWSE are stable and statistically more significant than the ones obtained by existing methods while also demonstrating biological relevance. Furthermore, FWSE is a generic method, applicable to various high-dimensional datasets in the fields of machine intelligence and bioinformatics.
author2 Lee Kong Chian School of Medicine (LKCMedicine)
author_facet Lee Kong Chian School of Medicine (LKCMedicine)
Budhraja, Sugam
Doborjeh, Maryam
Singh, Balkaran
Tan, Samuel Ming Xuan
Doborjeh, Zohreh
Lai, Edmund
Merkin, Alexander
Lee, Jimmy Chee Keong
Goh, Wilson
Kasabov, Nikola
format Article
author Budhraja, Sugam
Doborjeh, Maryam
Singh, Balkaran
Tan, Samuel Ming Xuan
Doborjeh, Zohreh
Lai, Edmund
Merkin, Alexander
Lee, Jimmy Chee Keong
Goh, Wilson
Kasabov, Nikola
author_sort Budhraja, Sugam
title Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data
title_short Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data
title_full Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data
title_fullStr Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data
title_full_unstemmed Filter and wrapper stacking ensemble (FWSE): a robust approach for reliable biomarker discovery in high-dimensional omics data
title_sort filter and wrapper stacking ensemble (fwse): a robust approach for reliable biomarker discovery in high-dimensional omics data
publishDate 2024
url https://hdl.handle.net/10356/174293
_version_ 1795375072531185664