Learning feature dependencies for noise correction in biomedical prediction

The presence of noise or errors in the stated feature values of biomedical data can lead to incorrect prediction. We introduce a Bayesian Network-based Noise Correction framework named BN-NC. After data preprocessing, a Bayesian Network (BN) is learned to capture the feature dependencies. Using the...

Full description

Saved in:
Bibliographic Details
Main Authors: YAP, Ghim-Eng, TAN, Ah-Hwee, PANG, Hwee Hwa
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2011
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/3661
https://ink.library.smu.edu.sg/context/sis_research/article/4663/viewcontent/YapTanPangHH_2011_LearningFeatureDependNoiseCorrectBiomedical_afv.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-4663
record_format dspace
spelling sg-smu-ink.sis_research-46632017-07-11T06:48:41Z Learning feature dependencies for noise correction in biomedical prediction YAP, Ghim-Eng TAN, Ah-Hwee PANG, Hwee Hwa The presence of noise or errors in the stated feature values of biomedical data can lead to incorrect prediction. We introduce a Bayesian Network-based Noise Correction framework named BN-NC. After data preprocessing, a Bayesian Network (BN) is learned to capture the feature dependencies. Using the BN to predict each feature in turn, BN-NC estimates a feature's error rate as the deviation between its predicted and stated values in the training data, and allocates the appropriate uncertainty to its subsequent findings during prediction. BN-NC automatically generates a probabilistic rule to explain BN prediction on the class variable using the feature values in its Markov blanket, and this is reapplied as necessary to explain the noise correction on those features. Using three real-life benchmark biomedical data sets (on HIV-1 drug resistance prediction and leukemia subtype classification), we demonstrate that BN-NC (1) accurately detects the errors in biomedical feature values, (2) automatically corrects for the errors to maintain higher prediction accuracy over competing methods including Decision Trees, Naive Bayes and Support Vector Machines, and (3) generates probabilistic rules that concisely explain the prediction and noise correction decisions. In addition to achieving more robust biomedical prediction in the presence of feature noise, by highlighting erroneous features and explaining their corrections, BN-NC provides medical researchers with high utility insights to biomedical data not found in other methods. 2011-04-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3661 info:doi/10.1137/1.9781611972818.7 https://ink.library.smu.edu.sg/context/sis_research/article/4663/viewcontent/YapTanPangHH_2011_LearningFeatureDependNoiseCorrectBiomedical_afv.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Biomedical Engineering and Bioengineering Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Biomedical Engineering and Bioengineering
Databases and Information Systems
spellingShingle Biomedical Engineering and Bioengineering
Databases and Information Systems
YAP, Ghim-Eng
TAN, Ah-Hwee
PANG, Hwee Hwa
Learning feature dependencies for noise correction in biomedical prediction
description The presence of noise or errors in the stated feature values of biomedical data can lead to incorrect prediction. We introduce a Bayesian Network-based Noise Correction framework named BN-NC. After data preprocessing, a Bayesian Network (BN) is learned to capture the feature dependencies. Using the BN to predict each feature in turn, BN-NC estimates a feature's error rate as the deviation between its predicted and stated values in the training data, and allocates the appropriate uncertainty to its subsequent findings during prediction. BN-NC automatically generates a probabilistic rule to explain BN prediction on the class variable using the feature values in its Markov blanket, and this is reapplied as necessary to explain the noise correction on those features. Using three real-life benchmark biomedical data sets (on HIV-1 drug resistance prediction and leukemia subtype classification), we demonstrate that BN-NC (1) accurately detects the errors in biomedical feature values, (2) automatically corrects for the errors to maintain higher prediction accuracy over competing methods including Decision Trees, Naive Bayes and Support Vector Machines, and (3) generates probabilistic rules that concisely explain the prediction and noise correction decisions. In addition to achieving more robust biomedical prediction in the presence of feature noise, by highlighting erroneous features and explaining their corrections, BN-NC provides medical researchers with high utility insights to biomedical data not found in other methods.
format text
author YAP, Ghim-Eng
TAN, Ah-Hwee
PANG, Hwee Hwa
author_facet YAP, Ghim-Eng
TAN, Ah-Hwee
PANG, Hwee Hwa
author_sort YAP, Ghim-Eng
title Learning feature dependencies for noise correction in biomedical prediction
title_short Learning feature dependencies for noise correction in biomedical prediction
title_full Learning feature dependencies for noise correction in biomedical prediction
title_fullStr Learning feature dependencies for noise correction in biomedical prediction
title_full_unstemmed Learning feature dependencies for noise correction in biomedical prediction
title_sort learning feature dependencies for noise correction in biomedical prediction
publisher Institutional Knowledge at Singapore Management University
publishDate 2011
url https://ink.library.smu.edu.sg/sis_research/3661
https://ink.library.smu.edu.sg/context/sis_research/article/4663/viewcontent/YapTanPangHH_2011_LearningFeatureDependNoiseCorrectBiomedical_afv.pdf
_version_ 1770573473099284480