Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence

This study aims to investigate efficient missing data techniques for prediction of nasopharyngeal carcinoma (NPC) recurrence. Initially, clinical data of patients with NPC who received treatment at Ramathibodi hospital, Thailand, were collected. In total, 495 records were employed for the cancer rec...

Full description

Saved in:
Bibliographic Details
Main Authors: Panrasee Ritthipravat, Orrawan Kumdee, Thongchai Bhongmakapat
Other Authors: Mahidol University
Format: Article
Published: 2018
Subjects:
Online Access:https://repository.li.mahidol.ac.th/handle/123456789/31646
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Mahidol University
id th-mahidol.31646
record_format dspace
spelling th-mahidol.316462018-10-19T11:52:05Z Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence Panrasee Ritthipravat Orrawan Kumdee Thongchai Bhongmakapat Mahidol University Faculty of Medicine, Ramathibodi Hospital, Mahidol University Computer Science This study aims to investigate efficient missing data techniques for prediction of nasopharyngeal carcinoma (NPC) recurrence. Initially, clinical data of patients with NPC who received treatment at Ramathibodi hospital, Thailand, were collected. In total, 495 records were employed for the cancer recurrence prediction. Due to the fact that these data contain different missing values, appropriate missing data techniques (MDTs) must be examined. In this study, complete-case analysis, mean imputation, k-nearest neighbor imputation and Expectation Maximization (EM) imputation are mainly focused. The completed data are then used for developing three different predictive models, i.e., single-point model, multiple-point model and sequential neural network. The experimental results showed that EM imputation was superior to the other missing data techniques in which it provided highest predictive performance in all models. The average area under the receiver operating characteristic curve (AUC) of 0.72 could be achieved. The Hosmer and Lemeshow goodness of fittest was used for evaluating goodness of fit of each model. The results confirmed that EM imputation was the best missing data technique. The sequential neural network outperformed the other models. It provided the highest predictive performances in terms of the average AUC (0.73) and the Chi-square statistic (4.30). In addition, survival curves generated from these predictive models were compared with that of the Kaplan-Meier survival curve. The curves based on EM imputation were closest to the Kaplan-Meier model. From the log-rank test, however, these curves were significantly different (p-value < 0.05). © 2013 Asian Network for Scientific Information. 2018-10-19T04:52:05Z 2018-10-19T04:52:05Z 2013-04-24 Article Information Technology Journal. Vol.12, No.6 (2013), 1125-1133 10.3923/itj.2013.1125.1133 18125646 18125638 2-s2.0-84876357650 https://repository.li.mahidol.ac.th/handle/123456789/31646 Mahidol University SCOPUS https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84876357650&origin=inward
institution Mahidol University
building Mahidol University Library
continent Asia
country Thailand
Thailand
content_provider Mahidol University Library
collection Mahidol University Institutional Repository
topic Computer Science
spellingShingle Computer Science
Panrasee Ritthipravat
Orrawan Kumdee
Thongchai Bhongmakapat
Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence
description This study aims to investigate efficient missing data techniques for prediction of nasopharyngeal carcinoma (NPC) recurrence. Initially, clinical data of patients with NPC who received treatment at Ramathibodi hospital, Thailand, were collected. In total, 495 records were employed for the cancer recurrence prediction. Due to the fact that these data contain different missing values, appropriate missing data techniques (MDTs) must be examined. In this study, complete-case analysis, mean imputation, k-nearest neighbor imputation and Expectation Maximization (EM) imputation are mainly focused. The completed data are then used for developing three different predictive models, i.e., single-point model, multiple-point model and sequential neural network. The experimental results showed that EM imputation was superior to the other missing data techniques in which it provided highest predictive performance in all models. The average area under the receiver operating characteristic curve (AUC) of 0.72 could be achieved. The Hosmer and Lemeshow goodness of fittest was used for evaluating goodness of fit of each model. The results confirmed that EM imputation was the best missing data technique. The sequential neural network outperformed the other models. It provided the highest predictive performances in terms of the average AUC (0.73) and the Chi-square statistic (4.30). In addition, survival curves generated from these predictive models were compared with that of the Kaplan-Meier survival curve. The curves based on EM imputation were closest to the Kaplan-Meier model. From the log-rank test, however, these curves were significantly different (p-value < 0.05). © 2013 Asian Network for Scientific Information.
author2 Mahidol University
author_facet Mahidol University
Panrasee Ritthipravat
Orrawan Kumdee
Thongchai Bhongmakapat
format Article
author Panrasee Ritthipravat
Orrawan Kumdee
Thongchai Bhongmakapat
author_sort Panrasee Ritthipravat
title Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence
title_short Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence
title_full Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence
title_fullStr Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence
title_full_unstemmed Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence
title_sort efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence
publishDate 2018
url https://repository.li.mahidol.ac.th/handle/123456789/31646
_version_ 1763494496826294272