Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence
This study aims to investigate efficient missing data techniques for prediction of nasopharyngeal carcinoma (NPC) recurrence. Initially, clinical data of patients with NPC who received treatment at Ramathibodi hospital, Thailand, were collected. In total, 495 records were employed for the cancer rec...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Published: |
2018
|
Subjects: | |
Online Access: | https://repository.li.mahidol.ac.th/handle/123456789/31646 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Mahidol University |
id |
th-mahidol.31646 |
---|---|
record_format |
dspace |
spelling |
th-mahidol.316462018-10-19T11:52:05Z Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence Panrasee Ritthipravat Orrawan Kumdee Thongchai Bhongmakapat Mahidol University Faculty of Medicine, Ramathibodi Hospital, Mahidol University Computer Science This study aims to investigate efficient missing data techniques for prediction of nasopharyngeal carcinoma (NPC) recurrence. Initially, clinical data of patients with NPC who received treatment at Ramathibodi hospital, Thailand, were collected. In total, 495 records were employed for the cancer recurrence prediction. Due to the fact that these data contain different missing values, appropriate missing data techniques (MDTs) must be examined. In this study, complete-case analysis, mean imputation, k-nearest neighbor imputation and Expectation Maximization (EM) imputation are mainly focused. The completed data are then used for developing three different predictive models, i.e., single-point model, multiple-point model and sequential neural network. The experimental results showed that EM imputation was superior to the other missing data techniques in which it provided highest predictive performance in all models. The average area under the receiver operating characteristic curve (AUC) of 0.72 could be achieved. The Hosmer and Lemeshow goodness of fittest was used for evaluating goodness of fit of each model. The results confirmed that EM imputation was the best missing data technique. The sequential neural network outperformed the other models. It provided the highest predictive performances in terms of the average AUC (0.73) and the Chi-square statistic (4.30). In addition, survival curves generated from these predictive models were compared with that of the Kaplan-Meier survival curve. The curves based on EM imputation were closest to the Kaplan-Meier model. From the log-rank test, however, these curves were significantly different (p-value < 0.05). © 2013 Asian Network for Scientific Information. 2018-10-19T04:52:05Z 2018-10-19T04:52:05Z 2013-04-24 Article Information Technology Journal. Vol.12, No.6 (2013), 1125-1133 10.3923/itj.2013.1125.1133 18125646 18125638 2-s2.0-84876357650 https://repository.li.mahidol.ac.th/handle/123456789/31646 Mahidol University SCOPUS https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84876357650&origin=inward |
institution |
Mahidol University |
building |
Mahidol University Library |
continent |
Asia |
country |
Thailand Thailand |
content_provider |
Mahidol University Library |
collection |
Mahidol University Institutional Repository |
topic |
Computer Science |
spellingShingle |
Computer Science Panrasee Ritthipravat Orrawan Kumdee Thongchai Bhongmakapat Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence |
description |
This study aims to investigate efficient missing data techniques for prediction of nasopharyngeal carcinoma (NPC) recurrence. Initially, clinical data of patients with NPC who received treatment at Ramathibodi hospital, Thailand, were collected. In total, 495 records were employed for the cancer recurrence prediction. Due to the fact that these data contain different missing values, appropriate missing data techniques (MDTs) must be examined. In this study, complete-case analysis, mean imputation, k-nearest neighbor imputation and Expectation Maximization (EM) imputation are mainly focused. The completed data are then used for developing three different predictive models, i.e., single-point model, multiple-point model and sequential neural network. The experimental results showed that EM imputation was superior to the other missing data techniques in which it provided highest predictive performance in all models. The average area under the receiver operating characteristic curve (AUC) of 0.72 could be achieved. The Hosmer and Lemeshow goodness of fittest was used for evaluating goodness of fit of each model. The results confirmed that EM imputation was the best missing data technique. The sequential neural network outperformed the other models. It provided the highest predictive performances in terms of the average AUC (0.73) and the Chi-square statistic (4.30). In addition, survival curves generated from these predictive models were compared with that of the Kaplan-Meier survival curve. The curves based on EM imputation were closest to the Kaplan-Meier model. From the log-rank test, however, these curves were significantly different (p-value < 0.05). © 2013 Asian Network for Scientific Information. |
author2 |
Mahidol University |
author_facet |
Mahidol University Panrasee Ritthipravat Orrawan Kumdee Thongchai Bhongmakapat |
format |
Article |
author |
Panrasee Ritthipravat Orrawan Kumdee Thongchai Bhongmakapat |
author_sort |
Panrasee Ritthipravat |
title |
Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence |
title_short |
Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence |
title_full |
Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence |
title_fullStr |
Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence |
title_full_unstemmed |
Efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence |
title_sort |
efficient missing data technique for prediction of nasopharyngeal carcinoma recurrence |
publishDate |
2018 |
url |
https://repository.li.mahidol.ac.th/handle/123456789/31646 |
_version_ |
1763494496826294272 |