A deep neural network approach to predicting clinical outcomes of neuroblastoma patients

Background: The availability of high-throughput omics datasets from large patient cohorts has allowed the development of methods that aim at predicting patient clinical outcomes, such as survival and disease recurrence. Such methods are also important to better understand the biological mechanisms u...

Full description

Saved in:
Bibliographic Details
Main Authors: Tranchevent, Léon-Charles, Azuaje, Francisco, Rajapakse, Jagath Chandana
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2021
Subjects:
Online Access:https://hdl.handle.net/10356/146977
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Background: The availability of high-throughput omics datasets from large patient cohorts has allowed the development of methods that aim at predicting patient clinical outcomes, such as survival and disease recurrence. Such methods are also important to better understand the biological mechanisms underlying disease etiology and development, as well as treatment responses. Recently, different predictive models, relying on distinct algorithms (including Support Vector Machines and Random Forests) have been investigated. In this context, deep learning strategies are of special interest due to their demonstrated superior performance over a wide range of problems and datasets. One of the main challenges of such strategies is the "small n large p" problem. Indeed, omics datasets typically consist of small numbers of samples and large numbers of features relative to typical deep learning datasets. Neural networks usually tackle this problem through feature selection or by including additional constraints during the learning process. Methods: We propose to tackle this problem with a novel strategy that relies on a graph-based method for featureextraction, coupled with a deep neural network for clinical outcome prediction. The omics data are first representedas graphs whose nodes represent patients, and edges represent correlations between the patients’ omics profiles.Topological features, such as centralities, are then extracted from these graphs for every node. Lastly, these featuresare used as input to train and test various classifiers. Results: We apply this strategy to four neuroblastoma datasets and observe that models based on neural networksare more accurate than state of the art models (DNN: 85%-87%, SVM/RF: 75%-82%). We explore how differentparameters and configurations are selected in order to overcome the effects of the small data problem as well as thecurse of dimensionality. Conclusions: Our results indicate that the deep neural networks capture complex features in the data that helppredicting patient clinical outcomes.