An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study
Aims/hypothesis: People with type 2 diabetes are heterogeneous in their disease trajectory, with some progressing more quickly to insulin initiation than others. Although classical biomarkers such as age, HbA1c and diabetes duration are associated with glycaemic progression, it is unclear how well s...
Saved in:
Main Authors: | , , , , , , , , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/178674 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-178674 |
---|---|
record_format |
dspace |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Medicine, Health and Life Sciences Machine learning Prediction model |
spellingShingle |
Medicine, Health and Life Sciences Machine learning Prediction model Slieker, Roderick C. Münch, Magnus Donnelly, Louise A. Bouland, Gerard A. Dragan, Iulian Kuznetsov, Dmitry Elders, Petra J. M. Rutter, Guy A. Ibberson, Mark Pearson, Ewan R. Hart, Leen M. 't van de Wiel, Mark A. Beulens, Joline W. J. An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study |
description |
Aims/hypothesis: People with type 2 diabetes are heterogeneous in their disease trajectory, with some progressing more quickly to insulin initiation than others. Although classical biomarkers such as age, HbA1c and diabetes duration are associated with glycaemic progression, it is unclear how well such variables predict insulin initiation or requirement and whether newly identified markers have added predictive value. Methods: In two prospective cohort studies as part of IMI-RHAPSODY, we investigated whether clinical variables and three types of molecular markers (metabolites, lipids, proteins) can predict time to insulin requirement using different machine learning approaches (lasso, ridge, GRridge, random forest). Clinical variables included age, sex, HbA1c, HDL-cholesterol and C-peptide. Models were run with unpenalised clinical variables (i.e. always included in the model without weights) or penalised clinical variables, or without clinical variables. Model development was performed in one cohort and the model was applied in a second cohort. Model performance was evaluated using Harrel’s C statistic. Results: Of the 585 individuals from the Hoorn Diabetes Care System (DCS) cohort, 69 required insulin during follow-up (1.0–11.4 years); of the 571 individuals in the Genetics of Diabetes Audit and Research in Tayside Scotland (GoDARTS) cohort, 175 required insulin during follow-up (0.3–11.8 years). Overall, the clinical variables and proteins were selected in the different models most often, followed by the metabolites. The most frequently selected clinical variables were HbA1c (18 of the 36 models, 50%), age (15 models, 41.2%) and C-peptide (15 models, 41.2%). Base models (age, sex, BMI, HbA1c) including only clinical variables performed moderately in both the DCS discovery cohort (C statistic 0.71 [95% CI 0.64, 0.79]) and the GoDARTS replication cohort (C 0.71 [95% CI 0.69, 0.75]). A more extensive model including HDL-cholesterol and C-peptide performed better in both cohorts (DCS, C 0.74 [95% CI 0.67, 0.81]; GoDARTS, C 0.73 [95% CI 0.69, 0.77]). Two proteins, lactadherin and proto-oncogene tyrosine-protein kinase receptor, were most consistently selected and slightly improved model performance. Conclusions/interpretation: Using machine learning approaches, we show that insulin requirement risk can be modestly well predicted by predominantly clinical variables. Inclusion of molecular markers improves the prognostic performance beyond that of clinical variables by up to 5%. Such prognostic models could be useful for identifying people with diabetes at high risk of progressing quickly to treatment intensification. Data availability: Summary statistics of lipidomic, proteomic and metabolomic data are available from a Shiny dashboard at https://rhapdata-app.vital-it.ch. |
author2 |
Lee Kong Chian School of Medicine (LKCMedicine) |
author_facet |
Lee Kong Chian School of Medicine (LKCMedicine) Slieker, Roderick C. Münch, Magnus Donnelly, Louise A. Bouland, Gerard A. Dragan, Iulian Kuznetsov, Dmitry Elders, Petra J. M. Rutter, Guy A. Ibberson, Mark Pearson, Ewan R. Hart, Leen M. 't van de Wiel, Mark A. Beulens, Joline W. J. |
format |
Article |
author |
Slieker, Roderick C. Münch, Magnus Donnelly, Louise A. Bouland, Gerard A. Dragan, Iulian Kuznetsov, Dmitry Elders, Petra J. M. Rutter, Guy A. Ibberson, Mark Pearson, Ewan R. Hart, Leen M. 't van de Wiel, Mark A. Beulens, Joline W. J. |
author_sort |
Slieker, Roderick C. |
title |
An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study |
title_short |
An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study |
title_full |
An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study |
title_fullStr |
An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study |
title_full_unstemmed |
An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study |
title_sort |
omics-based machine learning approach to predict diabetes progression: a rhapsody study |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/178674 |
_version_ |
1814047200097861632 |
spelling |
sg-ntu-dr.10356-1786742024-07-07T15:37:43Z An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study Slieker, Roderick C. Münch, Magnus Donnelly, Louise A. Bouland, Gerard A. Dragan, Iulian Kuznetsov, Dmitry Elders, Petra J. M. Rutter, Guy A. Ibberson, Mark Pearson, Ewan R. Hart, Leen M. 't van de Wiel, Mark A. Beulens, Joline W. J. Lee Kong Chian School of Medicine (LKCMedicine) Medicine, Health and Life Sciences Machine learning Prediction model Aims/hypothesis: People with type 2 diabetes are heterogeneous in their disease trajectory, with some progressing more quickly to insulin initiation than others. Although classical biomarkers such as age, HbA1c and diabetes duration are associated with glycaemic progression, it is unclear how well such variables predict insulin initiation or requirement and whether newly identified markers have added predictive value. Methods: In two prospective cohort studies as part of IMI-RHAPSODY, we investigated whether clinical variables and three types of molecular markers (metabolites, lipids, proteins) can predict time to insulin requirement using different machine learning approaches (lasso, ridge, GRridge, random forest). Clinical variables included age, sex, HbA1c, HDL-cholesterol and C-peptide. Models were run with unpenalised clinical variables (i.e. always included in the model without weights) or penalised clinical variables, or without clinical variables. Model development was performed in one cohort and the model was applied in a second cohort. Model performance was evaluated using Harrel’s C statistic. Results: Of the 585 individuals from the Hoorn Diabetes Care System (DCS) cohort, 69 required insulin during follow-up (1.0–11.4 years); of the 571 individuals in the Genetics of Diabetes Audit and Research in Tayside Scotland (GoDARTS) cohort, 175 required insulin during follow-up (0.3–11.8 years). Overall, the clinical variables and proteins were selected in the different models most often, followed by the metabolites. The most frequently selected clinical variables were HbA1c (18 of the 36 models, 50%), age (15 models, 41.2%) and C-peptide (15 models, 41.2%). Base models (age, sex, BMI, HbA1c) including only clinical variables performed moderately in both the DCS discovery cohort (C statistic 0.71 [95% CI 0.64, 0.79]) and the GoDARTS replication cohort (C 0.71 [95% CI 0.69, 0.75]). A more extensive model including HDL-cholesterol and C-peptide performed better in both cohorts (DCS, C 0.74 [95% CI 0.67, 0.81]; GoDARTS, C 0.73 [95% CI 0.69, 0.77]). Two proteins, lactadherin and proto-oncogene tyrosine-protein kinase receptor, were most consistently selected and slightly improved model performance. Conclusions/interpretation: Using machine learning approaches, we show that insulin requirement risk can be modestly well predicted by predominantly clinical variables. Inclusion of molecular markers improves the prognostic performance beyond that of clinical variables by up to 5%. Such prognostic models could be useful for identifying people with diabetes at high risk of progressing quickly to treatment intensification. Data availability: Summary statistics of lipidomic, proteomic and metabolomic data are available from a Shiny dashboard at https://rhapdata-app.vital-it.ch. Published version This project received funding from the Innovative Medicines Initiative 2 Joint Undertaking, under grant agreement no. 115881 (RHAPSODY). This Joint Undertaking receives support from the European Union’s Horizon 2020 research and innovation programme and EFPIA. This work is supported by the Swiss State Secretariat for Education‚ Research and Innovation (SERI), under contract no. 16.0097. The DCS cohort was supported by grants from the Netherlands Organisation for Health Research and Development (113102006, 459001015). GAR was funded by a Wellcome Trust Investigator Award (212625/Z/18/Z), UK MRC Programme grant (MR/R022259/1), CRCHUM start-up funds, an Innovation Canada John R. Evans Leader Award (CFI 42649), JDRF (JDRF 4-SRA-2023-1182-S-N) and CIHR (CIHR-IRSC:0682002550) project grants. 2024-07-02T05:46:38Z 2024-07-02T05:46:38Z 2024 Journal Article Slieker, R. C., Münch, M., Donnelly, L. A., Bouland, G. A., Dragan, I., Kuznetsov, D., Elders, P. J. M., Rutter, G. A., Ibberson, M., Pearson, E. R., Hart, L. M. '., van de Wiel, M. A. & Beulens, J. W. J. (2024). An omics-based machine learning approach to predict diabetes progression: a RHAPSODY study. Diabetologia, 67(5), 885-894. https://dx.doi.org/10.1007/s00125-024-06105-8 0012-186X https://hdl.handle.net/10356/178674 10.1007/s00125-024-06105-8 38374450 2-s2.0-85185302533 5 67 885 894 en Diabetologia © The Author(s) 2024. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. application/pdf |