HExpPredict: in vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization

BACKGROUND: Due to many substances in the human exposome, there is a dearth of exposure and toxicity information available to assess potential health risks. Quantification of all trace organics in the biological fluids seems impossible and costly, regardless of the high individual exposure variabili...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhao, Fanrong, Li, Li, Lin, Penghui, Chen, Yue, Xing, Shipei, Du, Huili, Wang, Zheng, Yang, Junjie, Huan, Tao, Long, Cheng, Zhang, Limao, Wang, Bin, Fang, Mingliang
Other Authors: Lee Kong Chian School of Medicine (LKCMedicine)
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/169955
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-169955
record_format dspace
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Medicine
Engineering::Computer science and engineering
Artificial Neural Network
Bioassay
spellingShingle Science::Medicine
Engineering::Computer science and engineering
Artificial Neural Network
Bioassay
Zhao, Fanrong
Li, Li
Lin, Penghui
Chen, Yue
Xing, Shipei
Du, Huili
Wang, Zheng
Yang, Junjie
Huan, Tao
Long, Cheng
Zhang, Limao
Wang, Bin
Fang, Mingliang
HExpPredict: in vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization
description BACKGROUND: Due to many substances in the human exposome, there is a dearth of exposure and toxicity information available to assess potential health risks. Quantification of all trace organics in the biological fluids seems impossible and costly, regardless of the high individual exposure variability. We hypothesized that the blood concentration (CB) of organic pollutants could be predicted via their exposure and chemical properties. Developing a prediction model on the annotation of chemicals in human blood can provide new insight into the distribution and extent of exposures to a wide range of chemicals in humans. OBJECTIVES: Our objective was to develop a machine learning (ML) model to predict blood concentrations (CBs) of chemicals and prioritize chemicals of health concern. METHODS: We curated the CBs of compounds mostly measured at population levels and developed an ML model for chemical CB predictions by considering chemical daily exposure (DE) and exposure pathway indicators (dij), half-lives (t1=2), and volume of distribution (Vd). Three ML models, including random forest (RF), artificial neural network (ANN) and support vector regression (SVR) were compared. The toxicity potential or prioritization of each chemical was represented as a bioanalytical equivalency (BEQ) and its percentage (BEQ%) estimated based on the predicted CB and ToxCast bioactivity data. We also retrieved the top 25 most active chemicals in each assay to further observe changes in the BEQ% after the exclusion of the drugs and endogenous substances. RESULTS: We curated the CBs of 216 compounds primarily measured at population levels. RF outperformed the ANN and SVF models with the root mean square error (RMSE) of 1.66 and 2:07 lM, the mean absolute error (MAE) values of 1.28 and 1:56 lM, the mean absolute percentage error (MAPE) of 0.29 and 0.23, and R2 of 0.80 and 0.72 across test and testing sets. Subsequently, the human CBs of 7,858 ToxCast chemicals were successfully predicted, ranging from 1:29 × 10−6 to 1:79 × 10−2 lM. The predicted CBs were then combined with ToxCast in vitro bioassays to prioritize the ToxCast chemicals across 12 in vitro assays with important toxicological end points. It is interesting that we found the most active compounds to be food additives and pesticides rather than widely monitored environmental pollutants. DISCUSSION: We have shown that the accurate prediction of “internal exposure” from “external exposure” is possible, and this result can be quite useful in the risk prioritization.
author2 Lee Kong Chian School of Medicine (LKCMedicine)
author_facet Lee Kong Chian School of Medicine (LKCMedicine)
Zhao, Fanrong
Li, Li
Lin, Penghui
Chen, Yue
Xing, Shipei
Du, Huili
Wang, Zheng
Yang, Junjie
Huan, Tao
Long, Cheng
Zhang, Limao
Wang, Bin
Fang, Mingliang
format Article
author Zhao, Fanrong
Li, Li
Lin, Penghui
Chen, Yue
Xing, Shipei
Du, Huili
Wang, Zheng
Yang, Junjie
Huan, Tao
Long, Cheng
Zhang, Limao
Wang, Bin
Fang, Mingliang
author_sort Zhao, Fanrong
title HExpPredict: in vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization
title_short HExpPredict: in vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization
title_full HExpPredict: in vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization
title_fullStr HExpPredict: in vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization
title_full_unstemmed HExpPredict: in vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization
title_sort hexppredict: in vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization
publishDate 2023
url https://hdl.handle.net/10356/169955
_version_ 1779156706454929408
spelling sg-ntu-dr.10356-1699552023-08-20T15:37:29Z HExpPredict: in vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization Zhao, Fanrong Li, Li Lin, Penghui Chen, Yue Xing, Shipei Du, Huili Wang, Zheng Yang, Junjie Huan, Tao Long, Cheng Zhang, Limao Wang, Bin Fang, Mingliang Lee Kong Chian School of Medicine (LKCMedicine) School of Civil and Environmental Engineering School of Computer Science and Engineering Science::Medicine Engineering::Computer science and engineering Artificial Neural Network Bioassay BACKGROUND: Due to many substances in the human exposome, there is a dearth of exposure and toxicity information available to assess potential health risks. Quantification of all trace organics in the biological fluids seems impossible and costly, regardless of the high individual exposure variability. We hypothesized that the blood concentration (CB) of organic pollutants could be predicted via their exposure and chemical properties. Developing a prediction model on the annotation of chemicals in human blood can provide new insight into the distribution and extent of exposures to a wide range of chemicals in humans. OBJECTIVES: Our objective was to develop a machine learning (ML) model to predict blood concentrations (CBs) of chemicals and prioritize chemicals of health concern. METHODS: We curated the CBs of compounds mostly measured at population levels and developed an ML model for chemical CB predictions by considering chemical daily exposure (DE) and exposure pathway indicators (dij), half-lives (t1=2), and volume of distribution (Vd). Three ML models, including random forest (RF), artificial neural network (ANN) and support vector regression (SVR) were compared. The toxicity potential or prioritization of each chemical was represented as a bioanalytical equivalency (BEQ) and its percentage (BEQ%) estimated based on the predicted CB and ToxCast bioactivity data. We also retrieved the top 25 most active chemicals in each assay to further observe changes in the BEQ% after the exclusion of the drugs and endogenous substances. RESULTS: We curated the CBs of 216 compounds primarily measured at population levels. RF outperformed the ANN and SVF models with the root mean square error (RMSE) of 1.66 and 2:07 lM, the mean absolute error (MAE) values of 1.28 and 1:56 lM, the mean absolute percentage error (MAPE) of 0.29 and 0.23, and R2 of 0.80 and 0.72 across test and testing sets. Subsequently, the human CBs of 7,858 ToxCast chemicals were successfully predicted, ranging from 1:29 × 10−6 to 1:79 × 10−2 lM. The predicted CBs were then combined with ToxCast in vitro bioassays to prioritize the ToxCast chemicals across 12 in vitro assays with important toxicological end points. It is interesting that we found the most active compounds to be food additives and pesticides rather than widely monitored environmental pollutants. DISCUSSION: We have shown that the accurate prediction of “internal exposure” from “external exposure” is possible, and this result can be quite useful in the risk prioritization. Ministry of Education (MOE) Published version This work was funded by the National Key R&D Program (No. 2022YFC3702600 and 2022YFC3702601), the Singapore Ministry of Education Academic Research Fund Tier 1 (04MNP000567C120), and the Startup Grant of Fudan University (No. JIH 1829010Y). 2023-08-16T02:11:27Z 2023-08-16T02:11:27Z 2023 Journal Article Zhao, F., Li, L., Lin, P., Chen, Y., Xing, S., Du, H., Wang, Z., Yang, J., Huan, T., Long, C., Zhang, L., Wang, B. & Fang, M. (2023). HExpPredict: in vivo exposure prediction of human blood exposome using a random forest model and its application in chemical risk prioritization. Environmental Health Perspectives, 131(3), 037009-1-037009-10. https://dx.doi.org/10.1289/EHP11305 0091-6765 https://hdl.handle.net/10356/169955 10.1289/EHP11305 36913238 2-s2.0-85150116608 3 131 037009-1 037009-10 en 04MNP000567C120 Environmental Health Perspectives © 2023 Public Health Services, US Dept of Health and Human Services. All rights reserved. application/pdf