XGBoost, mordred and RDKit for the prediction of glass transition temperature of polymers

Glass transition temperature (Tg) is the temperature at which a polymer changes from crystalline state to rubbery state. This change in the property below and above Tg is very important in food science and pharmaceutical industries. In recent decades, there has been a growth in using machine learni...

Full description

Saved in:
Bibliographic Details
Main Author: Goh, Kai Leong
Other Authors: Lu Yunpeng
Format: Student Research Paper
Language:English
Published: Nanyang Technological University 2022
Subjects:
Online Access:https://hdl.handle.net/10356/155298
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-155298
record_format dspace
spelling sg-ntu-dr.10356-1552982022-06-29T00:54:35Z XGBoost, mordred and RDKit for the prediction of glass transition temperature of polymers Goh, Kai Leong Lu Yunpeng Xia Kelin School of Physical and Mathematical Sciences Wee Jun Jie YPLu@ntu.edu.sg, xiakelin@ntu.edu.sg Science::Chemistry Glass transition temperature (Tg) is the temperature at which a polymer changes from crystalline state to rubbery state. This change in the property below and above Tg is very important in food science and pharmaceutical industries. In recent decades, there has been a growth in using machine learning (ML) to develop quantitative structure–property relationship (QSPR) models. QSPR uses molecular descriptors and molecular fingerprints as features to predict the properties of chemical compounds. As a result, numerous works have been dedicated to creating a good QSPR model to predict Tg. However, to the best of our knowledge, there was no previous research work that involved the use of the Mordred molecular descriptors library or the Extreme Gradient Boosting (XGBoost) regression algorithm to predict Tg. Therefore, this project employed Mordred and XGBoost, together with the RDKit cheminformatics library to predict Tg of 640 polymers. A total of 12 sets of features were generated by RDKit and Mordred as inputs for XGBoost to predict Tg. The scoring metrics from the Scikit-learn and Numpy libraries showed that the 2D molecular descriptors of Mordred (Mordred-2D) and the Extended-Connectivity Fingerprint with a diameter of 4 bonds (ECFP4) had the best performances. The results further improved when Mordred-2D and ECFP4 were combined to form a new set of features. Future work aims to increase the number of polymer data points and explore better methods to represent the polymer repeating units for the calculation of descriptors and fingerprints. 2022-02-16T05:40:46Z 2022-02-16T05:40:46Z 2021 Student Research Paper Goh, K. L. (2021). XGBoost, mordred and RDKit for the prediction of glass transition temperature of polymers. Student Research Paper, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/155298 https://hdl.handle.net/10356/155298 en SPMS20062 © 2021 The Author(s). application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Chemistry
spellingShingle Science::Chemistry
Goh, Kai Leong
XGBoost, mordred and RDKit for the prediction of glass transition temperature of polymers
description Glass transition temperature (Tg) is the temperature at which a polymer changes from crystalline state to rubbery state. This change in the property below and above Tg is very important in food science and pharmaceutical industries. In recent decades, there has been a growth in using machine learning (ML) to develop quantitative structure–property relationship (QSPR) models. QSPR uses molecular descriptors and molecular fingerprints as features to predict the properties of chemical compounds. As a result, numerous works have been dedicated to creating a good QSPR model to predict Tg. However, to the best of our knowledge, there was no previous research work that involved the use of the Mordred molecular descriptors library or the Extreme Gradient Boosting (XGBoost) regression algorithm to predict Tg. Therefore, this project employed Mordred and XGBoost, together with the RDKit cheminformatics library to predict Tg of 640 polymers. A total of 12 sets of features were generated by RDKit and Mordred as inputs for XGBoost to predict Tg. The scoring metrics from the Scikit-learn and Numpy libraries showed that the 2D molecular descriptors of Mordred (Mordred-2D) and the Extended-Connectivity Fingerprint with a diameter of 4 bonds (ECFP4) had the best performances. The results further improved when Mordred-2D and ECFP4 were combined to form a new set of features. Future work aims to increase the number of polymer data points and explore better methods to represent the polymer repeating units for the calculation of descriptors and fingerprints.
author2 Lu Yunpeng
author_facet Lu Yunpeng
Goh, Kai Leong
format Student Research Paper
author Goh, Kai Leong
author_sort Goh, Kai Leong
title XGBoost, mordred and RDKit for the prediction of glass transition temperature of polymers
title_short XGBoost, mordred and RDKit for the prediction of glass transition temperature of polymers
title_full XGBoost, mordred and RDKit for the prediction of glass transition temperature of polymers
title_fullStr XGBoost, mordred and RDKit for the prediction of glass transition temperature of polymers
title_full_unstemmed XGBoost, mordred and RDKit for the prediction of glass transition temperature of polymers
title_sort xgboost, mordred and rdkit for the prediction of glass transition temperature of polymers
publisher Nanyang Technological University
publishDate 2022
url https://hdl.handle.net/10356/155298
_version_ 1738844801216806912