Using small database and energy descriptors to predict molecular thermodynamic energies through mediated learning models

Delta machine learning (DML) models have paved a new way to obtaining high fidelity ab initio simulation results of materials by using quantities with lower computational cost as learning materials. However, the low out-of-sample extrapolative ability and the requirement of large training sets have...

Full description

Saved in:
Bibliographic Details
Main Authors: Chen, Chao, Deng, Siyan, Li, Shuzhou
Other Authors: School of Materials Science and Engineering
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/179404
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-179404
record_format dspace
spelling sg-ntu-dr.10356-1794042024-07-30T02:05:10Z Using small database and energy descriptors to predict molecular thermodynamic energies through mediated learning models Chen, Chao Deng, Siyan Li, Shuzhou School of Materials Science and Engineering Engineering Mediated machine learning Thermodynamic quantities Delta machine learning (DML) models have paved a new way to obtaining high fidelity ab initio simulation results of materials by using quantities with lower computational cost as learning materials. However, the low out-of-sample extrapolative ability and the requirement of large training sets have limited broader applications of conventional DML models. In this work, we proposed the concept of non-trivial electron energy, an intermediary energy quantity decoded from the electron total energy but exhibiting high Pearson's correlation with various thermodynamic energies, to build up mediated machine learning (MML) models. By hybridizing the intermediary non-trivial electron energy (N) with a bond descriptor (B) and a spatial matrix (S) of organic molecules, our integrated NBS descriptor shows excellent predictive power of thermodynamic energies with errors close to 1 kcal/mol for MML models when trained by a database with 100 entries and tested by a database with 500 entries. Moreover, adding supplemental sets with 10 ∼ 20 entries into the original training set could greatly improve the out-of-sample extendibility of NBS MML models, such as the molecules with obviously larger size, with disparate bond-type, and even with different elemental compositions. The method of mediated learning provides alternative ways to breakthrough limitations of traditional DML models and can be applied conveniently to study formation enthalpy, thermodynamic energy barriers, multi-dimensional Gibbs free energy surface, and other quantum chemical quantities related to materials' internal energy, enthalpy, and free energy under various conditions at tunable training cost, prediction efficiency, and accuracy. 2024-07-30T02:05:10Z 2024-07-30T02:05:10Z 2024 Journal Article Chen, C., Deng, S. & Li, S. (2024). Using small database and energy descriptors to predict molecular thermodynamic energies through mediated learning models. Chemical Engineering Journal, 488, 150607-. https://dx.doi.org/10.1016/j.cej.2024.150607 1385-8947 https://hdl.handle.net/10356/179404 10.1016/j.cej.2024.150607 2-s2.0-85189672657 488 150607 en Chemical Engineering Journal © 2024 Elsevier B.V. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering
Mediated machine learning
Thermodynamic quantities
spellingShingle Engineering
Mediated machine learning
Thermodynamic quantities
Chen, Chao
Deng, Siyan
Li, Shuzhou
Using small database and energy descriptors to predict molecular thermodynamic energies through mediated learning models
description Delta machine learning (DML) models have paved a new way to obtaining high fidelity ab initio simulation results of materials by using quantities with lower computational cost as learning materials. However, the low out-of-sample extrapolative ability and the requirement of large training sets have limited broader applications of conventional DML models. In this work, we proposed the concept of non-trivial electron energy, an intermediary energy quantity decoded from the electron total energy but exhibiting high Pearson's correlation with various thermodynamic energies, to build up mediated machine learning (MML) models. By hybridizing the intermediary non-trivial electron energy (N) with a bond descriptor (B) and a spatial matrix (S) of organic molecules, our integrated NBS descriptor shows excellent predictive power of thermodynamic energies with errors close to 1 kcal/mol for MML models when trained by a database with 100 entries and tested by a database with 500 entries. Moreover, adding supplemental sets with 10 ∼ 20 entries into the original training set could greatly improve the out-of-sample extendibility of NBS MML models, such as the molecules with obviously larger size, with disparate bond-type, and even with different elemental compositions. The method of mediated learning provides alternative ways to breakthrough limitations of traditional DML models and can be applied conveniently to study formation enthalpy, thermodynamic energy barriers, multi-dimensional Gibbs free energy surface, and other quantum chemical quantities related to materials' internal energy, enthalpy, and free energy under various conditions at tunable training cost, prediction efficiency, and accuracy.
author2 School of Materials Science and Engineering
author_facet School of Materials Science and Engineering
Chen, Chao
Deng, Siyan
Li, Shuzhou
format Article
author Chen, Chao
Deng, Siyan
Li, Shuzhou
author_sort Chen, Chao
title Using small database and energy descriptors to predict molecular thermodynamic energies through mediated learning models
title_short Using small database and energy descriptors to predict molecular thermodynamic energies through mediated learning models
title_full Using small database and energy descriptors to predict molecular thermodynamic energies through mediated learning models
title_fullStr Using small database and energy descriptors to predict molecular thermodynamic energies through mediated learning models
title_full_unstemmed Using small database and energy descriptors to predict molecular thermodynamic energies through mediated learning models
title_sort using small database and energy descriptors to predict molecular thermodynamic energies through mediated learning models
publishDate 2024
url https://hdl.handle.net/10356/179404
_version_ 1806059931731755008