Persistent Dirac for molecular representation

Molecular representations are of fundamental importance for the modeling and analysing molecular systems. The successes in drug design and materials discovery have been greatly contributed by molecular representation models. In this paper, we present a computational framework for molecular represent...

Full description

Saved in:
Bibliographic Details
Main Authors: Wee, Junjie, Bianconi, Ginestra, Xia, Kelin
Other Authors: School of Physical and Mathematical Sciences
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/171550
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-171550
record_format dspace
spelling sg-ntu-dr.10356-1715502023-10-30T15:34:41Z Persistent Dirac for molecular representation Wee, Junjie Bianconi, Ginestra Xia, Kelin School of Physical and Mathematical Sciences Science::Mathematics Persistent Dirac Molecular Representation Molecular representations are of fundamental importance for the modeling and analysing molecular systems. The successes in drug design and materials discovery have been greatly contributed by molecular representation models. In this paper, we present a computational framework for molecular representation that is mathematically rigorous and based on the persistent Dirac operator. The properties of the discrete weighted and unweighted Dirac matrix are systematically discussed, and the biological meanings of both homological and non-homological eigenvectors are studied. We also evaluate the impact of various weighting schemes on the weighted Dirac matrix. Additionally, a set of physical persistent attributes that characterize the persistence and variation of spectrum properties of Dirac matrices during a filtration process is proposed to be molecular fingerprints. Our persistent attributes are used to classify molecular configurations of nine different types of organic-inorganic halide perovskites. The combination of persistent attributes with gradient boosting tree model has achieved great success in molecular solvation free energy prediction. The results show that our model is effective in characterizing the molecular structures, demonstrating the power of our molecular representation and featurization approach. Ministry of Education (MOE) Nanyang Technological University Published version This work was supported in part by Nanyang Technological University Startup Grant M4081842 and Singapore Ministry of Education Academic Research fund Tier 1 RG109/19 and Tier 2 MOE2018-T2-1-033. 2023-10-30T06:45:56Z 2023-10-30T06:45:56Z 2023 Journal Article Wee, J., Bianconi, G. & Xia, K. (2023). Persistent Dirac for molecular representation. Scientific Reports, 13(1), 11183-. https://dx.doi.org/10.1038/s41598-023-37853-z 2045-2322 https://hdl.handle.net/10356/171550 10.1038/s41598-023-37853-z 37433870 2-s2.0-85164383375 1 13 11183 en M4081842 RG109/19 MOE2018-T2–1–033 Scientific Reports © The Author(s) 2023. Open Access. This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Science::Mathematics
Persistent Dirac
Molecular Representation
spellingShingle Science::Mathematics
Persistent Dirac
Molecular Representation
Wee, Junjie
Bianconi, Ginestra
Xia, Kelin
Persistent Dirac for molecular representation
description Molecular representations are of fundamental importance for the modeling and analysing molecular systems. The successes in drug design and materials discovery have been greatly contributed by molecular representation models. In this paper, we present a computational framework for molecular representation that is mathematically rigorous and based on the persistent Dirac operator. The properties of the discrete weighted and unweighted Dirac matrix are systematically discussed, and the biological meanings of both homological and non-homological eigenvectors are studied. We also evaluate the impact of various weighting schemes on the weighted Dirac matrix. Additionally, a set of physical persistent attributes that characterize the persistence and variation of spectrum properties of Dirac matrices during a filtration process is proposed to be molecular fingerprints. Our persistent attributes are used to classify molecular configurations of nine different types of organic-inorganic halide perovskites. The combination of persistent attributes with gradient boosting tree model has achieved great success in molecular solvation free energy prediction. The results show that our model is effective in characterizing the molecular structures, demonstrating the power of our molecular representation and featurization approach.
author2 School of Physical and Mathematical Sciences
author_facet School of Physical and Mathematical Sciences
Wee, Junjie
Bianconi, Ginestra
Xia, Kelin
format Article
author Wee, Junjie
Bianconi, Ginestra
Xia, Kelin
author_sort Wee, Junjie
title Persistent Dirac for molecular representation
title_short Persistent Dirac for molecular representation
title_full Persistent Dirac for molecular representation
title_fullStr Persistent Dirac for molecular representation
title_full_unstemmed Persistent Dirac for molecular representation
title_sort persistent dirac for molecular representation
publishDate 2023
url https://hdl.handle.net/10356/171550
_version_ 1781793769912270848