Multi-cover persistence (MCP)-based machine learning for polymer property prediction

Accurate and efficient prediction of polymers properties is crucial for polymer design. Recently, data-driven artificial intelligence (AI) models have demonstrated great promise in polymers property analysis. Even with the great progresses, a pivotal challenge in all the AI-driven models remains to...

Full description

Saved in:
Bibliographic Details
Main Authors: Zhang, Yipeng, Shen, Cong, Xia, Kelin
Other Authors: School of Physical and Mathematical Sciences
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181350
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-181350
record_format dspace
spelling sg-ntu-dr.10356-1813502024-12-02T15:35:49Z Multi-cover persistence (MCP)-based machine learning for polymer property prediction Zhang, Yipeng Shen, Cong Xia, Kelin School of Physical and Mathematical Sciences Mathematical Sciences Molecular representation Multi-cover persistence Accurate and efficient prediction of polymers properties is crucial for polymer design. Recently, data-driven artificial intelligence (AI) models have demonstrated great promise in polymers property analysis. Even with the great progresses, a pivotal challenge in all the AI-driven models remains to be the effective representation of molecules. Here we introduce Multi-Cover Persistence (MCP)-based molecular representation and featurization for the first time. Our MCP-based polymer descriptors are combined with machine learning models, in particular, Gradient Boosting Tree (GBT) models, for polymers property prediction. Different from all previous molecular representation, polymer molecular structure and interactions are represented as MCP, which utilizes Delaunay slices at different dimensions and Rhomboid tiling to characterize the complicated geometric and topological information within the data. Statistic features from the generated persistent barcodes are used as polymer descriptors, and further combined with GBT model. Our model has been extensively validated on polymer benchmark datasets. It has been found that our models can outperform traditional fingerprint-based models and has similar accuracy with geometric deep learning models. In particular, our model tends to be more effective on large-sized monomer structures, demonstrating the great potential of MCP in characterizing more complicated polymer data. This work underscores the potential of MCP in polymer informatics, presenting a novel perspective on molecular representation and its application in polymer science. Ministry of Education (MOE) Nanyang Technological University Published version This work was supported in part by Nanyang Technological University SPMS Collaborative Research Award 2022, Singapore Ministry of Education Academic Research fund (Tier 2 grants MOE-T2EP20220-0010 and MOE-T2EP20221-0003). 2024-11-26T04:58:43Z 2024-11-26T04:58:43Z 2024 Journal Article Zhang, Y., Shen, C. & Xia, K. (2024). Multi-cover persistence (MCP)-based machine learning for polymer property prediction. Briefings in Bioinformatics, 25(6). https://dx.doi.org/10.1093/bib/bbae465 1467-5463 https://hdl.handle.net/10356/181350 10.1093/bib/bbae465 39323091 2-s2.0-85204940933 6 25 en MOE-T2EP20220-0010 MOE-T2EP20221-0003 Briefings in Bioinformatics © 2024 The Author(s). Published by Oxford University Press. This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (https://creativecommons.org/ licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Mathematical Sciences
Molecular representation
Multi-cover persistence
spellingShingle Mathematical Sciences
Molecular representation
Multi-cover persistence
Zhang, Yipeng
Shen, Cong
Xia, Kelin
Multi-cover persistence (MCP)-based machine learning for polymer property prediction
description Accurate and efficient prediction of polymers properties is crucial for polymer design. Recently, data-driven artificial intelligence (AI) models have demonstrated great promise in polymers property analysis. Even with the great progresses, a pivotal challenge in all the AI-driven models remains to be the effective representation of molecules. Here we introduce Multi-Cover Persistence (MCP)-based molecular representation and featurization for the first time. Our MCP-based polymer descriptors are combined with machine learning models, in particular, Gradient Boosting Tree (GBT) models, for polymers property prediction. Different from all previous molecular representation, polymer molecular structure and interactions are represented as MCP, which utilizes Delaunay slices at different dimensions and Rhomboid tiling to characterize the complicated geometric and topological information within the data. Statistic features from the generated persistent barcodes are used as polymer descriptors, and further combined with GBT model. Our model has been extensively validated on polymer benchmark datasets. It has been found that our models can outperform traditional fingerprint-based models and has similar accuracy with geometric deep learning models. In particular, our model tends to be more effective on large-sized monomer structures, demonstrating the great potential of MCP in characterizing more complicated polymer data. This work underscores the potential of MCP in polymer informatics, presenting a novel perspective on molecular representation and its application in polymer science.
author2 School of Physical and Mathematical Sciences
author_facet School of Physical and Mathematical Sciences
Zhang, Yipeng
Shen, Cong
Xia, Kelin
format Article
author Zhang, Yipeng
Shen, Cong
Xia, Kelin
author_sort Zhang, Yipeng
title Multi-cover persistence (MCP)-based machine learning for polymer property prediction
title_short Multi-cover persistence (MCP)-based machine learning for polymer property prediction
title_full Multi-cover persistence (MCP)-based machine learning for polymer property prediction
title_fullStr Multi-cover persistence (MCP)-based machine learning for polymer property prediction
title_full_unstemmed Multi-cover persistence (MCP)-based machine learning for polymer property prediction
title_sort multi-cover persistence (mcp)-based machine learning for polymer property prediction
publishDate 2024
url https://hdl.handle.net/10356/181350
_version_ 1819113029661360128