Theory-guided machine learning to predict configurational energies of high distortion alloy systems
Cluster expansion (CE) is a popular surrogate model to density functional theory (DFT) for modeling the stability of alloy systems through configurational energies. However, since CE is a lattice-based model, its accuracy is often poor when applied to high-entropy alloys (HEAs) with significan...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/165985 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Cluster expansion (CE) is a popular surrogate model to density functional theory (DFT) for
modeling the stability of alloy systems through configurational energies. However, since CE is a
lattice-based model, its accuracy is often poor when applied to high-entropy alloys (HEAs) with
significant structural distortion. State-of-the-art attempts at using CE with machine learning (ML)
models like Lasso and Bayesian for selecting meaningful clusters show high prediction errors for
these high distortion alloy systems, where the contributions of long-range effective cluster
interactions (ECIs) to configurational energetics remain significant. Adopting only clusters as
descriptors has proven insufficient for accurate and robust predictions. This paper presents the
novel integration of feature generation from clusters in CE and over 3000 Matminer material
descriptors, to comprehensively capture the behavior of complex high distortion systems.
Matminer features have proved effective for predicting material properties such as bandgap, elastic
constants, formation energies, adsorption energies, and ferromagnetic properties in the past. Using
recursive feature elimination, optimized based on stable weight assignment of ridge regularization,
we sieved out only important descriptors in a high dimensional framework where configurational
energy labels vastly exceed the number of descriptors. The pipeline is applied to the ten constituent
binary alloys of HEA Mo-Nb-V-Ti-Zr, which is known to have large structural distortions, and we
discovered that the prediction accuracy significantly improved by an average of 56%, consistent
across all ten binary alloy systems. More importantly, we found the four important classes of
features—coordination number, XRD, dihedral-angle distribution function, and clusters—that our
model consistently select across all ten binaries. Our results are robust, showing that the additional
descriptors from Matminer can better capture the behavior of high-distortion alloy systems. These
important classes of descriptors are also transferable to other complex systems, such as HEAs, that
are currently poorly understood, and to give robust prediction of their properties, accelerating the
discovery of these high-performance alloys. |
---|