Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations

In this study, we propose a new framework for Data Assimilation (DA) named Probabilistic Optimal Interpolation (POI) to combine the predictions from Machine Learning (ML) models trained with historical data and real-time observations, with the key objective to improve the estimate on the state of sy...

Full description

Saved in:
Bibliographic Details
Main Authors: Wei, Yuying, Law, Adrian Wing-Keung, Yang, Chun
Other Authors: School of Civil and Environmental Engineering
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/170149
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-170149
record_format dspace
spelling sg-ntu-dr.10356-1701492023-08-30T01:18:00Z Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations Wei, Yuying Law, Adrian Wing-Keung Yang, Chun School of Civil and Environmental Engineering School of Mechanical and Aerospace Engineering Nanyang Environment and Water Research Institute Engineering::Computer science and engineering Data Assimilation Uncertainty In this study, we propose a new framework for Data Assimilation (DA) named Probabilistic Optimal Interpolation (POI) to combine the predictions from Machine Learning (ML) models trained with historical data and real-time observations, with the key objective to improve the estimate on the state of system. The framework utilizes the heteroscedastic uncertainty of the ML predictions as well as the residual-based uncertainty of the observations and integrates the two through the technique of optimal interpolation. The quantification of the respective uncertainties is directly included within the framework itself. As an application example, we test the performance of POI using a multi-scale Lorenz 96 chaos system with various added noise levels. The ML model is based on a Long Short-Term Memory (LSTM) neural network and the technique of Monte Carlo (MC) dropout is adopted for the uncertainty quantification. The computational results show that the POI implementation can lead to improved predictions of the state of the system with less uncertainty and it can also filter the added level of noises effectively when the historical data are reasonably accurate. However, if the noise level is high, using the updated POI predictions as sequential inputs for the next time step does not guarantee better performance than using the real-time observations directly. Furthermore, under very noisy conditions, the average ML predictions after the MC dropout can already reduce the noises substantially, and these predictions might even be better than the POI updates. Therefore, the POI implementation (or data assimilation in general) is not recommended with a ML-based surrogate model in a noisy environment. National Research Foundation (NRF) Public Utilities Board (PUB) This research is supported by the National Research Foundation, Singapore, and PUB, Singapore’s National Water Agency under its RIE2025 Urban Solutions and Sustainability (USS) (Water) Centre of Excellence (CoE) Programme, awarded to Nanyang Environment & Water Research Institute (NEWRI), Nanyang Technological University, Singapore (NTU). 2023-08-30T01:18:00Z 2023-08-30T01:18:00Z 2023 Journal Article Wei, Y., Law, A. W. & Yang, C. (2023). Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations. Journal of Computational Science, 67, 101977-. https://dx.doi.org/10.1016/j.jocs.2023.101977 1877-7503 https://hdl.handle.net/10356/170149 10.1016/j.jocs.2023.101977 2-s2.0-85149174975 67 101977 en Journal of Computational Science © 2023 Elsevier B.V. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Data Assimilation
Uncertainty
spellingShingle Engineering::Computer science and engineering
Data Assimilation
Uncertainty
Wei, Yuying
Law, Adrian Wing-Keung
Yang, Chun
Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations
description In this study, we propose a new framework for Data Assimilation (DA) named Probabilistic Optimal Interpolation (POI) to combine the predictions from Machine Learning (ML) models trained with historical data and real-time observations, with the key objective to improve the estimate on the state of system. The framework utilizes the heteroscedastic uncertainty of the ML predictions as well as the residual-based uncertainty of the observations and integrates the two through the technique of optimal interpolation. The quantification of the respective uncertainties is directly included within the framework itself. As an application example, we test the performance of POI using a multi-scale Lorenz 96 chaos system with various added noise levels. The ML model is based on a Long Short-Term Memory (LSTM) neural network and the technique of Monte Carlo (MC) dropout is adopted for the uncertainty quantification. The computational results show that the POI implementation can lead to improved predictions of the state of the system with less uncertainty and it can also filter the added level of noises effectively when the historical data are reasonably accurate. However, if the noise level is high, using the updated POI predictions as sequential inputs for the next time step does not guarantee better performance than using the real-time observations directly. Furthermore, under very noisy conditions, the average ML predictions after the MC dropout can already reduce the noises substantially, and these predictions might even be better than the POI updates. Therefore, the POI implementation (or data assimilation in general) is not recommended with a ML-based surrogate model in a noisy environment.
author2 School of Civil and Environmental Engineering
author_facet School of Civil and Environmental Engineering
Wei, Yuying
Law, Adrian Wing-Keung
Yang, Chun
format Article
author Wei, Yuying
Law, Adrian Wing-Keung
Yang, Chun
author_sort Wei, Yuying
title Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations
title_short Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations
title_full Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations
title_fullStr Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations
title_full_unstemmed Probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations
title_sort probabilistic optimal interpolation for data assimilation between machine learning model predictions and real time observations
publishDate 2023
url https://hdl.handle.net/10356/170149
_version_ 1779156294572179456