Data-driven estimation of building energy consumption with multi-source heterogeneous data

For better energy evaluation and management, a categorical boosting (CatBoost)-based predictive method is presented to accurately estimate building energy consumption by learning large volumes of multi-source heterogeneous data collected from buildings. To be specific, the newly-developed CatBoost m...

Full description

Saved in:
Bibliographic Details
Main Authors: Pan, Yue, Zhang, Limao
Other Authors: School of Civil and Environmental Engineering
Format: Article
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/155499
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-155499
record_format dspace
spelling sg-ntu-dr.10356-1554992022-03-02T08:40:53Z Data-driven estimation of building energy consumption with multi-source heterogeneous data Pan, Yue Zhang, Limao School of Civil and Environmental Engineering Engineering::Civil engineering Data Mining Feature Importance For better energy evaluation and management, a categorical boosting (CatBoost)-based predictive method is presented to accurately estimate building energy consumption by learning large volumes of multi-source heterogeneous data collected from buildings. To be specific, the newly-developed CatBoost model belonging to the ensemble learning has superiority in handling categorical variables and producing reliable results. As a case study, our proposed method is validated in a multi-dimensional dataset about Seattle's building energy performance provided by the city's government, aiming to estimate the weather normalized site energy use intensity of buildings and characterize its non-linear relationship with other 12 possible influential features. Results from the 5-fold cross-validation demonstrate that the model exhibits a strong ability in predicting the exact value of energy intensity precisely, which can even outperform popular machine learning algorithms including random forest and gradient boosting decision tree under R2 of 0.897. Based on a defined threshold, these predicted values can be classified as the normal or abnormal energy consumption reaching an accuracy of 99.32% for outlier detection, which is helpful in alarming potential risks at an early stage and developing strategies to enhance the energy efficiency. Moreover, results from the established model can be interpreted objectively, suggesting that features concerning the physical and energy characteristics contribute more to energy estimation than environmental features. Since such results understand the building energy consumption and efficiency in a data-driven manner, they can eventually serve as guidance for building owners and designers in designing and renovating buildings to achieve better energy-conserving performance. Ministry of Education (MOE) Nanyang Technological University The Ministry of Education Tier 1 Grant, Singapore (No. M4011971.030) and the Start-Up Grant at Nanyang Technological University, Singapore (No. M4082160.030) are acknowledged for their financial support of this research. 2022-03-02T08:40:53Z 2022-03-02T08:40:53Z 2020 Journal Article Pan, Y. & Zhang, L. (2020). Data-driven estimation of building energy consumption with multi-source heterogeneous data. Applied Energy, 268, 114965-. https://dx.doi.org/10.1016/j.apenergy.2020.114965 0306-2619 https://hdl.handle.net/10356/155499 10.1016/j.apenergy.2020.114965 2-s2.0-85083337705 268 114965 en M4011971.030 M4082160.030 Applied Energy © 2020 Elsevier Ltd. All rights reserved.
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Civil engineering
Data Mining
Feature Importance
spellingShingle Engineering::Civil engineering
Data Mining
Feature Importance
Pan, Yue
Zhang, Limao
Data-driven estimation of building energy consumption with multi-source heterogeneous data
description For better energy evaluation and management, a categorical boosting (CatBoost)-based predictive method is presented to accurately estimate building energy consumption by learning large volumes of multi-source heterogeneous data collected from buildings. To be specific, the newly-developed CatBoost model belonging to the ensemble learning has superiority in handling categorical variables and producing reliable results. As a case study, our proposed method is validated in a multi-dimensional dataset about Seattle's building energy performance provided by the city's government, aiming to estimate the weather normalized site energy use intensity of buildings and characterize its non-linear relationship with other 12 possible influential features. Results from the 5-fold cross-validation demonstrate that the model exhibits a strong ability in predicting the exact value of energy intensity precisely, which can even outperform popular machine learning algorithms including random forest and gradient boosting decision tree under R2 of 0.897. Based on a defined threshold, these predicted values can be classified as the normal or abnormal energy consumption reaching an accuracy of 99.32% for outlier detection, which is helpful in alarming potential risks at an early stage and developing strategies to enhance the energy efficiency. Moreover, results from the established model can be interpreted objectively, suggesting that features concerning the physical and energy characteristics contribute more to energy estimation than environmental features. Since such results understand the building energy consumption and efficiency in a data-driven manner, they can eventually serve as guidance for building owners and designers in designing and renovating buildings to achieve better energy-conserving performance.
author2 School of Civil and Environmental Engineering
author_facet School of Civil and Environmental Engineering
Pan, Yue
Zhang, Limao
format Article
author Pan, Yue
Zhang, Limao
author_sort Pan, Yue
title Data-driven estimation of building energy consumption with multi-source heterogeneous data
title_short Data-driven estimation of building energy consumption with multi-source heterogeneous data
title_full Data-driven estimation of building energy consumption with multi-source heterogeneous data
title_fullStr Data-driven estimation of building energy consumption with multi-source heterogeneous data
title_full_unstemmed Data-driven estimation of building energy consumption with multi-source heterogeneous data
title_sort data-driven estimation of building energy consumption with multi-source heterogeneous data
publishDate 2022
url https://hdl.handle.net/10356/155499
_version_ 1726885520395468800