Data-driven estimation of building energy consumption with multi-source heterogeneous data
For better energy evaluation and management, a categorical boosting (CatBoost)-based predictive method is presented to accurately estimate building energy consumption by learning large volumes of multi-source heterogeneous data collected from buildings. To be specific, the newly-developed CatBoost m...
Saved in:
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/155499 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-155499 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1554992022-03-02T08:40:53Z Data-driven estimation of building energy consumption with multi-source heterogeneous data Pan, Yue Zhang, Limao School of Civil and Environmental Engineering Engineering::Civil engineering Data Mining Feature Importance For better energy evaluation and management, a categorical boosting (CatBoost)-based predictive method is presented to accurately estimate building energy consumption by learning large volumes of multi-source heterogeneous data collected from buildings. To be specific, the newly-developed CatBoost model belonging to the ensemble learning has superiority in handling categorical variables and producing reliable results. As a case study, our proposed method is validated in a multi-dimensional dataset about Seattle's building energy performance provided by the city's government, aiming to estimate the weather normalized site energy use intensity of buildings and characterize its non-linear relationship with other 12 possible influential features. Results from the 5-fold cross-validation demonstrate that the model exhibits a strong ability in predicting the exact value of energy intensity precisely, which can even outperform popular machine learning algorithms including random forest and gradient boosting decision tree under R2 of 0.897. Based on a defined threshold, these predicted values can be classified as the normal or abnormal energy consumption reaching an accuracy of 99.32% for outlier detection, which is helpful in alarming potential risks at an early stage and developing strategies to enhance the energy efficiency. Moreover, results from the established model can be interpreted objectively, suggesting that features concerning the physical and energy characteristics contribute more to energy estimation than environmental features. Since such results understand the building energy consumption and efficiency in a data-driven manner, they can eventually serve as guidance for building owners and designers in designing and renovating buildings to achieve better energy-conserving performance. Ministry of Education (MOE) Nanyang Technological University The Ministry of Education Tier 1 Grant, Singapore (No. M4011971.030) and the Start-Up Grant at Nanyang Technological University, Singapore (No. M4082160.030) are acknowledged for their financial support of this research. 2022-03-02T08:40:53Z 2022-03-02T08:40:53Z 2020 Journal Article Pan, Y. & Zhang, L. (2020). Data-driven estimation of building energy consumption with multi-source heterogeneous data. Applied Energy, 268, 114965-. https://dx.doi.org/10.1016/j.apenergy.2020.114965 0306-2619 https://hdl.handle.net/10356/155499 10.1016/j.apenergy.2020.114965 2-s2.0-85083337705 268 114965 en M4011971.030 M4082160.030 Applied Energy © 2020 Elsevier Ltd. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Civil engineering Data Mining Feature Importance |
spellingShingle |
Engineering::Civil engineering Data Mining Feature Importance Pan, Yue Zhang, Limao Data-driven estimation of building energy consumption with multi-source heterogeneous data |
description |
For better energy evaluation and management, a categorical boosting (CatBoost)-based predictive method is presented to accurately estimate building energy consumption by learning large volumes of multi-source heterogeneous data collected from buildings. To be specific, the newly-developed CatBoost model belonging to the ensemble learning has superiority in handling categorical variables and producing reliable results. As a case study, our proposed method is validated in a multi-dimensional dataset about Seattle's building energy performance provided by the city's government, aiming to estimate the weather normalized site energy use intensity of buildings and characterize its non-linear relationship with other 12 possible influential features. Results from the 5-fold cross-validation demonstrate that the model exhibits a strong ability in predicting the exact value of energy intensity precisely, which can even outperform popular machine learning algorithms including random forest and gradient boosting decision tree under R2 of 0.897. Based on a defined threshold, these predicted values can be classified as the normal or abnormal energy consumption reaching an accuracy of 99.32% for outlier detection, which is helpful in alarming potential risks at an early stage and developing strategies to enhance the energy efficiency. Moreover, results from the established model can be interpreted objectively, suggesting that features concerning the physical and energy characteristics contribute more to energy estimation than environmental features. Since such results understand the building energy consumption and efficiency in a data-driven manner, they can eventually serve as guidance for building owners and designers in designing and renovating buildings to achieve better energy-conserving performance. |
author2 |
School of Civil and Environmental Engineering |
author_facet |
School of Civil and Environmental Engineering Pan, Yue Zhang, Limao |
format |
Article |
author |
Pan, Yue Zhang, Limao |
author_sort |
Pan, Yue |
title |
Data-driven estimation of building energy consumption with multi-source heterogeneous data |
title_short |
Data-driven estimation of building energy consumption with multi-source heterogeneous data |
title_full |
Data-driven estimation of building energy consumption with multi-source heterogeneous data |
title_fullStr |
Data-driven estimation of building energy consumption with multi-source heterogeneous data |
title_full_unstemmed |
Data-driven estimation of building energy consumption with multi-source heterogeneous data |
title_sort |
data-driven estimation of building energy consumption with multi-source heterogeneous data |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/155499 |
_version_ |
1726885520395468800 |