Applying data augmentation technique on blast-induced overbreak prediction: Resolving the problem of data shortage and data imbalance
Blast-induced overbreak in tunnels can cause severe damage and has therefore been a main concern in tunnel blasting. Researchers have developed many machine learning-based models to predict overbreak. Collecting overbreak data manually, however, can be challenging and might obtain insufficient or po...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Article |
Published: |
Elsevier
2024
|
Subjects: | |
Online Access: | http://eprints.um.edu.my/44315/ https://doi.org/10.1016/j.eswa.2023.121616 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Malaya |
id |
my.um.eprints.44315 |
---|---|
record_format |
eprints |
spelling |
my.um.eprints.443152024-06-14T02:49:40Z http://eprints.um.edu.my/44315/ Applying data augmentation technique on blast-induced overbreak prediction: Resolving the problem of data shortage and data imbalance He, Biao Armaghani, Danial Jahed Lai, Sai Hin Samui, Pijush Mohamad, Edy Tonnizam QA75 Electronic computers. Computer science TA Engineering (General). Civil engineering (General) TD Environmental technology. Sanitary engineering Blast-induced overbreak in tunnels can cause severe damage and has therefore been a main concern in tunnel blasting. Researchers have developed many machine learning-based models to predict overbreak. Collecting overbreak data manually, however, can be challenging and might obtain insufficient or poorly structured data. Thus, this study aims to utilise a deep generative model, namely the Conditional Tabular Generative Adversarial Network (CTGAN), to establish an acceptable dataset for overbreak prediction. The CTGAN model was applied to overbreak data collected from paired tunnels: a left-line tunnel and a right-line tunnel. The overbreak dataset collected from the left-line tunnel-nominated as the true dataset-served to train the CTGAN model. Then the well-trained CTGAN model generated a synthetic overbreak dataset. Statistical-based approaches verified the similarity between the true and synthetic datasets; machine learning-based approaches verified the feasibility of using the synthetic dataset to train overbreak prediction model. Lastly, this study clarified how to resolve the problem of data shortage and data imbalance by leveraging the CTGAN model. The results evidence that the CTGAN model can effectively generate a high-quality synthetic overbreak dataset. The synthetic overbreak dataset not only greatly retains the properties of the true dataset but also effectively enhances its diversity. The way, integrating the true and synthetic overbreak datasets, can dramatically resolve the problem of data shortage and data imbalance in overbreak prediction. The findings in this study, therefore, highlight it as a promising perspective to resolve such a particular engineering problem. Elsevier 2024 Article PeerReviewed He, Biao and Armaghani, Danial Jahed and Lai, Sai Hin and Samui, Pijush and Mohamad, Edy Tonnizam (2024) Applying data augmentation technique on blast-induced overbreak prediction: Resolving the problem of data shortage and data imbalance. Expert Systems with Applications, 237 (C). ISSN 0957-4174, DOI https://doi.org/10.1016/j.eswa.2023.121616 <https://doi.org/10.1016/j.eswa.2023.121616>. https://doi.org/10.1016/j.eswa.2023.121616 10.1016/j.eswa.2023.121616 |
institution |
Universiti Malaya |
building |
UM Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaya |
content_source |
UM Research Repository |
url_provider |
http://eprints.um.edu.my/ |
topic |
QA75 Electronic computers. Computer science TA Engineering (General). Civil engineering (General) TD Environmental technology. Sanitary engineering |
spellingShingle |
QA75 Electronic computers. Computer science TA Engineering (General). Civil engineering (General) TD Environmental technology. Sanitary engineering He, Biao Armaghani, Danial Jahed Lai, Sai Hin Samui, Pijush Mohamad, Edy Tonnizam Applying data augmentation technique on blast-induced overbreak prediction: Resolving the problem of data shortage and data imbalance |
description |
Blast-induced overbreak in tunnels can cause severe damage and has therefore been a main concern in tunnel blasting. Researchers have developed many machine learning-based models to predict overbreak. Collecting overbreak data manually, however, can be challenging and might obtain insufficient or poorly structured data. Thus, this study aims to utilise a deep generative model, namely the Conditional Tabular Generative Adversarial Network (CTGAN), to establish an acceptable dataset for overbreak prediction. The CTGAN model was applied to overbreak data collected from paired tunnels: a left-line tunnel and a right-line tunnel. The overbreak dataset collected from the left-line tunnel-nominated as the true dataset-served to train the CTGAN model. Then the well-trained CTGAN model generated a synthetic overbreak dataset. Statistical-based approaches verified the similarity between the true and synthetic datasets; machine learning-based approaches verified the feasibility of using the synthetic dataset to train overbreak prediction model. Lastly, this study clarified how to resolve the problem of data shortage and data imbalance by leveraging the CTGAN model. The results evidence that the CTGAN model can effectively generate a high-quality synthetic overbreak dataset. The synthetic overbreak dataset not only greatly retains the properties of the true dataset but also effectively enhances its diversity. The way, integrating the true and synthetic overbreak datasets, can dramatically resolve the problem of data shortage and data imbalance in overbreak prediction. The findings in this study, therefore, highlight it as a promising perspective to resolve such a particular engineering problem. |
format |
Article |
author |
He, Biao Armaghani, Danial Jahed Lai, Sai Hin Samui, Pijush Mohamad, Edy Tonnizam |
author_facet |
He, Biao Armaghani, Danial Jahed Lai, Sai Hin Samui, Pijush Mohamad, Edy Tonnizam |
author_sort |
He, Biao |
title |
Applying data augmentation technique on blast-induced overbreak prediction: Resolving the problem of data shortage and data imbalance |
title_short |
Applying data augmentation technique on blast-induced overbreak prediction: Resolving the problem of data shortage and data imbalance |
title_full |
Applying data augmentation technique on blast-induced overbreak prediction: Resolving the problem of data shortage and data imbalance |
title_fullStr |
Applying data augmentation technique on blast-induced overbreak prediction: Resolving the problem of data shortage and data imbalance |
title_full_unstemmed |
Applying data augmentation technique on blast-induced overbreak prediction: Resolving the problem of data shortage and data imbalance |
title_sort |
applying data augmentation technique on blast-induced overbreak prediction: resolving the problem of data shortage and data imbalance |
publisher |
Elsevier |
publishDate |
2024 |
url |
http://eprints.um.edu.my/44315/ https://doi.org/10.1016/j.eswa.2023.121616 |
_version_ |
1805881158019317760 |