Multi-stage generative adversarial networks for generating pavement crack images

The application of machine learning techniques in pavement health monitoring based on computer vision has greatly improved the accuracy and efficiency in the detection of pavement distress levels and categories. However, a persistent challenge in this field is the issue of sample imbalance, primaril...

Full description

Saved in:
Bibliographic Details
Main Authors: Han, Chengjia, Ma, Tao, Huyan, Ju, Tong, Zheng, Yang, Handuo, Yang, Yaowen
Other Authors: School of Civil and Environmental Engineering
Format: Article
Language:English
Published: 2024
Subjects:
Online Access:https://hdl.handle.net/10356/180178
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The application of machine learning techniques in pavement health monitoring based on computer vision has greatly improved the accuracy and efficiency in the detection of pavement distress levels and categories. However, a persistent challenge in this field is the issue of sample imbalance, primarily arising from the scarcity of cracked pavement images, which hampers their effectiveness in road maintenance engineering. To address this issue and enhance the fast and stable generation of high-quality crack images for engineering purposes, this study proposes two frameworks based on Generative Adversarial Networks (GAN): Multi-Stage GAN-v1 and Multi-Stage GAN-v2. These frameworks break down the complex task of directly generating high-quality images into a series of incremental steps, gradually increasing the image resolution from initially generated low-precision images. Both versions, v1 and v2, consist of multiple sequentially connected generation units, with each unit utilizing the Wasserstein Generative Adversarial Network-Gradient Penalty (WGAN-GP). Furthermore, v2 has the additional capability of generating pavement crack images of specified types and simultaneously providing crack segmentation labels. This feature significantly enhances the practical applicability of the generated data in engineering contexts. In a comprehensive case study, the evaluation results clearly illustrate the superior image generation quality from the two proposed frameworks. Moreover, the results from ablation experiments, involving the training of nine state-of-the-art crack semantic segmentation and object detection networks using both generated images and real images, demonstrate the effective utility of these generated images for training pavement distress detection networks.