Modeling flood susceptibility using data-driven approaches of naive bayes tree, alternating decision tree, and random forest methods

Floods are one of the most devastating types of disasters that cause loss of lives and property worldwide each year. This study aimed to evaluate and compare the prediction capability of the naïve Bayes tree (NBTree), alternating decision tree (ADTree), and random forest (RF) methods for the spatial...

Full description

Saved in:
Bibliographic Details
Main Authors: Chen, W., Li, Y., Xue, W., Shahabi, H., Li, S., Hong, H., Wang, X., Bian, H., Zhang, S., Pradhan, B., Ahmad, B. B.
Format: Article
Published: Elsevier B. V. 2020
Subjects:
Online Access:http://eprints.utm.my/id/eprint/86440/
https://dx.doi.org/10.1016/j.scitotenv.2019.134979
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Description
Summary:Floods are one of the most devastating types of disasters that cause loss of lives and property worldwide each year. This study aimed to evaluate and compare the prediction capability of the naïve Bayes tree (NBTree), alternating decision tree (ADTree), and random forest (RF) methods for the spatial prediction of flood occurrence in the Quannan area, China. A flood inventory map with 363 flood locations was produced and partitioned into training and validation datasets through random selection with a ratio of 70/30. The spatial flood database was constructed using thirteen flood explanatory factors. The probability certainty factor (PCF) method was used to analyze the correlation between the factors and flood occurrences. Consequently, three flood susceptibility maps were produced using the NBTree, ADTree, and RF methods. Finally, the area under the curve (AUC) and statistical measures were used to validate the flood susceptibility models. The results indicated that the RF method is an efficient and reliable model in flood susceptibility assessment, with the highest AUC values, positive predictive rate, negative predictive rate, sensitivity, specificity, and accuracy for the training (0.951, 0.892, 0.941, 0.945, 0.886, and 0.915, respectively) and validation (0.925, 0.851, 0.938, 0.945, 0.835, and 0.890, respectively) datasets.