A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping

An ensemble algorithm of data mining decision tree (DT)-based CHi-squared Automatic Interaction Detection (CHAID) is widely used for prediction analysis in variety of applications. CHAID as a multivariate method has an automatic classification capacity to analyze large numbers of landslide condition...

Full description

Saved in:
Bibliographic Details
Main Authors: Althuwaynee, Omar F., Pradhan, Biswajeet, Park, Hyuck Jin, Lee, Jung Hyun
Format: Article
Language:English
Published: Springer 2014
Online Access:http://psasir.upm.edu.my/id/eprint/36217/1/A%20novel%20ensemble%20decision%20tree.pdf
http://psasir.upm.edu.my/id/eprint/36217/
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Putra Malaysia
Language: English
id my.upm.eprints.36217
record_format eprints
spelling my.upm.eprints.362172015-09-01T04:42:54Z http://psasir.upm.edu.my/id/eprint/36217/ A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping Althuwaynee, Omar F. Pradhan, Biswajeet Park, Hyuck Jin Lee, Jung Hyun An ensemble algorithm of data mining decision tree (DT)-based CHi-squared Automatic Interaction Detection (CHAID) is widely used for prediction analysis in variety of applications. CHAID as a multivariate method has an automatic classification capacity to analyze large numbers of landslide conditioning factors. Moreover, it results two or more nodes for each independent variable, where every node contains numbers of presence or absence of landslides (dependent variable). Other DT methods such as Quick, Unbiased, Efficient Statistic Tree (QUEST) and Classification and Regression Trees (CRT) are not able to produce multi branches based tree. Thus, the main objective of this paper is to use CHAID method to perform the best classification fit for each conditioning factors, then, combined it with logistic regression (LR) to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. In the first step, a landslide inventory map with 296 landslide locations were extracted from various sources over the Pohang-Kyeong Joo catchment (South Korea). Then, the inventory was randomly split into two datasets, 70 % was used for training the models, and the remaining 30 % was used for validation purpose. Thirteen landslide conditioning factors were used for the susceptibility modeling. Then, CHAID was applied and revealed that some conditioning factors such as altitude, soil drain, soil texture and TWI, as terminal nodes and reflected the best classification fit. Then, a proposed ensemble technique was applied and the interpretations of the coefficients showed that the relationship between the decision tree branch nodes distance from drain, soil drain, and TWI, respectively, leads to better consequences assessment of landslides in the current study area. The validation results showed that both success and prediction rates, 75 and 79 %, respectively. This study proved the efficiency and reliability of ensemble DT and LR model in landslide susceptibility mapping. Springer 2014 Article PeerReviewed application/pdf en http://psasir.upm.edu.my/id/eprint/36217/1/A%20novel%20ensemble%20decision%20tree.pdf Althuwaynee, Omar F. and Pradhan, Biswajeet and Park, Hyuck Jin and Lee, Jung Hyun (2014) A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping. Landslides, 11 (6). pp. 1063-1078. ISSN 1612-510X; ESSN: 1612-5118 10.1007/s10346-014-0466-0
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description An ensemble algorithm of data mining decision tree (DT)-based CHi-squared Automatic Interaction Detection (CHAID) is widely used for prediction analysis in variety of applications. CHAID as a multivariate method has an automatic classification capacity to analyze large numbers of landslide conditioning factors. Moreover, it results two or more nodes for each independent variable, where every node contains numbers of presence or absence of landslides (dependent variable). Other DT methods such as Quick, Unbiased, Efficient Statistic Tree (QUEST) and Classification and Regression Trees (CRT) are not able to produce multi branches based tree. Thus, the main objective of this paper is to use CHAID method to perform the best classification fit for each conditioning factors, then, combined it with logistic regression (LR) to find the corresponding coefficients of best fitting function that assess the optimal terminal nodes. In the first step, a landslide inventory map with 296 landslide locations were extracted from various sources over the Pohang-Kyeong Joo catchment (South Korea). Then, the inventory was randomly split into two datasets, 70 % was used for training the models, and the remaining 30 % was used for validation purpose. Thirteen landslide conditioning factors were used for the susceptibility modeling. Then, CHAID was applied and revealed that some conditioning factors such as altitude, soil drain, soil texture and TWI, as terminal nodes and reflected the best classification fit. Then, a proposed ensemble technique was applied and the interpretations of the coefficients showed that the relationship between the decision tree branch nodes distance from drain, soil drain, and TWI, respectively, leads to better consequences assessment of landslides in the current study area. The validation results showed that both success and prediction rates, 75 and 79 %, respectively. This study proved the efficiency and reliability of ensemble DT and LR model in landslide susceptibility mapping.
format Article
author Althuwaynee, Omar F.
Pradhan, Biswajeet
Park, Hyuck Jin
Lee, Jung Hyun
spellingShingle Althuwaynee, Omar F.
Pradhan, Biswajeet
Park, Hyuck Jin
Lee, Jung Hyun
A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping
author_facet Althuwaynee, Omar F.
Pradhan, Biswajeet
Park, Hyuck Jin
Lee, Jung Hyun
author_sort Althuwaynee, Omar F.
title A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping
title_short A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping
title_full A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping
title_fullStr A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping
title_full_unstemmed A novel ensemble decision tree-based CHi-squared Automatic Interaction Detection (CHAID) and multivariate logistic regression models in landslide susceptibility mapping
title_sort novel ensemble decision tree-based chi-squared automatic interaction detection (chaid) and multivariate logistic regression models in landslide susceptibility mapping
publisher Springer
publishDate 2014
url http://psasir.upm.edu.my/id/eprint/36217/1/A%20novel%20ensemble%20decision%20tree.pdf
http://psasir.upm.edu.my/id/eprint/36217/
_version_ 1643831682343632896