Heterogeneous oblique random forest

Decision trees in random forests use a single feature in non-leaf nodes to split the data. Such splitting results in axis-parallel decision boundaries which may fail to exploit the geometric structure in the data. In oblique decision trees, an oblique hyperplane is employed instead of an axis-parall...

Full description

Saved in:
Bibliographic Details
Main Authors: Katuwal, Rakesh, Suganthan, Ponnuthurai Nagaratnam, Zhang, Le
Other Authors: School of Electrical and Electronic Engineering
Format: Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/138843
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-138843
record_format dspace
spelling sg-ntu-dr.10356-1388432021-02-04T07:07:07Z Heterogeneous oblique random forest Katuwal, Rakesh Suganthan, Ponnuthurai Nagaratnam Zhang, Le School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Benchmarking Classifiers Decision trees in random forests use a single feature in non-leaf nodes to split the data. Such splitting results in axis-parallel decision boundaries which may fail to exploit the geometric structure in the data. In oblique decision trees, an oblique hyperplane is employed instead of an axis-parallel hyperplane. Trees with such hyperplanes can better exploit the geometric structure to increase the accuracy of the trees and reduce the depth. The present realizations of oblique decision trees do not evaluate many promising oblique splits to select the best. In this paper, we propose a random forest of heterogeneous oblique decision trees that employ several linear classifiers at each non-leaf node on some top ranked partitions which are obtained via one-vs-all and two-hyperclasses based approaches and ranked based on ideal Gini scores and cluster separability. The oblique hyperplane that optimizes the impurity criterion is then selected as the splitting hyperplane for that node. We benchmark 190 classifiers on 121 UCI datasets. The results show that the oblique random forests proposed in this paper are the top 3 ranked classifiers with the heterogeneous oblique random forest being statistically better than all 189 classifiers in the literature. Accepted version 2020-05-13T05:46:39Z 2020-05-13T05:46:39Z 2019 Journal Article Katuwal, R, Suganthan, P. N., & Zhang, Le. (2019). Heterogeneous oblique random forest. Pattern Recognition, 99, 107078-. doi:10.1016/j.patcog.2019.107078 0031-3203 https://hdl.handle.net/10356/138843 10.1016/j.patcog.2019.107078 2-s2.0-85073635825 99 1 14 en Pattern Recognition © 2019 Elsevier Ltd. All rights reserved. This paper was published in Pattern Recognition and is made available with permission of Elsevier Ltd. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Electrical and electronic engineering
Benchmarking
Classifiers
spellingShingle Engineering::Electrical and electronic engineering
Benchmarking
Classifiers
Katuwal, Rakesh
Suganthan, Ponnuthurai Nagaratnam
Zhang, Le
Heterogeneous oblique random forest
description Decision trees in random forests use a single feature in non-leaf nodes to split the data. Such splitting results in axis-parallel decision boundaries which may fail to exploit the geometric structure in the data. In oblique decision trees, an oblique hyperplane is employed instead of an axis-parallel hyperplane. Trees with such hyperplanes can better exploit the geometric structure to increase the accuracy of the trees and reduce the depth. The present realizations of oblique decision trees do not evaluate many promising oblique splits to select the best. In this paper, we propose a random forest of heterogeneous oblique decision trees that employ several linear classifiers at each non-leaf node on some top ranked partitions which are obtained via one-vs-all and two-hyperclasses based approaches and ranked based on ideal Gini scores and cluster separability. The oblique hyperplane that optimizes the impurity criterion is then selected as the splitting hyperplane for that node. We benchmark 190 classifiers on 121 UCI datasets. The results show that the oblique random forests proposed in this paper are the top 3 ranked classifiers with the heterogeneous oblique random forest being statistically better than all 189 classifiers in the literature.
author2 School of Electrical and Electronic Engineering
author_facet School of Electrical and Electronic Engineering
Katuwal, Rakesh
Suganthan, Ponnuthurai Nagaratnam
Zhang, Le
format Article
author Katuwal, Rakesh
Suganthan, Ponnuthurai Nagaratnam
Zhang, Le
author_sort Katuwal, Rakesh
title Heterogeneous oblique random forest
title_short Heterogeneous oblique random forest
title_full Heterogeneous oblique random forest
title_fullStr Heterogeneous oblique random forest
title_full_unstemmed Heterogeneous oblique random forest
title_sort heterogeneous oblique random forest
publishDate 2020
url https://hdl.handle.net/10356/138843
_version_ 1692012903471775744