Heterogeneous oblique random forest
Decision trees in random forests use a single feature in non-leaf nodes to split the data. Such splitting results in axis-parallel decision boundaries which may fail to exploit the geometric structure in the data. In oblique decision trees, an oblique hyperplane is employed instead of an axis-parall...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/138843 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Decision trees in random forests use a single feature in non-leaf nodes to split the data. Such splitting results in axis-parallel decision boundaries which may fail to exploit the geometric structure in the data. In oblique decision trees, an oblique hyperplane is employed instead of an axis-parallel hyperplane. Trees with such hyperplanes can better exploit the geometric structure to increase the accuracy of the trees and reduce the depth. The present realizations of oblique decision trees do not evaluate many promising oblique splits to select the best. In this paper, we propose a random forest of heterogeneous oblique decision trees that employ several linear classifiers at each non-leaf node on some top ranked partitions which are obtained via one-vs-all and two-hyperclasses based approaches and ranked based on ideal Gini scores and cluster separability. The oblique hyperplane that optimizes the impurity criterion is then selected as the splitting hyperplane for that node. We benchmark 190 classifiers on 121 UCI datasets. The results show that the oblique random forests proposed in this paper are the top 3 ranked classifiers with the heterogeneous oblique random forest being statistically better than all 189 classifiers in the literature. |
---|