Heterogeneous oblique random forest
Decision trees in random forests use a single feature in non-leaf nodes to split the data. Such splitting results in axis-parallel decision boundaries which may fail to exploit the geometric structure in the data. In oblique decision trees, an oblique hyperplane is employed instead of an axis-parall...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/138843 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-138843 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1388432021-02-04T07:07:07Z Heterogeneous oblique random forest Katuwal, Rakesh Suganthan, Ponnuthurai Nagaratnam Zhang, Le School of Electrical and Electronic Engineering Engineering::Electrical and electronic engineering Benchmarking Classifiers Decision trees in random forests use a single feature in non-leaf nodes to split the data. Such splitting results in axis-parallel decision boundaries which may fail to exploit the geometric structure in the data. In oblique decision trees, an oblique hyperplane is employed instead of an axis-parallel hyperplane. Trees with such hyperplanes can better exploit the geometric structure to increase the accuracy of the trees and reduce the depth. The present realizations of oblique decision trees do not evaluate many promising oblique splits to select the best. In this paper, we propose a random forest of heterogeneous oblique decision trees that employ several linear classifiers at each non-leaf node on some top ranked partitions which are obtained via one-vs-all and two-hyperclasses based approaches and ranked based on ideal Gini scores and cluster separability. The oblique hyperplane that optimizes the impurity criterion is then selected as the splitting hyperplane for that node. We benchmark 190 classifiers on 121 UCI datasets. The results show that the oblique random forests proposed in this paper are the top 3 ranked classifiers with the heterogeneous oblique random forest being statistically better than all 189 classifiers in the literature. Accepted version 2020-05-13T05:46:39Z 2020-05-13T05:46:39Z 2019 Journal Article Katuwal, R, Suganthan, P. N., & Zhang, Le. (2019). Heterogeneous oblique random forest. Pattern Recognition, 99, 107078-. doi:10.1016/j.patcog.2019.107078 0031-3203 https://hdl.handle.net/10356/138843 10.1016/j.patcog.2019.107078 2-s2.0-85073635825 99 1 14 en Pattern Recognition © 2019 Elsevier Ltd. All rights reserved. This paper was published in Pattern Recognition and is made available with permission of Elsevier Ltd. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Electrical and electronic engineering Benchmarking Classifiers |
spellingShingle |
Engineering::Electrical and electronic engineering Benchmarking Classifiers Katuwal, Rakesh Suganthan, Ponnuthurai Nagaratnam Zhang, Le Heterogeneous oblique random forest |
description |
Decision trees in random forests use a single feature in non-leaf nodes to split the data. Such splitting results in axis-parallel decision boundaries which may fail to exploit the geometric structure in the data. In oblique decision trees, an oblique hyperplane is employed instead of an axis-parallel hyperplane. Trees with such hyperplanes can better exploit the geometric structure to increase the accuracy of the trees and reduce the depth. The present realizations of oblique decision trees do not evaluate many promising oblique splits to select the best. In this paper, we propose a random forest of heterogeneous oblique decision trees that employ several linear classifiers at each non-leaf node on some top ranked partitions which are obtained via one-vs-all and two-hyperclasses based approaches and ranked based on ideal Gini scores and cluster separability. The oblique hyperplane that optimizes the impurity criterion is then selected as the splitting hyperplane for that node. We benchmark 190 classifiers on 121 UCI datasets. The results show that the oblique random forests proposed in this paper are the top 3 ranked classifiers with the heterogeneous oblique random forest being statistically better than all 189 classifiers in the literature. |
author2 |
School of Electrical and Electronic Engineering |
author_facet |
School of Electrical and Electronic Engineering Katuwal, Rakesh Suganthan, Ponnuthurai Nagaratnam Zhang, Le |
format |
Article |
author |
Katuwal, Rakesh Suganthan, Ponnuthurai Nagaratnam Zhang, Le |
author_sort |
Katuwal, Rakesh |
title |
Heterogeneous oblique random forest |
title_short |
Heterogeneous oblique random forest |
title_full |
Heterogeneous oblique random forest |
title_fullStr |
Heterogeneous oblique random forest |
title_full_unstemmed |
Heterogeneous oblique random forest |
title_sort |
heterogeneous oblique random forest |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/138843 |
_version_ |
1692012903471775744 |