Ensemble learning for multidimensional poverty classification

The poverty rate in Malaysia is determined through financial or income indices and measurements. As such, periodic measurements are conducted through Household Expenditure and Income Survey (HEIS) twice every five years, and subsequently used to generate a Poverty Line Income (PLI) to determine pove...

Full description

Saved in:
Bibliographic Details
Main Authors: Azuraliza Abu Bakar, Rusnita Hamdan, Nor Samsiah Sani
Format: Article
Language:English
Published: Penerbit Universiti Kebangsaan Malaysia 2020
Online Access:http://journalarticle.ukm.my/14778/1/ARTIKEL%2024.pdf
http://journalarticle.ukm.my/14778/
http://www.ukm.my/jsm/malay_journals/jilid49bil2_2020/KandunganJilid49Bil2_2020.html
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Kebangsaan Malaysia
Language: English
id my-ukm.journal.14778
record_format eprints
spelling my-ukm.journal.147782020-06-23T01:15:29Z http://journalarticle.ukm.my/14778/ Ensemble learning for multidimensional poverty classification Azuraliza Abu Bakar, Rusnita Hamdan, Nor Samsiah Sani, The poverty rate in Malaysia is determined through financial or income indices and measurements. As such, periodic measurements are conducted through Household Expenditure and Income Survey (HEIS) twice every five years, and subsequently used to generate a Poverty Line Income (PLI) to determine poverty levels through statistical methods. Such uni-dimensional measurement however is unable to portray the overall deprivation conditions, especially based on the experience of the urban population. In addition, the United Nation Development Programme (UNDP) has introduced a set of multi-dimensional poverty measurements but is yet to be applied in the case of Malaysia. In view of this, a potential use of Machine Learning (ML) approaches that can produce new poverty measurement methods is therefore of interest, which must be triggered by the existence of a rich database collection on poverty, such as the eKasih database maintained by the Malaysian Government. The goal of this study was to determine whether ensemble learning method (random forest) can classify poverty and hence produce multidimensional poverty indicator compared to based learner method using eKasih dataset. CRoss Industry Standard Process for Data Mining (CRISP-DM) methods was used to ensure data mining and ML processes were conducted properly. Beside Random Forest, we also examined decision tree and general linear methods to benchmark their performance and determine the method with the highest accuracy. Fifteen variables were then rank using varImp method to search for important variables. Analysis of this study showed that Per Capita Income, State, Ethnic, Strata, Religion, Occupation and Education were found to be the most important variables in the classification of poverty at a rate of 99% accuracy confidence using Random Forest algorithm. Penerbit Universiti Kebangsaan Malaysia 2020-02 Article PeerReviewed application/pdf en http://journalarticle.ukm.my/14778/1/ARTIKEL%2024.pdf Azuraliza Abu Bakar, and Rusnita Hamdan, and Nor Samsiah Sani, (2020) Ensemble learning for multidimensional poverty classification. Sains Malaysiana, 49 (2). pp. 447-459. ISSN 0126-6039 http://www.ukm.my/jsm/malay_journals/jilid49bil2_2020/KandunganJilid49Bil2_2020.html
institution Universiti Kebangsaan Malaysia
building Tun Sri Lanang Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Kebangsaan Malaysia
content_source UKM Journal Article Repository
url_provider http://journalarticle.ukm.my/
language English
description The poverty rate in Malaysia is determined through financial or income indices and measurements. As such, periodic measurements are conducted through Household Expenditure and Income Survey (HEIS) twice every five years, and subsequently used to generate a Poverty Line Income (PLI) to determine poverty levels through statistical methods. Such uni-dimensional measurement however is unable to portray the overall deprivation conditions, especially based on the experience of the urban population. In addition, the United Nation Development Programme (UNDP) has introduced a set of multi-dimensional poverty measurements but is yet to be applied in the case of Malaysia. In view of this, a potential use of Machine Learning (ML) approaches that can produce new poverty measurement methods is therefore of interest, which must be triggered by the existence of a rich database collection on poverty, such as the eKasih database maintained by the Malaysian Government. The goal of this study was to determine whether ensemble learning method (random forest) can classify poverty and hence produce multidimensional poverty indicator compared to based learner method using eKasih dataset. CRoss Industry Standard Process for Data Mining (CRISP-DM) methods was used to ensure data mining and ML processes were conducted properly. Beside Random Forest, we also examined decision tree and general linear methods to benchmark their performance and determine the method with the highest accuracy. Fifteen variables were then rank using varImp method to search for important variables. Analysis of this study showed that Per Capita Income, State, Ethnic, Strata, Religion, Occupation and Education were found to be the most important variables in the classification of poverty at a rate of 99% accuracy confidence using Random Forest algorithm.
format Article
author Azuraliza Abu Bakar,
Rusnita Hamdan,
Nor Samsiah Sani,
spellingShingle Azuraliza Abu Bakar,
Rusnita Hamdan,
Nor Samsiah Sani,
Ensemble learning for multidimensional poverty classification
author_facet Azuraliza Abu Bakar,
Rusnita Hamdan,
Nor Samsiah Sani,
author_sort Azuraliza Abu Bakar,
title Ensemble learning for multidimensional poverty classification
title_short Ensemble learning for multidimensional poverty classification
title_full Ensemble learning for multidimensional poverty classification
title_fullStr Ensemble learning for multidimensional poverty classification
title_full_unstemmed Ensemble learning for multidimensional poverty classification
title_sort ensemble learning for multidimensional poverty classification
publisher Penerbit Universiti Kebangsaan Malaysia
publishDate 2020
url http://journalarticle.ukm.my/14778/1/ARTIKEL%2024.pdf
http://journalarticle.ukm.my/14778/
http://www.ukm.my/jsm/malay_journals/jilid49bil2_2020/KandunganJilid49Bil2_2020.html
_version_ 1671340603521105920