Utilization of filter feature selection with support vector machine for tumours classification

Due to rapid technology advancement, machine learning has been widely used for solving cancer classification problem. Classification performance is highly depending on the quality of input features. With an explosive increase number of features of high dimensional data, the occurrence of ambiguous s...

Full description

Saved in:
Bibliographic Details
Main Authors: Tengku Mazlin, T. A. H., Sallehuddin, R., Zuriahati, M. Y.
Format: Conference or Workshop Item
Language:English
Published: 2020
Subjects:
Online Access:http://eprints.utm.my/id/eprint/89601/1/RoselinaSallehuddin2019_UtilizationofFilterFeatureSelection.pdf
http://eprints.utm.my/id/eprint/89601/
http://www.dx.doi.org/10.1088/1757-899X/551/1/012062
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Teknologi Malaysia
Language: English
id my.utm.89601
record_format eprints
spelling my.utm.896012021-02-09T05:01:30Z http://eprints.utm.my/id/eprint/89601/ Utilization of filter feature selection with support vector machine for tumours classification Tengku Mazlin, T. A. H. Sallehuddin, R. Zuriahati, M. Y. QA75 Electronic computers. Computer science Due to rapid technology advancement, machine learning has been widely used for solving cancer classification problem. Classification performance is highly depending on the quality of input features. With an explosive increase number of features of high dimensional data, the occurrence of ambiguous samples and data redundancy directly leads to poor classification accuracy. Therefore, this paper presents a utilization of filter feature selection using four filter methods such as Information Gain, Gain Ratio, Chi-Squared and Relief-F by performing attribute rankings to remove the irrelevant and redundant features and evaluate the significance and correlation of input data. Then, the classification will be performed using Support Vector Machine (SVM) to measure the accuracy performance based on the number of selected features. The performance measurement will be validated on standard Breast Cancer datasets consisting of 286 instances obtained from the UCI repository. Evaluation metrics such as accuracy, sensitivity, specificity and Area under Receiver Operating Characteristic Curve (AUC) will be used to assess the performance of the SVM classifier using four different filter methods. Experimental result shows that Gain ratio improves the accuracy of SVM classification compared to Information Gain, Chi-Squared and Relief-F in classifying breast cancer data with only small number of features selected. 2020 Conference or Workshop Item PeerReviewed application/pdf en http://eprints.utm.my/id/eprint/89601/1/RoselinaSallehuddin2019_UtilizationofFilterFeatureSelection.pdf Tengku Mazlin, T. A. H. and Sallehuddin, R. and Zuriahati, M. Y. (2020) Utilization of filter feature selection with support vector machine for tumours classification. In: International Conference on Green Engineering Technology and Applied Computing 2019, IConGETech2 019 and International Conference on Applied Computing 2019, ICAC 2019, 4-5 Feb 2019, Eastin Hotel Makkasan Bangkok, Thailand. http://www.dx.doi.org/10.1088/1757-899X/551/1/012062
institution Universiti Teknologi Malaysia
building UTM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Teknologi Malaysia
content_source UTM Institutional Repository
url_provider http://eprints.utm.my/
language English
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Tengku Mazlin, T. A. H.
Sallehuddin, R.
Zuriahati, M. Y.
Utilization of filter feature selection with support vector machine for tumours classification
description Due to rapid technology advancement, machine learning has been widely used for solving cancer classification problem. Classification performance is highly depending on the quality of input features. With an explosive increase number of features of high dimensional data, the occurrence of ambiguous samples and data redundancy directly leads to poor classification accuracy. Therefore, this paper presents a utilization of filter feature selection using four filter methods such as Information Gain, Gain Ratio, Chi-Squared and Relief-F by performing attribute rankings to remove the irrelevant and redundant features and evaluate the significance and correlation of input data. Then, the classification will be performed using Support Vector Machine (SVM) to measure the accuracy performance based on the number of selected features. The performance measurement will be validated on standard Breast Cancer datasets consisting of 286 instances obtained from the UCI repository. Evaluation metrics such as accuracy, sensitivity, specificity and Area under Receiver Operating Characteristic Curve (AUC) will be used to assess the performance of the SVM classifier using four different filter methods. Experimental result shows that Gain ratio improves the accuracy of SVM classification compared to Information Gain, Chi-Squared and Relief-F in classifying breast cancer data with only small number of features selected.
format Conference or Workshop Item
author Tengku Mazlin, T. A. H.
Sallehuddin, R.
Zuriahati, M. Y.
author_facet Tengku Mazlin, T. A. H.
Sallehuddin, R.
Zuriahati, M. Y.
author_sort Tengku Mazlin, T. A. H.
title Utilization of filter feature selection with support vector machine for tumours classification
title_short Utilization of filter feature selection with support vector machine for tumours classification
title_full Utilization of filter feature selection with support vector machine for tumours classification
title_fullStr Utilization of filter feature selection with support vector machine for tumours classification
title_full_unstemmed Utilization of filter feature selection with support vector machine for tumours classification
title_sort utilization of filter feature selection with support vector machine for tumours classification
publishDate 2020
url http://eprints.utm.my/id/eprint/89601/1/RoselinaSallehuddin2019_UtilizationofFilterFeatureSelection.pdf
http://eprints.utm.my/id/eprint/89601/
http://www.dx.doi.org/10.1088/1757-899X/551/1/012062
_version_ 1691733106661261312