Performance comparison of feature selection methods for prediction in medical data
Along with technological advancement, the application of machine learning algorithms in industry, notably in the medical field, has grown and pro- gressed quickly. Medical databases commonly contain a lot of information about the medical histories of the patients and patient’s conditions, in additio...
Saved in:
Main Authors: | , , , |
---|---|
Format: | Conference or Workshop Item |
Language: | English English |
Published: |
Springer Nature
2023
|
Subjects: | |
Online Access: | http://irep.iium.edu.my/105807/6/105807_Performance%20comparison%20of%20feature%20selection.PDF http://irep.iium.edu.my/105807/7/105807_Performance%20comparison%20of%20feature%20selection_Scopus.pdf http://irep.iium.edu.my/105807/ https://link.springer.com/chapter/10.1007/978-981-99-0405-1_7 https://doi.org/10.1007/978-981-99-0405-1_7 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Islam Antarabangsa Malaysia |
Language: | English English |
Summary: | Along with technological advancement, the application of machine learning algorithms in industry, notably in the medical field, has grown and pro- gressed quickly. Medical databases commonly contain a lot of information about the medical histories of the patients and patient’s conditions, in addition, it is chal- lenging to identify and extract the information that will be relevant and meaning- ful for machine learning modelling. Not to mention, the efficacy of the predictive machine learning algorithm can be enhanced by using only useful and pertinent information. Hence, feature selection is proposed to determine the significant fea- tures. Thus, feature selection should be fully utilized and applied when building machine learning algorithm. This study analyzes filter, wrapper, and embedded feature selection methods for medical data with the predictive machine learn- ing algorithm, Random Forest and CatBoost. The experiment is carried out by evaluating the performances of the machine learning with and without applying feature selection methods. According to the results, CatBoost with RFE shows the best performance, in comparison to Random Forest with other feature selection methods. |
---|