LDSVM: Leukemia Cancer Classification Using Machine Learning
Leukemia is blood cancer, including bone marrow and lymphatic tissues, typically involving white blood cells. Leukemia produces an abnormal amount of white blood cells compared to normal blood. Deoxyribonucleic acid (DNA) microarrays provide reliable medical diagnostic services to help more pati...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Other NonPeerReviewed |
Language: | English |
Published: |
Computers, Materials and Continua
2022
|
Subjects: | |
Online Access: | https://repository.ugm.ac.id/283929/1/103.LDSVM-Leukemia-cancer-classification-using-machine-learningComputers-Materials-and-Continua.pdf https://repository.ugm.ac.id/283929/ https://www.techscience.com/cmc/v71n2/45786 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universitas Gadjah Mada |
Language: | English |
Summary: | Leukemia is blood cancer, including bone marrow and lymphatic
tissues, typically involving white blood cells. Leukemia produces an abnormal
amount of white blood cells compared to normal blood. Deoxyribonucleic
acid (DNA) microarrays provide reliable medical diagnostic services to help
more patients find the proposed treatment for infections. DNA microarrays
are also known as biochips that consist of microscopic DNA spots attached
to a solid glass surface. Currently, it is difficult to classify cancers using
microarray data. Nearly many data mining techniques have failed because
of the small sample size, which has become more critical for organizations.
However, they are not highly effective in improving results and are frequently
employed by doctors for cancer diagnosis. This study proposes a novel
method using machine learning algorithms based on microarrays of leukemia
GSE9476 cells. The main aim was to predict the initial leukemia disease.
Machine learning algorithms such as decision tree (DT), naive bayes (NB),
random forest (RF), gradient boosting machine (GBM), linear regression
(LinR), support vector machine (SVM), and novel approach based on the
combination of Logistic Regression (LR), DT and SVM named as ensemble
LDSVM model. The k-fold cross-validation and grid search optimization
methods were used with the LDSVM model to classify leukemia in patients
and comparatively analyze their impacts. The proposed approach evaluated
better accuracy, precision, recall, and f1 scores than the other algorithms.
Furthermore, the results were relatively assessed, which showed LDSVM
performance. This study aims to successfully predict leukemia in patients
and enhance prediction accuracy in minimum time. Moreover, a Synthetic
minority oversampling technique (SMOTE) and Principal compenent analysis
(PCA) approaches were implemented. This makes the records generalized and
evaluates the outcomeswell.PCAreduces the feature count without losing any
information and deals with class imbalanced datasets, as well as faster model
execution along with less computation cost. In this study, a novel process
was used to reduce the column results to develop a faster and more rapid
experiment execution. |
---|