THE DEVELOPMENT OF A SYNTHETIC DATASET USING DEEP GENERATIVE MODEL TO IMPROVE THE PERFORMANCE OF DEEP LEARNING-BASED BRAIN TUMOR MRI IMAGE CLASSIFICATION

Medical Image Classification based on a deep learning model shows excellent results. However, the medical image still lacks several public datasets, including brain tumors. Another problem is that this dataset suffers from an imbalanced class dataset problem. To overcome this problem, we can incr...

Full description

Saved in:
Bibliographic Details
Main Author: Fajriati MS. Musa, Angky
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/86560
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Medical Image Classification based on a deep learning model shows excellent results. However, the medical image still lacks several public datasets, including brain tumors. Another problem is that this dataset suffers from an imbalanced class dataset problem. To overcome this problem, we can increase the number of datasets using machine learning. So, our study was conducted to develop a synthetic dataset to increase the number of training sets with synthetic data while training the classification model based on deep learning models. This research uses an MRI tumor dataset with a total of 3064 images consisting of three tumors. We chose the GAN model to produce synthetic data. We developed nine GAN models with different tumors and modalities. Our goal is for GAN to focus on learning one tumor with each imaging plane. We evaluated the result of synthetic data from the GAN model quantitatively and qualitatively. The result shows that modified architecture for the discriminator and generator using a Residual Network produces a synthetic dataset that captures tumors, one of its important features, better than a Deep Convolutional Network. Then, to escalate the training process on our GAN model, we added the gradient-penalty method. The result of adding a gradient penalty is to minimize the corrupted feature on the synthetic dataset. However, the drawback is that the synthetic dataset has a higher FID score and lower inception score. The next process is to add synthetic data to train the set on the deep learning classification model. Best result of synthetic data produced by the ResWGAN-GP model, with InceptionV3, the improvement of the accuracy model from 95% to 96%, with DenseNet121 from 96% to 97%, and with MobileNetV2 from 92% to 93%. The accuracy improvement because the model with additional synthetic data predicts better meningioma and glioma class.