SWIN transformer for diabetic retinopathy detection

In the field of Machine Learning, Convolutional Neural Networks (CNNs) have been dominant in executing image classification tasks. Transformer models were first introduced in 2017 for Natural Language Processing tasks, where further development led to the introduction of Vision Transformers (ViTs) f...

Full description

Saved in:
Bibliographic Details
Main Author: Ang, Elroy Wei Yong
Other Authors: Jagath C Rajapakse
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/165923
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:In the field of Machine Learning, Convolutional Neural Networks (CNNs) have been dominant in executing image classification tasks. Transformer models were first introduced in 2017 for Natural Language Processing tasks, where further development led to the introduction of Vision Transformers (ViTs) for image classification. In the medical field, one of the many use cases of Artificial Intelligence is to detect diseases. Specific to eye diseases, Diabetic Retinopathy (DR) is a common disease that has been using CNNs to aid in its discovery or classification. While recent comparisons have shown that ViTs outperform CNNs on the ImageNet, none has been done on a DR dataset. In this paper, we aim to compare the performances of ViTs and its recent variants, and CNNs on detecting DR using a single standardized dataset. The dataset used for training is obtained from Kaggle, and there are two other separate external validation datasets. We demonstrate that the SWIN Transformer outperforms other architectures in this problem.