IMPLEMENTATION OF TOPOLOGICAL DATA ANALYSIS AND SUPPORT VECTOR MACHINE FOR MNIST DATASET CLASSIFICATION

The advancement of information technology and artificial intelligence has fostered innovation in pattern recognition, particularly on the MNIST dataset, a classic collection of handwritten digits. MNIST comprises two main components: image data X and labels y. This research focuses on exploring t...

Full description

Saved in:
Bibliographic Details
Main Author: Nilam Sari, Nur
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/83398
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:83398
spelling id-itb.:833982024-08-09T10:09:03ZIMPLEMENTATION OF TOPOLOGICAL DATA ANALYSIS AND SUPPORT VECTOR MACHINE FOR MNIST DATASET CLASSIFICATION Nilam Sari, Nur Indonesia Final Project MNIST dataset, persistence barcode, feature extraction, Support Vector Machine. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/83398 The advancement of information technology and artificial intelligence has fostered innovation in pattern recognition, particularly on the MNIST dataset, a classic collection of handwritten digits. MNIST comprises two main components: image data X and labels y. This research focuses on exploring the application of topological data analysis concepts, specifically through persistence barcode analysis. Furthermore, the classification process employs machine learning techniques, specifically the support vector machine with a Radial Basis Function (RBF) kernel. Each digit in the MNIST dataset is represented as a 28x28 matrix, with matrix elements ranging from 1 to 255. The preprocessing steps include converting grayscale matrices to binary, skeletonization using the Zhang-Suen thinning method, forming embedded graphs, determining filtration values, and constructing persistence barcodes. Features are extracted from the persistence barcodes using the Adcock-Carlsson Coordinates method. To enhance accuracy, each image in the MNIST dataset undergoes four rotations (north, south, west, east), resulting in 32 extracted features per image. These features serve as inputs for the classification algorithm. The MNIST dataset is divided into training data (80% 56,000 samples) and test data (20% 14,000 samples). The chosen parameters include a gamma value of 0.006551285568595509 and a C value of 138.94954943731375. Through these processes, the achieved accuracy on the test data reaches 70% text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description The advancement of information technology and artificial intelligence has fostered innovation in pattern recognition, particularly on the MNIST dataset, a classic collection of handwritten digits. MNIST comprises two main components: image data X and labels y. This research focuses on exploring the application of topological data analysis concepts, specifically through persistence barcode analysis. Furthermore, the classification process employs machine learning techniques, specifically the support vector machine with a Radial Basis Function (RBF) kernel. Each digit in the MNIST dataset is represented as a 28x28 matrix, with matrix elements ranging from 1 to 255. The preprocessing steps include converting grayscale matrices to binary, skeletonization using the Zhang-Suen thinning method, forming embedded graphs, determining filtration values, and constructing persistence barcodes. Features are extracted from the persistence barcodes using the Adcock-Carlsson Coordinates method. To enhance accuracy, each image in the MNIST dataset undergoes four rotations (north, south, west, east), resulting in 32 extracted features per image. These features serve as inputs for the classification algorithm. The MNIST dataset is divided into training data (80% 56,000 samples) and test data (20% 14,000 samples). The chosen parameters include a gamma value of 0.006551285568595509 and a C value of 138.94954943731375. Through these processes, the achieved accuracy on the test data reaches 70%
format Final Project
author Nilam Sari, Nur
spellingShingle Nilam Sari, Nur
IMPLEMENTATION OF TOPOLOGICAL DATA ANALYSIS AND SUPPORT VECTOR MACHINE FOR MNIST DATASET CLASSIFICATION
author_facet Nilam Sari, Nur
author_sort Nilam Sari, Nur
title IMPLEMENTATION OF TOPOLOGICAL DATA ANALYSIS AND SUPPORT VECTOR MACHINE FOR MNIST DATASET CLASSIFICATION
title_short IMPLEMENTATION OF TOPOLOGICAL DATA ANALYSIS AND SUPPORT VECTOR MACHINE FOR MNIST DATASET CLASSIFICATION
title_full IMPLEMENTATION OF TOPOLOGICAL DATA ANALYSIS AND SUPPORT VECTOR MACHINE FOR MNIST DATASET CLASSIFICATION
title_fullStr IMPLEMENTATION OF TOPOLOGICAL DATA ANALYSIS AND SUPPORT VECTOR MACHINE FOR MNIST DATASET CLASSIFICATION
title_full_unstemmed IMPLEMENTATION OF TOPOLOGICAL DATA ANALYSIS AND SUPPORT VECTOR MACHINE FOR MNIST DATASET CLASSIFICATION
title_sort implementation of topological data analysis and support vector machine for mnist dataset classification
url https://digilib.itb.ac.id/gdl/view/83398
_version_ 1822998110349033472