Machine learning methods for classification of COVID-19 exploiting infrared spectroscopy

COVID-19 has the characteristics of diverse transmission routes and a long incubation period and can spread to a large area in a short period. Therefore, rapid COVID-19 testing is crucial. In this dissertation, we develop machine learning methods for the classification of the infrared spectra of COV...

Full description

Saved in:
Bibliographic Details
Main Author: Li, Yina
Other Authors: Lin Zhiping
Format: Thesis-Master by Coursework
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/173801
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:COVID-19 has the characteristics of diverse transmission routes and a long incubation period and can spread to a large area in a short period. Therefore, rapid COVID-19 testing is crucial. In this dissertation, we develop machine learning methods for the classification of the infrared spectra of COVID-19 pharyngeal swab samples. Because of the high dimension of infrared spectral data, it is difficult to extract the key features, and it also brings a large amount of calculation. Therefore, using feature selection and feature transformation to reduce the dimensionality of the original data is a key step. In this dissertation, dimensionality reduction methods were selected and compared for two batches of datasets, and COVID-19 detection models were established with machine learning methods. For the first batch of data, the competitive adaptive reweighted sampling-principal component analysis-support vector machine (CARS-PCA-SVM) model reduces the dimensionality of the original dataset to 74 dimensions and achieves the best classification performance, with an accuracy of 83.33%, a sensitivity of 86.75%, and a specificity of 82.29%. In contrast, the genetic algorithm-support vector machine (GA-SVM) model only achieves an accuracy of 71.93%. For the second batch of data, the PCA-SVM model is able to reduce the dimensionality of the original spectral data to 36 dimensions, while achieving the best classification performance, with an accuracy of 96.68%, a sensitivity of 95.61%, and a specificity of 97.69%. In contrast, the successive projections algorithm-support vector machine (SPA-SVM) model only achieves an accuracy of 88.44%.