CANCER DETECTION USING PRINCIPAL COMPONENT ANALYSIS AND LONG-SHORT TERM MEMORY

Cancer is one of the most dangerous diseases worldwide. Abnormal cells go out of control and can invade other tissue cells wherein harmful cancer cells can spread to other parts of the body through the blood. According to WHO (World Health Organization), the biggest cause of death globally that take...

Full description

Saved in:
Bibliographic Details
Main Author: T Christopher Sirait, Daniel
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/78057
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:78057
spelling id-itb.:780572023-09-17T07:35:58ZCANCER DETECTION USING PRINCIPAL COMPONENT ANALYSIS AND LONG-SHORT TERM MEMORY T Christopher Sirait, Daniel Indonesia Theses microarray, principal component analysis, deep learning, long short­ term memory, CRJSP-DM INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/78057 Cancer is one of the most dangerous diseases worldwide. Abnormal cells go out of control and can invade other tissue cells wherein harmful cancer cells can spread to other parts of the body through the blood. According to WHO (World Health Organization), the biggest cause of death globally that takes 10 million lives due to cancer. The mortality rate will increase, and it is going to be fatal every year without early diagnosis. One way to detect it is to use microarray technology that monitors a very large number of expression data (genes) simultaneously. The datas used in this research are colon, ovarian and lung cancer. However, the main obstacle in a microarray data is the size of the dimensions which affects the accuracy result and time needed to process for the worse. Therefore, a plan is required to reduce such huge dimension and process it with a classification technique afterwards, so that the microarray data classification scheme can obtain good results and accuracy . In this study, CRISP-DM methodology is used to create an effective predictive model and handling out analytical data problem. Principal Component Analysis (PCA) functions as a feature extraction technique to reduce large dimensions in microarray data and applies the Long Short-Term Memory (LSTM) deep learning technique for the classification process. By using LSTM, it is proven that the accuracy value obtained is much greater and the processing time required is faster than LSTM with the help of PCA which brings down the accuracy result. The results of the classification with the best model show that LSTM can achieve the accuracy and F1 of 100% for lung cancer with time of 4164 seconds. Meanwhile, the best LSTM+PCA model obtained an accuracy and F1 of 100% for lung cancer in 4.6s. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Cancer is one of the most dangerous diseases worldwide. Abnormal cells go out of control and can invade other tissue cells wherein harmful cancer cells can spread to other parts of the body through the blood. According to WHO (World Health Organization), the biggest cause of death globally that takes 10 million lives due to cancer. The mortality rate will increase, and it is going to be fatal every year without early diagnosis. One way to detect it is to use microarray technology that monitors a very large number of expression data (genes) simultaneously. The datas used in this research are colon, ovarian and lung cancer. However, the main obstacle in a microarray data is the size of the dimensions which affects the accuracy result and time needed to process for the worse. Therefore, a plan is required to reduce such huge dimension and process it with a classification technique afterwards, so that the microarray data classification scheme can obtain good results and accuracy . In this study, CRISP-DM methodology is used to create an effective predictive model and handling out analytical data problem. Principal Component Analysis (PCA) functions as a feature extraction technique to reduce large dimensions in microarray data and applies the Long Short-Term Memory (LSTM) deep learning technique for the classification process. By using LSTM, it is proven that the accuracy value obtained is much greater and the processing time required is faster than LSTM with the help of PCA which brings down the accuracy result. The results of the classification with the best model show that LSTM can achieve the accuracy and F1 of 100% for lung cancer with time of 4164 seconds. Meanwhile, the best LSTM+PCA model obtained an accuracy and F1 of 100% for lung cancer in 4.6s.
format Theses
author T Christopher Sirait, Daniel
spellingShingle T Christopher Sirait, Daniel
CANCER DETECTION USING PRINCIPAL COMPONENT ANALYSIS AND LONG-SHORT TERM MEMORY
author_facet T Christopher Sirait, Daniel
author_sort T Christopher Sirait, Daniel
title CANCER DETECTION USING PRINCIPAL COMPONENT ANALYSIS AND LONG-SHORT TERM MEMORY
title_short CANCER DETECTION USING PRINCIPAL COMPONENT ANALYSIS AND LONG-SHORT TERM MEMORY
title_full CANCER DETECTION USING PRINCIPAL COMPONENT ANALYSIS AND LONG-SHORT TERM MEMORY
title_fullStr CANCER DETECTION USING PRINCIPAL COMPONENT ANALYSIS AND LONG-SHORT TERM MEMORY
title_full_unstemmed CANCER DETECTION USING PRINCIPAL COMPONENT ANALYSIS AND LONG-SHORT TERM MEMORY
title_sort cancer detection using principal component analysis and long-short term memory
url https://digilib.itb.ac.id/gdl/view/78057
_version_ 1822008460793872384