AUTOMATIC REMOVAL OF SPEECH ARTIFACT IN ELECTROENCEPHALOGRAM DATA USING MACHINE LEARNING
Electroencephalogram (EEG) is an equipment used to record the electrical activities originating from the brain. Unfortunately, the EEG data are often contaminated by artifacts, which are defined as electrical activities that are not generated by the brain, so the data cannot be processed further. Th...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/39286 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:39286 |
---|---|
spelling |
id-itb.:392862019-06-25T11:36:44ZAUTOMATIC REMOVAL OF SPEECH ARTIFACT IN ELECTROENCEPHALOGRAM DATA USING MACHINE LEARNING Lovenia, Holy Indonesia Final Project automatic speech artifact removal, EEG, machine learning INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/39286 Electroencephalogram (EEG) is an equipment used to record the electrical activities originating from the brain. Unfortunately, the EEG data are often contaminated by artifacts, which are defined as electrical activities that are not generated by the brain, so the data cannot be processed further. This noise certainly leads EEG data processing to many problems and limitations, especially for speech artifacts that often emerge in studies related to communication. In addition, no previous research has studied the characteristics of speech artifacts, causing severe difficulties in detecting them. Therefore, the present study aims to: 1) construct a speech artifact removal system, 2) find the best classification and clustering machine learning models for detection, 3) the search for prospects for using deep neural networks, and 4) find the important features. Before the machine learning experiment began, EEG-specific preprocessing and decomposition steps were applied to the signals. Afterwards, each of the independent components was labelled according to the correlation with lip EMG and the features were extracted. The machine learning model building experiment consisted of several scenarios that focused on the construction of a baseline model, imbalanced data handling, and feature selection/extraction techniques. Random Forest (f1-score on testing: 0.97) with upsampling and best parameter configuration came out as the best classification model, while Agglomerative (purity on testing: 0.63) with SMOTE, Select K Best as feature selection, and best parameter configuration had the best performance amongst the clustering models. The important features are determined according to the feature importance from the best classification model. Feedforward Neural Networks (f1-score test: 0.74) showed that speech artifact detection with deep neural networks had a promising prospect in the future. The speech artifact removal system was built using the best models established by the machine learning experiment. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Electroencephalogram (EEG) is an equipment used to record the electrical activities originating from the brain. Unfortunately, the EEG data are often contaminated by artifacts, which are defined as electrical activities that are not generated by the brain, so the data cannot be processed further. This noise certainly leads EEG data processing to many problems and limitations, especially for speech artifacts that often emerge in studies related to communication. In addition, no previous research has studied the characteristics of speech artifacts, causing severe difficulties in detecting them. Therefore, the present study aims to: 1) construct a speech artifact removal system, 2) find the best classification and clustering machine learning models for detection, 3) the search for prospects for using deep neural networks, and 4) find the important features.
Before the machine learning experiment began, EEG-specific preprocessing and decomposition steps were applied to the signals. Afterwards, each of the independent components was labelled according to the correlation with lip EMG and the features were extracted. The machine learning model building experiment consisted of several scenarios that focused on the construction of a baseline model, imbalanced data handling, and feature selection/extraction techniques. Random Forest (f1-score on testing: 0.97) with upsampling and best parameter configuration came out as the best classification model, while Agglomerative (purity on testing: 0.63) with SMOTE, Select K Best as feature selection, and best parameter configuration had the best performance amongst the clustering models. The important features are determined according to the feature importance from the best classification model. Feedforward Neural Networks (f1-score test: 0.74) showed that speech artifact detection with deep neural networks had a promising prospect in the future. The speech artifact removal system was built using the best models established by the machine learning experiment. |
format |
Final Project |
author |
Lovenia, Holy |
spellingShingle |
Lovenia, Holy AUTOMATIC REMOVAL OF SPEECH ARTIFACT IN ELECTROENCEPHALOGRAM DATA USING MACHINE LEARNING |
author_facet |
Lovenia, Holy |
author_sort |
Lovenia, Holy |
title |
AUTOMATIC REMOVAL OF SPEECH ARTIFACT IN ELECTROENCEPHALOGRAM DATA USING MACHINE LEARNING |
title_short |
AUTOMATIC REMOVAL OF SPEECH ARTIFACT IN ELECTROENCEPHALOGRAM DATA USING MACHINE LEARNING |
title_full |
AUTOMATIC REMOVAL OF SPEECH ARTIFACT IN ELECTROENCEPHALOGRAM DATA USING MACHINE LEARNING |
title_fullStr |
AUTOMATIC REMOVAL OF SPEECH ARTIFACT IN ELECTROENCEPHALOGRAM DATA USING MACHINE LEARNING |
title_full_unstemmed |
AUTOMATIC REMOVAL OF SPEECH ARTIFACT IN ELECTROENCEPHALOGRAM DATA USING MACHINE LEARNING |
title_sort |
automatic removal of speech artifact in electroencephalogram data using machine learning |
url |
https://digilib.itb.ac.id/gdl/view/39286 |
_version_ |
1822269219956326400 |