INDONESIAN SPONTANEOUS SPEECH RECOGNITION SYSTEM USING DEEP NEURAL NETWORKS

The existing Indonesian speech recognition system has an accuracy that is still not good for spontaneous speech recognition. The system was trained using the HMM-GMM acoustic model. In this study, spontaneous speech data collected in Indonesian with a duration of 14 hours and speech recognition syst...

Full description

Saved in:

Bibliographic Details
Main Author:	Arif Rahman, Dandy
Format:	Final Project
Language:	Indonesia
Online Access:	https://digilib.itb.ac.id/gdl/view/48149
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:48149
spelling	id-itb.:481492020-06-26T22:34:24ZINDONESIAN SPONTANEOUS SPEECH RECOGNITION SYSTEM USING DEEP NEURAL NETWORKS Arif Rahman, Dandy Indonesia Final Project neural network, CNN, DNN, TDNN, acoustic model, WER INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/48149 The existing Indonesian speech recognition system has an accuracy that is still not good for spontaneous speech recognition. The system was trained using the HMM-GMM acoustic model. In this study, spontaneous speech data collected in Indonesian with a duration of 14 hours and speech recognition system performance was improved by replacing the acoustic model with a neural network-based model. The neural network topology used are Deep Neural Network (DNN), Convolutional Neural Network (CNN), and Time Delay Neural Network (TDNN). In this paper, the baseline is the HMM-GMM acoustic model trained with dictated speech, the WER obtained by 73.87%. Then the model was trained on data augmented with noise, the WER value dropped to 71.15%. Then the adaptation technique is applied to the model so that the WER drops to 62.75%. Then adaptation model added noise augmentation and WER dropped to 62.16%. In subsequent experiments, the model was trained with mixed training data between dictated and spontaneous speech, the WER value dropped to 57.59%. Furthermore, the acoustic model was replaced with a neural network-based model. In the DNN model, the WER value drops to 50.02%. While on the CNN model, the WER value dropped to 47.58%. The smallest WER value was obtained in acoustic modeling using TDNN topology. The WER value of the model is 40.63%. text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
description	The existing Indonesian speech recognition system has an accuracy that is still not good for spontaneous speech recognition. The system was trained using the HMM-GMM acoustic model. In this study, spontaneous speech data collected in Indonesian with a duration of 14 hours and speech recognition system performance was improved by replacing the acoustic model with a neural network-based model. The neural network topology used are Deep Neural Network (DNN), Convolutional Neural Network (CNN), and Time Delay Neural Network (TDNN). In this paper, the baseline is the HMM-GMM acoustic model trained with dictated speech, the WER obtained by 73.87%. Then the model was trained on data augmented with noise, the WER value dropped to 71.15%. Then the adaptation technique is applied to the model so that the WER drops to 62.75%. Then adaptation model added noise augmentation and WER dropped to 62.16%. In subsequent experiments, the model was trained with mixed training data between dictated and spontaneous speech, the WER value dropped to 57.59%. Furthermore, the acoustic model was replaced with a neural network-based model. In the DNN model, the WER value drops to 50.02%. While on the CNN model, the WER value dropped to 47.58%. The smallest WER value was obtained in acoustic modeling using TDNN topology. The WER value of the model is 40.63%.
format	Final Project
author	Arif Rahman, Dandy
spellingShingle	Arif Rahman, Dandy INDONESIAN SPONTANEOUS SPEECH RECOGNITION SYSTEM USING DEEP NEURAL NETWORKS
author_facet	Arif Rahman, Dandy
author_sort	Arif Rahman, Dandy
title	INDONESIAN SPONTANEOUS SPEECH RECOGNITION SYSTEM USING DEEP NEURAL NETWORKS
title_short	INDONESIAN SPONTANEOUS SPEECH RECOGNITION SYSTEM USING DEEP NEURAL NETWORKS
title_full	INDONESIAN SPONTANEOUS SPEECH RECOGNITION SYSTEM USING DEEP NEURAL NETWORKS
title_fullStr	INDONESIAN SPONTANEOUS SPEECH RECOGNITION SYSTEM USING DEEP NEURAL NETWORKS
title_full_unstemmed	INDONESIAN SPONTANEOUS SPEECH RECOGNITION SYSTEM USING DEEP NEURAL NETWORKS
title_sort	indonesian spontaneous speech recognition system using deep neural networks
url	https://digilib.itb.ac.id/gdl/view/48149
_version_	1822927838791073792

INDONESIAN SPONTANEOUS SPEECH RECOGNITION SYSTEM USING DEEP NEURAL NETWORKS

Similar Items