DEVELOPMENT OF ACOUSTIC MODEL USING DNN BASED ON CHAIN MODEL IN SPEECH RECOGNITION SYSTEM (CASE STUDY: DENTAL CONVERSATION)
The problem of incomplete dentistry medical records due to time constraints can be solved by developing a speech recognition system in dentistry domain to write dentist medical records. The speech recognition system in dentistry domain has been developed by researchers Islami & Lestari (20...
Saved in:
Main Author: | |
---|---|
Format: | Theses |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/58052 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:58052 |
---|---|
spelling |
id-itb.:580522021-08-30T12:45:37ZDEVELOPMENT OF ACOUSTIC MODEL USING DNN BASED ON CHAIN MODEL IN SPEECH RECOGNITION SYSTEM (CASE STUDY: DENTAL CONVERSATION) Yora Islami, Dinda Indonesia Theses acoustic model, chain model, TDNN, LSTM INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/58052 The problem of incomplete dentistry medical records due to time constraints can be solved by developing a speech recognition system in dentistry domain to write dentist medical records. The speech recognition system in dentistry domain has been developed by researchers Islami & Lestari (2020) using Convolutional Neural Network (CNN) technique to develop acoustic models and n-gram technique to develop language models. System performance was measured by word error rate (WER) and the WER value was 14.47%. This performance can be improved by developing acoustic models and/or language models with other techniques and by handling out of vocabulary (OOV) problems. In this study an acoustic model was developed with other techniques that can improve the performance of the speech recognition system in dentistry domain. In several researchers the development of an acoustic model using the Time Delay Neural Network (TDNN) or Long Short-Term Memory (LSTM) technique resulted in better performance than the CNN technique. Then the development of an acoustic model with a chain model can improve the performance of the speech recognition system. So, in this study an acoustic model was developed using CNN, TDNN, LSTM and the application of the chain model to the three techniques. The language model in this study uses the n-gram technique, and the handling of OOV problems is solved by adding the dental domain text corpus. The results of this study showed that the treatment of OOV with the addition of the dental domain text corpus did not significantly reduce the OOV value. The development of acoustic models using TDNN, and LSTM techniques has not been able to outperform the baseline (CNN) technique. The development of an acoustic model with a chain model on the TDNN technique results in good performance compared to other techniques and baseline. In the TDNN chain, the WER value decreased by 3.6% from the baseline, with a WER value of 10.85%. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
The problem of incomplete dentistry medical records due to time constraints can
be solved by developing a speech recognition system in dentistry domain to write
dentist medical records. The speech recognition system in dentistry domain has
been developed by researchers Islami & Lestari (2020) using Convolutional Neural
Network (CNN) technique to develop acoustic models and n-gram technique to
develop language models. System performance was measured by word error rate
(WER) and the WER value was 14.47%. This performance can be improved by
developing acoustic models and/or language models with other techniques and by
handling out of vocabulary (OOV) problems.
In this study an acoustic model was developed with other techniques that can
improve the performance of the speech recognition system in dentistry domain. In
several researchers the development of an acoustic model using the Time Delay
Neural Network (TDNN) or Long Short-Term Memory (LSTM) technique resulted
in better performance than the CNN technique. Then the development of an acoustic
model with a chain model can improve the performance of the speech recognition
system. So, in this study an acoustic model was developed using CNN, TDNN, LSTM
and the application of the chain model to the three techniques. The language model
in this study uses the n-gram technique, and the handling of OOV problems is solved
by adding the dental domain text corpus.
The results of this study showed that the treatment of OOV with the addition of the
dental domain text corpus did not significantly reduce the OOV value. The
development of acoustic models using TDNN, and LSTM techniques has not been
able to outperform the baseline (CNN) technique. The development of an acoustic
model with a chain model on the TDNN technique results in good performance
compared to other techniques and baseline. In the TDNN chain, the WER value
decreased by 3.6% from the baseline, with a WER value of 10.85%.
|
format |
Theses |
author |
Yora Islami, Dinda |
spellingShingle |
Yora Islami, Dinda DEVELOPMENT OF ACOUSTIC MODEL USING DNN BASED ON CHAIN MODEL IN SPEECH RECOGNITION SYSTEM (CASE STUDY: DENTAL CONVERSATION) |
author_facet |
Yora Islami, Dinda |
author_sort |
Yora Islami, Dinda |
title |
DEVELOPMENT OF ACOUSTIC MODEL USING DNN BASED ON CHAIN MODEL IN SPEECH RECOGNITION SYSTEM (CASE STUDY: DENTAL CONVERSATION) |
title_short |
DEVELOPMENT OF ACOUSTIC MODEL USING DNN BASED ON CHAIN MODEL IN SPEECH RECOGNITION SYSTEM (CASE STUDY: DENTAL CONVERSATION) |
title_full |
DEVELOPMENT OF ACOUSTIC MODEL USING DNN BASED ON CHAIN MODEL IN SPEECH RECOGNITION SYSTEM (CASE STUDY: DENTAL CONVERSATION) |
title_fullStr |
DEVELOPMENT OF ACOUSTIC MODEL USING DNN BASED ON CHAIN MODEL IN SPEECH RECOGNITION SYSTEM (CASE STUDY: DENTAL CONVERSATION) |
title_full_unstemmed |
DEVELOPMENT OF ACOUSTIC MODEL USING DNN BASED ON CHAIN MODEL IN SPEECH RECOGNITION SYSTEM (CASE STUDY: DENTAL CONVERSATION) |
title_sort |
development of acoustic model using dnn based on chain model in speech recognition system (case study: dental conversation) |
url |
https://digilib.itb.ac.id/gdl/view/58052 |
_version_ |
1822275105788526592 |