HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION
<p align="justify"> Nowadays, many applications have been developed to help users to interact with the Quran. Some of them require an Automatic Speech Recognition (ASR) that can recognize the user's Quran recitation. Currently, acoustic modeling for the Qur'an speech recogn...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/27244 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:27244 |
---|---|
spelling |
id-itb.:272442018-07-02T11:34:38ZHYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION THIRAFI (NIM :13514033), FAZA Indonesia Final Project INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/27244 <p align="justify"> Nowadays, many applications have been developed to help users to interact with the Quran. Some of them require an Automatic Speech Recognition (ASR) that can recognize the user's Quran recitation. Currently, acoustic modeling for the Qur'an speech recognitino is using HMM-GMM. HMM-GMM is an acoustic modeling technique that uses a probabilistic approach in calculating the acoustic likelihood through averages and data covariance, so that if the data is too separated and noisy it can produce a non-optimal model. This problem can be improved by using deep learning for acoustic modeling. This Final Assignment research focuses on ASR development for Quran recitation using one of deep learning technique, which is Bidirectional Long-Short Term Memory (BLSTM) with HMM as a Hybrid system to improve acoustic model performance in recognizing Quran recitation. <br /> <br /> In this research, HMM-GMM-based acoustic modeling as the baseline model is compared with experimental models using Hybrid HMM-BLSTM-based acoustic modeling. The results obtained on the testing for Hybid HMM-BLSTM model is the Word Error Rate (WER) value has an average of 4.63%, while for HMM-GMM model with the same testing scenarios, the WER value has an average of 18.39%. This shows that using Hybrid HMM-BLSTM for the acoustic model on the Quran speech recognition is well recommended. This research also analyzes Quran reading style effect by building the model depends on the reading style (maqam). In one case, the closed maqam model with different speaker for testing gives the lowest WER value compared to another speaker testing with different maqam. Another case shows the opposite, which is closed maqam model gives the highest WER value when tested with the same maqam and different speaker. It shows us that considering maqam for building the acoustic model depends on the maqam itself. <p align="justify"> text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
<p align="justify"> Nowadays, many applications have been developed to help users to interact with the Quran. Some of them require an Automatic Speech Recognition (ASR) that can recognize the user's Quran recitation. Currently, acoustic modeling for the Qur'an speech recognitino is using HMM-GMM. HMM-GMM is an acoustic modeling technique that uses a probabilistic approach in calculating the acoustic likelihood through averages and data covariance, so that if the data is too separated and noisy it can produce a non-optimal model. This problem can be improved by using deep learning for acoustic modeling. This Final Assignment research focuses on ASR development for Quran recitation using one of deep learning technique, which is Bidirectional Long-Short Term Memory (BLSTM) with HMM as a Hybrid system to improve acoustic model performance in recognizing Quran recitation. <br />
<br />
In this research, HMM-GMM-based acoustic modeling as the baseline model is compared with experimental models using Hybrid HMM-BLSTM-based acoustic modeling. The results obtained on the testing for Hybid HMM-BLSTM model is the Word Error Rate (WER) value has an average of 4.63%, while for HMM-GMM model with the same testing scenarios, the WER value has an average of 18.39%. This shows that using Hybrid HMM-BLSTM for the acoustic model on the Quran speech recognition is well recommended. This research also analyzes Quran reading style effect by building the model depends on the reading style (maqam). In one case, the closed maqam model with different speaker for testing gives the lowest WER value compared to another speaker testing with different maqam. Another case shows the opposite, which is closed maqam model gives the highest WER value when tested with the same maqam and different speaker. It shows us that considering maqam for building the acoustic model depends on the maqam itself. <p align="justify"> |
format |
Final Project |
author |
THIRAFI (NIM :13514033), FAZA |
spellingShingle |
THIRAFI (NIM :13514033), FAZA HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION |
author_facet |
THIRAFI (NIM :13514033), FAZA |
author_sort |
THIRAFI (NIM :13514033), FAZA |
title |
HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION |
title_short |
HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION |
title_full |
HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION |
title_fullStr |
HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION |
title_full_unstemmed |
HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION |
title_sort |
hybrid hmm-blstm-based acoustic modeling for automatic speech recognition on quran recitation |
url |
https://digilib.itb.ac.id/gdl/view/27244 |
_version_ |
1821934319254372352 |