HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION

<p align="justify"> Nowadays, many applications have been developed to help users to interact with the Quran. Some of them require an Automatic Speech Recognition (ASR) that can recognize the user's Quran recitation. Currently, acoustic modeling for the Qur'an speech recogn...

Full description

Saved in:
Bibliographic Details
Main Author: THIRAFI (NIM :13514033), FAZA
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/27244
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:27244
spelling id-itb.:272442018-07-02T11:34:38ZHYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION THIRAFI (NIM :13514033), FAZA Indonesia Final Project INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/27244 <p align="justify"> Nowadays, many applications have been developed to help users to interact with the Quran. Some of them require an Automatic Speech Recognition (ASR) that can recognize the user's Quran recitation. Currently, acoustic modeling for the Qur'an speech recognitino is using HMM-GMM. HMM-GMM is an acoustic modeling technique that uses a probabilistic approach in calculating the acoustic likelihood through averages and data covariance, so that if the data is too separated and noisy it can produce a non-optimal model. This problem can be improved by using deep learning for acoustic modeling. This Final Assignment research focuses on ASR development for Quran recitation using one of deep learning technique, which is Bidirectional Long-Short Term Memory (BLSTM) with HMM as a Hybrid system to improve acoustic model performance in recognizing Quran recitation. <br /> <br /> In this research, HMM-GMM-based acoustic modeling as the baseline model is compared with experimental models using Hybrid HMM-BLSTM-based acoustic modeling. The results obtained on the testing for Hybid HMM-BLSTM model is the Word Error Rate (WER) value has an average of 4.63%, while for HMM-GMM model with the same testing scenarios, the WER value has an average of 18.39%. This shows that using Hybrid HMM-BLSTM for the acoustic model on the Quran speech recognition is well recommended. This research also analyzes Quran reading style effect by building the model depends on the reading style (maqam). In one case, the closed maqam model with different speaker for testing gives the lowest WER value compared to another speaker testing with different maqam. Another case shows the opposite, which is closed maqam model gives the highest WER value when tested with the same maqam and different speaker. It shows us that considering maqam for building the acoustic model depends on the maqam itself. <p align="justify"> text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description <p align="justify"> Nowadays, many applications have been developed to help users to interact with the Quran. Some of them require an Automatic Speech Recognition (ASR) that can recognize the user's Quran recitation. Currently, acoustic modeling for the Qur'an speech recognitino is using HMM-GMM. HMM-GMM is an acoustic modeling technique that uses a probabilistic approach in calculating the acoustic likelihood through averages and data covariance, so that if the data is too separated and noisy it can produce a non-optimal model. This problem can be improved by using deep learning for acoustic modeling. This Final Assignment research focuses on ASR development for Quran recitation using one of deep learning technique, which is Bidirectional Long-Short Term Memory (BLSTM) with HMM as a Hybrid system to improve acoustic model performance in recognizing Quran recitation. <br /> <br /> In this research, HMM-GMM-based acoustic modeling as the baseline model is compared with experimental models using Hybrid HMM-BLSTM-based acoustic modeling. The results obtained on the testing for Hybid HMM-BLSTM model is the Word Error Rate (WER) value has an average of 4.63%, while for HMM-GMM model with the same testing scenarios, the WER value has an average of 18.39%. This shows that using Hybrid HMM-BLSTM for the acoustic model on the Quran speech recognition is well recommended. This research also analyzes Quran reading style effect by building the model depends on the reading style (maqam). In one case, the closed maqam model with different speaker for testing gives the lowest WER value compared to another speaker testing with different maqam. Another case shows the opposite, which is closed maqam model gives the highest WER value when tested with the same maqam and different speaker. It shows us that considering maqam for building the acoustic model depends on the maqam itself. <p align="justify">
format Final Project
author THIRAFI (NIM :13514033), FAZA
spellingShingle THIRAFI (NIM :13514033), FAZA
HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION
author_facet THIRAFI (NIM :13514033), FAZA
author_sort THIRAFI (NIM :13514033), FAZA
title HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION
title_short HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION
title_full HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION
title_fullStr HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION
title_full_unstemmed HYBRID HMM-BLSTM-BASED ACOUSTIC MODELING FOR AUTOMATIC SPEECH RECOGNITION ON QURAN RECITATION
title_sort hybrid hmm-blstm-based acoustic modeling for automatic speech recognition on quran recitation
url https://digilib.itb.ac.id/gdl/view/27244
_version_ 1821934319254372352