FAR DISTANCE AUTOMATIC SPEECH RECOGNITION IN INDONESIAN LANGUAGE

Automatic speech recognition (ASR) is a system which is capable of translating speech into the corresponding text. Development of current ASR is focused on the case of close-range speech, in which the distance between the speaker and the microphone is less than 30 cm. There has been no research c...

Full description

Saved in:

Bibliographic Details
Main Author:	Agus Haryono, Stefanus
Format:	Theses
Language:	Indonesia
Subjects:	Teknik (Rekayasa, enjinering dan kegiatan berkaitan)
Online Access:	https://digilib.itb.ac.id/gdl/view/39800
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Institut Teknologi Bandung
Language:	Indonesia

id	id-itb.:39800
spelling	id-itb.:398002019-06-27T16:07:01ZFAR DISTANCE AUTOMATIC SPEECH RECOGNITION IN INDONESIAN LANGUAGE Agus Haryono, Stefanus Teknik (Rekayasa, enjinering dan kegiatan berkaitan) Indonesia Theses ASR, acoustic model, phoneme. INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/39800 Automatic speech recognition (ASR) is a system which is capable of translating speech into the corresponding text. Development of current ASR is focused on the case of close-range speech, in which the distance between the speaker and the microphone is less than 30 cm. There has been no research conducted on the case of far distance ASR for Indonesian language. In this research, experiments on far distance Indonesian language ASR is conducted. Two different approaches are used to build the ASR; by making speech processing front-end and by making a more robust acoustic model. The tested front-end consist of spectral subtraction, wiener filter, volume normalization, and dynamic range compression. More robust acoustic models are achieved through additions of long distance speech as training data and through volume perturbation. Experiments are conducted on speech data from multiple distance, including 0 meter, 0.5 meter, 1 meter, and 2 meter, with 96 data for each distance. Result of the experiments shows that using spectral subtraction on baseline model reduce the average WER by 0.59%. Addition of long distance speech as training data on acoustic model also increase the average WER by 2.31%. Combination of the new acoustic model and spectral subtraction results in WER reduction of 2.19%, which is lower than just using the acoustic model without frontend. s text
institution	Institut Teknologi Bandung
building	Institut Teknologi Bandung Library
continent	Asia
country	Indonesia Indonesia
content_provider	Institut Teknologi Bandung
collection	Digital ITB
language	Indonesia
topic	Teknik (Rekayasa, enjinering dan kegiatan berkaitan)
spellingShingle	Teknik (Rekayasa, enjinering dan kegiatan berkaitan) Agus Haryono, Stefanus FAR DISTANCE AUTOMATIC SPEECH RECOGNITION IN INDONESIAN LANGUAGE
description	Automatic speech recognition (ASR) is a system which is capable of translating speech into the corresponding text. Development of current ASR is focused on the case of close-range speech, in which the distance between the speaker and the microphone is less than 30 cm. There has been no research conducted on the case of far distance ASR for Indonesian language. In this research, experiments on far distance Indonesian language ASR is conducted. Two different approaches are used to build the ASR; by making speech processing front-end and by making a more robust acoustic model. The tested front-end consist of spectral subtraction, wiener filter, volume normalization, and dynamic range compression. More robust acoustic models are achieved through additions of long distance speech as training data and through volume perturbation. Experiments are conducted on speech data from multiple distance, including 0 meter, 0.5 meter, 1 meter, and 2 meter, with 96 data for each distance. Result of the experiments shows that using spectral subtraction on baseline model reduce the average WER by 0.59%. Addition of long distance speech as training data on acoustic model also increase the average WER by 2.31%. Combination of the new acoustic model and spectral subtraction results in WER reduction of 2.19%, which is lower than just using the acoustic model without frontend. s
format	Theses
author	Agus Haryono, Stefanus
author_facet	Agus Haryono, Stefanus
author_sort	Agus Haryono, Stefanus
title	FAR DISTANCE AUTOMATIC SPEECH RECOGNITION IN INDONESIAN LANGUAGE
title_short	FAR DISTANCE AUTOMATIC SPEECH RECOGNITION IN INDONESIAN LANGUAGE
title_full	FAR DISTANCE AUTOMATIC SPEECH RECOGNITION IN INDONESIAN LANGUAGE
title_fullStr	FAR DISTANCE AUTOMATIC SPEECH RECOGNITION IN INDONESIAN LANGUAGE
title_full_unstemmed	FAR DISTANCE AUTOMATIC SPEECH RECOGNITION IN INDONESIAN LANGUAGE
title_sort	far distance automatic speech recognition in indonesian language
url	https://digilib.itb.ac.id/gdl/view/39800
_version_	1822925406361092096

FAR DISTANCE AUTOMATIC SPEECH RECOGNITION IN INDONESIAN LANGUAGE

Similar Items