ANALISYS EFECTIVITY ON THE DEVELOPMENT OF INDONESIAN SPEECH RECOGNITION CORPUS USING ITERATIVE APROACH

<p align="justify">High quality Speech Recognition (SR) system is at least trained with corpus that consists of hundred or more utterances sample with hundred or more speakers. On making corpus for SR system, segmentation is needed to mark speech waveform for each linguistic unit ba...

Full description

Saved in:
Bibliographic Details
Main Author: FIARNI (NIM 23205009), CUT
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/10148
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:10148
spelling id-itb.:101482017-09-27T15:37:36ZANALISYS EFECTIVITY ON THE DEVELOPMENT OF INDONESIAN SPEECH RECOGNITION CORPUS USING ITERATIVE APROACH FIARNI (NIM 23205009), CUT Indonesia Theses INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/10148 <p align="justify">High quality Speech Recognition (SR) system is at least trained with corpus that consists of hundred or more utterances sample with hundred or more speakers. On making corpus for SR system, segmentation is needed to mark speech waveform for each linguistic unit based on time unit from all training data files, manually. Therefore developing the high quality corpus will need a lot of resources and time consuming.<p align="justify"><p>One of alternative ways to accelerate the development of high quality corpus is using iterative approach. On this method, small volume of corpus is developed manually. Then, that small corpus is used to recognize and tagg automatically some of sentences or words that will be used as content in the next corpus. The result will be edited manually and then bundled together with the first small corpus. Then this bundle will be use to recognize and tagg the content in the next corpus. So then, we will gain corpus with larger volume. In this research, corpus in Indonesian language consist of 10860 files will be developed with iterative approach.<p align="justify"><p>From analyses and measurements, the system can reach accuracy about 95.28. %. From this result, we can conclude that the developed corpus with iterative approach. can produce good accuracy and more efficient compared to manual labeling. <br /> text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description <p align="justify">High quality Speech Recognition (SR) system is at least trained with corpus that consists of hundred or more utterances sample with hundred or more speakers. On making corpus for SR system, segmentation is needed to mark speech waveform for each linguistic unit based on time unit from all training data files, manually. Therefore developing the high quality corpus will need a lot of resources and time consuming.<p align="justify"><p>One of alternative ways to accelerate the development of high quality corpus is using iterative approach. On this method, small volume of corpus is developed manually. Then, that small corpus is used to recognize and tagg automatically some of sentences or words that will be used as content in the next corpus. The result will be edited manually and then bundled together with the first small corpus. Then this bundle will be use to recognize and tagg the content in the next corpus. So then, we will gain corpus with larger volume. In this research, corpus in Indonesian language consist of 10860 files will be developed with iterative approach.<p align="justify"><p>From analyses and measurements, the system can reach accuracy about 95.28. %. From this result, we can conclude that the developed corpus with iterative approach. can produce good accuracy and more efficient compared to manual labeling. <br />
format Theses
author FIARNI (NIM 23205009), CUT
spellingShingle FIARNI (NIM 23205009), CUT
ANALISYS EFECTIVITY ON THE DEVELOPMENT OF INDONESIAN SPEECH RECOGNITION CORPUS USING ITERATIVE APROACH
author_facet FIARNI (NIM 23205009), CUT
author_sort FIARNI (NIM 23205009), CUT
title ANALISYS EFECTIVITY ON THE DEVELOPMENT OF INDONESIAN SPEECH RECOGNITION CORPUS USING ITERATIVE APROACH
title_short ANALISYS EFECTIVITY ON THE DEVELOPMENT OF INDONESIAN SPEECH RECOGNITION CORPUS USING ITERATIVE APROACH
title_full ANALISYS EFECTIVITY ON THE DEVELOPMENT OF INDONESIAN SPEECH RECOGNITION CORPUS USING ITERATIVE APROACH
title_fullStr ANALISYS EFECTIVITY ON THE DEVELOPMENT OF INDONESIAN SPEECH RECOGNITION CORPUS USING ITERATIVE APROACH
title_full_unstemmed ANALISYS EFECTIVITY ON THE DEVELOPMENT OF INDONESIAN SPEECH RECOGNITION CORPUS USING ITERATIVE APROACH
title_sort analisys efectivity on the development of indonesian speech recognition corpus using iterative aproach
url https://digilib.itb.ac.id/gdl/view/10148
_version_ 1820664894339416064