Thai speech recognition using double filter banks for basic voice commanding

This paper describes the methodology to recognize Thai speech words by integrating two approaches e.g., Double filter banks and Euclidian distance in a feature extraction and the recognition processes, respectively. Firstly, the speech signals are transformed into the 3-dimension of signal or spectr...

Full description

Saved in:
Bibliographic Details
Main Authors: Pisit Phokharatkul, Kriengkrai Nantanitikorn, Supachai Phaiboon
Other Authors: Mahidol University
Format: Conference or Workshop Item
Published: 2018
Subjects:
Online Access:https://repository.li.mahidol.ac.th/handle/123456789/28963
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Mahidol University
id th-mahidol.28963
record_format dspace
spelling th-mahidol.289632018-09-24T15:59:09Z Thai speech recognition using double filter banks for basic voice commanding Pisit Phokharatkul Kriengkrai Nantanitikorn Supachai Phaiboon Mahidol University Computer Science Engineering This paper describes the methodology to recognize Thai speech words by integrating two approaches e.g., Double filter banks and Euclidian distance in a feature extraction and the recognition processes, respectively. Firstly, the speech signals are transformed into the 3-dimension of signal or spectrogram. The spectrogram displays energy information along both time and frequency axes. Secondly, the frequencies to be within the bin spread and correlated them with each triangular filter. Thus, each bin holds a weighted sum, and represents the spectral magnitude in that filter bank channel. Finally, the filter banks are normalized into the normalized bank for comparison between the entry signal of word and various words of dictionary. The Euclidian distance is used to measure the similarity between them. The system was evaluated for its accuracy and stability in performing various conditions. The accuracy was tested with 9, 000 speeches from several volunteers. The average accuracy rate is about 96.3 %. The results show that the evaluation was beyond satisfaction for every aspect. © 2010 IEEE. 2018-09-24T08:56:18Z 2018-09-24T08:56:18Z 2010-12-16 Conference Paper 2010 International Conference on Computer, Mechatronics, Control and Electronic Engineering, CMCE 2010. Vol.6, (2010), 33-36 10.1109/CMCE.2010.5609930 2-s2.0-78650012404 https://repository.li.mahidol.ac.th/handle/123456789/28963 Mahidol University SCOPUS https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=78650012404&origin=inward
institution Mahidol University
building Mahidol University Library
continent Asia
country Thailand
Thailand
content_provider Mahidol University Library
collection Mahidol University Institutional Repository
topic Computer Science
Engineering
spellingShingle Computer Science
Engineering
Pisit Phokharatkul
Kriengkrai Nantanitikorn
Supachai Phaiboon
Thai speech recognition using double filter banks for basic voice commanding
description This paper describes the methodology to recognize Thai speech words by integrating two approaches e.g., Double filter banks and Euclidian distance in a feature extraction and the recognition processes, respectively. Firstly, the speech signals are transformed into the 3-dimension of signal or spectrogram. The spectrogram displays energy information along both time and frequency axes. Secondly, the frequencies to be within the bin spread and correlated them with each triangular filter. Thus, each bin holds a weighted sum, and represents the spectral magnitude in that filter bank channel. Finally, the filter banks are normalized into the normalized bank for comparison between the entry signal of word and various words of dictionary. The Euclidian distance is used to measure the similarity between them. The system was evaluated for its accuracy and stability in performing various conditions. The accuracy was tested with 9, 000 speeches from several volunteers. The average accuracy rate is about 96.3 %. The results show that the evaluation was beyond satisfaction for every aspect. © 2010 IEEE.
author2 Mahidol University
author_facet Mahidol University
Pisit Phokharatkul
Kriengkrai Nantanitikorn
Supachai Phaiboon
format Conference or Workshop Item
author Pisit Phokharatkul
Kriengkrai Nantanitikorn
Supachai Phaiboon
author_sort Pisit Phokharatkul
title Thai speech recognition using double filter banks for basic voice commanding
title_short Thai speech recognition using double filter banks for basic voice commanding
title_full Thai speech recognition using double filter banks for basic voice commanding
title_fullStr Thai speech recognition using double filter banks for basic voice commanding
title_full_unstemmed Thai speech recognition using double filter banks for basic voice commanding
title_sort thai speech recognition using double filter banks for basic voice commanding
publishDate 2018
url https://repository.li.mahidol.ac.th/handle/123456789/28963
_version_ 1763494493102800896