Human robot interaction : speech recognition

The aim of this project is to develop a Speech Recognition System which then be used for human-robot interaction. This system receives speech inputs from users, analyzes the speech inputs by extracting the features of the speech, searches and matches the input speech features with the pre-recorded a...

Full description

Saved in:
Bibliographic Details
Main Author: Tan, Roland Rustan.
Other Authors: Lau Wai Shing, Michael
Format: Final Year Project
Language:English
Published: 2011
Subjects:
Online Access:http://hdl.handle.net/10356/42865
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-42865
record_format dspace
spelling sg-ntu-dr.10356-428652023-03-04T19:01:57Z Human robot interaction : speech recognition Tan, Roland Rustan. Lau Wai Shing, Michael School of Mechanical and Aerospace Engineering Robotics Research Centre DRNTU::Engineering::Mechanical engineering::Robots The aim of this project is to develop a Speech Recognition System which then be used for human-robot interaction. This system receives speech inputs from users, analyzes the speech inputs by extracting the features of the speech, searches and matches the input speech features with the pre-recorded and stored speeches features in the trained database/codebook, and returns the best matching result to the users. Developing this system is meant to make an alternative way to interact with robot which is to provide a natural and social-style human-robot interaction. Verbal interaction is very popular in robotics especially in personal assistive robots, which are used to help elderly people and in entertainment robots. This project is limited to playing soccer- related commands, as well as some entertainment purpose, including play music. For Speech Recognition System to work, it needs acoustic models and language models. The acoustic model is a collection of features which are extracted from the pre-recorded speeches. To extract features from the speech signals the Mel-Frequency Cepstral Coefficients (MFCC) algorithm was applied. The language model is a large list of words and their probability of occurrence in a given sequence. For our purpose of project, grammars, the special type of the language model which defines constraints on words that are expected as input, are used. Julius, which is open-source speech recognition software was used in this project to enable human-robot verbal interaction. It was chosen after doing experiment which clearly showed the superiority of Julius compared to CMU-Sphinx 4 in term of accuracy, with average percentage of accuracy up to 84.865%, while CMU-Sphinx 4 is only 79.855%. Bachelor of Engineering (Mechanical Engineering) 2011-01-25T05:00:48Z 2011-01-25T05:00:48Z 2011 2011 Final Year Project (FYP) http://hdl.handle.net/10356/42865 en Nanyang Technological University 114 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Mechanical engineering::Robots
spellingShingle DRNTU::Engineering::Mechanical engineering::Robots
Tan, Roland Rustan.
Human robot interaction : speech recognition
description The aim of this project is to develop a Speech Recognition System which then be used for human-robot interaction. This system receives speech inputs from users, analyzes the speech inputs by extracting the features of the speech, searches and matches the input speech features with the pre-recorded and stored speeches features in the trained database/codebook, and returns the best matching result to the users. Developing this system is meant to make an alternative way to interact with robot which is to provide a natural and social-style human-robot interaction. Verbal interaction is very popular in robotics especially in personal assistive robots, which are used to help elderly people and in entertainment robots. This project is limited to playing soccer- related commands, as well as some entertainment purpose, including play music. For Speech Recognition System to work, it needs acoustic models and language models. The acoustic model is a collection of features which are extracted from the pre-recorded speeches. To extract features from the speech signals the Mel-Frequency Cepstral Coefficients (MFCC) algorithm was applied. The language model is a large list of words and their probability of occurrence in a given sequence. For our purpose of project, grammars, the special type of the language model which defines constraints on words that are expected as input, are used. Julius, which is open-source speech recognition software was used in this project to enable human-robot verbal interaction. It was chosen after doing experiment which clearly showed the superiority of Julius compared to CMU-Sphinx 4 in term of accuracy, with average percentage of accuracy up to 84.865%, while CMU-Sphinx 4 is only 79.855%.
author2 Lau Wai Shing, Michael
author_facet Lau Wai Shing, Michael
Tan, Roland Rustan.
format Final Year Project
author Tan, Roland Rustan.
author_sort Tan, Roland Rustan.
title Human robot interaction : speech recognition
title_short Human robot interaction : speech recognition
title_full Human robot interaction : speech recognition
title_fullStr Human robot interaction : speech recognition
title_full_unstemmed Human robot interaction : speech recognition
title_sort human robot interaction : speech recognition
publishDate 2011
url http://hdl.handle.net/10356/42865
_version_ 1759854066013306880