A review of audio-visual speech recognition
Speech is the most important tool of interaction among human beings. This has inspired researchers to study further on speech recognition and develop a computer system that is able to integrate and understand human speech. But acoustic noisy environment can highly contaminate audio speech and affect...
Saved in:
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
UTeM
2018
|
Subjects: | |
Online Access: | http://umpir.ump.edu.my/id/eprint/21637/1/A%20review%20of%20audio-visual%20speech%20recognition.pdf http://umpir.ump.edu.my/id/eprint/21637/ http://journal.utem.edu.my/index.php/jtec/article/view/3573 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Universiti Malaysia Pahang |
Language: | English |
id |
my.ump.umpir.21637 |
---|---|
record_format |
eprints |
spelling |
my.ump.umpir.216372018-09-14T07:20:30Z http://umpir.ump.edu.my/id/eprint/21637/ A review of audio-visual speech recognition Thum, Wei Seong M. Z., Ibrahim TK Electrical engineering. Electronics Nuclear engineering Speech is the most important tool of interaction among human beings. This has inspired researchers to study further on speech recognition and develop a computer system that is able to integrate and understand human speech. But acoustic noisy environment can highly contaminate audio speech and affect the overall recognition performance. Thus, Audio-Visual Speech Recognition (AVSR) is designed to overcome the problems by utilising visual images which are unaffected by noise. The aim of this paper is to discuss the AVSR structures, which includes the front end processes, audio-visual data corpus used, recent works and accuracy estimation methods. UTeM 2018 Article PeerReviewed pdf en cc_by http://umpir.ump.edu.my/id/eprint/21637/1/A%20review%20of%20audio-visual%20speech%20recognition.pdf Thum, Wei Seong and M. Z., Ibrahim (2018) A review of audio-visual speech recognition. Journal of Telecommunication, Electronic and Computer Engineering, 10 (1-4). pp. 35-40. ISSN 2289-8131 http://journal.utem.edu.my/index.php/jtec/article/view/3573 |
institution |
Universiti Malaysia Pahang |
building |
UMP Library |
collection |
Institutional Repository |
continent |
Asia |
country |
Malaysia |
content_provider |
Universiti Malaysia Pahang |
content_source |
UMP Institutional Repository |
url_provider |
http://umpir.ump.edu.my/ |
language |
English |
topic |
TK Electrical engineering. Electronics Nuclear engineering |
spellingShingle |
TK Electrical engineering. Electronics Nuclear engineering Thum, Wei Seong M. Z., Ibrahim A review of audio-visual speech recognition |
description |
Speech is the most important tool of interaction among human beings. This has inspired researchers to study further on speech recognition and develop a computer system that is able to integrate and understand human speech. But acoustic noisy environment can highly contaminate audio speech and affect the overall recognition performance. Thus, Audio-Visual Speech Recognition (AVSR) is designed to overcome the problems by utilising visual images which are unaffected by noise. The aim of this paper is to discuss the AVSR structures, which includes the front end processes, audio-visual data corpus used, recent works and accuracy estimation methods. |
format |
Article |
author |
Thum, Wei Seong M. Z., Ibrahim |
author_facet |
Thum, Wei Seong M. Z., Ibrahim |
author_sort |
Thum, Wei Seong |
title |
A review of audio-visual speech recognition |
title_short |
A review of audio-visual speech recognition |
title_full |
A review of audio-visual speech recognition |
title_fullStr |
A review of audio-visual speech recognition |
title_full_unstemmed |
A review of audio-visual speech recognition |
title_sort |
review of audio-visual speech recognition |
publisher |
UTeM |
publishDate |
2018 |
url |
http://umpir.ump.edu.my/id/eprint/21637/1/A%20review%20of%20audio-visual%20speech%20recognition.pdf http://umpir.ump.edu.my/id/eprint/21637/ http://journal.utem.edu.my/index.php/jtec/article/view/3573 |
_version_ |
1643669179251818496 |