WADA-W: A modified WADA SNR estimator for audio-visual speech recognition

One of the main challenges in speech recognition is developing systems that are robust to contamination by intrusive background noise. In audio-visual speech recognition (AVSR), audio information is augmented by visual information in order to help improve the performance of speech recognition, parti...

Full description

Saved in:

Bibliographic Details
Main Authors:	Thum, Wei Seong, M. Z., Ibrahim, Mulvaney, D. J.
Format:	Article
Language:	English
Published:	International Association of Computer Science and Information Technology 2019
Subjects:	TK Electrical engineering. Electronics Nuclear engineering
Online Access:	http://umpir.ump.edu.my/id/eprint/22251/13/WADA-W_A%20Modified%20WADA%20SNR.pdf http://umpir.ump.edu.my/id/eprint/22251/ https://doi.org/10.18178/ijmlc.2019.9.4.824 https://doi.org/10.18178/ijmlc.2019.9.4.824
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Universiti Malaysia Pahang Al-Sultan Abdullah
Language:	English

id	my.ump.umpir.22251
record_format	eprints
spelling	my.ump.umpir.222512020-01-28T06:58:01Z http://umpir.ump.edu.my/id/eprint/22251/ WADA-W: A modified WADA SNR estimator for audio-visual speech recognition Thum, Wei Seong M. Z., Ibrahim Mulvaney, D. J. TK Electrical engineering. Electronics Nuclear engineering One of the main challenges in speech recognition is developing systems that are robust to contamination by intrusive background noise. In audio-visual speech recognition (AVSR), audio information is augmented by visual information in order to help improve the performance of speech recognition, particularly when the audio modality is so significantly corrupted by background noise and it becomes hard to differentiate the original speech signal from the noise. The signal-to-noise ratio (SNR) can be used to identify the level of noise in original speech signal and one widely used method for SNR estimation is waveform amplitude distribution analysis (WADA), which is based on the assumption that the speech and noise signals have Gamma and Gaussian amplitude distributions respectively. Based on previous approaches, this work uses a precomputed look-up table as a reference for SNR estimation. In this study, WADA-white (WADA-W) has been developed, which rebuilds the precomputed look-up table using a white noise profile in combination of our own AVSR database. This new data corpus, namely the Loughborough University Audio-Visual (LUNA-V) dataset that contains recordings of 10 speakers with five sets of samples uttered by each speaker is used for this experimental work. We evaluate the performance of WADA-W on this database when it is corrupted by noise generated from three profiles obtained from the NOISEX-92 database included at varying SNR values. Evaluation of performance using the LUNA-V database shows that WADA-W performs better than the original WADA in terms of SNR estimation. International Association of Computer Science and Information Technology 2019 Article PeerReviewed pdf en cc_by_4 http://umpir.ump.edu.my/id/eprint/22251/13/WADA-W_A%20Modified%20WADA%20SNR.pdf Thum, Wei Seong and M. Z., Ibrahim and Mulvaney, D. J. (2019) WADA-W: A modified WADA SNR estimator for audio-visual speech recognition. International Journal of Machine Learning and Computing, 9 (4). pp. 446-451. ISSN 2010-3700. (Published) https://doi.org/10.18178/ijmlc.2019.9.4.824 https://doi.org/10.18178/ijmlc.2019.9.4.824
institution	Universiti Malaysia Pahang Al-Sultan Abdullah
building	UMPSA Library
collection	Institutional Repository
continent	Asia
country	Malaysia
content_provider	Universiti Malaysia Pahang Al-Sultan Abdullah
content_source	UMPSA Institutional Repository
url_provider	http://umpir.ump.edu.my/
language	English
topic	TK Electrical engineering. Electronics Nuclear engineering
spellingShingle	TK Electrical engineering. Electronics Nuclear engineering Thum, Wei Seong M. Z., Ibrahim Mulvaney, D. J. WADA-W: A modified WADA SNR estimator for audio-visual speech recognition
description	One of the main challenges in speech recognition is developing systems that are robust to contamination by intrusive background noise. In audio-visual speech recognition (AVSR), audio information is augmented by visual information in order to help improve the performance of speech recognition, particularly when the audio modality is so significantly corrupted by background noise and it becomes hard to differentiate the original speech signal from the noise. The signal-to-noise ratio (SNR) can be used to identify the level of noise in original speech signal and one widely used method for SNR estimation is waveform amplitude distribution analysis (WADA), which is based on the assumption that the speech and noise signals have Gamma and Gaussian amplitude distributions respectively. Based on previous approaches, this work uses a precomputed look-up table as a reference for SNR estimation. In this study, WADA-white (WADA-W) has been developed, which rebuilds the precomputed look-up table using a white noise profile in combination of our own AVSR database. This new data corpus, namely the Loughborough University Audio-Visual (LUNA-V) dataset that contains recordings of 10 speakers with five sets of samples uttered by each speaker is used for this experimental work. We evaluate the performance of WADA-W on this database when it is corrupted by noise generated from three profiles obtained from the NOISEX-92 database included at varying SNR values. Evaluation of performance using the LUNA-V database shows that WADA-W performs better than the original WADA in terms of SNR estimation.
format	Article
author	Thum, Wei Seong M. Z., Ibrahim Mulvaney, D. J.
author_facet	Thum, Wei Seong M. Z., Ibrahim Mulvaney, D. J.
author_sort	Thum, Wei Seong
title	WADA-W: A modified WADA SNR estimator for audio-visual speech recognition
title_short	WADA-W: A modified WADA SNR estimator for audio-visual speech recognition
title_full	WADA-W: A modified WADA SNR estimator for audio-visual speech recognition
title_fullStr	WADA-W: A modified WADA SNR estimator for audio-visual speech recognition
title_full_unstemmed	WADA-W: A modified WADA SNR estimator for audio-visual speech recognition
title_sort	wada-w: a modified wada snr estimator for audio-visual speech recognition
publisher	International Association of Computer Science and Information Technology
publishDate	2019
url	http://umpir.ump.edu.my/id/eprint/22251/13/WADA-W_A%20Modified%20WADA%20SNR.pdf http://umpir.ump.edu.my/id/eprint/22251/ https://doi.org/10.18178/ijmlc.2019.9.4.824 https://doi.org/10.18178/ijmlc.2019.9.4.824
_version_	1822920342826385408

WADA-W: A modified WADA SNR estimator for audio-visual speech recognition

Similar Items