Multichannel equalization applied to speech dereverberation

Speech signals acquired by a distant microphone inside an enclosed space is often degraded by reverberation. Reverberation results from the multipath propagation of a sound wave from its source to receivers. Reverberation can cause a detrimental effect on the perceived quality as well as the intelli...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلف الرئيسي:	Rajan Sobhana Rashobh
مؤلفون آخرون:	Andy Khong Wai Hoong
التنسيق:	Theses and Dissertations
اللغة:	English
منشور في:	2015
الموضوعات:	DRNTU::Engineering::Electrical and electronic engineering
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/62174
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Nanyang Technological University
اللغة:	English

id	sg-ntu-dr.10356-62174
record_format	dspace
spelling	sg-ntu-dr.10356-621742023-07-04T17:15:36Z Multichannel equalization applied to speech dereverberation Rajan Sobhana Rashobh Andy Khong Wai Hoong School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering Speech signals acquired by a distant microphone inside an enclosed space is often degraded by reverberation. Reverberation results from the multipath propagation of a sound wave from its source to receivers. Reverberation can cause a detrimental effect on the perceived quality as well as the intelligibility of the speech signals. This results in performance degradation of systems such as hand-free telephony, hearing aids, and automatic speech/speaker recognition systems. One of the popular approaches to mitigate the effects of reverberation is to achieve channel equalization via a two-stage process where acoustic impulse responses (AIRs) are first estimated using blind channel identification (BCI) techniques after which the received signals are filtered using inverse filters computed from the estimated AIRs. This thesis focuses on speech dereverberation employing BCI and inverse filtering. A typical AIR is often non-minimum phase and its direct inversion will result in an unstable inverse filter. Multichannnel equalization (MCEQ) algorithms developed for use with a microphone array are employed for the equalization of such non-minimum phase AIRs. Existing MCEQ algorithms achieve equalization in the time domain and in this thesis, a generalized framework that allows one to achieve equalization in different transform domains is proposed first. This is motivated from the fact that when equalization is performed on different domains, the inherent properties of the transforms can be exploited to achieve better equalization performance. Noting that the computational complexity of the non-adaptive MCEQ algorithm is proportional to the AIR order, a set of adaptive time-domain MCEQ algorithms are proposed to achieve equalization of high-order AIRs with reduced complexity. These algorithms iteratively estimate the inverse filters by minimizing a cost function. To improve the convergence as well as equalization performance, the sparsity of the desired equalized response is taken into account in the cost function and update equation. Although the time-domain adaptive algorithms reduce complexity, they suffer from slow convergence. To overcome this limitation, complexity reduction in the frequency domain is exploited. The proposed algorithm which achieves equalization in each frequency bin is derived from the proposed generalized framework for MCEQ. It is shown that the proposed algorithm significantly reduces the complexity involved in MCEQ and exhibits higher robustness to channel estimation errors. To further reduce the processing time of the proposed frequency domain MCEQ algorithm, adaptive filtering techniques are introduced. To achieve convergence in a single step, an optimal step size is derived for the proposed adaptive algorithm. Finally, a frequency-domain adaptive BCI algorithm is proposed for the estimation of unknown channels. The proposed algorithm exploits the spatial diversity of a multichannel system and estimates the AIRs based on the cross-relation among the channels. To gain more insights into its performance, the misconvergence problem is analyzed and based on this analysis, a penalty term derived from a sparseness constraint is introduced to the cost function for noise robustness. DOCTOR OF PHILOSOPHY (EEE) 2015-02-25T01:35:50Z 2015-02-25T01:35:50Z 2015 2015 Thesis Rajan Sobhana Rashobh. (2015). Multichannel equalization applied to speech dereverberation. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/62174 10.32657/10356/62174 en 211 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Electrical and electronic engineering
spellingShingle	DRNTU::Engineering::Electrical and electronic engineering Rajan Sobhana Rashobh Multichannel equalization applied to speech dereverberation
description	Speech signals acquired by a distant microphone inside an enclosed space is often degraded by reverberation. Reverberation results from the multipath propagation of a sound wave from its source to receivers. Reverberation can cause a detrimental effect on the perceived quality as well as the intelligibility of the speech signals. This results in performance degradation of systems such as hand-free telephony, hearing aids, and automatic speech/speaker recognition systems. One of the popular approaches to mitigate the effects of reverberation is to achieve channel equalization via a two-stage process where acoustic impulse responses (AIRs) are first estimated using blind channel identification (BCI) techniques after which the received signals are filtered using inverse filters computed from the estimated AIRs. This thesis focuses on speech dereverberation employing BCI and inverse filtering. A typical AIR is often non-minimum phase and its direct inversion will result in an unstable inverse filter. Multichannnel equalization (MCEQ) algorithms developed for use with a microphone array are employed for the equalization of such non-minimum phase AIRs. Existing MCEQ algorithms achieve equalization in the time domain and in this thesis, a generalized framework that allows one to achieve equalization in different transform domains is proposed first. This is motivated from the fact that when equalization is performed on different domains, the inherent properties of the transforms can be exploited to achieve better equalization performance. Noting that the computational complexity of the non-adaptive MCEQ algorithm is proportional to the AIR order, a set of adaptive time-domain MCEQ algorithms are proposed to achieve equalization of high-order AIRs with reduced complexity. These algorithms iteratively estimate the inverse filters by minimizing a cost function. To improve the convergence as well as equalization performance, the sparsity of the desired equalized response is taken into account in the cost function and update equation. Although the time-domain adaptive algorithms reduce complexity, they suffer from slow convergence. To overcome this limitation, complexity reduction in the frequency domain is exploited. The proposed algorithm which achieves equalization in each frequency bin is derived from the proposed generalized framework for MCEQ. It is shown that the proposed algorithm significantly reduces the complexity involved in MCEQ and exhibits higher robustness to channel estimation errors. To further reduce the processing time of the proposed frequency domain MCEQ algorithm, adaptive filtering techniques are introduced. To achieve convergence in a single step, an optimal step size is derived for the proposed adaptive algorithm. Finally, a frequency-domain adaptive BCI algorithm is proposed for the estimation of unknown channels. The proposed algorithm exploits the spatial diversity of a multichannel system and estimates the AIRs based on the cross-relation among the channels. To gain more insights into its performance, the misconvergence problem is analyzed and based on this analysis, a penalty term derived from a sparseness constraint is introduced to the cost function for noise robustness.
author2	Andy Khong Wai Hoong
author_facet	Andy Khong Wai Hoong Rajan Sobhana Rashobh
format	Theses and Dissertations
author	Rajan Sobhana Rashobh
author_sort	Rajan Sobhana Rashobh
title	Multichannel equalization applied to speech dereverberation
title_short	Multichannel equalization applied to speech dereverberation
title_full	Multichannel equalization applied to speech dereverberation
title_fullStr	Multichannel equalization applied to speech dereverberation
title_full_unstemmed	Multichannel equalization applied to speech dereverberation
title_sort	multichannel equalization applied to speech dereverberation
publishDate	2015
url	https://hdl.handle.net/10356/62174
_version_	1772827258889699328

Multichannel equalization applied to speech dereverberation

مواد مشابهة