Separation of underdetermined speech mixture based on sparse Bayesian recovery
This thesis focuses on solving the problems of separating underdetermined speech mixture using sparse Bayesian recovery techniques. Firstly, this thesis describes a novel algorithm to improve the performance of sparsity based single-channel speech separation. The conventional approach assumes th...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2017
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/72445 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This thesis focuses on solving the problems of separating underdetermined speech
mixture using sparse Bayesian recovery techniques.
Firstly, this thesis describes a novel algorithm to improve the performance of
sparsity based single-channel speech separation. The conventional approach assumes
the mixing conditions and source signals are stationary. For practical applications
of speech source separation, however, we face the challenges of non-stationary mixing
conditions due to the variation of sources or moving speakers. The proposed
algorithm deals with this nonstationary situation in single-channel source separation
where the speech signals are recovered based on a sparse Bayesian learning
algorithm.
Secondly, an algorithm for underdetermined instantaneous speech separation
problem is described based on hierarchical sparse Bayesian technique for e cient
data reconstruction. The proposed algorithm consists of three steps. The unknown
mixing matrix is rstly estimated from the speech mixtures in the transform domain.
Then, a permutation issue is solved based on the results from the rst step to get
the correct order of the dictionary. Finally speech sources are recovered using the
hierarchical sparse structure of the mixed speech signals. Numerical experiments
including the comparison with other sparse representation approach are provided to
show that our proposed method could reduce the interference e ectively and achieve
desirable performance improvement.
Thirdly, we work on the problem of speech source separation from underdetermined
convolutive mixture with channel identi cation and recovery. Our proposed
method does not require prior knowledge about the source geometry information.
The rst step of the proposed algorithm is to estimate the convolutive channel from
the speech mixtures after a clustering procedure to select single source time interval.
The next step is to recover the speech signal based on a compressed sensing concept
in short time Fourier transform domain. Compared to conventional methods, the
separation performance is greatly improved when the mixing channel is known.
Finally, a noise-robust algorithm of separating speech sources from their underdetermined
convolutive mixture is raised. Unlike the previously reported methods,
our proposed algorithm can work in a noisy environment. In our method, the
recovery of the speech signal makes use of the sparse structure of the speech signals
with a calibration to the estimated channel. The proposed method operates in a
statistical manner in TF domain to achieve desirable separation results without selecting
regularization parameters. Numerical experiments including the comparison
with other separation approaches for convolutive speech mixtures are provided to
show that our algorithm achieves performance improvement. |
---|