Blind separation of speech mixtures

This thesis addresses three well-known problems in blind source separation of speech signals, namely permutation problem in the frequency domain blind source separation (BSS), underdetermined instantaneous BSS and underdetermined convolutive BSS. For solving the permutation problem in the frequen...

Full description

Saved in:
Bibliographic Details
Main Author: Vaninirappuputhenpurayil Gopalan Reju
Other Authors: Koh Soo Ngee
Format: Theses and Dissertations
Language:English
Published: 2010
Subjects:
Online Access:https://hdl.handle.net/10356/20784
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-20784
record_format dspace
spelling sg-ntu-dr.10356-207842023-07-04T16:52:23Z Blind separation of speech mixtures Vaninirappuputhenpurayil Gopalan Reju Koh Soo Ngee School of Electrical and Electronic Engineering DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing This thesis addresses three well-known problems in blind source separation of speech signals, namely permutation problem in the frequency domain blind source separation (BSS), underdetermined instantaneous BSS and underdetermined convolutive BSS. For solving the permutation problem in the frequency domain for determined mixtures, an algorithm named partial separation method is proposed. The algorithm uses a multistage approach. In the first stage, the mixed signals are partially (roughly) separated using a computationally efficient time domain method. In the second stage, the output from the time domain stage is further separated using the frequency domain BSS algorithm. For the frequency domain stage, the permutation problem is solved using the correlations between the magnitude envelopes of the DFT coefficients of the partially separated signals and those of the fully separated signals from the frequency domain stage. To solve the permutation problem for the case of underdetermined BSS, the k-means clustering approach is used. In this approach, the masks estimated for the separation of the sources by a Time-Frequency (TF) masking approach are clustered by using k-means clustering of small groups of nearby masks with overlap. DOCTOR OF PHILOSOPHY (EEE) 2010-01-08T06:23:34Z 2010-01-08T06:23:34Z 2009 2009 Thesis Vaninirappuputhenpurayil, G. R. (2009). Blind separation of speech mixtures. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/20784 10.32657/10356/20784 en 211 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing
spellingShingle DRNTU::Engineering::Electrical and electronic engineering::Electronic systems::Signal processing
Vaninirappuputhenpurayil Gopalan Reju
Blind separation of speech mixtures
description This thesis addresses three well-known problems in blind source separation of speech signals, namely permutation problem in the frequency domain blind source separation (BSS), underdetermined instantaneous BSS and underdetermined convolutive BSS. For solving the permutation problem in the frequency domain for determined mixtures, an algorithm named partial separation method is proposed. The algorithm uses a multistage approach. In the first stage, the mixed signals are partially (roughly) separated using a computationally efficient time domain method. In the second stage, the output from the time domain stage is further separated using the frequency domain BSS algorithm. For the frequency domain stage, the permutation problem is solved using the correlations between the magnitude envelopes of the DFT coefficients of the partially separated signals and those of the fully separated signals from the frequency domain stage. To solve the permutation problem for the case of underdetermined BSS, the k-means clustering approach is used. In this approach, the masks estimated for the separation of the sources by a Time-Frequency (TF) masking approach are clustered by using k-means clustering of small groups of nearby masks with overlap.
author2 Koh Soo Ngee
author_facet Koh Soo Ngee
Vaninirappuputhenpurayil Gopalan Reju
format Theses and Dissertations
author Vaninirappuputhenpurayil Gopalan Reju
author_sort Vaninirappuputhenpurayil Gopalan Reju
title Blind separation of speech mixtures
title_short Blind separation of speech mixtures
title_full Blind separation of speech mixtures
title_fullStr Blind separation of speech mixtures
title_full_unstemmed Blind separation of speech mixtures
title_sort blind separation of speech mixtures
publishDate 2010
url https://hdl.handle.net/10356/20784
_version_ 1772827600679337984