Blind separation of speech mixtures

This thesis addresses three well-known problems in blind source separation of speech signals, namely permutation problem in the frequency domain blind source separation (BSS), underdetermined instantaneous BSS and underdetermined convolutive BSS. For solving the permutation problem in the frequen...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلف الرئيسي: Vaninirappuputhenpurayil Gopalan Reju
مؤلفون آخرون: Koh Soo Ngee
التنسيق: Theses and Dissertations
اللغة:English
منشور في: 2010
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/20784
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:This thesis addresses three well-known problems in blind source separation of speech signals, namely permutation problem in the frequency domain blind source separation (BSS), underdetermined instantaneous BSS and underdetermined convolutive BSS. For solving the permutation problem in the frequency domain for determined mixtures, an algorithm named partial separation method is proposed. The algorithm uses a multistage approach. In the first stage, the mixed signals are partially (roughly) separated using a computationally efficient time domain method. In the second stage, the output from the time domain stage is further separated using the frequency domain BSS algorithm. For the frequency domain stage, the permutation problem is solved using the correlations between the magnitude envelopes of the DFT coefficients of the partially separated signals and those of the fully separated signals from the frequency domain stage. To solve the permutation problem for the case of underdetermined BSS, the k-means clustering approach is used. In this approach, the masks estimated for the separation of the sources by a Time-Frequency (TF) masking approach are clustered by using k-means clustering of small groups of nearby masks with overlap.