Real-time sociofeedback from audio and video signals

The paper concentrates on speaker diarization over meeting recordings. The task of speaker diarization is solve the question of “Who and When”, which means find who is speaking in the audio and when do they speak, there are two main steps in the speaker diarization, speaker segmentation and clusteri...

Full description

Saved in:
Bibliographic Details
Main Author: Zhao, Xiaozhi.
Other Authors: Justin Dauwels
Format: Final Year Project
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/54391
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The paper concentrates on speaker diarization over meeting recordings. The task of speaker diarization is solve the question of “Who and When”, which means find who is speaking in the audio and when do they speak, there are two main steps in the speaker diarization, speaker segmentation and clustering, what the speaker segmentation do is find speaker change point in the audio, and the number of speakers and the when do each of them speaking can be solved in the clustering step. We adopt BIC algorithm and three typical type ICA algorithms as the experiment method. We only use BIC to implement speaker segmentation, thus the processing result of BIC is not labeled. And in our experiments, ICA is combined with speaker activity detection to implement speaker diarization. We will compare their performance in speaker segmentation, and results in BIC perform a little bit better than ICA algorithms, as the accuracy of BIC can reach 84.45%, compared with ICA algorithms AMUSE, JADE and FOBI, the error rate of them are separately 27.4%, 18.5% and 19.6%.