Real-time sociofeedback from audio and video signals

The paper concentrates on speaker diarization over meeting recordings. The task of speaker diarization is solve the question of “Who and When”, which means find who is speaking in the audio and when do they speak, there are two main steps in the speaker diarization, speaker segmentation and clusteri...

Full description

Saved in:
Bibliographic Details
Main Author: Zhao, Xiaozhi.
Other Authors: Justin Dauwels
Format: Final Year Project
Language:English
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/10356/54391
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-54391
record_format dspace
spelling sg-ntu-dr.10356-543912023-07-07T17:39:35Z Real-time sociofeedback from audio and video signals Zhao, Xiaozhi. Justin Dauwels School of Electrical and Electronic Engineering DRNTU::Engineering::Computer science and engineering The paper concentrates on speaker diarization over meeting recordings. The task of speaker diarization is solve the question of “Who and When”, which means find who is speaking in the audio and when do they speak, there are two main steps in the speaker diarization, speaker segmentation and clustering, what the speaker segmentation do is find speaker change point in the audio, and the number of speakers and the when do each of them speaking can be solved in the clustering step. We adopt BIC algorithm and three typical type ICA algorithms as the experiment method. We only use BIC to implement speaker segmentation, thus the processing result of BIC is not labeled. And in our experiments, ICA is combined with speaker activity detection to implement speaker diarization. We will compare their performance in speaker segmentation, and results in BIC perform a little bit better than ICA algorithms, as the accuracy of BIC can reach 84.45%, compared with ICA algorithms AMUSE, JADE and FOBI, the error rate of them are separately 27.4%, 18.5% and 19.6%. Bachelor of Engineering 2013-06-19T09:18:50Z 2013-06-19T09:18:50Z 2013 2013 Final Year Project (FYP) http://hdl.handle.net/10356/54391 en Nanyang Technological University 70 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Zhao, Xiaozhi.
Real-time sociofeedback from audio and video signals
description The paper concentrates on speaker diarization over meeting recordings. The task of speaker diarization is solve the question of “Who and When”, which means find who is speaking in the audio and when do they speak, there are two main steps in the speaker diarization, speaker segmentation and clustering, what the speaker segmentation do is find speaker change point in the audio, and the number of speakers and the when do each of them speaking can be solved in the clustering step. We adopt BIC algorithm and three typical type ICA algorithms as the experiment method. We only use BIC to implement speaker segmentation, thus the processing result of BIC is not labeled. And in our experiments, ICA is combined with speaker activity detection to implement speaker diarization. We will compare their performance in speaker segmentation, and results in BIC perform a little bit better than ICA algorithms, as the accuracy of BIC can reach 84.45%, compared with ICA algorithms AMUSE, JADE and FOBI, the error rate of them are separately 27.4%, 18.5% and 19.6%.
author2 Justin Dauwels
author_facet Justin Dauwels
Zhao, Xiaozhi.
format Final Year Project
author Zhao, Xiaozhi.
author_sort Zhao, Xiaozhi.
title Real-time sociofeedback from audio and video signals
title_short Real-time sociofeedback from audio and video signals
title_full Real-time sociofeedback from audio and video signals
title_fullStr Real-time sociofeedback from audio and video signals
title_full_unstemmed Real-time sociofeedback from audio and video signals
title_sort real-time sociofeedback from audio and video signals
publishDate 2013
url http://hdl.handle.net/10356/54391
_version_ 1772828365838876672