Real-time sociofeedback from audio and video signals
The paper concentrates on speaker diarization over meeting recordings. The task of speaker diarization is solve the question of “Who and When”, which means find who is speaking in the audio and when do they speak, there are two main steps in the speaker diarization, speaker segmentation and clusteri...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | http://hdl.handle.net/10356/54391 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-54391 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-543912023-07-07T17:39:35Z Real-time sociofeedback from audio and video signals Zhao, Xiaozhi. Justin Dauwels School of Electrical and Electronic Engineering DRNTU::Engineering::Computer science and engineering The paper concentrates on speaker diarization over meeting recordings. The task of speaker diarization is solve the question of “Who and When”, which means find who is speaking in the audio and when do they speak, there are two main steps in the speaker diarization, speaker segmentation and clustering, what the speaker segmentation do is find speaker change point in the audio, and the number of speakers and the when do each of them speaking can be solved in the clustering step. We adopt BIC algorithm and three typical type ICA algorithms as the experiment method. We only use BIC to implement speaker segmentation, thus the processing result of BIC is not labeled. And in our experiments, ICA is combined with speaker activity detection to implement speaker diarization. We will compare their performance in speaker segmentation, and results in BIC perform a little bit better than ICA algorithms, as the accuracy of BIC can reach 84.45%, compared with ICA algorithms AMUSE, JADE and FOBI, the error rate of them are separately 27.4%, 18.5% and 19.6%. Bachelor of Engineering 2013-06-19T09:18:50Z 2013-06-19T09:18:50Z 2013 2013 Final Year Project (FYP) http://hdl.handle.net/10356/54391 en Nanyang Technological University 70 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering |
spellingShingle |
DRNTU::Engineering::Computer science and engineering Zhao, Xiaozhi. Real-time sociofeedback from audio and video signals |
description |
The paper concentrates on speaker diarization over meeting recordings. The task of speaker diarization is solve the question of “Who and When”, which means find who is speaking in the audio and when do they speak, there are two main steps in the speaker diarization, speaker segmentation and clustering, what the speaker segmentation do is find speaker change point in the audio, and the number of speakers and the when do each of them speaking can be solved in the clustering step. We adopt BIC algorithm and three typical type ICA algorithms as the experiment method. We only use BIC to implement speaker segmentation, thus the processing result of BIC is not labeled. And in our experiments, ICA is combined with speaker activity detection to implement speaker diarization.
We will compare their performance in speaker segmentation, and results in BIC perform a little bit better than ICA algorithms, as the accuracy of BIC can reach 84.45%, compared with ICA algorithms AMUSE, JADE and FOBI, the error rate of them are separately 27.4%, 18.5% and 19.6%. |
author2 |
Justin Dauwels |
author_facet |
Justin Dauwels Zhao, Xiaozhi. |
format |
Final Year Project |
author |
Zhao, Xiaozhi. |
author_sort |
Zhao, Xiaozhi. |
title |
Real-time sociofeedback from audio and video signals |
title_short |
Real-time sociofeedback from audio and video signals |
title_full |
Real-time sociofeedback from audio and video signals |
title_fullStr |
Real-time sociofeedback from audio and video signals |
title_full_unstemmed |
Real-time sociofeedback from audio and video signals |
title_sort |
real-time sociofeedback from audio and video signals |
publishDate |
2013 |
url |
http://hdl.handle.net/10356/54391 |
_version_ |
1772828365838876672 |