Broadcast news story segmentation using conditional random fields and multimodal features
In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall...
Saved in:
Main Authors: | , , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2013
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/105431 http://hdl.handle.net/10220/16587 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-105431 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1054312020-05-28T07:17:20Z Broadcast news story segmentation using conditional random fields and multimodal features Wang, Xiaoxuan Xie, Lei Lu, Mimi Ma, Bin Chng, Eng Siong Li, Haizhou School of Computer Engineering DRNTU::Engineering::Computer science and engineering In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers. Published Version 2013-10-18T03:04:02Z 2019-12-06T21:51:08Z 2013-10-18T03:04:02Z 2019-12-06T21:51:08Z 2012 2012 Journal Article Wang, X., Xie, L., Lu, M., Ma, B., Chng, E. S., & Li, H. (2012). Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features. IEICE Transactions on Information and Systems, E95-D(5), 1206-1215. https://hdl.handle.net/10356/105431 http://hdl.handle.net/10220/16587 10.1587/transinf.E95.D.1206 en IEICE transactions on information and systems © 2012 Institute of Electronics, Information and Communication Engineers. This paper was published in IEICE transactions on information and systems and is made available as an electronic reprint (preprint) with permission of Institute of Electronics, Information and Communication Engineers. The paper can be found at the following official DOI: [http://dx.doi.org/10.1587/transinf.E95.D.1206]. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering |
spellingShingle |
DRNTU::Engineering::Computer science and engineering Wang, Xiaoxuan Xie, Lei Lu, Mimi Ma, Bin Chng, Eng Siong Li, Haizhou Broadcast news story segmentation using conditional random fields and multimodal features |
description |
In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers. |
author2 |
School of Computer Engineering |
author_facet |
School of Computer Engineering Wang, Xiaoxuan Xie, Lei Lu, Mimi Ma, Bin Chng, Eng Siong Li, Haizhou |
format |
Article |
author |
Wang, Xiaoxuan Xie, Lei Lu, Mimi Ma, Bin Chng, Eng Siong Li, Haizhou |
author_sort |
Wang, Xiaoxuan |
title |
Broadcast news story segmentation using conditional random fields and multimodal features |
title_short |
Broadcast news story segmentation using conditional random fields and multimodal features |
title_full |
Broadcast news story segmentation using conditional random fields and multimodal features |
title_fullStr |
Broadcast news story segmentation using conditional random fields and multimodal features |
title_full_unstemmed |
Broadcast news story segmentation using conditional random fields and multimodal features |
title_sort |
broadcast news story segmentation using conditional random fields and multimodal features |
publishDate |
2013 |
url |
https://hdl.handle.net/10356/105431 http://hdl.handle.net/10220/16587 |
_version_ |
1681058824218214400 |