Broadcast news story segmentation using conditional random fields and multimodal features

In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang, Xiaoxuan, Xie, Lei, Lu, Mimi, Ma, Bin, Chng, Eng Siong, Li, Haizhou
Other Authors: School of Computer Engineering
Format: Article
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/105431
http://hdl.handle.net/10220/16587
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-105431
record_format dspace
spelling sg-ntu-dr.10356-1054312020-05-28T07:17:20Z Broadcast news story segmentation using conditional random fields and multimodal features Wang, Xiaoxuan Xie, Lei Lu, Mimi Ma, Bin Chng, Eng Siong Li, Haizhou School of Computer Engineering DRNTU::Engineering::Computer science and engineering In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers. Published Version 2013-10-18T03:04:02Z 2019-12-06T21:51:08Z 2013-10-18T03:04:02Z 2019-12-06T21:51:08Z 2012 2012 Journal Article Wang, X., Xie, L., Lu, M., Ma, B., Chng, E. S., & Li, H. (2012). Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features. IEICE Transactions on Information and Systems, E95-D(5), 1206-1215. https://hdl.handle.net/10356/105431 http://hdl.handle.net/10220/16587 10.1587/transinf.E95.D.1206 en IEICE transactions on information and systems © 2012 Institute of Electronics, Information and Communication Engineers. This paper was published in IEICE transactions on information and systems and is made available as an electronic reprint (preprint) with permission of Institute of Electronics, Information and Communication Engineers. The paper can be found at the following official DOI: [http://dx.doi.org/10.1587/transinf.E95.D.1206]. One print or electronic copy may be made for personal use only. Systematic or multiple reproduction, distribution to multiple locations via electronic or other means, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper is prohibited and is subject to penalties under law. application/pdf
institution Nanyang Technological University
building NTU Library
country Singapore
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Wang, Xiaoxuan
Xie, Lei
Lu, Mimi
Ma, Bin
Chng, Eng Siong
Li, Haizhou
Broadcast news story segmentation using conditional random fields and multimodal features
description In this paper, we propose integration of multimodal features using conditional random fields (CRFs) for the segmentation of broadcast news stories. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness; acoustic features involve pause duration, pitch, speaker change and audio event type; and visual features contain shot boundaries, anchor faces and news title captions. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as boundary/non-boundary tags based on the multimodal features. Important interlabel relations and contextual feature information are effectively captured by the sequential learning framework of CRFs. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision trees (DTs), Bayesian networks (BNs), naive Bayesian classifiers (NBs), multilayer perception (MLP), support vector machines (SVMs) and maximum entropy (ME) classifiers.
author2 School of Computer Engineering
author_facet School of Computer Engineering
Wang, Xiaoxuan
Xie, Lei
Lu, Mimi
Ma, Bin
Chng, Eng Siong
Li, Haizhou
format Article
author Wang, Xiaoxuan
Xie, Lei
Lu, Mimi
Ma, Bin
Chng, Eng Siong
Li, Haizhou
author_sort Wang, Xiaoxuan
title Broadcast news story segmentation using conditional random fields and multimodal features
title_short Broadcast news story segmentation using conditional random fields and multimodal features
title_full Broadcast news story segmentation using conditional random fields and multimodal features
title_fullStr Broadcast news story segmentation using conditional random fields and multimodal features
title_full_unstemmed Broadcast news story segmentation using conditional random fields and multimodal features
title_sort broadcast news story segmentation using conditional random fields and multimodal features
publishDate 2013
url https://hdl.handle.net/10356/105431
http://hdl.handle.net/10220/16587
_version_ 1681058824218214400