Measuring novelty and redundancy with multiple modalities in cross-lingual broadcast news

News videos from different channels, languages are broadcast everyday, which provide abundant information for users. To effectively search, retrieve, browse and track news stories, news story similarity plays a critical role in assessing the novelty and redundancy among news stories. In this paper,...

Full description

Saved in:
Bibliographic Details
Main Authors: WU, Xiao, HAUPTMANN, Alexander G., NGO, Chong-wah
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2008
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6327
https://ink.library.smu.edu.sg/context/sis_research/article/7330/viewcontent/Measuring_novelty_and_redundancy_with_mu.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:News videos from different channels, languages are broadcast everyday, which provide abundant information for users. To effectively search, retrieve, browse and track news stories, news story similarity plays a critical role in assessing the novelty and redundancy among news stories. In this paper, we explore different measures of novelty and redundancy detection for cross-lingual news stories. A news story is represented by multimodal features which include a sequence of keyframes in the visual track, and a set of words and named entities extracted from speech transcript in the audio track. Vector space models and language models on individual features (text, named entities and keyframes) are constructed to compare the similarity among stories. Furthermore, multiple modalities are further fused to improve the performance. Experiments on the TRECVID-2005 cross-lingual news video corpus showed that modalities and measures demonstrate variant performance for novelty and redundancy detection. Language models on text are appropriate for detecting completely redundant stories, while Cosine Distance on keyframes is suitable for detecting somewhat redundant stories. The performance on mono-lingual topics is better than multilingual topics. Textual features and visual features complement each other, and fusion of text, named entities and keyframes substantially improves the performance, which outperforms approaches with just individual features. (C) 2007 Elsevier Inc. All rights reserved.