Towards textually describing complex video contents with audio-visual concept classifiers

Automatically generating compact textual descriptions of complex video contents has wide applications. With the recent advancements in automatic audio-visual content recognition, in this paper we explore the technical feasibility of the challenging issue of precisely recounting video contents. Based...

Full description

Saved in:
Bibliographic Details
Main Authors: TAN, Chun Chet, JIANG, Yu-Gang, NGO, Chong-wah
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2011
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6489
https://ink.library.smu.edu.sg/context/sis_research/article/7492/viewcontent/2072298.2072411.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7492
record_format dspace
spelling sg-smu-ink.sis_research-74922022-01-10T05:05:50Z Towards textually describing complex video contents with audio-visual concept classifiers TAN, Chun Chet JIANG, Yu-Gang NGO, Chong-wah Automatically generating compact textual descriptions of complex video contents has wide applications. With the recent advancements in automatic audio-visual content recognition, in this paper we explore the technical feasibility of the challenging issue of precisely recounting video contents. Based on cutting-edge automatic recognition techniques, we start from classifying a variety of visual and audio concepts in video contents. According to the classification results, we apply simple rule-based methods to generate textual descriptions of video contents. Results are evaluated by conducting carefully designed user studies. We find that the state-of-the-art visual and audio concept classification, although far from perfect, is able to provide very useful clues indicating what is happening in the videos. Most users involved in the evaluation confirmed the informativeness of our machine-generated descriptions. 2011-12-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6489 info:doi/10.1145/2072298.2072411 https://ink.library.smu.edu.sg/context/sis_research/article/7492/viewcontent/2072298.2072411.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Audio-visual concept classification Textual descriptions of video content Artificial Intelligence and Robotics Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Audio-visual concept classification
Textual descriptions of video content
Artificial Intelligence and Robotics
Graphics and Human Computer Interfaces
spellingShingle Audio-visual concept classification
Textual descriptions of video content
Artificial Intelligence and Robotics
Graphics and Human Computer Interfaces
TAN, Chun Chet
JIANG, Yu-Gang
NGO, Chong-wah
Towards textually describing complex video contents with audio-visual concept classifiers
description Automatically generating compact textual descriptions of complex video contents has wide applications. With the recent advancements in automatic audio-visual content recognition, in this paper we explore the technical feasibility of the challenging issue of precisely recounting video contents. Based on cutting-edge automatic recognition techniques, we start from classifying a variety of visual and audio concepts in video contents. According to the classification results, we apply simple rule-based methods to generate textual descriptions of video contents. Results are evaluated by conducting carefully designed user studies. We find that the state-of-the-art visual and audio concept classification, although far from perfect, is able to provide very useful clues indicating what is happening in the videos. Most users involved in the evaluation confirmed the informativeness of our machine-generated descriptions.
format text
author TAN, Chun Chet
JIANG, Yu-Gang
NGO, Chong-wah
author_facet TAN, Chun Chet
JIANG, Yu-Gang
NGO, Chong-wah
author_sort TAN, Chun Chet
title Towards textually describing complex video contents with audio-visual concept classifiers
title_short Towards textually describing complex video contents with audio-visual concept classifiers
title_full Towards textually describing complex video contents with audio-visual concept classifiers
title_fullStr Towards textually describing complex video contents with audio-visual concept classifiers
title_full_unstemmed Towards textually describing complex video contents with audio-visual concept classifiers
title_sort towards textually describing complex video contents with audio-visual concept classifiers
publisher Institutional Knowledge at Singapore Management University
publishDate 2011
url https://ink.library.smu.edu.sg/sis_research/6489
https://ink.library.smu.edu.sg/context/sis_research/article/7492/viewcontent/2072298.2072411.pdf
_version_ 1770575974797148160