Lecture video enhancement and editing by integrating posture, gesture, and text

This paper describes a novel framework for automatic lecture video editing by gesture, posture, and video text recognition. In content analysis, the trajectory of hand movement is tracked and the intentional gestures are automatically extracted for recognition. In addition, head pose is estimated th...

Full description

Saved in:
Bibliographic Details
Main Authors: WANG, Feng, NGO, Chong-wah, PONG, Ting-Chuen
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2007
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6325
https://ink.library.smu.edu.sg/context/sis_research/article/7328/viewcontent/10.1.1.501.9513.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:This paper describes a novel framework for automatic lecture video editing by gesture, posture, and video text recognition. In content analysis, the trajectory of hand movement is tracked and the intentional gestures are automatically extracted for recognition. In addition, head pose is estimated through overcoming the difficulties due to the complex lighting conditions in classrooms. The aim of recognition is to characterize the flow of lecturing with a series of regional focuses depicted by human postures and gestures. The regions of interest (ROIs) in videos are semantically structured with text recognition and the aid of external documents. By tracing the flow of lecturing, a finite state machine (FSM) which incorporates the gestures, postures, ROIs, general editing rules and constraints, is proposed to edit videos with novel views. The FSM is designed to generate appropriate simulated camera motion and cutting effects that suit the pace of a presenter's gestures and postures. To remedy the undesirable visual effects due to poor lighting conditions, we also propose approaches to automatically enhance the visibility and readability of slides and whiteboard images in the edited videos.