Optimal region selection for stereoscopic video subtitle insertion

Stereoscopic subtitle insertion is a fundamental and essential element in stereoscopic film and TV industry. However, little work has been dedicated to the optimal region selection for stereoscopic subtitle insertion. In addition, there is no public database reported for the performance evaluation o...

Full description

Saved in:
Bibliographic Details
Main Authors: Yue, Guanghui, Hou, Chunping, Lei, Jianjun, Fang, Yuming, Lin, Weisi
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2020
Subjects:
Online Access:https://hdl.handle.net/10356/140153
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Stereoscopic subtitle insertion is a fundamental and essential element in stereoscopic film and TV industry. However, little work has been dedicated to the optimal region selection for stereoscopic subtitle insertion. In addition, there is no public database reported for the performance evaluation of it. In this paper, we build the first large-scale video database (TJU3D) for stereoscopic video subtitle insertion, which includes 50 video sequences with rich screen scenes. Compared with 2D subtitle region selection, there are several problems we have to consider in stereoscopic subtitle region selection: 1) the subtitle should avoid depth cue collision and occlusion from objects in stereoscopic video sequences; 2) the disparity value of the subtitle must be minimized to reduce visual discomfort; and 3) the temporal coherence constraint must be considered during region selection for subtitles in video sequences. By considering these constraints, we propose an optimal region selection algorithm for stereoscopic subtitle insertion. First, we compute the disparity map of each video frame in video sequences. For each frame, the optimal position and disparity value of the subtitle are determined by a subtitle region selection algorithm, which contains two parts (i.e., the coarse selection and fine selection). After that, by considering the temporal consistency between adjacent frames, the position and disparity value of each frame are further classified and processed in order to avoid the subtitle jitter. We evaluate the proposed method on TJU3D video database through two visual discomfort prediction metrics and one subjective experiment. To further verify the effectiveness of the proposed method, we also validate the performance of the proposed method on video comfort assessment database, i.e., IEEE-SA Stereo Database. Experimental results demonstrate that the visual discomfort is greatly reduced when using the proposed method compared with the basic method.