A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video

Text detection and tracking in video is challenging due to contrast, resolution and background variations, and different orientations and text movements. In addition, the presence of both caption and scene texts in video aggravates the problem because these two text types differ in characteristics s...

Full description

Saved in:
Bibliographic Details
Main Authors: Wu, L., Shivakumara, P., Lu, T., Tan, C.L.
Format: Article
Published: Institute of Electrical and Electronics Engineers (IEEE) 2015
Subjects:
Online Access:http://eprints.um.edu.my/19428/
http://dx.doi.org/10.1109/TMM.2015.2443556
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Malaya
id my.um.eprints.19428
record_format eprints
spelling my.um.eprints.194282018-09-26T04:51:00Z http://eprints.um.edu.my/19428/ A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video Wu, L. Shivakumara, P. Lu, T. Tan, C.L. QA75 Electronic computers. Computer science Text detection and tracking in video is challenging due to contrast, resolution and background variations, and different orientations and text movements. In addition, the presence of both caption and scene texts in video aggravates the problem because these two text types differ in characteristics significantly. This paper proposes a new technique for detecting and tracking video texts of any orientation by using spatial and temporal information, respectively. The technique explores gradient directional symmetry at component level for smoothing edge components before text detection. Spatial information is preserved by forming Delaunay triangulation in a novel way at this level, which results in text candidates. Text characteristics are then proposed in a different way for eliminating false text candidates , which results in potential text candidates. Then grouping is proposed for combining potential text candidates regardless of orientation based on the nearest neighbor criterion. To tackle the problems of multi-font and multi-sized texts, we propose multi-scale integration by a pyramid structure, which helps in extracting full text lines. Then, the detected text lines are tracked in video by matching the subgraphs of triangulation. Experimental results for text detection and tracking on our video dataset, the benchmark video datasets, and the natural scene image benchmark datasets show that the proposed method is superior to the state-of-the-art methods in terms of recall, precision , and F-measure. Institute of Electrical and Electronics Engineers (IEEE) 2015 Article PeerReviewed Wu, L. and Shivakumara, P. and Lu, T. and Tan, C.L. (2015) A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video. IEEE Transactions on Multimedia, 17 (8). pp. 1137-1152. ISSN 1520-9210 http://dx.doi.org/10.1109/TMM.2015.2443556 doi:10.1109/TMM.2015.2443556
institution Universiti Malaya
building UM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Malaya
content_source UM Research Repository
url_provider http://eprints.um.edu.my/
topic QA75 Electronic computers. Computer science
spellingShingle QA75 Electronic computers. Computer science
Wu, L.
Shivakumara, P.
Lu, T.
Tan, C.L.
A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video
description Text detection and tracking in video is challenging due to contrast, resolution and background variations, and different orientations and text movements. In addition, the presence of both caption and scene texts in video aggravates the problem because these two text types differ in characteristics significantly. This paper proposes a new technique for detecting and tracking video texts of any orientation by using spatial and temporal information, respectively. The technique explores gradient directional symmetry at component level for smoothing edge components before text detection. Spatial information is preserved by forming Delaunay triangulation in a novel way at this level, which results in text candidates. Text characteristics are then proposed in a different way for eliminating false text candidates , which results in potential text candidates. Then grouping is proposed for combining potential text candidates regardless of orientation based on the nearest neighbor criterion. To tackle the problems of multi-font and multi-sized texts, we propose multi-scale integration by a pyramid structure, which helps in extracting full text lines. Then, the detected text lines are tracked in video by matching the subgraphs of triangulation. Experimental results for text detection and tracking on our video dataset, the benchmark video datasets, and the natural scene image benchmark datasets show that the proposed method is superior to the state-of-the-art methods in terms of recall, precision , and F-measure.
format Article
author Wu, L.
Shivakumara, P.
Lu, T.
Tan, C.L.
author_facet Wu, L.
Shivakumara, P.
Lu, T.
Tan, C.L.
author_sort Wu, L.
title A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video
title_short A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video
title_full A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video
title_fullStr A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video
title_full_unstemmed A New Technique for Multi-Oriented Scene Text Line Detection and Tracking in Video
title_sort new technique for multi-oriented scene text line detection and tracking in video
publisher Institute of Electrical and Electronics Engineers (IEEE)
publishDate 2015
url http://eprints.um.edu.my/19428/
http://dx.doi.org/10.1109/TMM.2015.2443556
_version_ 1643690985692069888