Sampling-based image and video matting without compositing equation

Image and video matting play a fundamental role in image and video editing applications. They are generally classified into α-propagation based approaches and color sampling based approaches. The correlation between neighboring pixels with respect to local image statistics is leveraged to interpo...

Full description

Saved in:

Bibliographic Details
Main Author:	Johnson, Jubin
Other Authors:	Deepu Rajan
Format:	Theses and Dissertations
Language:	English
Published:	2017
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	http://hdl.handle.net/10356/70341
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-70341
record_format	dspace
spelling	sg-ntu-dr.10356-703412023-03-04T00:49:04Z Sampling-based image and video matting without compositing equation Johnson, Jubin Deepu Rajan School of Computer Science and Engineering Centre for Multimedia and Network Technology DRNTU::Engineering::Computer science and engineering Image and video matting play a fundamental role in image and video editing applications. They are generally classified into α-propagation based approaches and color sampling based approaches. The correlation between neighboring pixels with respect to local image statistics is leveraged to interpolate the known alpha values to the unknown regions in α-propagation methods. In color sampling methods, foreground (F ) and background (B) samples from known regions that represent the true colors of the unknown pixels are used to estimate alpha. Complex color distributions of foreground and background regions, highly textured edges, and unavailability of true F and B samples are some of the main challenges faced by current works. In addition to this, sampling methods have traditionally followed the compositing equation using (F, B) pairs for alpha estimation. When extended to videos, the unavailability of user-defined trimaps in each frame and the additional requirement of temporal coherency across the sequence makes the matte extraction process a highly challenging task. We aim to develop novel natural matting algorithms for both images and video that can alleviate the drawbacks faced by current methods in generating a good quality matte. We achieve the objectives through the following contributions. First, a sampling-based image matting algorithm is proposed that utilizes sparse coding in the image domain to extract an alpha matte. Multiple F and B samples, as opposed to a single (F, B) pair is used to describe the color at a blended pixel. A carefully chosen dictionary made up of feature vectors from the F and B regions, refined through a foreground probability map ensures that the constrained sparse code coefficients can be approximated to the alpha value. Experimental evaluations on a public benchmark database show that our method achieves state-of-the-art results. Second, a new video matting algorithm is proposed which uses a multi-frame graph- ical model to ensure temporal coherency in the extracted matte. The alpha value at a pixel needs to be consistent and smooth across the video sequence for better tempo- ral coherence. This is accomplished by simultaneously solving for the alpha mattes for multiple consecutive frames. An objective function is proposed that can be solved in closed-form as a sparse linear system. An adaptive temporal trimap propagation using motion-assisted shape blending is utilized to propagate the trimaps automatically be- tween the key-frames. Experimental evaluations on an exclusive video matting dataset validates the effectiveness of the method. Third, a new sampling-based video matting algorithm is proposed that reinterprets the matting problem from the perspective of sparse reconstruction error of F and B samples. Sampling methods generally select an (F, B) pair that produces the least recon- struction error. The significance of the error has been left unexamined. Two patch-based frameworks are used to ensure temporal coherency in the video mattes - a multi-frame non-local means framework using coherency sensitive hashing and a patch-based multi- frame graph model using motion. Qualitative and quantitative evaluations indicate the performance of the method in reducing temporal jitter and maintaining spatial accuracy in the video mattes. Doctor of Philosophy (SCE) 2017-04-20T08:28:04Z 2017-04-20T08:28:04Z 2017 Thesis Johnson, J. (2017). Sampling-based image and video matting without compositing equation. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/70341 10.32657/10356/70341 en 143 p. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering
spellingShingle	DRNTU::Engineering::Computer science and engineering Johnson, Jubin Sampling-based image and video matting without compositing equation
description	Image and video matting play a fundamental role in image and video editing applications. They are generally classified into α-propagation based approaches and color sampling based approaches. The correlation between neighboring pixels with respect to local image statistics is leveraged to interpolate the known alpha values to the unknown regions in α-propagation methods. In color sampling methods, foreground (F ) and background (B) samples from known regions that represent the true colors of the unknown pixels are used to estimate alpha. Complex color distributions of foreground and background regions, highly textured edges, and unavailability of true F and B samples are some of the main challenges faced by current works. In addition to this, sampling methods have traditionally followed the compositing equation using (F, B) pairs for alpha estimation. When extended to videos, the unavailability of user-defined trimaps in each frame and the additional requirement of temporal coherency across the sequence makes the matte extraction process a highly challenging task. We aim to develop novel natural matting algorithms for both images and video that can alleviate the drawbacks faced by current methods in generating a good quality matte. We achieve the objectives through the following contributions. First, a sampling-based image matting algorithm is proposed that utilizes sparse coding in the image domain to extract an alpha matte. Multiple F and B samples, as opposed to a single (F, B) pair is used to describe the color at a blended pixel. A carefully chosen dictionary made up of feature vectors from the F and B regions, refined through a foreground probability map ensures that the constrained sparse code coefficients can be approximated to the alpha value. Experimental evaluations on a public benchmark database show that our method achieves state-of-the-art results. Second, a new video matting algorithm is proposed which uses a multi-frame graph- ical model to ensure temporal coherency in the extracted matte. The alpha value at a pixel needs to be consistent and smooth across the video sequence for better tempo- ral coherence. This is accomplished by simultaneously solving for the alpha mattes for multiple consecutive frames. An objective function is proposed that can be solved in closed-form as a sparse linear system. An adaptive temporal trimap propagation using motion-assisted shape blending is utilized to propagate the trimaps automatically be- tween the key-frames. Experimental evaluations on an exclusive video matting dataset validates the effectiveness of the method. Third, a new sampling-based video matting algorithm is proposed that reinterprets the matting problem from the perspective of sparse reconstruction error of F and B samples. Sampling methods generally select an (F, B) pair that produces the least recon- struction error. The significance of the error has been left unexamined. Two patch-based frameworks are used to ensure temporal coherency in the video mattes - a multi-frame non-local means framework using coherency sensitive hashing and a patch-based multi- frame graph model using motion. Qualitative and quantitative evaluations indicate the performance of the method in reducing temporal jitter and maintaining spatial accuracy in the video mattes.
author2	Deepu Rajan
author_facet	Deepu Rajan Johnson, Jubin
format	Theses and Dissertations
author	Johnson, Jubin
author_sort	Johnson, Jubin
title	Sampling-based image and video matting without compositing equation
title_short	Sampling-based image and video matting without compositing equation
title_full	Sampling-based image and video matting without compositing equation
title_fullStr	Sampling-based image and video matting without compositing equation
title_full_unstemmed	Sampling-based image and video matting without compositing equation
title_sort	sampling-based image and video matting without compositing equation
publishDate	2017
url	http://hdl.handle.net/10356/70341
_version_	1759857031003504640

Sampling-based image and video matting without compositing equation

Similar Items