Video frame synthesis via plug-and-play deep locally temporal embedding

We propose a generative framework that tackles video frame interpolation. Conventionally, optical flow methods can solve the problem, but the perceptual quality depends on the accuracy of flow estimation. Nevertheless, a merit of traditional methods is that they have a remarkable generalization abil...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	Nguyen, Anh-Duc, Kim, Woojae, Kim, Jongyoo, Lin, Weisi, Lee, Sanghoon
مؤلفون آخرون:	School of Computer Science and Engineering
التنسيق:	مقال
اللغة:	English
منشور في:	2021
الموضوعات:	Engineering::Computer science and engineering Frame Synthesis Video Processing
الوصول للمادة أونلاين:	https://hdl.handle.net/10356/145916
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Nanyang Technological University
اللغة:	English

id	sg-ntu-dr.10356-145916
record_format	dspace
spelling	sg-ntu-dr.10356-1459162021-01-14T05:35:55Z Video frame synthesis via plug-and-play deep locally temporal embedding Nguyen, Anh-Duc Kim, Woojae Kim, Jongyoo Lin, Weisi Lee, Sanghoon School of Computer Science and Engineering Engineering::Computer science and engineering Frame Synthesis Video Processing We propose a generative framework that tackles video frame interpolation. Conventionally, optical flow methods can solve the problem, but the perceptual quality depends on the accuracy of flow estimation. Nevertheless, a merit of traditional methods is that they have a remarkable generalization ability. Recently, deep convolutional neural networks (CNNs) have achieved good performance at the price of computation. However, to deploy a CNN, it is necessary to train it with a large-scale dataset beforehand, not to mention the process of fine tuning and adaptation afterwards. Also, despite the sharp motion results, their perceptual quality does not correlate well with their pixel-to-pixel difference metric performance due to various artifacts created by erroneous warping. In this paper, we take the advantages of both conventional and deep-learning models, and tackle the problem from a different perspective. The framework, which we call deep locally temporal embedding (DeepLTE), is powered by a deep CNN and can be used instantly like conventional models. DeepLTE fits an auto-encoding CNN to several consecutive frames and embeds some constraints on the latent representations so that new frames can be generated by interpolating new latent codes. Unlike the current deep learning paradigm which requires training on large datasets, DeepLTE works in a plug-and-play and unsupervised manner, and is able to generate an arbitrary number of frames from multiple given consecutive frames. We demonstrate that, without bells and whistles, DeepLTE outperforms existing state-of-the-art models in terms of the perceptual quality. Published version 2021-01-14T05:35:55Z 2021-01-14T05:35:55Z 2019 Journal Article Nguyen, A.-D., Kim, W., Kim, J., Lin, W., & Lee, S. (2019). Video frame synthesis via plug-and-play deep locally temporal embedding. IEEE Access, 7, 179304-179319. doi:10.1109/ACCESS.2019.2959019 2169-3536 0000-0001-9895-5347 0000-0002-8312-9736 0000-0002-2435-9195 0000-0001-9866-1947 #NODATA# https://hdl.handle.net/10356/145916 10.1109/ACCESS.2019.2959019 2-s2.0-85077230233 7 179304 179319 en IEEE Access © 2019 IEEE. This journal is 100% open access, which means that all content is freely available without charge to users or their institutions. All articles accepted after 12 June 2019 are published under a CC BY 4.0 license, and the author retains copyright. Users are allowed to read, download, copy, distribute, print, search, or link to the full texts of the articles, or use them for any other lawful purpose, as long as proper attribution is given. application/pdf
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering Frame Synthesis Video Processing
spellingShingle	Engineering::Computer science and engineering Frame Synthesis Video Processing Nguyen, Anh-Duc Kim, Woojae Kim, Jongyoo Lin, Weisi Lee, Sanghoon Video frame synthesis via plug-and-play deep locally temporal embedding
description	We propose a generative framework that tackles video frame interpolation. Conventionally, optical flow methods can solve the problem, but the perceptual quality depends on the accuracy of flow estimation. Nevertheless, a merit of traditional methods is that they have a remarkable generalization ability. Recently, deep convolutional neural networks (CNNs) have achieved good performance at the price of computation. However, to deploy a CNN, it is necessary to train it with a large-scale dataset beforehand, not to mention the process of fine tuning and adaptation afterwards. Also, despite the sharp motion results, their perceptual quality does not correlate well with their pixel-to-pixel difference metric performance due to various artifacts created by erroneous warping. In this paper, we take the advantages of both conventional and deep-learning models, and tackle the problem from a different perspective. The framework, which we call deep locally temporal embedding (DeepLTE), is powered by a deep CNN and can be used instantly like conventional models. DeepLTE fits an auto-encoding CNN to several consecutive frames and embeds some constraints on the latent representations so that new frames can be generated by interpolating new latent codes. Unlike the current deep learning paradigm which requires training on large datasets, DeepLTE works in a plug-and-play and unsupervised manner, and is able to generate an arbitrary number of frames from multiple given consecutive frames. We demonstrate that, without bells and whistles, DeepLTE outperforms existing state-of-the-art models in terms of the perceptual quality.
author2	School of Computer Science and Engineering
author_facet	School of Computer Science and Engineering Nguyen, Anh-Duc Kim, Woojae Kim, Jongyoo Lin, Weisi Lee, Sanghoon
format	Article
author	Nguyen, Anh-Duc Kim, Woojae Kim, Jongyoo Lin, Weisi Lee, Sanghoon
author_sort	Nguyen, Anh-Duc
title	Video frame synthesis via plug-and-play deep locally temporal embedding
title_short	Video frame synthesis via plug-and-play deep locally temporal embedding
title_full	Video frame synthesis via plug-and-play deep locally temporal embedding
title_fullStr	Video frame synthesis via plug-and-play deep locally temporal embedding
title_full_unstemmed	Video frame synthesis via plug-and-play deep locally temporal embedding
title_sort	video frame synthesis via plug-and-play deep locally temporal embedding
publishDate	2021
url	https://hdl.handle.net/10356/145916
_version_	1690658279619821568

Video frame synthesis via plug-and-play deep locally temporal embedding

مواد مشابهة