AIM 2020 challenge on video extreme super-resolution: methods and results

This paper reviews the video extreme super-resolution challenge associated with the AIM 2020 workshop at ECCV 2020. Common scaling factors for learned video super-resolution (VSR) do not go beyond factor 4. Missing information can be restored well in this region, especially in HR videos, where the h...

Full description

Saved in:
Bibliographic Details
Main Authors: FUOLI, D., HUANG, Zhiwu, GU, S., TIMOFTE, R., RAVENTOS, A., ESFANDIARI, A., KAROUT, S., XU, X., LI, X., XIONG, X., WANG, J., NAVARRETE, Michelini P., ZHANG, W., ZHANG, D., ZHU, H., XIA, D., CHEN, H., GU, J., ZHANG, Z., ZHAO, T.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6549
https://ink.library.smu.edu.sg/context/sis_research/article/7552/viewcontent/AIM_2020_Challenge_on_Video_Extreme_Super_Resoluti.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
Description
Summary:This paper reviews the video extreme super-resolution challenge associated with the AIM 2020 workshop at ECCV 2020. Common scaling factors for learned video super-resolution (VSR) do not go beyond factor 4. Missing information can be restored well in this region, especially in HR videos, where the high-frequency content mostly consists of texture details. The task in this challenge is to upscale videos with an extreme factor of 16, which results in more serious degradations that also affect the structural integrity of the videos. A single pixel in the lowresolution (LR) domain corresponds to 256 pixels in the high-resolution (HR) domain. Due to this massive information loss, it is hard to accurately restore the missing information. Track 1 is set up to gauge the state-of-the-art for such a demanding task, where fidelity to the ground truth is measured by PSNR and SSIM. Perceptually higher quality can be achieved in trade-off for fidelity by generating plausible high-frequency content. Track 2 therefore aims at generating visually pleasing results, which are ranked according to human perception, evaluated by a user study. In contrast to single image super-resolution (SISR), VSR can benefit from additional information in the temporal domain. However, this also imposes an additional requirement, as the generated frames need to be consistent along time