Exploiting the image prior in CLIP for super-resolution

Super-resolution (SR) is a fundamental task in computer vision aimed at enhancing the resolution and quality of low-resolution images. However, a persistent challenge arises from the inherent ambiguity where a single low-resolution image may correspond to mul- tiple high-resolution images. Additiona...

Full description

Saved in:
Bibliographic Details
Main Author: Chen, Xingyu
Other Authors: Chen Change Loy
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175133
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Super-resolution (SR) is a fundamental task in computer vision aimed at enhancing the resolution and quality of low-resolution images. However, a persistent challenge arises from the inherent ambiguity where a single low-resolution image may correspond to mul- tiple high-resolution images. Additional priors are essential to address such problem, especially when the degradation is complex. Recent emergence of large vision-language model such as CLIP provides potential to enhance SR generation by providing extra con- textual information from the image. Hence, in this project, we investigate the efficacy of integrating CLIP priors into image super-resolution. Through a series of experiments, we explore both blind and non-blind SR problems, evaluating the impact of CLIP priors on model performance. Additionally, we analyze the limitations and challenges associated with CLIP integration, particularly in handling low-resolution and incomplete images. Our findings demonstrate that while CLIP priors hold promise in enhancing SR results, careful fine-tuning is required to optimize their utilization for image generation tasks.