Exploiting diffusion prior for real-world image super-resolution
We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution. Specifically, by employing our time-aware encoder, we can achieve promising restoration results without altering the pre-trained synthesis model, thereby pre...
Saved in:
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/180685 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-180685 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1806852024-10-21T02:09:42Z Exploiting diffusion prior for real-world image super-resolution Wang, Jianyi Yue, Zongsheng Zhou, Shangchen Chan, Kelvin C. K. Loy, Chen Change College of Computing and Data Science S-Lab Computer and Information Science Image restoration Diffusion models We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution. Specifically, by employing our time-aware encoder, we can achieve promising restoration results without altering the pre-trained synthesis model, thereby preserving the generative prior and minimizing training cost. To remedy the loss of fidelity caused by the inherent stochasticity of diffusion models, we employ a controllable feature wrapping module that allows users to balance quality and fidelity by simply adjusting a scalar value during the inference process. Moreover, we develop a progressive aggregation sampling strategy to overcome the fixed-size constraints of pre-trained diffusion models, enabling adaptation to resolutions of any size. A comprehensive evaluation of our method using both synthetic and real-world benchmarks demonstrates its superiority over current state-of-the-art approaches. Code and models are available at https://github.com/IceClear/StableSR. National Research Foundation (NRF) This study is supported by the National Research Foundation, Singapore under its AI Singapore Programme (AISG Award No: AISG2-PhD-2022-01-033[T]), RIE2020 Industry Alignment Fund Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s). We sincerely thank Yi Li for providing valuable advice and building the WebUI implementation (https://github.com/pkuliyi2015/ sd-webui-stablesr) of our work. We also thank the continuous interest and contributions from the community. 2024-10-21T02:09:42Z 2024-10-21T02:09:42Z 2024 Journal Article Wang, J., Yue, Z., Zhou, S., Chan, K. C. K. & Loy, C. C. (2024). Exploiting diffusion prior for real-world image super-resolution. International Journal of Computer Vision. https://dx.doi.org/10.1007/s11263-024-02168-7 0920-5691 https://hdl.handle.net/10356/180685 10.1007/s11263-024-02168-7 2-s2.0-85198058630 en RIE2020 AISG2-PhD-2022-01-033[T] International Journal of Computer Vision © 2024 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Image restoration Diffusion models |
spellingShingle |
Computer and Information Science Image restoration Diffusion models Wang, Jianyi Yue, Zongsheng Zhou, Shangchen Chan, Kelvin C. K. Loy, Chen Change Exploiting diffusion prior for real-world image super-resolution |
description |
We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution. Specifically, by employing our time-aware encoder, we can achieve promising restoration results without altering the pre-trained synthesis model, thereby preserving the generative prior and minimizing training cost. To remedy the loss of fidelity caused by the inherent stochasticity of diffusion models, we employ a controllable feature wrapping module that allows users to balance quality and fidelity by simply adjusting a scalar value during the inference process. Moreover, we develop a progressive aggregation sampling strategy to overcome the fixed-size constraints of pre-trained diffusion models, enabling adaptation to resolutions of any size. A comprehensive evaluation of our method using both synthetic and real-world benchmarks demonstrates its superiority over current state-of-the-art approaches. Code and models are available at https://github.com/IceClear/StableSR. |
author2 |
College of Computing and Data Science |
author_facet |
College of Computing and Data Science Wang, Jianyi Yue, Zongsheng Zhou, Shangchen Chan, Kelvin C. K. Loy, Chen Change |
format |
Article |
author |
Wang, Jianyi Yue, Zongsheng Zhou, Shangchen Chan, Kelvin C. K. Loy, Chen Change |
author_sort |
Wang, Jianyi |
title |
Exploiting diffusion prior for real-world image super-resolution |
title_short |
Exploiting diffusion prior for real-world image super-resolution |
title_full |
Exploiting diffusion prior for real-world image super-resolution |
title_fullStr |
Exploiting diffusion prior for real-world image super-resolution |
title_full_unstemmed |
Exploiting diffusion prior for real-world image super-resolution |
title_sort |
exploiting diffusion prior for real-world image super-resolution |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/180685 |
_version_ |
1814777811475365888 |