Pluralistic image completion
Most image completion methods produce only one result for each masked input, although there may be many reasonable possibilities. In this paper, we present an approach for pluralistic image completion - the task of generating multiple and diverse plausible solutions for image completion. A major cha...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/138481 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-138481 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1384812020-09-26T21:52:39Z Pluralistic image completion Zheng, Chuanxia Cham, Tat-Jen Cai, Jianfei School of Computer Science and Engineering 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Institute for Media Innovation (IMI) Engineering::Computer science and engineering Computer Vision Image Reconstruction Most image completion methods produce only one result for each masked input, although there may be many reasonable possibilities. In this paper, we present an approach for pluralistic image completion - the task of generating multiple and diverse plausible solutions for image completion. A major challenge faced by learning-based approaches is that usually only one ground truth training instance per label. As such, sampling from conditional VAEs still leads to minimal diversity. To overcome this, we propose a novel and probabilistically principled framework with two parallel paths. One is a reconstructive path that utilizes the only one given ground truth to get prior distribution of missing parts and rebuild the original image from this distribution. The other is a generative path for which the conditional prior is coupled to the distribution obtained in the reconstructive path. Both are supported by GANs. We also introduce a new short+long term attention layer that exploits distant relations among decoder and encoder features, improving appearance consistency. When tested on datasets with buildings (Paris), faces (CelebA-HQ), and natural images (ImageNet), our method not only generated higher-quality completion results, but also with multiple and diverse plausible outputs. NRF (Natl Research Foundation, S’pore) Accepted version 2020-05-06T10:08:29Z 2020-05-06T10:08:29Z 2019 Conference Paper Zheng, C., Cham, T.-J., & Cai, J. (2019). Pluralistic image completion. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 1438-1447. doi:10.1109/CVPR.2019.00153 9781728132938 https://hdl.handle.net/10356/138481 10.1109/CVPR.2019.00153 2-s2.0-85078702220 1438 1447 en © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The published version is available at: https://doi.org/10.1109/CVPR.2019.00153. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Computer Vision Image Reconstruction |
spellingShingle |
Engineering::Computer science and engineering Computer Vision Image Reconstruction Zheng, Chuanxia Cham, Tat-Jen Cai, Jianfei Pluralistic image completion |
description |
Most image completion methods produce only one result for each masked input, although there may be many reasonable possibilities. In this paper, we present an approach for pluralistic image completion - the task of generating multiple and diverse plausible solutions for image completion. A major challenge faced by learning-based approaches is that usually only one ground truth training instance per label. As such, sampling from conditional VAEs still leads to minimal diversity. To overcome this, we propose a novel and probabilistically principled framework with two parallel paths. One is a reconstructive path that utilizes the only one given ground truth to get prior distribution of missing parts and rebuild the original image from this distribution. The other is a generative path for which the conditional prior is coupled to the distribution obtained in the reconstructive path. Both are supported by GANs. We also introduce a new short+long term attention layer that exploits distant relations among decoder and encoder features, improving appearance consistency. When tested on datasets with buildings (Paris), faces (CelebA-HQ), and natural images (ImageNet), our method not only generated higher-quality completion results, but also with multiple and diverse plausible outputs. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Zheng, Chuanxia Cham, Tat-Jen Cai, Jianfei |
format |
Conference or Workshop Item |
author |
Zheng, Chuanxia Cham, Tat-Jen Cai, Jianfei |
author_sort |
Zheng, Chuanxia |
title |
Pluralistic image completion |
title_short |
Pluralistic image completion |
title_full |
Pluralistic image completion |
title_fullStr |
Pluralistic image completion |
title_full_unstemmed |
Pluralistic image completion |
title_sort |
pluralistic image completion |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/138481 |
_version_ |
1681056433731272704 |