T2Net : synthetic-to-realistic translation for solving single-image depth estimation tasks
Current methods for single-image depth estimation use training datasets with real image-depth pairs or stereo pairs, which are not easy to acquire. We propose a framework, trained on synthetic image-depth pairs and unpaired real images, that comprises an image translation network for enhancing reali...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/138497 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-138497 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1384972020-09-26T21:53:07Z T2Net : synthetic-to-realistic translation for solving single-image depth estimation tasks Zheng, Chuanxia Cham, Tat-Jen Cai, Jianfei School of Computer Science and Engineering European Conference on Computer Vision (ECCV) Institute for Media Innovation (IMI) Engineering::Computer science and engineering Single-image Depth Estimation Unpaired Images Current methods for single-image depth estimation use training datasets with real image-depth pairs or stereo pairs, which are not easy to acquire. We propose a framework, trained on synthetic image-depth pairs and unpaired real images, that comprises an image translation network for enhancing realism of input images, followed by a depth prediction network. A key idea is having the first network act as a wide-spectrum input translator, taking in either synthetic or real images, and ideally producing minimally modified realistic images. This is done via a reconstruction loss when the training input is real, and GAN loss when synthetic, removing the need for heuristic self-regularization. The second network is trained on a task loss for synthetic image-depth pairs, with extra GAN loss to unify real and synthetic feature distributions. Importantly, the framework can be trained end-to-end, leading to good results, even surpassing early deep-learning methods that use real paired data. NRF (Natl Research Foundation, S’pore) Accepted version 2020-05-06T13:48:44Z 2020-05-06T13:48:44Z 2018 Conference Paper Zheng, C., Cham, T.-J., & Cai, J. (2018). T2Net : synthetic-to-realistic translation for solving single-image depth estimation tasks. Computer Vision – ECCV 2018, 798-814. doi:10.1007/978-3-030-01234-2_47 9783030012335 https://hdl.handle.net/10356/138497 10.1007/978-3-030-01234-2_47 2-s2.0-85055108906 798 814 en © 2018 Springer Nature Switzerland AG. All rights reserved. This paper was published in Computer Vision – ECCV 2018 and is made available with permission of Springer Nature Switzerland AG. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
country |
Singapore |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Single-image Depth Estimation Unpaired Images |
spellingShingle |
Engineering::Computer science and engineering Single-image Depth Estimation Unpaired Images Zheng, Chuanxia Cham, Tat-Jen Cai, Jianfei T2Net : synthetic-to-realistic translation for solving single-image depth estimation tasks |
description |
Current methods for single-image depth estimation use training datasets with real image-depth pairs or stereo pairs, which are not easy to acquire. We propose a framework, trained on synthetic image-depth pairs and unpaired real images, that comprises an image translation network for enhancing realism of input images, followed by a depth prediction network. A key idea is having the first network act as a wide-spectrum input translator, taking in either synthetic or real images, and ideally producing minimally modified realistic images. This is done via a reconstruction loss when the training input is real, and GAN loss when synthetic, removing the need for heuristic self-regularization. The second network is trained on a task loss for synthetic image-depth pairs, with extra GAN loss to unify real and synthetic feature distributions. Importantly, the framework can be trained end-to-end, leading to good results, even surpassing early deep-learning methods that use real paired data. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Zheng, Chuanxia Cham, Tat-Jen Cai, Jianfei |
format |
Conference or Workshop Item |
author |
Zheng, Chuanxia Cham, Tat-Jen Cai, Jianfei |
author_sort |
Zheng, Chuanxia |
title |
T2Net : synthetic-to-realistic translation for solving single-image depth estimation tasks |
title_short |
T2Net : synthetic-to-realistic translation for solving single-image depth estimation tasks |
title_full |
T2Net : synthetic-to-realistic translation for solving single-image depth estimation tasks |
title_fullStr |
T2Net : synthetic-to-realistic translation for solving single-image depth estimation tasks |
title_full_unstemmed |
T2Net : synthetic-to-realistic translation for solving single-image depth estimation tasks |
title_sort |
t2net : synthetic-to-realistic translation for solving single-image depth estimation tasks |
publishDate |
2020 |
url |
https://hdl.handle.net/10356/138497 |
_version_ |
1681058563271688192 |