SceneDreamer: unbounded 3D scene generation from 2D image collections
In this work, we present SceneDreamer, an unconditional generative model for unbounded 3D scenes, which synthesizes large-scale 3D landscapes from random noise. Our framework is learned from in-the-wild 2D image collections only, without any 3D annotations. At the core of SceneDreamer is a principle...
Saved in:
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/173443 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-173443 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1734432024-02-06T07:04:26Z SceneDreamer: unbounded 3D scene generation from 2D image collections Chen, Zhaoxi Wang, Guangcong Liu, Ziwei School of Computer Science and Engineering S-Lab Computer and Information Science Neural Rendering Unbounded Scene Generation In this work, we present SceneDreamer, an unconditional generative model for unbounded 3D scenes, which synthesizes large-scale 3D landscapes from random noise. Our framework is learned from in-the-wild 2D image collections only, without any 3D annotations. At the core of SceneDreamer is a principled learning paradigm comprising: 1) an efficient yet expressive 3D scene representation, 2) a generative scene parameterization, and 3) an effective renderer that can leverage the knowledge from 2D images. Our approach begins with an efficient bird's-eye-view (BEV) representation generated from simplex noise, which includes a height field for surface elevation and a semantic field for detailed scene semantics. This BEV scene representation enables: 1) representing a 3D scene with quadratic complexity, 2) disentangled geometry and semantics, and 3) efficient training. Moreover, we propose a novel generative neural hash grid to parameterize the latent space based on 3D positions and scene semantics, aiming to encode generalizable features across various scenes. Lastly, a neural volumetric renderer, learned from 2D image collections through adversarial training, is employed to produce photorealistic images. Extensive experiments demonstrate the effectiveness of SceneDreamer and superiority over state-of-the-art methods in generating vivid yet diverse unbounded 3D worlds. Ministry of Education (MOE) Nanyang Technological University National Research Foundation (NRF) This work was supported in part by National Research Foundation, Singapore under its AI Singapore Programme (AISG) under Grant AISG2-PhD-2021-08-019, in part by NTU NAP, in part by MOE AcRF Tier 2 under Grant T2EP20221-0012, and in part by RIE2020 Industry Alignment Fund - Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s). 2024-02-05T01:13:20Z 2024-02-05T01:13:20Z 2023 Journal Article Chen, Z., Wang, G. & Liu, Z. (2023). SceneDreamer: unbounded 3D scene generation from 2D image collections. IEEE Transactions On Pattern Analysis and Machine Intelligence, 45(12), 15562-15576. https://dx.doi.org/10.1109/TPAMI.2023.3321857 0162-8828 https://hdl.handle.net/10356/173443 10.1109/TPAMI.2023.3321857 37788193 2-s2.0-85174809772 12 45 15562 15576 en AISG2-PhD-2021-08-019 NTU NAP T2EP20221-0012 IEEE Transactions on Pattern Analysis and Machine Intelligence © 2023 IEEE. All rights reserved. |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Neural Rendering Unbounded Scene Generation |
spellingShingle |
Computer and Information Science Neural Rendering Unbounded Scene Generation Chen, Zhaoxi Wang, Guangcong Liu, Ziwei SceneDreamer: unbounded 3D scene generation from 2D image collections |
description |
In this work, we present SceneDreamer, an unconditional generative model for unbounded 3D scenes, which synthesizes large-scale 3D landscapes from random noise. Our framework is learned from in-the-wild 2D image collections only, without any 3D annotations. At the core of SceneDreamer is a principled learning paradigm comprising: 1) an efficient yet expressive 3D scene representation, 2) a generative scene parameterization, and 3) an effective renderer that can leverage the knowledge from 2D images. Our approach begins with an efficient bird's-eye-view (BEV) representation generated from simplex noise, which includes a height field for surface elevation and a semantic field for detailed scene semantics. This BEV scene representation enables: 1) representing a 3D scene with quadratic complexity, 2) disentangled geometry and semantics, and 3) efficient training. Moreover, we propose a novel generative neural hash grid to parameterize the latent space based on 3D positions and scene semantics, aiming to encode generalizable features across various scenes. Lastly, a neural volumetric renderer, learned from 2D image collections through adversarial training, is employed to produce photorealistic images. Extensive experiments demonstrate the effectiveness of SceneDreamer and superiority over state-of-the-art methods in generating vivid yet diverse unbounded 3D worlds. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Chen, Zhaoxi Wang, Guangcong Liu, Ziwei |
format |
Article |
author |
Chen, Zhaoxi Wang, Guangcong Liu, Ziwei |
author_sort |
Chen, Zhaoxi |
title |
SceneDreamer: unbounded 3D scene generation from 2D image collections |
title_short |
SceneDreamer: unbounded 3D scene generation from 2D image collections |
title_full |
SceneDreamer: unbounded 3D scene generation from 2D image collections |
title_fullStr |
SceneDreamer: unbounded 3D scene generation from 2D image collections |
title_full_unstemmed |
SceneDreamer: unbounded 3D scene generation from 2D image collections |
title_sort |
scenedreamer: unbounded 3d scene generation from 2d image collections |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/173443 |
_version_ |
1794549355130126336 |