Diffusion time-step curriculum for one image to 3D generation
Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pretrained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success, SDS-based met...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2024
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/9020 https://ink.library.smu.edu.sg/context/sis_research/article/10023/viewcontent/2024_CVPR_Image_3D.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-10023 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-100232024-07-25T08:07:06Z Diffusion time-step curriculum for one image to 3D generation YI, Xuanyu WU, Zike XU, Qingshan ZHOU, Pan LIM, Joo Hwee ZHANG, Hanwang Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pretrained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success, SDS-based methods often encounter geometric artifacts and texture saturation. We find out the crux is the overlooked indiscriminate treatment of diffusion time-steps during optimization: it unreasonably treats the studentteacher knowledge distillation to be equal at all time-steps and thus entangles coarse-grained and fine-grained modeling. Therefore, we propose the Diffusion Time-step Curriculum one-image-to-3D pipeline (DTC123), which involves both the teacher and student models collaborating with the time-step curriculum in a coarse-to-fine manner. Extensive experiments on NeRF4, RealFusion15, GSO and Level50 benchmark demonstrate that DTC123 can produce multiview consistent, high-quality, and diverse 3D assets. Codes and more generation demos will be released in https: //github.com/yxymessi/DTC123 2024-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9020 https://ink.library.smu.edu.sg/context/sis_research/article/10023/viewcontent/2024_CVPR_Image_3D.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Graphics and Human Computer Interfaces |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Graphics and Human Computer Interfaces |
spellingShingle |
Graphics and Human Computer Interfaces YI, Xuanyu WU, Zike XU, Qingshan ZHOU, Pan LIM, Joo Hwee ZHANG, Hanwang Diffusion time-step curriculum for one image to 3D generation |
description |
Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pretrained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success, SDS-based methods often encounter geometric artifacts and texture saturation. We find out the crux is the overlooked indiscriminate treatment of diffusion time-steps during optimization: it unreasonably treats the studentteacher knowledge distillation to be equal at all time-steps and thus entangles coarse-grained and fine-grained modeling. Therefore, we propose the Diffusion Time-step Curriculum one-image-to-3D pipeline (DTC123), which involves both the teacher and student models collaborating with the time-step curriculum in a coarse-to-fine manner. Extensive experiments on NeRF4, RealFusion15, GSO and Level50 benchmark demonstrate that DTC123 can produce multiview consistent, high-quality, and diverse 3D assets. Codes and more generation demos will be released in https: //github.com/yxymessi/DTC123 |
format |
text |
author |
YI, Xuanyu WU, Zike XU, Qingshan ZHOU, Pan LIM, Joo Hwee ZHANG, Hanwang |
author_facet |
YI, Xuanyu WU, Zike XU, Qingshan ZHOU, Pan LIM, Joo Hwee ZHANG, Hanwang |
author_sort |
YI, Xuanyu |
title |
Diffusion time-step curriculum for one image to 3D generation |
title_short |
Diffusion time-step curriculum for one image to 3D generation |
title_full |
Diffusion time-step curriculum for one image to 3D generation |
title_fullStr |
Diffusion time-step curriculum for one image to 3D generation |
title_full_unstemmed |
Diffusion time-step curriculum for one image to 3D generation |
title_sort |
diffusion time-step curriculum for one image to 3d generation |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2024 |
url |
https://ink.library.smu.edu.sg/sis_research/9020 https://ink.library.smu.edu.sg/context/sis_research/article/10023/viewcontent/2024_CVPR_Image_3D.pdf |
_version_ |
1814047694410219520 |