Diffusion time-step curriculum for one image to 3D generation

Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pretrained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success, SDS-based met...

Full description

Saved in:
Bibliographic Details
Main Authors: YI, Xuanyu, WU, Zike, XU, Qingshan, ZHOU, Pan, LIM, Joo Hwee, ZHANG, Hanwang
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/9020
https://ink.library.smu.edu.sg/context/sis_research/article/10023/viewcontent/2024_CVPR_Image_3D.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-10023
record_format dspace
spelling sg-smu-ink.sis_research-100232024-07-25T08:07:06Z Diffusion time-step curriculum for one image to 3D generation YI, Xuanyu WU, Zike XU, Qingshan ZHOU, Pan LIM, Joo Hwee ZHANG, Hanwang Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pretrained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success, SDS-based methods often encounter geometric artifacts and texture saturation. We find out the crux is the overlooked indiscriminate treatment of diffusion time-steps during optimization: it unreasonably treats the studentteacher knowledge distillation to be equal at all time-steps and thus entangles coarse-grained and fine-grained modeling. Therefore, we propose the Diffusion Time-step Curriculum one-image-to-3D pipeline (DTC123), which involves both the teacher and student models collaborating with the time-step curriculum in a coarse-to-fine manner. Extensive experiments on NeRF4, RealFusion15, GSO and Level50 benchmark demonstrate that DTC123 can produce multiview consistent, high-quality, and diverse 3D assets. Codes and more generation demos will be released in https: //github.com/yxymessi/DTC123 2024-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9020 https://ink.library.smu.edu.sg/context/sis_research/article/10023/viewcontent/2024_CVPR_Image_3D.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Graphics and Human Computer Interfaces
spellingShingle Graphics and Human Computer Interfaces
YI, Xuanyu
WU, Zike
XU, Qingshan
ZHOU, Pan
LIM, Joo Hwee
ZHANG, Hanwang
Diffusion time-step curriculum for one image to 3D generation
description Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pretrained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success, SDS-based methods often encounter geometric artifacts and texture saturation. We find out the crux is the overlooked indiscriminate treatment of diffusion time-steps during optimization: it unreasonably treats the studentteacher knowledge distillation to be equal at all time-steps and thus entangles coarse-grained and fine-grained modeling. Therefore, we propose the Diffusion Time-step Curriculum one-image-to-3D pipeline (DTC123), which involves both the teacher and student models collaborating with the time-step curriculum in a coarse-to-fine manner. Extensive experiments on NeRF4, RealFusion15, GSO and Level50 benchmark demonstrate that DTC123 can produce multiview consistent, high-quality, and diverse 3D assets. Codes and more generation demos will be released in https: //github.com/yxymessi/DTC123
format text
author YI, Xuanyu
WU, Zike
XU, Qingshan
ZHOU, Pan
LIM, Joo Hwee
ZHANG, Hanwang
author_facet YI, Xuanyu
WU, Zike
XU, Qingshan
ZHOU, Pan
LIM, Joo Hwee
ZHANG, Hanwang
author_sort YI, Xuanyu
title Diffusion time-step curriculum for one image to 3D generation
title_short Diffusion time-step curriculum for one image to 3D generation
title_full Diffusion time-step curriculum for one image to 3D generation
title_fullStr Diffusion time-step curriculum for one image to 3D generation
title_full_unstemmed Diffusion time-step curriculum for one image to 3D generation
title_sort diffusion time-step curriculum for one image to 3d generation
publisher Institutional Knowledge at Singapore Management University
publishDate 2024
url https://ink.library.smu.edu.sg/sis_research/9020
https://ink.library.smu.edu.sg/context/sis_research/article/10023/viewcontent/2024_CVPR_Image_3D.pdf
_version_ 1814047694410219520