Diffusion time-step curriculum for one image to 3D generation

Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pretrained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success, SDS-based met...

Full description

Saved in:

Bibliographic Details
Main Authors:	YI, Xuanyu, WU, Zike, XU, Qingshan, ZHOU, Pan, LIM, Joo Hwee, ZHANG, Hanwang
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Graphics and Human Computer Interfaces
Online Access:	https://ink.library.smu.edu.sg/sis_research/9020 https://ink.library.smu.edu.sg/context/sis_research/article/10023/viewcontent/2024_CVPR_Image_3D.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10023
record_format	dspace
spelling	sg-smu-ink.sis_research-100232024-07-25T08:07:06Z Diffusion time-step curriculum for one image to 3D generation YI, Xuanyu WU, Zike XU, Qingshan ZHOU, Pan LIM, Joo Hwee ZHANG, Hanwang Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pretrained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success, SDS-based methods often encounter geometric artifacts and texture saturation. We find out the crux is the overlooked indiscriminate treatment of diffusion time-steps during optimization: it unreasonably treats the studentteacher knowledge distillation to be equal at all time-steps and thus entangles coarse-grained and fine-grained modeling. Therefore, we propose the Diffusion Time-step Curriculum one-image-to-3D pipeline (DTC123), which involves both the teacher and student models collaborating with the time-step curriculum in a coarse-to-fine manner. Extensive experiments on NeRF4, RealFusion15, GSO and Level50 benchmark demonstrate that DTC123 can produce multiview consistent, high-quality, and diverse 3D assets. Codes and more generation demos will be released in https: //github.com/yxymessi/DTC123 2024-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9020 https://ink.library.smu.edu.sg/context/sis_research/article/10023/viewcontent/2024_CVPR_Image_3D.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Graphics and Human Computer Interfaces
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Graphics and Human Computer Interfaces
spellingShingle	Graphics and Human Computer Interfaces YI, Xuanyu WU, Zike XU, Qingshan ZHOU, Pan LIM, Joo Hwee ZHANG, Hanwang Diffusion time-step curriculum for one image to 3D generation
description	Score distillation sampling (SDS) has been widely adopted to overcome the absence of unseen views in reconstructing 3D objects from a single image. It leverages pretrained 2D diffusion models as teacher to guide the reconstruction of student 3D models. Despite their remarkable success, SDS-based methods often encounter geometric artifacts and texture saturation. We find out the crux is the overlooked indiscriminate treatment of diffusion time-steps during optimization: it unreasonably treats the studentteacher knowledge distillation to be equal at all time-steps and thus entangles coarse-grained and fine-grained modeling. Therefore, we propose the Diffusion Time-step Curriculum one-image-to-3D pipeline (DTC123), which involves both the teacher and student models collaborating with the time-step curriculum in a coarse-to-fine manner. Extensive experiments on NeRF4, RealFusion15, GSO and Level50 benchmark demonstrate that DTC123 can produce multiview consistent, high-quality, and diverse 3D assets. Codes and more generation demos will be released in https: //github.com/yxymessi/DTC123
format	text
author	YI, Xuanyu WU, Zike XU, Qingshan ZHOU, Pan LIM, Joo Hwee ZHANG, Hanwang
author_facet	YI, Xuanyu WU, Zike XU, Qingshan ZHOU, Pan LIM, Joo Hwee ZHANG, Hanwang
author_sort	YI, Xuanyu
title	Diffusion time-step curriculum for one image to 3D generation
title_short	Diffusion time-step curriculum for one image to 3D generation
title_full	Diffusion time-step curriculum for one image to 3D generation
title_fullStr	Diffusion time-step curriculum for one image to 3D generation
title_full_unstemmed	Diffusion time-step curriculum for one image to 3D generation
title_sort	diffusion time-step curriculum for one image to 3d generation
publisher	Institutional Knowledge at Singapore Management University
publishDate	2024
url	https://ink.library.smu.edu.sg/sis_research/9020 https://ink.library.smu.edu.sg/context/sis_research/article/10023/viewcontent/2024_CVPR_Image_3D.pdf
_version_	1814047694410219520

Diffusion time-step curriculum for one image to 3D generation

Similar Items