3D reconstruction from single images
3D reconstruction from single images is a fundamental task in computer vision, and it has a wide range of applications, including anime films, robot object interaction, AR, VR and 3D games. Due to the task's complexity and significant information loss from 3D to 2D, traditional methods are inef...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/174108 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-174108 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1741082024-04-09T03:58:58Z 3D reconstruction from single images Ping, Guiju Mao Kezhi School of Electrical and Electronic Engineering EKZMao@ntu.edu.sg Computer and Information Science 3D Reconstruction Point Cloud 3D reconstruction from single images is a fundamental task in computer vision, and it has a wide range of applications, including anime films, robot object interaction, AR, VR and 3D games. Due to the task's complexity and significant information loss from 3D to 2D, traditional methods are ineffective. The use of deep learning and large-scale datasets to learn priori knowledge is a promising direction and has achieved varying degrees of success. However, 3D deep learning requires a large number of annotated 3D objects. These annotations are usually tedious and time-consuming. In view of this fact, this thesis presents a method to generate annotated 3D datasets automatically, and experiments demonstrate the effectiveness of the generated datasets for 3D deep learning tasks. However, the generated 3D datasets can only imitate objects with basic topology. For 3D reconstruction tasks, objects with higher precision are required to capture the widely varying structures of objects in daily life. This thesis provides three methods to enhance the quality of the single-view 3D reconstruction using publicly accessible 3D datasets. - Most of the existing reconstruction methods focus too much on the reconstruction metrics, such as Chamfer Distance(CD), and neglect the visual consistency between the reconstructed 3D objects and the objects in the given image. In our first framework, we enhance the visual quality of the reconstructed shapes by emphasising the consistency between the reconstructed 3D shape and the object's boundaries and corner points in the given image. - Earlier point cloud-based reconstruction techniques could only generate point clouds with preset resolutions. Moreover, to obtain dense point clouds, previous research need to employ multistage training. We propose PushNet, which can produce point clouds with arbitrary resolutions, including very dense resolutions, in an end-to-end manner and only require sparse point clouds during training. - To improve the reconstruction quality in local areas, a two-stage reconstruction approach is proposed. We overcome the shortcomings of previous pixel-aligned reconstruction methods and produce reliable results without predicting camera parameters. Our method incorporates local information about each pixel in the first stage and focuses on global information in the second stage. Doctor of Philosophy 2024-03-18T01:51:23Z 2024-03-18T01:51:23Z 2022 Thesis-Doctor of Philosophy Ping, G. (2022). 3D reconstruction from single images. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/174108 https://hdl.handle.net/10356/174108 10.32657/10356/174108 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science 3D Reconstruction Point Cloud |
spellingShingle |
Computer and Information Science 3D Reconstruction Point Cloud Ping, Guiju 3D reconstruction from single images |
description |
3D reconstruction from single images is a fundamental task in computer vision, and it has a wide range of applications, including anime films, robot object interaction, AR, VR and 3D games. Due to the task's complexity and significant information loss from 3D to 2D, traditional methods are ineffective.
The use of deep learning and large-scale datasets to learn priori knowledge is a promising direction and has achieved varying degrees of success. However, 3D deep learning requires a large number of annotated 3D objects. These annotations are usually tedious and time-consuming. In view of this fact, this thesis presents a method to generate annotated 3D datasets automatically, and experiments demonstrate the effectiveness of the generated datasets for 3D deep learning tasks.
However, the generated 3D datasets can only imitate objects with basic topology. For 3D reconstruction tasks, objects with higher precision are required to capture the widely varying structures of objects in daily life. This thesis provides three methods to enhance the quality of the single-view 3D reconstruction using publicly accessible 3D datasets.
- Most of the existing reconstruction methods focus too much on the reconstruction metrics, such as Chamfer Distance(CD), and neglect the visual consistency between the reconstructed 3D objects and the objects in the given image. In our first framework, we enhance the visual quality of the reconstructed shapes by emphasising the consistency between the reconstructed 3D shape and the object's boundaries and corner points in the given image.
- Earlier point cloud-based reconstruction techniques could only generate point clouds with preset resolutions. Moreover, to obtain dense point clouds, previous research need to employ multistage training. We propose PushNet, which can produce point clouds with arbitrary resolutions, including very dense resolutions, in an end-to-end manner and only require sparse point clouds during training.
- To improve the reconstruction quality in local areas, a two-stage reconstruction approach is proposed. We overcome the shortcomings of previous pixel-aligned reconstruction methods and produce reliable results without predicting camera parameters. Our method incorporates local information about each pixel in the first stage and focuses on global information in the second stage. |
author2 |
Mao Kezhi |
author_facet |
Mao Kezhi Ping, Guiju |
format |
Thesis-Doctor of Philosophy |
author |
Ping, Guiju |
author_sort |
Ping, Guiju |
title |
3D reconstruction from single images |
title_short |
3D reconstruction from single images |
title_full |
3D reconstruction from single images |
title_fullStr |
3D reconstruction from single images |
title_full_unstemmed |
3D reconstruction from single images |
title_sort |
3d reconstruction from single images |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/174108 |
_version_ |
1814047429533630464 |