Advancing 3D scene understanding through discriminative and generative learning approaches

This thesis explores the crucial role of Computer Vision in endowing computers with general intelligence, focusing on developing algorithms that enable machines to perceive and understand their three-dimensional surroundings. The research is divided into two parts: discriminative and generative lear...

Full description

Saved in:

Bibliographic Details
Main Author:	Tang, Zhe Jun
Other Authors:	Cham Tat Jen
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2025
Subjects:	Computer and Information Science Computer vision
Online Access:	https://hdl.handle.net/10356/182916
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-182916
record_format	dspace
spelling	sg-ntu-dr.10356-1829162025-03-10T02:00:08Z Advancing 3D scene understanding through discriminative and generative learning approaches Tang, Zhe Jun Cham Tat Jen College of Computing and Data Science ASTJCham@ntu.edu.sg Computer and Information Science Computer vision This thesis explores the crucial role of Computer Vision in endowing computers with general intelligence, focusing on developing algorithms that enable machines to perceive and understand their three-dimensional surroundings. The research is divided into two parts: discriminative and generative learning approaches, with three core chapters formulating 3D scene understanding. From a discriminative learning perspective, a novel approach to point cloud segmentation is devised, which is crucial for road scene perception. The proposed method processes point clouds as a whole while retaining local information, achieving high accuracy in segmenting objects from scenes despite the computational challenges of processing large input data. The generative learning approach focuses on generating entire 3D scenes from 2D images. Prior art methods in rendering 3D scenes via volumetric rendering are studied, and an end-to-end learning approach with transformers is proposed as an alternative to physics-based approaches. Novel methods to capture lighting information of scenes, inspired by modern game engines, are devised to improve rendering quality. Further investigation into new rendering methods with rasterisation of 3D Gaussian spheres is conducted, along with a different method for capturing lighting information to enhance rendering quality. The research contributes to the overarching goal of helping computers perceive and interact with the 3D world, offering numerous advantages for downstream applications such as autonomous vehicles, augmented reality, and virtual collaboration. Doctor of Philosophy 2025-03-10T02:00:08Z 2025-03-10T02:00:08Z 2025 Thesis-Doctor of Philosophy Tang, Z. J. (2025). Advancing 3D scene understanding through discriminative and generative learning approaches. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182916 https://hdl.handle.net/10356/182916 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science Computer vision
spellingShingle	Computer and Information Science Computer vision Tang, Zhe Jun Advancing 3D scene understanding through discriminative and generative learning approaches
description	This thesis explores the crucial role of Computer Vision in endowing computers with general intelligence, focusing on developing algorithms that enable machines to perceive and understand their three-dimensional surroundings. The research is divided into two parts: discriminative and generative learning approaches, with three core chapters formulating 3D scene understanding. From a discriminative learning perspective, a novel approach to point cloud segmentation is devised, which is crucial for road scene perception. The proposed method processes point clouds as a whole while retaining local information, achieving high accuracy in segmenting objects from scenes despite the computational challenges of processing large input data. The generative learning approach focuses on generating entire 3D scenes from 2D images. Prior art methods in rendering 3D scenes via volumetric rendering are studied, and an end-to-end learning approach with transformers is proposed as an alternative to physics-based approaches. Novel methods to capture lighting information of scenes, inspired by modern game engines, are devised to improve rendering quality. Further investigation into new rendering methods with rasterisation of 3D Gaussian spheres is conducted, along with a different method for capturing lighting information to enhance rendering quality. The research contributes to the overarching goal of helping computers perceive and interact with the 3D world, offering numerous advantages for downstream applications such as autonomous vehicles, augmented reality, and virtual collaboration.
author2	Cham Tat Jen
author_facet	Cham Tat Jen Tang, Zhe Jun
format	Thesis-Doctor of Philosophy
author	Tang, Zhe Jun
author_sort	Tang, Zhe Jun
title	Advancing 3D scene understanding through discriminative and generative learning approaches
title_short	Advancing 3D scene understanding through discriminative and generative learning approaches
title_full	Advancing 3D scene understanding through discriminative and generative learning approaches
title_fullStr	Advancing 3D scene understanding through discriminative and generative learning approaches
title_full_unstemmed	Advancing 3D scene understanding through discriminative and generative learning approaches
title_sort	advancing 3d scene understanding through discriminative and generative learning approaches
publisher	Nanyang Technological University
publishDate	2025
url	https://hdl.handle.net/10356/182916
_version_	1826362243096248320

Advancing 3D scene understanding through discriminative and generative learning approaches

Similar Items