Towards high-quality 3D telepresence with commodity RGBD camera

3D telepresence aims at providing remote participants to have the perception of being present at the same physical space, which cannot be achieved by any 2D teleconference system. The success of 3D telepresence will greatly enhance communications, allowing much better user experience, which could st...

Full description

Saved in:
Bibliographic Details
Main Author: Zhao, Mengyao
Other Authors: Cai Jianfei
Format: Theses and Dissertations
Language:English
Published: 2018
Subjects:
Online Access:http://hdl.handle.net/10356/73161
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-73161
record_format dspace
spelling sg-ntu-dr.10356-731612023-03-04T00:47:18Z Towards high-quality 3D telepresence with commodity RGBD camera Zhao, Mengyao Cai Jianfei School of Computer Science and Engineering DRNTU::Engineering::Computer science and engineering 3D telepresence aims at providing remote participants to have the perception of being present at the same physical space, which cannot be achieved by any 2D teleconference system. The success of 3D telepresence will greatly enhance communications, allowing much better user experience, which could stimulate many applications including teleconference, telesurgery, remote education, etc. Despite years of study, 3D telepresence research still faces many challenges such as high system cost, hard to achieve real-time performance with consumer-level hardware and high computation requirement, costly to obtain depth data, hard to extracting 3D people in real-time with high quality and difficult for 3D scene replacement and composition. The emerging of consumer-grade range cameras, such as Microsoft Kinect, which provides convenient and low-cost acquisition of 3D depth in real-time, accelerate many multimedia applications. In this thesis, we make a few attempts, aim at improving the quality of 3D telepresence with commodity RGBD camera. First, considering that the raw depth data of commodity depth camera is highly noisy and error-prone, we carefully study the error patterns of Kinect and propose a multi-scale direction-aware filtering method to combat Kinect noise. We have also implemented the proposed method in CUDA to achieve real-time performance. Experimental results show that our method outperforms the popular bilateral filter. Second, we consider the problem of real-time extracting dynamic foreground person from RGB-D video, which is a common task in 3D telepresence. Existing methods are hard to en- sure real time, high quality and temporal coherence at the same time. We propose a foreground extraction framework which nicely integrates many existing techniques including background subtraction, depth hole filing and 3D matting. We also take advantage of various CUDA strategies and spatial data structures to improve the speed. Experimental results show that, compared with state-of-the-art methods, our proposed method can extract stable foreground objects with higher visual quality as well as better temporal coherence, while still achieving real-time performance. Third, we further consider another challenging problem in 3D telepresence, i.e. given a RGBD video, we want to replace the local 3D background scene by a target 3D scene. There are a lot of issues such as the mismatch between the local scene and the target scene, the range of motion in different scenes, the collision problem, etc. We propose a novel scene replacement system that consists of multi-stages of processing including foreground extraction, scene adjustment, scene analysis, scene suggestion, scene matching, and scene rendering. We also develop our system entirely on the GPU by parallelizing most of the computation with CUDA strategies, by which we can achieve not only good visual quality scene replacement but also real-time performance. Doctor of Philosophy (SCE) 2018-01-08T06:17:30Z 2018-01-08T06:17:30Z 2018 Thesis Zhao, M. (2018). Towards high-quality 3D telepresence with commodity RGBD camera. Doctoral thesis, Nanyang Technological University, Singapore. http://hdl.handle.net/10356/73161 10.32657/10356/73161 en 112 p. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic DRNTU::Engineering::Computer science and engineering
spellingShingle DRNTU::Engineering::Computer science and engineering
Zhao, Mengyao
Towards high-quality 3D telepresence with commodity RGBD camera
description 3D telepresence aims at providing remote participants to have the perception of being present at the same physical space, which cannot be achieved by any 2D teleconference system. The success of 3D telepresence will greatly enhance communications, allowing much better user experience, which could stimulate many applications including teleconference, telesurgery, remote education, etc. Despite years of study, 3D telepresence research still faces many challenges such as high system cost, hard to achieve real-time performance with consumer-level hardware and high computation requirement, costly to obtain depth data, hard to extracting 3D people in real-time with high quality and difficult for 3D scene replacement and composition. The emerging of consumer-grade range cameras, such as Microsoft Kinect, which provides convenient and low-cost acquisition of 3D depth in real-time, accelerate many multimedia applications. In this thesis, we make a few attempts, aim at improving the quality of 3D telepresence with commodity RGBD camera. First, considering that the raw depth data of commodity depth camera is highly noisy and error-prone, we carefully study the error patterns of Kinect and propose a multi-scale direction-aware filtering method to combat Kinect noise. We have also implemented the proposed method in CUDA to achieve real-time performance. Experimental results show that our method outperforms the popular bilateral filter. Second, we consider the problem of real-time extracting dynamic foreground person from RGB-D video, which is a common task in 3D telepresence. Existing methods are hard to en- sure real time, high quality and temporal coherence at the same time. We propose a foreground extraction framework which nicely integrates many existing techniques including background subtraction, depth hole filing and 3D matting. We also take advantage of various CUDA strategies and spatial data structures to improve the speed. Experimental results show that, compared with state-of-the-art methods, our proposed method can extract stable foreground objects with higher visual quality as well as better temporal coherence, while still achieving real-time performance. Third, we further consider another challenging problem in 3D telepresence, i.e. given a RGBD video, we want to replace the local 3D background scene by a target 3D scene. There are a lot of issues such as the mismatch between the local scene and the target scene, the range of motion in different scenes, the collision problem, etc. We propose a novel scene replacement system that consists of multi-stages of processing including foreground extraction, scene adjustment, scene analysis, scene suggestion, scene matching, and scene rendering. We also develop our system entirely on the GPU by parallelizing most of the computation with CUDA strategies, by which we can achieve not only good visual quality scene replacement but also real-time performance.
author2 Cai Jianfei
author_facet Cai Jianfei
Zhao, Mengyao
format Theses and Dissertations
author Zhao, Mengyao
author_sort Zhao, Mengyao
title Towards high-quality 3D telepresence with commodity RGBD camera
title_short Towards high-quality 3D telepresence with commodity RGBD camera
title_full Towards high-quality 3D telepresence with commodity RGBD camera
title_fullStr Towards high-quality 3D telepresence with commodity RGBD camera
title_full_unstemmed Towards high-quality 3D telepresence with commodity RGBD camera
title_sort towards high-quality 3d telepresence with commodity rgbd camera
publishDate 2018
url http://hdl.handle.net/10356/73161
_version_ 1759854413091962880