Towards high-quality 3D telepresence with commodity RGBD camera

3D telepresence aims at providing remote participants to have the perception of being present at the same physical space, which cannot be achieved by any 2D teleconference system. The success of 3D telepresence will greatly enhance communications, allowing much better user experience, which could st...

全面介紹

Saved in:

書目詳細資料
主要作者:	Zhao, Mengyao
其他作者:	Cai Jianfei
格式:	Theses and Dissertations
語言:	English
出版:	2018
主題:	DRNTU::Engineering::Computer science and engineering
在線閱讀:	http://hdl.handle.net/10356/73161
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

實物特徵
總結:	3D telepresence aims at providing remote participants to have the perception of being present at the same physical space, which cannot be achieved by any 2D teleconference system. The success of 3D telepresence will greatly enhance communications, allowing much better user experience, which could stimulate many applications including teleconference, telesurgery, remote education, etc. Despite years of study, 3D telepresence research still faces many challenges such as high system cost, hard to achieve real-time performance with consumer-level hardware and high computation requirement, costly to obtain depth data, hard to extracting 3D people in real-time with high quality and difficult for 3D scene replacement and composition. The emerging of consumer-grade range cameras, such as Microsoft Kinect, which provides convenient and low-cost acquisition of 3D depth in real-time, accelerate many multimedia applications. In this thesis, we make a few attempts, aim at improving the quality of 3D telepresence with commodity RGBD camera. First, considering that the raw depth data of commodity depth camera is highly noisy and error-prone, we carefully study the error patterns of Kinect and propose a multi-scale direction-aware filtering method to combat Kinect noise. We have also implemented the proposed method in CUDA to achieve real-time performance. Experimental results show that our method outperforms the popular bilateral filter. Second, we consider the problem of real-time extracting dynamic foreground person from RGB-D video, which is a common task in 3D telepresence. Existing methods are hard to en- sure real time, high quality and temporal coherence at the same time. We propose a foreground extraction framework which nicely integrates many existing techniques including background subtraction, depth hole filing and 3D matting. We also take advantage of various CUDA strategies and spatial data structures to improve the speed. Experimental results show that, compared with state-of-the-art methods, our proposed method can extract stable foreground objects with higher visual quality as well as better temporal coherence, while still achieving real-time performance. Third, we further consider another challenging problem in 3D telepresence, i.e. given a RGBD video, we want to replace the local 3D background scene by a target 3D scene. There are a lot of issues such as the mismatch between the local scene and the target scene, the range of motion in different scenes, the collision problem, etc. We propose a novel scene replacement system that consists of multi-stages of processing including foreground extraction, scene adjustment, scene analysis, scene suggestion, scene matching, and scene rendering. We also develop our system entirely on the GPU by parallelizing most of the computation with CUDA strategies, by which we can achieve not only good visual quality scene replacement but also real-time performance.

Towards high-quality 3D telepresence with commodity RGBD camera

相似書籍