Model-based markerless human motion capture from multiple camera sequences

The tracking of 3D articulated body motion from video sequences, or markerless motion capture, plays an important role in a wide variety of potential applications: human computer interaction, biomechanics, computer animation, surveillance and sport analysis. Though there have been remarkable advance...

Full description

Saved in:

Bibliographic Details
Main Author:	Zhang, Zheng.
Other Authors:	Seah Hock Soon
Format:	Theses and Dissertations
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Online Access:	http://hdl.handle.net/10356/52724
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-52724
record_format	dspace
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
spellingShingle	DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Zhang, Zheng. Model-based markerless human motion capture from multiple camera sequences
description	The tracking of 3D articulated body motion from video sequences, or markerless motion capture, plays an important role in a wide variety of potential applications: human computer interaction, biomechanics, computer animation, surveillance and sport analysis. Though there have been remarkable advances in vision-based motion capture, pose tracking from multiple images has not been extensively studied: no existing work produces a solution comparable to that of existing marker-based motion capture methods which generally can recover accurate 3D full body motions in real-time. In this thesis, we develop new methods for human body motion tracking with the main focus on tackling the scenarios where multiple cameras are assumed available. Our research follows a 3D data-based tracking framework, where 3D data, e.g., colored volume and scene flow (i.e., 3D optical flow), is firstly reconstructed and then the optimal human posture is recovered from the 3D data at every instant in time. A multiple camera system with eight cameras is firstly assembled to capture synchronized multiple image video streams. We present and implement efficient methods for synthesizing and rendering 3D reconstruction of the real world dynamic scenes. Model-based pose estimation approach is the mainstream research direction as it takes into account the underlying structure and exploits shape prior information which is beneficial to resolving occlusions and ambiguities. Our methods belong to this category. For a complete model-based pose tracking approach, body model initialization is one key problem. At initialization, the body model must be adapted to fit the shape and size of the subject to be tracked, and must be initialized with the pose at the beginning frame where no temporal and strong prior information is available. For this, we present a robust solution, where pose estimation is performed in a hierarchical way with space constraints enforced on each PSO (particle swarm optimization) based sub-optimization step. The combination of hierarchical estimation and stochastic particle-based search, which has strong global search ability, makes our approach capable of recovering the body pose even when the initial pose is very far from the correct solution. To improve estimation and tracking accuracy and robustness, we present a method to acquire a subject-specific body model which well fits the subject's body shape, and we exploit it for the task of pose tracking. With the voxel-based subject-specific body model, a new model-based pose search method is proposed. The tracking is performed in 3D space using 3D data including colored volume and 3D scene flow reconstructed at every frame. We introduce strategies to compute view-independent scene flow in combination with volumetric reconstruction, and have attained efficient scene flow computation. Our body pose estimation starts with a prediction using scene flow and then it is changed to a lower dimensional global optimization problem. Our method exploits multiple 3D cues and incorporates physical constraints into a stochastic particle-based search initialized from the deterministic prediction and stochastic sampling. Continuing with the voxel-based body model, we proposed to use a multi-layer search method. The first layer, niching swarm filter (NSF), is a stochastical sampling algorithm and the second layer performs pose refinement using local optimization. In order to generalize well to general human motions, our approach does not use strong or specific motion models. We introduce a stochastical niching search into a particle filter to move particles to significant peaks of likelihoods. The local optimization of the second layer not only reduces the time cost, but also increases the accuracy of the sampling estimation, which is required for NSF to attain higher precision. The requirement of real-time processing motivates us to accelerate the tracking by implementing time-consuming steps on GPU using CUDA. Benefiting from the massive parallelism of GPU, our method is capable of tracking full body movements robustly and efficiently.
author2	Seah Hock Soon
author_facet	Seah Hock Soon Zhang, Zheng.
format	Theses and Dissertations
author	Zhang, Zheng.
author_sort	Zhang, Zheng.
title	Model-based markerless human motion capture from multiple camera sequences
title_short	Model-based markerless human motion capture from multiple camera sequences
title_full	Model-based markerless human motion capture from multiple camera sequences
title_fullStr	Model-based markerless human motion capture from multiple camera sequences
title_full_unstemmed	Model-based markerless human motion capture from multiple camera sequences
title_sort	model-based markerless human motion capture from multiple camera sequences
publishDate	2013
url	http://hdl.handle.net/10356/52724
_version_	1759854908809412608
spelling	sg-ntu-dr.10356-527242023-03-04T00:35:34Z Model-based markerless human motion capture from multiple camera sequences Zhang, Zheng. Seah Hock Soon School of Computer Engineering Game Lab DRNTU::Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision The tracking of 3D articulated body motion from video sequences, or markerless motion capture, plays an important role in a wide variety of potential applications: human computer interaction, biomechanics, computer animation, surveillance and sport analysis. Though there have been remarkable advances in vision-based motion capture, pose tracking from multiple images has not been extensively studied: no existing work produces a solution comparable to that of existing marker-based motion capture methods which generally can recover accurate 3D full body motions in real-time. In this thesis, we develop new methods for human body motion tracking with the main focus on tackling the scenarios where multiple cameras are assumed available. Our research follows a 3D data-based tracking framework, where 3D data, e.g., colored volume and scene flow (i.e., 3D optical flow), is firstly reconstructed and then the optimal human posture is recovered from the 3D data at every instant in time. A multiple camera system with eight cameras is firstly assembled to capture synchronized multiple image video streams. We present and implement efficient methods for synthesizing and rendering 3D reconstruction of the real world dynamic scenes. Model-based pose estimation approach is the mainstream research direction as it takes into account the underlying structure and exploits shape prior information which is beneficial to resolving occlusions and ambiguities. Our methods belong to this category. For a complete model-based pose tracking approach, body model initialization is one key problem. At initialization, the body model must be adapted to fit the shape and size of the subject to be tracked, and must be initialized with the pose at the beginning frame where no temporal and strong prior information is available. For this, we present a robust solution, where pose estimation is performed in a hierarchical way with space constraints enforced on each PSO (particle swarm optimization) based sub-optimization step. The combination of hierarchical estimation and stochastic particle-based search, which has strong global search ability, makes our approach capable of recovering the body pose even when the initial pose is very far from the correct solution. To improve estimation and tracking accuracy and robustness, we present a method to acquire a subject-specific body model which well fits the subject's body shape, and we exploit it for the task of pose tracking. With the voxel-based subject-specific body model, a new model-based pose search method is proposed. The tracking is performed in 3D space using 3D data including colored volume and 3D scene flow reconstructed at every frame. We introduce strategies to compute view-independent scene flow in combination with volumetric reconstruction, and have attained efficient scene flow computation. Our body pose estimation starts with a prediction using scene flow and then it is changed to a lower dimensional global optimization problem. Our method exploits multiple 3D cues and incorporates physical constraints into a stochastic particle-based search initialized from the deterministic prediction and stochastic sampling. Continuing with the voxel-based body model, we proposed to use a multi-layer search method. The first layer, niching swarm filter (NSF), is a stochastical sampling algorithm and the second layer performs pose refinement using local optimization. In order to generalize well to general human motions, our approach does not use strong or specific motion models. We introduce a stochastical niching search into a particle filter to move particles to significant peaks of likelihoods. The local optimization of the second layer not only reduces the time cost, but also increases the accuracy of the sampling estimation, which is required for NSF to attain higher precision. The requirement of real-time processing motivates us to accelerate the tracking by implementing time-consuming steps on GPU using CUDA. Benefiting from the massive parallelism of GPU, our method is capable of tracking full body movements robustly and efficiently. Doctor of Philosophy (SCE) 2013-05-23T03:41:34Z 2013-05-23T03:41:34Z 2013 2013 Thesis http://hdl.handle.net/10356/52724 en 194 p. application/pdf

Model-based markerless human motion capture from multiple camera sequences

Similar Items