Geometry estimation by deep neural network

Geometry estimation predicts the geometry information under a vision coordinate system. With the high popularity of deep learning, data-driven and learning-based geometry estimation has received much attention in decades. Following the development of geometry estimation, traditional methods compute...

Full description

Saved in:

Bibliographic Details
Main Author:	Mei, Jianhan
Other Authors:	Jiang Xudong
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2022
Subjects:	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
Online Access:	https://hdl.handle.net/10356/155046
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-155046
record_format	dspace
spelling	sg-ntu-dr.10356-1550462023-07-04T16:40:18Z Geometry estimation by deep neural network Mei, Jianhan Jiang Xudong School of Electrical and Electronic Engineering Rapid-Rich Object Search (ROSE) Lab EXDJiang@ntu.edu.sg Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Geometry estimation predicts the geometry information under a vision coordinate system. With the high popularity of deep learning, data-driven and learning-based geometry estimation has received much attention in decades. Following the development of geometry estimation, traditional methods compute geometry by object correspondences while deep learning-based algorithms train the "black-box" to predict the parameter directly. In this thesis, we target on constructing deep neural network for the application of 6 Dof (6D) pose estimation. Meanwhile, we explore the possibility of learning local image features which is one of the fundamental stages for geometry estimation. Finally, we build the deep learning-based 6D pose estimation system combining with the traditional keypoint estimation modules. Tackling the problem of learning the local image region representation via deep neural networks, existing works mainly learn from matched corresponding image patches, with which the learned feature is too sensitive to the individual local patch matching result and cannot handle aggregation based tasks such as image level retrieval. Thus, we propose to use both the matched corresponding image patches and the clustering result as labels for the network training. To resolve the inconsistency between the matched correspondences and clustering results, we propose a semi-supervised iterative training scheme together with a dual-margins loss. Moreover, A jointly learned spatial transform prediction network is utilized to obtain better spatial transform invariance of the learned local features. Using SIFT as the label initializer, experimental results show comparable or even better performance than the hand-crafted feature, which sheds light on learning local feature representation in an unsupervised or weakly supervised manner. For the application of 6D object pose estimation, we focus on two challenges that are the rotation ambiguity and object occlusion. Considering the strong occlusion and background noise, we propose to utilize the spatial structure for better tackling the challenging task. Consequently, observing that the 3D mesh can be naturally abstracted by the graph, we build the graph using 3D points as vertices and mesh connections as edges. We construct the corresponding mapping from 2D image features to 3D points for filling the graph and fusion of the 2D and 3D features. Afterward, a Graph Convolutional Network (GCN) is applied to help the feature exchange among objects' points in 3D space. To address the problem of rotation symmetry ambiguity for objects, a spherical convolution is utilized and the spherical feature is combined with the convolutional feature which is mapped to the graph. Predefined 3D keypoints are voted and the 6DoF pose is obtained via the optimization fitting. Both the scenarios of inference with and without the depth information are discussed. Tested on the datasets of YCB-Video and LINEMOD, the experiments demonstrate the effectiveness of our proposed method. Doctor of Philosophy 2022-02-27T23:41:54Z 2022-02-27T23:41:54Z 2022 Thesis-Doctor of Philosophy Mei, J. (2022). Geometry estimation by deep neural network. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/155046 https://hdl.handle.net/10356/155046 10.32657/10356/155046 en This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0). application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision
spellingShingle	Engineering::Computer science and engineering::Computing methodologies::Image processing and computer vision Mei, Jianhan Geometry estimation by deep neural network
description	Geometry estimation predicts the geometry information under a vision coordinate system. With the high popularity of deep learning, data-driven and learning-based geometry estimation has received much attention in decades. Following the development of geometry estimation, traditional methods compute geometry by object correspondences while deep learning-based algorithms train the "black-box" to predict the parameter directly. In this thesis, we target on constructing deep neural network for the application of 6 Dof (6D) pose estimation. Meanwhile, we explore the possibility of learning local image features which is one of the fundamental stages for geometry estimation. Finally, we build the deep learning-based 6D pose estimation system combining with the traditional keypoint estimation modules. Tackling the problem of learning the local image region representation via deep neural networks, existing works mainly learn from matched corresponding image patches, with which the learned feature is too sensitive to the individual local patch matching result and cannot handle aggregation based tasks such as image level retrieval. Thus, we propose to use both the matched corresponding image patches and the clustering result as labels for the network training. To resolve the inconsistency between the matched correspondences and clustering results, we propose a semi-supervised iterative training scheme together with a dual-margins loss. Moreover, A jointly learned spatial transform prediction network is utilized to obtain better spatial transform invariance of the learned local features. Using SIFT as the label initializer, experimental results show comparable or even better performance than the hand-crafted feature, which sheds light on learning local feature representation in an unsupervised or weakly supervised manner. For the application of 6D object pose estimation, we focus on two challenges that are the rotation ambiguity and object occlusion. Considering the strong occlusion and background noise, we propose to utilize the spatial structure for better tackling the challenging task. Consequently, observing that the 3D mesh can be naturally abstracted by the graph, we build the graph using 3D points as vertices and mesh connections as edges. We construct the corresponding mapping from 2D image features to 3D points for filling the graph and fusion of the 2D and 3D features. Afterward, a Graph Convolutional Network (GCN) is applied to help the feature exchange among objects' points in 3D space. To address the problem of rotation symmetry ambiguity for objects, a spherical convolution is utilized and the spherical feature is combined with the convolutional feature which is mapped to the graph. Predefined 3D keypoints are voted and the 6DoF pose is obtained via the optimization fitting. Both the scenarios of inference with and without the depth information are discussed. Tested on the datasets of YCB-Video and LINEMOD, the experiments demonstrate the effectiveness of our proposed method.
author2	Jiang Xudong
author_facet	Jiang Xudong Mei, Jianhan
format	Thesis-Doctor of Philosophy
author	Mei, Jianhan
author_sort	Mei, Jianhan
title	Geometry estimation by deep neural network
title_short	Geometry estimation by deep neural network
title_full	Geometry estimation by deep neural network
title_fullStr	Geometry estimation by deep neural network
title_full_unstemmed	Geometry estimation by deep neural network
title_sort	geometry estimation by deep neural network
publisher	Nanyang Technological University
publishDate	2022
url	https://hdl.handle.net/10356/155046
_version_	1772826393236733952

Geometry estimation by deep neural network

Similar Items