Advanced topics in learning to hash for large-scale retrieval
Learning to hash has become a crucial technique to analyze the dramatically increasing data engaged in machine learning or computer vision applications. As hashing methods are applied in large-scale applications, their training often faces assorted challenges from the training data such as the huge...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2020
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/139943 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Learning to hash has become a crucial technique to analyze the dramatically increasing data engaged in machine learning or computer vision applications. As hashing methods are applied in large-scale applications, their training often faces assorted challenges from the training data such as the huge size or the data structure. This motivates our research on training-efficient hashing methods.
In the training of shallow-model-based supervised hashing, the supervised information is often encoded as an instance-pairwise similarity matrix in the objective function. However, the size of an instance-pairwise similarity matrix is quadratic to the size of labeled training data, which is very expensive in terms of space, especially for a large-scale learning problem. This limitation hinders the full utilization of labeled instances for learning a more precise hashing model. To overcome this limitation, we propose our first model, a class-wise supervised hashing method. In the proposed method, the hash model is trained based on a class-pairwise similarity matrix, whose size is much smaller than the instance-pairwise similarity matrix in many applications. Besides a set of hash functions, the proposed method also learns a set of class-wise code-prototypes with active bits for different classes. These class-wise code-prototypes can help to learn more precise compact codes for semantic information retrieval.
As deep learning has shown its superior performance on many computer vision applications, recent designs of learning-based hashing models have been moving from shallow ones to deep architectures. However, the gradient descent based optimization used in deep hashing models would potentially cause hash codes of a pair of training instances to be updated towards the directions of each other simultaneously during optimization. In the worst case, the paired hash codes switch their directions after update, and consequently, their corresponding distance in the Hamming space remains unchanged. This makes the overall learning process highly inefficient. To address this issue, we propose a new deep hashing model integrated with a novel gradient attention mechanism. The gradient attention mechanism assigns weights for gradients during back-propagation and reduces the probability of changing the paired hash codes simultaneously. Thus, the learning process is accelerated.
In large-scale applications, the training data may come in sequential order. Therefore, the existing off-line hash models are not suitable for processing these training data. This gives rise to develop online learning to hash. To address this problem, we propose our third model, an online hashing method. Specifically, a new loss function is proposed to measure the similarity loss between a pair of data samples in Hamming space. Then, a structured hash model is derived and optimized in a passive-aggressive way. Theoretical analysis on the upper bound of the cumulative loss for the proposed online hash model is provided. Furthermore, we extend our online hashing from a single-model to a multi-model online hashing that trains multiple models so as to retain diverse online hashing models in order to avoid biased-update.
Moreover, in a city-scale or country-scale information retrieval application, data is often collected and stored over different geo-distributed machines. Considering the extremely high data transmission cost, it is impractical to centralize all the data into a single machine to train the hash model, which gives rise to developing distributed optimization based hashing methods. Most of existing works on distributed hashing are based on Alternating Direction Method of Multipliers (ADMM). However, ADMM-based algorithms require intensive communication to update variables, which has become the most significant bottleneck in the geo-distributed computing environment. To address this issue, we propose a communication-efficient distributed hashing algorithm. In the proposed method, each local machine approximates the global objective with local information and regularly updated global information, thus only regularly communication is required. The local approximation objective is solved by Riemannian optimization and its convergence is analyzed using the property of the objective function on Riemannian manifolds.
To sum up, in this dissertation, we propose a series of learning to hash methods to address the challenge of training hash models in large-scale applications and we verify the state-of-the-art performances of all the proposed learning to hash models through extensive experiments on benchmark datasets. |
---|