Deep robust multilevel semantic hashing for multi-label cross-modal retrieval

Hashing based cross-modal retrieval has recently made significant progress. But straightforward embedding data from different modalities involving rich semantics into a joint Hamming space will inevitably produce false codes due to the intrinsic modality discrepancy and noises. We present a novel d...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: Song, Ge, Tan, Xiaoyang, Zhao, Jun, Yang, Ming
مؤلفون آخرون: School of Computer Science and Engineering
التنسيق: مقال
اللغة:English
منشور في: 2023
الموضوعات:
الوصول للمادة أونلاين:https://hdl.handle.net/10356/164098
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Nanyang Technological University
اللغة: English
الوصف
الملخص:Hashing based cross-modal retrieval has recently made significant progress. But straightforward embedding data from different modalities involving rich semantics into a joint Hamming space will inevitably produce false codes due to the intrinsic modality discrepancy and noises. We present a novel deep Robust Multilevel Semantic Hashing (RMSH) for more accurate multi-label cross-modal retrieval. It seeks to preserve fine-grained similarity among data with rich semantics,i.e., multi-label, while explicitly require distances between dissimilar points to be larger than a specific value for strong robustness. For this, we give an effective bound of this value based on the information coding-theoretic analysis, and the above goals are embodied into a margin-adaptive triplet loss. Furthermore, we introduce pseudo-codes via fusing multiple hash codes to explore seldom-seen semantics, alleviating the sparsity problem of similarity information. Experiments on three benchmarks show the validity of the derived bounds, and our method achieves state-of-the-art performance.