Unsupervised video hashing with multi-granularity contextualization and multi-structure preservation

Unsupervised video hashing typically aims to learn a compact binary vector to represent complex video content without using manual annotations. Existing unsupervised hashing methods generally suffer from incomplete exploration of various perspective dependencies (e.g., long-range and short-range) an...

Full description

Saved in:

Bibliographic Details
Main Authors:	HAO, Yanbin, DUAN, Jingru, ZHANG, Hao, ZHU, Bin, ZHOU, Pengyuan, HE, Xiangnan
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2022
Subjects:	Hashing feature contextualization unsupervised learning video retrieval Graphics and Human Computer Interfaces
Online Access:	https://ink.library.smu.edu.sg/sis_research/9014 https://ink.library.smu.edu.sg/context/sis_research/article/10017/viewcontent/mm22_video_hashing.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-10017
record_format	dspace
spelling	sg-smu-ink.sis_research-100172024-07-25T08:11:52Z Unsupervised video hashing with multi-granularity contextualization and multi-structure preservation HAO, Yanbin DUAN, Jingru ZHANG, Hao ZHU, Bin ZHOU, Pengyuan HE, Xiangnan Unsupervised video hashing typically aims to learn a compact binary vector to represent complex video content without using manual annotations. Existing unsupervised hashing methods generally suffer from incomplete exploration of various perspective dependencies (e.g., long-range and short-range) and data structures that exist in visual contents, resulting in less discriminative hash codes. In this paper, we propose aMulti-granularity Contextualized and Multi-Structure preserved Hashing (MCMSH) method, exploring multiple axial contexts for discriminative video representation generation and various structural information for unsupervised learning simultaneously. Specifically, we delicately design three self-gating modules to separately model three granularities of dependencies (i.e., long/middle/short-range dependencies) and densely integrate them into MLP-Mixer for feature contextualization, leading to a novel model MC-MLP. To facilitate unsupervised learning, we investigate three kinds of data structures, including clusters, local neighborhood similarity structure, and inter/intra-class variations, and design a multi-objective task to train MC-MLP. These data structures show high complementarities in hash code learning. We conduct extensive experiments using three video retrieval benchmark datasets, demonstrating that our MCMSH not only boosts the performance of the backbone MLP-Mixer significantly but also outperforms the competing methods notably. Code is available at: https://github.com/haoyanbin918/MCMSH. 2022-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/9014 info:doi/10.1145/3503161.3547836 https://ink.library.smu.edu.sg/context/sis_research/article/10017/viewcontent/mm22_video_hashing.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Hashing feature contextualization unsupervised learning video retrieval Graphics and Human Computer Interfaces
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Hashing feature contextualization unsupervised learning video retrieval Graphics and Human Computer Interfaces
spellingShingle	Hashing feature contextualization unsupervised learning video retrieval Graphics and Human Computer Interfaces HAO, Yanbin DUAN, Jingru ZHANG, Hao ZHU, Bin ZHOU, Pengyuan HE, Xiangnan Unsupervised video hashing with multi-granularity contextualization and multi-structure preservation
description	Unsupervised video hashing typically aims to learn a compact binary vector to represent complex video content without using manual annotations. Existing unsupervised hashing methods generally suffer from incomplete exploration of various perspective dependencies (e.g., long-range and short-range) and data structures that exist in visual contents, resulting in less discriminative hash codes. In this paper, we propose aMulti-granularity Contextualized and Multi-Structure preserved Hashing (MCMSH) method, exploring multiple axial contexts for discriminative video representation generation and various structural information for unsupervised learning simultaneously. Specifically, we delicately design three self-gating modules to separately model three granularities of dependencies (i.e., long/middle/short-range dependencies) and densely integrate them into MLP-Mixer for feature contextualization, leading to a novel model MC-MLP. To facilitate unsupervised learning, we investigate three kinds of data structures, including clusters, local neighborhood similarity structure, and inter/intra-class variations, and design a multi-objective task to train MC-MLP. These data structures show high complementarities in hash code learning. We conduct extensive experiments using three video retrieval benchmark datasets, demonstrating that our MCMSH not only boosts the performance of the backbone MLP-Mixer significantly but also outperforms the competing methods notably. Code is available at: https://github.com/haoyanbin918/MCMSH.
format	text
author	HAO, Yanbin DUAN, Jingru ZHANG, Hao ZHU, Bin ZHOU, Pengyuan HE, Xiangnan
author_facet	HAO, Yanbin DUAN, Jingru ZHANG, Hao ZHU, Bin ZHOU, Pengyuan HE, Xiangnan
author_sort	HAO, Yanbin
title	Unsupervised video hashing with multi-granularity contextualization and multi-structure preservation
title_short	Unsupervised video hashing with multi-granularity contextualization and multi-structure preservation
title_full	Unsupervised video hashing with multi-granularity contextualization and multi-structure preservation
title_fullStr	Unsupervised video hashing with multi-granularity contextualization and multi-structure preservation
title_full_unstemmed	Unsupervised video hashing with multi-granularity contextualization and multi-structure preservation
title_sort	unsupervised video hashing with multi-granularity contextualization and multi-structure preservation
publisher	Institutional Knowledge at Singapore Management University
publishDate	2022
url	https://ink.library.smu.edu.sg/sis_research/9014 https://ink.library.smu.edu.sg/context/sis_research/article/10017/viewcontent/mm22_video_hashing.pdf
_version_	1814047692713623552

Unsupervised video hashing with multi-granularity contextualization and multi-structure preservation

Similar Items