Neighbourhood structure preserving cross-modal embedding for video hyperlinking

Video hyperlinking is a task aiming to enhance the accessibility of large archives, by establishing links between fragments of videos. The links model the aboutness between fragments for efficient traversal of video content. This paper addresses the problem of link construction from the perspective...

Full description

Saved in:

Bibliographic Details
Main Authors:	HAO, Yanbin, NGO, Chong-wah, HUET, Benoit
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2020
Subjects:	Task analysis Visualization Joining processes Gallium nitride Benchmark testing Feature extraction Neural networks Video hyperlinking cross-modal translation structure-preserving learning Graphics and Human Computer Interfaces OS and Networks
Online Access:	https://ink.library.smu.edu.sg/sis_research/6305 https://ink.library.smu.edu.sg/context/sis_research/article/7308/viewcontent/08736841.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-7308
record_format	dspace
spelling	sg-smu-ink.sis_research-73082021-11-23T06:59:30Z Neighbourhood structure preserving cross-modal embedding for video hyperlinking HAO, Yanbin NGO, Chong-wah HUET, Benoit Video hyperlinking is a task aiming to enhance the accessibility of large archives, by establishing links between fragments of videos. The links model the aboutness between fragments for efficient traversal of video content. This paper addresses the problem of link construction from the perspective of cross-modal embedding. To this end, a generalized multi-modal auto-encoder is proposed.& x00A0;The encoder learns two embeddings from visual and speech modalities, respectively, whereas each of the embeddings performs self-modal and cross-modal translation of modalities. Furthermore, to preserve the neighbourhood structure of fragments, which is important for video hyperlinking, the auto-encoder is devised to model data distribution of fragments in a dataset. Experiments are conducted on Blip10000 dataset using the anchor fragments provided by TRECVid Video Hyperlinking (LNK) task over the years of 2016 and 2017. This paper shares the empirical insights on a number of issues in cross-modal learning, including the preservation of neighbourhood structure in embedding, model fine-tuning and issue of missing modality, for video hyperlinking. 2020-01-01T08:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6305 info:doi/10.1109/TMM.2019.2923121 https://ink.library.smu.edu.sg/context/sis_research/article/7308/viewcontent/08736841.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Task analysis Visualization Joining processes Gallium nitride Benchmark testing Feature extraction Neural networks Video hyperlinking cross-modal translation structure-preserving learning Graphics and Human Computer Interfaces OS and Networks
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Task analysis Visualization Joining processes Gallium nitride Benchmark testing Feature extraction Neural networks Video hyperlinking cross-modal translation structure-preserving learning Graphics and Human Computer Interfaces OS and Networks
spellingShingle	Task analysis Visualization Joining processes Gallium nitride Benchmark testing Feature extraction Neural networks Video hyperlinking cross-modal translation structure-preserving learning Graphics and Human Computer Interfaces OS and Networks HAO, Yanbin NGO, Chong-wah HUET, Benoit Neighbourhood structure preserving cross-modal embedding for video hyperlinking
description	Video hyperlinking is a task aiming to enhance the accessibility of large archives, by establishing links between fragments of videos. The links model the aboutness between fragments for efficient traversal of video content. This paper addresses the problem of link construction from the perspective of cross-modal embedding. To this end, a generalized multi-modal auto-encoder is proposed.& x00A0;The encoder learns two embeddings from visual and speech modalities, respectively, whereas each of the embeddings performs self-modal and cross-modal translation of modalities. Furthermore, to preserve the neighbourhood structure of fragments, which is important for video hyperlinking, the auto-encoder is devised to model data distribution of fragments in a dataset. Experiments are conducted on Blip10000 dataset using the anchor fragments provided by TRECVid Video Hyperlinking (LNK) task over the years of 2016 and 2017. This paper shares the empirical insights on a number of issues in cross-modal learning, including the preservation of neighbourhood structure in embedding, model fine-tuning and issue of missing modality, for video hyperlinking.
format	text
author	HAO, Yanbin NGO, Chong-wah HUET, Benoit
author_facet	HAO, Yanbin NGO, Chong-wah HUET, Benoit
author_sort	HAO, Yanbin
title	Neighbourhood structure preserving cross-modal embedding for video hyperlinking
title_short	Neighbourhood structure preserving cross-modal embedding for video hyperlinking
title_full	Neighbourhood structure preserving cross-modal embedding for video hyperlinking
title_fullStr	Neighbourhood structure preserving cross-modal embedding for video hyperlinking
title_full_unstemmed	Neighbourhood structure preserving cross-modal embedding for video hyperlinking
title_sort	neighbourhood structure preserving cross-modal embedding for video hyperlinking
publisher	Institutional Knowledge at Singapore Management University
publishDate	2020
url	https://ink.library.smu.edu.sg/sis_research/6305 https://ink.library.smu.edu.sg/context/sis_research/article/7308/viewcontent/08736841.pdf
_version_	1770575931136540672

Neighbourhood structure preserving cross-modal embedding for video hyperlinking

Similar Items