Fusion of multimodal embeddings for ad-hoc video search

The challenge of Ad-Hoc Video Search (AVS) originates from free-form (i.e., no pre-defined vocabulary) and freestyle (i.e., natural language) query description. Bridging the semantic gap between AVS queries and videos becomes highly difficult as evidenced from the low retrieval accuracy of AVS bench...

Full description

Saved in:
Bibliographic Details
Main Authors: FRANCIS, Danny, NGUYEN, Phuong Anh, HUET, Benoit, NGO, Chong-wah
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2019
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6462
https://ink.library.smu.edu.sg/context/sis_research/article/7465/viewcontent/Francis_Fusion_of_Multimodal_Embeddings_for_Ad_Hoc_Video_Search_ICCVW_2019_paper.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7465
record_format dspace
spelling sg-smu-ink.sis_research-74652022-01-10T06:07:50Z Fusion of multimodal embeddings for ad-hoc video search FRANCIS, Danny NGUYEN, Phuong Anh HUET, Benoit NGO, Chong-wah The challenge of Ad-Hoc Video Search (AVS) originates from free-form (i.e., no pre-defined vocabulary) and freestyle (i.e., natural language) query description. Bridging the semantic gap between AVS queries and videos becomes highly difficult as evidenced from the low retrieval accuracy of AVS benchmarking in TRECVID. In this paper, we study a new method to fuse multimodal embeddings which have been derived based on completely disjoint datasets. This method is tested on two datasets for two distinct tasks: on MSR-VTT for unique video retrieval and on V3C1 for multiple videos retrieval. 2019-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6462 info:doi/10.1109/ICCVW.2019.00233 https://ink.library.smu.edu.sg/context/sis_research/article/7465/viewcontent/Francis_Fusion_of_Multimodal_Embeddings_for_Ad_Hoc_Video_Search_ICCVW_2019_paper.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Deep learning Multimedia Multimodal embeddings Multimodal fusion Video search Databases and Information Systems Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Deep learning
Multimedia
Multimodal embeddings
Multimodal fusion
Video search
Databases and Information Systems
Graphics and Human Computer Interfaces
spellingShingle Deep learning
Multimedia
Multimodal embeddings
Multimodal fusion
Video search
Databases and Information Systems
Graphics and Human Computer Interfaces
FRANCIS, Danny
NGUYEN, Phuong Anh
HUET, Benoit
NGO, Chong-wah
Fusion of multimodal embeddings for ad-hoc video search
description The challenge of Ad-Hoc Video Search (AVS) originates from free-form (i.e., no pre-defined vocabulary) and freestyle (i.e., natural language) query description. Bridging the semantic gap between AVS queries and videos becomes highly difficult as evidenced from the low retrieval accuracy of AVS benchmarking in TRECVID. In this paper, we study a new method to fuse multimodal embeddings which have been derived based on completely disjoint datasets. This method is tested on two datasets for two distinct tasks: on MSR-VTT for unique video retrieval and on V3C1 for multiple videos retrieval.
format text
author FRANCIS, Danny
NGUYEN, Phuong Anh
HUET, Benoit
NGO, Chong-wah
author_facet FRANCIS, Danny
NGUYEN, Phuong Anh
HUET, Benoit
NGO, Chong-wah
author_sort FRANCIS, Danny
title Fusion of multimodal embeddings for ad-hoc video search
title_short Fusion of multimodal embeddings for ad-hoc video search
title_full Fusion of multimodal embeddings for ad-hoc video search
title_fullStr Fusion of multimodal embeddings for ad-hoc video search
title_full_unstemmed Fusion of multimodal embeddings for ad-hoc video search
title_sort fusion of multimodal embeddings for ad-hoc video search
publisher Institutional Knowledge at Singapore Management University
publishDate 2019
url https://ink.library.smu.edu.sg/sis_research/6462
https://ink.library.smu.edu.sg/context/sis_research/article/7465/viewcontent/Francis_Fusion_of_Multimodal_Embeddings_for_Ad_Hoc_Video_Search_ICCVW_2019_paper.pdf
_version_ 1770575967201263616