Fusion of multimodal embeddings for ad-hoc video search
The challenge of Ad-Hoc Video Search (AVS) originates from free-form (i.e., no pre-defined vocabulary) and freestyle (i.e., natural language) query description. Bridging the semantic gap between AVS queries and videos becomes highly difficult as evidenced from the low retrieval accuracy of AVS bench...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2019
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/6462 https://ink.library.smu.edu.sg/context/sis_research/article/7465/viewcontent/Francis_Fusion_of_Multimodal_Embeddings_for_Ad_Hoc_Video_Search_ICCVW_2019_paper.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-7465 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-74652022-01-10T06:07:50Z Fusion of multimodal embeddings for ad-hoc video search FRANCIS, Danny NGUYEN, Phuong Anh HUET, Benoit NGO, Chong-wah The challenge of Ad-Hoc Video Search (AVS) originates from free-form (i.e., no pre-defined vocabulary) and freestyle (i.e., natural language) query description. Bridging the semantic gap between AVS queries and videos becomes highly difficult as evidenced from the low retrieval accuracy of AVS benchmarking in TRECVID. In this paper, we study a new method to fuse multimodal embeddings which have been derived based on completely disjoint datasets. This method is tested on two datasets for two distinct tasks: on MSR-VTT for unique video retrieval and on V3C1 for multiple videos retrieval. 2019-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6462 info:doi/10.1109/ICCVW.2019.00233 https://ink.library.smu.edu.sg/context/sis_research/article/7465/viewcontent/Francis_Fusion_of_Multimodal_Embeddings_for_Ad_Hoc_Video_Search_ICCVW_2019_paper.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Deep learning Multimedia Multimodal embeddings Multimodal fusion Video search Databases and Information Systems Graphics and Human Computer Interfaces |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Deep learning Multimedia Multimodal embeddings Multimodal fusion Video search Databases and Information Systems Graphics and Human Computer Interfaces |
spellingShingle |
Deep learning Multimedia Multimodal embeddings Multimodal fusion Video search Databases and Information Systems Graphics and Human Computer Interfaces FRANCIS, Danny NGUYEN, Phuong Anh HUET, Benoit NGO, Chong-wah Fusion of multimodal embeddings for ad-hoc video search |
description |
The challenge of Ad-Hoc Video Search (AVS) originates from free-form (i.e., no pre-defined vocabulary) and freestyle (i.e., natural language) query description. Bridging the semantic gap between AVS queries and videos becomes highly difficult as evidenced from the low retrieval accuracy of AVS benchmarking in TRECVID. In this paper, we study a new method to fuse multimodal embeddings which have been derived based on completely disjoint datasets. This method is tested on two datasets for two distinct tasks: on MSR-VTT for unique video retrieval and on V3C1 for multiple videos retrieval. |
format |
text |
author |
FRANCIS, Danny NGUYEN, Phuong Anh HUET, Benoit NGO, Chong-wah |
author_facet |
FRANCIS, Danny NGUYEN, Phuong Anh HUET, Benoit NGO, Chong-wah |
author_sort |
FRANCIS, Danny |
title |
Fusion of multimodal embeddings for ad-hoc video search |
title_short |
Fusion of multimodal embeddings for ad-hoc video search |
title_full |
Fusion of multimodal embeddings for ad-hoc video search |
title_fullStr |
Fusion of multimodal embeddings for ad-hoc video search |
title_full_unstemmed |
Fusion of multimodal embeddings for ad-hoc video search |
title_sort |
fusion of multimodal embeddings for ad-hoc video search |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2019 |
url |
https://ink.library.smu.edu.sg/sis_research/6462 https://ink.library.smu.edu.sg/context/sis_research/article/7465/viewcontent/Francis_Fusion_of_Multimodal_Embeddings_for_Ad_Hoc_Video_Search_ICCVW_2019_paper.pdf |
_version_ |
1770575967201263616 |