Improving interpretable embeddings for ad-hoc video search with generative captions and multi-word concept bank

Improving interpretable embeddings for ad-hoc video search with generative captions and multi-word concept bank

Aligning a user query and video clips in cross-modal latent space and that with semantic concepts are two mainstream approaches for ad-hoc video search (AVS). However, the effectiveness of existing approaches is bottlenecked by the small sizes of available video-text datasets and the low quality of...

Full description

Saved in:

Bibliographic Details
Main Authors:	WU, Jiaxin, NGO, Chong-wah, CHAN, Wing-Kwong
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Ad-hoc video search Interpretable embedding Large-scale videotext dataset Concept bank construction Out of vocabulary Databases and Information Systems Graphics and Human Computer Interfaces
Online Access:	https://ink.library.smu.edu.sg/sis_research/9288 https://ink.library.smu.edu.sg/context/sis_research/article/10288/viewcontent/2404.06173v1.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Similar Items

Interpretable embedding for ad-hoc video search
by: WU, Jiaxin, et al.
Published: (2020)

SQL-like interpretable interactive video search
by: WU, Jiaxin, et al.
Published: (2021)

(Un)likelihood training for interpretable embedding
by: WU, Jiaxin, et al.
Published: (2023)

Building descriptive and discriminative visual codebook for large-scale image applications
by: Tian, Q., et al.
Published: (2016)

Morphologically-aware vocabulary reduction of word embeddings
by: CHIA, Chong Cher, et al.
Published: (2023)

Utilizing related samples to learn complex queries in interactive concept-based video search
by: Yuan, J., et al.
Published: (2013)

Complex query learning in semantic video search
by: YUAN JIN
Published: (2013)

Fusion of multimodal embeddings for ad-hoc video search
by: FRANCIS, Danny, et al.
Published: (2019)

Image captioning via semantic element embedding
by: ZHANG, Xiaodan, et al.
Published: (2020)

Unlocking markets: A multilingual benchmark to cross-market question answering
by: YUAN, Yifei, et al.
Published: (2024)

Interactive search vs. automatic search: An extensive study on video retrieval
by: NGUYEN, Phuong-Anh, et al.
Published: (2021)

Projective nonnegative graph embedding
by: Liu, X., et al.
Published: (2014)

The Jiku mobile video dataset
by: Saini, M., et al.
Published: (2014)

Learning concept bundles for video search with complex queries
by: Yuan, J., et al.
Published: (2013)

Real-time video copy-location detection in large-scale repositories
by: Liu, B., et al.
Published: (2013)

An effective and scalable framework for authorship attribution query processing
by: Sarwar, R., et al.
Published: (2021)

Enhanced vireo KIS at VBS 2018
by: NGUYEN, Phuong Anh, et al.
Published: (2018)

Large-scale multimedia data collections
by: Huet, B., et al.
Published: (2013)

PSDVec: A toolbox for incremental and scalable word embedding
by: Li, Shaohua, et al.
Published: (2017)

Temporal query substitution for ad search
by: Zhang, W., et al.
Published: (2014)

Learning transferable negative prompts for out-of-distribution detection
by: LI, Tianqi, et al.
Published: (2024)

A hamming embedding kernel with informative bag-of-visual words for video semantic indexing
by: WANG, Feng, et al.
Published: (2014)

Concept-driven multi-modality fusion for video search
by: WEI, Xiao-Yong, et al.
Published: (2011)

Automatic image annotation using word embedding learning
by: Chen, Q., et al.
Published: (2014)

Fusing semantics, observability, reliability and diversity of concept detectors for video search
by: WEI, Xiao-Yong, et al.
Published: (2008)

Alternate form reliability of the PPVT-III in 100 ESL university students
by: Hilton, Laurence M.
Published: (2014)

Collaborative error reduction for hierarchical classification
by: ZHU, Shiai, et al.
Published: (2014)

Concept-based interactive search system
by: LU, Yi-Jie, et al.
Published: (2017)

Scene recognition by semantic visual words
by: Farahzadeh, Elahe, et al.
Published: (2015)

Error recovered hierarchical classification
by: ZHU, Shiai, et al.
Published: (2013)

Leveraging LLMs and generative models for interactive known-item video search
by: MA, Zhixin, et al.
Published: (2024)

Video hyperlinking: Libraries and tools for threading and visualizing large video collection
by: PANG, Lei, et al.
Published: (2012)

Fast hierarchical clustering and its validation
by: Dash, M., et al.
Published: (2013)

Methods And Strategies In Translating Instagram Caption In PT. Sumber Cipta Multiniaga
by: Savitri, Fadhila Annisya
Published: (2021)

The Vid3oC and IntVID datasets for video super resolution and quality mapping
by: KIM, S., et al.
Published: (2019)

Ontology-enriched semantic space for video search
by: WEI, Xiao-Yong, et al.
Published: (2007)

3D color set partitioning in hierarchical trees
by: Kassim, A.A., et al.
Published: (2014)

Event detection with zero example: Select the right and suppress the wrong concepts
by: LU, Yi-Jie, et al.
Published: (2016)

A MODEL AGGREGATION APPROACH FOR HIGH-DIMENSIONAL LARGE-SCALE OPTIMIZATION
by: ZHANG ERCONG
Published: (2024)

On very large scale test collection for landmark image search benchmarking
by: CHENG, Zhiyong, et al.
Published: (2016)