(Un)likelihood training for interpretable embedding

(Un)likelihood training for interpretable embedding

Cross-modal representation learning has become a new normal for bridging the semantic gap between text and visual data. Learning modality agnostic representations in a continuous latent space, however, is often treated as a black-box data-driven training process. It is well known that the effectiven...

Full description

Saved in:

Bibliographic Details
Main Authors:	WU, Jiaxin, NGO, Chong-wah, CHAN, Wing-Kwong, HOU, Zhijian
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2023
Subjects:	Cross-modal representation learning; Explainable embedding Neural networks Video search Artificial Intelligence and Robotics
Online Access:	https://ink.library.smu.edu.sg/sis_research/9819 https://ink.library.smu.edu.sg/context/sis_research/article/10819/viewcontent/2207.00282v3.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Similar Items

Interpretable embedding for ad-hoc video search
by: WU, Jiaxin, et al.
Published: (2020)

Improving interpretable embeddings for ad-hoc video search with generative captions and multi-word concept bank
by: WU, Jiaxin, et al.
Published: (2024)

NEURAL NETWORK REPRESENTATION SIMILARITY REVISITED
by: WANG YUHUI
Published: (2024)

SQL-like interpretable interactive video search
by: WU, Jiaxin, et al.
Published: (2021)

Cross-modal recipe retrieval with stacked attention model
by: CHEN, Jing-Jing, et al.
Published: (2018)

Cross-modal recipe retrieval: How to cook this dish?
by: CHEN, Jingjing, et al.
Published: (2017)

Crew: Cross-modal resource searching by exploiting wikipedia
by: Liu, C., et al.
Published: (2013)

Fusion of multimodal embeddings for ad-hoc video search
by: FRANCIS, Danny, et al.
Published: (2019)

Interpretable recommendation based on graph neural networks
by: Tan, Samantha Shu Hua
Published: (2024)

PiRhDy: Learning Pitch-, Rhythm-, and Dynamics-aware Embeddings for Symbolic Music
by: Hongru Liang, et al.
Published: (2020)

Disentangled Graph Collaborative Filtering
by: WANG XIANG, et al.
Published: (2020)

Towards semantic, debiased and moment video retrieval
by: Satar, Burak
Published: (2025)

Ontologie d’événements vidéos pour un système automatique d’interprétation vidéo
by: PHAM, Le Son
Published: (2015)

Neighbourhood structure preserving cross-modal embedding for video hyperlinking
by: HAO, Yanbin, et al.
Published: (2020)

GO2Vec : transforming GO terms and proteins to vector representations via graph embeddings
by: Zhong, Xiaoshi, et al.
Published: (2021)

Assessing the generalizability of code2vec token embeddings
by: KANG, Hong Jin, et al.
Published: (2019)

Sparse-representation-based graph embedding for traffic sign recognition
by: Lu, K., et al.
Published: (2014)

Cross-modal Moment Localization in Videos
by: Meng Liu, et al.
Published: (2020)

Robust short clip representation and fast search through large video collections
by: YUAN JUNSONG
Published: (2010)

Graph embeddings on gene ontology annotations for protein-protein interaction prediction
by: Zhong, Xiaoshi, et al.
Published: (2021)

Cross-media semantic representation via bi-directional learning to rank
by: Wu, F., et al.
Published: (2014)

CONQUER: Contextual query-aware ranking for video corpus moment retrieval
by: HOU, Zhijian, et al.
Published: (2021)

Temporal sentence grounding in videos: a survey and future directions
by: Zhang, Hao, et al.
Published: (2023)

Learning a cross-modal hashing network for multimedia search
by: Tan, Yap Peng, et al.
Published: (2018)

Concept-driven multi-modality fusion for video search
by: WEI, Xiao-Yong, et al.
Published: (2011)

Efficient cross-modal video retrieval with meta-optimized frames
by: HAN, Ning, et al.
Published: (2024)

Modeling Embedding Dimension Correlations via Convolutional Neural Collaborative Filtering
by: Xiaoyu Du, et al.
Published: (2020)

Deep Understanding of Cooking Procedure for Cross-modal Recipe Retrieval
by: Jing-Jing Chen, et al.
Published: (2020)

Alleviating the inconsistency of multimodal data in cross-modal retrieval
by: Li, Tieying, et al.
Published: (2024)

Language Agnostic Speaker Embedding for Cross-Lingual Personalized Speech Generation
by: Zhou, Yi, et al.
Published: (2022)

Interactive search vs. automatic search: An extensive study on video retrieval
by: NGUYEN, Phuong-Anh, et al.
Published: (2021)

3D color set partitioning in hierarchical trees
by: Kassim, A.A., et al.
Published: (2014)

Is a high tone pointy? Speakers of different languages match Mandarin Chinese tones to visual shapes differently
by: Shang, Nan, et al.
Published: (2018)

THE ASSOCIATION BETWEEN EMOTION AND VISION
by: FENG YENJU
Published: (2020)

GraphCode2Vec: Generic code embedding via lexical and program dependence analyses
by: MA, Wei, et al.
Published: (2022)

Learning network-based multi-modal mobile user interface embeddings
by: ANG, Gary, et al.
Published: (2021)

Reduction of transients in switches using embedded machine learning
by: Suresh, P., et al.
Published: (2021)

Embedded systems handbook [electronic resource] : networked embedded systems / edited by Richard Zurawski.
by: Zurawski, Richard
Published: (2018)

Knowledge graph embedding by normalizing flows
by: XIAO, Changyi, et al.
Published: (2022)

You only search once: on lightweight differentiable architecture search for resource-constrained embedded platforms
by: Luo, Xiangzhong, et al.
Published: (2023)