Towards robust and efficient multimodal representation learning and fusion

Towards robust and efficient multimodal representation learning and fusion

In the past few years, multimodal learning has made significant progress. The goal of multimodal learning is to create models that can relate and process data from various modalities. One of the challenges is to learn useful representations efficiently given the heterogeneity of the data. Another is...

Full description

Saved in:

Bibliographic Details
Main Author:	Guo, Xiaobao
Other Authors:	Kong Wai-Kin Adams
Format:	Thesis-Doctor of Philosophy
Language:	English
Published:	Nanyang Technological University 2025
Subjects:	Computer and Information Science Multimodal learning Multimodal fusion
Online Access:	https://hdl.handle.net/10356/182226
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Similar Items

Multimodal fusion for multimedia analysis: A survey
by: Atrey, P.K., et al.
Published: (2013)

Fusion of multimodal embeddings for ad-hoc video search
by: FRANCIS, Danny, et al.
Published: (2019)

Query-document-dependent fusion: A case study of multimodal music retrieval
by: Li, Z., et al.
Published: (2014)

Multimodal sentiment analysis using hierarchical fusion with context modeling
by: Majumder, Navonil, et al.
Published: (2020)

Adaptive multimodal fusion based similarity measures in music information retrieval
by: ZHANG BINGJUN
Published: (2011)

Document dependent fusion in multimodal music retrieval
by: Li, Z., et al.
Published: (2013)

Comprehensive query-dependent fusion using regression-on-folksonomies: A case study of multimodal music search
by: Zhang, B., et al.
Published: (2013)

MultiFusion: A boosting approach for multimedia fusion
by: Wang, X., et al.
Published: (2013)

Data efficient deep multimodal learning
by: Shen, Meng
Published: (2025)

Multimodal Music Information Retrieval: From Content Analysis to Multimodal Fusion
by: LI ZHONGHUA
Published: (2013)

Query-document-dependent fusion: A case study of multimodal music retrieval
by: LI, Zhonghua, et al.
Published: (2013)

Exploring a multimodal fusion-based deep learning network for detecting facial palsy
by: OO, Heng Yim Nicole, et al.
Published: (2024)

Sentic maxine: Multimodal affective fusion and emotional paths
by: Hupont, I., et al.
Published: (2014)

Fusing pairwise modalities for emotion recognition in conversations
by: Fan, Chunxiao, et al.
Published: (2024)

KnowleNet: knowledge fusion network for multimodal sarcasm detection
by: Yue, Tan, et al.
Published: (2023)

Large multimodal models for visual reasoning
by: Duong, Ngoc Yen
Published: (2024)

CONTEXT-AWARE FUSION FOR MULTI-MODAL BIOMETRICS: WHOM DO I LISTEN TO AND WHEN?
by: SIVASANKARAN DIVYA
Published: (2018)

Jointly optimizing sensing pipelines for multimodal mixed reality interaction
by: KANATTA GAMAGE, Ramesh Darshana Rathnayake, et al.
Published: (2020)

Knowledge-based multimodal information fusion for role recognition and situation assessment by using mobile robot
by: Yang, Chule, et al.
Published: (2020)

Structure-aware multimodal feature fusion for RGB-D scene classification and beyond
by: Wang, Anran, et al.
Published: (2020)

Multimodal few-shot classification without attribute embedding
by: Chang, Jun Qing, et al.
Published: (2024)

Revisiting disentanglement and fusion on modality and context in conversational multimodal emotion recognition
by: LI, Bobo, et al.
Published: (2023)

M3SA: Multimodal Sentiment Analysis based on multi-scale feature extraction and multi-task learning
by: LIN, Changkai, et al.
Published: (2024)

Comprehensive Query-Dependent Fusion Using Regression-on-Folksonomies: A Case Study of Multimodal Music Search
by: ZHANG, Bingjun, et al.
Published: (2009)

Jointly optimizing sensing pipelines for multimodal mixed reality interaction
by: RATHNAYAKE, Darshana, et al.
Published: (2020)

Sentic blending: Scalable multimodal fusion for the continuous interpretation of semantics and sentics
by: Cambria, E., et al.
Published: (2014)

Look, read and feel : benchmarking ads understanding with multimodal multitask learning
by: Zhang, Huaizheng, et al.
Published: (2021)

DIALOG SYSTEMS GO MULTIMODAL
by: LIAO LIZI
Published: (2019)

Multimode process monitoring based on robust dictionary learning with application to aluminium electrolysis process
by: Yang, Chunhua, et al.
Published: (2020)

A Multimodal Virtual Reality Inventory System
by: Ko, Kenneth King L., et al.
Published: (2023)

Analysing multimodality in an interactive digital environment: Software as a meta-semiotic tool
by: Smith, B.A., et al.
Published: (2014)

A multimodal and multilevel ranking framework for content-based video retrieval
by: HOI, Steven C. H., et al.
Published: (2007)

A multimodal and multilevel ranking framework for content-based video retrieval
by: HOI, Steven C. H., et al.
Published: (2007)

Multimodal digital semiotics: The interaction of language with other resources
by: O'Halloran, K.L., et al.
Published: (2016)

Towards IMACA: Intelligent multimodal affective conversational agent
by: Hussain, A., et al.
Published: (2014)

Autonomous soundscape augmentation with multimodal fusion of visual and participant-linked inputs
by: Ooi, Kenneth, et al.
Published: (2023)

A multi-image dataset based on social media
by: Gan, Junjie
Published: (2025)

REPRESENTATION LEARNING IN MULTIMODAL SPATIOTEMPORAL IMAGE-GUIDED MEDICAL PROCEDURES
by: MOBARAKOL ISLAM
Published: (2019)

REPRESENTATION LEARNING OF DATA WITH MULTIPLE MODALITIES WITH APPLICATIONS TO VISUAL QUESTION ANSWERING
by: ILIEVSKI ILIJA
Published: (2018)

MULTIMODAL INSTRUCTION IN INITIAL TEACHER TRAINING: PROSPECTS AND CHALLENGES
by: Trần, Thị Hiếu Thủy
Published: (2019)