Unifying text, tables, and images for multimodal question answering
Multimodal question answering (MMQA), which aims to derive the answer from multiple knowledge modalities (e.g., text, tables, and images), has received increasing attention due to its board applications. Current approaches to MMQA often rely on single-modal or bi-modal QA models, which limits their...
Saved in:
Main Authors: | LUO, Haohao, SHEN, Ying, DENG, Yang |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2023
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/9120 https://ink.library.smu.edu.sg/context/sis_research/article/10123/viewcontent/2023.findings_emnlp.626.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Similar Items
-
汉语情态助动词的主观性和主观化 = THE SUBJECTIVITY AND SUBJECTIFICATION OF MODAL AUXILIARIES IN CHINESE
by: 杨黎黎, et al.
Published: (2015) -
Cross-modal recipe retrieval with stacked attention model
by: CHEN, Jing-Jing, et al.
Published: (2018) -
Alleviating the inconsistency of multimodal data in cross-modal retrieval
by: Li, Tieying, et al.
Published: (2024) -
Cross-modal recipe retrieval: How to cook this dish?
by: CHEN, Jingjing, et al.
Published: (2017) -
Epistemic modality in TED talks on education
by: Ton Nu, My Nhat, et al.
Published: (2019)