Learning to collocate Visual-Linguistic Neural Modules for image captioning

Humans tend to decompose a sentence into different parts like sth do sth at someplace and then fill each part with certain content. Inspired by this, we follow the principle of modular design to propose a novel image captioner: learning to Collocate Visual-Linguistic Neural Modules (CVLNM). Unlike t...

Full description

Saved in:

Bibliographic Details
Main Authors:	Yang, Xu, Zhang, Hanwang, Gao, Chongyang, Cai, Jianfei
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2023
Subjects:	Engineering::Computer science and engineering Mage Captioning Distinguishable Neural Modules
Online Access:	https://hdl.handle.net/10356/170425
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Be the first to leave a comment!

Learning to collocate Visual-Linguistic Neural Modules for image captioning

Similar Items