Cross-modal graph with meta concepts for video captioning

Video captioning targets interpreting the complex visual contents as text descriptions, which requires the model to fully understand video scenes including objects and their interactions. Prevailing methods adopt off-the-shelf object detection networks to give object proposals and use the attention...

Full description

Saved in:

Bibliographic Details
Main Authors:	WANG, Hao, LIN, Guosheng, HOI, Steven C. H., MIAO, Chunyan
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2022
Subjects:	Semantics Visualization Feature extraction Predictive models Task analysis Computational modeling Location awareness Video captioning vision-and-language Databases and Information Systems
Online Access:	https://ink.library.smu.edu.sg/sis_research/7245
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Be the first to leave a comment!

Cross-modal graph with meta concepts for video captioning

Similar Items