Context-aware visual policy network for fine-grained image captioning

With the maturity of visual detection techniques, we are more ambitious in describing visual content with open-vocabulary, fine-grained and free-form language, i.e., the task of image captioning. In particular, we are interested in generating longer, richer and more fine-grained sentences and paragr...

全面介紹

Saved in:
書目詳細資料
Main Authors: Zha, Zheng-Jun, Liu, Daqing, Zhang, Hanwang, Zhang, Yongdong, Wu, Feng
其他作者: School of Computer Science and Engineering
格式: Article
語言:English
出版: 2022
主題:
在線閱讀:https://hdl.handle.net/10356/162628
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English