Keyword-driven image captioning via Context-dependent Bilateral LSTM

Image captioning has recently received much attention. Existing approaches, however, are limited to describing images with simple contextual information, which typically generate one sentence to describe each image with only a single contextual emphasis. In this paper, we address this limitation fro...

全面介紹

Saved in:

書目詳細資料
Main Authors:	ZHANG, Xiaodan, HE, Shengfeng, SONG, Xinhang, WEI, Pengxu, JIANG, Shuqiang, YE, Qixiang, JIAO, Jianbin, LAU, Rynson W. H.
格式:	text
語言:	English
出版:	Institutional Knowledge at Singapore Management University 2017
主題:	Image captioning Keyword-driven L-STM Artificial Intelligence and Robotics
在線閱讀:	https://ink.library.smu.edu.sg/sis_research/8497
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Singapore Management University
語言:	English

id	sg-smu-ink.sis_research-9500
record_format	dspace
spelling	sg-smu-ink.sis_research-95002024-01-04T04:18:03Z Keyword-driven image captioning via Context-dependent Bilateral LSTM ZHANG, Xiaodan HE, Shengfeng SONG, Xinhang WEI, Pengxu JIANG, Shuqiang YE, Qixiang JIAO, Jianbin LAU, Rynson W. H. Image captioning has recently received much attention. Existing approaches, however, are limited to describing images with simple contextual information, which typically generate one sentence to describe each image with only a single contextual emphasis. In this paper, we address this limitation from a user perspective with a novel approach. Given some keywords as additional inputs, the proposed method would generate various descriptions according to the provided guidance. Hence, descriptions with different focuses can be generated for the same image. Our method is based on a new Context-dependent Bilateral Long Short-Term Memory (CDB-LSTM) model to predict a keyword-driven sentence by considering the word dependence. The word dependence is explored externally with a bilateral pipeline, and internally with a unified and joint training process. Experiments on the MS COCO dataset demonstrate that the proposed approach not only significantly outperforms the baseline method but also shows good adaptation and consistency with various keywords. 2017-07-14T07:00:00Z text https://ink.library.smu.edu.sg/sis_research/8497 info:doi/10.1109/icme.2017.8019525 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Image captioning Keyword-driven L-STM Artificial Intelligence and Robotics
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Image captioning Keyword-driven L-STM Artificial Intelligence and Robotics
spellingShingle	Image captioning Keyword-driven L-STM Artificial Intelligence and Robotics ZHANG, Xiaodan HE, Shengfeng SONG, Xinhang WEI, Pengxu JIANG, Shuqiang YE, Qixiang JIAO, Jianbin LAU, Rynson W. H. Keyword-driven image captioning via Context-dependent Bilateral LSTM
description	Image captioning has recently received much attention. Existing approaches, however, are limited to describing images with simple contextual information, which typically generate one sentence to describe each image with only a single contextual emphasis. In this paper, we address this limitation from a user perspective with a novel approach. Given some keywords as additional inputs, the proposed method would generate various descriptions according to the provided guidance. Hence, descriptions with different focuses can be generated for the same image. Our method is based on a new Context-dependent Bilateral Long Short-Term Memory (CDB-LSTM) model to predict a keyword-driven sentence by considering the word dependence. The word dependence is explored externally with a bilateral pipeline, and internally with a unified and joint training process. Experiments on the MS COCO dataset demonstrate that the proposed approach not only significantly outperforms the baseline method but also shows good adaptation and consistency with various keywords.
format	text
author	ZHANG, Xiaodan HE, Shengfeng SONG, Xinhang WEI, Pengxu JIANG, Shuqiang YE, Qixiang JIAO, Jianbin LAU, Rynson W. H.
author_facet	ZHANG, Xiaodan HE, Shengfeng SONG, Xinhang WEI, Pengxu JIANG, Shuqiang YE, Qixiang JIAO, Jianbin LAU, Rynson W. H.
author_sort	ZHANG, Xiaodan
title	Keyword-driven image captioning via Context-dependent Bilateral LSTM
title_short	Keyword-driven image captioning via Context-dependent Bilateral LSTM
title_full	Keyword-driven image captioning via Context-dependent Bilateral LSTM
title_fullStr	Keyword-driven image captioning via Context-dependent Bilateral LSTM
title_full_unstemmed	Keyword-driven image captioning via Context-dependent Bilateral LSTM
title_sort	keyword-driven image captioning via context-dependent bilateral lstm
publisher	Institutional Knowledge at Singapore Management University
publishDate	2017
url	https://ink.library.smu.edu.sg/sis_research/8497
_version_	1787590780556148736

Keyword-driven image captioning via Context-dependent Bilateral LSTM

相似書籍