Keyword-driven image captioning via Context-dependent Bilateral LSTM
Image captioning has recently received much attention. Existing approaches, however, are limited to describing images with simple contextual information, which typically generate one sentence to describe each image with only a single contextual emphasis. In this paper, we address this limitation fro...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2017
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/8497 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-9500 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-95002024-01-04T04:18:03Z Keyword-driven image captioning via Context-dependent Bilateral LSTM ZHANG, Xiaodan HE, Shengfeng SONG, Xinhang WEI, Pengxu JIANG, Shuqiang YE, Qixiang JIAO, Jianbin LAU, Rynson W. H. Image captioning has recently received much attention. Existing approaches, however, are limited to describing images with simple contextual information, which typically generate one sentence to describe each image with only a single contextual emphasis. In this paper, we address this limitation from a user perspective with a novel approach. Given some keywords as additional inputs, the proposed method would generate various descriptions according to the provided guidance. Hence, descriptions with different focuses can be generated for the same image. Our method is based on a new Context-dependent Bilateral Long Short-Term Memory (CDB-LSTM) model to predict a keyword-driven sentence by considering the word dependence. The word dependence is explored externally with a bilateral pipeline, and internally with a unified and joint training process. Experiments on the MS COCO dataset demonstrate that the proposed approach not only significantly outperforms the baseline method but also shows good adaptation and consistency with various keywords. 2017-07-14T07:00:00Z text https://ink.library.smu.edu.sg/sis_research/8497 info:doi/10.1109/icme.2017.8019525 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Image captioning Keyword-driven L-STM Artificial Intelligence and Robotics |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Image captioning Keyword-driven L-STM Artificial Intelligence and Robotics |
spellingShingle |
Image captioning Keyword-driven L-STM Artificial Intelligence and Robotics ZHANG, Xiaodan HE, Shengfeng SONG, Xinhang WEI, Pengxu JIANG, Shuqiang YE, Qixiang JIAO, Jianbin LAU, Rynson W. H. Keyword-driven image captioning via Context-dependent Bilateral LSTM |
description |
Image captioning has recently received much attention. Existing approaches, however, are limited to describing images with simple contextual information, which typically generate one sentence to describe each image with only a single contextual emphasis. In this paper, we address this limitation from a user perspective with a novel approach. Given some keywords as additional inputs, the proposed method would generate various descriptions according to the provided guidance. Hence, descriptions with different focuses can be generated for the same image. Our method is based on a new Context-dependent Bilateral Long Short-Term Memory (CDB-LSTM) model to predict a keyword-driven sentence by considering the word dependence. The word dependence is explored externally with a bilateral pipeline, and internally with a unified and joint training process. Experiments on the MS COCO dataset demonstrate that the proposed approach not only significantly outperforms the baseline method but also shows good adaptation and consistency with various keywords. |
format |
text |
author |
ZHANG, Xiaodan HE, Shengfeng SONG, Xinhang WEI, Pengxu JIANG, Shuqiang YE, Qixiang JIAO, Jianbin LAU, Rynson W. H. |
author_facet |
ZHANG, Xiaodan HE, Shengfeng SONG, Xinhang WEI, Pengxu JIANG, Shuqiang YE, Qixiang JIAO, Jianbin LAU, Rynson W. H. |
author_sort |
ZHANG, Xiaodan |
title |
Keyword-driven image captioning via Context-dependent Bilateral LSTM |
title_short |
Keyword-driven image captioning via Context-dependent Bilateral LSTM |
title_full |
Keyword-driven image captioning via Context-dependent Bilateral LSTM |
title_fullStr |
Keyword-driven image captioning via Context-dependent Bilateral LSTM |
title_full_unstemmed |
Keyword-driven image captioning via Context-dependent Bilateral LSTM |
title_sort |
keyword-driven image captioning via context-dependent bilateral lstm |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2017 |
url |
https://ink.library.smu.edu.sg/sis_research/8497 |
_version_ |
1787590780556148736 |