Learning generalized video memory for automatic video captioning
Recent video captioning methods have made great progress by deep learning approaches with convolutional neural networks (CNN) and recurrent neural networks (RNN). While there are techniques that use memory networks for sentence decoding, few work has leveraged on the memory component to learn and ge...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2018
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/6076 https://ink.library.smu.edu.sg/context/sis_research/article/7079/viewcontent/Multi_disciplinary_Trends_in_Artificial_Intelligence.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-7079 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-70792023-08-03T14:49:05Z Learning generalized video memory for automatic video captioning CHANG, Poo-Hee TAN, Ah-hwee Recent video captioning methods have made great progress by deep learning approaches with convolutional neural networks (CNN) and recurrent neural networks (RNN). While there are techniques that use memory networks for sentence decoding, few work has leveraged on the memory component to learn and generalize the temporal structure in video. In this paper, we propose a new method, namely Generalized Video Memory (GVM), utilizing a memory model for enhancing video description generation. Based on a class of self-organizing neural networks, GVM’s model is able to learn new video features incrementally. The learned generalized memory is further exploited to decode the associated sentences using RNN. We evaluate our method on the YouTube2Text data set using BLEU and METEOR scores as a standard benchmark. Our results are shown to be competitive against other state-of-the-art methods. 2018-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6076 info:doi/10.1007/978-3-030-03014-8_16 https://ink.library.smu.edu.sg/context/sis_research/article/7079/viewcontent/Multi_disciplinary_Trends_in_Artificial_Intelligence.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Memory model Video captioning Deep learning Adaptive Resonance Theory LSTM CNN Databases and Information Systems Software Engineering |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Memory model Video captioning Deep learning Adaptive Resonance Theory LSTM CNN Databases and Information Systems Software Engineering |
spellingShingle |
Memory model Video captioning Deep learning Adaptive Resonance Theory LSTM CNN Databases and Information Systems Software Engineering CHANG, Poo-Hee TAN, Ah-hwee Learning generalized video memory for automatic video captioning |
description |
Recent video captioning methods have made great progress by deep learning approaches with convolutional neural networks (CNN) and recurrent neural networks (RNN). While there are techniques that use memory networks for sentence decoding, few work has leveraged on the memory component to learn and generalize the temporal structure in video. In this paper, we propose a new method, namely Generalized Video Memory (GVM), utilizing a memory model for enhancing video description generation. Based on a class of self-organizing neural networks, GVM’s model is able to learn new video features incrementally. The learned generalized memory is further exploited to decode the associated sentences using RNN. We evaluate our method on the YouTube2Text data set using BLEU and METEOR scores as a standard benchmark. Our results are shown to be competitive against other state-of-the-art methods. |
format |
text |
author |
CHANG, Poo-Hee TAN, Ah-hwee |
author_facet |
CHANG, Poo-Hee TAN, Ah-hwee |
author_sort |
CHANG, Poo-Hee |
title |
Learning generalized video memory for automatic video captioning |
title_short |
Learning generalized video memory for automatic video captioning |
title_full |
Learning generalized video memory for automatic video captioning |
title_fullStr |
Learning generalized video memory for automatic video captioning |
title_full_unstemmed |
Learning generalized video memory for automatic video captioning |
title_sort |
learning generalized video memory for automatic video captioning |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2018 |
url |
https://ink.library.smu.edu.sg/sis_research/6076 https://ink.library.smu.edu.sg/context/sis_research/article/7079/viewcontent/Multi_disciplinary_Trends_in_Artificial_Intelligence.pdf |
_version_ |
1773551429145853952 |