Learning generalized video memory for automatic video captioning

Recent video captioning methods have made great progress by deep learning approaches with convolutional neural networks (CNN) and recurrent neural networks (RNN). While there are techniques that use memory networks for sentence decoding, few work has leveraged on the memory component to learn and ge...

Full description

Saved in:
Bibliographic Details
Main Authors: CHANG, Poo-Hee, TAN, Ah-hwee
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2018
Subjects:
CNN
Online Access:https://ink.library.smu.edu.sg/sis_research/6076
https://ink.library.smu.edu.sg/context/sis_research/article/7079/viewcontent/Multi_disciplinary_Trends_in_Artificial_Intelligence.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7079
record_format dspace
spelling sg-smu-ink.sis_research-70792023-08-03T14:49:05Z Learning generalized video memory for automatic video captioning CHANG, Poo-Hee TAN, Ah-hwee Recent video captioning methods have made great progress by deep learning approaches with convolutional neural networks (CNN) and recurrent neural networks (RNN). While there are techniques that use memory networks for sentence decoding, few work has leveraged on the memory component to learn and generalize the temporal structure in video. In this paper, we propose a new method, namely Generalized Video Memory (GVM), utilizing a memory model for enhancing video description generation. Based on a class of self-organizing neural networks, GVM’s model is able to learn new video features incrementally. The learned generalized memory is further exploited to decode the associated sentences using RNN. We evaluate our method on the YouTube2Text data set using BLEU and METEOR scores as a standard benchmark. Our results are shown to be competitive against other state-of-the-art methods. 2018-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6076 info:doi/10.1007/978-3-030-03014-8_16 https://ink.library.smu.edu.sg/context/sis_research/article/7079/viewcontent/Multi_disciplinary_Trends_in_Artificial_Intelligence.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Memory model Video captioning Deep learning Adaptive Resonance Theory LSTM CNN Databases and Information Systems Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Memory model
Video captioning
Deep learning
Adaptive Resonance Theory
LSTM
CNN
Databases and Information Systems
Software Engineering
spellingShingle Memory model
Video captioning
Deep learning
Adaptive Resonance Theory
LSTM
CNN
Databases and Information Systems
Software Engineering
CHANG, Poo-Hee
TAN, Ah-hwee
Learning generalized video memory for automatic video captioning
description Recent video captioning methods have made great progress by deep learning approaches with convolutional neural networks (CNN) and recurrent neural networks (RNN). While there are techniques that use memory networks for sentence decoding, few work has leveraged on the memory component to learn and generalize the temporal structure in video. In this paper, we propose a new method, namely Generalized Video Memory (GVM), utilizing a memory model for enhancing video description generation. Based on a class of self-organizing neural networks, GVM’s model is able to learn new video features incrementally. The learned generalized memory is further exploited to decode the associated sentences using RNN. We evaluate our method on the YouTube2Text data set using BLEU and METEOR scores as a standard benchmark. Our results are shown to be competitive against other state-of-the-art methods.
format text
author CHANG, Poo-Hee
TAN, Ah-hwee
author_facet CHANG, Poo-Hee
TAN, Ah-hwee
author_sort CHANG, Poo-Hee
title Learning generalized video memory for automatic video captioning
title_short Learning generalized video memory for automatic video captioning
title_full Learning generalized video memory for automatic video captioning
title_fullStr Learning generalized video memory for automatic video captioning
title_full_unstemmed Learning generalized video memory for automatic video captioning
title_sort learning generalized video memory for automatic video captioning
publisher Institutional Knowledge at Singapore Management University
publishDate 2018
url https://ink.library.smu.edu.sg/sis_research/6076
https://ink.library.smu.edu.sg/context/sis_research/article/7079/viewcontent/Multi_disciplinary_Trends_in_Artificial_Intelligence.pdf
_version_ 1773551429145853952