Learning generalized video memory for automatic video captioning

Recent video captioning methods have made great progress by deep learning approaches with convolutional neural networks (CNN) and recurrent neural networks (RNN). While there are techniques that use memory networks for sentence decoding, few work has leveraged on the memory component to learn and ge...

وصف كامل

محفوظ في:
التفاصيل البيبلوغرافية
المؤلفون الرئيسيون: CHANG, Poo-Hee, TAN, Ah-hwee
التنسيق: text
اللغة:English
منشور في: Institutional Knowledge at Singapore Management University 2018
الموضوعات:
CNN
الوصول للمادة أونلاين:https://ink.library.smu.edu.sg/sis_research/6076
https://ink.library.smu.edu.sg/context/sis_research/article/7079/viewcontent/Multi_disciplinary_Trends_in_Artificial_Intelligence.pdf
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة: Singapore Management University
اللغة: English
id sg-smu-ink.sis_research-7079
record_format dspace
spelling sg-smu-ink.sis_research-70792023-08-03T14:49:05Z Learning generalized video memory for automatic video captioning CHANG, Poo-Hee TAN, Ah-hwee Recent video captioning methods have made great progress by deep learning approaches with convolutional neural networks (CNN) and recurrent neural networks (RNN). While there are techniques that use memory networks for sentence decoding, few work has leveraged on the memory component to learn and generalize the temporal structure in video. In this paper, we propose a new method, namely Generalized Video Memory (GVM), utilizing a memory model for enhancing video description generation. Based on a class of self-organizing neural networks, GVM’s model is able to learn new video features incrementally. The learned generalized memory is further exploited to decode the associated sentences using RNN. We evaluate our method on the YouTube2Text data set using BLEU and METEOR scores as a standard benchmark. Our results are shown to be competitive against other state-of-the-art methods. 2018-11-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6076 info:doi/10.1007/978-3-030-03014-8_16 https://ink.library.smu.edu.sg/context/sis_research/article/7079/viewcontent/Multi_disciplinary_Trends_in_Artificial_Intelligence.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Memory model Video captioning Deep learning Adaptive Resonance Theory LSTM CNN Databases and Information Systems Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Memory model
Video captioning
Deep learning
Adaptive Resonance Theory
LSTM
CNN
Databases and Information Systems
Software Engineering
spellingShingle Memory model
Video captioning
Deep learning
Adaptive Resonance Theory
LSTM
CNN
Databases and Information Systems
Software Engineering
CHANG, Poo-Hee
TAN, Ah-hwee
Learning generalized video memory for automatic video captioning
description Recent video captioning methods have made great progress by deep learning approaches with convolutional neural networks (CNN) and recurrent neural networks (RNN). While there are techniques that use memory networks for sentence decoding, few work has leveraged on the memory component to learn and generalize the temporal structure in video. In this paper, we propose a new method, namely Generalized Video Memory (GVM), utilizing a memory model for enhancing video description generation. Based on a class of self-organizing neural networks, GVM’s model is able to learn new video features incrementally. The learned generalized memory is further exploited to decode the associated sentences using RNN. We evaluate our method on the YouTube2Text data set using BLEU and METEOR scores as a standard benchmark. Our results are shown to be competitive against other state-of-the-art methods.
format text
author CHANG, Poo-Hee
TAN, Ah-hwee
author_facet CHANG, Poo-Hee
TAN, Ah-hwee
author_sort CHANG, Poo-Hee
title Learning generalized video memory for automatic video captioning
title_short Learning generalized video memory for automatic video captioning
title_full Learning generalized video memory for automatic video captioning
title_fullStr Learning generalized video memory for automatic video captioning
title_full_unstemmed Learning generalized video memory for automatic video captioning
title_sort learning generalized video memory for automatic video captioning
publisher Institutional Knowledge at Singapore Management University
publishDate 2018
url https://ink.library.smu.edu.sg/sis_research/6076
https://ink.library.smu.edu.sg/context/sis_research/article/7079/viewcontent/Multi_disciplinary_Trends_in_Artificial_Intelligence.pdf
_version_ 1773551429145853952