Learning generalized video memory for automatic video captioning

Learning generalized video memory for automatic video captioning

Recent video captioning methods have made great progress by deep learning approaches with convolutional neural networks (CNN) and recurrent neural networks (RNN). While there are techniques that use memory networks for sentence decoding, few work has leveraged on the memory component to learn and ge...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	CHANG, Poo-Hee, TAN, Ah-hwee
التنسيق:	text
اللغة:	English
منشور في:	Institutional Knowledge at Singapore Management University 2018
الموضوعات:	Memory model Video captioning Deep learning Adaptive Resonance Theory LSTM CNN Databases and Information Systems Software Engineering
الوصول للمادة أونلاين:	https://ink.library.smu.edu.sg/sis_research/6076 https://ink.library.smu.edu.sg/context/sis_research/article/7079/viewcontent/Multi_disciplinary_Trends_in_Artificial_Intelligence.pdf
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

مواد مشابهة

Image captioning via semantic element embedding
بواسطة: ZHANG, Xiaodan, وآخرون
منشور في: (2020)

PERSONALIZED VISUAL INFORMATION CAPTIONING
بواسطة: WU SHUANG
منشور في: (2023)

Semantic-filtered Soft-Split-Aware video captioning with audio-augmented feature
بواسطة: Xu, Yuecong, وآخرون
منشور في: (2021)

Interactive video search with multi-modal LLM video captioning
بواسطة: CHENG, Yu-Tong, وآخرون
منشور في: (2025)

Cross-modal graph with meta concepts for video captioning
بواسطة: Wang, Hao, وآخرون
منشور في: (2022)

A Fine-Grained Spatial-Temporal Attention Model for Video Captioning
بواسطة: Liu, A.-A., وآخرون
منشور في: (2021)

Deep learning-based quality of experience (QoE) classification for optimal resource allocation in manufacturing networks
بواسطة: Lim, Sean Kuan Hwee
منشور في: (2025)

Dynamic captioning: Video accessibility enhancement for hearing impairment
بواسطة: Hong, R., وآخرون
منشور في: (2013)

Cross-modal graph with meta concepts for video captioning
بواسطة: WANG, Hao, وآخرون
منشور في: (2022)

Play and Rewind: Optimizing Binary Representations of Videos by Self-Supervised Temporal Hashing
بواسطة: Hanwang Zhang, وآخرون
منشور في: (2020)

An Investigation of Deep Learning Models for EEG-Based Emotion Recognition
بواسطة: Zhang, Y., وآخرون
منشور في: (2021)

Semantic memory modeling and memory interaction in learning agents
بواسطة: WANG, Wenwen, وآخرون
منشور في: (2016)

Learning transferable perturbations for image captioning
بواسطة: WU, Hanjie, وآخرون
منشور في: (2022)

EFFICIENT DEEP CONVOLUTIONAL ARCHITECTURES FOR IMAGE AND VIDEO REPRESENTATION LEARNING
بواسطة: CHEN YUNPENG
منشور في: (2019)

Self-organizing neural networks for universal learning and multimodal memory encoding
بواسطة: TAN, Ah-hwee, وآخرون
منشور في: (2019)

A Qualitative Study of Closed Captions in English Language Teaching (ELT) YouTube Videos
بواسطة: Hernandez, Queenie Mae G., وآخرون
منشور في: (2024)

Context-aware visual policy network for fine-grained image captioning
بواسطة: Zha, Zheng-Jun, وآخرون
منشور في: (2022)

Automatic parsing and indexing of news video
بواسطة: Zhang, H., وآخرون
منشور في: (2014)

Memory formation, consolidation, and forgetting in learning agents
بواسطة: SUSNAGDJA, Budhitama, وآخرون
منشور في: (2012)

Memory formation, consolidation, and forgetting in learning agents
بواسطة: SUBAGDJA, Budhitama, وآخرون
منشور في: (2012)

Deconfounded image captioning: a causal retrospect
بواسطة: Yang, Xu, وآخرون
منشور في: (2022)

Video accessibility enhancement for hearing-impaired users
بواسطة: Hong, R., وآخرون
منشور في: (2013)

Deep learning-based classification and segmentation of single-molecule fluorescence time traces
بواسطة: Cheng, Hongjing
منشور في: (2025)

Deep learning algorithm to detect tree defects via circular scans
بواسطة: Nguyen, Thanh Tin
منشور في: (2024)

Critical video quality for distributed automated video surveillance
بواسطة: Korshunov, P., وآخرون
منشور في: (2014)

CgT-GAN: CLIP-guided text GAN for image captioning
بواسطة: YU, Jiarui, وآخرون
منشور في: (2023)

Neural modeling of sequential inferences and learning over episodic memory
بواسطة: SUBAGDJA, Budhitama, وآخرون
منشور في: (2015)

Deep learning-based construction activity classification
بواسطة: Lian, Si Hui
منشور في: (2024)

Learning to collocate Visual-Linguistic Neural Modules for image captioning
بواسطة: Yang, Xu, وآخرون
منشور في: (2023)

Stack-VS : stacked visual-semantic attention for image caption generation
بواسطة: Cheng, Ling, وآخرون
منشور في: (2021)

Keyword-driven image captioning via Context-dependent Bilateral LSTM
بواسطة: ZHANG, Xiaodan, وآخرون
منشور في: (2017)

Who You Are Decides How You Tell
بواسطة: WU SHUANG, وآخرون
منشور في: (2020)

AmpSum: adaptive multiple-product summarization towards improving recommendation captions
بواسطة: TRUONG, Quoc Tuan, وآخرون
منشور في: (2022)

Interactive change-aware transformer network for remote sensing image change captioning
بواسطة: Cai, Chen, وآخرون
منشور في: (2024)

Automatic summarization of music videos
بواسطة: Shao, X., وآخرون
منشور في: (2013)

Affect-based adaptive presentation of home videos
بواسطة: Xiang, X., وآخرون
منشور في: (2013)

Fusion of multimodal embeddings for ad-hoc video search
بواسطة: FRANCIS, Danny, وآخرون
منشور في: (2019)

DeepQoE : a multimodal learning framework for video quality of experience (QoE) prediction
بواسطة: Zhang, Huaizheng, وآخرون
منشور في: (2021)

Automatic video logo detection and removal
بواسطة: Yan, W.-Q., وآخرون
منشور في: (2013)

A REVIEW ON THE APPLICATION OF VIDEO ANALYTICS USING 5G FOR FACILITIES MANAGEMENT
بواسطة: LAW KHAI MIN JASMINE
منشور في: (2024)