Context-aware visual policy network for fine-grained image captioning

Context-aware visual policy network for fine-grained image captioning

With the maturity of visual detection techniques, we are more ambitious in describing visual content with open-vocabulary, fine-grained and free-form language, i.e., the task of image captioning. In particular, we are interested in generating longer, richer and more fine-grained sentences and paragr...

Full description

Saved in:

Bibliographic Details
Main Authors:	Zha, Zheng-Jun, Liu, Daqing, Zhang, Hanwang, Zhang, Yongdong, Wu, Feng
Other Authors:	School of Computer Science and Engineering
Format:	Article
Language:	English
Published:	2022
Subjects:	Engineering::Computer science and engineering Image Captioning Reinforcement Learning
Online Access:	https://hdl.handle.net/10356/162628
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Similar Items

PERSONALIZED VISUAL INFORMATION CAPTIONING
by: WU SHUANG
Published: (2023)

Deconfounded image captioning: a causal retrospect
by: Yang, Xu, et al.
Published: (2022)

A Fine-Grained Spatial-Temporal Attention Model for Video Captioning
by: Liu, A.-A., et al.
Published: (2021)

Learning to collocate Visual-Linguistic Neural Modules for image captioning
by: Yang, Xu, et al.
Published: (2023)

Interactive change-aware transformer network for remote sensing image change captioning
by: Cai, Chen, et al.
Published: (2024)

Stack-VS : stacked visual-semantic attention for image caption generation
by: Cheng, Ling, et al.
Published: (2021)

Keyword-driven image captioning via Context-dependent Bilateral LSTM
by: ZHANG, Xiaodan, et al.
Published: (2017)

More is better : precise and detailed image captioning using online positive recall and missing concepts mining
by: Zhang, Mingxing, et al.
Published: (2020)

CgT-GAN: CLIP-guided text GAN for image captioning
by: YU, Jiarui, et al.
Published: (2023)

Image captioning via semantic element embedding
by: ZHANG, Xiaodan, et al.
Published: (2020)

Semantic-filtered Soft-Split-Aware video captioning with audio-augmented feature
by: Xu, Yuecong, et al.
Published: (2021)

Learning transferable perturbations for image captioning
by: WU, Hanjie, et al.
Published: (2022)

Dynamic captioning: Video accessibility enhancement for hearing impairment
by: Hong, R., et al.
Published: (2013)

Cross-modal graph with meta concepts for video captioning
by: Wang, Hao, et al.
Published: (2022)

AmpSum: adaptive multiple-product summarization towards improving recommendation captions
by: TRUONG, Quoc Tuan, et al.
Published: (2022)

Learning generalized video memory for automatic video captioning
by: CHANG, Poo-Hee, et al.
Published: (2018)

Rights that can’t be heard: Addressing the need for extending closed caption law in social media platforms for COVID-19 pandemic related coverage
by: Daos, Bernadette De Vera
Published: (2020)

Cross-modal graph with meta concepts for video captioning
by: WANG, Hao, et al.
Published: (2022)

Automated localization of affective objects and actions in images via caption text-cum-eye gaze analysis
by: Ramanathan, S., et al.
Published: (2013)

Mitigating fine-grained hallucination by fine-tuning large vision-language models with caption rewrites
by: WANG, Lei, et al.
Published: (2024)

Matryoshka Peek: Toward Learning Fine-Grained,Robust, Discriminative Features for Product Search
by: Zaw lin Kyaw, et al.
Published: (2020)

Video accessibility enhancement for hearing-impaired users
by: Hong, R., et al.
Published: (2013)

Context-aware techniques for real-time decision making in autonomous mobile robots (Situation awareness Part A)
by: Parittotog, Apichaya
Published: (2024)

Who You Are Decides How You Tell
by: WU SHUANG, et al.
Published: (2020)

Visual navigation with multiple goals based on deep reinforcement learning
by: Rao, Zhenhuan, et al.
Published: (2022)

A Qualitative Study of Closed Captions in English Language Teaching (ELT) YouTube Videos
by: Hernandez, Queenie Mae G., et al.
Published: (2024)

Fine-grained image classification using deep learning
by: Sun, Deguang
Published: (2022)

Paired cross-modal data augmentation for fine-grained image-to-text retrieval
by: Wang, Hao, et al.
Published: (2023)

Towards better fine-grained visual classification: an attention-based, hierarchical approach
by: Gao, Manrong
Published: (2023)

Delving into multimodal prompting for fine-grained visual classification
by: JIANG, Xin, et al.
Published: (2024)

Hierarchical part matching for fine-grained visual categorization
by: Xie, L., et al.
Published: (2014)

CONTEXT-AWARE LEARNING FOR AUTONOMOUS ROBOTIC DEPLOYMENTS IN UNKNOWN ENVIRONMENTS
by: CAO YUHONG
Published: (2024)

Fine-grained fish classification
by: Isak, Merchant Mahek
Published: (2023)

Reinforcement retrieval leveraging fine-grained feedback for fact checking news claims with Black-Box LLM
by: ZHANG, Xuan, et al.
Published: (2024)

Biocementation of fine-grained soil
by: Ong, Jun Hao
Published: (2017)

Spatial context-aware object-attentional network for multi-label image classification
by: Zhang, Jialu, et al.
Published: (2024)

Visual analytics using deep learning : fine-grained image recognition of fauna species in Singapore
by: Gao, Jing Ying
Published: (2021)

Curiosity-driven and victim-aware adversarial policies
by: GONG, Chen, et al.
Published: (2022)

Using infrastructure-provided context filters for efficient fine-grained activity sensing
by: SUBBARAJU, Vigneshwaran, et al.
Published: (2015)

Fine- and Coarse-Grain Reconfigurable Computing
by: Stamatis Vassiliadis, Dimitrios Soudris
Published: (2017)