Aligning vision and language for image captioning using deep learning

A longstanding objective in the field of multi-modal research uniting computer vision and natural language processing is to develop models that can comprehend the intricate relationship between vision and language. In recent years, we have witnessed notable developments directed towards this objecti...

Full description

Saved in:
Bibliographic Details
Main Author: Cai, Chen
Other Authors: Yap Kim Hui
Format: Thesis-Doctor of Philosophy
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181511
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English