Aligning vision and language for image captioning using deep learning
A longstanding objective in the field of multi-modal research uniting computer vision and natural language processing is to develop models that can comprehend the intricate relationship between vision and language. In recent years, we have witnessed notable developments directed towards this objecti...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis-Doctor of Philosophy |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181511 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |