Aligning vision and language for image captioning using deep learning

A longstanding objective in the field of multi-modal research uniting computer vision and natural language processing is to develop models that can comprehend the intricate relationship between vision and language. In recent years, we have witnessed notable developments directed towards this objecti...

全面介紹

Saved in:
書目詳細資料
主要作者: Cai, Chen
其他作者: Yap Kim Hui
格式: Thesis-Doctor of Philosophy
語言:English
出版: Nanyang Technological University 2024
主題:
在線閱讀:https://hdl.handle.net/10356/181511
標簽: 添加標簽
沒有標簽, 成為第一個標記此記錄!
機構: Nanyang Technological University
語言: English