Aligning vision and language for image captioning using deep learning

Aligning vision and language for image captioning using deep learning

A longstanding objective in the field of multi-modal research uniting computer vision and natural language processing is to develop models that can comprehend the intricate relationship between vision and language. In recent years, we have witnessed notable developments directed towards this objecti...

Saved in:

書目詳細資料
主要作者:	Cai, Chen
其他作者:	Yap Kim Hui
格式:	Thesis-Doctor of Philosophy
語言:	English
出版:	Nanyang Technological University 2024
主題:	Computer and Information Science Computer vision Natural language processing
在線閱讀:	https://hdl.handle.net/10356/181511
標簽:	添加標簽沒有標簽, 成為第一個標記此記錄!
機構:	Nanyang Technological University
語言:	English

相似書籍

Bridging images and natural language with deep learning
由: Gu, Jiuxiang
出版: (2019)

Deep learning for x-ray vision
由: Ng, Kenneth Chen Ee
出版: (2021)

Diffusion models for natural language processing
由: Hoang, Minh Nhat
出版: (2024)

Mitigating fine-grained hallucination by fine-tuning large vision-language models with caption rewrites
由: WANG, Lei, et al.
出版: (2024)

Generative image captioning in Urdu using deep learning
由: Afzal M.K.
出版: (2023)

Image artefact removal using deep learning
由: Sanchari, Das
出版: (2022)

Deep learning-based image captioning
由: Chong, Kaydon
出版: (2019)

Highly controllable human motion generation model
由: Huang, Jingfang
出版: (2024)

Empowering natural language processing in low-resource regimes
由: Feng, Zijian
出版: (2025)

Deep learning for medical image analysis
由: Yang, Ivan Sze Yuan
出版: (2020)

Semantic, syntactic and joint deep learning of event extraction
由: Hao, Anran
出版: (2025)

Natural language generator for SUMO
由: Ureta, Danielle Erika Y.
出版: (2012)

Image and video generation via deep learning
由: Jiang, Liming
出版: (2023)

Cross-modal graph with meta concepts for video captioning
由: Wang, Hao, et al.
出版: (2022)

Image preprocessing using quick color averaging approach for color machine vision (CMV) systems
由: Luta, Raphael Benedict G., et al.
出版: (2017)

Punctuation restoration for speech transcripts using large language models
由: Liu, Changsong
出版: (2024)

System reliability enhancement via deep-driven computer vision
由: Ding, Shuya
出版: (2021)

Deep disentangling learning for real-world image enlightening and restoration
由: Chan, Yi Xuan
出版: (2022)

Enhancing contextual understanding in NLP: adapting state-of-the-art models for improved sentiment analysis of informal language
由: Sneha Ravisankar
出版: (2024)

Benchmarking embedded deep learning hardware for computer vision
由: Ching, Amos Li En
出版: (2020)

Neural image and video captioning
由: Lam, Ting En
出版: (2024)

Image quality assessment based label smoothing in deep neural network learning
由: Chen, Zhou
出版: (2018)

Deep learning for human motion generation
由: Gu, Chenyang
出版: (2024)

Tracking human mobility using Twitter through natural language processing techniques
由: Ver, Andrea Nicole O.
出版: (2018)

Image processing algorithms for dynamic vision sensors
由: Wang, Lun
出版: (2023)

Federated learning for natural language processing in medical domain
由: Saraf, Ishita
出版: (2024)

INDONESIAN IMAGE CAPTIONING USING VISION-LANGUAGE MODEL
由: Astrada Fathurrahman, Raihan

Use of word and character N-grams for low-resourced local languages
由: Regalado, Ralph Vincent, et al.
出版: (2019)

Crowd monitoring using deep learning
由: Tan, Raymond Rui Ming
出版: (2021)

Image recognition based on deep learning of convolutional neural networks
由: Xie, Cong
出版: (2019)

Deep image enhancement
由: Han, Jun
出版: (2021)

Benchmarking neuromorphic vision: Lessons learnt from computer vision
由: Tan, C, et al.
出版: (2020)

Facial expression recognition using deep learning
由: Wang, Xiao Yi
出版: (2024)

Sentiment analysis of the burmese language using the distributed representation of n-gram-based words
由: Myat lay phyu
出版: (2023)

Fine-grained image classification using deep learning
由: Sun, Deguang
出版: (2022)

Communicating effectively with the hearing impaired
由: Cheng, Eddy Kuan Quan
出版: (2024)

Automated image quality assessment and its applications in computer vision
由: Zhou, Phoebe Huixin
出版: (2022)

Identification of foreign materials in food using passive terahertz imaging and deep learning
由: Ong, Eng Zia
出版: (2022)

Emergent semantic segmentation: training-free dense-label-free extraction from vision-language models
由: Luo, Jiayun
出版: (2024)

Classification of white blood cells using deep learning
由: Zhang, Mengxin
出版: (2022)