Mitigating fine-grained hallucination by fine-tuning large vision-language models with caption rewrites

Large language models (LLMs) have shown remarkable performance in natural language processing (NLP) tasks. To comprehend and execute diverse human instructions over image data, instruction-tuned large vision-language models (LVLMs) have been introduced. However, LVLMs may suffer from different types...

Full description

Saved in:
Bibliographic Details
Main Authors: WANG, Lei, HE, Jiabang, LI, Shenshen, LIU, Ning, LIM, Ee-peng
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2024
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/8750
https://ink.library.smu.edu.sg/context/sis_research/article/9753/viewcontent/MitigatingFine_GrainedHallucination_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English