Mitigating style-image hallucination in large vision language models

LLMs are widely applied across various domains, yet a significant challenge remains—their performance deteriorates sharply in out-of-domain scenarios, often leading to increased hallucinations. Despite its importance, this phenomenon has received limited attention in academic research. To address th...

Full description

Saved in:

Bibliographic Details
Main Author:	He, Guoshun
Other Authors:	Alex Chichung Kot
Format:	Thesis-Master by Coursework
Language:	English
Published:	Nanyang Technological University 2025
Subjects:	Engineering Out-of-domain Hallucination Lightweight model
Online Access:	https://hdl.handle.net/10356/182918
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

Description
Summary:	LLMs are widely applied across various domains, yet a significant challenge remains—their performance deteriorates sharply in out-of-domain scenarios, often leading to increased hallucinations. Despite its importance, this phenomenon has received limited attention in academic research. To address this, we first construct a benchmark dataset using style transfer techniques and employ it to evaluate the out-of-domain performance of several popular large-scale models. Building upon these findings, we introduce CopeCap, a lightweight image captioning model that leverages collaborative prompting to achieve strong out-of-domain performance without requiring additional training.

Mitigating style-image hallucination in large vision language models

Similar Items