Mitigating style-image hallucination in large vision language models

LLMs are widely applied across various domains, yet a significant challenge remains—their performance deteriorates sharply in out-of-domain scenarios, often leading to increased hallucinations. Despite its importance, this phenomenon has received limited attention in academic research. To address th...

Full description

Saved in:

Bibliographic Details
Main Author:	He, Guoshun
Other Authors:	Alex Chichung Kot
Format:	Thesis-Master by Coursework
Language:	English
Published:	Nanyang Technological University 2025
Subjects:	Engineering Out-of-domain Hallucination Lightweight model
Online Access:	https://hdl.handle.net/10356/182918
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-182918
record_format	dspace
spelling	sg-ntu-dr.10356-1829182025-03-10T08:46:20Z Mitigating style-image hallucination in large vision language models He, Guoshun Alex Chichung Kot School of Electrical and Electronic Engineering EACKOT@ntu.edu.sg Engineering Out-of-domain Hallucination Lightweight model LLMs are widely applied across various domains, yet a significant challenge remains—their performance deteriorates sharply in out-of-domain scenarios, often leading to increased hallucinations. Despite its importance, this phenomenon has received limited attention in academic research. To address this, we first construct a benchmark dataset using style transfer techniques and employ it to evaluate the out-of-domain performance of several popular large-scale models. Building upon these findings, we introduce CopeCap, a lightweight image captioning model that leverages collaborative prompting to achieve strong out-of-domain performance without requiring additional training. Master's degree 2025-03-10T02:21:22Z 2025-03-10T02:21:22Z 2025 Thesis-Master by Coursework He, G. (2025). Mitigating style-image hallucination in large vision language models. Master's thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/182918 https://hdl.handle.net/10356/182918 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Engineering Out-of-domain Hallucination Lightweight model
spellingShingle	Engineering Out-of-domain Hallucination Lightweight model He, Guoshun Mitigating style-image hallucination in large vision language models
description	LLMs are widely applied across various domains, yet a significant challenge remains—their performance deteriorates sharply in out-of-domain scenarios, often leading to increased hallucinations. Despite its importance, this phenomenon has received limited attention in academic research. To address this, we first construct a benchmark dataset using style transfer techniques and employ it to evaluate the out-of-domain performance of several popular large-scale models. Building upon these findings, we introduce CopeCap, a lightweight image captioning model that leverages collaborative prompting to achieve strong out-of-domain performance without requiring additional training.
author2	Alex Chichung Kot
author_facet	Alex Chichung Kot He, Guoshun
format	Thesis-Master by Coursework
author	He, Guoshun
author_sort	He, Guoshun
title	Mitigating style-image hallucination in large vision language models
title_short	Mitigating style-image hallucination in large vision language models
title_full	Mitigating style-image hallucination in large vision language models
title_fullStr	Mitigating style-image hallucination in large vision language models
title_full_unstemmed	Mitigating style-image hallucination in large vision language models
title_sort	mitigating style-image hallucination in large vision language models
publisher	Nanyang Technological University
publishDate	2025
url	https://hdl.handle.net/10356/182918
_version_	1826362290923896832

Mitigating style-image hallucination in large vision language models

Similar Items