Towards a more trustworthy generative artificial intelligence

In recent years, the rapid increase of available Generative Artificial Intelligence (AI) models has revolutionized various domains. From AI image generation to intelligent natural language reasoning like Open AI’s ChatGPT and Google Gemini. These models fuelled by advancements in deep learning...

Full description

Saved in:

Bibliographic Details
Main Author:	Cheong, Ben Wee Joon
Other Authors:	Alex Chichung Kot
Format:	Final Year Project
Language:	English
Published:	Nanyang Technological University 2024
Subjects:	Computer and Information Science GAN Hallucinations
Online Access:	https://hdl.handle.net/10356/181657
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Nanyang Technological University
Language:	English

id	sg-ntu-dr.10356-181657
record_format	dspace
spelling	sg-ntu-dr.10356-1816572024-12-13T15:45:16Z Towards a more trustworthy generative artificial intelligence Cheong, Ben Wee Joon Alex Chichung Kot School of Electrical and Electronic Engineering Rapid-Rich Object Search (ROSE) Lab EACKOT@ntu.edu.sg Computer and Information Science GAN Hallucinations In recent years, the rapid increase of available Generative Artificial Intelligence (AI) models has revolutionized various domains. From AI image generation to intelligent natural language reasoning like Open AI’s ChatGPT and Google Gemini. These models fuelled by advancements in deep learning have demonstrated unprecedented capabilities in generating life-like content. However, with these advancements comes new challenges, especially in ensuring the robustness and reliability of such models when faced with adversarial attacks. Adversarial attacks are attacks that exploits the vulnerabilities of Generative AI models by subtly altering the input data, also known as perturbations, to deceive the models into making hallucinated, distorted, or incorrect predictions. Vision Language Models (VLM) are an example of Generative AI that leverages complex neural architectures, combining the capabilities of both Computer Vision (CV) and Natural Language Processing (NLP) modalities to generate captions or descriptions of images or to produce corresponding visual content from textual prompts. In the context of VLM, adversarial attacks can come in the form of perturbated image inputs ranging from subtle to severe using techniques such as gaussian blur, dithering, and contrast manipulation leading to hallucinated image captioning or description. Hallucinations refer to erroneous or implausible outputs generated by AI models, particularly in generative tasks. In this context, hallucinations manifest as generated images or captions that deviate significantly from the intended input. Motivated by the need to address this vulnerability, this project systematically evaluates VLMs’ robustness under a variety of adversarial perturbations, with a focus on benchmarking hallucinations. Existing hallucination benchmark work faces challenges, such as limited task specificity and lack of depth in assessing VLM responses across diverse perturbations. By addressing these gaps, this study aims to enhance our understanding of VLM limitations and contribute to the development of more robust and reliable vision-language models, setting a foundation for improving AI models’ resilience to real-world distortions. Bachelor's degree 2024-12-11T23:54:47Z 2024-12-11T23:54:47Z 2024 Final Year Project (FYP) Cheong, B. W. J. (2024). Towards a more trustworthy generative artificial intelligence. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181657 https://hdl.handle.net/10356/181657 en application/pdf Nanyang Technological University
institution	Nanyang Technological University
building	NTU Library
continent	Asia
country	Singapore Singapore
content_provider	NTU Library
collection	DR-NTU
language	English
topic	Computer and Information Science GAN Hallucinations
spellingShingle	Computer and Information Science GAN Hallucinations Cheong, Ben Wee Joon Towards a more trustworthy generative artificial intelligence
description	In recent years, the rapid increase of available Generative Artificial Intelligence (AI) models has revolutionized various domains. From AI image generation to intelligent natural language reasoning like Open AI’s ChatGPT and Google Gemini. These models fuelled by advancements in deep learning have demonstrated unprecedented capabilities in generating life-like content. However, with these advancements comes new challenges, especially in ensuring the robustness and reliability of such models when faced with adversarial attacks. Adversarial attacks are attacks that exploits the vulnerabilities of Generative AI models by subtly altering the input data, also known as perturbations, to deceive the models into making hallucinated, distorted, or incorrect predictions. Vision Language Models (VLM) are an example of Generative AI that leverages complex neural architectures, combining the capabilities of both Computer Vision (CV) and Natural Language Processing (NLP) modalities to generate captions or descriptions of images or to produce corresponding visual content from textual prompts. In the context of VLM, adversarial attacks can come in the form of perturbated image inputs ranging from subtle to severe using techniques such as gaussian blur, dithering, and contrast manipulation leading to hallucinated image captioning or description. Hallucinations refer to erroneous or implausible outputs generated by AI models, particularly in generative tasks. In this context, hallucinations manifest as generated images or captions that deviate significantly from the intended input. Motivated by the need to address this vulnerability, this project systematically evaluates VLMs’ robustness under a variety of adversarial perturbations, with a focus on benchmarking hallucinations. Existing hallucination benchmark work faces challenges, such as limited task specificity and lack of depth in assessing VLM responses across diverse perturbations. By addressing these gaps, this study aims to enhance our understanding of VLM limitations and contribute to the development of more robust and reliable vision-language models, setting a foundation for improving AI models’ resilience to real-world distortions.
author2	Alex Chichung Kot
author_facet	Alex Chichung Kot Cheong, Ben Wee Joon
format	Final Year Project
author	Cheong, Ben Wee Joon
author_sort	Cheong, Ben Wee Joon
title	Towards a more trustworthy generative artificial intelligence
title_short	Towards a more trustworthy generative artificial intelligence
title_full	Towards a more trustworthy generative artificial intelligence
title_fullStr	Towards a more trustworthy generative artificial intelligence
title_full_unstemmed	Towards a more trustworthy generative artificial intelligence
title_sort	towards a more trustworthy generative artificial intelligence
publisher	Nanyang Technological University
publishDate	2024
url	https://hdl.handle.net/10356/181657
_version_	1819113072721133568

Towards a more trustworthy generative artificial intelligence

Similar Items