Generative AI art - hypernetworks

Generative Artificial Intelligence (Gen AI) has made significant advancements, notably in Stable Diffusion, a text-to-image generation model released in August 2022. Despite its popularity, research exploring Stable Diffusion’s application in niche domains such as landscape photography remain lim...

Full description

Saved in:
Bibliographic Details
Main Author: Chee, Mei Qi
Other Authors: Chia Liang Tien, Clement
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175304
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-175304
record_format dspace
spelling sg-ntu-dr.10356-1753042024-04-26T15:43:38Z Generative AI art - hypernetworks Chee, Mei Qi Chia Liang Tien, Clement School of Computer Science and Engineering ASLTCHIA@ntu.edu.sg Computer and Information Science Stable diffusion Generative AI art Hypernetworks Generative Artificial Intelligence (Gen AI) has made significant advancements, notably in Stable Diffusion, a text-to-image generation model released in August 2022. Despite its popularity, research exploring Stable Diffusion’s application in niche domains such as landscape photography remain limited. This report addresses this research gap by investigating Stable Diffusion’s potential for refinement and artistic exploration of landscape photography. This study aims to develop a user-friendly web application designed specifically for landscape photographers. A comprehensive exploration of effective text prompt engineering and Stable Diffusion parameters optimisation was done. Comparative analysis has identified optimal base models (Stable Diffusion v1.5, RealisticVision, and AbsoluteReality) for fine-tuning and new models were trained with the fine-tuning methods – DreamBooth, Textual Inversion, LoRA, Hypernetworks. Results indicated hypernetworks’ superiority in generating realistic and accurate landscape images, with LoRA chosen for its lightweight nature and computational efficiency in the web application implementation. This web application offers text-to-image generation with assisted text prompt engineering and LoRA training. These features will allow for pre-visualisation, creative exploration and generating new images with the photographer’s original style. Additionally, preliminary exploration of Canny and IP2P in ControlNet reveals potential for enhanced image editing. Future work involves further exploration of ControlNet capabilities and user studies to evaluate the web application's effectiveness. Overall, this report demonstrates the potential of Stable Diffusion to revolutionise the artistic realm of landscape photography for photographers, empowering them with unprecedented creative freedom and seamless integration into their existing workflow. Bachelor's degree 2024-04-22T08:58:46Z 2024-04-22T08:58:46Z 2024 Final Year Project (FYP) Chee, M. Q. (2024). Generative AI art - hypernetworks. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175304 https://hdl.handle.net/10356/175304 en SCSE23-0529 application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Computer and Information Science
Stable diffusion
Generative AI art
Hypernetworks
spellingShingle Computer and Information Science
Stable diffusion
Generative AI art
Hypernetworks
Chee, Mei Qi
Generative AI art - hypernetworks
description Generative Artificial Intelligence (Gen AI) has made significant advancements, notably in Stable Diffusion, a text-to-image generation model released in August 2022. Despite its popularity, research exploring Stable Diffusion’s application in niche domains such as landscape photography remain limited. This report addresses this research gap by investigating Stable Diffusion’s potential for refinement and artistic exploration of landscape photography. This study aims to develop a user-friendly web application designed specifically for landscape photographers. A comprehensive exploration of effective text prompt engineering and Stable Diffusion parameters optimisation was done. Comparative analysis has identified optimal base models (Stable Diffusion v1.5, RealisticVision, and AbsoluteReality) for fine-tuning and new models were trained with the fine-tuning methods – DreamBooth, Textual Inversion, LoRA, Hypernetworks. Results indicated hypernetworks’ superiority in generating realistic and accurate landscape images, with LoRA chosen for its lightweight nature and computational efficiency in the web application implementation. This web application offers text-to-image generation with assisted text prompt engineering and LoRA training. These features will allow for pre-visualisation, creative exploration and generating new images with the photographer’s original style. Additionally, preliminary exploration of Canny and IP2P in ControlNet reveals potential for enhanced image editing. Future work involves further exploration of ControlNet capabilities and user studies to evaluate the web application's effectiveness. Overall, this report demonstrates the potential of Stable Diffusion to revolutionise the artistic realm of landscape photography for photographers, empowering them with unprecedented creative freedom and seamless integration into their existing workflow.
author2 Chia Liang Tien, Clement
author_facet Chia Liang Tien, Clement
Chee, Mei Qi
format Final Year Project
author Chee, Mei Qi
author_sort Chee, Mei Qi
title Generative AI art - hypernetworks
title_short Generative AI art - hypernetworks
title_full Generative AI art - hypernetworks
title_fullStr Generative AI art - hypernetworks
title_full_unstemmed Generative AI art - hypernetworks
title_sort generative ai art - hypernetworks
publisher Nanyang Technological University
publishDate 2024
url https://hdl.handle.net/10356/175304
_version_ 1800916291154870272