Generative AI art - hypernetworks

Generative Artificial Intelligence (Gen AI) has made significant advancements, notably in Stable Diffusion, a text-to-image generation model released in August 2022. Despite its popularity, research exploring Stable Diffusion’s application in niche domains such as landscape photography remain lim...

Full description

Saved in:
Bibliographic Details
Main Author: Chee, Mei Qi
Other Authors: Chia Liang Tien, Clement
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/175304
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:Generative Artificial Intelligence (Gen AI) has made significant advancements, notably in Stable Diffusion, a text-to-image generation model released in August 2022. Despite its popularity, research exploring Stable Diffusion’s application in niche domains such as landscape photography remain limited. This report addresses this research gap by investigating Stable Diffusion’s potential for refinement and artistic exploration of landscape photography. This study aims to develop a user-friendly web application designed specifically for landscape photographers. A comprehensive exploration of effective text prompt engineering and Stable Diffusion parameters optimisation was done. Comparative analysis has identified optimal base models (Stable Diffusion v1.5, RealisticVision, and AbsoluteReality) for fine-tuning and new models were trained with the fine-tuning methods – DreamBooth, Textual Inversion, LoRA, Hypernetworks. Results indicated hypernetworks’ superiority in generating realistic and accurate landscape images, with LoRA chosen for its lightweight nature and computational efficiency in the web application implementation. This web application offers text-to-image generation with assisted text prompt engineering and LoRA training. These features will allow for pre-visualisation, creative exploration and generating new images with the photographer’s original style. Additionally, preliminary exploration of Canny and IP2P in ControlNet reveals potential for enhanced image editing. Future work involves further exploration of ControlNet capabilities and user studies to evaluate the web application's effectiveness. Overall, this report demonstrates the potential of Stable Diffusion to revolutionise the artistic realm of landscape photography for photographers, empowering them with unprecedented creative freedom and seamless integration into their existing workflow.