Generative AI art - hypernetworks
Generative Artificial Intelligence (Gen AI) has made significant advancements, notably in Stable Diffusion, a text-to-image generation model released in August 2022. Despite its popularity, research exploring Stable Diffusion’s application in niche domains such as landscape photography remain lim...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175304 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Generative Artificial Intelligence (Gen AI) has made significant advancements, notably in
Stable Diffusion, a text-to-image generation model released in August 2022. Despite its
popularity, research exploring Stable Diffusion’s application in niche domains such as
landscape photography remain limited. This report addresses this research gap by investigating
Stable Diffusion’s potential for refinement and artistic exploration of landscape photography.
This study aims to develop a user-friendly web application designed specifically for landscape
photographers. A comprehensive exploration of effective text prompt engineering and Stable
Diffusion parameters optimisation was done. Comparative analysis has identified optimal base
models (Stable Diffusion v1.5, RealisticVision, and AbsoluteReality) for fine-tuning and new
models were trained with the fine-tuning methods – DreamBooth, Textual Inversion, LoRA,
Hypernetworks. Results indicated hypernetworks’ superiority in generating realistic and
accurate landscape images, with LoRA chosen for its lightweight nature and computational
efficiency in the web application implementation. This web application offers text-to-image
generation with assisted text prompt engineering and LoRA training. These features will allow
for pre-visualisation, creative exploration and generating new images with the photographer’s
original style. Additionally, preliminary exploration of Canny and IP2P in ControlNet reveals
potential for enhanced image editing. Future work involves further exploration of ControlNet
capabilities and user studies to evaluate the web application's effectiveness. Overall, this report
demonstrates the potential of Stable Diffusion to revolutionise the artistic realm of landscape
photography for photographers, empowering them with unprecedented creative freedom and
seamless integration into their existing workflow. |
---|