Generative AI art - hypernetworks
Generative Artificial Intelligence (Gen AI) has made significant advancements, notably in Stable Diffusion, a text-to-image generation model released in August 2022. Despite its popularity, research exploring Stable Diffusion’s application in niche domains such as landscape photography remain lim...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175304 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-175304 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1753042024-04-26T15:43:38Z Generative AI art - hypernetworks Chee, Mei Qi Chia Liang Tien, Clement School of Computer Science and Engineering ASLTCHIA@ntu.edu.sg Computer and Information Science Stable diffusion Generative AI art Hypernetworks Generative Artificial Intelligence (Gen AI) has made significant advancements, notably in Stable Diffusion, a text-to-image generation model released in August 2022. Despite its popularity, research exploring Stable Diffusion’s application in niche domains such as landscape photography remain limited. This report addresses this research gap by investigating Stable Diffusion’s potential for refinement and artistic exploration of landscape photography. This study aims to develop a user-friendly web application designed specifically for landscape photographers. A comprehensive exploration of effective text prompt engineering and Stable Diffusion parameters optimisation was done. Comparative analysis has identified optimal base models (Stable Diffusion v1.5, RealisticVision, and AbsoluteReality) for fine-tuning and new models were trained with the fine-tuning methods – DreamBooth, Textual Inversion, LoRA, Hypernetworks. Results indicated hypernetworks’ superiority in generating realistic and accurate landscape images, with LoRA chosen for its lightweight nature and computational efficiency in the web application implementation. This web application offers text-to-image generation with assisted text prompt engineering and LoRA training. These features will allow for pre-visualisation, creative exploration and generating new images with the photographer’s original style. Additionally, preliminary exploration of Canny and IP2P in ControlNet reveals potential for enhanced image editing. Future work involves further exploration of ControlNet capabilities and user studies to evaluate the web application's effectiveness. Overall, this report demonstrates the potential of Stable Diffusion to revolutionise the artistic realm of landscape photography for photographers, empowering them with unprecedented creative freedom and seamless integration into their existing workflow. Bachelor's degree 2024-04-22T08:58:46Z 2024-04-22T08:58:46Z 2024 Final Year Project (FYP) Chee, M. Q. (2024). Generative AI art - hypernetworks. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/175304 https://hdl.handle.net/10356/175304 en SCSE23-0529 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science Stable diffusion Generative AI art Hypernetworks |
spellingShingle |
Computer and Information Science Stable diffusion Generative AI art Hypernetworks Chee, Mei Qi Generative AI art - hypernetworks |
description |
Generative Artificial Intelligence (Gen AI) has made significant advancements, notably in
Stable Diffusion, a text-to-image generation model released in August 2022. Despite its
popularity, research exploring Stable Diffusion’s application in niche domains such as
landscape photography remain limited. This report addresses this research gap by investigating
Stable Diffusion’s potential for refinement and artistic exploration of landscape photography.
This study aims to develop a user-friendly web application designed specifically for landscape
photographers. A comprehensive exploration of effective text prompt engineering and Stable
Diffusion parameters optimisation was done. Comparative analysis has identified optimal base
models (Stable Diffusion v1.5, RealisticVision, and AbsoluteReality) for fine-tuning and new
models were trained with the fine-tuning methods – DreamBooth, Textual Inversion, LoRA,
Hypernetworks. Results indicated hypernetworks’ superiority in generating realistic and
accurate landscape images, with LoRA chosen for its lightweight nature and computational
efficiency in the web application implementation. This web application offers text-to-image
generation with assisted text prompt engineering and LoRA training. These features will allow
for pre-visualisation, creative exploration and generating new images with the photographer’s
original style. Additionally, preliminary exploration of Canny and IP2P in ControlNet reveals
potential for enhanced image editing. Future work involves further exploration of ControlNet
capabilities and user studies to evaluate the web application's effectiveness. Overall, this report
demonstrates the potential of Stable Diffusion to revolutionise the artistic realm of landscape
photography for photographers, empowering them with unprecedented creative freedom and
seamless integration into their existing workflow. |
author2 |
Chia Liang Tien, Clement |
author_facet |
Chia Liang Tien, Clement Chee, Mei Qi |
format |
Final Year Project |
author |
Chee, Mei Qi |
author_sort |
Chee, Mei Qi |
title |
Generative AI art - hypernetworks |
title_short |
Generative AI art - hypernetworks |
title_full |
Generative AI art - hypernetworks |
title_fullStr |
Generative AI art - hypernetworks |
title_full_unstemmed |
Generative AI art - hypernetworks |
title_sort |
generative ai art - hypernetworks |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/175304 |
_version_ |
1800916291154870272 |