Customized image synthesis using diffusion models
Recently, diffusion models have become a powerful mainstream method for image generation. Text-to-image diffusion models, in particular, have been widely used to convert a natural language description (e.g., ‘an orange cat’) to photorealistic images (e.g., a photo of an orange cat). These pre-tra...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/175199 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | Recently, diffusion models have become a powerful mainstream method for image
generation. Text-to-image diffusion models, in particular, have been widely used to
convert a natural language description (e.g., ‘an orange cat’) to photorealistic images
(e.g., a photo of an orange cat). These pre-trained diffusion models have enabled various downstream applications, including customized image synthesis. For instance, a
pre-trained text-to-image diffusion model can be leveraged to capture the appearance
of a specific cat from multiple images, and subsequently generate images of this cat in
diverse scenarios. In this final year project, we introduce an integration pipeline for storyboard generation. We begin by using large language models to assist in the creation
of storylines, followed by the application of existing customization methods to visually render each scene. The pipeline is carefully designed to leverage both language
models and customizastion methods for efficient and effective storyboard generation.
We demonstrate the usefulness of our proposed pipeline both qualitatively and quantitatively. Additionally, a comprehensive research is also proposed focus on several
diffusion models related to the latest advancements in customized image synthesis,
which experimentally compare and analyze various diffusion models. We believe this
project can enable and inspire subsequent explorations on applying customized image
synthesis methods for automatic storyboard generation. |
---|