Layout and context understanding for image synthesis with scene graphs
dvancements on text-to-image synthesis generate remarkable images from tex-tual descriptions. However, these methods are designed to generate only one object with varying attributes. They face difficulties with complex descriptions having multiple arbitrary objects since it would require information o...
Saved in:
Main Author: | |
---|---|
Format: | text |
Language: | English |
Published: |
Animo Repository
2019
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/etd_masteral/6525 https://animorepository.dlsu.edu.ph/context/etd_masteral/article/13523/viewcontent/Talavera__Arces_Adlique___11791381___Thesis_Document_Redacted.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
Language: | English |
id |
oai:animorepository.dlsu.edu.ph:etd_masteral-13523 |
---|---|
record_format |
eprints |
spelling |
oai:animorepository.dlsu.edu.ph:etd_masteral-135232022-12-02T07:03:25Z Layout and context understanding for image synthesis with scene graphs Talavera, Arces A. dvancements on text-to-image synthesis generate remarkable images from tex-tual descriptions. However, these methods are designed to generate only one object with varying attributes. They face difficulties with complex descriptions having multiple arbitrary objects since it would require information on the place-ment and sizes of each object in the image. Recently, a method that infers object layouts from scene graphs has been proposed as a solution to this problem. How-ever, their method uses only object labels in describing the layout, which fail to capture the appearance of some objects. Moreover, their model is biased towards generating rectangular shaped objects in the absence of ground-truth masks. In this paper, we propose an object encoding module to capture object features and use it as additional information to the image generation network. We also intro-duce a graph-cuts based segmentation method that can infer the masks of objects from bounding boxes to better model object shapes. Our method produces more discernable images with more realistic shapes as compared to the images generated by the current state-of-the-art method. 2019-03-01T08:00:00Z text application/pdf https://animorepository.dlsu.edu.ph/etd_masteral/6525 https://animorepository.dlsu.edu.ph/context/etd_masteral/article/13523/viewcontent/Talavera__Arces_Adlique___11791381___Thesis_Document_Redacted.pdf Master's Theses English Animo Repository Deep learning (Machine learning) Text data mining Imaging systems Computer Sciences |
institution |
De La Salle University |
building |
De La Salle University Library |
continent |
Asia |
country |
Philippines Philippines |
content_provider |
De La Salle University Library |
collection |
DLSU Institutional Repository |
language |
English |
topic |
Deep learning (Machine learning) Text data mining Imaging systems Computer Sciences |
spellingShingle |
Deep learning (Machine learning) Text data mining Imaging systems Computer Sciences Talavera, Arces A. Layout and context understanding for image synthesis with scene graphs |
description |
dvancements on text-to-image synthesis generate remarkable images from tex-tual descriptions. However, these methods are designed to generate only one object with varying attributes. They face difficulties with complex descriptions having multiple arbitrary objects since it would require information on the place-ment and sizes of each object in the image. Recently, a method that infers object layouts from scene graphs has been proposed as a solution to this problem. How-ever, their method uses only object labels in describing the layout, which fail to capture the appearance of some objects. Moreover, their model is biased towards generating rectangular shaped objects in the absence of ground-truth masks. In this paper, we propose an object encoding module to capture object features and use it as additional information to the image generation network. We also intro-duce a graph-cuts based segmentation method that can infer the masks of objects from bounding boxes to better model object shapes. Our method produces more discernable images with more realistic shapes as compared to the images generated by the current state-of-the-art method. |
format |
text |
author |
Talavera, Arces A. |
author_facet |
Talavera, Arces A. |
author_sort |
Talavera, Arces A. |
title |
Layout and context understanding for image synthesis with scene graphs |
title_short |
Layout and context understanding for image synthesis with scene graphs |
title_full |
Layout and context understanding for image synthesis with scene graphs |
title_fullStr |
Layout and context understanding for image synthesis with scene graphs |
title_full_unstemmed |
Layout and context understanding for image synthesis with scene graphs |
title_sort |
layout and context understanding for image synthesis with scene graphs |
publisher |
Animo Repository |
publishDate |
2019 |
url |
https://animorepository.dlsu.edu.ph/etd_masteral/6525 https://animorepository.dlsu.edu.ph/context/etd_masteral/article/13523/viewcontent/Talavera__Arces_Adlique___11791381___Thesis_Document_Redacted.pdf |
_version_ |
1767196774959677440 |