R2GAN: Cross-modal recipe retrieval with generative adversarial network
Representing procedure text such as recipe for crossmodal retrieval is inherently a difficult problem, not mentioning to generate image from recipe for visualization. This paper studies a new version of GAN, named Recipe Retrieval Generative Adversarial Network (R2GAN), to explore the feasibility of...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2019
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/6456 https://ink.library.smu.edu.sg/context/sis_research/article/7459/viewcontent/Zhu_R2GAN_Cross_Modal_Recipe_Retrieval_With_Generative_Adversarial_Network_CVPR_2019_paper.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-7459 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-74592022-01-10T06:12:07Z R2GAN: Cross-modal recipe retrieval with generative adversarial network ZHU, Bin NGO, Chong-wah CHEN, Jingjing HAO, Yanbin Representing procedure text such as recipe for crossmodal retrieval is inherently a difficult problem, not mentioning to generate image from recipe for visualization. This paper studies a new version of GAN, named Recipe Retrieval Generative Adversarial Network (R2GAN), to explore the feasibility of generating image from procedure text for retrieval problem. The motivation of using GAN is twofold: learning compatible cross-modal features in an adversarial way, and explanation of search results by showing the images generated from recipes. The novelty of R2GAN comes from architecture design, specifically a GAN with one generator and dual discriminators is used, which makes the generation of image from recipe a feasible idea. Furthermore, empowered by the generated images, a two-level ranking loss in both embedding and image spaces are considered. These add-ons not only result in excellent retrieval performance, but also generate close-to-realistic food images useful for explaining ranking of recipes. On recipe1M dataset, R2GAN demonstrates high scalability to data size, outperforms all the existing approaches, and generates images intuitive for human to interpret the search results. 2019-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6456 info:doi/10.1109/CVPR.2019.01174 https://ink.library.smu.edu.sg/context/sis_research/article/7459/viewcontent/Zhu_R2GAN_Cross_Modal_Recipe_Retrieval_With_Generative_Adversarial_Network_CVPR_2019_paper.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Categorization Image and Video Synthesis Recognition: Detection Representation Learning Retrieval; Vision + Language Data Storage Systems Graphics and Human Computer Interfaces OS and Networks |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Categorization Image and Video Synthesis Recognition: Detection Representation Learning Retrieval; Vision + Language Data Storage Systems Graphics and Human Computer Interfaces OS and Networks |
spellingShingle |
Categorization Image and Video Synthesis Recognition: Detection Representation Learning Retrieval; Vision + Language Data Storage Systems Graphics and Human Computer Interfaces OS and Networks ZHU, Bin NGO, Chong-wah CHEN, Jingjing HAO, Yanbin R2GAN: Cross-modal recipe retrieval with generative adversarial network |
description |
Representing procedure text such as recipe for crossmodal retrieval is inherently a difficult problem, not mentioning to generate image from recipe for visualization. This paper studies a new version of GAN, named Recipe Retrieval Generative Adversarial Network (R2GAN), to explore the feasibility of generating image from procedure text for retrieval problem. The motivation of using GAN is twofold: learning compatible cross-modal features in an adversarial way, and explanation of search results by showing the images generated from recipes. The novelty of R2GAN comes from architecture design, specifically a GAN with one generator and dual discriminators is used, which makes the generation of image from recipe a feasible idea. Furthermore, empowered by the generated images, a two-level ranking loss in both embedding and image spaces are considered. These add-ons not only result in excellent retrieval performance, but also generate close-to-realistic food images useful for explaining ranking of recipes. On recipe1M dataset, R2GAN demonstrates high scalability to data size, outperforms all the existing approaches, and generates images intuitive for human to interpret the search results. |
format |
text |
author |
ZHU, Bin NGO, Chong-wah CHEN, Jingjing HAO, Yanbin |
author_facet |
ZHU, Bin NGO, Chong-wah CHEN, Jingjing HAO, Yanbin |
author_sort |
ZHU, Bin |
title |
R2GAN: Cross-modal recipe retrieval with generative adversarial network |
title_short |
R2GAN: Cross-modal recipe retrieval with generative adversarial network |
title_full |
R2GAN: Cross-modal recipe retrieval with generative adversarial network |
title_fullStr |
R2GAN: Cross-modal recipe retrieval with generative adversarial network |
title_full_unstemmed |
R2GAN: Cross-modal recipe retrieval with generative adversarial network |
title_sort |
r2gan: cross-modal recipe retrieval with generative adversarial network |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2019 |
url |
https://ink.library.smu.edu.sg/sis_research/6456 https://ink.library.smu.edu.sg/context/sis_research/article/7459/viewcontent/Zhu_R2GAN_Cross_Modal_Recipe_Retrieval_With_Generative_Adversarial_Network_CVPR_2019_paper.pdf |
_version_ |
1770575963858403328 |