R2GAN: Cross-modal recipe retrieval with generative adversarial network

Representing procedure text such as recipe for crossmodal retrieval is inherently a difficult problem, not mentioning to generate image from recipe for visualization. This paper studies a new version of GAN, named Recipe Retrieval Generative Adversarial Network (R2GAN), to explore the feasibility of...

Full description

Saved in:

Bibliographic Details
Main Authors:	ZHU, Bin, NGO, Chong-wah, CHEN, Jingjing, HAO, Yanbin
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2019
Subjects:	Categorization Image and Video Synthesis Recognition: Detection Representation Learning Retrieval; Vision + Language Data Storage Systems Graphics and Human Computer Interfaces OS and Networks
Online Access:	https://ink.library.smu.edu.sg/sis_research/6456 https://ink.library.smu.edu.sg/context/sis_research/article/7459/viewcontent/Zhu_R2GAN_Cross_Modal_Recipe_Retrieval_With_Generative_Adversarial_Network_CVPR_2019_paper.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-7459
record_format	dspace
spelling	sg-smu-ink.sis_research-74592022-01-10T06:12:07Z R2GAN: Cross-modal recipe retrieval with generative adversarial network ZHU, Bin NGO, Chong-wah CHEN, Jingjing HAO, Yanbin Representing procedure text such as recipe for crossmodal retrieval is inherently a difficult problem, not mentioning to generate image from recipe for visualization. This paper studies a new version of GAN, named Recipe Retrieval Generative Adversarial Network (R2GAN), to explore the feasibility of generating image from procedure text for retrieval problem. The motivation of using GAN is twofold: learning compatible cross-modal features in an adversarial way, and explanation of search results by showing the images generated from recipes. The novelty of R2GAN comes from architecture design, specifically a GAN with one generator and dual discriminators is used, which makes the generation of image from recipe a feasible idea. Furthermore, empowered by the generated images, a two-level ranking loss in both embedding and image spaces are considered. These add-ons not only result in excellent retrieval performance, but also generate close-to-realistic food images useful for explaining ranking of recipes. On recipe1M dataset, R2GAN demonstrates high scalability to data size, outperforms all the existing approaches, and generates images intuitive for human to interpret the search results. 2019-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6456 info:doi/10.1109/CVPR.2019.01174 https://ink.library.smu.edu.sg/context/sis_research/article/7459/viewcontent/Zhu_R2GAN_Cross_Modal_Recipe_Retrieval_With_Generative_Adversarial_Network_CVPR_2019_paper.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Categorization Image and Video Synthesis Recognition: Detection Representation Learning Retrieval; Vision + Language Data Storage Systems Graphics and Human Computer Interfaces OS and Networks
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Categorization Image and Video Synthesis Recognition: Detection Representation Learning Retrieval; Vision + Language Data Storage Systems Graphics and Human Computer Interfaces OS and Networks
spellingShingle	Categorization Image and Video Synthesis Recognition: Detection Representation Learning Retrieval; Vision + Language Data Storage Systems Graphics and Human Computer Interfaces OS and Networks ZHU, Bin NGO, Chong-wah CHEN, Jingjing HAO, Yanbin R2GAN: Cross-modal recipe retrieval with generative adversarial network
description	Representing procedure text such as recipe for crossmodal retrieval is inherently a difficult problem, not mentioning to generate image from recipe for visualization. This paper studies a new version of GAN, named Recipe Retrieval Generative Adversarial Network (R2GAN), to explore the feasibility of generating image from procedure text for retrieval problem. The motivation of using GAN is twofold: learning compatible cross-modal features in an adversarial way, and explanation of search results by showing the images generated from recipes. The novelty of R2GAN comes from architecture design, specifically a GAN with one generator and dual discriminators is used, which makes the generation of image from recipe a feasible idea. Furthermore, empowered by the generated images, a two-level ranking loss in both embedding and image spaces are considered. These add-ons not only result in excellent retrieval performance, but also generate close-to-realistic food images useful for explaining ranking of recipes. On recipe1M dataset, R2GAN demonstrates high scalability to data size, outperforms all the existing approaches, and generates images intuitive for human to interpret the search results.
format	text
author	ZHU, Bin NGO, Chong-wah CHEN, Jingjing HAO, Yanbin
author_facet	ZHU, Bin NGO, Chong-wah CHEN, Jingjing HAO, Yanbin
author_sort	ZHU, Bin
title	R2GAN: Cross-modal recipe retrieval with generative adversarial network
title_short	R2GAN: Cross-modal recipe retrieval with generative adversarial network
title_full	R2GAN: Cross-modal recipe retrieval with generative adversarial network
title_fullStr	R2GAN: Cross-modal recipe retrieval with generative adversarial network
title_full_unstemmed	R2GAN: Cross-modal recipe retrieval with generative adversarial network
title_sort	r2gan: cross-modal recipe retrieval with generative adversarial network
publisher	Institutional Knowledge at Singapore Management University
publishDate	2019
url	https://ink.library.smu.edu.sg/sis_research/6456 https://ink.library.smu.edu.sg/context/sis_research/article/7459/viewcontent/Zhu_R2GAN_Cross_Modal_Recipe_Retrieval_With_Generative_Adversarial_Network_CVPR_2019_paper.pdf
_version_	1770575963858403328

R2GAN: Cross-modal recipe retrieval with generative adversarial network

Similar Items