Cross-domain cross-modal food transfer

The recent works in cross-modal image-to-recipe retrieval pave a new way to scale up food recognition. By learning the joint space between food images and recipes, food recognition is boiled down as a retrieval problem by evaluating the similarity of embedded features. The major drawback, neverthele...

Full description

Saved in:
Bibliographic Details
Main Authors: ZHU, Bin, NGO, Chong-wah, CHEN, Jingjing
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6497
https://ink.library.smu.edu.sg/context/sis_research/article/7500/viewcontent/3394171.3413809.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7500
record_format dspace
spelling sg-smu-ink.sis_research-75002022-01-10T04:58:35Z Cross-domain cross-modal food transfer ZHU, Bin NGO, Chong-wah CHEN, Jingjing The recent works in cross-modal image-to-recipe retrieval pave a new way to scale up food recognition. By learning the joint space between food images and recipes, food recognition is boiled down as a retrieval problem by evaluating the similarity of embedded features. The major drawback, nevertheless, is the difficulty in applying an already-trained model to recognize different cuisines of dishes unknown to the model. In general, model updating with new training examples, in the form of image-recipe pairs, is required to adapt a model to new cooking styles in a cuisine. Nevertheless, in practice, acquiring sufficient number of image-recipe pairs for model transfer can be time-consuming. This paper addresses the challenge of resource scarcity in the scenario that only partial data instead of a complete view of data is accessible for model transfer. Partial data refers to missing information such as absence of image modality or cooking instructions from an image-recipe pair. To cope with partial data, a novel generic model, equipped with various loss functions including cross-modal metric learning, recipe residual loss, semantic regularization and adversarial learning, is proposed for cross-domain transfer learning. Experiments are conducted on three different cuisines (Chuan, Yue and Washoku) to provide insights on scaling up food recognition across domains with limited training resources. 2020-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6497 info:doi/10.1145/3394171.3413809 https://ink.library.smu.edu.sg/context/sis_research/article/7500/viewcontent/3394171.3413809.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University cross-domain transfer cross-modal food retrieval food recognition Databases and Information Systems Graphics and Human Computer Interfaces
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic cross-domain transfer
cross-modal food retrieval
food recognition
Databases and Information Systems
Graphics and Human Computer Interfaces
spellingShingle cross-domain transfer
cross-modal food retrieval
food recognition
Databases and Information Systems
Graphics and Human Computer Interfaces
ZHU, Bin
NGO, Chong-wah
CHEN, Jingjing
Cross-domain cross-modal food transfer
description The recent works in cross-modal image-to-recipe retrieval pave a new way to scale up food recognition. By learning the joint space between food images and recipes, food recognition is boiled down as a retrieval problem by evaluating the similarity of embedded features. The major drawback, nevertheless, is the difficulty in applying an already-trained model to recognize different cuisines of dishes unknown to the model. In general, model updating with new training examples, in the form of image-recipe pairs, is required to adapt a model to new cooking styles in a cuisine. Nevertheless, in practice, acquiring sufficient number of image-recipe pairs for model transfer can be time-consuming. This paper addresses the challenge of resource scarcity in the scenario that only partial data instead of a complete view of data is accessible for model transfer. Partial data refers to missing information such as absence of image modality or cooking instructions from an image-recipe pair. To cope with partial data, a novel generic model, equipped with various loss functions including cross-modal metric learning, recipe residual loss, semantic regularization and adversarial learning, is proposed for cross-domain transfer learning. Experiments are conducted on three different cuisines (Chuan, Yue and Washoku) to provide insights on scaling up food recognition across domains with limited training resources.
format text
author ZHU, Bin
NGO, Chong-wah
CHEN, Jingjing
author_facet ZHU, Bin
NGO, Chong-wah
CHEN, Jingjing
author_sort ZHU, Bin
title Cross-domain cross-modal food transfer
title_short Cross-domain cross-modal food transfer
title_full Cross-domain cross-modal food transfer
title_fullStr Cross-domain cross-modal food transfer
title_full_unstemmed Cross-domain cross-modal food transfer
title_sort cross-domain cross-modal food transfer
publisher Institutional Knowledge at Singapore Management University
publishDate 2020
url https://ink.library.smu.edu.sg/sis_research/6497
https://ink.library.smu.edu.sg/context/sis_research/article/7500/viewcontent/3394171.3413809.pdf
_version_ 1770575976642641920