Cross-domain cross-modal food transfer
The recent works in cross-modal image-to-recipe retrieval pave a new way to scale up food recognition. By learning the joint space between food images and recipes, food recognition is boiled down as a retrieval problem by evaluating the similarity of embedded features. The major drawback, neverthele...
Saved in:
Main Authors: | , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2020
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/6497 https://ink.library.smu.edu.sg/context/sis_research/article/7500/viewcontent/3394171.3413809.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-7500 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-75002022-01-10T04:58:35Z Cross-domain cross-modal food transfer ZHU, Bin NGO, Chong-wah CHEN, Jingjing The recent works in cross-modal image-to-recipe retrieval pave a new way to scale up food recognition. By learning the joint space between food images and recipes, food recognition is boiled down as a retrieval problem by evaluating the similarity of embedded features. The major drawback, nevertheless, is the difficulty in applying an already-trained model to recognize different cuisines of dishes unknown to the model. In general, model updating with new training examples, in the form of image-recipe pairs, is required to adapt a model to new cooking styles in a cuisine. Nevertheless, in practice, acquiring sufficient number of image-recipe pairs for model transfer can be time-consuming. This paper addresses the challenge of resource scarcity in the scenario that only partial data instead of a complete view of data is accessible for model transfer. Partial data refers to missing information such as absence of image modality or cooking instructions from an image-recipe pair. To cope with partial data, a novel generic model, equipped with various loss functions including cross-modal metric learning, recipe residual loss, semantic regularization and adversarial learning, is proposed for cross-domain transfer learning. Experiments are conducted on three different cuisines (Chuan, Yue and Washoku) to provide insights on scaling up food recognition across domains with limited training resources. 2020-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6497 info:doi/10.1145/3394171.3413809 https://ink.library.smu.edu.sg/context/sis_research/article/7500/viewcontent/3394171.3413809.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University cross-domain transfer cross-modal food retrieval food recognition Databases and Information Systems Graphics and Human Computer Interfaces |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
cross-domain transfer cross-modal food retrieval food recognition Databases and Information Systems Graphics and Human Computer Interfaces |
spellingShingle |
cross-domain transfer cross-modal food retrieval food recognition Databases and Information Systems Graphics and Human Computer Interfaces ZHU, Bin NGO, Chong-wah CHEN, Jingjing Cross-domain cross-modal food transfer |
description |
The recent works in cross-modal image-to-recipe retrieval pave a new way to scale up food recognition. By learning the joint space between food images and recipes, food recognition is boiled down as a retrieval problem by evaluating the similarity of embedded features. The major drawback, nevertheless, is the difficulty in applying an already-trained model to recognize different cuisines of dishes unknown to the model. In general, model updating with new training examples, in the form of image-recipe pairs, is required to adapt a model to new cooking styles in a cuisine. Nevertheless, in practice, acquiring sufficient number of image-recipe pairs for model transfer can be time-consuming. This paper addresses the challenge of resource scarcity in the scenario that only partial data instead of a complete view of data is accessible for model transfer. Partial data refers to missing information such as absence of image modality or cooking instructions from an image-recipe pair. To cope with partial data, a novel generic model, equipped with various loss functions including cross-modal metric learning, recipe residual loss, semantic regularization and adversarial learning, is proposed for cross-domain transfer learning. Experiments are conducted on three different cuisines (Chuan, Yue and Washoku) to provide insights on scaling up food recognition across domains with limited training resources. |
format |
text |
author |
ZHU, Bin NGO, Chong-wah CHEN, Jingjing |
author_facet |
ZHU, Bin NGO, Chong-wah CHEN, Jingjing |
author_sort |
ZHU, Bin |
title |
Cross-domain cross-modal food transfer |
title_short |
Cross-domain cross-modal food transfer |
title_full |
Cross-domain cross-modal food transfer |
title_fullStr |
Cross-domain cross-modal food transfer |
title_full_unstemmed |
Cross-domain cross-modal food transfer |
title_sort |
cross-domain cross-modal food transfer |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2020 |
url |
https://ink.library.smu.edu.sg/sis_research/6497 https://ink.library.smu.edu.sg/context/sis_research/article/7500/viewcontent/3394171.3413809.pdf |
_version_ |
1770575976642641920 |