Enhancing recipe retrieval with foundation models: A data augmentation perspective

Enhancing recipe retrieval with foundation models: A data augmentation perspective

Learning recipe and food image representation in common embedding space is non-trivial but crucial for cross-modal recipe retrieval. In this paper, we propose a new perspective for this problem by utilizing foundation models for data augmentation. Leveraging on the remarkable capabilities of foundat...

Full description

Saved in:

Bibliographic Details
Main Authors:	SONG, Fangzhou, ZHU, Bin, HAO, Yanbin, WANG, Shuo
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2024
Subjects:	Recipe retrieval Data augmentation Foundation models Databases and Information Systems Graphics and Human Computer Interfaces
Online Access:	https://ink.library.smu.edu.sg/sis_research/9726 https://ink.library.smu.edu.sg/context/sis_research/article/10726/viewcontent/06751.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

Similar Items

Retrieval augmented recipe generation
by: LIU, Guoshan, et al.
Published: (2025)

Cross-modal recipe retrieval: How to cook this dish?
by: CHEN, Jingjing, et al.
Published: (2017)

Cross-lingual adaptation for recipe retrieval with mixup
by: ZHU, Bin, et al.
Published: (2022)

Deep understanding of cooking procedure for cross-modal recipe retrieval
by: CHEN, Jingjing, et al.
Published: (2018)

Cross-modal recipe retrieval with stacked attention model
by: CHEN, Jing-Jing, et al.
Published: (2018)

Cross-modal recipe retrieval with rich food attributes
by: CHEN, Jingjing, et al.
Published: (2017)

Deep Understanding of Cooking Procedure for Cross-modal Recipe Retrieval
by: Jing-Jing Chen, et al.
Published: (2020)

Cross-modal Recipe Retrieval with Rich Food Attributes
by: Jingjing Chen, et al.
Published: (2020)

INMU-RecipeDatabase
by: อรพินท์ บรรจง
Published: (2020)

R2GAN: Cross-modal recipe retrieval with generative adversarial network
by: ZHU, Bin, et al.
Published: (2019)

Deep-based ingredient recognition for cooking recipe retrieval
by: CHEN, Jingjing, et al.
Published: (2016)

Learning from web recipe-image pairs for food recognition: Problem, baselines and performance
by: ZHU, Bin, et al.
Published: (2022)

Secrets from Tita Cely's kitchen: Recipes and stories of Cecilia Kalaw
by: Mercado, Alessandra Marie C., et al.
Published: (2010)

Estimating glycemic impact of cooking recipes via online crowdsourcing and machine learning
by: LEE, Helena, et al.
Published: (2019)

Multi-modal cooking workflow construction for food recipes
by: PAN, Liangming, et al.
Published: (2020)

โปรแกรม INMU – RecipeCal Program
by: อุไรพร จิตต์แจ้ง
Published: (2020)

Standardization vs. multiplicity of medical recipe compositions – a computational reconstruction of the genealogy of TCM recipes
by: Prackwieser, Joachim
Published: (2023)

Tree-augmented cross-modal encoding for complex-query video retrieval
by: YANG, Xun, et al.
Published: (2020)

Cross-modal food retrieval: Learning a joint embedding of food images and recipes with semantic consistency and attention mechanism
by: WANG, Hao, et al.
Published: (2022)

PIC2DISH: A customized cooking assistant system
by: AN, Yongsheng, et al.
Published: (2017)

Assistance for target selection in mobile augmented reality
by: ASOKAN, Vinod, et al.
Published: (2020)

Cross-modal food retrieval: Learning a joint embedding of food images and recipes with semantic consistency and attention mechanism;
by: WANG, Hao, et al.
Published: (2021)

An affordable augmented reality based rehabilitation system for hand motions
by: Zhang, D., et al.
Published: (2014)

RecipeGPT: Generative pre-training based cooking recipe generation and evaluation system
by: LEE, Helena Huey Chong, et al.
Published: (2020)

Blind late fusion in multimedia event retrieval
by: DE BOER, Maaike H. T., et al.
Published: (2016)

A recipe for success? A nutrient analysis of recipes promoted by supermarkets
by: Wademan, J., et al.
Published: (2021)

Data augmentation using rotation and shifting
by: Muhammad Haziq Bin Mornin
Published: (2024)

On clustering and retrieval of video shots
by: NGO, Chong-wah, et al.
Published: (2001)

Efficient cross-modal video retrieval with meta-optimized frames
by: HAN, Ning, et al.
Published: (2024)

Clip-based similarity measure for hierarchical video retrieval
by: PENG, Yuxin, et al.
Published: (2004)

Generative adversarial networks-based data augmentation for brain-computer interface
by: Fahimi, Fatemeh, et al.
Published: (2022)

A visual interaction cue framework from video game environments for augmented reality
by: DILLMAN, Kody R., et al.
Published: (2018)

Unsupervised video hashing with multi-granularity contextualization and multi-structure preservation
by: HAO, Yanbin, et al.
Published: (2022)

Motion retrieval by temporal slices analysis
by: NGO, Chong-Wah, et al.
Published: (2002)

Distribution-based concept selection for concept-based video retrieval
by: CAO, Juan, et al.
Published: (2009)

Recipe determination and scheduling of gasoline blending operations
by: Li, J., et al.
Published: (2014)

Key technologies for content-based video retrieval
by: PENG, Y., et al.
Published: (2004)

Effect of augmented reality on consumer behavior in e-commerce
by: UZOECHINA, Chibuke, et al.
Published: (2021)

Large language model (LLM) with retrieve-augmented generation (RAG) for legal case research
by: Liu, Zihao
Published: (2024)

OM-based video shot retrieval by one-to-one matching
by: PENG, Yuxin, et al.
Published: (2007)