Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation

Fashion recommendation has attracted increasing attention from both industry and academic communities. This paper proposes a novel neural architecture for fashion recommendation based on both image region-level features and user review information. Our basic intuition is that: for a fashion image, n...

Full description

Saved in:

Bibliographic Details
Main Authors:	CHEN, Xu, CHEN, Hanxiong, XU, Hongteng, ZHANG, Yongfeng, CAO, Yixin, QIN, Zheng, ZHA, Hongyuan
Format:	text
Language:	English
Published:	Institutional Knowledge at Singapore Management University 2019
Subjects:	Databases and Information Systems Graphics and Human Computer Interfaces OS and Networks
Online Access:	https://ink.library.smu.edu.sg/sis_research/7463 https://ink.library.smu.edu.sg/context/sis_research/article/8466/viewcontent/3331184.3331254.pdf
Tags:	Add Tag No Tags, Be the first to tag this record!
Institution:	Singapore Management University
Language:	English

id	sg-smu-ink.sis_research-8466
record_format	dspace
spelling	sg-smu-ink.sis_research-84662022-10-20T07:10:43Z Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation CHEN, Xu CHEN, Hanxiong XU, Hongteng ZHANG, Yongfeng CAO, Yixin QIN, Zheng ZHA, Hongyuan Fashion recommendation has attracted increasing attention from both industry and academic communities. This paper proposes a novel neural architecture for fashion recommendation based on both image region-level features and user review information. Our basic intuition is that: for a fashion image, not all the regions are equally important for the users, i.e., people usually care about a few parts of the fashion image. To model such human sense, we learn an attention model over many pre-segmented image regions, based on which we can understand where a user is really interested in on the image, and correspondingly, represent the image in a more accurate manner. In addition, by discovering such fine-grained visual preference, we can visually explain a recommendation by highlighting some regions of its image. For better learning the attention model, we also introduce user review information as a weak supervision signal to collect more comprehensive user preference. In our final framework, the visual and textual features are seamlessly coupled by a multimodal attention network. Based on this architecture, we can not only provide accurate recommendation, but also can accompany each recommended item with novel visual explanations. We conduct extensive experiments to demonstrate the superiority of our proposed model in terms of Top-N recommendation, and also we build a collectively labeled dataset for evaluating our provided visual explanations in a quantitative manner. 2019-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/7463 info:doi/10.1145/3331184.3331254 https://ink.library.smu.edu.sg/context/sis_research/article/8466/viewcontent/3331184.3331254.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Databases and Information Systems Graphics and Human Computer Interfaces OS and Networks
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Databases and Information Systems Graphics and Human Computer Interfaces OS and Networks
spellingShingle	Databases and Information Systems Graphics and Human Computer Interfaces OS and Networks CHEN, Xu CHEN, Hanxiong XU, Hongteng ZHANG, Yongfeng CAO, Yixin QIN, Zheng ZHA, Hongyuan Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation
description	Fashion recommendation has attracted increasing attention from both industry and academic communities. This paper proposes a novel neural architecture for fashion recommendation based on both image region-level features and user review information. Our basic intuition is that: for a fashion image, not all the regions are equally important for the users, i.e., people usually care about a few parts of the fashion image. To model such human sense, we learn an attention model over many pre-segmented image regions, based on which we can understand where a user is really interested in on the image, and correspondingly, represent the image in a more accurate manner. In addition, by discovering such fine-grained visual preference, we can visually explain a recommendation by highlighting some regions of its image. For better learning the attention model, we also introduce user review information as a weak supervision signal to collect more comprehensive user preference. In our final framework, the visual and textual features are seamlessly coupled by a multimodal attention network. Based on this architecture, we can not only provide accurate recommendation, but also can accompany each recommended item with novel visual explanations. We conduct extensive experiments to demonstrate the superiority of our proposed model in terms of Top-N recommendation, and also we build a collectively labeled dataset for evaluating our provided visual explanations in a quantitative manner.
format	text
author	CHEN, Xu CHEN, Hanxiong XU, Hongteng ZHANG, Yongfeng CAO, Yixin QIN, Zheng ZHA, Hongyuan
author_facet	CHEN, Xu CHEN, Hanxiong XU, Hongteng ZHANG, Yongfeng CAO, Yixin QIN, Zheng ZHA, Hongyuan
author_sort	CHEN, Xu
title	Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation
title_short	Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation
title_full	Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation
title_fullStr	Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation
title_full_unstemmed	Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation
title_sort	personalized fashion recommendation with visual explanations based on multimodal attention network: towards visually explainable recommendation
publisher	Institutional Knowledge at Singapore Management University
publishDate	2019
url	https://ink.library.smu.edu.sg/sis_research/7463 https://ink.library.smu.edu.sg/context/sis_research/article/8466/viewcontent/3331184.3331254.pdf
_version_	1770576342894510080

Personalized fashion recommendation with visual explanations based on multimodal attention network: Towards visually explainable recommendation

Similar Items