Pre-training graph transformer with multimodal side information for recommendation
Side information of items, e.g., images and text description, has shown to be effective in contributing to accurate recommendations. Inspired by the recent success of pre-training models on natural language and images, we propose a pre-training strategy to learn item representations by considering b...
Saved in:
Main Authors: | , , , , , , , |
---|---|
Other Authors: | |
Format: | Conference or Workshop Item |
Language: | English |
Published: |
2022
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/156088 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-156088 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1560882022-04-07T00:47:35Z Pre-training graph transformer with multimodal side information for recommendation Liu, Yong Yang, Susen Lei, Chenyi Wang, Guoxin Tang, Haihong Zhang, Juyong Sun, Aixin Miao, Chunyan School of Computer Science and Engineering 29th ACM International Conference on Multimedia (MM '21) Alibaba-NTU Singapore Joint Research Institute & LILY Research Centre Engineering::Computer science and engineering Recommendation Systems Pre-Training Mode Side information of items, e.g., images and text description, has shown to be effective in contributing to accurate recommendations. Inspired by the recent success of pre-training models on natural language and images, we propose a pre-training strategy to learn item representations by considering both item side information and their relationships. We relate items by common user activities, e.g., co-purchase, and construct a homogeneous item graph. This graph provides a unified view of item relations and their associated side information in multimodality. We develop a novel sampling algorithm named MCNSampling to select contextual neighbors for each item. The proposed Pre-trained Multimodal Graph Transformer (PMGT) learns item representations with two objectives: 1) graph structure reconstruction, and 2) masked node feature reconstruction. Experimental results on real datasets demonstrate that the proposed PMGT model effectively exploits the multimodality side information to achieve better accuracies in downstream tasks including item recommendation and click-through ratio prediction. In addition, we also report a case study of testing PMGT in an online setting with 600 thousand users. AI Singapore National Research Foundation (NRF) Submitted/Accepted version This research is supported, in part, by Alibaba Group through Alibaba Innovative Research (AIR) Program and Alibaba-NTU Singapore Joint Research Institute (JRI), Nanyang Technological University, Singapore. This research is also supported, in part, by the National Research Foundation, Prime Minister’s Office, Singapore under its AI Singapore Programme (AISG Award No: AISG-GC2019-003) and under its NRF Investigatorship Programme (NRFI Award No. NRF-NRFI05-2019-0002). 2022-04-07T00:47:34Z 2022-04-07T00:47:34Z 2021 Conference Paper Liu, Y., Yang, S., Lei, C., Wang, G., Tang, H., Zhang, J., Sun, A. & Miao, C. (2021). Pre-training graph transformer with multimodal side information for recommendation. 29th ACM International Conference on Multimedia (MM '21), 2853-2861. https://dx.doi.org/10.1145/3474085.3475709 9781450386517 https://hdl.handle.net/10356/156088 10.1145/3474085.3475709 2-s2.0-85118999559 2853 2861 en AISG-GC2019-003 NRF-NRFI05-2019-0002 © 2021 Association for Computing Machinery. All rights reserved. This paper was published in Proceedings of the 29th ACM International Conference on Multimedia (MM '21) and is made available with permission of Association for Computing Machinery. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Recommendation Systems Pre-Training Mode |
spellingShingle |
Engineering::Computer science and engineering Recommendation Systems Pre-Training Mode Liu, Yong Yang, Susen Lei, Chenyi Wang, Guoxin Tang, Haihong Zhang, Juyong Sun, Aixin Miao, Chunyan Pre-training graph transformer with multimodal side information for recommendation |
description |
Side information of items, e.g., images and text description, has shown to be effective in contributing to accurate recommendations. Inspired by the recent success of pre-training models on natural language and images, we propose a pre-training strategy to learn item representations by considering both item side information and their relationships. We relate items by common user activities, e.g., co-purchase, and construct a homogeneous item graph. This graph provides a unified view of item relations and their associated side information in multimodality. We develop a novel sampling algorithm named MCNSampling to select contextual neighbors for each item. The proposed Pre-trained Multimodal Graph Transformer (PMGT) learns item representations with two objectives: 1) graph structure reconstruction, and 2) masked node feature reconstruction. Experimental results on real datasets demonstrate that the proposed PMGT model effectively exploits the multimodality side information to achieve better accuracies in downstream tasks including item recommendation and click-through ratio prediction. In addition, we also report a case study of testing PMGT in an online setting with 600 thousand users. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Liu, Yong Yang, Susen Lei, Chenyi Wang, Guoxin Tang, Haihong Zhang, Juyong Sun, Aixin Miao, Chunyan |
format |
Conference or Workshop Item |
author |
Liu, Yong Yang, Susen Lei, Chenyi Wang, Guoxin Tang, Haihong Zhang, Juyong Sun, Aixin Miao, Chunyan |
author_sort |
Liu, Yong |
title |
Pre-training graph transformer with multimodal side information for recommendation |
title_short |
Pre-training graph transformer with multimodal side information for recommendation |
title_full |
Pre-training graph transformer with multimodal side information for recommendation |
title_fullStr |
Pre-training graph transformer with multimodal side information for recommendation |
title_full_unstemmed |
Pre-training graph transformer with multimodal side information for recommendation |
title_sort |
pre-training graph transformer with multimodal side information for recommendation |
publishDate |
2022 |
url |
https://hdl.handle.net/10356/156088 |
_version_ |
1729789484339298304 |