Pre-training graph transformer with multimodal side information for recommendation

Side information of items, e.g., images and text description, has shown to be effective in contributing to accurate recommendations. Inspired by the recent success of pre-training models on natural language and images, we propose a pre-training strategy to learn item representations by considering b...

Full description

Saved in:
Bibliographic Details
Main Authors: Liu, Yong, Yang, Susen, Lei, Chenyi, Wang, Guoxin, Tang, Haihong, Zhang, Juyong, Sun, Aixin, Miao, Chunyan
Other Authors: School of Computer Science and Engineering
Format: Conference or Workshop Item
Language:English
Published: 2022
Subjects:
Online Access:https://hdl.handle.net/10356/156088
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-156088
record_format dspace
spelling sg-ntu-dr.10356-1560882022-04-07T00:47:35Z Pre-training graph transformer with multimodal side information for recommendation Liu, Yong Yang, Susen Lei, Chenyi Wang, Guoxin Tang, Haihong Zhang, Juyong Sun, Aixin Miao, Chunyan School of Computer Science and Engineering 29th ACM International Conference on Multimedia (MM '21) Alibaba-NTU Singapore Joint Research Institute & LILY Research Centre Engineering::Computer science and engineering Recommendation Systems Pre-Training Mode Side information of items, e.g., images and text description, has shown to be effective in contributing to accurate recommendations. Inspired by the recent success of pre-training models on natural language and images, we propose a pre-training strategy to learn item representations by considering both item side information and their relationships. We relate items by common user activities, e.g., co-purchase, and construct a homogeneous item graph. This graph provides a unified view of item relations and their associated side information in multimodality. We develop a novel sampling algorithm named MCNSampling to select contextual neighbors for each item. The proposed Pre-trained Multimodal Graph Transformer (PMGT) learns item representations with two objectives: 1) graph structure reconstruction, and 2) masked node feature reconstruction. Experimental results on real datasets demonstrate that the proposed PMGT model effectively exploits the multimodality side information to achieve better accuracies in downstream tasks including item recommendation and click-through ratio prediction. In addition, we also report a case study of testing PMGT in an online setting with 600 thousand users. AI Singapore National Research Foundation (NRF) Submitted/Accepted version This research is supported, in part, by Alibaba Group through Alibaba Innovative Research (AIR) Program and Alibaba-NTU Singapore Joint Research Institute (JRI), Nanyang Technological University, Singapore. This research is also supported, in part, by the National Research Foundation, Prime Minister’s Office, Singapore under its AI Singapore Programme (AISG Award No: AISG-GC2019-003) and under its NRF Investigatorship Programme (NRFI Award No. NRF-NRFI05-2019-0002). 2022-04-07T00:47:34Z 2022-04-07T00:47:34Z 2021 Conference Paper Liu, Y., Yang, S., Lei, C., Wang, G., Tang, H., Zhang, J., Sun, A. & Miao, C. (2021). Pre-training graph transformer with multimodal side information for recommendation. 29th ACM International Conference on Multimedia (MM '21), 2853-2861. https://dx.doi.org/10.1145/3474085.3475709 9781450386517 https://hdl.handle.net/10356/156088 10.1145/3474085.3475709 2-s2.0-85118999559 2853 2861 en AISG-GC2019-003 NRF-NRFI05-2019-0002 © 2021 Association for Computing Machinery. All rights reserved. This paper was published in Proceedings of the 29th ACM International Conference on Multimedia (MM '21) and is made available with permission of Association for Computing Machinery. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Recommendation Systems
Pre-Training Mode
spellingShingle Engineering::Computer science and engineering
Recommendation Systems
Pre-Training Mode
Liu, Yong
Yang, Susen
Lei, Chenyi
Wang, Guoxin
Tang, Haihong
Zhang, Juyong
Sun, Aixin
Miao, Chunyan
Pre-training graph transformer with multimodal side information for recommendation
description Side information of items, e.g., images and text description, has shown to be effective in contributing to accurate recommendations. Inspired by the recent success of pre-training models on natural language and images, we propose a pre-training strategy to learn item representations by considering both item side information and their relationships. We relate items by common user activities, e.g., co-purchase, and construct a homogeneous item graph. This graph provides a unified view of item relations and their associated side information in multimodality. We develop a novel sampling algorithm named MCNSampling to select contextual neighbors for each item. The proposed Pre-trained Multimodal Graph Transformer (PMGT) learns item representations with two objectives: 1) graph structure reconstruction, and 2) masked node feature reconstruction. Experimental results on real datasets demonstrate that the proposed PMGT model effectively exploits the multimodality side information to achieve better accuracies in downstream tasks including item recommendation and click-through ratio prediction. In addition, we also report a case study of testing PMGT in an online setting with 600 thousand users.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Liu, Yong
Yang, Susen
Lei, Chenyi
Wang, Guoxin
Tang, Haihong
Zhang, Juyong
Sun, Aixin
Miao, Chunyan
format Conference or Workshop Item
author Liu, Yong
Yang, Susen
Lei, Chenyi
Wang, Guoxin
Tang, Haihong
Zhang, Juyong
Sun, Aixin
Miao, Chunyan
author_sort Liu, Yong
title Pre-training graph transformer with multimodal side information for recommendation
title_short Pre-training graph transformer with multimodal side information for recommendation
title_full Pre-training graph transformer with multimodal side information for recommendation
title_fullStr Pre-training graph transformer with multimodal side information for recommendation
title_full_unstemmed Pre-training graph transformer with multimodal side information for recommendation
title_sort pre-training graph transformer with multimodal side information for recommendation
publishDate 2022
url https://hdl.handle.net/10356/156088
_version_ 1729789484339298304