Summarizing user-item matrix by group utility maximization

A user-item utility matrix represents the utility (or preference) associated with each (user, item) pair, such as citation counts, rating/vote on items or locations, and clicks on items. A high utility value indicates a strong association of the pair. In this work, we consider the problem of summari...

Full description

Saved in:
Bibliographic Details
Main Authors: Wang, Yongjie, Wang, Ke, Long, Cheng, Miao, Chunyan
Other Authors: School of Computer Science and Engineering
Format: Article
Language:English
Published: 2023
Subjects:
Online Access:https://hdl.handle.net/10356/166437
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-166437
record_format dspace
spelling sg-ntu-dr.10356-1664372023-04-28T15:37:38Z Summarizing user-item matrix by group utility maximization Wang, Yongjie Wang, Ke Long, Cheng Miao, Chunyan School of Computer Science and Engineering Alibaba-NTU Singapore Joint Research Institute (JRI) Engineering::Computer science and engineering Data Summarization Greedy Algorithm A user-item utility matrix represents the utility (or preference) associated with each (user, item) pair, such as citation counts, rating/vote on items or locations, and clicks on items. A high utility value indicates a strong association of the pair. In this work, we consider the problem of summarizing strong association for a large user-item matrix using a small summary size. Traditional techniques fail to distinguish user groups associated with different items (such as top-$l$ item selection) or fail to focus on high utility (such as similarity based subspace clustering and biclustering). We formulate a new problem, called Group Utility Maximization, to summarize the entire user population through $k$ user groups and $l$ items for each group; the goal is to maximize the total utility of selected items over all groups collectively. We show this problem is NP-hard even for $l=1$. We present two algorithms. One greedily finds the next group, called Greedy algorithm, and the other iteratively refines existing $k$ groups, called $k$-max algorithm. Greedy algorithm provides the $(1-\frac{1}{e})$ approximation guarantee for a nonnegative utility matrix, whereas $k$-max algorithm is more efficient for large datasets. We evaluate these algorithms on real-life datasets. National Research Foundation (NRF) Submitted/Accepted version 2023-04-28T07:30:33Z 2023-04-28T07:30:33Z 2023 Journal Article Wang, Y., Wang, K., Long, C. & Miao, C. (2023). Summarizing user-item matrix by group utility maximization. ACM Transactions On Knowledge Discovery From Data, 17(6), 86-. https://dx.doi.org/10.1145/3578586 1556-4681 https://hdl.handle.net/10356/166437 10.1145/3578586 6 17 86 en AISG-GC-2019-003 NRF-NRFI05- 2019-0002 ACM Transactions on Knowledge Discovery from Data © 2023 Association for Computing Machinery. All rights reserved. This paper was published in ACM Transactions on Knowledge Discovery from Data and is made available with permission of Association for Computing Machinery. application/pdf
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
Data Summarization
Greedy Algorithm
spellingShingle Engineering::Computer science and engineering
Data Summarization
Greedy Algorithm
Wang, Yongjie
Wang, Ke
Long, Cheng
Miao, Chunyan
Summarizing user-item matrix by group utility maximization
description A user-item utility matrix represents the utility (or preference) associated with each (user, item) pair, such as citation counts, rating/vote on items or locations, and clicks on items. A high utility value indicates a strong association of the pair. In this work, we consider the problem of summarizing strong association for a large user-item matrix using a small summary size. Traditional techniques fail to distinguish user groups associated with different items (such as top-$l$ item selection) or fail to focus on high utility (such as similarity based subspace clustering and biclustering). We formulate a new problem, called Group Utility Maximization, to summarize the entire user population through $k$ user groups and $l$ items for each group; the goal is to maximize the total utility of selected items over all groups collectively. We show this problem is NP-hard even for $l=1$. We present two algorithms. One greedily finds the next group, called Greedy algorithm, and the other iteratively refines existing $k$ groups, called $k$-max algorithm. Greedy algorithm provides the $(1-\frac{1}{e})$ approximation guarantee for a nonnegative utility matrix, whereas $k$-max algorithm is more efficient for large datasets. We evaluate these algorithms on real-life datasets.
author2 School of Computer Science and Engineering
author_facet School of Computer Science and Engineering
Wang, Yongjie
Wang, Ke
Long, Cheng
Miao, Chunyan
format Article
author Wang, Yongjie
Wang, Ke
Long, Cheng
Miao, Chunyan
author_sort Wang, Yongjie
title Summarizing user-item matrix by group utility maximization
title_short Summarizing user-item matrix by group utility maximization
title_full Summarizing user-item matrix by group utility maximization
title_fullStr Summarizing user-item matrix by group utility maximization
title_full_unstemmed Summarizing user-item matrix by group utility maximization
title_sort summarizing user-item matrix by group utility maximization
publishDate 2023
url https://hdl.handle.net/10356/166437
_version_ 1765213843229769728