Summarizing user-item matrix by group utility maximization
A user-item utility matrix represents the utility (or preference) associated with each (user, item) pair, such as citation counts, rating/vote on items or locations, and clicks on items. A high utility value indicates a strong association of the pair. In this work, we consider the problem of summari...
Saved in:
Main Authors: | , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
2023
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/166437 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-166437 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1664372023-04-28T15:37:38Z Summarizing user-item matrix by group utility maximization Wang, Yongjie Wang, Ke Long, Cheng Miao, Chunyan School of Computer Science and Engineering Alibaba-NTU Singapore Joint Research Institute (JRI) Engineering::Computer science and engineering Data Summarization Greedy Algorithm A user-item utility matrix represents the utility (or preference) associated with each (user, item) pair, such as citation counts, rating/vote on items or locations, and clicks on items. A high utility value indicates a strong association of the pair. In this work, we consider the problem of summarizing strong association for a large user-item matrix using a small summary size. Traditional techniques fail to distinguish user groups associated with different items (such as top-$l$ item selection) or fail to focus on high utility (such as similarity based subspace clustering and biclustering). We formulate a new problem, called Group Utility Maximization, to summarize the entire user population through $k$ user groups and $l$ items for each group; the goal is to maximize the total utility of selected items over all groups collectively. We show this problem is NP-hard even for $l=1$. We present two algorithms. One greedily finds the next group, called Greedy algorithm, and the other iteratively refines existing $k$ groups, called $k$-max algorithm. Greedy algorithm provides the $(1-\frac{1}{e})$ approximation guarantee for a nonnegative utility matrix, whereas $k$-max algorithm is more efficient for large datasets. We evaluate these algorithms on real-life datasets. National Research Foundation (NRF) Submitted/Accepted version 2023-04-28T07:30:33Z 2023-04-28T07:30:33Z 2023 Journal Article Wang, Y., Wang, K., Long, C. & Miao, C. (2023). Summarizing user-item matrix by group utility maximization. ACM Transactions On Knowledge Discovery From Data, 17(6), 86-. https://dx.doi.org/10.1145/3578586 1556-4681 https://hdl.handle.net/10356/166437 10.1145/3578586 6 17 86 en AISG-GC-2019-003 NRF-NRFI05- 2019-0002 ACM Transactions on Knowledge Discovery from Data © 2023 Association for Computing Machinery. All rights reserved. This paper was published in ACM Transactions on Knowledge Discovery from Data and is made available with permission of Association for Computing Machinery. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Engineering::Computer science and engineering Data Summarization Greedy Algorithm |
spellingShingle |
Engineering::Computer science and engineering Data Summarization Greedy Algorithm Wang, Yongjie Wang, Ke Long, Cheng Miao, Chunyan Summarizing user-item matrix by group utility maximization |
description |
A user-item utility matrix represents the utility (or preference) associated with each (user, item) pair, such as citation counts, rating/vote on items or locations, and clicks on items. A high utility value indicates a strong association of the pair. In this work, we consider the problem of summarizing strong association for a large user-item matrix using a small summary size. Traditional techniques
fail to distinguish user groups associated with different items (such as top-$l$ item selection) or fail to focus on high utility (such as similarity based subspace clustering and biclustering). We formulate a new problem, called Group Utility Maximization, to summarize the entire user population through $k$ user groups and $l$ items for each group; the goal is to maximize the total utility of selected items over all groups collectively. We show this problem is NP-hard even for $l=1$. We present
two algorithms. One greedily finds the next group, called Greedy algorithm, and the other iteratively refines existing $k$ groups, called $k$-max algorithm. Greedy algorithm provides the $(1-\frac{1}{e})$ approximation guarantee for a nonnegative utility matrix, whereas $k$-max algorithm is more efficient for large datasets. We evaluate these algorithms on real-life datasets. |
author2 |
School of Computer Science and Engineering |
author_facet |
School of Computer Science and Engineering Wang, Yongjie Wang, Ke Long, Cheng Miao, Chunyan |
format |
Article |
author |
Wang, Yongjie Wang, Ke Long, Cheng Miao, Chunyan |
author_sort |
Wang, Yongjie |
title |
Summarizing user-item matrix by group utility maximization |
title_short |
Summarizing user-item matrix by group utility maximization |
title_full |
Summarizing user-item matrix by group utility maximization |
title_fullStr |
Summarizing user-item matrix by group utility maximization |
title_full_unstemmed |
Summarizing user-item matrix by group utility maximization |
title_sort |
summarizing user-item matrix by group utility maximization |
publishDate |
2023 |
url |
https://hdl.handle.net/10356/166437 |
_version_ |
1765213843229769728 |