Federated topic discovery: A semantic consistent approach

General-purpose topic models have widespread industrial applications. Yet high-quality topic modeling is becoming increasingly challenging because accurate models require large amounts of training data typically owned by multiple parties, who are often unwilling to share their sensitive data for col...

Full description

Saved in:
Bibliographic Details
Main Authors: SHI, Yexuan, TONG, Yongxin, SU, Zhiyang, JIANG, Di, ZHOU, Zimu, ZHANG, Wenbin
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2020
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/6406
https://ink.library.smu.edu.sg/context/sis_research/article/7409/viewcontent/is21_shi_av.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-7409
record_format dspace
spelling sg-smu-ink.sis_research-74092021-11-23T02:07:58Z Federated topic discovery: A semantic consistent approach SHI, Yexuan TONG, Yongxin SU, Zhiyang JIANG, Di ZHOU, Zimu ZHANG, Wenbin General-purpose topic models have widespread industrial applications. Yet high-quality topic modeling is becoming increasingly challenging because accurate models require large amounts of training data typically owned by multiple parties, who are often unwilling to share their sensitive data for collaborative training without guarantees on their data privacy. To enable effective privacy-preserving multiparty topic modeling, we propose a novel federated general-purpose topic model named private and consistent topic discovery (PC-TD). On the one hand, PC-TD seamlessly integrates differential privacy in topic modeling to provide privacy guarantees on sensitive data of different parties. On the other hand, PC-TD exploits multiple sources of semantic consistency information to retain the accuracy of topic modeling while protecting data privacy. We verify the effectiveness of PC-TD on real-life datasets. Experimental results demonstrate its superiority over the state-of-the-art general-purpose topic models. 2020-10-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/6406 info:doi/10.1109/MIS.2020.3033459 https://ink.library.smu.edu.sg/context/sis_research/article/7409/viewcontent/is21_shi_av.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Topic discovery topic models private data Databases and Information Systems Numerical Analysis and Scientific Computing
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Topic discovery
topic models
private data
Databases and Information Systems
Numerical Analysis and Scientific Computing
spellingShingle Topic discovery
topic models
private data
Databases and Information Systems
Numerical Analysis and Scientific Computing
SHI, Yexuan
TONG, Yongxin
SU, Zhiyang
JIANG, Di
ZHOU, Zimu
ZHANG, Wenbin
Federated topic discovery: A semantic consistent approach
description General-purpose topic models have widespread industrial applications. Yet high-quality topic modeling is becoming increasingly challenging because accurate models require large amounts of training data typically owned by multiple parties, who are often unwilling to share their sensitive data for collaborative training without guarantees on their data privacy. To enable effective privacy-preserving multiparty topic modeling, we propose a novel federated general-purpose topic model named private and consistent topic discovery (PC-TD). On the one hand, PC-TD seamlessly integrates differential privacy in topic modeling to provide privacy guarantees on sensitive data of different parties. On the other hand, PC-TD exploits multiple sources of semantic consistency information to retain the accuracy of topic modeling while protecting data privacy. We verify the effectiveness of PC-TD on real-life datasets. Experimental results demonstrate its superiority over the state-of-the-art general-purpose topic models.
format text
author SHI, Yexuan
TONG, Yongxin
SU, Zhiyang
JIANG, Di
ZHOU, Zimu
ZHANG, Wenbin
author_facet SHI, Yexuan
TONG, Yongxin
SU, Zhiyang
JIANG, Di
ZHOU, Zimu
ZHANG, Wenbin
author_sort SHI, Yexuan
title Federated topic discovery: A semantic consistent approach
title_short Federated topic discovery: A semantic consistent approach
title_full Federated topic discovery: A semantic consistent approach
title_fullStr Federated topic discovery: A semantic consistent approach
title_full_unstemmed Federated topic discovery: A semantic consistent approach
title_sort federated topic discovery: a semantic consistent approach
publisher Institutional Knowledge at Singapore Management University
publishDate 2020
url https://ink.library.smu.edu.sg/sis_research/6406
https://ink.library.smu.edu.sg/context/sis_research/article/7409/viewcontent/is21_shi_av.pdf
_version_ 1770575954211504128