Discovering newsworthy themes from sequenced data: A step towards computational journalism

Automatic discovery of newsworthy themes from sequenced data can relieve journalists from manually poring over a large amount of data in order to find interesting news. In this paper, we propose a novel k -Sketch query that aims to find k striking streaks to best summarize a subject. Our scoring fun...

وصف كامل

محفوظ في:

التفاصيل البيبلوغرافية
المؤلفون الرئيسيون:	FAN, Qi, LI, Yuchen, ZHANG, Dongxiang, TAN, Kian-Lee Tan
التنسيق:	text
اللغة:	English
منشور في:	Institutional Knowledge at Singapore Management University 2017
الموضوعات:	Computational journalism news theme discovery sequenced data approximate algorithms Databases and Information Systems Data Storage Systems
الوصول للمادة أونلاين:	https://ink.library.smu.edu.sg/sis_research/3996 https://ink.library.smu.edu.sg/context/sis_research/article/4998/viewcontent/07883865__1_.pdf
الوسوم:	إضافة وسم لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!
المؤسسة:	Singapore Management University
اللغة:	English

id	sg-smu-ink.sis_research-4998
record_format	dspace
spelling	sg-smu-ink.sis_research-49982018-05-28T08:57:09Z Discovering newsworthy themes from sequenced data: A step towards computational journalism FAN, Qi LI, Yuchen ZHANG, Dongxiang TAN, Kian-Lee Tan Automatic discovery of newsworthy themes from sequenced data can relieve journalists from manually poring over a large amount of data in order to find interesting news. In this paper, we propose a novel k -Sketch query that aims to find k striking streaks to best summarize a subject. Our scoring function takes into account streak strikingness and streak coverage at the same time. We study the k -Sketch query processing in both offline and online scenarios, and propose various streak-level pruning techniques to find striking candidates. Among those candidates, we then develop approximate methods to discover the k most representative streaks with theoretical bounds. We conduct experiments on four real datasets, and the results demonstrate the efficiency and effectiveness of our proposed algorithms: the running time achieves up to 500 times speedup and the quality of the generated summaries is endorsed by the anonymous users from Amazon Mechanical Turk. 2017-07-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/3996 info:doi/10.1109/TKDE.2017.2685587 https://ink.library.smu.edu.sg/context/sis_research/article/4998/viewcontent/07883865__1_.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Computational journalism news theme discovery sequenced data approximate algorithms Databases and Information Systems Data Storage Systems
institution	Singapore Management University
building	SMU Libraries
continent	Asia
country	Singapore Singapore
content_provider	SMU Libraries
collection	InK@SMU
language	English
topic	Computational journalism news theme discovery sequenced data approximate algorithms Databases and Information Systems Data Storage Systems
spellingShingle	Computational journalism news theme discovery sequenced data approximate algorithms Databases and Information Systems Data Storage Systems FAN, Qi LI, Yuchen ZHANG, Dongxiang TAN, Kian-Lee Tan Discovering newsworthy themes from sequenced data: A step towards computational journalism
description	Automatic discovery of newsworthy themes from sequenced data can relieve journalists from manually poring over a large amount of data in order to find interesting news. In this paper, we propose a novel k -Sketch query that aims to find k striking streaks to best summarize a subject. Our scoring function takes into account streak strikingness and streak coverage at the same time. We study the k -Sketch query processing in both offline and online scenarios, and propose various streak-level pruning techniques to find striking candidates. Among those candidates, we then develop approximate methods to discover the k most representative streaks with theoretical bounds. We conduct experiments on four real datasets, and the results demonstrate the efficiency and effectiveness of our proposed algorithms: the running time achieves up to 500 times speedup and the quality of the generated summaries is endorsed by the anonymous users from Amazon Mechanical Turk.
format	text
author	FAN, Qi LI, Yuchen ZHANG, Dongxiang TAN, Kian-Lee Tan
author_facet	FAN, Qi LI, Yuchen ZHANG, Dongxiang TAN, Kian-Lee Tan
author_sort	FAN, Qi
title	Discovering newsworthy themes from sequenced data: A step towards computational journalism
title_short	Discovering newsworthy themes from sequenced data: A step towards computational journalism
title_full	Discovering newsworthy themes from sequenced data: A step towards computational journalism
title_fullStr	Discovering newsworthy themes from sequenced data: A step towards computational journalism
title_full_unstemmed	Discovering newsworthy themes from sequenced data: A step towards computational journalism
title_sort	discovering newsworthy themes from sequenced data: a step towards computational journalism
publisher	Institutional Knowledge at Singapore Management University
publishDate	2017
url	https://ink.library.smu.edu.sg/sis_research/3996 https://ink.library.smu.edu.sg/context/sis_research/article/4998/viewcontent/07883865__1_.pdf
_version_	1770574114917974016

Discovering newsworthy themes from sequenced data: A step towards computational journalism

مواد مشابهة