Advances and Challenges for Scalable Provenance in Stream Processing Systems

While data provenance is a well-studied topic in both database and workflow systems, its support within stream processing systems presents a new set of challenges. Part of the challenge is the high stream event rate and the low processing latency requirements imposed by many streaming applications....

Full description

Saved in:
Bibliographic Details
Main Authors: MISRA, Archan, BLOUNT, Marion, KEMENTSIETSIDIS, Anastasios, SOW, Daby, WANG, Min
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2008
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/678
https://ink.library.smu.edu.sg/context/sis_research/article/1677/viewcontent/Misra2008_Chapter_AdvancesAndChallengesForScalab.pdf
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-1677
record_format dspace
spelling sg-smu-ink.sis_research-16772020-07-29T01:58:43Z Advances and Challenges for Scalable Provenance in Stream Processing Systems MISRA, Archan BLOUNT, Marion KEMENTSIETSIDIS, Anastasios SOW, Daby WANG, Min While data provenance is a well-studied topic in both database and workflow systems, its support within stream processing systems presents a new set of challenges. Part of the challenge is the high stream event rate and the low processing latency requirements imposed by many streaming applications. For example, emerging streaming applications in healthcare or finance call for data provenance, as illustrated in the Century stream processing infrastructure that we are building for supporting online healthcare analytics. At anytime, given an output data element (e.g., a medical alert) generated by Century, the system must be able to retrieve the input and intermediate data elements that led to its generation. In this paper, we describe the requirements behind our initial implementation of Century’s provenance subsystem. We then analyze its strengths and limitations and propose a new provenance architecture to address some of these limitations. The paper also includes a discussion on the open challenges in this area. 2008-06-01T07:00:00Z text application/pdf https://ink.library.smu.edu.sg/sis_research/678 info:doi/10.1007/978-3-540-89965-5_26 https://ink.library.smu.edu.sg/context/sis_research/article/1677/viewcontent/Misra2008_Chapter_AdvancesAndChallengesForScalab.pdf http://creativecommons.org/licenses/by-nc-nd/4.0/ Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Software Engineering
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Software Engineering
spellingShingle Software Engineering
MISRA, Archan
BLOUNT, Marion
KEMENTSIETSIDIS, Anastasios
SOW, Daby
WANG, Min
Advances and Challenges for Scalable Provenance in Stream Processing Systems
description While data provenance is a well-studied topic in both database and workflow systems, its support within stream processing systems presents a new set of challenges. Part of the challenge is the high stream event rate and the low processing latency requirements imposed by many streaming applications. For example, emerging streaming applications in healthcare or finance call for data provenance, as illustrated in the Century stream processing infrastructure that we are building for supporting online healthcare analytics. At anytime, given an output data element (e.g., a medical alert) generated by Century, the system must be able to retrieve the input and intermediate data elements that led to its generation. In this paper, we describe the requirements behind our initial implementation of Century’s provenance subsystem. We then analyze its strengths and limitations and propose a new provenance architecture to address some of these limitations. The paper also includes a discussion on the open challenges in this area.
format text
author MISRA, Archan
BLOUNT, Marion
KEMENTSIETSIDIS, Anastasios
SOW, Daby
WANG, Min
author_facet MISRA, Archan
BLOUNT, Marion
KEMENTSIETSIDIS, Anastasios
SOW, Daby
WANG, Min
author_sort MISRA, Archan
title Advances and Challenges for Scalable Provenance in Stream Processing Systems
title_short Advances and Challenges for Scalable Provenance in Stream Processing Systems
title_full Advances and Challenges for Scalable Provenance in Stream Processing Systems
title_fullStr Advances and Challenges for Scalable Provenance in Stream Processing Systems
title_full_unstemmed Advances and Challenges for Scalable Provenance in Stream Processing Systems
title_sort advances and challenges for scalable provenance in stream processing systems
publisher Institutional Knowledge at Singapore Management University
publishDate 2008
url https://ink.library.smu.edu.sg/sis_research/678
https://ink.library.smu.edu.sg/context/sis_research/article/1677/viewcontent/Misra2008_Chapter_AdvancesAndChallengesForScalab.pdf
_version_ 1770570659909337088