Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis

Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases, to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, graphs have been growing rapidly and s...

Full description

Saved in:
Bibliographic Details
Main Authors: CHEN, Chen, YAN, Xifeng, ZHU, Feida, HAN, Jiawei, YU, Philip S.
Format: text
Language:English
Published: Institutional Knowledge at Singapore Management University 2009
Subjects:
Online Access:https://ink.library.smu.edu.sg/sis_research/1537
http://dx.doi.org/10.1007/s10115-009-0228-9
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Singapore Management University
Language: English
id sg-smu-ink.sis_research-2536
record_format dspace
spelling sg-smu-ink.sis_research-25362012-08-08T09:12:06Z Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis CHEN, Chen YAN, Xifeng ZHU, Feida HAN, Jiawei YU, Philip S. Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases, to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, graphs have been growing rapidly and showing their critical importance in many applications, such as the analysis of XML, social networks, Web, biological data, multimedia data and spatiotemporal data. Can we extend useful functions of databases and data warehouse systems to handle graph structured data? In particular, OLAP (On-Line Analytical Processing) has been a popular tool for fast and user-friendly multi-dimensional analysis of data warehouses. Can we OLAP graphs? Unfortunately, to our best knowledge, there are no OLAP tools available that can interactively view and analyze graph data from different perspectives and with multiple granularities. In this paper, we argue that it is critically important to OLAP graph structured data and propose a novel Graph OLAP framework. According to this framework, given a graph dataset with its nodes and edges associated with respective attributes, a multi-dimensional model can be built to enable efficient on-line analytical processing so that any portions of the graphs can be generalized/specialized dynamically, offering multiple, versatile views of the data. The contributions of this work are three-fold. First, starting from basic definitions, i.e., what are dimensions and measures in the Graph OLAP scenario, we develop a conceptual framework for data cubes on graphs. We also look into different semantics of OLAP operations, and classify the framework into two major subcases: informational OLAP and topological OLAP. Second, we show how a graph cube can be materialized by calculating a special kind of measure called aggregated graph and how to implement it efficiently. This includes both full materialization and partial materialization where constraints are enforced to obtain an iceberg cube. As we can see, due to the increased structural complexity of data, aggregated graphs that depend on the underlying “network” properties of the graph dataset are much harder to compute than their traditional OLAP counterparts. Third, to provide more flexible, interesting and informative OLAP of graphs, we further propose a discovery-driven multi-dimensional analysis model to ensure that OLAP is performed in an intelligent manner, guided by expert rules and knowledge discovery processes. We outline such a framework and discuss some challenging research issues for discovery-driven Graph OLAP. 2009-10-01T07:00:00Z text https://ink.library.smu.edu.sg/sis_research/1537 info:doi/10.1007/s10115-009-0228-9 http://dx.doi.org/10.1007/s10115-009-0228-9 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Multi-dimensional model Graph OLAP Efficient computation Discovery-driven analysis Databases and Information Systems
institution Singapore Management University
building SMU Libraries
continent Asia
country Singapore
Singapore
content_provider SMU Libraries
collection InK@SMU
language English
topic Multi-dimensional model
Graph OLAP
Efficient computation
Discovery-driven analysis
Databases and Information Systems
spellingShingle Multi-dimensional model
Graph OLAP
Efficient computation
Discovery-driven analysis
Databases and Information Systems
CHEN, Chen
YAN, Xifeng
ZHU, Feida
HAN, Jiawei
YU, Philip S.
Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis
description Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases, to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, graphs have been growing rapidly and showing their critical importance in many applications, such as the analysis of XML, social networks, Web, biological data, multimedia data and spatiotemporal data. Can we extend useful functions of databases and data warehouse systems to handle graph structured data? In particular, OLAP (On-Line Analytical Processing) has been a popular tool for fast and user-friendly multi-dimensional analysis of data warehouses. Can we OLAP graphs? Unfortunately, to our best knowledge, there are no OLAP tools available that can interactively view and analyze graph data from different perspectives and with multiple granularities. In this paper, we argue that it is critically important to OLAP graph structured data and propose a novel Graph OLAP framework. According to this framework, given a graph dataset with its nodes and edges associated with respective attributes, a multi-dimensional model can be built to enable efficient on-line analytical processing so that any portions of the graphs can be generalized/specialized dynamically, offering multiple, versatile views of the data. The contributions of this work are three-fold. First, starting from basic definitions, i.e., what are dimensions and measures in the Graph OLAP scenario, we develop a conceptual framework for data cubes on graphs. We also look into different semantics of OLAP operations, and classify the framework into two major subcases: informational OLAP and topological OLAP. Second, we show how a graph cube can be materialized by calculating a special kind of measure called aggregated graph and how to implement it efficiently. This includes both full materialization and partial materialization where constraints are enforced to obtain an iceberg cube. As we can see, due to the increased structural complexity of data, aggregated graphs that depend on the underlying “network” properties of the graph dataset are much harder to compute than their traditional OLAP counterparts. Third, to provide more flexible, interesting and informative OLAP of graphs, we further propose a discovery-driven multi-dimensional analysis model to ensure that OLAP is performed in an intelligent manner, guided by expert rules and knowledge discovery processes. We outline such a framework and discuss some challenging research issues for discovery-driven Graph OLAP.
format text
author CHEN, Chen
YAN, Xifeng
ZHU, Feida
HAN, Jiawei
YU, Philip S.
author_facet CHEN, Chen
YAN, Xifeng
ZHU, Feida
HAN, Jiawei
YU, Philip S.
author_sort CHEN, Chen
title Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis
title_short Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis
title_full Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis
title_fullStr Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis
title_full_unstemmed Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis
title_sort graph olap: a multi-dimensional framework for graph data analysis
publisher Institutional Knowledge at Singapore Management University
publishDate 2009
url https://ink.library.smu.edu.sg/sis_research/1537
http://dx.doi.org/10.1007/s10115-009-0228-9
_version_ 1770571261032792064