Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis
Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases, to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, graphs have been growing rapidly and s...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2009
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/1537 http://dx.doi.org/10.1007/s10115-009-0228-9 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
id |
sg-smu-ink.sis_research-2536 |
---|---|
record_format |
dspace |
spelling |
sg-smu-ink.sis_research-25362012-08-08T09:12:06Z Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis CHEN, Chen YAN, Xifeng ZHU, Feida HAN, Jiawei YU, Philip S. Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases, to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, graphs have been growing rapidly and showing their critical importance in many applications, such as the analysis of XML, social networks, Web, biological data, multimedia data and spatiotemporal data. Can we extend useful functions of databases and data warehouse systems to handle graph structured data? In particular, OLAP (On-Line Analytical Processing) has been a popular tool for fast and user-friendly multi-dimensional analysis of data warehouses. Can we OLAP graphs? Unfortunately, to our best knowledge, there are no OLAP tools available that can interactively view and analyze graph data from different perspectives and with multiple granularities. In this paper, we argue that it is critically important to OLAP graph structured data and propose a novel Graph OLAP framework. According to this framework, given a graph dataset with its nodes and edges associated with respective attributes, a multi-dimensional model can be built to enable efficient on-line analytical processing so that any portions of the graphs can be generalized/specialized dynamically, offering multiple, versatile views of the data. The contributions of this work are three-fold. First, starting from basic definitions, i.e., what are dimensions and measures in the Graph OLAP scenario, we develop a conceptual framework for data cubes on graphs. We also look into different semantics of OLAP operations, and classify the framework into two major subcases: informational OLAP and topological OLAP. Second, we show how a graph cube can be materialized by calculating a special kind of measure called aggregated graph and how to implement it efficiently. This includes both full materialization and partial materialization where constraints are enforced to obtain an iceberg cube. As we can see, due to the increased structural complexity of data, aggregated graphs that depend on the underlying “network” properties of the graph dataset are much harder to compute than their traditional OLAP counterparts. Third, to provide more flexible, interesting and informative OLAP of graphs, we further propose a discovery-driven multi-dimensional analysis model to ensure that OLAP is performed in an intelligent manner, guided by expert rules and knowledge discovery processes. We outline such a framework and discuss some challenging research issues for discovery-driven Graph OLAP. 2009-10-01T07:00:00Z text https://ink.library.smu.edu.sg/sis_research/1537 info:doi/10.1007/s10115-009-0228-9 http://dx.doi.org/10.1007/s10115-009-0228-9 Research Collection School Of Computing and Information Systems eng Institutional Knowledge at Singapore Management University Multi-dimensional model Graph OLAP Efficient computation Discovery-driven analysis Databases and Information Systems |
institution |
Singapore Management University |
building |
SMU Libraries |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
SMU Libraries |
collection |
InK@SMU |
language |
English |
topic |
Multi-dimensional model Graph OLAP Efficient computation Discovery-driven analysis Databases and Information Systems |
spellingShingle |
Multi-dimensional model Graph OLAP Efficient computation Discovery-driven analysis Databases and Information Systems CHEN, Chen YAN, Xifeng ZHU, Feida HAN, Jiawei YU, Philip S. Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis |
description |
Databases and data warehouse systems have been evolving from handling normalized spreadsheets stored in relational databases, to managing and analyzing diverse application-oriented data with complex interconnecting structures. Responding to this emerging trend, graphs have been growing rapidly and showing their critical importance in many applications, such as the analysis of XML, social networks, Web, biological data, multimedia data and spatiotemporal data. Can we extend useful functions of databases and data warehouse systems to handle graph structured data? In particular, OLAP (On-Line Analytical Processing) has been a popular tool for fast and user-friendly multi-dimensional analysis of data warehouses. Can we OLAP graphs? Unfortunately, to our best knowledge, there are no OLAP tools available that can interactively view and analyze graph data from different perspectives and with multiple granularities. In this paper, we argue that it is critically important to OLAP graph structured data and propose a novel Graph OLAP framework. According to this framework, given a graph dataset with its nodes and edges associated with respective attributes, a multi-dimensional model can be built to enable efficient on-line analytical processing so that any portions of the graphs can be generalized/specialized dynamically, offering multiple, versatile views of the data. The contributions of this work are three-fold. First, starting from basic definitions, i.e., what are dimensions and measures in the Graph OLAP scenario, we develop a conceptual framework for data cubes on graphs. We also look into different semantics of OLAP operations, and classify the framework into two major subcases: informational OLAP and topological OLAP. Second, we show how a graph cube can be materialized by calculating a special kind of measure called aggregated graph and how to implement it efficiently. This includes both full materialization and partial materialization where constraints are enforced to obtain an iceberg cube. As we can see, due to the increased structural complexity of data, aggregated graphs that depend on the underlying “network” properties of the graph dataset are much harder to compute than their traditional OLAP counterparts. Third, to provide more flexible, interesting and informative OLAP of graphs, we further propose a discovery-driven multi-dimensional analysis model to ensure that OLAP is performed in an intelligent manner, guided by expert rules and knowledge discovery processes. We outline such a framework and discuss some challenging research issues for discovery-driven Graph OLAP. |
format |
text |
author |
CHEN, Chen YAN, Xifeng ZHU, Feida HAN, Jiawei YU, Philip S. |
author_facet |
CHEN, Chen YAN, Xifeng ZHU, Feida HAN, Jiawei YU, Philip S. |
author_sort |
CHEN, Chen |
title |
Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis |
title_short |
Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis |
title_full |
Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis |
title_fullStr |
Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis |
title_full_unstemmed |
Graph OLAP: A Multi-dimensional Framework for Graph Data Analysis |
title_sort |
graph olap: a multi-dimensional framework for graph data analysis |
publisher |
Institutional Knowledge at Singapore Management University |
publishDate |
2009 |
url |
https://ink.library.smu.edu.sg/sis_research/1537 http://dx.doi.org/10.1007/s10115-009-0228-9 |
_version_ |
1770571261032792064 |