Performance analysis of data replication and scheduling in data grid
The Grid is an infrastructure that enables dynamic sharing and coordinated access of resources among different organizations. As a specialization and extension of the Grid, Data Grid emphasizes on the sharing of large-scale data sets and data storage resources. It has evolved to be the solution for...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Theses and Dissertations |
Language: | English |
Published: |
2010
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/38584 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-38584 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-385842023-03-04T00:40:24Z Performance analysis of data replication and scheduling in data grid Zhang, Junwei Lee Bu Sung, Francis School of Computer Engineering Parallel and Distributed Computing Centre DRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systems DRNTU::Engineering::Computer science and engineering::Computer systems organization::Computer-communication networks The Grid is an infrastructure that enables dynamic sharing and coordinated access of resources among different organizations. As a specialization and extension of the Grid, Data Grid emphasizes on the sharing of large-scale data sets and data storage resources. It has evolved to be the solution for data intensive applications, such as global climate change, High Energy Physics (HEP), astrophysics, and computational genomics. In these research domains, the size of scientific data is measured in terabytes (1024 gigabyte) or even petabytes (1024 terabytes). Such scientific data are stored as large files and replicated across the Data Grid. Scientists geographically located all over the world are able to download these datasets and analyze them for various purposes. Hierarchical Data Grid is a class of Data Grid that has been adopted by European Organization for Nuclear Research (CERN) to support the distribution of large experimental datasets across the globe. There have been a lot of research works on replication algorithms for the Hierarchical Data Grid. I have developed a probabilistic model of data replication in a Hierarchical Data Grid environment. The model enables us to evaluate the optimality of the replication algorithm in terms of average response time and average bandwidth cost. The accuracy of the model is verified through simulation. DOCTOR OF PHILOSOPHY (SCE) 2010-05-12T04:35:18Z 2010-05-12T04:35:18Z 2009 2009 Thesis Zhang, J. W. (2009). Performance analysis of data replication and scheduling in data grid. Doctoral thesis, Nanyang Technological University, Singapore. https://hdl.handle.net/10356/38584 10.32657/10356/38584 en 156 p. application/pdf |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
DRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systems DRNTU::Engineering::Computer science and engineering::Computer systems organization::Computer-communication networks |
spellingShingle |
DRNTU::Engineering::Computer science and engineering::Computer systems organization::Performance of systems DRNTU::Engineering::Computer science and engineering::Computer systems organization::Computer-communication networks Zhang, Junwei Performance analysis of data replication and scheduling in data grid |
description |
The Grid is an infrastructure that enables dynamic sharing and coordinated access of resources among different organizations. As a specialization and extension of the Grid, Data Grid emphasizes on the sharing of large-scale data sets and data storage resources. It has evolved to be the solution for data intensive applications, such as global climate change, High Energy Physics (HEP), astrophysics, and computational genomics. In these research domains, the size of scientific data is measured in terabytes (1024 gigabyte) or even petabytes (1024 terabytes). Such scientific data are stored as large files and replicated across the Data Grid. Scientists geographically located all over the world are able to download these datasets and analyze them for various purposes. Hierarchical Data Grid is a class of Data Grid that has been adopted by European Organization for Nuclear Research (CERN) to support the distribution of large experimental datasets across the globe. There have been a lot of research works on replication algorithms for the Hierarchical Data Grid. I have developed a probabilistic model of data replication in a Hierarchical Data Grid environment. The model enables us to evaluate the optimality of the replication algorithm in terms of average response time and average bandwidth cost. The accuracy of the model is verified through simulation. |
author2 |
Lee Bu Sung, Francis |
author_facet |
Lee Bu Sung, Francis Zhang, Junwei |
format |
Theses and Dissertations |
author |
Zhang, Junwei |
author_sort |
Zhang, Junwei |
title |
Performance analysis of data replication and scheduling in data grid |
title_short |
Performance analysis of data replication and scheduling in data grid |
title_full |
Performance analysis of data replication and scheduling in data grid |
title_fullStr |
Performance analysis of data replication and scheduling in data grid |
title_full_unstemmed |
Performance analysis of data replication and scheduling in data grid |
title_sort |
performance analysis of data replication and scheduling in data grid |
publishDate |
2010 |
url |
https://hdl.handle.net/10356/38584 |
_version_ |
1759854082326003712 |