Measuring the Performance of An Object-Based Multi-Cloud Data Lake
As the amount of data generated by society continues to become less structured and larger in size, more and more organizations are implementing data lakes in the public cloud to store, process, and analyze this data. However, concerns over the availability of this data as well as the potential of ve...
Saved in:
Main Authors: | , |
---|---|
Format: | text |
Published: |
Archīum Ateneo
2023
|
Subjects: | |
Online Access: | https://archium.ateneo.edu/discs-faculty-pubs/388 https://doi.org/10.1007/978-981-99-3243-6_4 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Ateneo De Manila University |
id |
ph-ateneo-arc.discs-faculty-pubs-1388 |
---|---|
record_format |
eprints |
spelling |
ph-ateneo-arc.discs-faculty-pubs-13882024-02-21T02:49:28Z Measuring the Performance of An Object-Based Multi-Cloud Data Lake Saavedra, Miguel Zenon Nicanor L Yu, William Emmanuel S As the amount of data generated by society continues to become less structured and larger in size, more and more organizations are implementing data lakes in the public cloud to store, process, and analyze this data. However, concerns over the availability of this data as well as the potential of vendor lock-in lead more users to adopt the multi-cloud approach. This study investigates the viability of this approach in data lake use cases. Results that a multi-cloud data lake can potentially be implemented with less than 1% performance impact to query run times at the cost of a 300% increase in one-time loading. This opens the door for future work on more algorithms and implementations that leverage multi-cloud deployments to enhance availability, scalability, and cost optimization. 2023-01-01T08:00:00Z text https://archium.ateneo.edu/discs-faculty-pubs/388 https://doi.org/10.1007/978-981-99-3243-6_4 Department of Information Systems & Computer Science Faculty Publications Archīum Ateneo Big data Cloud Data analytics Data lake Computer Engineering Data Storage Systems Engineering |
institution |
Ateneo De Manila University |
building |
Ateneo De Manila University Library |
continent |
Asia |
country |
Philippines Philippines |
content_provider |
Ateneo De Manila University Library |
collection |
archium.Ateneo Institutional Repository |
topic |
Big data Cloud Data analytics Data lake Computer Engineering Data Storage Systems Engineering |
spellingShingle |
Big data Cloud Data analytics Data lake Computer Engineering Data Storage Systems Engineering Saavedra, Miguel Zenon Nicanor L Yu, William Emmanuel S Measuring the Performance of An Object-Based Multi-Cloud Data Lake |
description |
As the amount of data generated by society continues to become less structured and larger in size, more and more organizations are implementing data lakes in the public cloud to store, process, and analyze this data. However, concerns over the availability of this data as well as the potential of vendor lock-in lead more users to adopt the multi-cloud approach. This study investigates the viability of this approach in data lake use cases. Results that a multi-cloud data lake can potentially be implemented with less than 1% performance impact to query run times at the cost of a 300% increase in one-time loading. This opens the door for future work on more algorithms and implementations that leverage multi-cloud deployments to enhance availability, scalability, and cost optimization. |
format |
text |
author |
Saavedra, Miguel Zenon Nicanor L Yu, William Emmanuel S |
author_facet |
Saavedra, Miguel Zenon Nicanor L Yu, William Emmanuel S |
author_sort |
Saavedra, Miguel Zenon Nicanor L |
title |
Measuring the Performance of An Object-Based Multi-Cloud Data Lake |
title_short |
Measuring the Performance of An Object-Based Multi-Cloud Data Lake |
title_full |
Measuring the Performance of An Object-Based Multi-Cloud Data Lake |
title_fullStr |
Measuring the Performance of An Object-Based Multi-Cloud Data Lake |
title_full_unstemmed |
Measuring the Performance of An Object-Based Multi-Cloud Data Lake |
title_sort |
measuring the performance of an object-based multi-cloud data lake |
publisher |
Archīum Ateneo |
publishDate |
2023 |
url |
https://archium.ateneo.edu/discs-faculty-pubs/388 https://doi.org/10.1007/978-981-99-3243-6_4 |
_version_ |
1792202616487280640 |