Measuring the Performance of An Object-Based Multi-Cloud Data Lake

As the amount of data generated by society continues to become less structured and larger in size, more and more organizations are implementing data lakes in the public cloud to store, process, and analyze this data. However, concerns over the availability of this data as well as the potential of ve...

Full description

Saved in:
Bibliographic Details
Main Authors: Saavedra, Miguel Zenon Nicanor L, Yu, William Emmanuel S
Format: text
Published: Archīum Ateneo 2023
Subjects:
Online Access:https://archium.ateneo.edu/discs-faculty-pubs/388
https://doi.org/10.1007/978-981-99-3243-6_4
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Ateneo De Manila University
id ph-ateneo-arc.discs-faculty-pubs-1388
record_format eprints
spelling ph-ateneo-arc.discs-faculty-pubs-13882024-02-21T02:49:28Z Measuring the Performance of An Object-Based Multi-Cloud Data Lake Saavedra, Miguel Zenon Nicanor L Yu, William Emmanuel S As the amount of data generated by society continues to become less structured and larger in size, more and more organizations are implementing data lakes in the public cloud to store, process, and analyze this data. However, concerns over the availability of this data as well as the potential of vendor lock-in lead more users to adopt the multi-cloud approach. This study investigates the viability of this approach in data lake use cases. Results that a multi-cloud data lake can potentially be implemented with less than 1% performance impact to query run times at the cost of a 300% increase in one-time loading. This opens the door for future work on more algorithms and implementations that leverage multi-cloud deployments to enhance availability, scalability, and cost optimization. 2023-01-01T08:00:00Z text https://archium.ateneo.edu/discs-faculty-pubs/388 https://doi.org/10.1007/978-981-99-3243-6_4 Department of Information Systems & Computer Science Faculty Publications Archīum Ateneo Big data Cloud Data analytics Data lake Computer Engineering Data Storage Systems Engineering
institution Ateneo De Manila University
building Ateneo De Manila University Library
continent Asia
country Philippines
Philippines
content_provider Ateneo De Manila University Library
collection archium.Ateneo Institutional Repository
topic Big data
Cloud
Data analytics
Data lake
Computer Engineering
Data Storage Systems
Engineering
spellingShingle Big data
Cloud
Data analytics
Data lake
Computer Engineering
Data Storage Systems
Engineering
Saavedra, Miguel Zenon Nicanor L
Yu, William Emmanuel S
Measuring the Performance of An Object-Based Multi-Cloud Data Lake
description As the amount of data generated by society continues to become less structured and larger in size, more and more organizations are implementing data lakes in the public cloud to store, process, and analyze this data. However, concerns over the availability of this data as well as the potential of vendor lock-in lead more users to adopt the multi-cloud approach. This study investigates the viability of this approach in data lake use cases. Results that a multi-cloud data lake can potentially be implemented with less than 1% performance impact to query run times at the cost of a 300% increase in one-time loading. This opens the door for future work on more algorithms and implementations that leverage multi-cloud deployments to enhance availability, scalability, and cost optimization.
format text
author Saavedra, Miguel Zenon Nicanor L
Yu, William Emmanuel S
author_facet Saavedra, Miguel Zenon Nicanor L
Yu, William Emmanuel S
author_sort Saavedra, Miguel Zenon Nicanor L
title Measuring the Performance of An Object-Based Multi-Cloud Data Lake
title_short Measuring the Performance of An Object-Based Multi-Cloud Data Lake
title_full Measuring the Performance of An Object-Based Multi-Cloud Data Lake
title_fullStr Measuring the Performance of An Object-Based Multi-Cloud Data Lake
title_full_unstemmed Measuring the Performance of An Object-Based Multi-Cloud Data Lake
title_sort measuring the performance of an object-based multi-cloud data lake
publisher Archīum Ateneo
publishDate 2023
url https://archium.ateneo.edu/discs-faculty-pubs/388
https://doi.org/10.1007/978-981-99-3243-6_4
_version_ 1792202616487280640