Comparative study for load management of HBase and Cassandra distributed databases in big data

The advancement in cloud computing, the increasing size of databases and the emergence of big data have made traditional data management system to be insufficient solution to store and manage such large-scale data. Therefore, there has been an emergence of new mechan...

Full description

Saved in:
Bibliographic Details
Main Authors: Al-Dailamy, Ali Y., Muhammed, Abdullah, Ismail, Waidah, Radman, Abduljalil
Format: Article
Language:English
Published: Science Publishing Corporation 2018
Online Access:http://psasir.upm.edu.my/id/eprint/73457/1/DATA.pdf
http://psasir.upm.edu.my/id/eprint/73457/
https://www.sciencepubco.com/index.php/ijet/article/view/23715
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Universiti Putra Malaysia
Language: English
id my.upm.eprints.73457
record_format eprints
spelling my.upm.eprints.734572021-04-18T01:14:17Z http://psasir.upm.edu.my/id/eprint/73457/ Comparative study for load management of HBase and Cassandra distributed databases in big data Al-Dailamy, Ali Y. Muhammed, Abdullah Ismail, Waidah Radman, Abduljalil The advancement in cloud computing, the increasing size of databases and the emergence of big data have made traditional data management system to be insufficient solution to store and manage such large-scale data. Therefore, there has been an emergence of new mechanisms for data storage that can handle large-scale data. NoSQL databases are used to store and manage large amount of data. They are intended to be open source, distributed and horizontally scalable in order to provide high performance. Scalability is one of the important features of such systems, it means that by increasing the number of nodes, more requests can be served per unit of time. Distribution and scalability are always companied with load management, which provides load balancing of work among multiple nodes. Load management efficiency varies from system to another according to the used load balancing technique. In this study, HBase and Cassandra load management with scalability will be evaluated as they are the most popular NoSQL databases modeled based on Big Table. In particular,this paper will compare and analyze the load management for the distributed performance of HBase and Cassandra using standard benchmark tool named Yahoo! Cloud Serving Benchmark (YCSB). The experiments will measure the performance of database operations with a different number of connections using different numbers of operations, database size, and processing nodes. The experimental results showed that HBase can provide better performance as the number of connections increase in the presence of horizontal scalability Science Publishing Corporation 2018 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/73457/1/DATA.pdf Al-Dailamy, Ali Y. and Muhammed, Abdullah and Ismail, Waidah and Radman, Abduljalil (2018) Comparative study for load management of HBase and Cassandra distributed databases in big data. International Journal of Engineering and Technology(UAE), 7 (4 spec. 31). art. no. 23715. 375 - 380. ISSN 2227-524X https://www.sciencepubco.com/index.php/ijet/article/view/23715 10.14419/ijet.v7i4.31.23715
institution Universiti Putra Malaysia
building UPM Library
collection Institutional Repository
continent Asia
country Malaysia
content_provider Universiti Putra Malaysia
content_source UPM Institutional Repository
url_provider http://psasir.upm.edu.my/
language English
description The advancement in cloud computing, the increasing size of databases and the emergence of big data have made traditional data management system to be insufficient solution to store and manage such large-scale data. Therefore, there has been an emergence of new mechanisms for data storage that can handle large-scale data. NoSQL databases are used to store and manage large amount of data. They are intended to be open source, distributed and horizontally scalable in order to provide high performance. Scalability is one of the important features of such systems, it means that by increasing the number of nodes, more requests can be served per unit of time. Distribution and scalability are always companied with load management, which provides load balancing of work among multiple nodes. Load management efficiency varies from system to another according to the used load balancing technique. In this study, HBase and Cassandra load management with scalability will be evaluated as they are the most popular NoSQL databases modeled based on Big Table. In particular,this paper will compare and analyze the load management for the distributed performance of HBase and Cassandra using standard benchmark tool named Yahoo! Cloud Serving Benchmark (YCSB). The experiments will measure the performance of database operations with a different number of connections using different numbers of operations, database size, and processing nodes. The experimental results showed that HBase can provide better performance as the number of connections increase in the presence of horizontal scalability
format Article
author Al-Dailamy, Ali Y.
Muhammed, Abdullah
Ismail, Waidah
Radman, Abduljalil
spellingShingle Al-Dailamy, Ali Y.
Muhammed, Abdullah
Ismail, Waidah
Radman, Abduljalil
Comparative study for load management of HBase and Cassandra distributed databases in big data
author_facet Al-Dailamy, Ali Y.
Muhammed, Abdullah
Ismail, Waidah
Radman, Abduljalil
author_sort Al-Dailamy, Ali Y.
title Comparative study for load management of HBase and Cassandra distributed databases in big data
title_short Comparative study for load management of HBase and Cassandra distributed databases in big data
title_full Comparative study for load management of HBase and Cassandra distributed databases in big data
title_fullStr Comparative study for load management of HBase and Cassandra distributed databases in big data
title_full_unstemmed Comparative study for load management of HBase and Cassandra distributed databases in big data
title_sort comparative study for load management of hbase and cassandra distributed databases in big data
publisher Science Publishing Corporation
publishDate 2018
url http://psasir.upm.edu.my/id/eprint/73457/1/DATA.pdf
http://psasir.upm.edu.my/id/eprint/73457/
https://www.sciencepubco.com/index.php/ijet/article/view/23715
_version_ 1698698926824095744