THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM
Cassandra is a column-family NoSQL database management system (DBMS) that stores data by rows, similar to that of a relational database. In relational database, it is common to process data using join operations. On the other hand, Cassandra is not designed for join operations. If there is a need...
Saved in:
Main Author: | |
---|---|
Format: | Final Project |
Language: | Indonesia |
Online Access: | https://digilib.itb.ac.id/gdl/view/69201 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Institut Teknologi Bandung |
Language: | Indonesia |
id |
id-itb.:69201 |
---|---|
spelling |
id-itb.:692012022-09-21T04:41:28ZTHE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM Rafi Adyatma, Mohammad Indonesia Final Project join operation, Cassandra, hybrid hash join, block nested loop join INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/69201 Cassandra is a column-family NoSQL database management system (DBMS) that stores data by rows, similar to that of a relational database. In relational database, it is common to process data using join operations. On the other hand, Cassandra is not designed for join operations. If there is a need to perform operations that require join, the solution will be to denormalize the tables involved. But this approach cannot be done when the database is already operational. A flexible join operator is still required in this case. Thus, this research is aimed at developing a library for join operations in Cassandra. To get the idea on how to perform join operation in Cassandra, we begin with an understanding of how the internal workings of Cassandra works, understanding how to retrieve data from Cassandra, and determining the join algorithm to be developed. There are various join algorithms that can be considered as alternatives and from those variations, the algorithms with best estimated performance are chosen. They are the hybrid hash join and block nested loop join algorithms. Afterwards, a library that implements the algorithms are developed based on some functional requirements and based on designs in the form of analysis models using use case diagrams, class diagrams, sequence diagrams, and so on. The resulting library for join operations supports both inner and outer join operations as well as equi-join and non-equi join operations. Additionally, the technology and technical matters related to the construction of the join library are also described. The testing to the library shows that all join algorithms implemented work effectively. The hybrid hash join algorithm works properly on both small and large amount of data. The nested loop join algorithm, on the other hand, performs poorly especially in handling large amount of data. text |
institution |
Institut Teknologi Bandung |
building |
Institut Teknologi Bandung Library |
continent |
Asia |
country |
Indonesia Indonesia |
content_provider |
Institut Teknologi Bandung |
collection |
Digital ITB |
language |
Indonesia |
description |
Cassandra is a column-family NoSQL database management system (DBMS) that stores data by
rows, similar to that of a relational database. In relational database, it is common to process data
using join operations. On the other hand, Cassandra is not designed for join operations. If there is
a need to perform operations that require join, the solution will be to denormalize the tables
involved. But this approach cannot be done when the database is already operational. A flexible
join operator is still required in this case. Thus, this research is aimed at developing a library for
join operations in Cassandra.
To get the idea on how to perform join operation in Cassandra, we begin with an understanding of
how the internal workings of Cassandra works, understanding how to retrieve data from Cassandra,
and determining the join algorithm to be developed. There are various join algorithms that can be
considered as alternatives and from those variations, the algorithms with best estimated
performance are chosen. They are the hybrid hash join and block nested loop join algorithms.
Afterwards, a library that implements the algorithms are developed based on some functional
requirements and based on designs in the form of analysis models using use case diagrams, class
diagrams, sequence diagrams, and so on. The resulting library for join operations supports both
inner and outer join operations as well as equi-join and non-equi join operations. Additionally, the
technology and technical matters related to the construction of the join library are also described.
The testing to the library shows that all join algorithms implemented work effectively. The hybrid
hash join algorithm works properly on both small and large amount of data. The nested loop join
algorithm, on the other hand, performs poorly especially in handling large amount of data. |
format |
Final Project |
author |
Rafi Adyatma, Mohammad |
spellingShingle |
Rafi Adyatma, Mohammad THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM |
author_facet |
Rafi Adyatma, Mohammad |
author_sort |
Rafi Adyatma, Mohammad |
title |
THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM |
title_short |
THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM |
title_full |
THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM |
title_fullStr |
THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM |
title_full_unstemmed |
THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM |
title_sort |
development of library for join operation in cassandra database management system |
url |
https://digilib.itb.ac.id/gdl/view/69201 |
_version_ |
1822005975099375616 |