THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM

Cassandra is a column-family NoSQL database management system (DBMS) that stores data by rows, similar to that of a relational database. In relational database, it is common to process data using join operations. On the other hand, Cassandra is not designed for join operations. If there is a need...

Full description

Saved in:
Bibliographic Details
Main Author: Rafi Adyatma, Mohammad
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/69201
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
id id-itb.:69201
spelling id-itb.:692012022-09-21T04:41:28ZTHE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM Rafi Adyatma, Mohammad Indonesia Final Project join operation, Cassandra, hybrid hash join, block nested loop join INSTITUT TEKNOLOGI BANDUNG https://digilib.itb.ac.id/gdl/view/69201 Cassandra is a column-family NoSQL database management system (DBMS) that stores data by rows, similar to that of a relational database. In relational database, it is common to process data using join operations. On the other hand, Cassandra is not designed for join operations. If there is a need to perform operations that require join, the solution will be to denormalize the tables involved. But this approach cannot be done when the database is already operational. A flexible join operator is still required in this case. Thus, this research is aimed at developing a library for join operations in Cassandra. To get the idea on how to perform join operation in Cassandra, we begin with an understanding of how the internal workings of Cassandra works, understanding how to retrieve data from Cassandra, and determining the join algorithm to be developed. There are various join algorithms that can be considered as alternatives and from those variations, the algorithms with best estimated performance are chosen. They are the hybrid hash join and block nested loop join algorithms. Afterwards, a library that implements the algorithms are developed based on some functional requirements and based on designs in the form of analysis models using use case diagrams, class diagrams, sequence diagrams, and so on. The resulting library for join operations supports both inner and outer join operations as well as equi-join and non-equi join operations. Additionally, the technology and technical matters related to the construction of the join library are also described. The testing to the library shows that all join algorithms implemented work effectively. The hybrid hash join algorithm works properly on both small and large amount of data. The nested loop join algorithm, on the other hand, performs poorly especially in handling large amount of data. text
institution Institut Teknologi Bandung
building Institut Teknologi Bandung Library
continent Asia
country Indonesia
Indonesia
content_provider Institut Teknologi Bandung
collection Digital ITB
language Indonesia
description Cassandra is a column-family NoSQL database management system (DBMS) that stores data by rows, similar to that of a relational database. In relational database, it is common to process data using join operations. On the other hand, Cassandra is not designed for join operations. If there is a need to perform operations that require join, the solution will be to denormalize the tables involved. But this approach cannot be done when the database is already operational. A flexible join operator is still required in this case. Thus, this research is aimed at developing a library for join operations in Cassandra. To get the idea on how to perform join operation in Cassandra, we begin with an understanding of how the internal workings of Cassandra works, understanding how to retrieve data from Cassandra, and determining the join algorithm to be developed. There are various join algorithms that can be considered as alternatives and from those variations, the algorithms with best estimated performance are chosen. They are the hybrid hash join and block nested loop join algorithms. Afterwards, a library that implements the algorithms are developed based on some functional requirements and based on designs in the form of analysis models using use case diagrams, class diagrams, sequence diagrams, and so on. The resulting library for join operations supports both inner and outer join operations as well as equi-join and non-equi join operations. Additionally, the technology and technical matters related to the construction of the join library are also described. The testing to the library shows that all join algorithms implemented work effectively. The hybrid hash join algorithm works properly on both small and large amount of data. The nested loop join algorithm, on the other hand, performs poorly especially in handling large amount of data.
format Final Project
author Rafi Adyatma, Mohammad
spellingShingle Rafi Adyatma, Mohammad
THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM
author_facet Rafi Adyatma, Mohammad
author_sort Rafi Adyatma, Mohammad
title THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM
title_short THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM
title_full THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM
title_fullStr THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM
title_full_unstemmed THE DEVELOPMENT OF LIBRARY FOR JOIN OPERATION IN CASSANDRA DATABASE MANAGEMENT SYSTEM
title_sort development of library for join operation in cassandra database management system
url https://digilib.itb.ac.id/gdl/view/69201
_version_ 1822005975099375616