PERFORMANCE ANALYSIS OF COCKROACHDB AND CITUSDB DISTRIBUTED DATABASE SYSTEM

Distributed processing is an efficient way to improve the performance of systems, including distributed relational database management systems. Two examples of such database management systems are CockroachDB and CitusDB. CockroachDB and CitusDB have different architectural implementations. Cock...

Full description

Saved in:
Bibliographic Details
Main Author: Axel Candiasa, Denilsen
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/73560
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:Distributed processing is an efficient way to improve the performance of systems, including distributed relational database management systems. Two examples of such database management systems are CockroachDB and CitusDB. CockroachDB and CitusDB have different architectural implementations. CockroachDB prioritizes data consistency and availability within the system, unlike CitusDB, which is merely an extension of PostgreSQL to support distributed processing. Additionally, CockroachDB and CitusDB use different storage structures. CockroachDB utilizes the LSM Tree, while CitusDB utilizes the B-Tree. The difference between these storage structures lies in their read and write performance, with the B-Tree having better read performance and the LSM Tree having better write performance. This research conducted benchmarking and architecture analysis on CockroachDB and CitusDB to determine the implications of their architectural implementations and storage structure choices on the performance of distributed relational database management systems. The benchmark scenarios were divided into three categories: scenarios simulating OLTP workload, read-intensive workload, and write-intensive workload. The benchmark results indicate that storage structure influences the performance of distributed relational database management systems, particularly in terms of read and write performance, although the influence of architecture is greater. The analysis of the architecture of the related database management system reveals several overheads, including the trade-off between consistency and throughput, as well as the trade-off between availability and throughput. The analysis of the storage structure implementation demonstrates that the trade-off between read and write performance of the B-Tree and LSM Tree can be addressed through optimization of the relevant storage structure implementation. The choice of storage structure also influences how the architecture is implemented in the distributed relational database management system.