THE STUDY OF DATA MODELING METHODOLOGIES FOR COLUMN-ORIENTED DATABASES

As the time goes, further development in research suggests that applying data mo- deling in NoSQL databases as in SQL databases is beneficial in terms of perform- ance, therefore resulting in many research regarding methodology for data model- ing in NoSQL, including column-oriented database. Two...

Full description

Saved in:
Bibliographic Details
Main Author: Haryo Pandu Prakoso, Raden
Format: Theses
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/76858
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:As the time goes, further development in research suggests that applying data mo- deling in NoSQL databases as in SQL databases is beneficial in terms of perform- ance, therefore resulting in many research regarding methodology for data model- ing in NoSQL, including column-oriented database. Two column-oriented da- tabase data modelling methodologies that were discussed in this research that use EER diagrams as their conceptual models are Chebotko methodology which also considers application queries for said database and Poffo methodology which de- pends only on said EER diagrams. Thus, a research purpose can be determined which is to analyze the difference of both data modelling methodology in a column- wide oriented database. After studying both methodologies, the study was continued by choosing a case stu- dy. Then, the logical and physical data models were made using each methodology, with ones made using Chebotko methodology also considered several application queries. Lastly, both physical data models were implemented on two devices in which two types of queries were run each five times and its’ reading performance were recorded using a benchmark tool. The testing results show that on the chosen case study, the physical data models made using Chebotko methodology allows queries to retrieve data 2,24 times slow- er than the ones made using Poffo methodology. Further analysis shows that the physical data models made using Chebotko methodology are prone to becoming column families that become very large compared to the ones made using Poffo methodology, due to the dependency to application queries made for the said data models. In conclusion, modelling data in column-oriented database using Poffo methodology is generally safer because the column families made could retain the intended size for the corresponding column families, therefore the size could remain small whereas one would have to be cautious while modelling data using Chebotko methodology because one could fall into the trap of making the column families too big due to asking too much information in a query, therefore affecting its’ perform- ance negatively.