SQL INTERFACE DEVELOPMENT FOR SPATIAL DATA RETRIEVAL ON CASSANDRA

As the time goes, the use of spatial data gains significance in the field of information technology. The use of NoSQL database technology also grows as well as an alternative to relational databases in storing data records. Cassandra is one of NoSQL databases that has the ability to store spatial...

Full description

Saved in:
Bibliographic Details
Main Author: Fu, William
Format: Final Project
Language:Indonesia
Online Access:https://digilib.itb.ac.id/gdl/view/65819
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Institut Teknologi Bandung
Language: Indonesia
Description
Summary:As the time goes, the use of spatial data gains significance in the field of information technology. The use of NoSQL database technology also grows as well as an alternative to relational databases in storing data records. Cassandra is one of NoSQL databases that has the ability to store spatial data natively, however, this database lacks the operations for spatial data and has limited features on its query for data retrieval when compared to most relational database. Pradipta (2020) has developed an SQL interface for retrieving spatial data stored in MongoDB database with the help of PostGIS extension. In this final project, an SQL interface system is developed to perform spatial data retrieval on a Cassandra column-oriented database based on the solution developed by Pradipta (2020). The system developed can complement the spatial operations provided by Cassandra and complement the limitations of the CQL query feature for data retrieval on Cassandra. The SQL interface system works by transforming an input SQL query into one or more CQL queries to retrieve data from Cassandra. Then, the data will be used to fill in the table section of the input SQL query. To utilize Cassandra's pros in storing data per column in a block of memory, the CQL query being constructed will involve only the necessary columns that are needed from a table. In handling large amounts of data, the streaming method is used when retrieving data from Cassandra, an identification of selection operations is performed in the WHERE clause to determine which expression can be executed on Cassandra, and the PostgreSQL database is used as temporary data storage if CQL query results returns large number of data records. In order to test its performance, the SQL interface will be executed on input queries with various spatial functions, subquery expressions, and the number of tables involved. The execution time for each query will then be measured for further analysis. Based on the evaluation results, it is concluded that the SQL interface has successfully able to retrieve spatial data records from the Cassandra database.