Benchmarking spatial-vector queries

The growth of data has increased exponentially, spurred by technological advancements such as smartphones becoming readily available, providing an increase in global connectivity as well as access to digital applications. This increased connectivity has led to increased creation of spatial data, dat...

Full description

Saved in:
Bibliographic Details
Main Author: Wong, Scott Wen Jie
Other Authors: Gao Cong
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/181533
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The growth of data has increased exponentially, spurred by technological advancements such as smartphones becoming readily available, providing an increase in global connectivity as well as access to digital applications. This increased connectivity has led to increased creation of spatial data, data that provide us geospatial information that can be used to further improve our lives. New methods transforming unstructured data such as text, images and audio to structured data in the form of vectors. These vector embeddings have semantic meanings that capture the relationship and context of the data. As such, there must be a database that is able to store such high-dimensional vectors, something that traditional relational databases are not well suited for. Thus, we will need to analyse how vector databases work, to understand and see how we can improve such traditional databases to be on par with vector databases in terms of storing and managing such data. In this report, we provide an overview of how vector databases work, focusing on their indexing and querying techniques. Additionally, we will design and execute various queries that use different data modalities, evaluating the performance of traditional relational database systems that have been enhanced for vector processing and vector databases. By evaluating the different database systems, we can compare their performance and understand why some systems are better than others in specific queries, identifying their strengths and limitations. Finally, we conclude on the effectiveness of each database system against the challenges faced by modern data requirements.