When learned indexes meet LSM-tree based systems: an empirical evaluation

The learned indexes have demonstrated significant performance enhancements through a computational methodology, as opposed to the traditional indexes relying on comparisons. Recent studies shed light on the benefits of learned indexes when they are embedded into LSM-tree-based storage systems, with...

Full description

Saved in:
Bibliographic Details
Main Author: Chen, Mengshi
Other Authors: Luo Siqiang
Format: Thesis-Master by Research
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/178492
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
Description
Summary:The learned indexes have demonstrated significant performance enhancements through a computational methodology, as opposed to the traditional indexes relying on comparisons. Recent studies shed light on the benefits of learned indexes when they are embedded into LSM-tree-based storage systems, with a case study on a simple baseline learned index. Nevertheless, it remains uncertain whether the system can capitalize on the recent advancements in learned indexes research. In this work, we comprehensively explore the impact of integrating advanced learned indexes on the performance of LSM-tree-based storage systems (LSM systems for short). We evaluate nine representative learned indexes on a full key-value system. To our surprise, we find that indexing structures receiving significant attention in learned index research may have critical limitations in LSM systems. By contrast, those equipped with lightweight structures and simpler training processes showcase significant strengths. Through our empirical evaluation, we aim to pinpoint the most effective and practical learned index models for LSM systems, offering a comprehensive understanding of the reasons behind their effectiveness. Our findings contribute valuable insights into the potential of learned indexes to enhance LSM systems, guiding future research towards optimizing database systems through informed choices in learned index implementation.