Query cost estimation in DBMS with deep learning

Cost and cardinality estimation is considered the Achilles Heel of modern query optimizers. Poor cardinality estimates lead to bad cost estimates resulting in sub-optimal query execution plans being selected which drops the performance of query optimizers. With the recent rise of ML for DB, the d...

Full description

Saved in:
Bibliographic Details
Main Author: Acharya, Atul
Other Authors: Luo Siqiang
Format: Final Year Project
Language:English
Published: Nanyang Technological University 2023
Subjects:
Online Access:https://hdl.handle.net/10356/166095
Tags: Add Tag
No Tags, Be the first to tag this record!
Institution: Nanyang Technological University
Language: English
id sg-ntu-dr.10356-166095
record_format dspace
spelling sg-ntu-dr.10356-1660952023-04-21T15:38:33Z Query cost estimation in DBMS with deep learning Acharya, Atul Luo Siqiang School of Computer Science and Engineering siqiang.luo@ntu.edu.sg Engineering::Computer science and engineering Cost and cardinality estimation is considered the Achilles Heel of modern query optimizers. Poor cardinality estimates lead to bad cost estimates resulting in sub-optimal query execution plans being selected which drops the performance of query optimizers. With the recent rise of ML for DB, the database community explored the use of learned methods in cost and cardinality estimation. However none of the methods till date can achieve prediction speeds required for modern database systems. In this project we introduce a novel algorithm (TreeGBM) using Gradient Boosting Trees to solve both cost estimation and cardinality estimation on numeric JOB workloads based on the IMDB dataset. We conducted multiple experiments to improve prediction scores and inference times. Our experiments showed that the TreeGBM was ∼120 times faster than state-of-the-art learned methods while maintaining good prediction scores. We stated possible improvements to our method that could help improve prediction scores and inference times. Future work can add on to the algorithm by using a new predicate embedding algorithm that does not incur much latency and by using prefix tries to encode string values. Bachelor of Engineering (Computer Science) 2023-04-21T06:12:10Z 2023-04-21T06:12:10Z 2023 Final Year Project (FYP) Acharya, A. (2023). Query cost estimation in DBMS with deep learning. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/166095 https://hdl.handle.net/10356/166095 en application/pdf Nanyang Technological University
institution Nanyang Technological University
building NTU Library
continent Asia
country Singapore
Singapore
content_provider NTU Library
collection DR-NTU
language English
topic Engineering::Computer science and engineering
spellingShingle Engineering::Computer science and engineering
Acharya, Atul
Query cost estimation in DBMS with deep learning
description Cost and cardinality estimation is considered the Achilles Heel of modern query optimizers. Poor cardinality estimates lead to bad cost estimates resulting in sub-optimal query execution plans being selected which drops the performance of query optimizers. With the recent rise of ML for DB, the database community explored the use of learned methods in cost and cardinality estimation. However none of the methods till date can achieve prediction speeds required for modern database systems. In this project we introduce a novel algorithm (TreeGBM) using Gradient Boosting Trees to solve both cost estimation and cardinality estimation on numeric JOB workloads based on the IMDB dataset. We conducted multiple experiments to improve prediction scores and inference times. Our experiments showed that the TreeGBM was ∼120 times faster than state-of-the-art learned methods while maintaining good prediction scores. We stated possible improvements to our method that could help improve prediction scores and inference times. Future work can add on to the algorithm by using a new predicate embedding algorithm that does not incur much latency and by using prefix tries to encode string values.
author2 Luo Siqiang
author_facet Luo Siqiang
Acharya, Atul
format Final Year Project
author Acharya, Atul
author_sort Acharya, Atul
title Query cost estimation in DBMS with deep learning
title_short Query cost estimation in DBMS with deep learning
title_full Query cost estimation in DBMS with deep learning
title_fullStr Query cost estimation in DBMS with deep learning
title_full_unstemmed Query cost estimation in DBMS with deep learning
title_sort query cost estimation in dbms with deep learning
publisher Nanyang Technological University
publishDate 2023
url https://hdl.handle.net/10356/166095
_version_ 1764208174645116928