LLM-based column lineage for relational databases
This research paper explores a novel approach to deriving column lineage of relational databases, by making use of large language models (LLMs). Column lineage, or column-level lineage, tracks the flow of data for each column across tables, from ingestion to visualization. Traditional methods for de...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181146 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
id |
sg-ntu-dr.10356-181146 |
---|---|
record_format |
dspace |
spelling |
sg-ntu-dr.10356-1811462024-11-18T00:19:23Z LLM-based column lineage for relational databases Tan, Yu Ling Long Cheng College of Computing and Data Science c.long@ntu.edu.sg Computer and Information Science LLM Database Column lineage This research paper explores a novel approach to deriving column lineage of relational databases, by making use of large language models (LLMs). Column lineage, or column-level lineage, tracks the flow of data for each column across tables, from ingestion to visualization. Traditional methods for determining column lineage rely heavily on SQL parsers, which are often rigid and inflexible. Consequently, existing tools for column lineage are difficult to generalize and expensive to maintain. This project seeks to overcome these limitations by investigating the potential of using LLMs as an alternative to conventional SQL parsers. Bachelor's degree 2024-11-18T00:19:23Z 2024-11-18T00:19:23Z 2024 Final Year Project (FYP) Tan, Y. L. (2024). LLM-based column lineage for relational databases. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/181146 https://hdl.handle.net/10356/181146 en SCSE23-0652 application/pdf Nanyang Technological University |
institution |
Nanyang Technological University |
building |
NTU Library |
continent |
Asia |
country |
Singapore Singapore |
content_provider |
NTU Library |
collection |
DR-NTU |
language |
English |
topic |
Computer and Information Science LLM Database Column lineage |
spellingShingle |
Computer and Information Science LLM Database Column lineage Tan, Yu Ling LLM-based column lineage for relational databases |
description |
This research paper explores a novel approach to deriving column lineage of relational databases, by making use of large language models (LLMs). Column lineage, or column-level lineage, tracks the flow of data for each column across tables, from ingestion to visualization. Traditional methods for determining column lineage rely heavily on SQL parsers, which are often rigid and inflexible. Consequently, existing tools for column lineage are difficult to generalize and expensive to maintain. This project seeks to overcome these limitations by investigating the potential of using LLMs as an alternative to conventional SQL parsers. |
author2 |
Long Cheng |
author_facet |
Long Cheng Tan, Yu Ling |
format |
Final Year Project |
author |
Tan, Yu Ling |
author_sort |
Tan, Yu Ling |
title |
LLM-based column lineage for relational databases |
title_short |
LLM-based column lineage for relational databases |
title_full |
LLM-based column lineage for relational databases |
title_fullStr |
LLM-based column lineage for relational databases |
title_full_unstemmed |
LLM-based column lineage for relational databases |
title_sort |
llm-based column lineage for relational databases |
publisher |
Nanyang Technological University |
publishDate |
2024 |
url |
https://hdl.handle.net/10356/181146 |
_version_ |
1816859055751168000 |