LLM-based column lineage for relational databases
This research paper explores a novel approach to deriving column lineage of relational databases, by making use of large language models (LLMs). Column lineage, or column-level lineage, tracks the flow of data for each column across tables, from ingestion to visualization. Traditional methods for de...
Saved in:
Main Author: | |
---|---|
Other Authors: | |
Format: | Final Year Project |
Language: | English |
Published: |
Nanyang Technological University
2024
|
Subjects: | |
Online Access: | https://hdl.handle.net/10356/181146 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Nanyang Technological University |
Language: | English |
Summary: | This research paper explores a novel approach to deriving column lineage of relational databases, by making use of large language models (LLMs). Column lineage, or column-level lineage, tracks the flow of data for each column across tables, from ingestion to visualization. Traditional methods for determining column lineage rely heavily on SQL parsers, which are often rigid and inflexible. Consequently, existing tools for column lineage are difficult to generalize and expensive to maintain. This project seeks to overcome these limitations by investigating the potential of using LLMs as an alternative to conventional SQL parsers. |
---|