Skip to content

SQL DDL is under-modeled: CREATE TABLE/VIEW become generic Variable nodes with no lineage #574

Description

@alexisperinger-ux

What problem does this solve?

SQL DDL is under-modeled. CREATE TABLE / CREATE VIEW are extracted but labeled generic Variable, CREATE MATERIALIZED VIEW is dropped, and there is no table-to-table lineage. Every CREATE collapses to an untyped Variable node with zero lineage edges, so the graph cannot answer basic questions:

  • "What tables/views does this view read from?"
  • "What breaks downstream if I change this table?"

Public test bed: lerocha/chinook-database (pure CREATE TABLE / CREATE VIEW DDL).

Proposed solution

  • Label create_table as Table, and create_view / create_materialized_view as View.
  • Emit FROM/JOIN relation references as USAGE lineage edges, resolved through the existing definition registry, so a view links to the tables/views it selects from.
  • Zero schema change: freeform node labels + the existing USAGE edge type. Only relations that resolve against the registry emit an edge, so DML-only files fabricate nothing.

Scope: DDL labeling + FROM/JOIN lineage only.

Alternatives considered

  • A new REFERENCES edge type instead of reusing USAGE: avoided to add no edge type; USAGE already means "X refers to Y".
  • Modeling columns as Field nodes: deferred to keep this small.

Confirmations

  • I searched existing issues and this is not a duplicate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions