What problem does this solve?
dbt .sql models are Jinja-templated ({{ ref() }}, {{ source() }}, {% macro %}), which the SQL grammar cannot parse, and dbt's manifest only exists after dbt compile. So in a raw checkout, models and their dependencies are invisible: a model file is only a generic Module, and a referenced model like stg_users is not even a node, so ref() lineage cannot form.
Public test bed: dbt-labs/jaffle_shop (index the raw models/ without compiling).
Proposed solution
Run an additive tree-sitter-jinja2 pass on dbt-templated .sql (files containing {{ / {%):
{{ ref('m') }} / {{ source('s','t') }} become USAGE lineage edges.
- A dbt model (a
.sql file with no macro defs) becomes a Model node keyed by file stem, so cross-file {{ ref('that_model') }} resolves into model-to-model lineage.
{% macro name(...) %} becomes a Macro.
Zero schema change (freeform labels + existing USAGE edge). Model is emitted only on the .sql path, so a plain .jinja / .j2 template is not treated as a model. These source-level Model nodes coexist with the manifest path (#576) without conflict.
Caveat: the vendored tree-sitter-jinja2 grammar models only {{ }} expressions (so ref() / source() are parsed from the AST). It has no rule for {% %} statements, so {% macro %} names are recovered with a small text scan until a statement-aware grammar is vendored.
Alternatives considered
- Reusing the embedded-language (
<script>-in-HTML) re-parse path: rejected because dbt Jinja is interleaved throughout the file, not a delimited sub-region; a full-file second parse is the right shape.
- Extending the grammar to parse
{% %} statements: larger and touches grammar vendoring; deferred.
Confirmations
What problem does this solve?
dbt
.sqlmodels are Jinja-templated ({{ ref() }},{{ source() }},{% macro %}), which the SQL grammar cannot parse, and dbt's manifest only exists afterdbt compile. So in a raw checkout, models and their dependencies are invisible: a model file is only a genericModule, and a referenced model likestg_usersis not even a node, soref()lineage cannot form.Public test bed:
dbt-labs/jaffle_shop(index the rawmodels/without compiling).Proposed solution
Run an additive
tree-sitter-jinja2pass on dbt-templated.sql(files containing{{/{%):{{ ref('m') }}/{{ source('s','t') }}becomeUSAGElineage edges..sqlfile with no macro defs) becomes aModelnode keyed by file stem, so cross-file{{ ref('that_model') }}resolves into model-to-model lineage.{% macro name(...) %}becomes aMacro.Zero schema change (freeform labels + existing
USAGEedge).Modelis emitted only on the.sqlpath, so a plain.jinja/.j2template is not treated as a model. These source-levelModelnodes coexist with the manifest path (#576) without conflict.Caveat: the vendored tree-sitter-jinja2 grammar models only {{ }} expressions (so ref() / source() are parsed from the AST). It has no rule for {% %} statements, so {% macro %} names are recovered with a small text scan until a statement-aware grammar is vendored.
Alternatives considered
<script>-in-HTML) re-parse path: rejected because dbt Jinja is interleaved throughout the file, not a delimited sub-region; a full-file second parse is the right shape.{% %}statements: larger and touches grammar vendoring; deferred.Confirmations