Skip to content

databricks-skills: tombstone redirects to databricks-agent-skills#546

Open
jamesbroadhead wants to merge 2 commits into
databricks-solutions:mainfrom
jamesbroadhead:jb/tombstones-for-d-a-s-migration
Open

databricks-skills: tombstone redirects to databricks-agent-skills#546
jamesbroadhead wants to merge 2 commits into
databricks-solutions:mainfrom
jamesbroadhead:jb/tombstones-for-d-a-s-migration

Conversation

@jamesbroadhead
Copy link
Copy Markdown
Contributor

@jamesbroadhead jamesbroadhead commented May 24, 2026

Summary

Replaces all databricks-skills/<name>/ directories with tombstone redirects to databricks/databricks-agent-skills (d-a-s), which is now the source of truth for Databricks skills. Skills install via the Databricks CLI:

databricks aitools install                  # all stable skills
databricks aitools install --experimental   # stable + experimental
databricks aitools install <name> [--experimental]

This PR also rewires the in-repo install paths so users actually reach the CLI instead of getting tombstones copied into their per-agent skill directories — see "Install-path rewire" below.

Background

d-a-s PR #73 merged on May 24 and imported 18 of these skills into experimental/. The remaining 6 were merged into d-a-s's stable skills/ (1:1 or under a renamed home), and databricks-genie was deferred. With d-a-s acting as the source of truth, a-d-k stops shipping skill content and just redirects users.

This supersedes the subtree-sync proposal in a-d-k RFC #530 (now closed) — tombstones avoid the publish/consume workflows on both sides.

Changes

Tombstones

Per-skill tombstones: each databricks-skills/<name>/ becomes a single SKILL.md pointing at the equivalent install command. Directory contents otherwise removed. See the redirect table in databricks-skills/README.md for the mapping.

Mapping summary:

  • 18 → d-a-s/experimental (kept name): agent-bricks, ai-functions, aibi-dashboards, apps-python, dbsql, docs, execution-compute, iceberg, metric-views, mlflow-evaluation, python-sdk, spark-structured-streaming, synthetic-data-gen, unity-catalog, unstructured-pdf-generation, vector-search, zerobus-ingest, spark-python-data-source.
  • 6 → d-a-s/skills (stable): bundles → databricks-dabs; config → databricks-core; jobs → databricks-jobs (merged); lakebase-autoscale → databricks-lakebase; lakebase-provisioned → databricks-lakebase; model-serving → databricks-model-serving (dev-side content port in progress via d-a-s PR #84); spark-declarative-pipelines → databricks-pipelines (content port pending; owners @lennartkats-db / @camielstee-db).
  • 1 deferred: databricks-genie — tombstone now points at d-a-s PR #73 where the skill was excluded, with "will be available in a future d-a-s release" wording.

Install-path rewire

The first revision of this PR only tombstoned databricks-skills/<name>/. That left several install paths still pointing customers at those directories (or at the bash installer that copied them), which after the migration would have meant installing empty tombstones over working setups. This commit fixes that.

install.sh / install.ps1 — skill installation is fully delegated to databricks aitools install. The installer no longer copies skill content:

  • On run, it cleans up any .claude/skills/databricks-*/ (etc.) left by previous installs using the per-skill manifest, then invokes the CLI with the user's selected profile.
  • MIN_CLI_VERSION bumped to 1.0.0 (the release that ships top-level databricks aitools); the CLI is now mandatory — the installer dies with an upgrade message if missing or older.
  • MLflow + APX bundled-skill fetching dropped; those are out of scope now that distribution lives in the CLI.
  • The per-user-type bundles (data-engineer / analyst / ai-ml-engineer / app-developer) are preserved and now map to post-migration d-a-s skill names (bundles → databricks-dabs, config → databricks-core, lakebase-* → databricks-lakebase, spark-declarative-pipelines → databricks-pipelines).

.claude-plugin/plugin.json — removed the "skills": "./databricks-skills/" field. Plugin-marketplace users get the MCP server only; they install skills via the CLI like everyone else (otherwise they'd auto-load the tombstones).

.claude-plugin/check_update.sh — when the session-start update banner fires for users on a pre-1.0.0 install, it now includes a migration callout: skills move to the CLI, CLI v1.0.0+ required, running the upgrade command cleans up the old per-agent skill directories.

README.md — reordered so the "Start here!" CTA is databricks aitools install; the install.sh path is presented as the skills+MCP all-in-one. The Prerequisites section calls out the CLI v1.0.0+ requirement; the misleading "What's Included → databricks-skills/ (20 markdown skills)" row now points at the CLI.

Deprecated entry points (already in the first revision)

  • install_skills.sh → prints a deprecation/redirect message and exits.
  • install_genie_code_skills.py → same, raises SystemExit if run as a notebook.
  • databricks-skills/README.md rewritten as a redirect README with the per-skill mapping table.
  • databricks-skills/TEMPLATE/ preserved (starter template).

Test plan

Tombstones (from the first revision):

  • databricks-skills/install_skills.sh runs and prints the deprecation message.
  • python3 databricks-skills/install_genie_code_skills.py exits with the deprecation message (when run outside a notebook context — it raises SystemExit).
  • Spot-check a tombstone: databricks-skills/databricks-iceberg/SKILL.md shows the redirect to databricks aitools install databricks-iceberg --experimental.

Install-path rewire (this revision):

  • bash install.sh --help and bash install.sh --list-skills produce sensible output; no references to the old databricks-skills/ copy path or to MLflow/APX skills.
  • bash -n install.sh passes (sanity-only; CI does the real validation).
  • Fresh-machine flow: with no CLI installed, bash install.sh dies with a clear "install Databricks CLI v1.0.0+" message instead of silently producing tombstone skills.
  • Upgrade flow: with a pre-migration .claude/skills/databricks-iceberg/ directory present and a .installed-skills manifest, bash install.sh removes that directory, invokes databricks aitools install --experimental, and the CLI writes its own canonical skill location.
  • check_update.sh: stage local VERSION=0.x.y and remote VERSION=1.0.0 so the script fires; confirm the migration block appears in the banner. With VERSION=1.0.1 on both sides, confirm the script is silent.
  • .claude-plugin/plugin.json is valid JSON and has no skills field.
  • Spot-check databricks-skills/databricks-genie/SKILL.md: tombstone links to d-a-s PR Add Vector Search and Lakebase Provisioned MCP tools #73 and says the skill will be available once a future PR lands.
  • CI green on this branch.

Follow-up

After this merges, delete the d-a-s experimental-only-preview branch (it was a one-shot for the now-closed RFC #530).

This PR was AI-assisted by Isaac.

Skills are now distributed via the Databricks CLI from
databricks/databricks-agent-skills. Replace each per-skill directory
with a single SKILL.md redirect pointing at
`databricks aitools install <name> [--experimental]`. Update install
scripts (install_skills.sh, install_genie_code_skills.py) to print a
deprecation/redirect message and exit. Update databricks-skills/README.md
with a per-skill redirect table, and the root README to point at the new
upstream.

Mapping notes:
- 18 experimental skills redirect to d-a-s/experimental/.
- 6 a-d-k skills were merged into d-a-s stable: databricks-bundles →
  databricks-dabs, databricks-config → databricks-core, databricks-jobs
  (kept name), databricks-lakebase-autoscale → databricks-lakebase,
  databricks-lakebase-provisioned → databricks-lakebase,
  databricks-model-serving (kept name; dev-side content port in progress
  via d-a-s PR databricks-solutions#84), databricks-spark-declarative-pipelines →
  databricks-pipelines (content port pending, owners @lennartkats-db /
  @camielstee-db).
- 1 a-d-k skill (databricks-genie) is not currently published in d-a-s;
  its tombstone says "may return in a future release".
- TEMPLATE/ is preserved as a starter template.

Supersedes the subtree-sync approach in a-d-k RFC databricks-solutions#530 (now closed).
The d-a-s `experimental-only-preview` branch can be deleted once this
PR merges.

Co-authored-by: Isaac
The previous tombstone-only PR left install.sh, install.ps1, the Claude
Code plugin manifest, and the check_update.sh upgrade prompt all driving
users into per-agent skill directories sourced from databricks-skills/ —
which under the tombstone migration would have meant copying empty "moved"
files over working installs.

This commit closes the loop:

- install.sh / install.ps1
  - Skill installation is fully delegated to `databricks aitools install`.
    The installer no longer copies skill content from this repo.
  - On run, the installer first cleans up any leftover .claude/skills/* (etc.)
    written by the old installer using the manifest it kept, then invokes
    the CLI with the user's selected profile names.
  - MIN_CLI_VERSION bumped to 1.0.0 (the release that ships top-level
    `databricks aitools`); CLI is now mandatory (die instead of warn).
  - MLflow + APX bundled-skill fetching dropped — out of scope for this
    installer. The per-user-type bundles (data-engineer / analyst /
    ai-ml-engineer / app-developer) are preserved and now map to
    post-migration d-a-s skill names.

- .claude-plugin/plugin.json: removed the `skills` field that pointed at
  ./databricks-skills/. Marketplace users get MCP only; they install
  skills via the CLI.

- .claude-plugin/check_update.sh: when the update prompt fires for users
  on a pre-1.0.0 install, the banner now includes a one-shot migration
  note explaining the CLI-based distribution and that running install.sh
  will clean up old per-agent skill directories.

- README.md: reordered so the Skills CTA is `databricks aitools install`
  first; install.sh is presented as the skills+MCP all-in-one path. CLI
  v1.0.0+ requirement called out in Prerequisites; the misleading
  databricks-skills/ row in What's Included now points at the CLI.

- databricks-skills/databricks-genie/SKILL.md: tombstone now points at
  d-a-s PR databricks-solutions#73 (which deferred Genie during the migration) and notes
  that the skill will return in a future d-a-s release.

Co-authored-by: Isaac
jamesbroadhead added a commit to databricks/databricks-agent-skills that referenced this pull request May 26, 2026
…-d-k (#85)

## Summary

Ports the `databricks-spark-declarative-pipelines` skill from
[`databricks-solutions/ai-dev-kit`](https://github.com/databricks-solutions/ai-dev-kit/tree/experimental/databricks-skills/databricks-spark-declarative-pipelines)
into stable `skills/databricks-pipelines/`. Source:
`databricks-solutions/ai-dev-kit:experimental`.

Completes d-a-s [PR
#73](#73
TODO #5. Pairs with a-d-k [PR
#546](databricks-solutions/ai-dev-kit#546),
which tombstones the a-d-k skill once this lands.

Stable's `databricks-pipelines` already covered the per-feature ×
per-language API/options surface (decision tree, common traps, format
options, dataset/flow/quality references). a-d-k's version covered
scaffolding/workflows, configuration, performance tuning, DLT migration,
and several streaming patterns + Kafka ingestion + SCD-2 query patterns
that stable lacked. This PR adds a-d-k's net-new content as new
`references/` files; the per-feature reference structure is preserved.

## Changes

### New `references/`

- `dlt-migration.md` — both migration paths (DLT Python → SDP Python via
`pyspark.pipelines`, DLT Python → SDP SQL) with side-by-side conversions
for decorators, reads, expectations, CDC/SCD, and partitioning → liquid
clustering.
- `workflows.md` — Workflow A/B/C chooser (standalone bundle via
`databricks pipelines init`, pipeline-in-existing-bundle, rapid CLI
iteration with no bundle); language-selection rules; start-update +
poll-the-update pattern (with the "never poll top-level pipeline state
because RETRY_ON_FAILURE flips it back to RUNNING" rationale);
edit/re-upload/restart flow; Python SDK alternative.
- `pipeline-configuration.md` — Full JSON config reference for
`pipelines create|update` (top-level fields, `clusters`, `event_log`,
`notifications`, `configuration`, `run_as`, `restart_window`,
`environment`, `deployment`); variant snippets (dev mode,
non-serverless, continuous, notifications, autoscaling, custom event
log, serverless Python deps); multi-schema patterns; platform
constraints.
- `performance.md` — Liquid Clustering with per-layer key guidance
(bronze/silver/gold); cluster-key type rules; table properties;
state-management strategies for streaming; join optimization
(stream-to-static, stream-to-stream with time bounds); query
optimization; pre-aggregation; compute config; monitoring.
- `streaming-patterns.md` — Deduplication (by key, with time window,
composite); windowed aggregations (tumbling, multi-size, session
windows); event-time vs processing-time; rescue-data quarantine (Auto
Loader `_rescued_data` → bronze_quarantine + silver_clean fanout);
stream-to-stream join as a pattern; running totals; anomaly detection
(rolling z-score outlier flag); end-to-end lag monitoring.
- `kafka.md` — Basic Kafka read (Python + SQL); JSON payload parsing
with explicit schemas; Databricks Secrets SASL/PLAIN auth; mTLS notes;
Event Hubs via the Kafka protocol; pipeline-config plumbing for
brokers/topics; pointer to `sink.md` for writing back to Kafka. Fills a
full gap — stable's SKILL.md API table listed `read_kafka` and
`format(\"kafka\")` with no linked skill.
- `scd-2-querying.md` — `__START_AT` / `__END_AT` temporal semantics;
current-state materialized views; point-in-time queries with the
inclusive-lower / exclusive-upper boundary; per-entity history;
period-bounded change analysis; joining facts with historical dimensions
(as-of-transaction-time and current-dim variants); pre-filter MV
optimization; clustering on `(entity_key, __START_AT)`.

### `SKILL.md`

- New "Choose Your Workflow" and "Language Selection" sections near
scaffolding.
- Scaffolding section documents both `databricks pipelines init` (newer,
focused) and `databricks bundle init lakeflow-pipelines`
(template-based).
- Pipeline API Reference list reorganized: **Project & Lifecycle**
(workflows, configuration, performance, DLT migration) and **Datasets,
Flows & Quality** (the existing per-feature refs + new kafka,
scd-2-querying, streaming-patterns).
- Version bumped to `0.3.0`.

### Cross-references in existing references

- `auto-loader.md` → `streaming-patterns.md` (quarantine), `kafka.md`,
lag monitoring.
- `auto-cdc.md` → `scd-2-querying.md` for reading SCD-2 history tables.

## Deliberately dropped from a-d-k

| a-d-k file | Why dropped |
|------------|-------------|
| `references/2-mcp-approach.md` | a-d-k experimental already renamed
this to `2-cli-approach.md`; MCP tool refs stripped per d-a-s PR #73
policy. CLI flow now lives in `workflows.md` as Workflow C. |
| `references/python/1-syntax-basics.md`,
`references/sql/1-syntax-basics.md` | Covered by stable's
`python-basics.md`, `sql-basics.md`, and the per-feature references
(streaming-table, materialized-view, temporary-view, view-sql). |
| `references/python/{2,3,4}-*.md`, `references/sql/{2,3,4}-*.md` |
Pattern content ported into `streaming-patterns.md`, `kafka.md`,
`scd-2-querying.md` (this PR); API/options content already covered by
stable's per-feature × per-language references. |
| `scripts/exploration_notebook.py` | Stable convention has no
`scripts/` directory under a skill. `databricks pipelines init`
generates an `explorations/` folder; users use the CLI or the generated
notebook directly. |

## Test plan

- [x] `python3 scripts/skills.py generate` clean.
- [x] `python3 scripts/skills.py validate` passes.
- [x] Merged `origin/main` mid-port (resolved version conflict — kept
`0.3.0`; took main's CLI install command + compatibility bump).
- [ ] CI green on this branch.
- [ ] Owner review (`@lennartkats-db` / `@camielstee-db` per
CODEOWNERS).

This pull request and its description were written by Claude.
simonfaltum pushed a commit to databricks/databricks-agent-skills that referenced this pull request May 27, 2026
)

## Summary

- Adds `.claude-plugin/marketplace.json` so users can install the d-a-s
skills plugin directly from inside Claude Code:

      /plugin marketplace add databricks/databricks-agent-skills
      /plugin install databricks-skills

Without this file the existing `.claude-plugin/plugin.json` is reachable
only after cloning the repo.

- `README.md`: documents both install paths (CLI canonical, plugin
marketplace alternative for stable skills) and adds a short comparison
table covering experimental skills, per-skill selection, and
outside-agent prerequisites. The two paths install to *different*
locations (CLI writes into `~/.claude/skills/`; plugin caches under
`~/.claude/plugins/cache/<marketplace>/<plugin>/`) — the README now
spells that out instead of pretending they share a target.

## Background

`databricks-solutions/ai-dev-kit` is in the process of being retired as
a skills-distribution mechanism (see a-d-k PRs
[#546](databricks-solutions/ai-dev-kit#546),
[#547](databricks-solutions/ai-dev-kit#547),
[#548](databricks-solutions/ai-dev-kit#548) and
the team's `DECOMMISSION_PLAN.md` on the experimental branch). Users
will be redirected here for skills.

The migration banner that a-d-k will start showing pre-1.0.0 users tells
them to run `/plugin marketplace add
databricks/databricks-agent-skills`. Without this PR landing, that
command fails to resolve.

The plugin marketplace path is intentionally scoped to stable skills
(matches the existing `"skills": "./skills/"` in `plugin.json`).
Experimental skills stay CLI-only — the README's comparison table calls
that out so users know which knob to reach for.

This supersedes #92 (which was opened from a personal fork).

## Test plan

- [x] `python3 -m json.tool .claude-plugin/marketplace.json` — valid
JSON.
- [x] `claude plugin validate --strict .` on the branch — marketplace
manifest passes strict validation.
- [x] From a clean Claude Code session, `claude plugin marketplace add
<path>` + `claude plugin install databricks-skills` succeeds; `claude
plugin details databricks-skills` registers exactly the 8 stable skills
(`databricks-apps`, `-core`, `-dabs`, `-jobs`, `-lakebase`,
`-model-serving`, `-pipelines`, `-serverless-migration`); experimental
skills correctly excluded from the registered set even though they're
present in the plugin cache.
- [x] Plugin files land at
`~/.claude/plugins/cache/databricks-skills/databricks-skills/0.1.0/skills/<skill>/SKILL.md`
(the plugin path; not `~/.claude/skills/`, which is the CLI path).
- [ ] README renders cleanly on GitHub; the comparison table is legible.

This PR was AI-assisted by Isaac.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant