Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions contrib/job_with_ai_parse_document/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Source Documents (UC Volume)

5. **Upload documents** to your source volume

6. **Run job** from the Databricks UI (Workflows)
6. **Run job** from the Databricks UI (Jobs & Pipelines)

## Configuration

Expand Down Expand Up @@ -172,7 +172,7 @@ The included notebook visualizes parsing results with interactive bounding boxes
## Resources

- [Declarative Automation Bundles](https://docs.databricks.com/dev-tools/bundles/)
- [Databricks Workflows](https://docs.databricks.com/workflows/)
- [Lakeflow Jobs](https://docs.databricks.com/aws/en/jobs/)
- [Structured Streaming](https://docs.databricks.com/structured-streaming/)
- [`ai_parse_document` Function](https://docs.databricks.com/aws/en/sql/language-manual/functions/ai_parse_document)
- [`ai_query` Function](https://docs.databricks.com/aws/en/sql/language-manual/functions/ai_query)
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ The '{{.project_name}}' project was generated by using the default-scala templat
This deploys everything that's defined for this project.
For example, the default template would deploy a job called
`[dev yourname] {{.project_name}}_job` to your workspace.
You can find that job by opening your workspace and clicking on **Workflows**.
You can find that job by opening your workspace and clicking on **Jobs & Pipelines**.

4. Similarly, to deploy a production copy, type:
```
Expand Down
2 changes: 1 addition & 1 deletion knowledge_base/job_backfill_data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ with this project. You can also use the CLI:
(Note: "dev" is the default target, so `--target` is optional.)

This deploys everything defined for this project, including the job
`[dev yourname] sql_backfill_example`. You can find it under **Workflows** (or **Jobs & Pipelines**) in your workspace.
`[dev yourname] sql_backfill_example`. You can find it under **Jobs & Pipelines** in your workspace.

3. To run the job with the default `run_date`:
```
Expand Down
2 changes: 1 addition & 1 deletion knowledge_base/pipeline_with_schema/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Pipeline with a dedicated Unity Catalog schema

This example demonstrates how to define a Unity Catalog schema and a Delta Live Tables pipeline that uses it.
This example demonstrates how to define a Unity Catalog schema and a [Lakeflow Spark Declarative Pipelines](https://docs.databricks.com/aws/en/dlt/) pipeline that uses it.

## Prerequisites

Expand Down
2 changes: 1 addition & 1 deletion knowledge_base/serverless_job/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

This Declarative Automation Bundles example demonstrates how to define a job that runs on serverless compute.

For more information, please refer to the [documentation](https://docs.databricks.com/en/workflows/jobs/how-to/use-bundles-with-jobs.html#configure-a-job-that-uses-serverless-compute).
For more information, please refer to the [documentation](https://docs.databricks.com/aws/en/dev-tools/bundles/jobs-tutorial).

## Prerequisites

Expand Down
20 changes: 12 additions & 8 deletions knowledge_base/vector_search_product_discovery/README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
# Vector Search: Semantic Product Discovery
# AI Search: Semantic Product Discovery

A Declarative Automation Bundle demonstrating semantic product search using
[Databricks Vector Search](https://docs.databricks.com/en/generative-ai/vector-search.html).
It automates the full setup — the Unity Catalog schema, the Vector Search endpoint and
[Databricks AI Search](https://docs.databricks.com/aws/en/ai-search/ai-search) (formerly
Vector Search).
It automates the full setup — the Unity Catalog schema, the AI Search endpoint and
index, and the jobs that load and query the catalog — so a single `databricks bundle deploy`
gives you a working semantic-search example to explore and adapt.

Expand All @@ -22,7 +23,7 @@ products in vector space.
```
data/products.json (synced to workspace by bundle deploy)
↓ embed descriptions → upsert_data()
product_index (Direct Access Vector Search index)
product_index (Direct Access AI Search index)
↓ embed query → similarity_search(query_vector=...)
ranked results
```
Expand All @@ -36,7 +37,7 @@ ranked results
│ └── products.json # Product catalog — synced to the workspace on deploy
├── resources/
│ ├── schema.yml # Unity Catalog schema that namespaces the index
│ ├── vector-search-endpoint.yml # Vector Search endpoint (managed ANN serving)
│ ├── vector-search-endpoint.yml # AI Search endpoint (managed ANN serving)
│ ├── vector-search-index.yml # Direct Access index — schema defined inline
│ ├── setup-job.yml # Job: embed product descriptions and upsert them
│ └── query-job.yml # Job: embed a query and return ranked results
Expand All @@ -45,6 +46,9 @@ ranked results
└── 02_query_demo.py # Semantic search — runs as a job or interactively
```

Bundle resource types are unchanged by the rename to AI Search: the endpoint and index
are still declared as `vector_search_endpoints` and `vector_search_indexes`.

## Prerequisites

- Databricks workspace with Unity Catalog enabled
Expand All @@ -69,7 +73,7 @@ ranked results
you — and several people can deploy into the same workspace without colliding. Use
`databricks bundle deploy --target prod` for the shared production copy.

> Vector Search endpoint creation takes a few minutes to reach ONLINE status.
> AI Search endpoint creation takes a few minutes to reach ONLINE status.

4. Load the catalog by running the bundle. This embeds all product descriptions and upserts them into the index.
```bash
Expand Down Expand Up @@ -103,7 +107,7 @@ databricks bundle deploy \
|---|---|---|
| `catalog` | `main` | Existing Unity Catalog catalog |
| `schema` | `product_search` | Schema created by the bundle |
| `endpoint_name` | `product-search-endpoint` | Vector Search endpoint name. Shared in prod; the `dev` target overrides it per user. |
| `endpoint_name` | `product-search-endpoint` | AI Search endpoint name. Shared in prod; the `dev` target overrides it per user. |
| `embedding_model` | `databricks-gte-large-en` | Foundation model used for embeddings |
| `embedding_dimension` | `1024` | Vector dimension. Drives both the index and the embedding requests; immutable after the index is created. |

Expand Down Expand Up @@ -150,6 +154,6 @@ table and it keeps itself up to date. Replace `index_type: DIRECT_ACCESS` and

## Resources

- [Databricks Vector Search](https://docs.databricks.com/en/generative-ai/vector-search.html)
- [Databricks AI Search](https://docs.databricks.com/aws/en/ai-search/ai-search)
- [Declarative Automation Bundles](https://docs.databricks.com/dev-tools/bundles/)
- [Foundation Models — GTE Large](https://docs.databricks.com/en/machine-learning/foundation-models/supported-models.html)
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ variables:
description: Unity Catalog schema name for the product search use case
default: product_search
endpoint_name:
description: Name of the Vector Search endpoint
description: Name of the AI Search endpoint
default: product-search-endpoint
embedding_model:
description: Model serving endpoint used to embed product descriptions
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ resources:

tasks:
- task_key: upsert_products
description: Load products from JSON, embed descriptions, and upsert into the Vector Search index
description: Load products from JSON, embed descriptions, and upsert into the AI Search index
environment_key: serverless_env
notebook_task:
notebook_path: ../src/01_upsert_products.py
Expand Down
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
# Databricks notebook source
# MAGIC %md
# MAGIC # Upsert Products into Vector Search Index
# MAGIC # Upsert Products into AI Search Index
# MAGIC
# MAGIC Reads the product catalog from the JSON file deployed with the bundle,
# MAGIC embeds each product description, then upserts all records into the Vector
# MAGIC Search index. Re-running is safe — upsert is idempotent on `product_id`.
# MAGIC embeds each product description, then upserts all records into the AI Search
# MAGIC index. Re-running is safe — upsert is idempotent on `product_id`.

# COMMAND ----------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
# MAGIC %md
# MAGIC # Semantic Product Search Demo
# MAGIC
# MAGIC Queries the Vector Search index to find products that match a natural-language
# MAGIC Queries the AI Search index to find products that match a natural-language
# MAGIC description. Try queries that would fail keyword search — e.g. *"something to
# MAGIC keep my coffee hot all day"* or *"gear for sleeping outside in freezing weather"*.

Expand Down