Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 25 additions & 150 deletions preview-features/on-demand-sandboxes/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,21 @@

To gain access to the private preview, email [dts-team@microsoft.com](mailto:dts-team@microsoft.com).

You'll need a Durable Task Scheduler in one of the supported preview regions. You can use
an existing scheduler or create a new one in any of these regions:

- East US 2 (`eastus2`)
- West US 3 (`westus3`)
- North Europe (`northeurope`)
- Australia East (`australiaeast`)

Reply to us with your scheduler name and the region it's in, and we'll enable On-demand
Sandboxes on it.

## Overview

A *sandbox* is an isolated, microVM-backed container that runs a single piece of your
workflow with its own runtime, dependencies, and security boundaryseparate from your
workflow with its own runtime, dependencies, and security boundary, separate from your
orchestrator's process.

On-demand Sandboxes let you move individual workflow steps (activities) out of your
Expand All @@ -19,7 +30,7 @@ in isolation and provide a container image with that activity code; DTS handles
provisioning, scaling, and teardown.

Most activities belong in-process: they're fast, simple, and co-located with your
orchestrator. But some steps don't fit that model—they need a native binary, a
orchestrator. But some steps don't fit that model. They need a native binary, a
different language runtime, per-invocation isolation, or bursty compute you don't want
to keep warm. On-demand Sandboxes handle those exceptions without dedicated
infrastructure or custom scaling policies.
Expand All @@ -29,7 +40,7 @@ infrastructure or custom scaling policies.
- **Activity-level granularity.** Move individual steps to managed compute, not your
whole app.
- **Per-activity or per-invocation isolation.** Each execution runs in a clean,
microVM-backed sandboxideal for untrusted code, customer plugins, or LLM-generated
microVM-backed sandbox, ideal for untrusted code, customer plugins, or LLM-generated
logic.
- **Cross-runtime flexibility.** Run a Python inference step from a .NET orchestrator,
with no compromise on either side.
Expand All @@ -38,30 +49,6 @@ infrastructure or custom scaling policies.
- **No orchestrator changes.** Your orchestration code and hosting model don't change
at all.

## Prerequisites

Before you begin, make sure you have:

- **Private preview access.** On-demand Sandboxes is in private preview.
[Sign up here](https://techcommunity.microsoft.com/blog/AppsonAzureBlog/introducing-on-demand-sandboxes-for-azure-durable-task-scheduler-private-preview/4522333)
to have the feature enabled on your scheduler.
- **An app using a supported standalone Durable Task SDK.** On-demand Sandboxes target
the standalone Durable Task SDKs used *outside* the Azure Functions host—apps running
on Azure Container Apps, Azure Kubernetes Service, App Service, or anywhere else you
self-host. The private preview supports the **.NET** and **Python** SDKs; additional
language SDKs and Azure Functions support are coming soon.
- **A provisioned Durable Task Scheduler** configured as the durable backend for your app,
in one of the supported preview regions.
- **A container registry** (for example, Azure Container Registry) where you can push
the worker image that contains your sandboxed activity code.
- **User-assigned managed identities** that DTS uses to pull your worker image from your
registry and start the sandbox. The scheduler must have the identity attached, and the
image-pull identity needs the **AcrPull** role on your registry. You provide the client
IDs on the worker profile (the image-pull identity via `Image.ManagedIdentityClientId` /
`image.managed_identity_client_id`, and the worker/scheduler identity via
`SchedulerManagedIdentityClientId` / `scheduler_managed_identity_client_id`). See
[Configure the scheduler identity for image pull](#configure-the-scheduler-identity-for-image-pull).

## How it works

On-demand Sandboxes use a two-part model:
Expand All @@ -78,7 +65,7 @@ an activity in a sandbox lives entirely in the profile configuration.
Imagine an orchestrator that does two things: format some text in-process, then run a
piece of customer-supplied Python in isolation. Only the second activity is declared in a
sandbox worker profile, so DTS runs it in a managed sandbox started from your worker
imagewhile the first activity stays in-process. The result flows back to the
image, while the first activity stays in-process. The result flows back to the
orchestrator as if nothing special happened.

```mermaid
Expand All @@ -102,139 +89,27 @@ flowchart LR
```

1. The orchestrator runs `FormatText` in-process, like any normal activity.
2. When it calls `RunPython`an activity declared in a sandbox worker profileDTS starts a
2. When it calls `RunPython` (an activity declared in a sandbox worker profile), DTS starts a
sandbox from your worker image and dispatches the activity to it.
3. The activity runs in the isolated sandbox, and its result flows back through DTS to the
orchestrator. When the work is done, DTS tears the sandbox down.

## Configure the scheduler identity for image pull

To start a sandbox, DTS pulls your worker image from your container registry on your
behalf. It does this using a **user-assigned managed identity** attached to the scheduler.
That identity must be granted the **AcrPull** role on the Azure Container Registry that
hosts your worker image, and the scheduler must have the identity attached.

> [!IMPORTANT]
> Only **user-assigned** managed identities are supported. System-assigned managed
> identities are not supported at this time.

The worker profile distinguishes two identities, and you can use the same identity for
both or split them:

- **Image-pull identity** (`Image.ManagedIdentityClientId` /
`image.managed_identity_client_id`) — the identity DTS uses to **pull the worker image**
from your registry. This identity needs the **AcrPull** role on the registry.
- **Worker/scheduler identity** (`SchedulerManagedIdentityClientId` /
`scheduler_managed_identity_client_id`) — the identity the **sandbox worker uses to
connect back to Durable Task Scheduler**, and the identity your activity code runs as
when it calls other services (for example, Storage, Key Vault, or a database). Grant
this identity whatever roles your activity code needs on those downstream services.

Both identities must be attached to the scheduler. Using two separate identities lets you
scope image-pull permissions narrowly while granting your activity code only the
downstream permissions it needs.

### 1. Grant the identity the AcrPull role on your registry

Assign the **AcrPull** role to the **image-pull** user-assigned managed identity, scoped
to your registry:

```bash
az role assignment create \
--assignee "<image-pull-identity-principal-id>" \
--role "AcrPull" \
--scope "/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.ContainerRegistry/registries/<registry-name>"
```

Without this role assignment, DTS cannot pull the worker image and the sandbox will fail
to start. If your activity code calls other Azure services, grant the **worker/scheduler**
identity the roles it needs on those services as well.

### 2. Attach the identity to the scheduler

The scheduler must have the user-assigned identity attached. You can attach it when you
create the scheduler, or update an existing scheduler.

> [!IMPORTANT]
> Managing scheduler identities requires API version **2026-05-01-preview** or later. See
> the [Schedulers - Create Or Update](https://learn.microsoft.com/rest/api/durabletask/schedulers/create-or-update?view=rest-durabletask-2026-05-01-preview&tabs=HTTP#managedserviceidentity)
> REST API reference.

**For an existing scheduler**, send a PATCH to the scheduler resource URI. You can attach
multiple identities:

```bash
az rest --method patch \
--uri "https://management.azure.com/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.DurableTask/schedulers/<scheduler-name>?api-version=2026-05-01-preview" \
--body '{
"identity": {
"type": "UserAssigned",
"userAssignedIdentities": {
"/subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.ManagedIdentity/userAssignedIdentities/<identity-name>": {}
}
}
}'
```

You can also include the same `identity` block directly in the body when **creating** a
scheduler.

Once the identities are attached to the scheduler—the image-pull identity with the
**AcrPull** role on your registry—reference their client IDs on the worker profile
(`Image.ManagedIdentityClientId` / `image.managed_identity_client_id` and
`SchedulerManagedIdentityClientId` / `scheduler_managed_identity_client_id`) so DTS uses
the image-pull identity to pull the image and the worker/scheduler identity for the
sandbox worker to connect back to DTS and call downstream services.
## Get started

## Choose your language
On-demand Sandboxes is in private preview. To get access, email
[dts-team@microsoft.com](mailto:dts-team@microsoft.com). You'll need a scheduler in one of
the [supported preview regions](#get-private-preview-access).

Follow the step-by-step guide for your SDK:
Once you're in, follow the step-by-step guide for your SDK:

- **[.NET guide](./docs/dotnet.md)** — declare a sandbox worker profile and build the worker
- **[.NET guide](./docs/dotnet.md):** declare a sandbox worker profile and build the worker
image with the .NET Durable Task SDK.
- **[Python guide](./docs/python.md)** — declare a sandbox worker profile and build the worker
- **[Python guide](./docs/python.md):** declare a sandbox worker profile and build the worker
image with the Python Durable Task SDK.

Both guides follow the same shape: declare a sandbox worker profile in your orchestrator
app, build and push a worker image, then view execution logs in the DTS dashboard.

## Worker profile configuration reference

Both languages configure the same worker profile settings. The table below lists each
setting, what it controls, its accepted values, and its default. The setting names differ
slightly between .NET (`PascalCase`) and Python (`snake_case`) but map one to one.

| Setting (.NET / Python) | What it controls | Accepted values | Default |
| --- | --- | --- | --- |
| `Image.ImageRef` / `image.image_ref` | The container image that holds your activity implementations. | A full OCI image reference, by tag (`myregistry.azurecr.io/workers/hello:1.0`) or digest (`myregistry.azurecr.io/workers/hello@sha256:...`). | *Required* |
| `Image.ManagedIdentityClientId` / `image.managed_identity_client_id` | The client ID of the user-assigned managed identity DTS uses to **pull the worker image** from your registry. This identity needs the **AcrPull** role on the registry. | A user-assigned managed identity client ID (GUID). Must be attached to the scheduler. | *Required* |
| `SchedulerManagedIdentityClientId` / `scheduler_managed_identity_client_id` | The client ID of the user-assigned managed identity the **sandbox worker uses to connect back to DTS**, and that the activity code runs as when calling other services. | A user-assigned managed identity client ID (GUID). Must be attached to the scheduler. Can be the same identity as the image-pull identity or a different one. | *Required* |
| `Cpu` / `cpu` | CPU quantity declared for each sandbox. | A positive CPU quantity, expressed in millicores (`500m`, `1000m`) or whole/fractional cores (`2`, `0.5`). | `1000m` (1 vCPU) |
| `Memory` / `memory` | Memory quantity declared for each sandbox. | A positive memory quantity, such as `256Mi`, `1Gi`, or a bare number interpreted as MiB (`2048`). | `2048Mi` |
| `MaxConcurrentActivities` / `max_concurrent_activities` | How many activities a single sandbox worker instance processes concurrently. | An integer greater than `0`. There is no enforced upper bound; size it to what your activity and resource shape can handle. | `100` |
| `EnvironmentVariables` / `environment_variables` | Customer environment variables injected into the sandbox at runtime. | A map of string keys to string values. | Empty |
| *(profile id)* | Friendly profile id that groups the image, resources, and activities for monitoring and reuse. | A non-empty string, unique across your declared profiles. | `default` |
| `AddActivity` / `add_activity` | The activity names this profile offloads to the sandbox. | One or more activity names. At least one is required; an activity can belong to only one profile. | *Required* |

> [!NOTE]
> CPU and memory must be positive resource quantities. The platform may apply additional
> per-preview ceilings on the total CPU and memory a sandbox can request—check your
> private preview onboarding details for the current limits.

## View logs in the DTS dashboard

Once your sandbox activities are running, you can view their execution logs directly in
the Durable Task Scheduler dashboard. The dashboard shows real-time output from your
managed workers, including stdout, stderr, and activity lifecycle events—giving you full
visibility into what's happening inside the sandbox without configuring external log
sinks or building your own observability pipeline.

## Get started

On-demand Sandboxes is in private preview. To get access,
[sign up here](https://techcommunity.microsoft.com/blog/AppsonAzureBlog/introducing-on-demand-sandboxes-for-azure-durable-task-scheduler-private-preview/4522333).
Once you're in, the workflow is straightforward: declare a sandbox worker profile in
your orchestrator app, build and push a worker image, and DTS takes care of the rest.
app, build and push a worker image, then view execution logs in the DTS dashboard. Each
guide also includes a worker profile configuration reference for its SDK.

## Related resources

Expand Down
Loading