diff --git a/preview-features/on-demand-sandboxes/README.md b/preview-features/on-demand-sandboxes/README.md index 1fe1630a..8feded4b 100644 --- a/preview-features/on-demand-sandboxes/README.md +++ b/preview-features/on-demand-sandboxes/README.md @@ -6,10 +6,21 @@ To gain access to the private preview, email [dts-team@microsoft.com](mailto:dts-team@microsoft.com). +You'll need a Durable Task Scheduler in one of the supported preview regions. You can use +an existing scheduler or create a new one in any of these regions: + +- East US 2 (`eastus2`) +- West US 3 (`westus3`) +- North Europe (`northeurope`) +- Australia East (`australiaeast`) + +Reply to us with your scheduler name and the region it's in, and we'll enable On-demand +Sandboxes on it. + ## Overview A *sandbox* is an isolated, microVM-backed container that runs a single piece of your -workflow with its own runtime, dependencies, and security boundary—separate from your +workflow with its own runtime, dependencies, and security boundary, separate from your orchestrator's process. On-demand Sandboxes let you move individual workflow steps (activities) out of your @@ -19,7 +30,7 @@ in isolation and provide a container image with that activity code; DTS handles provisioning, scaling, and teardown. Most activities belong in-process: they're fast, simple, and co-located with your -orchestrator. But some steps don't fit that model—they need a native binary, a +orchestrator. But some steps don't fit that model. They need a native binary, a different language runtime, per-invocation isolation, or bursty compute you don't want to keep warm. On-demand Sandboxes handle those exceptions without dedicated infrastructure or custom scaling policies. @@ -29,7 +40,7 @@ infrastructure or custom scaling policies. - **Activity-level granularity.** Move individual steps to managed compute, not your whole app. - **Per-activity or per-invocation isolation.** Each execution runs in a clean, - microVM-backed sandbox—ideal for untrusted code, customer plugins, or LLM-generated + microVM-backed sandbox, ideal for untrusted code, customer plugins, or LLM-generated logic. - **Cross-runtime flexibility.** Run a Python inference step from a .NET orchestrator, with no compromise on either side. @@ -38,30 +49,6 @@ infrastructure or custom scaling policies. - **No orchestrator changes.** Your orchestration code and hosting model don't change at all. -## Prerequisites - -Before you begin, make sure you have: - -- **Private preview access.** On-demand Sandboxes is in private preview. - [Sign up here](https://techcommunity.microsoft.com/blog/AppsonAzureBlog/introducing-on-demand-sandboxes-for-azure-durable-task-scheduler-private-preview/4522333) - to have the feature enabled on your scheduler. -- **An app using a supported standalone Durable Task SDK.** On-demand Sandboxes target - the standalone Durable Task SDKs used *outside* the Azure Functions host—apps running - on Azure Container Apps, Azure Kubernetes Service, App Service, or anywhere else you - self-host. The private preview supports the **.NET** and **Python** SDKs; additional - language SDKs and Azure Functions support are coming soon. -- **A provisioned Durable Task Scheduler** configured as the durable backend for your app, - in one of the supported preview regions. -- **A container registry** (for example, Azure Container Registry) where you can push - the worker image that contains your sandboxed activity code. -- **User-assigned managed identities** that DTS uses to pull your worker image from your - registry and start the sandbox. The scheduler must have the identity attached, and the - image-pull identity needs the **AcrPull** role on your registry. You provide the client - IDs on the worker profile (the image-pull identity via `Image.ManagedIdentityClientId` / - `image.managed_identity_client_id`, and the worker/scheduler identity via - `SchedulerManagedIdentityClientId` / `scheduler_managed_identity_client_id`). See - [Configure the scheduler identity for image pull](#configure-the-scheduler-identity-for-image-pull). - ## How it works On-demand Sandboxes use a two-part model: @@ -78,7 +65,7 @@ an activity in a sandbox lives entirely in the profile configuration. Imagine an orchestrator that does two things: format some text in-process, then run a piece of customer-supplied Python in isolation. Only the second activity is declared in a sandbox worker profile, so DTS runs it in a managed sandbox started from your worker -image—while the first activity stays in-process. The result flows back to the +image, while the first activity stays in-process. The result flows back to the orchestrator as if nothing special happened. ```mermaid @@ -102,139 +89,27 @@ flowchart LR ``` 1. The orchestrator runs `FormatText` in-process, like any normal activity. -2. When it calls `RunPython`—an activity declared in a sandbox worker profile—DTS starts a +2. When it calls `RunPython` (an activity declared in a sandbox worker profile), DTS starts a sandbox from your worker image and dispatches the activity to it. 3. The activity runs in the isolated sandbox, and its result flows back through DTS to the orchestrator. When the work is done, DTS tears the sandbox down. -## Configure the scheduler identity for image pull - -To start a sandbox, DTS pulls your worker image from your container registry on your -behalf. It does this using a **user-assigned managed identity** attached to the scheduler. -That identity must be granted the **AcrPull** role on the Azure Container Registry that -hosts your worker image, and the scheduler must have the identity attached. - -> [!IMPORTANT] -> Only **user-assigned** managed identities are supported. System-assigned managed -> identities are not supported at this time. - -The worker profile distinguishes two identities, and you can use the same identity for -both or split them: - -- **Image-pull identity** (`Image.ManagedIdentityClientId` / - `image.managed_identity_client_id`) — the identity DTS uses to **pull the worker image** - from your registry. This identity needs the **AcrPull** role on the registry. -- **Worker/scheduler identity** (`SchedulerManagedIdentityClientId` / - `scheduler_managed_identity_client_id`) — the identity the **sandbox worker uses to - connect back to Durable Task Scheduler**, and the identity your activity code runs as - when it calls other services (for example, Storage, Key Vault, or a database). Grant - this identity whatever roles your activity code needs on those downstream services. - -Both identities must be attached to the scheduler. Using two separate identities lets you -scope image-pull permissions narrowly while granting your activity code only the -downstream permissions it needs. - -### 1. Grant the identity the AcrPull role on your registry - -Assign the **AcrPull** role to the **image-pull** user-assigned managed identity, scoped -to your registry: - -```bash -az role assignment create \ - --assignee "" \ - --role "AcrPull" \ - --scope "/subscriptions//resourceGroups//providers/Microsoft.ContainerRegistry/registries/" -``` - -Without this role assignment, DTS cannot pull the worker image and the sandbox will fail -to start. If your activity code calls other Azure services, grant the **worker/scheduler** -identity the roles it needs on those services as well. - -### 2. Attach the identity to the scheduler - -The scheduler must have the user-assigned identity attached. You can attach it when you -create the scheduler, or update an existing scheduler. - -> [!IMPORTANT] -> Managing scheduler identities requires API version **2026-05-01-preview** or later. See -> the [Schedulers - Create Or Update](https://learn.microsoft.com/rest/api/durabletask/schedulers/create-or-update?view=rest-durabletask-2026-05-01-preview&tabs=HTTP#managedserviceidentity) -> REST API reference. - -**For an existing scheduler**, send a PATCH to the scheduler resource URI. You can attach -multiple identities: - -```bash -az rest --method patch \ - --uri "https://management.azure.com/subscriptions//resourceGroups//providers/Microsoft.DurableTask/schedulers/?api-version=2026-05-01-preview" \ - --body '{ - "identity": { - "type": "UserAssigned", - "userAssignedIdentities": { - "/subscriptions//resourceGroups//providers/Microsoft.ManagedIdentity/userAssignedIdentities/": {} - } - } - }' -``` - -You can also include the same `identity` block directly in the body when **creating** a -scheduler. - -Once the identities are attached to the scheduler—the image-pull identity with the -**AcrPull** role on your registry—reference their client IDs on the worker profile -(`Image.ManagedIdentityClientId` / `image.managed_identity_client_id` and -`SchedulerManagedIdentityClientId` / `scheduler_managed_identity_client_id`) so DTS uses -the image-pull identity to pull the image and the worker/scheduler identity for the -sandbox worker to connect back to DTS and call downstream services. +## Get started -## Choose your language +On-demand Sandboxes is in private preview. To get access, email +[dts-team@microsoft.com](mailto:dts-team@microsoft.com). You'll need a scheduler in one of +the [supported preview regions](#get-private-preview-access). -Follow the step-by-step guide for your SDK: +Once you're in, follow the step-by-step guide for your SDK: -- **[.NET guide](./docs/dotnet.md)** — declare a sandbox worker profile and build the worker +- **[.NET guide](./docs/dotnet.md):** declare a sandbox worker profile and build the worker image with the .NET Durable Task SDK. -- **[Python guide](./docs/python.md)** — declare a sandbox worker profile and build the worker +- **[Python guide](./docs/python.md):** declare a sandbox worker profile and build the worker image with the Python Durable Task SDK. Both guides follow the same shape: declare a sandbox worker profile in your orchestrator -app, build and push a worker image, then view execution logs in the DTS dashboard. - -## Worker profile configuration reference - -Both languages configure the same worker profile settings. The table below lists each -setting, what it controls, its accepted values, and its default. The setting names differ -slightly between .NET (`PascalCase`) and Python (`snake_case`) but map one to one. - -| Setting (.NET / Python) | What it controls | Accepted values | Default | -| --- | --- | --- | --- | -| `Image.ImageRef` / `image.image_ref` | The container image that holds your activity implementations. | A full OCI image reference, by tag (`myregistry.azurecr.io/workers/hello:1.0`) or digest (`myregistry.azurecr.io/workers/hello@sha256:...`). | *Required* | -| `Image.ManagedIdentityClientId` / `image.managed_identity_client_id` | The client ID of the user-assigned managed identity DTS uses to **pull the worker image** from your registry. This identity needs the **AcrPull** role on the registry. | A user-assigned managed identity client ID (GUID). Must be attached to the scheduler. | *Required* | -| `SchedulerManagedIdentityClientId` / `scheduler_managed_identity_client_id` | The client ID of the user-assigned managed identity the **sandbox worker uses to connect back to DTS**, and that the activity code runs as when calling other services. | A user-assigned managed identity client ID (GUID). Must be attached to the scheduler. Can be the same identity as the image-pull identity or a different one. | *Required* | -| `Cpu` / `cpu` | CPU quantity declared for each sandbox. | A positive CPU quantity, expressed in millicores (`500m`, `1000m`) or whole/fractional cores (`2`, `0.5`). | `1000m` (1 vCPU) | -| `Memory` / `memory` | Memory quantity declared for each sandbox. | A positive memory quantity, such as `256Mi`, `1Gi`, or a bare number interpreted as MiB (`2048`). | `2048Mi` | -| `MaxConcurrentActivities` / `max_concurrent_activities` | How many activities a single sandbox worker instance processes concurrently. | An integer greater than `0`. There is no enforced upper bound; size it to what your activity and resource shape can handle. | `100` | -| `EnvironmentVariables` / `environment_variables` | Customer environment variables injected into the sandbox at runtime. | A map of string keys to string values. | Empty | -| *(profile id)* | Friendly profile id that groups the image, resources, and activities for monitoring and reuse. | A non-empty string, unique across your declared profiles. | `default` | -| `AddActivity` / `add_activity` | The activity names this profile offloads to the sandbox. | One or more activity names. At least one is required; an activity can belong to only one profile. | *Required* | - -> [!NOTE] -> CPU and memory must be positive resource quantities. The platform may apply additional -> per-preview ceilings on the total CPU and memory a sandbox can request—check your -> private preview onboarding details for the current limits. - -## View logs in the DTS dashboard - -Once your sandbox activities are running, you can view their execution logs directly in -the Durable Task Scheduler dashboard. The dashboard shows real-time output from your -managed workers, including stdout, stderr, and activity lifecycle events—giving you full -visibility into what's happening inside the sandbox without configuring external log -sinks or building your own observability pipeline. - -## Get started - -On-demand Sandboxes is in private preview. To get access, -[sign up here](https://techcommunity.microsoft.com/blog/AppsonAzureBlog/introducing-on-demand-sandboxes-for-azure-durable-task-scheduler-private-preview/4522333). -Once you're in, the workflow is straightforward: declare a sandbox worker profile in -your orchestrator app, build and push a worker image, and DTS takes care of the rest. +app, build and push a worker image, then view execution logs in the DTS dashboard. Each +guide also includes a worker profile configuration reference for its SDK. ## Related resources diff --git a/preview-features/on-demand-sandboxes/docs/dotnet.md b/preview-features/on-demand-sandboxes/docs/dotnet.md index 76e3229c..34db4596 100644 --- a/preview-features/on-demand-sandboxes/docs/dotnet.md +++ b/preview-features/on-demand-sandboxes/docs/dotnet.md @@ -1,14 +1,14 @@ -# On-demand Sandboxes — .NET guide +# On-demand Sandboxes: .NET guide > **Status:** Private preview · [Back to overview](./README.md) This guide walks through using On-demand Sandboxes with the **.NET** Durable Task SDK. -Make sure you've reviewed the [prerequisites](./README.md#prerequisites) first. +Make sure you've read the [overview](./README.md) first. On-demand Sandboxes use a two-part model: a **sandbox worker profile** in your orchestrator app that tells DTS which activities to offload, and a **worker image** that contains those activity implementations. Your orchestrator still calls activities the -same way it always has—the decision to run one in a sandbox lives entirely in the profile +same way it always has. The decision to run one in a sandbox lives entirely in the profile configuration. ## Install the preview packages @@ -16,9 +16,9 @@ configuration. The on-demand sandbox APIs ship in two opt-in preview packages that layer on top of the Azure-managed client and worker packages: -- `Microsoft.DurableTask.Client.AzureManaged.Sandboxes` — declarer-app side +- `Microsoft.DurableTask.Client.AzureManaged.Sandboxes`, the declarer-app side (`[SandboxWorkerProfile]`, `SandboxWorkerProfileOptions`, `SandboxActivitiesClient`). -- `Microsoft.DurableTask.Worker.AzureManaged.Sandboxes` — sandbox-worker side +- `Microsoft.DurableTask.Worker.AzureManaged.Sandboxes`, the sandbox-worker side (`UseSandboxWorker()`). Add the client and worker packages to your orchestrator app, and the worker package to @@ -33,7 +33,7 @@ dotnet add package Microsoft.DurableTask.Worker.AzureManaged.Sandboxes --version dotnet add package Microsoft.DurableTask.Worker.AzureManaged.Sandboxes --version 1.25.0-preview.2 ``` -## Step 1 — Declare a sandbox worker profile +## Step 1: Declare a sandbox worker profile In the app that hosts your orchestrator, define a sandbox worker profile. The profile gives DTS the container image of your activity code, the managed identities DTS uses to @@ -68,7 +68,7 @@ internal sealed class CodeSandboxWorkerProfile : ISandboxWorkerProfile > image-pull identity must have the **AcrPull** role on your container registry, and the > worker/scheduler identity must have whatever roles your activity code needs on the > downstream services it calls. You can use the same identity for both or split them. See -> [Configure the scheduler identity for image pull](./README.md#configure-the-scheduler-identity-for-image-pull). +> [Configure the scheduler identity for image pull](#configure-the-scheduler-identity-for-image-pull). Then, in the main app, enable work-item filters, register the sandbox activities client, and declare the profiles with DTS: @@ -92,7 +92,7 @@ builder.Services.AddDurableTaskSchedulerSandboxActivitiesClient(); ``` `UseWorkItemFilters()` is required: without it, DTS can dispatch a sandbox activity to -your in-process worker—which doesn't implement it—and the orchestration gets stuck +your in-process worker, which doesn't implement it, and the orchestration gets stuck retrying the wrong worker. Once the host is running, publish the declared profiles to DTS so it can route their @@ -104,22 +104,10 @@ SandboxActivitiesClient sandboxActivitiesClient = await sandboxActivitiesClient.EnableSandboxActivitiesAsync(); ``` -`EnableSandboxActivitiesAsync()` is the key call—it registers your sandbox worker profiles +`EnableSandboxActivitiesAsync()` is the key call. It registers your sandbox worker profiles with DTS so it picks them up and routes their declared activities to managed compute. Without it, those activities won't be offloaded. -For the meaning, accepted values, and defaults of each `SandboxWorkerProfileOptions` -setting, see the -[worker profile configuration reference](./README.md#worker-profile-configuration-reference). -In short: `Image.ImageRef` is the image with your activity implementations; -`Image.ManagedIdentityClientId` is the managed identity DTS uses to **pull the worker -image** from your registry (needs **AcrPull**), while `SchedulerManagedIdentityClientId` -is the managed identity the **sandbox worker uses to connect back to DTS** and that the -activity code runs as when it calls other services; `Cpu` / `Memory` set the per-sandbox -resource shape; `MaxConcurrentActivities` sets concurrency; and `AddActivity` selects the -activities to offload (only added activities run in DTS-managed isolated compute; -everything else stays in-process). - The orchestrator call site doesn't change: ```csharp @@ -131,7 +119,98 @@ ExecuteCodeOutput execution = await context.CallActivityAsync Because `ExecuteCode` is not registered in the main app's in-process activity list, DTS uses the profile to route the work to the sandbox image when the orchestrator calls it. -## Step 2 — Build the worker image +### Worker profile configuration reference + +The table below lists each `SandboxWorkerProfileOptions` setting, what it controls, its +accepted values, and its default. + +| Setting | What it controls | Accepted values | Default | +| --- | --- | --- | --- | +| `Image.ImageRef` | The container image that holds your activity implementations. | A full OCI image reference, by tag (`myregistry.azurecr.io/workers/hello:1.0`) or digest (`myregistry.azurecr.io/workers/hello@sha256:...`). | *Required* | +| `Image.ManagedIdentityClientId` | The client ID of the user-assigned managed identity DTS uses to **pull the worker image** from your registry. This identity needs the **AcrPull** role on the registry. | A user-assigned managed identity client ID (GUID). Must be attached to the scheduler. | *Required* | +| `SchedulerManagedIdentityClientId` | The client ID of the user-assigned managed identity the **sandbox worker uses to connect back to DTS**, and that the activity code runs as when calling other services. | A user-assigned managed identity client ID (GUID). Must be attached to the scheduler. Can be the same identity as the image-pull identity or a different one. | *Required* | +| `Cpu` | CPU quantity declared for each sandbox. | A positive CPU quantity, expressed in millicores (`500m`, `1000m`) or whole/fractional cores (`2`, `0.5`). | `1000m` (1 vCPU) | +| `Memory` | Memory quantity declared for each sandbox. | A positive memory quantity, such as `256Mi`, `1Gi`, or a bare number interpreted as MiB (`2048`). | `2048Mi` | +| `MaxConcurrentActivities` | How many activities a single sandbox worker instance processes concurrently. | An integer greater than `0`. There is no enforced upper bound; size it to what your activity and resource shape can handle. | `100` | +| `EnvironmentVariables` | Customer environment variables injected into the sandbox at runtime. | A map of string keys to string values. | Empty | +| *(profile id)* | Friendly profile id that groups the image, resources, and activities for monitoring and reuse. | A non-empty string, unique across your declared profiles. | `default` | +| `AddActivity` | The activity names this profile offloads to the sandbox. | One or more activity names. At least one is required; an activity can belong to only one profile. | *Required* | + +> [!NOTE] +> CPU and memory must be positive resource quantities. The platform may apply additional +> per-preview ceilings on the total CPU and memory a sandbox can request. Check your +> private preview onboarding details for the current limits. + +## Configure the scheduler identity for image pull + +To start a sandbox, DTS pulls your worker image from your container registry on your +behalf. It does this using a **user-assigned managed identity** attached to the scheduler. +That identity must be granted the **AcrPull** role on the Azure Container Registry that +hosts your worker image, and the scheduler must have the identity attached. + +> [!IMPORTANT] +> Only **user-assigned** managed identities are supported. System-assigned managed +> identities are not supported at this time. + +The worker profile distinguishes two identities, and you can use the same identity for +both or split them: + +- **Image-pull identity** (`options.Image.ManagedIdentityClientId`): the identity DTS + uses to **pull the worker image** from your registry. This identity needs the **AcrPull** + role on the registry. +- **Worker/scheduler identity** (`options.SchedulerManagedIdentityClientId`): the + identity the **sandbox worker uses to connect back to Durable Task Scheduler**, and the + identity your activity code runs as when it calls other services (for example, Storage, + Key Vault, or a database). Grant this identity whatever roles your activity code needs on + those downstream services. + +Both identities must be attached to the scheduler. Using two separate identities lets you +scope image-pull permissions narrowly while granting your activity code only the +downstream permissions it needs. + +### 1. Grant the identity the AcrPull role on your registry + +Assign the **AcrPull** role to the **image-pull** user-assigned managed identity, scoped +to your registry: + +```bash +az role assignment create \ + --assignee "" \ + --role "AcrPull" \ + --scope "/subscriptions//resourceGroups//providers/Microsoft.ContainerRegistry/registries/" +``` + +Without this role assignment, DTS cannot pull the worker image and the sandbox will fail +to start. If your activity code calls other Azure services, grant the **worker/scheduler** +identity the roles it needs on those services as well. + +### 2. Attach the identity to the scheduler + +The scheduler must have the user-assigned identity attached. The +[`durabletask` Azure CLI extension](https://learn.microsoft.com/cli/azure/durabletask) +provides identity commands that handle this for you. Install it with +`az extension add --name durabletask` if you haven't already. + +Attach the identity to an existing scheduler: + +```bash +az durabletask scheduler identity assign \ + --resource-group "" \ + --name "" \ + --user-assigned "/subscriptions//resourceGroups//providers/Microsoft.ManagedIdentity/userAssignedIdentities/" +``` + +To attach multiple identities (for example, separate image-pull and worker/scheduler +identities), pass several space-separated resource IDs to `--user-assigned`. Verify what's +attached with `az durabletask scheduler identity show --resource-group "" --name ""`. + +Once the identities are attached to the scheduler (the image-pull identity with the +**AcrPull** role on your registry), reference their client IDs on the worker profile +(`options.Image.ManagedIdentityClientId` and `options.SchedulerManagedIdentityClientId`) +so DTS uses the image-pull identity to pull the image and the worker/scheduler identity for +the sandbox worker to connect back to DTS and call downstream services. + +## Step 2: Build the worker image The worker image is a container you own. In most apps, this worker lives in a separate project from the orchestrator host so it can have its own entry point, dependencies, and @@ -150,13 +229,13 @@ builder.Services.AddDurableTaskWorker(workerBuilder => }); ``` -`UseSandboxWorker()` is the key line—it signals that this worker runs in DTS-managed +`UseSandboxWorker()` is the key line. It signals that this worker runs in DTS-managed compute. The sandbox worker does **not** need to configure the DTS endpoint, task hub, profile id, or credentials; DTS injects the runtime settings when it starts the container. The activity implementations themselves are standard Durable Task activities. There's -nothing special about the activity code—it can call a runtime with different +nothing special about the activity code. It can call a runtime with different dependencies (for example, Python and pandas) while running in an isolated container instead of in your main app's process. @@ -166,17 +245,17 @@ Container Registry) and reference the image in the worker profile's `Image.Image option. The image-pull identity you set in `Image.ManagedIdentityClientId` must have the **AcrPull** role on that registry. -## Step 3 — View logs in the DTS dashboard +## Step 3: View logs in the DTS dashboard Once your sandbox activities are running, you can view their execution logs directly in -the Durable Task Scheduler dashboard. See -[View logs in the DTS dashboard](./README.md#view-logs-in-the-dts-dashboard) in the -overview for details. +the Durable Task Scheduler dashboard. The dashboard shows real-time output from your +managed workers, including stdout, stderr, and activity lifecycle events, giving you full +visibility into what's happening inside the sandbox without configuring external log +sinks or building your own observability pipeline. ## Next steps -- [Worker profile configuration reference](./README.md#worker-profile-configuration-reference) -- [Configure the scheduler identity for image pull](./README.md#configure-the-scheduler-identity-for-image-pull) +- [Configure the scheduler identity for image pull](#configure-the-scheduler-identity-for-image-pull) - [End-to-end .NET sample](../samples/dotnet) - [Python guide](./python.md) - [Back to overview](./README.md) diff --git a/preview-features/on-demand-sandboxes/docs/python.md b/preview-features/on-demand-sandboxes/docs/python.md index ae168d84..07714d87 100644 --- a/preview-features/on-demand-sandboxes/docs/python.md +++ b/preview-features/on-demand-sandboxes/docs/python.md @@ -1,26 +1,27 @@ -# On-demand Sandboxes — Python guide +# On-demand Sandboxes: Python guide > **Status:** Private preview · [Back to overview](./README.md) This guide walks through using On-demand Sandboxes with the **Python** Durable Task SDK. -Make sure you've reviewed the [prerequisites](./README.md#prerequisites) first. +Make sure you've read the [overview](./README.md) first. On-demand Sandboxes use a two-part model: a **sandbox worker profile** (the *declarer app*) that tells DTS which activities to offload, and a **worker image** that contains those activity implementations. Your orchestrator still calls activities the same way it -always has—the decision to run one in a sandbox lives entirely in the profile +always has. The decision to run one in a sandbox lives entirely in the profile configuration. ## Install the SDK The on-demand sandbox APIs ship under the `durabletask.azuremanaged.preview.sandboxes` -namespace. Install the Durable Task packages: +namespace. Install the Azure-managed Durable Task package (it pulls in the core +`durabletask` SDK): ```bash -pip install durabletask==1.6.0 durabletask-azuremanaged==1.6.0 +pip install durabletask-azuremanaged==1.6.0 ``` -## Step 1 — Declare a sandbox worker profile +## Step 1: Declare a sandbox worker profile The declarer app uses a decorated profile class to declare the remote worker image and activity ownership, then enables sandbox activities on the DTS client. The profile sets @@ -100,7 +101,7 @@ with DurableTaskSchedulerWorker( print(state.serialized_output if state else "no result") ``` -`enable_sandbox_activities()` is the key call—it registers the declared profiles with DTS +`enable_sandbox_activities()` is the key call. It registers the declared profiles with DTS so it can route those activities to the sandbox image. `use_work_item_filters()` keeps sandbox activities from being dispatched to this in-process worker. @@ -110,28 +111,108 @@ sandbox activities from being dispatched to this in-process worker. > The image-pull identity must have the **AcrPull** role on your container registry, and > the worker/scheduler identity must have whatever roles your activity code needs on the > downstream services it calls (you can use the same identity for both or split them). See -> [Configure the scheduler identity for image pull](./README.md#configure-the-scheduler-identity-for-image-pull). - -For the meaning, accepted values, and defaults of each profile option, see the -[worker profile configuration reference](./README.md#worker-profile-configuration-reference). -In short: `image.image_ref` is the image with your activity implementations; -`image.managed_identity_client_id` is the managed identity DTS uses to **pull the worker -image** from your registry (needs **AcrPull**), while -`scheduler_managed_identity_client_id` is the managed identity the **sandbox worker uses -to connect back to DTS** and that the activity code runs as when it calls other services; -`cpu` / `memory` set the per-sandbox resource shape; `max_concurrent_activities` sets -concurrency; `environment_variables` injects customer environment variables; and -`add_activity(...)` selects the activities to offload (only added activities run in -DTS-managed isolated compute; everything else stays in-process). - -The orchestrator call site doesn't change—it calls `REMOTE_HELLO` the same way it would +> [Configure the scheduler identity for image pull](#configure-the-scheduler-identity-for-image-pull). + +The orchestrator call site doesn't change. It calls `REMOTE_HELLO` the same way it would call any activity, and DTS routes it to the sandbox. -## Step 2 — Build the worker image +### Worker profile configuration reference + +The table below lists each profile option, what it controls, its accepted values, and its +default. + +| Setting | What it controls | Accepted values | Default | +| --- | --- | --- | --- | +| `image.image_ref` | The container image that holds your activity implementations. | A full OCI image reference, by tag (`myregistry.azurecr.io/workers/hello:1.0`) or digest (`myregistry.azurecr.io/workers/hello@sha256:...`). | *Required* | +| `image.managed_identity_client_id` | The client ID of the user-assigned managed identity DTS uses to **pull the worker image** from your registry. This identity needs the **AcrPull** role on the registry. | A user-assigned managed identity client ID (GUID). Must be attached to the scheduler. | *Required* | +| `scheduler_managed_identity_client_id` | The client ID of the user-assigned managed identity the **sandbox worker uses to connect back to DTS**, and that the activity code runs as when calling other services. | A user-assigned managed identity client ID (GUID). Must be attached to the scheduler. Can be the same identity as the image-pull identity or a different one. | *Required* | +| `cpu` | CPU quantity declared for each sandbox. | A positive CPU quantity, expressed in millicores (`500m`, `1000m`) or whole/fractional cores (`2`, `0.5`). | `1000m` (1 vCPU) | +| `memory` | Memory quantity declared for each sandbox. | A positive memory quantity, such as `256Mi`, `1Gi`, or a bare number interpreted as MiB (`2048`). | `2048Mi` | +| `max_concurrent_activities` | How many activities a single sandbox worker instance processes concurrently. | An integer greater than `0`. There is no enforced upper bound; size it to what your activity and resource shape can handle. | `100` | +| `environment_variables` | Customer environment variables injected into the sandbox at runtime. | A map of string keys to string values. | Empty | +| *(profile id)* | Friendly profile id that groups the image, resources, and activities for monitoring and reuse. | A non-empty string, unique across your declared profiles. | `default` | +| `add_activity(...)` | The activity names this profile offloads to the sandbox. | One or more activity names. At least one is required; an activity can belong to only one profile. | *Required* | + +> [!NOTE] +> CPU and memory must be positive resource quantities. The platform may apply additional +> per-preview ceilings on the total CPU and memory a sandbox can request. Check your +> private preview onboarding details for the current limits. + +## Configure the scheduler identity for image pull + +To start a sandbox, DTS pulls your worker image from your container registry on your +behalf. It does this using a **user-assigned managed identity** attached to the scheduler. +That identity must be granted the **AcrPull** role on the Azure Container Registry that +hosts your worker image, and the scheduler must have the identity attached. + +> [!IMPORTANT] +> Only **user-assigned** managed identities are supported. System-assigned managed +> identities are not supported at this time. + +The worker profile distinguishes two identities, and you can use the same identity for +both or split them: + +- **Image-pull identity** (`options.image.managed_identity_client_id`): the identity DTS + uses to **pull the worker image** from your registry. This identity needs the **AcrPull** + role on the registry. +- **Worker/scheduler identity** (`options.scheduler_managed_identity_client_id`): the + identity the **sandbox worker uses to connect back to Durable Task Scheduler**, and the + identity your activity code runs as when it calls other services (for example, Storage, + Key Vault, or a database). Grant this identity whatever roles your activity code needs on + those downstream services. + +Both identities must be attached to the scheduler. Using two separate identities lets you +scope image-pull permissions narrowly while granting your activity code only the +downstream permissions it needs. + +### 1. Grant the identity the AcrPull role on your registry + +Assign the **AcrPull** role to the **image-pull** user-assigned managed identity, scoped +to your registry: + +```bash +az role assignment create \ + --assignee "" \ + --role "AcrPull" \ + --scope "/subscriptions//resourceGroups//providers/Microsoft.ContainerRegistry/registries/" +``` + +Without this role assignment, DTS cannot pull the worker image and the sandbox will fail +to start. If your activity code calls other Azure services, grant the **worker/scheduler** +identity the roles it needs on those services as well. + +### 2. Attach the identity to the scheduler + +The scheduler must have the user-assigned identity attached. The +[`durabletask` Azure CLI extension](https://learn.microsoft.com/cli/azure/durabletask) +provides identity commands that handle this for you. Install it with +`az extension add --name durabletask` if you haven't already. + +Attach the identity to an existing scheduler: + +```bash +az durabletask scheduler identity assign \ + --resource-group "" \ + --name "" \ + --user-assigned "/subscriptions//resourceGroups//providers/Microsoft.ManagedIdentity/userAssignedIdentities/" +``` + +To attach multiple identities (for example, separate image-pull and worker/scheduler +identities), pass several space-separated resource IDs to `--user-assigned`. Verify what's +attached with `az durabletask scheduler identity show --resource-group "" --name ""`. + +Once the identities are attached to the scheduler (the image-pull identity with the +**AcrPull** role on your registry), reference their client IDs on the worker profile +(`options.image.managed_identity_client_id` and +`options.scheduler_managed_identity_client_id`) so DTS uses the image-pull identity to pull +the image and the worker/scheduler identity for the sandbox worker to connect back to DTS +and call downstream services. + +## Step 2: Build the worker image The worker image runs `SandboxWorker()`, registers the activity implementations it owns, and starts. The sandbox worker does **not** configure the DTS endpoint, task hub, profile -id, or credentials—`SandboxWorker()` reads the runtime settings (such as `DTS_ENDPOINT`, +id, or credentials. `SandboxWorker()` reads the runtime settings (such as `DTS_ENDPOINT`, `DTS_TASK_HUB`, `DTS_WORKER_PROFILE_ID`, and `DTS_SANDBOX_ID`) from environment variables that DTS injects when it starts the container. @@ -184,7 +265,7 @@ ENV GRPC_DEFAULT_SSL_ROOTS_FILE_PATH=/etc/ssl/certs/ca-certificates.crt # Install the Durable Task SDK (with the sandboxes extension), plus your # activity dependencies. -RUN pip install --no-cache-dir durabletask==1.6.0 durabletask-azuremanaged==1.6.0 +RUN pip install --no-cache-dir durabletask-azuremanaged==1.6.0 COPY remote_worker.py /app/remote_worker.py COPY activities.py /app/activities.py @@ -202,17 +283,17 @@ Then set the image reference on the declarer profile (for example, via the image-pull managed identity you configured on the profile, which must have the **AcrPull** role on your registry. -## Step 3 — View logs in the DTS dashboard +## Step 3: View logs in the DTS dashboard Once your sandbox activities are running, you can view their execution logs directly in -the Durable Task Scheduler dashboard. See -[View logs in the DTS dashboard](./README.md#view-logs-in-the-dts-dashboard) in the -overview for details. +the Durable Task Scheduler dashboard. The dashboard shows real-time output from your +managed workers, including stdout, stderr, and activity lifecycle events, giving you full +visibility into what's happening inside the sandbox without configuring external log +sinks or building your own observability pipeline. ## Next steps -- [Worker profile configuration reference](./README.md#worker-profile-configuration-reference) -- [Configure the scheduler identity for image pull](./README.md#configure-the-scheduler-identity-for-image-pull) +- [Configure the scheduler identity for image pull](#configure-the-scheduler-identity-for-image-pull) - [End-to-end Python sample](../samples/python) - [.NET guide](./dotnet.md) - [Back to overview](./README.md) diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/README.md b/preview-features/on-demand-sandboxes/samples/dotnet/README.md index 58dfaf0f..fd2db14d 100644 --- a/preview-features/on-demand-sandboxes/samples/dotnet/README.md +++ b/preview-features/on-demand-sandboxes/samples/dotnet/README.md @@ -3,14 +3,6 @@ A three-step Durable Task workflow that demonstrates the **On-demand Sandboxes** preview of Azure Durable Task Scheduler (DTS). -``` - ┌─────────────────────────┐ ┌─────────────────────────┐ ┌─────────────────────────┐ - │ GenerateCode │ │ ExecuteCode │ │ FormatAnswer │ - │ (in-process .NET) │ -> │ (on-demand sandbox) │ -> │ (in-process .NET) │ - │ Azure OpenAI -> Python │ │ python3 + pandas │ │ Pretty-print answer │ - └─────────────────────────┘ └─────────────────────────┘ └─────────────────────────┘ -``` - The orchestrator asks a natural-language question over `data/sales_q1.csv`. The LLM returns a self-contained pandas script. That script is **untrusted** code, so it runs in a DTS-managed on-demand sandbox - not in the orchestrator's process. The first and last @@ -58,7 +50,7 @@ flowchart LR - The orchestrator and its in-process activities (`GenerateCode`, `FormatAnswer`) run in the always-on `main-app` process and exchange work items with the DTS task hub. - `ExecuteCode` is declared as an on-demand sandbox activity by the `code-executor` worker profile (see `main-app/WorkerProfiles.cs`). The activity is never registered in the main app. - When the orchestrator calls `ExecuteCode`, the DTS on-demand sandbox runtime provisions a sandbox container from the profile's image. The sandbox picks up the work item, runs it, returns the result, and is scaled back to zero when idle. -- The orchestrator's call site (`CallActivityAsync(TaskNames.ExecuteCode, ...)`) is identical to any other activity call — the "this runs in a sandbox" decision lives entirely in the worker profile declaration. +- The orchestrator's call site (`CallActivityAsync(TaskNames.ExecuteCode, ...)`) is identical to any other activity call. The "this runs in a sandbox" decision lives entirely in the worker profile declaration. ## Layout @@ -84,13 +76,20 @@ dts-ondemand-sandbox-codegen-demo/ └── Containerfile ``` -## Prerequisites +## Prerequisites (local development) + +These prerequisites are for running the sample **locally** (building the sandbox image +and running the orchestrator on your machine). To deploy to Azure instead, skip to +[Deploy to Azure (AKS) with `azd`](#deploy-to-azure-aks-with-azd), which has its own +prerequisites. - .NET 10 SDK - Docker (for building the sandbox image) +- Azure CLI (`az`), signed in with access to the scheduler, ACR, and Azure OpenAI - A DTS scheduler + task hub you can hit -- An Azure Container Registry with anonymous pull enabled (so DTS can fetch the sandbox image) -- An Azure OpenAI deployment of a chat model (GPT-4o, GPT-4.1, etc.) +- An Azure Container Registry the image-pull identity can pull from (granted AcrPull) +- Two user-assigned managed identities (image pull + scheduler connect) +- An Azure OpenAI deployment of a chat model (GPT-5.1, GPT-5, GPT-4.1, etc.) - The Durable Task on-demand sandbox preview packages (`1.25.0-preview.2`) available on a NuGet feed you can restore from @@ -108,11 +107,16 @@ docker build \ -t $IMAGE \ . -# Enable anonymous pull so DTS can fetch the sandbox image without credentials -az acr update --name $ACR --anonymous-pull-enabled true - az acr login --name $ACR docker push $IMAGE + +# DTS pulls the sandbox image using the image-pull managed identity (not anonymous +# pull). Grant that identity AcrPull on the registry -- use the same UMI you pass as +# DTS_SANDBOX_IMAGE_PULL_UMI_CLIENT_ID when running the orchestrator. +az role assignment create \ + --assignee "" \ + --role AcrPull \ + --scope "$(az acr show --name $ACR --query id -o tsv)" ``` > **Note on `--platform linux/amd64`:** Required on Apple Silicon. The `Grpc.Tools` @@ -142,7 +146,7 @@ The orchestrator prints the question, the orchestration id, and the final answer The main-app console shows the AOAI-generated Python (prefixed `[generate]`) before it's handed off to the sandbox. The sandbox container logs (prefixed `[sandbox]`) stream through the DTS dashboard's **On-demand Sandboxes** tab while `ExecuteCode` -runs — that's where you see the code, dataset load, execution timing, and script output. +runs. That's where you see the code, dataset load, execution timing, and script output. ## Deploy to Azure (AKS) with `azd` @@ -151,10 +155,11 @@ Kubernetes Service** with [`azd`](https://learn.microsoft.com/azure/developer/az The sandbox worker image is built and pushed to ACR; DTS starts it on demand, so it is never deployed to the cluster. -> The Durable Task Scheduler is **not created** by this template — you pass in an +> The Durable Task Scheduler is **not created** by this template. You pass in an > existing one. On-demand Sandboxes is a private-preview feature that must be enabled on > the scheduler out of band, so the scheduler is patched separately and supplied here by -> name. +> name. The scheduler must be in a supported preview region: East US 2, West US 3, North +> Europe, or Australia East. ### What gets provisioned @@ -163,19 +168,19 @@ never deployed to the cluster. | **AKS cluster** | Hosts the `main-app` orchestrator pod (workload identity enabled) | | **Azure Container Registry** | Stores the main-app and sandbox-worker images (built server-side via ACR Tasks) | | **User-assigned managed identity** + federated credential | Pod auth to DTS/Azure OpenAI, ACR pull for the sandbox, and the sandbox's connection back to DTS | -| **Azure OpenAI** + `gpt-4o` deployment | Backs the in-process `GenerateCode` activity | -| **VNet** | Network isolation for AKS | +| **Azure OpenAI** + `gpt-5.1` deployment | Backs the in-process `GenerateCode` activity | The deployment also **ensures the task hub** exists, grants the identity the roles it needs (AcrPull, Durable Task data access, Cognitive Services OpenAI User), and a `postprovision` hook **attaches the identity to your scheduler** (a merge-safe PATCH). -### Prerequisites +### Prerequisites (Azure deployment) - An existing **DTS scheduler** with the On-demand Sandboxes preview enabled, and its resource group name. - [Azure Developer CLI (`azd`)](https://learn.microsoft.com/azure/developer/azure-developer-cli/install-azd), [Azure CLI](https://learn.microsoft.com/cli/azure/install-azure-cli), and [kubectl](https://kubernetes.io/docs/tasks/tools/). -- Azure OpenAI quota for `gpt-4o` in your target region. +- Azure OpenAI quota for `gpt-5.1` (`GlobalStandard`) in your target region (default + `eastus`; override with `AZURE_OPENAI_LOCATION`). ### Deploy diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/azure.yaml b/preview-features/on-demand-sandboxes/samples/dotnet/azure.yaml index c2200786..24fa4e59 100644 --- a/preview-features/on-demand-sandboxes/samples/dotnet/azure.yaml +++ b/preview-features/on-demand-sandboxes/samples/dotnet/azure.yaml @@ -12,11 +12,19 @@ metadata: name: dts-ondemand-sandboxes-dotnet hooks: predeploy: - shell: bash - run: ./scripts/acr-build.sh + posix: + shell: sh + run: ./scripts/acr-build.sh + windows: + shell: pwsh + run: ./scripts/acr-build.ps1 postprovision: - shell: bash - run: ./scripts/attach-scheduler-identity.sh + posix: + shell: sh + run: ./scripts/attach-scheduler-identity.sh + windows: + shell: pwsh + run: ./scripts/attach-scheduler-identity.ps1 services: mainapp: project: ./main-app diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/infra/core/host/aks-cluster.bicep b/preview-features/on-demand-sandboxes/samples/dotnet/infra/core/host/aks-cluster.bicep index d2a767e3..9830654b 100644 --- a/preview-features/on-demand-sandboxes/samples/dotnet/infra/core/host/aks-cluster.bicep +++ b/preview-features/on-demand-sandboxes/samples/dotnet/infra/core/host/aks-cluster.bicep @@ -9,8 +9,8 @@ param location string = resourceGroup().location @description('Tags to apply to the AKS cluster') param tags object = {} -@description('The Kubernetes version for the AKS cluster') -param kubernetesVersion string = '1.32' +@description('The Kubernetes version for the AKS cluster. Leave empty to use the AKS default supported version.') +param kubernetesVersion string = '' @description('The VM size for the default node pool') param agentVMSize string = 'standard_d4s_v5' @@ -45,7 +45,7 @@ resource aksCluster 'Microsoft.ContainerService/managedClusters@2024-09-01' = { type: 'SystemAssigned' } properties: { - kubernetesVersion: kubernetesVersion + kubernetesVersion: !empty(kubernetesVersion) ? kubernetesVersion : null dnsPrefix: name enableRBAC: true agentPoolProfiles: [ diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/infra/core/networking/vnet.bicep b/preview-features/on-demand-sandboxes/samples/dotnet/infra/core/networking/vnet.bicep deleted file mode 100644 index 46764a8e..00000000 --- a/preview-features/on-demand-sandboxes/samples/dotnet/infra/core/networking/vnet.bicep +++ /dev/null @@ -1,40 +0,0 @@ -@description('The name of the Virtual Network') -param name string - -@description('The Azure region where the Virtual Network should exist') -param location string = resourceGroup().location - -@description('Optional tags for the resources') -param tags object = {} - -@description('The address prefixes of the Virtual Network') -param addressPrefixes array = ['10.0.0.0/16'] - -@description('The subnets to create in the Virtual Network') -param subnets array = [ - { - name: 'aks-subnet' - properties: { - addressPrefix: '10.0.0.0/21' - delegations: [] - privateEndpointNetworkPolicies: 'Disabled' - privateLinkServiceNetworkPolicies: 'Enabled' - } - } -] - -resource vnet 'Microsoft.Network/virtualNetworks@2023-11-01' = { - name: name - location: location - tags: tags - properties: { - addressSpace: { - addressPrefixes: addressPrefixes - } - subnets: subnets - } -} - -output id string = vnet.id -output name string = vnet.name -output aksSubnetId string = resourceId('Microsoft.Network/virtualNetworks/subnets', name, 'aks-subnet') diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/infra/main.bicep b/preview-features/on-demand-sandboxes/samples/dotnet/infra/main.bicep index 66ee57f2..7d1b31ca 100644 --- a/preview-features/on-demand-sandboxes/samples/dotnet/infra/main.bicep +++ b/preview-features/on-demand-sandboxes/samples/dotnet/infra/main.bicep @@ -20,7 +20,7 @@ param principalId string = '' // AKS parameters param aksClusterName string = '' -param kubernetesVersion string = '1.32' +param kubernetesVersion string = '' param aksVmSize string = 'standard_d4s_v5' param aksNodeCount int = 2 @@ -83,19 +83,6 @@ module identity './app/user-assigned-identity.bicep' = { } } -// ============================ -// Networking -// ============================ - -module vnet './core/networking/vnet.bicep' = { - scope: rg - params: { - name: '${abbrs.networkVirtualNetworks}${resourceToken}' - location: location - tags: tags - } -} - // ============================ // Container Registry // ============================ @@ -138,7 +125,6 @@ module aksCluster './core/host/aks-cluster.bicep' = { kubernetesVersion: kubernetesVersion agentVMSize: aksVmSize agentCount: aksNodeCount - subnetId: vnet.outputs.aksSubnetId containerRegistryName: containerRegistry.outputs.name } } diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/infra/main.parameters.json b/preview-features/on-demand-sandboxes/samples/dotnet/infra/main.parameters.json index f98d2108..a2748fe0 100644 --- a/preview-features/on-demand-sandboxes/samples/dotnet/infra/main.parameters.json +++ b/preview-features/on-demand-sandboxes/samples/dotnet/infra/main.parameters.json @@ -19,6 +19,9 @@ }, "taskHubName": { "value": "${DTS_TASK_HUB=default}" + }, + "openAiLocation": { + "value": "${AZURE_OPENAI_LOCATION=eastus}" } } } diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/main-app/Containerfile b/preview-features/on-demand-sandboxes/samples/dotnet/main-app/Containerfile index c0a2daf7..6e6f97f8 100644 --- a/preview-features/on-demand-sandboxes/samples/dotnet/main-app/Containerfile +++ b/preview-features/on-demand-sandboxes/samples/dotnet/main-app/Containerfile @@ -9,7 +9,7 @@ # under Docker's arm64 emulation on Apple Silicon. amd64 works under Rosetta and matches # what the cluster runs. -FROM --platform=$TARGETPLATFORM mcr.microsoft.com/dotnet/sdk:10.0 AS build +FROM mcr.microsoft.com/dotnet/sdk:10.0 AS build ARG TARGETARCH WORKDIR /src/dts-ondemand-sandbox-codegen-demo diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/main-app/Program.cs b/preview-features/on-demand-sandboxes/samples/dotnet/main-app/Program.cs index 1f36d793..cb405210 100644 --- a/preview-features/on-demand-sandboxes/samples/dotnet/main-app/Program.cs +++ b/preview-features/on-demand-sandboxes/samples/dotnet/main-app/Program.cs @@ -115,6 +115,12 @@ Console.WriteLine(result?.ReadOutputAs() ?? ""); } -await host.StopAsync(); +// The app runs a single orchestration above. When deployed as an always-on +// Deployment, we keep the process alive afterwards so the pod stays Running +// instead of exiting (which would make Kubernetes restart it and schedule a +// new orchestration on every restart). Block until the host receives SIGTERM. +Console.WriteLine(); +Console.WriteLine("[demo] Orchestration complete. Idling; press Ctrl+C or send SIGTERM to exit."); +await host.WaitForShutdownAsync(); return 0; diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/main-app/main-app.csproj b/preview-features/on-demand-sandboxes/samples/dotnet/main-app/main-app.csproj index 103b60dd..9dd2302a 100644 --- a/preview-features/on-demand-sandboxes/samples/dotnet/main-app/main-app.csproj +++ b/preview-features/on-demand-sandboxes/samples/dotnet/main-app/main-app.csproj @@ -1,4 +1,4 @@ - + Exe @@ -7,6 +7,7 @@ enable CodegenMainApp Demo.Codegen.MainApp + 923d893f-feb6-44bc-a79c-89c368094649 diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/sandbox-worker/Containerfile b/preview-features/on-demand-sandboxes/samples/dotnet/sandbox-worker/Containerfile index 65ab938e..a756c9c9 100644 --- a/preview-features/on-demand-sandboxes/samples/dotnet/sandbox-worker/Containerfile +++ b/preview-features/on-demand-sandboxes/samples/dotnet/sandbox-worker/Containerfile @@ -15,7 +15,7 @@ # The sandbox worker is a .NET app that shells out to python3, so we install # pandas in the runtime stage. -FROM --platform=$TARGETPLATFORM mcr.microsoft.com/dotnet/sdk:10.0 AS build +FROM mcr.microsoft.com/dotnet/sdk:10.0 AS build ARG TARGETARCH WORKDIR /src/dts-ondemand-sandbox-codegen-demo diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/scripts/acr-build.ps1 b/preview-features/on-demand-sandboxes/samples/dotnet/scripts/acr-build.ps1 new file mode 100644 index 00000000..60114f1f --- /dev/null +++ b/preview-features/on-demand-sandboxes/samples/dotnet/scripts/acr-build.ps1 @@ -0,0 +1,55 @@ +# Builds the two container images for the On-demand Sandboxes demo server-side using +# ACR Tasks (az acr build) - no local Docker required. Called by azd as a predeploy hook +# on Windows (the POSIX equivalent is acr-build.sh). +# +# - main-app : the orchestrator, deployed to AKS (azd reads SERVICE_MAINAPP_IMAGE_NAME +# and skips its own build/push). +# - sandbox : the worker image DTS starts on demand. Not deployed to AKS; its full +# image reference is handed to the app via DTS_SANDBOX_CONTAINER_IMAGE. + +$ErrorActionPreference = 'Stop' + +function Get-RequiredEnv([string]$name) { + $value = [Environment]::GetEnvironmentVariable($name) + if ([string]::IsNullOrEmpty($value)) { + throw "$name must be set" + } + return $value +} + +$Registry = Get-RequiredEnv 'AZURE_CONTAINER_REGISTRY_NAME' +$RegistryEndpoint = Get-RequiredEnv 'AZURE_CONTAINER_REGISTRY_ENDPOINT' +$EnvName = Get-RequiredEnv 'AZURE_ENV_NAME' +$Tag = "azd-deploy-$([DateTimeOffset]::UtcNow.ToUnixTimeSeconds())" + +# The .NET build context is the sample root so Directory.Build.props is available. +function Build-Image([string]$imageRepo, [string]$containerfile) { + $fullImage = "$RegistryEndpoint/${imageRepo}:$Tag" + + Write-Host "==> Building ${imageRepo}:$Tag via ACR Tasks (--platform linux/amd64)..." + # The classic ACR builder does not auto-populate the BuildKit TARGETARCH arg, so we + # pass it explicitly. We always build linux/amd64 here, so amd64 is correct. + az acr build ` + --registry $Registry ` + --image "${imageRepo}:$Tag" ` + --platform linux/amd64 ` + --build-arg TARGETARCH=amd64 ` + --file $containerfile ` + . ` + --no-logs ` + --output none + if ($LASTEXITCODE -ne 0) { throw "az acr build failed for $imageRepo" } + + return $fullImage +} + +$MainAppImage = Build-Image "dts-ondemand-sandboxes/main-app-$EnvName" "main-app/Containerfile" +$SandboxImage = Build-Image "dts-ondemand-sandboxes/sandbox-worker-$EnvName" "sandbox-worker/Containerfile" + +# azd uses SERVICE__IMAGE_NAME to skip its own build and deploy this image instead. +azd env set SERVICE_MAINAPP_IMAGE_NAME $MainAppImage +# The app declares the sandbox worker profile using this image reference. +azd env set DTS_SANDBOX_CONTAINER_IMAGE $SandboxImage + +Write-Host "==> main-app image : $MainAppImage" +Write-Host "==> sandbox image : $SandboxImage" diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/scripts/acr-build.sh b/preview-features/on-demand-sandboxes/samples/dotnet/scripts/acr-build.sh index b6192a8e..ca88e8c6 100755 --- a/preview-features/on-demand-sandboxes/samples/dotnet/scripts/acr-build.sh +++ b/preview-features/on-demand-sandboxes/samples/dotnet/scripts/acr-build.sh @@ -7,7 +7,7 @@ # - sandbox : the worker image DTS starts on demand. Not deployed to AKS; its full # image reference is handed to the app via DTS_SANDBOX_CONTAINER_IMAGE. -set -euo pipefail +set -eu REGISTRY="${AZURE_CONTAINER_REGISTRY_NAME:?AZURE_CONTAINER_REGISTRY_NAME must be set}" REGISTRY_ENDPOINT="${AZURE_CONTAINER_REGISTRY_ENDPOINT:?AZURE_CONTAINER_REGISTRY_ENDPOINT must be set}" @@ -21,10 +21,13 @@ build() { local full_image="${REGISTRY_ENDPOINT}/${image_repo}:${TAG}" echo "==> Building ${image_repo}:${TAG} via ACR Tasks (--platform linux/amd64)..." >&2 + # The classic ACR builder does not auto-populate the BuildKit TARGETARCH arg, so we + # pass it explicitly. We always build linux/amd64 here, so amd64 is correct. az acr build \ --registry "${REGISTRY}" \ --image "${image_repo}:${TAG}" \ --platform linux/amd64 \ + --build-arg TARGETARCH=amd64 \ --file "${containerfile}" \ . \ --no-logs \ diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/scripts/attach-scheduler-identity.ps1 b/preview-features/on-demand-sandboxes/samples/dotnet/scripts/attach-scheduler-identity.ps1 new file mode 100644 index 00000000..cafc1c53 --- /dev/null +++ b/preview-features/on-demand-sandboxes/samples/dotnet/scripts/attach-scheduler-identity.ps1 @@ -0,0 +1,39 @@ +# Attaches the sample's user-assigned managed identity to the existing Durable Task +# Scheduler so DTS can use it to pull the sandbox image and let the sandbox worker +# connect back. Runs as an azd postprovision hook on Windows (the POSIX equivalent is +# attach-scheduler-identity.sh). Uses the `durabletask` Azure CLI extension's identity +# command, which adds the identity without removing any that are already attached to +# the scheduler. +# +# NOTE: enabling the On-demand Sandboxes preview *feature* on the scheduler is a +# separate, out-of-band step handled during private-preview onboarding. + +$ErrorActionPreference = 'Stop' + +function Get-RequiredEnv([string]$name) { + $value = [Environment]::GetEnvironmentVariable($name) + if ([string]::IsNullOrEmpty($value)) { + throw "$name must be set" + } + return $value +} + +$SubscriptionId = Get-RequiredEnv 'AZURE_SUBSCRIPTION_ID' +$SchedulerName = Get-RequiredEnv 'DTS_SCHEDULER_NAME' +$SchedulerRg = Get-RequiredEnv 'DTS_SCHEDULER_RESOURCE_GROUP' +$IdentityId = Get-RequiredEnv 'AZURE_USER_ASSIGNED_IDENTITY_RESOURCE_ID' + +Write-Host "==> Ensuring the 'durabletask' Azure CLI extension is installed..." +az extension add --name durabletask --upgrade --only-show-errors | Out-Null +if ($LASTEXITCODE -ne 0) { throw "Failed to add the 'durabletask' az extension" } + +Write-Host "==> Attaching managed identity to scheduler '$SchedulerName'..." +az durabletask scheduler identity assign ` + --subscription $SubscriptionId ` + --resource-group $SchedulerRg ` + --name $SchedulerName ` + --user-assigned $IdentityId | Out-Null +if ($LASTEXITCODE -ne 0) { throw "Failed to assign identity to scheduler '$SchedulerName'" } + +$IdentityName = $IdentityId.Substring($IdentityId.LastIndexOf('/') + 1) +Write-Host "==> Done. Identity $IdentityName is attached to '$SchedulerName'." diff --git a/preview-features/on-demand-sandboxes/samples/dotnet/scripts/attach-scheduler-identity.sh b/preview-features/on-demand-sandboxes/samples/dotnet/scripts/attach-scheduler-identity.sh index 3c216af4..f2a6c8da 100755 --- a/preview-features/on-demand-sandboxes/samples/dotnet/scripts/attach-scheduler-identity.sh +++ b/preview-features/on-demand-sandboxes/samples/dotnet/scripts/attach-scheduler-identity.sh @@ -1,51 +1,28 @@ #!/usr/bin/env bash # Attaches the sample's user-assigned managed identity to the existing Durable Task # Scheduler so DTS can use it to pull the sandbox image and let the sandbox worker -# connect back. Runs as an azd postprovision hook. The PATCH is merge-safe: it keeps -# any identities already attached to the scheduler. +# connect back. Runs as an azd postprovision hook. Uses the `durabletask` Azure CLI +# extension's identity command, which adds the identity without removing any that are +# already attached to the scheduler. # # NOTE: enabling the On-demand Sandboxes preview *feature* on the scheduler is a # separate, out-of-band step handled during private-preview onboarding. -set -euo pipefail +set -eu SUBSCRIPTION_ID="${AZURE_SUBSCRIPTION_ID:?AZURE_SUBSCRIPTION_ID must be set}" SCHEDULER_NAME="${DTS_SCHEDULER_NAME:?DTS_SCHEDULER_NAME must be set}" SCHEDULER_RG="${DTS_SCHEDULER_RESOURCE_GROUP:?DTS_SCHEDULER_RESOURCE_GROUP must be set}" IDENTITY_ID="${AZURE_USER_ASSIGNED_IDENTITY_RESOURCE_ID:?AZURE_USER_ASSIGNED_IDENTITY_RESOURCE_ID must be set}" -API_VERSION="2026-05-01-preview" -if ! command -v python3 >/dev/null 2>&1; then - echo "ERROR: python3 is required to merge the scheduler identity block." >&2 - exit 1 -fi +echo "==> Ensuring the 'durabletask' Azure CLI extension is installed..." +az extension add --name durabletask --upgrade --only-show-errors >/dev/null -URI="https://management.azure.com/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${SCHEDULER_RG}/providers/Microsoft.DurableTask/schedulers/${SCHEDULER_NAME}?api-version=${API_VERSION}" +echo "==> Attaching managed identity to scheduler '${SCHEDULER_NAME}'..." +az durabletask scheduler identity assign \ + --subscription "${SUBSCRIPTION_ID}" \ + --resource-group "${SCHEDULER_RG}" \ + --name "${SCHEDULER_NAME}" \ + --user-assigned "${IDENTITY_ID}" >/dev/null -echo "==> Reading current identity on scheduler '${SCHEDULER_NAME}'..." -CURRENT="$(az rest --method get --uri "${URI}")" - -BODY="$(IDENTITY_ID="${IDENTITY_ID}" python3 - "${CURRENT}" <<'PY' -import json, os, sys - -current = json.loads(sys.argv[1]) -identity_id = os.environ["IDENTITY_ID"] - -identity = current.get("identity") or {} -user_assigned = identity.get("userAssignedIdentities") or {} -user_assigned[identity_id] = {} - -current_type = identity.get("type", "") or "" -new_type = "SystemAssigned, UserAssigned" if "SystemAssigned" in current_type else "UserAssigned" - -print(json.dumps({"identity": {"type": new_type, "userAssignedIdentities": user_assigned}})) -PY -)" - -TMP="$(mktemp)" -trap 'rm -f "${TMP}"' EXIT -printf '%s' "${BODY}" > "${TMP}" - -echo "==> Attaching managed identity to scheduler..." -az rest --method patch --uri "${URI}" --body "@${TMP}" >/dev/null echo "==> Done. Identity ${IDENTITY_ID##*/} is attached to '${SCHEDULER_NAME}'." diff --git a/preview-features/on-demand-sandboxes/samples/python/Containerfile b/preview-features/on-demand-sandboxes/samples/python/Containerfile index a0d1a71c..93c025bd 100644 --- a/preview-features/on-demand-sandboxes/samples/python/Containerfile +++ b/preview-features/on-demand-sandboxes/samples/python/Containerfile @@ -20,7 +20,6 @@ ENV GRPC_DEFAULT_SSL_ROOTS_FILE_PATH=/etc/ssl/certs/ca-certificates.crt # Install the Durable Task SDK (with the sandboxes extension), plus pandas for # the LLM-generated scripts the sandbox executes. RUN pip install --no-cache-dir \ - durabletask==1.6.0 \ durabletask-azuremanaged==1.6.0 \ azure-identity \ "pandas==2.2.*" diff --git a/preview-features/on-demand-sandboxes/samples/python/README.md b/preview-features/on-demand-sandboxes/samples/python/README.md index cf79e16f..da37872f 100644 --- a/preview-features/on-demand-sandboxes/samples/python/README.md +++ b/preview-features/on-demand-sandboxes/samples/python/README.md @@ -1,21 +1,12 @@ # On-demand Sandboxes demo (Python): LLM-generated code interpreter -The Python port of the [.NET demo](../dotnet/README.md). A three-step Durable Task -workflow that demonstrates the **On-demand Sandboxes** preview of Azure Durable +A three-step Durable Task workflow that demonstrates the **On-demand Sandboxes** preview of Azure Durable Task Scheduler (DTS), using the `durabletask.azuremanaged.preview.sandboxes` package. -``` - ┌─────────────────────────┐ ┌─────────────────────────┐ ┌─────────────────────────┐ - │ generate_code │ │ execute_code │ │ format_answer │ - │ (in-process Python) │ -> │ (on-demand sandbox) │ -> │ (in-process Python) │ - │ Azure OpenAI -> Python │ │ python3 + pandas │ │ Pick top region │ - └─────────────────────────┘ └─────────────────────────┘ └─────────────────────────┘ -``` - The orchestrator asks a natural-language question over `data/sales_q1.csv`. The LLM returns a self-contained pandas script. That script is **untrusted** code, so it runs -in a DTS-managed on-demand sandbox — not in the orchestrator's process. The first and +in a DTS-managed on-demand sandbox, not in the orchestrator's process. The first and last activities stay in-process; `execute_code` is fanned out one sandbox execution per region partition. @@ -41,14 +32,20 @@ python/ registered on the main app worker. - `generate_code` and `format_answer` run in-process in the main app worker. -## Prerequisites +## Prerequisites (local development) + +These prerequisites are for running the sample **locally** (building the sandbox image +and running the orchestrator on your machine). To deploy to Azure instead, skip to +[Deploy to Azure (AKS) with `azd`](#deploy-to-azure-aks-with-azd), which has its own +prerequisites. - Python 3.12+ - Docker (to build the sandbox image) +- Azure CLI (`az`), signed in with access to the scheduler, ACR, and Azure OpenAI - A DTS scheduler + task hub with the On-demand Sandboxes preview enabled -- An Azure Container Registry the sandbox platform can pull from +- An Azure Container Registry the image-pull identity can pull from (granted AcrPull) - Two user-assigned managed identities (image pull + scheduler connect) -- An Azure OpenAI deployment of a chat model (GPT-4o, GPT-4.1, etc.) +- An Azure OpenAI deployment of a chat model (GPT-5.1, GPT-5, GPT-4.1, etc.) ## Install @@ -56,7 +53,6 @@ From the `python/` directory: ```bash pip install -r requirements.txt -pip install durabletask==1.6.0 durabletask-azuremanaged==1.6.0 ``` ## Build the sandbox image @@ -72,10 +68,16 @@ docker build \ -t $IMAGE \ . -# Enable anonymous pull so DTS can fetch the sandbox image without credentials -az acr update --name $ACR --anonymous-pull-enabled true az acr login --name $ACR docker push $IMAGE + +# DTS pulls the sandbox image using the image-pull managed identity (not anonymous +# pull). Grant that identity AcrPull on the registry -- use the same UMI you pass as +# DTS_SANDBOX_IMAGE_PULL_UMI_CLIENT_ID when running the orchestrator. +az role assignment create \ + --assignee "" \ + --role AcrPull \ + --scope "$(az acr show --name $ACR --query id -o tsv)" ``` ## Run the orchestrator @@ -109,10 +111,11 @@ Kubernetes Service** with [`azd`](https://learn.microsoft.com/azure/developer/az The sandbox worker image (`remote_worker.py`) is built and pushed to ACR; DTS starts it on demand, so it is never deployed to the cluster. -> The Durable Task Scheduler is **not created** by this template — you pass in an +> The Durable Task Scheduler is **not created** by this template. You pass in an > existing one. On-demand Sandboxes is a private-preview feature that must be enabled on > the scheduler out of band, so the scheduler is patched separately and supplied here by -> name. +> name. The scheduler must be in a supported preview region: East US 2, West US 3, North +> Europe, or Australia East. ### What gets provisioned @@ -121,19 +124,19 @@ on demand, so it is never deployed to the cluster. | **AKS cluster** | Hosts the `main_app` orchestrator pod (workload identity enabled) | | **Azure Container Registry** | Stores the main-app and sandbox-worker images (built server-side via ACR Tasks) | | **User-assigned managed identity** + federated credential | Pod auth to DTS/Azure OpenAI, ACR pull for the sandbox, and the sandbox's connection back to DTS | -| **Azure OpenAI** + `gpt-4o` deployment | Backs the in-process `generate_code` activity | -| **VNet** | Network isolation for AKS | +| **Azure OpenAI** + `gpt-5.1` deployment | Backs the in-process `generate_code` activity | The deployment also **ensures the task hub** exists, grants the identity the roles it needs (AcrPull, Durable Task data access, Cognitive Services OpenAI User), and a `postprovision` hook **attaches the identity to your scheduler** (a merge-safe PATCH). -### Prerequisites +### Prerequisites (Azure deployment) - An existing **DTS scheduler** with the On-demand Sandboxes preview enabled, and its resource group name. - [Azure Developer CLI (`azd`)](https://learn.microsoft.com/azure/developer/azure-developer-cli/install-azd), [Azure CLI](https://learn.microsoft.com/cli/azure/install-azure-cli), and [kubectl](https://kubernetes.io/docs/tasks/tools/). -- Azure OpenAI quota for `gpt-4o` in your target region. +- Azure OpenAI quota for `gpt-5.1` (`GlobalStandard`) in your target region (default + `eastus`; override with `AZURE_OPENAI_LOCATION`). ### Deploy diff --git a/preview-features/on-demand-sandboxes/samples/python/azure.yaml b/preview-features/on-demand-sandboxes/samples/python/azure.yaml index 058f3559..a1d29e5a 100644 --- a/preview-features/on-demand-sandboxes/samples/python/azure.yaml +++ b/preview-features/on-demand-sandboxes/samples/python/azure.yaml @@ -12,11 +12,19 @@ metadata: name: dts-ondemand-sandboxes-python hooks: predeploy: - shell: bash - run: ./scripts/acr-build.sh + posix: + shell: sh + run: ./scripts/acr-build.sh + windows: + shell: pwsh + run: ./scripts/acr-build.ps1 postprovision: - shell: bash - run: ./scripts/attach-scheduler-identity.sh + posix: + shell: sh + run: ./scripts/attach-scheduler-identity.sh + windows: + shell: pwsh + run: ./scripts/attach-scheduler-identity.ps1 services: mainapp: project: . diff --git a/preview-features/on-demand-sandboxes/samples/python/infra/core/host/aks-cluster.bicep b/preview-features/on-demand-sandboxes/samples/python/infra/core/host/aks-cluster.bicep index d2a767e3..9830654b 100644 --- a/preview-features/on-demand-sandboxes/samples/python/infra/core/host/aks-cluster.bicep +++ b/preview-features/on-demand-sandboxes/samples/python/infra/core/host/aks-cluster.bicep @@ -9,8 +9,8 @@ param location string = resourceGroup().location @description('Tags to apply to the AKS cluster') param tags object = {} -@description('The Kubernetes version for the AKS cluster') -param kubernetesVersion string = '1.32' +@description('The Kubernetes version for the AKS cluster. Leave empty to use the AKS default supported version.') +param kubernetesVersion string = '' @description('The VM size for the default node pool') param agentVMSize string = 'standard_d4s_v5' @@ -45,7 +45,7 @@ resource aksCluster 'Microsoft.ContainerService/managedClusters@2024-09-01' = { type: 'SystemAssigned' } properties: { - kubernetesVersion: kubernetesVersion + kubernetesVersion: !empty(kubernetesVersion) ? kubernetesVersion : null dnsPrefix: name enableRBAC: true agentPoolProfiles: [ diff --git a/preview-features/on-demand-sandboxes/samples/python/infra/core/networking/vnet.bicep b/preview-features/on-demand-sandboxes/samples/python/infra/core/networking/vnet.bicep deleted file mode 100644 index 46764a8e..00000000 --- a/preview-features/on-demand-sandboxes/samples/python/infra/core/networking/vnet.bicep +++ /dev/null @@ -1,40 +0,0 @@ -@description('The name of the Virtual Network') -param name string - -@description('The Azure region where the Virtual Network should exist') -param location string = resourceGroup().location - -@description('Optional tags for the resources') -param tags object = {} - -@description('The address prefixes of the Virtual Network') -param addressPrefixes array = ['10.0.0.0/16'] - -@description('The subnets to create in the Virtual Network') -param subnets array = [ - { - name: 'aks-subnet' - properties: { - addressPrefix: '10.0.0.0/21' - delegations: [] - privateEndpointNetworkPolicies: 'Disabled' - privateLinkServiceNetworkPolicies: 'Enabled' - } - } -] - -resource vnet 'Microsoft.Network/virtualNetworks@2023-11-01' = { - name: name - location: location - tags: tags - properties: { - addressSpace: { - addressPrefixes: addressPrefixes - } - subnets: subnets - } -} - -output id string = vnet.id -output name string = vnet.name -output aksSubnetId string = resourceId('Microsoft.Network/virtualNetworks/subnets', name, 'aks-subnet') diff --git a/preview-features/on-demand-sandboxes/samples/python/infra/main.bicep b/preview-features/on-demand-sandboxes/samples/python/infra/main.bicep index 66ee57f2..7d1b31ca 100644 --- a/preview-features/on-demand-sandboxes/samples/python/infra/main.bicep +++ b/preview-features/on-demand-sandboxes/samples/python/infra/main.bicep @@ -20,7 +20,7 @@ param principalId string = '' // AKS parameters param aksClusterName string = '' -param kubernetesVersion string = '1.32' +param kubernetesVersion string = '' param aksVmSize string = 'standard_d4s_v5' param aksNodeCount int = 2 @@ -83,19 +83,6 @@ module identity './app/user-assigned-identity.bicep' = { } } -// ============================ -// Networking -// ============================ - -module vnet './core/networking/vnet.bicep' = { - scope: rg - params: { - name: '${abbrs.networkVirtualNetworks}${resourceToken}' - location: location - tags: tags - } -} - // ============================ // Container Registry // ============================ @@ -138,7 +125,6 @@ module aksCluster './core/host/aks-cluster.bicep' = { kubernetesVersion: kubernetesVersion agentVMSize: aksVmSize agentCount: aksNodeCount - subnetId: vnet.outputs.aksSubnetId containerRegistryName: containerRegistry.outputs.name } } diff --git a/preview-features/on-demand-sandboxes/samples/python/infra/main.parameters.json b/preview-features/on-demand-sandboxes/samples/python/infra/main.parameters.json index f98d2108..a2748fe0 100644 --- a/preview-features/on-demand-sandboxes/samples/python/infra/main.parameters.json +++ b/preview-features/on-demand-sandboxes/samples/python/infra/main.parameters.json @@ -19,6 +19,9 @@ }, "taskHubName": { "value": "${DTS_TASK_HUB=default}" + }, + "openAiLocation": { + "value": "${AZURE_OPENAI_LOCATION=eastus}" } } } diff --git a/preview-features/on-demand-sandboxes/samples/python/main_app.py b/preview-features/on-demand-sandboxes/samples/python/main_app.py index 44c65071..b96948a8 100644 --- a/preview-features/on-demand-sandboxes/samples/python/main_app.py +++ b/preview-features/on-demand-sandboxes/samples/python/main_app.py @@ -12,6 +12,7 @@ import os import sys +import time from azure.identity import DefaultAzureCredential, get_bearer_token_provider from openai import AzureOpenAI @@ -251,6 +252,18 @@ def main() -> int: print(state.serialized_output) elif state and state.failure_details: print(f"[failure] {state.failure_details}") + + # The app runs a single orchestration above. When deployed as an always-on + # Deployment, we keep the process (and worker) alive afterwards so the pod + # stays Running instead of exiting -- exiting would make Kubernetes restart + # the pod and schedule a new orchestration on every restart. Idle until the + # pod receives SIGTERM (or Ctrl+C locally). + print("\n[demo] Orchestration complete. Idling; send SIGTERM or Ctrl+C to exit.") + try: + while True: + time.sleep(3600) + except KeyboardInterrupt: + pass return 0 diff --git a/preview-features/on-demand-sandboxes/samples/python/requirements.txt b/preview-features/on-demand-sandboxes/samples/python/requirements.txt index 4b56dadc..0dc0dbb4 100644 --- a/preview-features/on-demand-sandboxes/samples/python/requirements.txt +++ b/preview-features/on-demand-sandboxes/samples/python/requirements.txt @@ -1,5 +1,4 @@ # Declarer-app (main_app.py) dependencies. -durabletask==1.6.0 durabletask-azuremanaged==1.6.0 azure-identity>=1.16 openai>=1.40 diff --git a/preview-features/on-demand-sandboxes/samples/python/scripts/acr-build.ps1 b/preview-features/on-demand-sandboxes/samples/python/scripts/acr-build.ps1 new file mode 100644 index 00000000..a4e1b4aa --- /dev/null +++ b/preview-features/on-demand-sandboxes/samples/python/scripts/acr-build.ps1 @@ -0,0 +1,52 @@ +# Builds the two container images for the On-demand Sandboxes demo server-side using +# ACR Tasks (az acr build) - no local Docker required. Called by azd as a predeploy hook +# on Windows (the POSIX equivalent is acr-build.sh). +# +# - main-app : the orchestrator (main_app.py), deployed to AKS (azd reads +# SERVICE_MAINAPP_IMAGE_NAME and skips its own build/push). +# - sandbox : the worker image (remote_worker.py) DTS starts on demand. Not deployed +# to AKS; its full image reference is handed to the app via +# DTS_SANDBOX_CONTAINER_IMAGE. + +$ErrorActionPreference = 'Stop' + +function Get-RequiredEnv([string]$name) { + $value = [Environment]::GetEnvironmentVariable($name) + if ([string]::IsNullOrEmpty($value)) { + throw "$name must be set" + } + return $value +} + +$Registry = Get-RequiredEnv 'AZURE_CONTAINER_REGISTRY_NAME' +$RegistryEndpoint = Get-RequiredEnv 'AZURE_CONTAINER_REGISTRY_ENDPOINT' +$EnvName = Get-RequiredEnv 'AZURE_ENV_NAME' +$Tag = "azd-deploy-$([DateTimeOffset]::UtcNow.ToUnixTimeSeconds())" + +function Build-Image([string]$imageRepo, [string]$containerfile) { + $fullImage = "$RegistryEndpoint/${imageRepo}:$Tag" + + Write-Host "==> Building ${imageRepo}:$Tag via ACR Tasks (--platform linux/amd64)..." + az acr build ` + --registry $Registry ` + --image "${imageRepo}:$Tag" ` + --platform linux/amd64 ` + --file $containerfile ` + . ` + --no-logs ` + --output none + if ($LASTEXITCODE -ne 0) { throw "az acr build failed for $imageRepo" } + + return $fullImage +} + +$MainAppImage = Build-Image "dts-ondemand-sandboxes/main-app-$EnvName" "Containerfile.mainapp" +$SandboxImage = Build-Image "dts-ondemand-sandboxes/sandbox-worker-$EnvName" "Containerfile" + +# azd uses SERVICE__IMAGE_NAME to skip its own build and deploy this image instead. +azd env set SERVICE_MAINAPP_IMAGE_NAME $MainAppImage +# The app declares the sandbox worker profile using this image reference. +azd env set DTS_SANDBOX_CONTAINER_IMAGE $SandboxImage + +Write-Host "==> main-app image : $MainAppImage" +Write-Host "==> sandbox image : $SandboxImage" diff --git a/preview-features/on-demand-sandboxes/samples/python/scripts/acr-build.sh b/preview-features/on-demand-sandboxes/samples/python/scripts/acr-build.sh index 6f93d402..567155f7 100755 --- a/preview-features/on-demand-sandboxes/samples/python/scripts/acr-build.sh +++ b/preview-features/on-demand-sandboxes/samples/python/scripts/acr-build.sh @@ -8,7 +8,7 @@ # to AKS; its full image reference is handed to the app via # DTS_SANDBOX_CONTAINER_IMAGE. -set -euo pipefail +set -eu REGISTRY="${AZURE_CONTAINER_REGISTRY_NAME:?AZURE_CONTAINER_REGISTRY_NAME must be set}" REGISTRY_ENDPOINT="${AZURE_CONTAINER_REGISTRY_ENDPOINT:?AZURE_CONTAINER_REGISTRY_ENDPOINT must be set}" diff --git a/preview-features/on-demand-sandboxes/samples/python/scripts/attach-scheduler-identity.ps1 b/preview-features/on-demand-sandboxes/samples/python/scripts/attach-scheduler-identity.ps1 new file mode 100644 index 00000000..cafc1c53 --- /dev/null +++ b/preview-features/on-demand-sandboxes/samples/python/scripts/attach-scheduler-identity.ps1 @@ -0,0 +1,39 @@ +# Attaches the sample's user-assigned managed identity to the existing Durable Task +# Scheduler so DTS can use it to pull the sandbox image and let the sandbox worker +# connect back. Runs as an azd postprovision hook on Windows (the POSIX equivalent is +# attach-scheduler-identity.sh). Uses the `durabletask` Azure CLI extension's identity +# command, which adds the identity without removing any that are already attached to +# the scheduler. +# +# NOTE: enabling the On-demand Sandboxes preview *feature* on the scheduler is a +# separate, out-of-band step handled during private-preview onboarding. + +$ErrorActionPreference = 'Stop' + +function Get-RequiredEnv([string]$name) { + $value = [Environment]::GetEnvironmentVariable($name) + if ([string]::IsNullOrEmpty($value)) { + throw "$name must be set" + } + return $value +} + +$SubscriptionId = Get-RequiredEnv 'AZURE_SUBSCRIPTION_ID' +$SchedulerName = Get-RequiredEnv 'DTS_SCHEDULER_NAME' +$SchedulerRg = Get-RequiredEnv 'DTS_SCHEDULER_RESOURCE_GROUP' +$IdentityId = Get-RequiredEnv 'AZURE_USER_ASSIGNED_IDENTITY_RESOURCE_ID' + +Write-Host "==> Ensuring the 'durabletask' Azure CLI extension is installed..." +az extension add --name durabletask --upgrade --only-show-errors | Out-Null +if ($LASTEXITCODE -ne 0) { throw "Failed to add the 'durabletask' az extension" } + +Write-Host "==> Attaching managed identity to scheduler '$SchedulerName'..." +az durabletask scheduler identity assign ` + --subscription $SubscriptionId ` + --resource-group $SchedulerRg ` + --name $SchedulerName ` + --user-assigned $IdentityId | Out-Null +if ($LASTEXITCODE -ne 0) { throw "Failed to assign identity to scheduler '$SchedulerName'" } + +$IdentityName = $IdentityId.Substring($IdentityId.LastIndexOf('/') + 1) +Write-Host "==> Done. Identity $IdentityName is attached to '$SchedulerName'." diff --git a/preview-features/on-demand-sandboxes/samples/python/scripts/attach-scheduler-identity.sh b/preview-features/on-demand-sandboxes/samples/python/scripts/attach-scheduler-identity.sh index 3c216af4..f2a6c8da 100755 --- a/preview-features/on-demand-sandboxes/samples/python/scripts/attach-scheduler-identity.sh +++ b/preview-features/on-demand-sandboxes/samples/python/scripts/attach-scheduler-identity.sh @@ -1,51 +1,28 @@ #!/usr/bin/env bash # Attaches the sample's user-assigned managed identity to the existing Durable Task # Scheduler so DTS can use it to pull the sandbox image and let the sandbox worker -# connect back. Runs as an azd postprovision hook. The PATCH is merge-safe: it keeps -# any identities already attached to the scheduler. +# connect back. Runs as an azd postprovision hook. Uses the `durabletask` Azure CLI +# extension's identity command, which adds the identity without removing any that are +# already attached to the scheduler. # # NOTE: enabling the On-demand Sandboxes preview *feature* on the scheduler is a # separate, out-of-band step handled during private-preview onboarding. -set -euo pipefail +set -eu SUBSCRIPTION_ID="${AZURE_SUBSCRIPTION_ID:?AZURE_SUBSCRIPTION_ID must be set}" SCHEDULER_NAME="${DTS_SCHEDULER_NAME:?DTS_SCHEDULER_NAME must be set}" SCHEDULER_RG="${DTS_SCHEDULER_RESOURCE_GROUP:?DTS_SCHEDULER_RESOURCE_GROUP must be set}" IDENTITY_ID="${AZURE_USER_ASSIGNED_IDENTITY_RESOURCE_ID:?AZURE_USER_ASSIGNED_IDENTITY_RESOURCE_ID must be set}" -API_VERSION="2026-05-01-preview" -if ! command -v python3 >/dev/null 2>&1; then - echo "ERROR: python3 is required to merge the scheduler identity block." >&2 - exit 1 -fi +echo "==> Ensuring the 'durabletask' Azure CLI extension is installed..." +az extension add --name durabletask --upgrade --only-show-errors >/dev/null -URI="https://management.azure.com/subscriptions/${SUBSCRIPTION_ID}/resourceGroups/${SCHEDULER_RG}/providers/Microsoft.DurableTask/schedulers/${SCHEDULER_NAME}?api-version=${API_VERSION}" +echo "==> Attaching managed identity to scheduler '${SCHEDULER_NAME}'..." +az durabletask scheduler identity assign \ + --subscription "${SUBSCRIPTION_ID}" \ + --resource-group "${SCHEDULER_RG}" \ + --name "${SCHEDULER_NAME}" \ + --user-assigned "${IDENTITY_ID}" >/dev/null -echo "==> Reading current identity on scheduler '${SCHEDULER_NAME}'..." -CURRENT="$(az rest --method get --uri "${URI}")" - -BODY="$(IDENTITY_ID="${IDENTITY_ID}" python3 - "${CURRENT}" <<'PY' -import json, os, sys - -current = json.loads(sys.argv[1]) -identity_id = os.environ["IDENTITY_ID"] - -identity = current.get("identity") or {} -user_assigned = identity.get("userAssignedIdentities") or {} -user_assigned[identity_id] = {} - -current_type = identity.get("type", "") or "" -new_type = "SystemAssigned, UserAssigned" if "SystemAssigned" in current_type else "UserAssigned" - -print(json.dumps({"identity": {"type": new_type, "userAssignedIdentities": user_assigned}})) -PY -)" - -TMP="$(mktemp)" -trap 'rm -f "${TMP}"' EXIT -printf '%s' "${BODY}" > "${TMP}" - -echo "==> Attaching managed identity to scheduler..." -az rest --method patch --uri "${URI}" --body "@${TMP}" >/dev/null echo "==> Done. Identity ${IDENTITY_ID##*/} is attached to '${SCHEDULER_NAME}'."