Python: Add Cosmos DB NoSQL Checkpoint Storage for Python Workflows#4916
Open
aayush3011 wants to merge 7 commits intomicrosoft:mainfrom
Open
Python: Add Cosmos DB NoSQL Checkpoint Storage for Python Workflows#4916aayush3011 wants to merge 7 commits intomicrosoft:mainfrom
aayush3011 wants to merge 7 commits intomicrosoft:mainfrom
Conversation
Add native Cosmos DB NoSQL support for workflow checkpoint storage in the Python agent-framework-azure-cosmos package, achieving parity with the existing .NET CosmosCheckpointStore. New files: - _checkpoint_storage.py: CosmosCheckpointStorage implementing the CheckpointStorage protocol with 6 methods (save, load, list_checkpoints, delete, get_latest, list_checkpoint_ids) - test_cosmos_checkpoint_storage.py: Unit and integration tests - workflow_checkpointing.py: Sample demonstrating Cosmos DB-backed workflow checkpoint/resume Auth support: - Managed identity / RBAC via Azure credential objects (DefaultAzureCredential, ManagedIdentityCredential, etc.) - Key-based auth via account key string or AZURE_COSMOS_KEY env var - Pre-created CosmosClient or ContainerProxy Key design decisions: - Partition key: /workflow_name for efficient per-workflow queries - Serialization: Reuses encode/decode_checkpoint_value for full Python object fidelity (hybrid JSON + pickle approach) - Container auto-creation via create_container_if_not_exists Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a Cosmos DB (NoSQL) checkpoint storage backend to the Python agent-framework-azure-cosmos package to enable durable workflow pause/resume (feature-parity with the .NET Cosmos checkpoint store).
Changes:
- Introduces
CosmosCheckpointStorageimplementing workflow checkpoint persistence in Cosmos DB (auto-creates DB/container, partitions byworkflow_name). - Adds unit + integration tests covering the checkpoint storage behavior.
- Adds runnable samples + README updates showing Cosmos-backed workflow checkpointing (standalone and Azure AI Foundry).
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| python/packages/azure-cosmos/agent_framework_azure_cosmos/_checkpoint_storage.py | Implements Cosmos-backed checkpoint storage (save/load/list/delete/latest/ids). |
| python/packages/azure-cosmos/agent_framework_azure_cosmos/init.py | Exposes CosmosCheckpointStorage from the package. |
| python/packages/azure-cosmos/tests/test_cosmos_checkpoint_storage.py | Adds unit tests and an integration round-trip test for the new storage. |
| python/packages/azure-cosmos/samples/cosmos_workflow_checkpointing.py | Standalone workflow sample using Cosmos-backed checkpointing. |
| python/packages/azure-cosmos/samples/cosmos_workflow_checkpointing_foundry.py | Foundry multi-agent workflow sample using Cosmos checkpoint storage. |
| python/packages/azure-cosmos/samples/README.md | Documents the new samples. |
| python/packages/azure-cosmos/README.md | Documents CosmosCheckpointStorage usage and configuration. |
| python/packages/azure-cosmos/pyproject.toml | Extends the integration test task to include the new integration test. |
python/packages/azure-cosmos/agent_framework_azure_cosmos/_checkpoint_storage.py
Show resolved
Hide resolved
python/packages/azure-cosmos/agent_framework_azure_cosmos/_checkpoint_storage.py
Outdated
Show resolved
Hide resolved
python/packages/azure-cosmos/samples/cosmos_workflow_checkpointing_foundry.py
Outdated
Show resolved
Hide resolved
python/packages/azure-cosmos/samples/cosmos_workflow_checkpointing.py
Outdated
Show resolved
Hide resolved
Member
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||
Author
|
@markwallace-microsoft , please review the above PR. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation and Context
The .NET implementation of the Agent Framework already ships a native CosmosCheckpointStore for workflow checkpointing, but the Python side only supports in-memory and file-based storage. Cosmos DB customers building agents on Azure AI Foundry have been asking for native Cosmos DB checkpoint support so they can durably pause and resume workflows across process restarts without writing custom storage adapters.
This PR adds CosmosCheckpointStorage to the existing agent-framework-azure-cosmos Python package, achieving feature parity with .NET and enabling Cosmos DB customers to use workflow checkpointing out-of-the-box.
Description
Core implementation (_checkpoint_storage.py):
Tests (test_cosmos_checkpoint_storage.py):
Samples:
Contribution Checklist