Python: Add prompt caching support to Anthropic connector by Vizhy · Pull Request #13947 · microsoft/semantic-kernel

Vizhy · 2026-05-04T11:57:01Z

Summary

Adds opt-in prompt caching support to the Python Anthropic connector via a new AnthropicCacheSettings model and a prepare_settings_dict() override that injects Anthropic cache_control blocks into the outbound request payload.

What's included:

AnthropicCacheSettings — inherits KernelBaseSettings with env_prefix = ANTHROPIC_CACHE_. Caching can be toggled via environment variables (ANTHROPIC_CACHE_ENABLED, ANTHROPIC_CACHE_INCLUDE_SYSTEM, ANTHROPIC_CACHE_INCLUDE_TOOLS, ANTHROPIC_CACHE_TTL) or set explicitly in code.
AnthropicChatPromptExecutionSettings.cache field (excluded from serialization) + prepare_settings_dict() override that injects cache_control on the system message and/or the last tool definition.
AnthropicCacheSettings exported from semantic_kernel.connectors.ai.anthropic.
Unit tests covering all classmethods, both TTLs, all injection combinations, edge cases, env-var loading, and the no-overwrite guard.
Sample: samples/concepts/caching/anthropic_prompt_caching.py.

Copilot review addressed (commit f71799f):

Inherits KernelBaseSettings (not bare BaseModel) — consistent with rest of SDK; enables env-var support.
Renamed cache_system/cache_tools → include_system/include_tools — removes redundant cache prefix.
Replaced copy.deepcopy with shallow list + dict spread — cheaper for large tool catalogs.
Injection skips if cache_control already present — avoids clobbering caller's explicit setting.
TTL tests moved to prepare_settings_dict() public surface (no longer calling private _cache_control()).

Cache TTL:

5m (default): 1.25x write cost, 0.1x read cost. Breaks even after one cache hit.
1h: 2x write cost, 0.1x read cost. Breaks even after two cache hits.

Usage:

from semantic_kernel.connectors.ai.anthropic import AnthropicCacheSettings

settings = AnthropicChatPromptExecutionSettings(
    cache=AnthropicCacheSettings.on(ttl=1h),
)
# or via env: ANTHROPIC_CACHE_ENABLED=true ANTHROPIC_CACHE_TTL=1h

Adds AnthropicCacheSettings and a `cache` field on AnthropicChatPromptExecutionSettings to enable opt-in prompt caching via the Anthropic cache_control API. When enabled, prepare_settings_dict() injects cache_control blocks on the system message and the last tool definition before the request is sent. No changes to AnthropicChatCompletion — caching is fully contained in the settings layer. Off by default; opt in with cache=AnthropicCacheSettings.on(). Convenience constructors: .on() .off() .system() .tools() .short() .long() TTL: "5m" -> {"type":"ephemeral"}, "1h" -> {"type":"ephemeral","ttl":3600} Includes 16 new unit tests and a usage sample at samples/concepts/caching/anthropic_prompt_caching.py.

Copilot

Pull request overview

Adds opt-in prompt caching support to the Python Anthropic connector by introducing a cache settings model and injecting Anthropic cache_control blocks into the serialized request payload (system content block and/or the last tool definition).

Changes:

Introduces AnthropicCacheSettings and exposes it as a public API via semantic_kernel.connectors.ai.anthropic.
Extends AnthropicChatPromptExecutionSettings with an excluded cache field and injects cache_control during prepare_settings_dict().
Adds unit tests for caching settings/injection behavior and a new sample demonstrating prompt caching usage.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File	Description
python/tests/unit/connectors/ai/anthropic/test_anthropic_request_settings.py	Adds unit tests covering cache settings constructors and `prepare_settings_dict()` injection behavior.
python/semantic_kernel/connectors/ai/anthropic/prompt_execution_settings/anthropic_prompt_execution_settings.py	Adds `AnthropicCacheSettings`, adds `cache` to execution settings (excluded from serialization), and injects `cache_control` into outbound payload.
python/semantic_kernel/connectors/ai/anthropic/init.py	Exports `AnthropicCacheSettings` as part of the Anthropic connector public surface.
python/samples/concepts/caching/anthropic_prompt_caching.py	Adds a runnable sample demonstrating multi-turn Anthropic prompt caching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Vizhy · 2026-05-05T16:28:35Z

+class AnthropicCacheSettings(BaseModel):
+    """Configuration for Anthropic prompt caching.
+
+    Controls which parts of the request receive cache_control injection.
+


Fixed in f71799f. AnthropicCacheSettings now inherits KernelBaseSettings with env_prefix = ANTHROPIC_CACHE_, consistent with the rest of the SDK (validate_assignment, populate_by_name, arbitrary_types_allowed). This also unlocks env-var control out of the box: ANTHROPIC_CACHE_ENABLED=true, ANTHROPIC_CACHE_INCLUDE_SYSTEM=true, ANTHROPIC_CACHE_TTL=1h, etc. Took the opportunity to rename cache_system/cache_tools to include_system/include_tools to remove the redundant cache prefix on fields inside a cache settings class.

Vizhy · 2026-05-05T16:29:00Z

+        if self.cache.cache_tools:
+            tools: list[dict[str, Any]] | None = data.get("tools")
+            if tools:
+                tools = copy.deepcopy(tools)
+                tools[-1]["cache_control"] = cache_control
+                data["tools"] = tools


Fixed in f71799f. Replaced copy.deepcopy with a shallow list + dict spread: [*tools[:-1], {**tools[-1], cache_control: cache_control}]. Only the last element is modified so only that dict needs copying — no full deep clone of the entire tools list.

Vizhy · 2026-05-05T16:28:43Z

+            tools: list[dict[str, Any]] | None = data.get("tools")
+            if tools:
+                tools = copy.deepcopy(tools)
+                tools[-1]["cache_control"] = cache_control


Fixed in f71799f. Injection now checks cache_control not in tools[-1] (and the same for system blocks) before writing — existing values are preserved as-is.

Vizhy · 2026-05-05T16:28:45Z

+    ctrl = AnthropicCacheSettings.on(ttl="5m")._cache_control()
+    assert ctrl == {"type": "ephemeral"}
+
+
+def test_cache_control_1h():
+    ctrl = AnthropicCacheSettings.on(ttl="1h")._cache_control()
+    assert ctrl == {"type": "ephemeral", "ttl": 3600}


Fixed in f71799f. Replaced the two _cache_control() direct tests (test_cache_control_5m / test_cache_control_1h) with prepare_settings_dict() equivalents that validate the same TTL output via the public API. The private helper is now only exercised transitively.

github-actions

Automated Code Review

Reviewers: 4 | Confidence: 92%

✓ Correctness

The PR adds Anthropic prompt caching support with a well-structured AnthropicCacheSettings model and prepare_settings_dict override. There is one correctness bug: the _cache_control() method emits "ttl": 3600 (an integer) for the 1-hour TTL, but the Anthropic SDK's CacheControlEphemeralParam type defines ttl: Literal["5m", "1h"] — it expects the string "1h", not an integer. This will cause a runtime API error or silent rejection when 1-hour caching is used. The corresponding tests also assert the wrong expected value, so they pass but do not catch the bug.

✓ Security Reliability

This PR adds Anthropic prompt caching support via a new AnthropicCacheSettings model and prepare_settings_dict override. The implementation is clean from a security and reliability standpoint: TTL values are constrained by Literal["5m", "1h"], the cache field is correctly excluded from API serialization (exclude=True), tools are deep-copied before mutation to prevent side effects, and edge cases (empty system string, missing tools) are handled properly. No secrets, injection risks, resource leaks, or unsafe deserialization were found.

✓ Test Coverage

The new AnthropicCacheSettings class and its integration into prepare_settings_dict are well-tested, covering factory methods, TTL variants, edge cases (empty system, no tools), mutation protection, and serialization exclusion. However, the PR widens the system field type to accept list[dict[str, Any]] in addition to str, yet there is no test verifying behavior when system is passed as a pre-structured list with caching enabled. The code silently skips cache injection in that case (line 158: isinstance(system, str)), and a test should document this intended behavior.

✗ Design Approach

I found one design-level issue. The new caching API broadens system to accept Anthropic-native block lists, but the caching implementation only injects cache_control when system is a plain string. That makes the newly supported structured-system form a silent no-op for cache_system, which is a contract gap in the core feature rather than a missing edge-case test.

Suggestions

In python/semantic_kernel/connectors/ai/anthropic/prompt_execution_settings/anthropic_prompt_execution_settings.py:156-159, treat system as one normalized content-block sequence for serialization so caching works for both supported input shapes, rather than special-casing only raw strings.

Automated review by Vizhy's agents

… blocks - _cache_control() now emits {"ttl":"1h"} string per CacheControlEphemeralParam spec instead of integer 3600 - prepare_settings_dict() now injects cache_control on list[dict] system blocks in addition to plain strings, closing the silent no-op design gap - add test covering cache injection when system is pre-structured as list[dict] - update 1h TTL test assertions to match corrected string value

Vizhy · 2026-05-05T09:44:25Z

@microsoft-github-policy-service agree

Vizhy · 2026-05-05T09:44:34Z

Thanks for the thorough automated review — two valid issues were caught and both are addressed in the follow-up commit (da6de64):

1. TTL value fix (Correctness)
_cache_control() now emits {"ttl": "1h"} (string) instead of {"ttl": 3600} (integer), correctly matching the CacheControlEphemeralParam SDK type definition. The corresponding test assertions have been updated to match.

2. Pre-structured system blocks (Design Approach)
prepare_settings_dict() now handles both input shapes for system:

str → wrapped into a single content block with cache_control
list[dict] → cache_control injected on the last block (same pattern used for tools), with a copy.deepcopy to avoid mutation

A test covering the list[dict] case has been added to document this behaviour explicitly.

…ow copy, no-overwrite - AnthropicCacheSettings now inherits KernelBaseSettings (consistent with rest of SDK; enables validate_assignment, populate_by_name) - Added env_prefix = "ANTHROPIC_CACHE_" so caching can be toggled via environment variables (ANTHROPIC_CACHE_ENABLED, ANTHROPIC_CACHE_TTL, etc.) - Renamed cache_system/cache_tools fields to include_system/include_tools (removes redundant "cache" prefix on fields inside a cache settings class) - Replaced copy.deepcopy with shallow list + dict spread — cheaper for large tool catalogs where caching is most beneficial - inject now skips if cache_control already present on last block — avoids silently clobbering a caller's explicit setting - Replaced two _cache_control() private-method tests with prepare_settings_dict() equivalents; added env-var tests (monkeypatch) and no-overwrite test Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 4, 2026 11:57

Vizhy requested a review from a team as a code owner May 4, 2026 11:57

moonbox3 added the python Pull requests for the Python Semantic Kernel label May 4, 2026

Copilot started reviewing on behalf of Vizhy May 4, 2026 11:57 View session

Copilot AI reviewed May 4, 2026

View reviewed changes

github-actions Bot reviewed May 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Add prompt caching support to Anthropic connector#13947

Python: Add prompt caching support to Anthropic connector#13947
Vizhy wants to merge 3 commits intomicrosoft:mainfrom
Vizhy:feature/connectors-ai-anthropic-cache

Vizhy commented May 4, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Vizhy May 5, 2026

Uh oh!

Vizhy May 5, 2026

Uh oh!

Vizhy May 5, 2026

Uh oh!

Vizhy May 5, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

Vizhy commented May 5, 2026

Uh oh!

Vizhy commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Vizhy commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Vizhy May 5, 2026

Choose a reason for hiding this comment

Uh oh!

Vizhy May 5, 2026

Choose a reason for hiding this comment

Uh oh!

Vizhy May 5, 2026

Choose a reason for hiding this comment

Uh oh!

Vizhy May 5, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✗ Design Approach

Suggestions

Uh oh!

Vizhy commented May 5, 2026

Uh oh!

Vizhy commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Vizhy commented May 4, 2026 •

edited

Loading