Skip to content

Fix sensitive_data_leakage tool context not reaching agent callback in Foundry path#46151

Merged
slister1001 merged 7 commits intoAzure:mainfrom
slister1001:fix/sdl-tool-context-propagation
Apr 8, 2026
Merged

Fix sensitive_data_leakage tool context not reaching agent callback in Foundry path#46151
slister1001 merged 7 commits intoAzure:mainfrom
slister1001:fix/sdl-tool-context-propagation

Conversation

@slister1001
Copy link
Copy Markdown
Member

In the Foundry execution path, agent-specific context items (with tool_name fields like document_client_smode) were stored in SeedObjective.metadata but never propagated to the callback's context parameter. The agent received an empty context dict and could not recognize the injected tools, causing all sensitive_data_leakage objectives to score 0.0 (false negative).

Add a fallback in _CallbackChatTarget._send_prompt_impl() that extracts context_items from request.prompt_metadata when labels['context'] is empty. This matches the ACA runtime behavior where FunctionTool definitions are dynamically created from context items.

  • Add prompt_metadata fallback for context extraction
  • Add seed objectives with tool_name/context_type for E2E coverage
  • Update E2E test to assert tool context delivery to callback
  • Add unit tests for metadata fallback and labels precedence

@github-actions github-actions bot added the Evaluation Issues related to the client library for Azure AI Evaluation label Apr 6, 2026
@slister1001 slister1001 force-pushed the fix/sdl-tool-context-propagation branch 6 times, most recently from b24dd5b to 5e13209 Compare April 7, 2026 15:33
@slister1001 slister1001 marked this pull request as ready for review April 7, 2026 15:52
@slister1001 slister1001 requested a review from a team as a code owner April 7, 2026 15:52
Copilot AI review requested due to automatic review settings April 7, 2026 15:52
@slister1001 slister1001 force-pushed the fix/sdl-tool-context-propagation branch from 5e13209 to 73d409c Compare April 7, 2026 15:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a Foundry-path regression where agent tool context (e.g., tool_name/context_type context items used by sensitive-data leakage objectives) was not making it into the callback context parameter, leading to false-negative scoring. The changes attempt to propagate tool context by creating context SeedPrompts and extracting tool context from conversation history in _CallbackChatTarget.

Changes:

  • Create context SeedPrompts for standard attacks so tool context can be recovered from prepended conversation history.
  • Extract tool context from conversation history prompt_metadata in _CallbackChatTarget as a fallback when labels don’t contain context.
  • Expand unit/E2E coverage (new seed objectives with tool context; assertions that callback receives tool context; PyRIT memory reset between E2E tests).

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/_foundry/_dataset_builder.py Adds standard-attack context SeedPrompt creation and an objective SeedPrompt to drive Foundry context/tool propagation.
sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/red_team/_callback_chat_target.py Extracts tool context from conversation history prompt_metadata and passes it via context['contexts'] to the callback.
sdk/evaluation/azure-ai-evaluation/tests/unittests/test_redteam/test_dataset_builder_binary_path.py Adds unit tests verifying context SeedPrompt creation and sequencing expectations.
sdk/evaluation/azure-ai-evaluation/tests/unittests/test_redteam/test_callback_chat_target.py Adds unit test verifying tool context extraction from prepended conversation history.
sdk/evaluation/azure-ai-evaluation/tests/e2etests/test_red_team_foundry.py Adds an autouse fixture to reset PyRIT singleton memory; updates sensitive-data-leakage E2E to assert tool context delivery.
sdk/evaluation/azure-ai-evaluation/tests/e2etests/data/redteam_seeds/sensitive_data_leakage_seeds.json Adds seed objectives that include tool context payloads for E2E validation.

…y path

In the Foundry execution path, agent-specific context items (with tool_name
fields like document_client_smode) were stored in SeedObjective.metadata but
PyRIT discards SeedObjective.metadata during attack execution -- only
objective.value is sent to the target. The target never saw the sensitive
data, causing all sensitive_data_leakage objectives to score 0.0.

Fix by creating context as SeedPrompt objects at lower sequence numbers so
PyRIT places them in prepended_conversation (conversation history). A user
SeedPrompt for the objective text is added at a higher sequence so it becomes
next_message (the actual prompt). _CallbackChatTarget filters these context
pieces out of the messages list (so the model doesn't see raw sensitive data
as prior user messages) and instead reconstructs context['contexts'] with
tool_name fields. This enables the ACA runtime agent_callback to build
FunctionTool injections without any changes to the ACA code -- the model
must call the tool to access the sensitive data, matching the intended
attack semantics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@slister1001 slister1001 force-pushed the fix/sdl-tool-context-propagation branch from 73d409c to 99f4746 Compare April 7, 2026 16:13
Copy link
Copy Markdown
Member

@nagkumar91 nagkumar91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — two-part fix is clean. Context SeedPrompts + callback extraction solves the tool context gap. Good unit + e2e coverage.

slister1001 and others added 6 commits April 7, 2026 14:14
- Use underscores in risk-type ('sensitive_data_leakage') to match SDK
  validator. The service uses hyphens but the SDK expects underscores;
  seeds with hyphens were silently skipped, leaving no tool-context
  objectives in the test.
- Wrap CentralMemory.get_memory_instance() in try/except since it throws
  if called before any instance is set.
- Add CHANGELOG entry for 1.16.5.
- Merge upstream main.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The tool names document_client_smode and email_client_smode come from the
RAI service's attack objectives for sensitive_data_leakage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Reset PyRIT database (drop/recreate tables) before each test instead of
  using :memory: DB that gets overwritten by RedTeam.__init__
- Filter is_context pieces in FoundryResultProcessor._build_messages_from_pieces
  so context SeedPrompts don't appear as extra user messages in conversations

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use isinstance(pm, dict) and pm.get('is_context') is True instead of
truthy checks. MagicMock objects return truthy values for any attribute
access, causing all conversation pieces to be filtered out in unit tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Member

@nagkumar91 nagkumar91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed after updates. Good improvements — stricter metadata checks, context filtering in _build_messages_from_pieces (was a gap in v1), is_objective tagging, and seed data fix. LGTM.

@slister1001 slister1001 merged commit 08e5f6a into Azure:main Apr 8, 2026
21 checks passed
slister1001 added a commit that referenced this pull request Apr 8, 2026
…n Foundry path (#46151)

* Fix sensitive_data_leakage tool context not reaching target in Foundry path

In the Foundry execution path, agent-specific context items (with tool_name
fields like document_client_smode) were stored in SeedObjective.metadata but
PyRIT discards SeedObjective.metadata during attack execution -- only
objective.value is sent to the target. The target never saw the sensitive
data, causing all sensitive_data_leakage objectives to score 0.0.

Fix by creating context as SeedPrompt objects at lower sequence numbers so
PyRIT places them in prepended_conversation (conversation history). A user
SeedPrompt for the objective text is added at a higher sequence so it becomes
next_message (the actual prompt). _CallbackChatTarget filters these context
pieces out of the messages list (so the model doesn't see raw sensitive data
as prior user messages) and instead reconstructs context['contexts'] with
tool_name fields. This enables the ACA runtime agent_callback to build
FunctionTool injections without any changes to the ACA code -- the model
must call the tool to access the sensitive data, matching the intended
attack semantics.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix SDL seed risk-type, memory fixture, add changelog

- Use underscores in risk-type ('sensitive_data_leakage') to match SDK
  validator. The service uses hyphens but the SDK expects underscores;
  seeds with hyphens were silently skipped, leaving no tool-context
  objectives in the test.
- Wrap CentralMemory.get_memory_instance() in try/except since it throws
  if called before any instance is set.
- Add CHANGELOG entry for 1.16.5.
- Merge upstream main.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Add 'smode' to cspell ignoreWords

The tool names document_client_smode and email_client_smode come from the
RAI service's attack objectives for sensitive_data_leakage.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Fix conversation contamination between Foundry E2E tests

- Reset PyRIT database (drop/recreate tables) before each test instead of
  using :memory: DB that gets overwritten by RedTeam.__init__
- Filter is_context pieces in FoundryResultProcessor._build_messages_from_pieces
  so context SeedPrompts don't appear as extra user messages in conversations

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* Use strict is_context check to avoid MagicMock false positives

Use isinstance(pm, dict) and pm.get('is_context') is True instead of
truthy checks. MagicMock objects return truthy values for any attribute
access, causing all conversation pieces to be filtered out in unit tests.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Evaluation Issues related to the client library for Azure AI Evaluation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants