.NET: Foundry Evals integration for .NET by alliscode · Pull Request #4914 · microsoft/agent-framework

alliscode · 2026-03-25T20:12:18Z

This pull request introduces a new sample, FoundryAgents_Evaluations_Step03_AllPatterns, which demonstrates all evaluation patterns available in the Agent Framework for .NET. It also updates the existing FoundryAgents_Evaluations_Step02_SelfReflection sample to use the new Azure AI Foundry project endpoints and simplifies the evaluator setup. The changes primarily focus on showcasing evaluation capabilities, including function evaluators, built-in checks, MEAI quality evaluators, Foundry (cloud-based) evaluators, mixed evaluators, pre-existing response evaluation, and conversation split strategies.

Addition of comprehensive evaluation sample:

Added a new sample project FoundryAgents_Evaluations_Step03_AllPatterns with a detailed Program.cs demonstrating all major evaluation patterns (function evaluators, built-in checks, MEAI, Foundry, mixed, pre-existing response evaluation, and conversation split strategies), including a custom conversation splitter. [1] [2]
Added a README (README.md) for the new sample, summarizing its purpose, prerequisites, key types, and usage instructions.

Updates to existing self-reflection evaluation sample:

Updated environment variable names and initialization in Program.cs to use Azure AI Foundry project endpoints and deployment names, simplifying the evaluator setup to derive everything from the project endpoint.
Removed the direct dependency on Azure.AI.OpenAI in the .csproj file, relying instead on Azure.AI.Projects and related packages.
Updated namespaces in Program.cs to use Azure.AI.Projects.OpenAI instead of Azure.AI.OpenAI.

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Add evaluation framework with local and Foundry-hosted evaluator support: - EvalItem/EvalCheck/EvalChecks core types with IConversationSplitter - IAgentEvaluator interface and MeaiEvaluatorAdapter for MEAI bridge - FunctionEvaluator and LocalEvaluator for custom evaluation functions - FoundryEvals provider for Azure AI Foundry hosted evaluations - EvaluateAsync extension methods with expected values support - WorkflowEvaluationExtensions for multi-agent workflow evaluation - Unit tests and evaluation samples Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

- Make Microsoft.Extensions.AI.Evaluation package references conditional on net8.0+ in AI, AzureAI, and Workflows csproj files - Exclude Evaluation/**/*.cs from compilation on legacy TFMs (net472, netstandard2.0) since MEAI.Evaluation does not support them - Fix missing numRepetitions XML doc params in AgentEvaluationExtensions - Fix expectedOutput parameter name bug in BuildItemsFromResponses call Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Code fixes: - Deduplicate ContentHarmEvaluator in BuildEvaluators (all safety names share one instance) - Throw ArgumentException on unknown evaluator names instead of silently ignoring - BuildEvalItem no longer mutates caller's messages list - AllPassed checks both SubResults and _items when SubResults is populated - Null guard for agent in sample finally blocks - Fix README type reference (Evaluators -> FoundryEvals) Test coverage: - BuildItemsFromResponses validation (mismatched queries/responses/expectedOutput/expectedToolCalls) - BuildEvaluators: quality names, safety deduplication, unknown name throws, default selection - AllPassed: empty items, SubResults with overall failure - BuildEvalItem: property correctness, input list not mutated - ExtractAgentData: empty events, matched pairs, unmatched invocations, completions without invocations, multiple agents, duplicate executor IDs, multiple rounds, null data, splitter propagation Infrastructure: - Made BuildItemsFromResponses, ExtractAgentData, BuildEvaluators internal for testability - Added InternalsVisibleTo for AI.UnitTests in AzureAI project - Added conditional AzureAI project reference in AI.UnitTests (net8.0+ only) - Added conditional compile exclusion for WorkflowEvaluationTests.cs on net472 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Use item.Split() to separate query messages from the response, matching what FoundryEvals does. Previously the full conversation (including assistant turns) was passed as 'messages' alongside chatResponse, feeding duplicate assistant context to the evaluator and corrupting quality scores. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

markwallace-microsoft added documentation Improvements or additions to documentation .NET workflows Related to Workflows in agent-framework labels Mar 25, 2026

alliscode had a problem deploying to integration March 25, 2026 20:12 — with GitHub Actions Failure

github-actions bot changed the title ~~Foundry Evals integration for .NET~~ .NET: Foundry Evals integration for .NET Mar 25, 2026

alliscode force-pushed the af-foundry-evals-dotnet branch from e74c569 to b3684c0 Compare March 26, 2026 17:40

alliscode temporarily deployed to integration March 26, 2026 17:40 — with GitHub Actions Inactive

alliscode force-pushed the af-foundry-evals-dotnet branch from b3684c0 to 704e804 Compare March 26, 2026 18:02

alliscode temporarily deployed to integration March 26, 2026 18:02 — with GitHub Actions Inactive

alliscode temporarily deployed to integration March 26, 2026 21:21 — with GitHub Actions Inactive

alliscode temporarily deployed to integration March 26, 2026 21:30 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.NET: Foundry Evals integration for .NET#4914

.NET: Foundry Evals integration for .NET#4914
alliscode wants to merge 4 commits intomicrosoft:mainfrom
alliscode:af-foundry-evals-dotnet

alliscode commented Mar 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants