Python: Feature: telemetry for tracking provider latency by castlenthesky · Pull Request #3631 · microsoft/agent-framework

castlenthesky · 2026-02-03T00:28:26Z

Motivation and Context

Why is this change required? - Helps add observability into latency from the LLM provider
What problem does it solve? - Current telemetry and observability does not provide any insight into where latency sits during the response generation. This PR aims to add some degree of insight into this question.
What scenario does it contribute to? - It contributes to instances where teams are trying to optimize for speed.
If it fixes an open issue, please link to the issue here. - Open issue link
-->

Description

Adding spans/metrics to track streaming latency. Specific metrics added to otel exports:

gen_ai.client.operation.time_to_first_chunk
gen_ai.client.operation.time_per_output_chunk

Contribution Checklist

The code builds clean without any errors or warnings
The PR follows the Contribution Guidelines
All unit tests pass, and I have added new tests where possible
Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

… measurements * Added TIME_TO_FIRST_CHUNK_BUCKET_BOUNDARIES and TIME_PER_OUTPUT_CHUNK_BUCKET_BOUNDARIES for improved metric tracking. * Implemented _get_time_to_first_chunk_histogram and _get_time_per_output_chunk_histogram functions to create new histograms. * Updated _trace_get_streaming_response to record metrics for time to first chunk and time per output chunk. * Introduced _record_streaming_metrics function to handle the recording of streaming-specific metrics.

…latency.

Copilot

Pull request overview

This PR adds telemetry metrics for tracking streaming provider latency in the agent framework's observability module. The implementation introduces three new OpenTelemetry metrics to measure streaming operation performance from the client's perspective.

Changes:

Added three new histogram metrics for streaming latency: gen_ai.client.operation.time_to_first_chunk, gen_ai.client.operation.time_per_output_chunk, and gen_ai.client.operation.duration
Modified trace_get_streaming_response to track timing information for chunks and record streaming-specific metrics
Added bucket boundaries configurations optimized for streaming latency measurements
Created tests to verify streaming metrics are recorded during both successful and error scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File	Description
python/packages/core/agent_framework/observability.py	Added new histogram creation functions, bucket boundaries for streaming metrics, modified streaming response tracing to track chunk timing, and implemented `_record_streaming_metrics` helper function
python/packages/core/tests/core/test_observability.py	Added test fixtures and test cases for streaming metrics recording in both success and error scenarios

python/packages/core/tests/core/test_observability.py

python/packages/core/agent_framework/observability.py

castlenthesky · 2026-02-03T16:17:00Z

@microsoft-github-policy-service agree

TaoChenOSU · 2026-02-03T17:17:43Z

@castlenthesky Thank you for contributing!

Do you have an ETA on when these two metrics will be merged to the OTel GenAI Semantic conventions?

castlenthesky · 2026-02-03T17:39:01Z

@TaoChenOSU I don't yet. I just submitted the PR for their repo as well. We can track that status here: open-telemetry/semantic-conventions#3377

Happy to make the updates I flagged with the TODO and submit the final commit once they've officially adopted the convention.

…s do not mask original streaming operation exceptions

castlenthesky · 2026-02-20T03:32:35Z

@TaoChenOSU, the telemetry metrics outlined above have been merged into the OTEL standard successfully. I recommend we merge these in as well. What can I do to help move that process forward?

castlenthesky · 2026-02-24T22:58:16Z

@markwallace-microsoft and @TaoChenOSU Bumping this PR - the changes in OTEL's semantic conventions that support the two telemetry metrics covered in this PR have been accepted and merged into the OTEL standard's main branch. Accordingly this PR should be ready for final review.

python/packages/core/agent_framework/observability.py

TaoChenOSU · 2026-02-24T23:18:15Z

@castlenthesky Thank you for your contribution!

I left a few comments. Copilot also left a few comments. There are two conflicts. Please address them and I think it's good to go.

Co-authored-by: Tao Chen <williamchan444307762@hotmail.com>

castlenthesky added 3 commits November 27, 2025 08:14

Merge branch 'main' into feature-telemetry-provider_latency

b774d43

test: Add streaming chat client observability metric tests for chunk …

5d44435

…latency.

Copilot AI review requested due to automatic review settings February 3, 2026 00:28

markwallace-microsoft added the python label Feb 3, 2026

Copilot started reviewing on behalf of castlenthesky February 3, 2026 00:29 View session

castlenthesky mentioned this pull request Feb 3, 2026

Python: Add telemetry for server latency tracing #2249

Open

github-actions bot changed the title ~~Feature: telemetry for tracking provider latency~~ Python: Feature: telemetry for tracking provider latency Feb 3, 2026

Copilot AI reviewed Feb 3, 2026

View reviewed changes

castlenthesky added 2 commits February 3, 2026 10:39

Merge branch 'main' into feature-telemetry-provider_latency

ec4ca59

fix: Corrected chunk count logic and ensured metric recording failure…

7423c9e

…s do not mask original streaming operation exceptions

TaoChenOSU reviewed Feb 24, 2026

View reviewed changes

python/packages/core/agent_framework/observability.py Outdated Show resolved Hide resolved

python/packages/core/agent_framework/observability.py Outdated Show resolved Hide resolved

python/packages/core/agent_framework/observability.py Show resolved Hide resolved

castlenthesky and others added 2 commits March 2, 2026 09:21

Update python/packages/core/agent_framework/observability.py

3abddb5

Co-authored-by: Tao Chen <williamchan444307762@hotmail.com>

chore: removing todos from observability

a8bbe8b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Python: Feature: telemetry for tracking provider latency#3631

Python: Feature: telemetry for tracking provider latency#3631
castlenthesky wants to merge 7 commits intomicrosoft:mainfrom
castlenthesky:feature-telemetry-provider_latency

castlenthesky commented Feb 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

castlenthesky commented Feb 3, 2026

Uh oh!

TaoChenOSU commented Feb 3, 2026

Uh oh!

castlenthesky commented Feb 3, 2026

Uh oh!

castlenthesky commented Feb 20, 2026

Uh oh!

castlenthesky commented Feb 24, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TaoChenOSU commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

castlenthesky commented Feb 3, 2026

Motivation and Context

Description

Contribution Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

castlenthesky commented Feb 3, 2026

Uh oh!

TaoChenOSU commented Feb 3, 2026

Uh oh!

castlenthesky commented Feb 3, 2026

Uh oh!

castlenthesky commented Feb 20, 2026

Uh oh!

castlenthesky commented Feb 24, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

TaoChenOSU commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants