Skip to content

Python: Feature: telemetry for tracking provider latency#3631

Open
castlenthesky wants to merge 7 commits intomicrosoft:mainfrom
castlenthesky:feature-telemetry-provider_latency
Open

Python: Feature: telemetry for tracking provider latency#3631
castlenthesky wants to merge 7 commits intomicrosoft:mainfrom
castlenthesky:feature-telemetry-provider_latency

Conversation

@castlenthesky
Copy link

Motivation and Context

  1. Why is this change required? - Helps add observability into latency from the LLM provider
  2. What problem does it solve? - Current telemetry and observability does not provide any insight into where latency sits during the response generation. This PR aims to add some degree of insight into this question.
  3. What scenario does it contribute to? - It contributes to instances where teams are trying to optimize for speed.
  4. If it fixes an open issue, please link to the issue here. - Open issue link
    -->

Description

Adding spans/metrics to track streaming latency. Specific metrics added to otel exports:

  • gen_ai.client.operation.time_to_first_chunk
  • gen_ai.client.operation.time_per_output_chunk

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

… measurements

* Added TIME_TO_FIRST_CHUNK_BUCKET_BOUNDARIES and TIME_PER_OUTPUT_CHUNK_BUCKET_BOUNDARIES for improved metric tracking.
* Implemented _get_time_to_first_chunk_histogram and _get_time_per_output_chunk_histogram functions to create new histograms.
* Updated _trace_get_streaming_response to record metrics for time to first chunk and time per output chunk.
* Introduced _record_streaming_metrics function to handle the recording of streaming-specific metrics.
Copilot AI review requested due to automatic review settings February 3, 2026 00:28
@github-actions github-actions bot changed the title Feature: telemetry for tracking provider latency Python: Feature: telemetry for tracking provider latency Feb 3, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds telemetry metrics for tracking streaming provider latency in the agent framework's observability module. The implementation introduces three new OpenTelemetry metrics to measure streaming operation performance from the client's perspective.

Changes:

  • Added three new histogram metrics for streaming latency: gen_ai.client.operation.time_to_first_chunk, gen_ai.client.operation.time_per_output_chunk, and gen_ai.client.operation.duration
  • Modified trace_get_streaming_response to track timing information for chunks and record streaming-specific metrics
  • Added bucket boundaries configurations optimized for streaming latency measurements
  • Created tests to verify streaming metrics are recorded during both successful and error scenarios

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.

File Description
python/packages/core/agent_framework/observability.py Added new histogram creation functions, bucket boundaries for streaming metrics, modified streaming response tracing to track chunk timing, and implemented _record_streaming_metrics helper function
python/packages/core/tests/core/test_observability.py Added test fixtures and test cases for streaming metrics recording in both success and error scenarios

@castlenthesky
Copy link
Author

@microsoft-github-policy-service agree

@TaoChenOSU
Copy link
Contributor

@castlenthesky Thank you for contributing!

Do you have an ETA on when these two metrics will be merged to the OTel GenAI Semantic conventions?

@castlenthesky
Copy link
Author

@TaoChenOSU I don't yet. I just submitted the PR for their repo as well. We can track that status here: open-telemetry/semantic-conventions#3377

Happy to make the updates I flagged with the TODO and submit the final commit once they've officially adopted the convention.

@castlenthesky
Copy link
Author

@TaoChenOSU, the telemetry metrics outlined above have been merged into the OTEL standard successfully. I recommend we merge these in as well. What can I do to help move that process forward?

@castlenthesky
Copy link
Author

@markwallace-microsoft and @TaoChenOSU Bumping this PR - the changes in OTEL's semantic conventions that support the two telemetry metrics covered in this PR have been accepted and merged into the OTEL standard's main branch. Accordingly this PR should be ready for final review.

@TaoChenOSU
Copy link
Contributor

@castlenthesky Thank you for your contribution!

I left a few comments. Copilot also left a few comments. There are two conflicts. Please address them and I think it's good to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants