Python: Feature: telemetry for tracking provider latency#3631
Python: Feature: telemetry for tracking provider latency#3631castlenthesky wants to merge 7 commits intomicrosoft:mainfrom
Conversation
… measurements * Added TIME_TO_FIRST_CHUNK_BUCKET_BOUNDARIES and TIME_PER_OUTPUT_CHUNK_BUCKET_BOUNDARIES for improved metric tracking. * Implemented _get_time_to_first_chunk_histogram and _get_time_per_output_chunk_histogram functions to create new histograms. * Updated _trace_get_streaming_response to record metrics for time to first chunk and time per output chunk. * Introduced _record_streaming_metrics function to handle the recording of streaming-specific metrics.
There was a problem hiding this comment.
Pull request overview
This PR adds telemetry metrics for tracking streaming provider latency in the agent framework's observability module. The implementation introduces three new OpenTelemetry metrics to measure streaming operation performance from the client's perspective.
Changes:
- Added three new histogram metrics for streaming latency:
gen_ai.client.operation.time_to_first_chunk,gen_ai.client.operation.time_per_output_chunk, andgen_ai.client.operation.duration - Modified
trace_get_streaming_responseto track timing information for chunks and record streaming-specific metrics - Added bucket boundaries configurations optimized for streaming latency measurements
- Created tests to verify streaming metrics are recorded during both successful and error scenarios
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 8 comments.
| File | Description |
|---|---|
| python/packages/core/agent_framework/observability.py | Added new histogram creation functions, bucket boundaries for streaming metrics, modified streaming response tracing to track chunk timing, and implemented _record_streaming_metrics helper function |
| python/packages/core/tests/core/test_observability.py | Added test fixtures and test cases for streaming metrics recording in both success and error scenarios |
|
@microsoft-github-policy-service agree |
|
@castlenthesky Thank you for contributing! Do you have an ETA on when these two metrics will be merged to the OTel GenAI Semantic conventions? |
|
@TaoChenOSU I don't yet. I just submitted the PR for their repo as well. We can track that status here: open-telemetry/semantic-conventions#3377 Happy to make the updates I flagged with the TODO and submit the final commit once they've officially adopted the convention. |
…s do not mask original streaming operation exceptions
|
@TaoChenOSU, the telemetry metrics outlined above have been merged into the OTEL standard successfully. I recommend we merge these in as well. What can I do to help move that process forward? |
|
@markwallace-microsoft and @TaoChenOSU Bumping this PR - the changes in OTEL's semantic conventions that support the two telemetry metrics covered in this PR have been accepted and merged into the OTEL standard's main branch. Accordingly this PR should be ready for final review. |
|
@castlenthesky Thank you for your contribution! I left a few comments. Copilot also left a few comments. There are two conflicts. Please address them and I think it's good to go. |
Co-authored-by: Tao Chen <williamchan444307762@hotmail.com>
Motivation and Context
-->
Description
Adding spans/metrics to track streaming latency. Specific metrics added to otel exports:
gen_ai.client.operation.time_to_first_chunkgen_ai.client.operation.time_per_output_chunkContribution Checklist