fix: record errors on spans for anthropic, groq, and mistralai#3970
fix: record errors on spans for anthropic, groq, and mistralai#3970Dheeraj-Bhaskaruni wants to merge 4 commits into
Conversation
…mentations When API calls to foundation model providers raise exceptions, the spans were not being marked as failed. This adds span.record_exception() and span.set_status(ERROR) to the error paths in both sync and async wrappers for these three packages. - anthropic: added record_exception, set_status(ERROR), and span.end() - groq: added record_exception, set_status(ERROR), and span.end() - mistralai: wrapped API calls in try/except with full error recording Fixes traceloop#412
📝 WalkthroughWalkthroughAdds explicit exception handling to three OpenTelemetry instrumentations: on exception spans now record the exception, set Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(200,200,255,0.5)
participant Client
participant Instrumentation
participant Tracer as "Tracer / Span"
participant Wrapped as "Wrapped Function/Service"
participant Exporter as "Span Exporter"
end
Client->>Instrumentation: call instrumented function
Instrumentation->>Tracer: start span
Instrumentation->>Wrapped: call/await wrapped(*args, **kwargs)
alt success
Wrapped-->>Instrumentation: returns response
Instrumentation->>Tracer: set attributes, set_status(OK), end span
Tracer->>Exporter: export finished span
Instrumentation-->>Client: return response
else exception
Wrapped-->>Instrumentation: raises Exception e
Instrumentation->>Tracer: record_exception(e)
Instrumentation->>Tracer: set_status(ERROR, str(e))
Instrumentation->>Tracer: end span
Tracer->>Exporter: export error span
Instrumentation-->>Client: re-raise e
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
|
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py (1)
567-581:⚠️ Potential issue | 🔴 CriticalAnthropicStream and context manager wrappers do not properly handle span closure on streaming errors.
The exception handler in
AnthropicStream.__next__()(lines 226-230) increments the exception counter but does not callrecord_exception(),set_status(StatusCode.ERROR), orspan.end(). When iteration fails mid-stream, the span remains incomplete and unmarked as errored.Additionally,
WrappedMessageStreamManager.__exit__()andWrappedAsyncMessageStreamManager.__aexit__()delegate entirely to the underlying stream manager without checking or handling the span state. If an exception propagates during thewith/async withblock, the span is never marked as errored or properly closed.While
AnthropicAsyncStream.__anext__()(lines 374-384) correctly handles streaming exceptions by setting error status and ending the span, the sync path and both context manager wrappers leave spans in an incomplete state.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py` around lines 567 - 581, The sync streaming path and context-manager wrappers must mirror the async error-handling: in AnthropicStream.__next__ ensure that when an exception occurs you call span.record_exception(e), span.set_status(Status(StatusCode.ERROR, str(e))) and span.end() (and record duration_histogram/exception_counter like the async path) before re-raising; similarly, in WrappedMessageStreamManager.__exit__ and WrappedAsyncMessageStreamManager.__aexit__ detect if an exception is propagating (non-None exc_type/exception) and, if a span exists on the wrapper, call span.record_exception(exc), span.set_status(Status(StatusCode.ERROR, str(exc))) and span.end() (and update duration/exception metrics) so spans are always marked errored and closed on stream errors.packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.py (1)
256-269:⚠️ Potential issue | 🟠 MajorStream exception handling bypasses record_exception() and span.end() for mid-stream failures.
The try/except blocks at lines 269–276 and 355–361 only catch exceptions during generator creation, not during chunk iteration. Exceptions raised while consuming
_create_stream_processor()or_create_async_stream_processor()will not triggerrecord_exception(),set_status(StatusCode.ERROR), orspan.end()since those operations are only called after normal loop completion. Thefor/async forloops inside these processors must wrap iteration in try/except, plus add a regression test for exceptions raised after the first chunk.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.py` around lines 256 - 269, The stream processors (_create_stream_processor and _create_async_stream_processor) only handle exceptions during generator creation but not during iteration, so wrap the actual for/async for iteration inside a try/except that mirrors the existing exception handler: on any exception compute end_time, build attributes via error_metrics_attributes(e), record duration to duration_histogram if present, call span.record_exception(e), span.set_status(Status(StatusCode.ERROR, str(e))), span.end(), then re-raise; ensure the normal span.end() remains on normal completion. Also add a regression test that raises an exception after yielding the first chunk to verify record_exception, set_status and span.end are invoked for mid-stream failures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py`:
- Around line 417-423: The streaming accumulator generators
_accumulate_streaming_response and _aaccumulate_streaming_response currently
iterate without protection, so exceptions raised mid-stream leave spans open;
update both functions to wrap their iteration loop in a try/except/finally: in
except call span.record_exception(e) and
span.set_status(Status(StatusCode.ERROR, str(e))) then re-raise, and in finally
ensure span.end() and any _handle_response() cleanup always run; also add a unit
test that simulates an exception thrown mid-stream (e.g., a generator that
raises partway through) to assert record_exception, set_status and span.end are
invoked to prevent regression.
---
Outside diff comments:
In
`@packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.py`:
- Around line 567-581: The sync streaming path and context-manager wrappers must
mirror the async error-handling: in AnthropicStream.__next__ ensure that when an
exception occurs you call span.record_exception(e),
span.set_status(Status(StatusCode.ERROR, str(e))) and span.end() (and record
duration_histogram/exception_counter like the async path) before re-raising;
similarly, in WrappedMessageStreamManager.__exit__ and
WrappedAsyncMessageStreamManager.__aexit__ detect if an exception is propagating
(non-None exc_type/exception) and, if a span exists on the wrapper, call
span.record_exception(exc), span.set_status(Status(StatusCode.ERROR, str(exc)))
and span.end() (and update duration/exception metrics) so spans are always
marked errored and closed on stream errors.
In
`@packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.py`:
- Around line 256-269: The stream processors (_create_stream_processor and
_create_async_stream_processor) only handle exceptions during generator creation
but not during iteration, so wrap the actual for/async for iteration inside a
try/except that mirrors the existing exception handler: on any exception compute
end_time, build attributes via error_metrics_attributes(e), record duration to
duration_histogram if present, call span.record_exception(e),
span.set_status(Status(StatusCode.ERROR, str(e))), span.end(), then re-raise;
ensure the normal span.end() remains on normal completion. Also add a regression
test that raises an exception after yielding the first chunk to verify
record_exception, set_status and span.end are invoked for mid-stream failures.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7db9efe9-3b41-47dd-95e1-3740189167fc
📒 Files selected for processing (3)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.pypackages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.pypackages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py
| try: | ||
| response = wrapped(*args, **kwargs) | ||
| except Exception as e: | ||
| span.record_exception(e) | ||
| span.set_status(Status(StatusCode.ERROR, str(e))) | ||
| span.end() | ||
| raise |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd packages/opentelemetry-instrumentation-mistralai && wc -l opentelemetry/instrumentation/mistralai/__init__.pyRepository: traceloop/openllmetry
Length of output: 120
🏁 Script executed:
cd packages/opentelemetry-instrumentation-mistralai && sed -n '400,480p' opentelemetry/instrumentation/mistralai/__init__.pyRepository: traceloop/openllmetry
Length of output: 2298
🏁 Script executed:
cd packages/opentelemetry-instrumentation-mistralai && grep -n "_accumulate_streaming_response\|_aaccumulate_streaming_response" opentelemetry/instrumentation/mistralai/__init__.pyRepository: traceloop/openllmetry
Length of output: 310
🏁 Script executed:
cd packages/opentelemetry-instrumentation-mistralai && sed -n '200,280p' opentelemetry/instrumentation/mistralai/__init__.pyRepository: traceloop/openllmetry
Length of output: 3142
🏁 Script executed:
cd packages/opentelemetry-instrumentation-mistralai && sed -n '410,445p' opentelemetry/instrumentation/mistralai/__init__.pyRepository: traceloop/openllmetry
Length of output: 948
🏁 Script executed:
cd packages/opentelemetry-instrumentation-mistralai && sed -n '460,495p' opentelemetry/instrumentation/mistralai/__init__.pyRepository: traceloop/openllmetry
Length of output: 1066
Stream consumption failures bypass span cleanup.
The new exception handlers (lines 417–423 and 469–478) only catch errors while obtaining the response object via wrapped(). When streaming is enabled, control is returned to the caller via generator functions (_accumulate_streaming_response and _aaccumulate_streaming_response) that iterate the stream without any try/except/finally wrapper. If an exception occurs during iteration—such as a network error, malformed chunk, or timeout—the exception propagates to the caller while the generator's remaining code (including _handle_response() and span.end()) is never executed. This leaves the span open and unrecorded.
Both accumulator functions must wrap their iteration loops in try/except/finally to ensure record_exception(), set_status(), and span.end() are always called. Add a test that raises an exception mid-stream to prevent regression.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py`
around lines 417 - 423, The streaming accumulator generators
_accumulate_streaming_response and _aaccumulate_streaming_response currently
iterate without protection, so exceptions raised mid-stream leave spans open;
update both functions to wrap their iteration loop in a try/except/finally: in
except call span.record_exception(e) and
span.set_status(Status(StatusCode.ERROR, str(e))) then re-raise, and in finally
ensure span.end() and any _handle_response() cleanup always run; also add a unit
test that simulates an exception thrown mid-stream (e.g., a generator that
raises partway through) to assert record_exception, set_status and span.end are
invoked to prevent regression.
| if exception_counter: | ||
| exception_counter.add(1, attributes=attributes) | ||
|
|
||
| span.record_exception(e) |
There was a problem hiding this comment.
please add errorType attribute to comply with Otel semconv span.set_attribute(ERROR_TYPE, e.__class__.__name__)
| if exception_counter: | ||
| exception_counter.add(1, attributes=attributes) | ||
|
|
||
| span.record_exception(e) |
There was a problem hiding this comment.
please add errorType attribute to comply with Otel semconv span.set_attribute(ERROR_TYPE, e.__class__.__name__)
| duration = end_time - start_time | ||
| duration_histogram.record(duration, attributes=attributes) | ||
|
|
||
| span.record_exception(e) |
There was a problem hiding this comment.
please add errorType attribute to comply with Otel semconv span.set_attribute(ERROR_TYPE, e.__class__.__name__)
| duration = end_time - start_time | ||
| duration_histogram.record(duration, attributes=attributes) | ||
|
|
||
| span.record_exception(e) |
There was a problem hiding this comment.
please add errorType attribute to comply with Otel semconv span.set_attribute(ERROR_TYPE, e.__class__.__name__)
| try: | ||
| response = wrapped(*args, **kwargs) | ||
| except Exception as e: | ||
| span.record_exception(e) |
There was a problem hiding this comment.
please add errorType attribute to comply with Otel semconv span.set_attribute(ERROR_TYPE, e.__class__.__name__)
| else: | ||
| response = await wrapped(*args, **kwargs) | ||
| except Exception as e: | ||
| span.record_exception(e) |
There was a problem hiding this comment.
please add errorType attribute to comply with Otel semconv span.set_attribute(ERROR_TYPE, e.__class__.__name__)
|
Please fix failing tests, and add dedicated error-path tests |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py (1)
96-96: Consider extracting span filtering into a helper.The same filter expression is duplicated many times, which increases drift risk in future test updates.
♻️ Refactor sketch
+def _split_spans_by_status(span_exporter): + finished = span_exporter.get_finished_spans() + error = [s for s in finished if s.status.status_code == StatusCode.ERROR] + ok = [s for s in finished if s.status.status_code != StatusCode.ERROR] + return ok, error - spans = [s for s in span_exporter.get_finished_spans() if s.status.status_code != StatusCode.ERROR] + spans, error_spans = _split_spans_by_status(span_exporter) + assert len(error_spans) >= 1Also applies to: 188-188, 350-350, 434-434, 522-522, 690-690, 782-782, 873-873, 1046-1046, 1134-1134, 1225-1225, 1409-1409
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py` at line 96, Extract the repeated filtering expression into a single test helper like non_error_spans(span_exporter) (or get_non_error_spans) that returns [s for s in span_exporter.get_finished_spans() if s.status.status_code != StatusCode.ERROR]; replace every inline occurrence (e.g., the current instances used in tests in test_prompt_caching.py) with a call to that helper to avoid duplication and drift.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Line 96: The test currently filters out ERROR spans (using
span_exporter.get_finished_spans() and StatusCode.ERROR) without first asserting
that the failing call (the test's unknown_parameter invocation) produced an
error span; add an explicit assertion that at least one span with
status.status_code == StatusCode.ERROR exists before performing the filter,
locating the check around the same test block that calls unknown_parameter and
uses span_exporter.get_finished_spans() so regressions that stop emitting error
spans will fail the test.
In `@packages/opentelemetry-instrumentation-anthropic/tests/test_thinking.py`:
- Line 45: The tests currently filter out ERROR spans (using spans = [s for s in
span_exporter.get_finished_spans() if s.status.status_code != StatusCode.ERROR])
which loses coverage of expected exception spans; update each occurrence (e.g.,
in tests/test_thinking.py around span_exporter.get_finished_spans() and
StatusCode.ERROR) to also collect error_spans = [s for s in
span_exporter.get_finished_spans() if s.status.status_code == StatusCode.ERROR]
and add an assertion like assert error_spans, optionally checking the error span
count or attributes, so the tests verify that ERROR spans are actually emitted
for the failing pre-call.
---
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`:
- Line 96: Extract the repeated filtering expression into a single test helper
like non_error_spans(span_exporter) (or get_non_error_spans) that returns [s for
s in span_exporter.get_finished_spans() if s.status.status_code !=
StatusCode.ERROR]; replace every inline occurrence (e.g., the current instances
used in tests in test_prompt_caching.py) with a call to that helper to avoid
duplication and drift.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 09cdcd22-6d93-4d32-865f-81eee0c7a5b2
📒 Files selected for processing (3)
packages/opentelemetry-instrumentation-anthropic/tests/test_messages.pypackages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.pypackages/opentelemetry-instrumentation-anthropic/tests/test_thinking.py
✅ Files skipped from review due to trivial changes (1)
- packages/opentelemetry-instrumentation-anthropic/tests/test_messages.py
| ) | ||
|
|
||
| spans = span_exporter.get_finished_spans() | ||
| spans = [s for s in span_exporter.get_finished_spans() if s.status.status_code != StatusCode.ERROR] |
There was a problem hiding this comment.
Assert error-span existence before filtering it out.
At Line 96 (and the same pattern in the other updated lines), the test excludes ERROR spans but never verifies that the intentionally failing unknown_parameter call actually produced one. That leaves a gap where regression to “no error span emitted” would still pass.
✅ Suggested pattern
- spans = [s for s in span_exporter.get_finished_spans() if s.status.status_code != StatusCode.ERROR]
+ finished_spans = span_exporter.get_finished_spans()
+ error_spans = [s for s in finished_spans if s.status.status_code == StatusCode.ERROR]
+ assert len(error_spans) >= 1
+ spans = [s for s in finished_spans if s.status.status_code != StatusCode.ERROR]Also applies to: 188-188, 350-350, 434-434, 522-522, 690-690, 782-782, 873-873, 1046-1046, 1134-1134, 1225-1225, 1409-1409
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@packages/opentelemetry-instrumentation-anthropic/tests/test_prompt_caching.py`
at line 96, The test currently filters out ERROR spans (using
span_exporter.get_finished_spans() and StatusCode.ERROR) without first asserting
that the failing call (the test's unknown_parameter invocation) produced an
error span; add an explicit assertion that at least one span with
status.status_code == StatusCode.ERROR exists before performing the filter,
locating the check around the same test block that calls unknown_parameter and
uses span_exporter.get_finished_spans() so regressions that stop emitting error
spans will fail the test.
| ) | ||
|
|
||
| spans = span_exporter.get_finished_spans() | ||
| spans = [s for s in span_exporter.get_finished_spans() if s.status.status_code != StatusCode.ERROR] |
There was a problem hiding this comment.
Same test-coverage gap: verify ERROR spans are present.
At Line 45 and the other updated filter sites, ERROR spans are excluded without asserting they exist for the deliberately failing pre-call. These tests can pass even if exception spans stop being emitted.
✅ Suggested pattern
- spans = [s for s in span_exporter.get_finished_spans() if s.status.status_code != StatusCode.ERROR]
+ finished_spans = span_exporter.get_finished_spans()
+ error_spans = [s for s in finished_spans if s.status.status_code == StatusCode.ERROR]
+ assert len(error_spans) >= 1
+ spans = [s for s in finished_spans if s.status.status_code != StatusCode.ERROR]Also applies to: 104-104, 178-178, 245-245, 305-305, 383-383, 461-461, 531-531, 623-623, 702-702, 773-773, 871-871
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@packages/opentelemetry-instrumentation-anthropic/tests/test_thinking.py` at
line 45, The tests currently filter out ERROR spans (using spans = [s for s in
span_exporter.get_finished_spans() if s.status.status_code != StatusCode.ERROR])
which loses coverage of expected exception spans; update each occurrence (e.g.,
in tests/test_thinking.py around span_exporter.get_finished_spans() and
StatusCode.ERROR) to also collect error_spans = [s for s in
span_exporter.get_finished_spans() if s.status.status_code == StatusCode.ERROR]
and add an assertion like assert error_spans, optionally checking the error span
count or attributes, so the tests verify that ERROR spans are actually emitted
for the failing pre-call.
Tests that deliberately make error calls (unknown_parameter="unknown") now correctly produce finished error spans after the error recording fix. Filter out error spans in test assertions so they only validate the successful call's span attributes, matching the original test intent.
f74afdf to
2f2fa2e
Compare
…, mistralai Set error.type attribute with exception class name per OTel semconv, matching the pattern already used in the openai instrumentation package.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
packages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py (1)
471-475: Redundant streaming branch in_awrap.Both branches of
if to_wrap.get("streaming")callawait wrapped(*args, **kwargs)identically, so the conditional is dead code. Collapse to a single await.♻️ Proposed simplification
- try: - if to_wrap.get("streaming"): - response = await wrapped(*args, **kwargs) - else: - response = await wrapped(*args, **kwargs) - except Exception as e: + try: + response = await wrapped(*args, **kwargs) + except Exception as e:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py` around lines 471 - 475, The conditional branch inside the async wrapper _awrap that checks to_wrap.get("streaming") is redundant because both branches call await wrapped(*args, **kwargs); remove the if/else and replace it with a single await wrapped(*args, **kwargs) assignment to response to simplify the function and eliminate dead code while keeping exception handling and subsequent logic intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@packages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py`:
- Around line 471-475: The conditional branch inside the async wrapper _awrap
that checks to_wrap.get("streaming") is redundant because both branches call
await wrapped(*args, **kwargs); remove the if/else and replace it with a single
await wrapped(*args, **kwargs) assignment to response to simplify the function
and eliminate dead code while keeping exception handling and subsequent logic
intact.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 33677265-b9af-4a24-847f-fe6a837790d5
📒 Files selected for processing (3)
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.pypackages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.pypackages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.py
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/init.py
- packages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/init.py
`Span.set_status(...)` per the OTel SDK contract silently drops the
`description` argument and emits a warning when the first argument is
already a `Status` object:
Description ... ignored. Use either `Status` or `(StatusCode, Description)`
`TraceloopCallbackHandler._handle_error` was calling
span.set_status(Status(StatusCode.ERROR), str(error))
so the human-readable error message was never set on the span's status,
and every callback error caused a warning to be logged from
`opentelemetry.sdk.trace`.
Pass the description inside `Status(...)` instead — matching the
canonical pattern used elsewhere in the openllmetry monorepo (see
PR traceloop#3970 for anthropic/groq/mistralai).
Adds `tests/test_error_status.py` as a regression test that pins the
behaviour: the span's status description equals the exception message,
and no `Description ... ignored` warning is emitted.
`Span.set_status(...)` per the OTel SDK contract silently drops the
`description` argument and emits a warning when the first argument is
already a `Status` object:
Description ... ignored. Use either `Status` or `(StatusCode, Description)`
`TraceloopCallbackHandler._handle_error` was calling
span.set_status(Status(StatusCode.ERROR), str(error))
so the human-readable error message was never set on the span's status,
and every callback error caused a warning to be logged from
`opentelemetry.sdk.trace`.
Pass the description inside `Status(...)` instead — matching the
canonical pattern used elsewhere in the openllmetry monorepo (see
PR traceloop#3970 for anthropic/groq/mistralai).
Adds `tests/test_error_status.py` as a regression test that pins the
behaviour: the span's status description equals the exception message,
and no `Description ... ignored` warning is emitted.
Summary
Fixes #412
When API calls to foundation model providers raise exceptions, the OpenTelemetry spans were not being properly marked as failed. This PR adds error recording to three instrumentation packages:
span.record_exception(),span.set_status(ERROR), orspan.end()try/exceptat all aroundwrapped()calls, so exceptions propagated with spans never ended or markedChanges
All three packages now follow the same pattern used by
opentelemetry-instrumentation-openai:Files changed
packages/opentelemetry-instrumentation-anthropic/opentelemetry/instrumentation/anthropic/__init__.pypackages/opentelemetry-instrumentation-groq/opentelemetry/instrumentation/groq/__init__.pypackages/opentelemetry-instrumentation-mistralai/opentelemetry/instrumentation/mistralai/__init__.pyTest plan
ruff checkpasses on all changed filesSummary by CodeRabbit
Bug Fixes
Tests