feat(langchain): add semconv attributes, operation mapping, and content recording modules [1/5]#4450
Conversation
…nt recording modules Add foundation modules for enhanced LangChain GenAI semantic convention tracing: - semconv_attributes.py: Per-operation attribute matrix based on OTel GenAI semantic conventions. Single source of truth for which attributes apply to which operations (chat, text_completion, invoke_agent, execute_tool, invoke_workflow, retrieval). - operation_mapping.py: Callback-to-semconv operation mapping. Maps each LangChain callback to the correct GenAI semantic convention operation name. Includes heuristic classification for on_chain_start callbacks (agents vs workflows vs internal plumbing) and LangGraph marker recognition. - content_recording.py: Thin integration layer over the shared genai content capture utilities. Provides clear APIs for the callback handler to decide what content should be recorded on spans and events. Part 1 of a series breaking down open-telemetry#4389 into smaller reviewable PRs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| @@ -0,0 +1,313 @@ | |||
| # Copyright The OpenTelemetry Authors | |||
There was a problem hiding this comment.
I'm curious as to why we need this module and not just take a dependency on the semantic conventions package?
| @@ -0,0 +1,109 @@ | |||
| # Copyright The OpenTelemetry Authors | |||
There was a problem hiding this comment.
It's quite difficult to validate the correctness of these new files without seeing the functions within them being actually used end to end and, in addition, how they will be used in the existing code. For someone not super familiar with the inner workings of langchain, it would be good to see a simple scenario (like chat) in which a patched method with span creation, recording metrics is done end to end before adding all the other mechanisms (like invoke agent, etc). Would you mind building the solution piece by piece so that each PR as a standalone adds end to end functionality so it's easier to review/test?
Description
Add foundation modules for enhanced LangChain GenAI semantic convention tracing. These are internal building-block modules that will be consumed by subsequent PRs in this series.
New Modules
semconv_attributes.pyPer-operation attribute matrix based on OTel GenAI semantic conventions. Single source of truth for which attributes apply to which operations (chat, text_completion, invoke_agent, execute_tool, invoke_workflow, retrieval). Defines requirement levels (REQUIRED, CONDITIONALLY_REQUIRED, RECOMMENDED, OPT_IN) following the GenAI semconv spec.
operation_mapping.pyCallback-to-semconv operation mapping. Maps each LangChain callback to the correct GenAI semantic convention operation name. Includes heuristic classification for
on_chain_startcallbacks (agents vs workflows vs internal plumbing) and LangGraph marker recognition.content_recording.pyThin integration layer over the shared genai content capture utilities. Provides clear APIs for the callback handler to decide what content should be recorded on spans and events.
Tests
test_operation_mapping.py— comprehensive tests for chain classification, agent name resolution, LangGraph markerstest_content_recording.py— tests for content policy modes, redaction behaviorPR Series
This is PR 1 of 5 breaking down #4389 into smaller reviewable units:
Type of change
Checklist