Skip to content

feat(langchain): add semconv attributes, operation mapping, and content recording modules [1/5]#4450

Open
nagkumar91 wants to merge 1 commit intoopen-telemetry:mainfrom
nagkumar91:langchain/pr1-foundation-modules
Open

feat(langchain): add semconv attributes, operation mapping, and content recording modules [1/5]#4450
nagkumar91 wants to merge 1 commit intoopen-telemetry:mainfrom
nagkumar91:langchain/pr1-foundation-modules

Conversation

@nagkumar91
Copy link
Copy Markdown
Contributor

Description

Add foundation modules for enhanced LangChain GenAI semantic convention tracing. These are internal building-block modules that will be consumed by subsequent PRs in this series.

New Modules

semconv_attributes.py

Per-operation attribute matrix based on OTel GenAI semantic conventions. Single source of truth for which attributes apply to which operations (chat, text_completion, invoke_agent, execute_tool, invoke_workflow, retrieval). Defines requirement levels (REQUIRED, CONDITIONALLY_REQUIRED, RECOMMENDED, OPT_IN) following the GenAI semconv spec.

operation_mapping.py

Callback-to-semconv operation mapping. Maps each LangChain callback to the correct GenAI semantic convention operation name. Includes heuristic classification for on_chain_start callbacks (agents vs workflows vs internal plumbing) and LangGraph marker recognition.

content_recording.py

Thin integration layer over the shared genai content capture utilities. Provides clear APIs for the callback handler to decide what content should be recorded on spans and events.

Tests

  • test_operation_mapping.py — comprehensive tests for chain classification, agent name resolution, LangGraph markers
  • test_content_recording.py — tests for content policy modes, redaction behavior

PR Series

This is PR 1 of 5 breaking down #4389 into smaller reviewable units:

  1. Foundation modules (this PR)
  2. Provider inference, message formatting, W3C propagation utilities
  3. Span manager enhancements + Event emitter
  4. Callback handler — model, chain, agent callbacks + wiring
  5. Callback handler — tool, retriever callbacks + E2E tests

Type of change

  • New feature (non-breaking change which adds functionality)

Checklist

  • Followed the style guidelines of this project
  • Unit tests have been added

…nt recording modules

Add foundation modules for enhanced LangChain GenAI semantic convention tracing:

- semconv_attributes.py: Per-operation attribute matrix based on OTel GenAI
  semantic conventions. Single source of truth for which attributes apply to
  which operations (chat, text_completion, invoke_agent, execute_tool,
  invoke_workflow, retrieval).

- operation_mapping.py: Callback-to-semconv operation mapping. Maps each
  LangChain callback to the correct GenAI semantic convention operation name.
  Includes heuristic classification for on_chain_start callbacks (agents vs
  workflows vs internal plumbing) and LangGraph marker recognition.

- content_recording.py: Thin integration layer over the shared genai content
  capture utilities. Provides clear APIs for the callback handler to decide
  what content should be recorded on spans and events.

Part 1 of a series breaking down open-telemetry#4389 into smaller reviewable PRs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@@ -0,0 +1,313 @@
# Copyright The OpenTelemetry Authors
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious as to why we need this module and not just take a dependency on the semantic conventions package?

@@ -0,0 +1,109 @@
# Copyright The OpenTelemetry Authors
Copy link
Copy Markdown
Contributor

@lzchen lzchen Apr 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's quite difficult to validate the correctness of these new files without seeing the functions within them being actually used end to end and, in addition, how they will be used in the existing code. For someone not super familiar with the inner workings of langchain, it would be good to see a simple scenario (like chat) in which a patched method with span creation, recording metrics is done end to end before adding all the other mechanisms (like invoke agent, etc). Would you mind building the solution piece by piece so that each PR as a standalone adds end to end functionality so it's easier to review/test?

@lzchen lzchen added the gen-ai Related to generative AI label Apr 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

gen-ai Related to generative AI

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants