skill(add-apm-integrations): R13-R33 + Cat B (context-propagation) rules by jordan-wong · Pull Request #11760 · DataDog/dd-trace-java

jordan-wong · 2026-06-29T04:56:30Z

Summary

Ports reviewer-encoded rules from the toolkit's synced derivative (apm-integrations skill in DataDog/apm-instrumentation-toolkit) back to this canonical add-apm-integrations skill.

Over the past several weeks, expert reviewers (@PerfectSlayer, @amarziali) commented on toolkit-generated integration PRs (#11337 jedis-3.0, #11562 sparkjava-2.3, #11506 RxJava 3, #11709 feign, #11717 commons-httpclient). Their feedback was encoded as 21 numbered R-rules and a new "Step 4.4" section for context-propagation libraries.

Both the toolkit copy and this canonical copy should stay in sync — and they had drifted significantly (toolkit was 563/831 lines; canonical was 246).

What's added

Step 4.4 — Library category: span-creating vs context-propagation (NEW)

Routes reactive/async/executor/future/actor libraries (RxJava, Reactor, Kotlin coroutines, executors, etc.) into the context_propagation codegen path that produces InstrumenterModule.ContextTracking + per-type wrappers, rather than the standard span-creating advice. Includes:

Category A (span_creating) vs Category B (context_propagation) decision tree
Required output shape (target_kind, context_propagation sibling field)
The Flowable subscribe(FlowableSubscriber) overload rule (hook the framework-internal overload, not the public wrapper)

Step 4.5 — Java naming consistency (NEW)

Module-name and integration-name conventions.

Inline R-rules across Steps 4-12:

Where	Rule
Step 4	R32 — dir name must end with version OR allowed suffix
Step 5	R13 — no single-type helper for CallDepthThreadLocalMap
Step 5	R30 — preserve master's integration name when regenerating
Step 7	R15/R16/R17 — instrument the single delegate, not all overloads
Step 7	R33 — no NullPointerException catches; use null-check guards
Step 9	R14 — test error/exception scenarios + spotless
Step 9	R18 — muzzle excludes incompatible majors
Step 9	R19 — latestDepTestImplementation range matches instrumented
Step 9	R20 — Java tests only; no new .groovy files
Step 9	R28 — compileOnly / testImplementation version split rationale
Step 9	R29 — register names in `supported-configurations.json`
Step 9	R31 — assertInverse only when declared min is true min
Step 4.4	R21-R27 — Cat A vs Cat B classification + Cat B output schema

What's NOT added

The toolkit's "Quick Reference" navigation section (toolkit-internal)
The "Step 11.1 — disclose skill knowledge in every PR description" boilerplate that was added during eval research and later removed as research-process overhead, not a permanent rule
The toolkit's ## Step 4 (eval runs) – Use a -generated directory name section (toolkit-specific eval workflow, not relevant to the canonical skill)

Paired with

DataDog/apm-instrumentation-toolkit#472 — same content on the toolkit's synced derivative copy.

Test plan

This is a documentation/skill content PR. No runtime code changes, no CI hooks to run.

Both copies of the skill (canonical here + toolkit derivative) describe the same R-rules
No removal of existing Step 1-12 content; all additions are additive
(Future) re-run the rxjava3 toolkit generation against this updated canonical skill and verify the agent produces a working InstrumenterModule.ContextTracking integration autonomously

Stats

+542 -6 (one file: .claude/skills/add-apm-integrations/SKILL.md)

…ep 4.4 Port reviewer-encoded rules from the toolkit's synced derivative (`apm-integrations` skill in DataDog/apm-instrumentation-toolkit) back to the canonical `add-apm-integrations` skill. The toolkit accumulated 21+ numbered reviewer rules during eval research (commits flowing through PR #11337 jedis-3.0, PR #11562 sparkjava-2.3, PR #11717 commons-httpclient, PR #11709 feign, and PR #11506 RxJava 3). These rules encode real failure modes observed during agent-generated PRs and the fixes reviewers asked for. Sections added: - **Step 4.4** — Library category: span-creating vs context-propagation. New axis that routes Cat B libraries (reactive, async, executors, futures, actors) into a `context_propagation` codegen path producing `InstrumenterModule.ContextTracking` instead of span-creating advice. Includes the Flowable subscribe(FlowableSubscriber) overload rule (hook the framework-internal overload, not the public wrapper). - **Step 4.5** — Java naming consistency (module-name conventions). R-rule placements: - Step 4: R32 (dir name must end with version OR allowed suffix) - Step 5: R13 (no single-type helper class for CallDepthThreadLocalMap) R30 (preserve master's integration name when regenerating) - Step 7: R15/R16/R17 (single delegate method, not all overloads) R33 (no NullPointerException catches; use null-check guards) - Step 9: R14 (test error/exception scenarios + spotless) R18 (muzzle excludes incompatible majors) R19 (latestDepTestImplementation range matches instrumented) R20 (Java tests only; no new .groovy files) R28 (compileOnly/testImplementation version split rationale) R29 (register names in supported-configurations.json) R31 (assertInverse only when declared min is true min) - Step 4.4: R21-R27 (the Cat A vs Cat B classification + Cat B schema) This PR pairs with toolkit-side PR DataDog/apm-instrumentation-toolkit#472 which adds the same content to the toolkit's synced derivative copy. Both copies should stay in sync — this is the canonical home. Signed-off-by: Jordan Wong <jordan.wong@datadoghq.com>

dd-octo-sts · 2026-06-29T05:23:56Z

🟢 Java Benchmark SLOs — All performance SLOs passed

Suite	Status
Startup	🟢 pass

SLO thresholds are defined here based on automatically generated metrics. A warning is raised when results are within 5% of the threshold.

PR vs. master results

Scenario	Candidate	master	Δ (95% CI of mean)
startup:insecure-bank:iast:Agent	13.99 s	14.00 s	[-0.8%; +0.7%] (no difference)
startup:insecure-bank:tracing:Agent	12.96 s	13.08 s	[-1.9%; -0.0%] (maybe better)
startup:petclinic:appsec:Agent	16.96 s	16.75 s	[+0.2%; +2.4%] (maybe worse)
startup:petclinic:iast:Agent	16.91 s	16.95 s	[-1.1%; +0.5%] (no difference)
startup:petclinic:profiling:Agent	16.59 s	16.87 s	[-2.5%; -0.7%] (maybe better)
startup:petclinic:sca:Agent	16.92 s	16.53 s	[+1.4%; +3.3%] (significantly worse)
startup:petclinic:tracing:Agent	16.18 s	16.11 s	[-0.7%; +1.6%] (no difference)

Commit: de2291cd · CI Pipeline · Benchmarking Platform UI

Load and DaCapo benchmarks can be triggered manually in the GitLab pipeline. Results will appear in the Benchmarking Platform UI after completion.

Two cosmetic passes on the rule encoding ported from the toolkit's synced derivative copy. Substantive content unchanged. Pass 1 — strip R-numbering. The R13-R33 numbers are toolkit-internal traceability tags that tie each rule to a specific reviewer comment on a specific generated-PR review. They are meaningful to the toolkit-side eval research workflow but meaningless to readers of the canonical skill, who have no R1-R12 context here. Removes: - '#### R<NN> — <title>' prefixes (14 sub-section headings) - Inline '(R<NN>)', '(R<NN>/R<NN>/R<NN>)' parentheticals on existing bullets that already convey the rule in their wording - A stale cross-reference 'see R29 below' rewritten to 'see "Register new integration names"' The toolkit-side copy keeps the R-numbering — it remains the eval-research home where the traceability matters. This is a one-way port convention. Pass 2 — renumber half-steps so the decimals make sense. Was: Step 4 → Step 4.4 → Step 4.5 → Step 5 implying missing 4.1, 4.2, 4.3. The original numbering was an artifact of an earlier toolkit-side draft that had a 4.1-4.4 enumeration which got collapsed. Renumbered to: Step 4 → Step 4.1 → Step 4.2 → Step 5 Main integer steps (1-12) unchanged. Step 7.1 (Multiple advice classes and @AppliesOn) left as-is since its decimal already makes sense relative to Step 7. Signed-off-by: Jordan Wong <jordan.wong@datadoghq.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

skill(add-apm-integrations): R13-R33 + Cat B (context-propagation) rules#11760

skill(add-apm-integrations): R13-R33 + Cat B (context-propagation) rules#11760
jordan-wong wants to merge 2 commits into
masterfrom
skill/r13-r33-from-toolkit

jordan-wong commented Jun 29, 2026

Uh oh!

dd-octo-sts Bot commented Jun 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

jordan-wong commented Jun 29, 2026

Summary

What's added

What's NOT added

Paired with

Test plan

Stats

Uh oh!

dd-octo-sts Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🟢 Java Benchmark SLOs — All performance SLOs passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dd-octo-sts Bot commented Jun 29, 2026 •

edited

Loading