Skip to content

fix: align Java interface dispatch across wasm/native/hybrid#1503

Merged
carlos-alm merged 28 commits into
mainfrom
fix/java-interface-dispatch-1469
Jun 14, 2026
Merged

fix: align Java interface dispatch across wasm/native/hybrid#1503
carlos-alm merged 28 commits into
mainfrom
fix/java-interface-dispatch-1469

Conversation

@carlos-alm

@carlos-alm carlos-alm commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Java was the only fixture where all three build paths (wasm, native, hybrid) disagreed pairwise on the same edges. Two distinct bugs caused this.

Bug 1 — WASM typeMap pollution (src/extractors/java.ts)

handleJavaLocalVarDecl used last-wins Map.set(), so a local variable declaration in one method (e.g. InMemoryUserRepository repo in the static createDefault()) silently overrode an earlier binding set by a constructor parameter (UserRepository repo). The typeMap is flat per-file without method scoping, so this pollution propagated to all instance methods.

Result: WASM resolved repo.save() directly to InMemoryUserRepository.save (bypassing the interface), emitting no interface declaration edge and the wrong receiver node (InMemoryUserRepository(class) instead of UserRepository(interface)).

Fix: switch to setTypeMapEntry (first-wins on tie), matching Rust extractor semantics. First-wins preserves the interface annotation that drives correct CHA dispatch.

Bug 2 — native vs wasm/hybrid confidence mismatch (native-orchestrator.ts)

runPostNativeCha (native orchestrator path) computed CHA dispatch confidence as computeConfidence − CHA_DISPATCH_PENALTY = 0.7 − 0.1 = 0.6. The DB-level runChaPostPass (used by wasm and hybrid) hardcodes 0.8. These two code paths disagreed on the confidence of every CHA-expanded interface dispatch edge.

Fix: align runPostNativeCha to use the same value as runChaPostPass by extracting CHA_DISPATCH_CONFIDENCE = 0.8 as a named constant in helpers.ts, imported by both callers. This removes the implicit coupling between two previously synchronized magic numbers.

Accompanying fixes (discovered while achieving full parity)

C++/CUDA typeMap first-wins (src/extractors/cpp.ts, src/extractors/cuda.ts):
handleCppDeclaration and handleCudaDeclaration had the same last-wins Map.set() bug as Java. Both now use setTypeMapEntry (first-wins on tie), consistent with all other WASM extractors and the Rust extractor semantics.

Go factory / Python constructor receiver inference (crates/codegraph-core/src/extractors/go.rs, python.rs):
The Rust solver was not inferring receiver types for Go := short-variable assignments from factory calls (NewFoo()*Foo) or Python __init__ constructor calls (self.x = SomeClass()). Added infer_short_var_types (Go) and constructor-return inference (Python) to close parity gaps with WASM in those two languages.

Enclosing-caller attribution for variable bindings (src/domain/graph/builder/call-resolver.ts):
findCaller now uses a two-pass strategy: narrowest enclosing function/method wins; widest variable/constant binding is the fallback. This fixes incorrect attribution for Haskell top-level bindings and Zig const-fn patterns where the call site is inside a variable initializer rather than a named function body.

Confidence-descending call target sort (src/domain/graph/builder/stages/build-edges.ts):
Call targets are now sorted by confidence (desc) before emission to match native engine output order, fixing non-deterministic edge ordering that caused intermittent parity test failures.

Class-scoped field annotation typeMap keys (src/extractors/javascript.ts):
handleFieldDefTypeMap now emits ClassName.field as the primary key (confidence 0.9) with bare field / this.field as fallbacks (0.6), preventing cross-class typeMap collision when two classes declare identically-named fields (issue #1458).

Gate B RTA check extended to constructor/function kinds (native-orchestrator.ts):
The incremental Gate B query now checks tgt.kind IN ('class', 'constructor', 'function') instead of only tgt.kind = 'class'. On incremental builds against older-schema DBs where constructor calls are stored under constructor/function kinds, Gate B now correctly fires when RTA evidence is present in changed files.

Elixir/Julia/ObjC expected-edges updated to module-qualified names:
These three fixture manifests were using unqualified function names as edge targets (e.g. "to": "greet") but the extractor now emits module-qualified names ("to": "Greeter.greet"). Updated to match current extractor output — these are corrections of stale ground truth, not regressions.

Performance and CI:

  • symbolsOnly flag now propagates through parseFilesWasmInline (was silently ignored on that path)
  • INLINE_BACKFILL_THRESHOLD raised from 16 → 32 for required-tier grammars
  • runPostNativeCha scoped to changed files on incremental builds
  • Post-pass phase timings added to result.phases
  • New perf-canary.yml workflow gates extractor/graph/native PRs at 50% regression threshold (wider than the full CI gate to account for runner variance)

Outcome

  • parity-compare.mjs --langs java --hybrid goes green (all three paths: 48 nodes, 81 edges, identical)
  • Updated expected-edges.json to include both the interface declaration edge (UserRepository.X at conf=0.7) and the CHA-expanded impl edge (InMemoryUserRepository.X at conf=0.8) — the correct semantics for an interface-typed receiver
  • All 178 test files pass (12 pre-existing Erlang failures filed separately as fix: erlang extractor - 12 parser tests failing (macros, includes) #1502)

Closes #1469

… diff

Adds snapshot-pre-bash.sh (PreToolUse Bash) + track-bash-writes.sh
(PostToolUse Bash): the pre-hook captures git status --porcelain to a
per-worktree temp file before each Bash call; the post-hook diffs the
before/after state and appends newly modified or created files to
.claude/session-edits.log.

This closes the gap where files written by sed -i, printf redirects,
tee, heredocs, or build tools (Cargo.lock, lockfiles) were never
recorded, causing guard-git.sh to emit false-positive BLOCKED errors.

Closes #1457
- clojure.rs: annotate lifetime-anchor assignment to silence false-positive
- cfg.rs: remove never-called start_line_of method
- complexity.rs: remove never-constructed NotHandled variant; convert
  irrefutable if-let patterns to plain let destructures
- dataflow.rs: remove never-read callee fields from CallReturn/Destructured
- incremental.rs: remove never-read lang field from CacheEntry

cargo check and cargo clippy both clean after these changes.
Adds .github/workflows/perf-canary.yml — a path-filtered workflow that
fires on PRs touching src/extractors/, src/domain/graph/, or crates/**
and runs only the incremental-benchmark suite (full build + no-op +
1-file rebuild, both engines). Catches the class of regressions that
accumulated invisibly across the Phase 8.x PRs and were only detected
at v3.12.0 publish time.

The regression guard gains BENCH_CANARY=1 mode: raises thresholds to
50%/100%/150% (standard/noisy/WASM) and skips the build, query, and
resolution suites — only incremental checks run. This absorbs shared-
runner timing variance while still blocking catastrophic regressions
(+98% full build, +1827% 1-file rebuild from v3.12.0).

Closes #1433
On incremental builds, runPostNativeCha previously scanned all
call→qualified-method edges in the DB (~12ms flat, O(graph size)),
even for 1-file changes where no hierarchy or RTA evidence changed.

Add two cheap indexed gate queries. Gate A checks whether any changed
file introduced a class/interface/trait/struct/record node (hierarchy
may have new implementors reachable from unchanged call sites). Gate B
checks whether any changed file added a call edge to a class-kind target
(RTA set may have grown, enabling previously filtered expansions in
unchanged callers). If neither gate fires, restrict the candidate query
to src.file IN changedFiles — safe because the hierarchy and instantiated
set are unchanged for all other files.

Full builds (isFullBuild=true) and cases where either gate fires retain
the existing full-scan behaviour. Mirrors the changed-files scoping
pattern of runPostNativeThisDispatch.

Closes #1441
Times each JS post-pass in tryNativeOrchestrator and exposes the
measurements in BuildResult.phases:

- gapDetectMs  — dropped-language gap detection + backfill
- chaMs        — CHA expansion (interface dispatch)
- thisDispatchMs — this/super dispatch WASM re-parse (was already
                   tracked but now properly named alongside the rest)
- reclassifyMs — scoped role re-classification after edge insertion
- techniqueBackfillMs — technique-column UPDATE on native-written edges

Previously only thisDispatchMs was reported, causing wall-clock vs
phaseSum to diverge by 1.1s+ on 1-file rebuilds and making benchmark
regressions undiagnosable from committed history.

Updates update-incremental-report.ts to render the new phases in a
collapsible details block under each engine's 1-file rebuild section.

Closes #1434
…ld for required-tier grammars

The docstring claimed pool cost was "amortised over enough parse work" —
measurements show IPC overhead scales linearly (~55–64ms/file pool vs
~8–10ms/file inline). The real motivation is crash safety for exotic WASM
grammars (#965); JS/TS/TSX (required-tier, used in all this-dispatch
backfill calls) have never triggered the V8 fatal crash class and are safe
to run inline.

Raise threshold 16 → 32 to keep typical this-dispatch batches (≤ 18 files
on the codegraph corpus) on the inline fast path. Exotic-language drops are
almost always well under 32 files and also benefit from the inline path
without meaningful crash risk increase.

Closes #1435
…e incremental rebuilds

On 1-file native incremental builds, two JS post-passes ran unconditionally
even when they had no work to do:

- `backfillNativeDroppedFiles`: called whenever changedCount > 0, even when
  detectDroppedLanguageGap returned an empty gap. Gate now checks
  gap.missingAbs.length > 0 || gap.staleRel.length > 0 directly, matching
  backfillNativeDroppedFiles's own internal early-exit guard.

- Node/edge COUNT(*) re-count: ran unconditionally after all post-passes even
  when none of them wrote any edges. COUNT(*) over 50K+ edge tables is
  non-trivial, especially via the NativeDbProxy napi-rs round-trip. Now gated
  on postPassWroteData (backfill | CHA edges | this-dispatch edges).

Closes #1454
The post-pass it timed (runPostNativePrototypeMethods) was deleted in
b5c03a2 when func-prop extraction moved to Rust (#1432). The optional
field was never set by any code path that survived the deletion.

Also remove the stale reference to "prototype-methods post-pass" from
the parseFilesWasmForBackfill docstring — only the this-dispatch
post-pass uses symbolsOnly now.

Closes #1432
… collision

Field type annotations (`private repo: OrderRepository`) were seeded as bare
file-wide typeMap keys, causing `this.repo` inside `UserService` to resolve to
`OrderRepository` when both classes had a `repo` field (issue #1458).

Both extractors (TS `handleFieldDefTypeMap` and Rust `field_definition` branch)
now seed `ClassName.field` keys at confidence 0.9, matching the `CallerClass.X`
resolver fallback added in PR #1382. Bare keys are kept at confidence 0.6 as
fallbacks for single-class files or class expressions where no enclosing class
name is available.

Both engines change identically — parity preserved.
…ed names

The resolution benchmark uses WASM-built graphs where the Elixir, Julia,
and Objective-C extractors emit module-qualified symbol names (Main.run,
App.main, UserService.create_user, etc.). The expected-edges manifests
were written with bare unqualified names (run, main, create_user), so
every correctly-resolved edge appeared as a false positive and every
expected edge appeared as a false negative — causing all three languages
to show 0% precision even though resolution was working correctly.

Root cause: starting in v3.12.0, cross-module call resolution began working
for these languages (via the improved receiver-dispatch and same-class
fallback in resolveByMethodOrGlobal / build-edges.ts). With 0 edges
previously resolved, the name mismatch was invisible; once edges started
resolving, the manifests showed 17 FP (elixir), 11 FP (julia), 6 FP
(objc) — all correctly resolved edges misidentified as false positives.

Fix:
- Update all three expected-edges.json manifests to use the
  module-qualified names matching actual extractor output:
  elixir: Main.run, UserService.create_user, Validators.validate_user, etc.
  julia:  App.main, Service.create_user, Repository.new_repo, etc.
  objc:   full ObjC selectors (createUserWithId:name:email:, isValidEmail:, etc.)
          plus add main -> run (plain C call correctly resolved)
- Ratchet THRESHOLDS for all three:
  elixir: precision 0.0 -> 1.0, recall 0.0 -> 0.8  (17/21 resolved)
  julia:  precision 0.0 -> 1.0, recall 0.0 -> 0.7  (11/15 resolved)
  objc:   precision 0.0 -> 1.0, recall 0.0 -> 0.4   (6/13 resolved)

Remaining FNs are genuine unresolved edges (same-file bare calls in
elixir/julia, receiver-typed message sends in objc) — not regressions.

Closes #1447
The JS C++ and CUDA extractors had no handler for 'declaration' AST nodes,
so typeMap was never seeded for statically-typed locals (e.g. 'UserService svc;').
Without a typeMap entry for 'svc', resolveReceiverEdge had nothing to look up and
silently skipped the receiver edge.

Add handleCppDeclaration / handleCudaDeclaration to both extractors. They mirror
match_c_family_type_map ('declaration' branch) from the native Rust path: extract
the type node text and seed typeMap[varName] = { type, confidence: 0.9 } for each
identifier or init_declarator child. Primitive types (int, char, bool, …) are
skipped to avoid spurious edges.

parity-compare.mjs --langs cpp,cuda --hybrid: PARITY OK (wasm = native = hybrid)
All 3044 tests pass.
… in Rust solver

Go extractor was only seeding typeMap for var_spec and parameter_declaration,
missing short_var_declaration. Added infer_short_var_types to handle:
- x := Struct{} → conf 1.0 (composite literal)
- x := &Struct{} → conf 1.0 (address-of composite)
- x := NewFoo() / x := pkg.NewFoo() → conf 0.7 (New* factory prefix)

Python extractor was only seeding typeMap for typed_parameter and
typed_default_parameter, missing plain assignment. Added
infer_py_assignment_type to handle:
- order = Order(...) → conf 1.0 (uppercase constructor)
- obj = Module.Class(...) → conf 0.7 (uppercase module prefix, non-builtin)

Both mirror the existing JS extractors exactly. Parity check for
go and python: wasm vs native/hybrid OK.
…l, zig)

Both engines used different rules for attributing calls inside variable bindings:

WASM: attributed to the narrowest enclosing span regardless of kind, so
local variable declarations inside fn main() shadowed the enclosing function
(Zig: calls attributed to repo/svc variables instead of main), and nested
let-bindings inside a Haskell do-block shadowed the top-level main binding.

Native: loaded allNodes from a query that excluded 'variable' kind, so
top-level Haskell bind nodes (main = do …, kind='variable') never matched
in defs_with_ids, causing all calls to fall back to the file node.

Unified rule implemented in findCaller (TS) and find_enclosing_caller (Rust):
- Function/method definitions are preferred over any variable/constant binding
  as the enclosing caller scope — local var declarations inside a function
  body never shadow the enclosing function (fixes Zig repo/svc attribution).
- When no function/method encloses the call, fall back to the WIDEST
  (outermost) variable/constant binding — this handles Haskell where main
  is a top-level bind node with kind 'variable'. Widest span is used so that
  nested let-bindings do not shadow the outer main binding.
- File node remains the absolute last resort.

Also adds 'variable' to NODE_KIND_FILTER_SQL (JS) and EDGE_NODE_KIND_FILTER
(Rust pipeline.rs) so top-level variable bindings are included in the allNodes
set available for caller matching.

parity-compare.mjs --langs haskell,zig --hybrid: PARITY OK — 2/2 fixtures.
…and test

Remove unused TypeMapEntry import from cpp.ts and cuda.ts, reformat
primitive-type Set literals and test expect() calls to satisfy biome
line-length rules.
Java was the only fixture where all three build paths (wasm, native,
hybrid) disagreed pairwise.

Bug 1 — WASM typeMap pollution:
`handleJavaLocalVarDecl` used last-wins Map.set(), so the local
`InMemoryUserRepository repo` in the static `createDefault()` method
silently overrode the constructor parameter `UserRepository repo`.
This caused WASM to bypass the interface and resolve directly to the
concrete class, producing no interface edge and the wrong receiver.
Fix: switch to first-wins `setTypeMapEntry` to match Rust extractor
semantics. First-wins preserves the interface annotation that drives
correct CHA dispatch.

Bug 2 — native vs wasm/hybrid confidence mismatch:
`runPostNativeCha` (native orchestrator path) used
`computeConfidence − CHA_DISPATCH_PENALTY = 0.7 − 0.1 = 0.6`,
while `runChaPostPass` (DB post-pass used by wasm and hybrid) hardcodes
0.8. Fix: align `runPostNativeCha` to also use 0.8.

Result: all three build paths now emit identical edges and confidences.
`parity-compare.mjs --langs java --hybrid` passes.
Updated expected-edges.json to include both the interface declaration
edge (TypeRepository.X at 0.7) and the CHA-expanded impl edge
(InMemoryUserRepository.X at 0.8), which are the correct semantics for
an interface-typed receiver.

Closes #1469
@claude

claude Bot commented Jun 13, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


PR Review: fix/java-interface-dispatch-1469

  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity check
  • Phase 4: Code quality review
  • Phase 5: Backlog compliance
  • Final assessment

Phase 1 — Root Cause Analysis

Bug 1 (WASM typeMap pollution): Confirmed. handleJavaLocalVarDecl in src/extractors/java.ts:268–285 was using implicit Map.set() (last-wins). The fix correctly switches to setTypeMapEntry (first-wins on tie). This is the right fix: the typeMap is file-scoped, so the first annotation — which comes from constructor/method parameters — is the authoritative type. The comment on line 277–282 accurately documents the reasoning.

Bug 2 (confidence mismatch): Confirmed. runPostNativeCha in native-orchestrator.ts:659 now hardcodes 0.8, matching helpers.ts (runChaPostPass). The comment at line 657–660 explains the alignment. The fix is minimal and correct.

Verdict: Both root causes are genuine, the fixes address the actual sources of divergence, and neither fix papers over symptoms.


Phase 2 — Configuration Change Detection

.claude/settings.json adds two new hooks (snapshot-pre-bash.sh and track-bash-writes.sh). These are tooling/developer hooks only — they snapshot git status before each Bash tool call and track files modified by those calls. No linting rules relaxed, no test timeouts increased, no TypeScript strictness modified. Clean.

.github/workflows/perf-canary.yml is a new lightweight perf gate (50% threshold) triggered only on PRs touching extractors/graph-builder/Rust layers. This is additive and the threshold is intentionally wider than the full suite's 25% to account for CI runner variance — documented in the file header. Clean.

.gitignore adds 3 lines (presumably benchmark output artifacts). Not inspected in detail but low risk.

No linting rules relaxed. No strictness changes.


Phase 3 — Test Integrity

expected-edges.json (Java): Added 6 edges (3 interface declaration edges + 3 CHA expansion edges). These are improvements, not regressions — they add correct ground truth edges that now resolve with the bug fixed. The old manifest was missing the interface edge + CHA expansion pair; that was the observable symptom of Bug 1.

resolution-benchmark.test.ts: Java threshold remains { precision: 0.7, recall: 0.3 }. The manifest expanded from 14 to 20 edges and recall/precision improves — no thresholds were lowered.

tests/benchmarks/regression-guard.test.ts: Added ~39 lines. Spot-checking: adds a WASM timing threshold constant (WASM_TIMING_THRESHOLD) and documentation for widened CI tolerances. The minimum threshold (0.75 in normal mode) is wider than before but documented with empirical justification (WASM timing variance), not added to hide a regression. The canary mode 1.5 (150%) is used only in the new perf-canary workflow, not in the full CI gate.

tests/parsers/javascript.test.ts: Adds test coverage for JS extractor changes. No assertions removed.

tests/unit/visitor.test.ts: Removes 1 line — likely a duplicate or no-longer-applicable assertion. Would need to see the diff in detail, but the PR claims 178 tests pass.

No red flags found.


Phase 4 — Code Quality

src/extractors/java.ts:

  • The handleJavaLocalVarDecl fix at line 283 is correct. One concern: the comment is verbose for what is a one-liner change. The repo's CLAUDE.md says "DO NOT ADD COMMENTS unless asked" — these comments were added programmatically. Not a blocking issue, but worth noting.
  • The file otherwise looks well-structured and consistent.

src/domain/graph/builder/stages/native-orchestrator.ts:

  • runPostNativeCha at line 659: the hardcoded 0.8 is now aligned with runChaPostPass. However, this creates a latent coupling risk: if runChaPostPass in helpers.ts ever changes its hardcoded 0.8, runPostNativeCha will silently diverge again. The value is not extracted to a shared constant.

    Recommendation: Extract 0.8 to a named constant (e.g. CHA_DISPATCH_CONFIDENCE = 0.8) in helpers.ts and import it in native-orchestrator.ts — the same way CHA_DISPATCH_PENALTY is already imported from build-edges.ts (line 58). The PR description says "hardcodes 0.8" but CHA_DISPATCH_PENALTY is already a named exported constant. This should be too.

  • The this/super dispatch path at line 906 still uses computeConfidence(relPath, t.file, null) - CHA_DISPATCH_PENALTY, which is correct and separate from the Bug 2 fix (different code path, different semantics).

Scope bloat in the diff: The PR title says "align Java interface dispatch" but the diff includes significant changes to:

  • Go extractor (crates/codegraph-core/src/extractors/go.rs +191 lines)
  • Python extractor (crates/codegraph-core/src/extractors/python.rs +154 lines)
  • JavaScript extractor (crates/codegraph-core/src/extractors/javascript.rs +34 lines)
  • C++ extractor (src/extractors/cpp.ts +70 lines)
  • CUDA extractor (src/extractors/cuda.ts +69 lines)
  • call-resolver.ts (+64 lines)
  • domain/parser.ts (+33 lines)
  • Elixir, Julia, ObjC fixture updates (expected-edges changes)

These changes are not explained in the PR description. This is a significant scope concern — the PR title only claims Java interface dispatch fixes, but the actual diff is substantially larger.

Elixir/Julia/ObjC expected-edges.json: These are bulk updates to expected edges (46+46+38+31 lines changed). Are these corrections of pre-existing wrong edges, or coincidental updates? No explanation is given.


Phase 5 — Backlog Compliance

Zero-dep: ✓ — No new dependencies in package.json. Pure extraction/resolution changes.

Foundation-aligned: ✓ — All changes are local analysis, SQLite, tree-sitter. No cloud calls, no external services, no agents. Aligns with P1 (always-current) and P4 (zero-cost core).

Problem-fit: 5 — Directly prevents hallucinated dependency edges (wrong receiver type = wrong call graph = AI navigates to wrong implementation). The interface dispatch bug was causing AI agents to see InMemoryUserRepository.save when the code only has evidence of UserRepository.save — exactly the false dependency problem codegraph exists to prevent.

Breaking: No — purely fixes wrong edges to correct edges. No API, CLI, schema, or MCP contract changes.

Tier: 1 — Zero new deps, fully foundation-aligned.


Critical Concerns

  1. CHA_DISPATCH_CONFIDENCE = 0.8 should be a shared constant. The fix aligns two hardcoded values but the coupling is still implicit. Any future change to runChaPostPass's 0.8 will silently re-introduce the bug. Extract to a named constant in helpers.ts and import it. Medium priority — not a regression risk today, but a maintenance landmine.

  2. Unexplained scope: The Go, Python, JS, C++, CUDA extractor changes and the Elixir/Julia/ObjC fixture changes are not addressed in the PR description. Each of those represents a real behavioral change in a different language. Per CLAUDE.md: "One PR = one concern." This PR has at least 5 concerns. The description should explain each language change, or they should be split into separate PRs.

  3. Minor: The comment block added at java.ts:277–282 violates CLAUDE.md's "DO NOT ADD COMMENTS unless asked" rule. This is low severity since comments were likely added by an agent session, not a human reviewer — but worth noting for repo hygiene.


Final Assessment

## Root Cause Analysis
Problem Being Solved: Java was the only fixture language where wasm/native/hybrid
  all disagreed — two distinct bugs (typeMap last-wins pollution and CHA confidence
  mismatch) caused the divergence. Both are real bugs with identifiable root causes.
Why This Approach: First-wins typeMap semantics and confidence alignment are minimal,
  correct fixes that match the semantics of the Rust extractor.
Risk Assessment: Low. The core bug fixes are narrow and correct. The scope bloat in
  the diff carries moderate risk of unreviewed regressions in other languages.

## Backlog Compliance
- Zero-dep: ✓
- Foundation-aligned: ✓
- Problem-fit: 5 — directly prevents hallucinated dependency edges
- Breaking: No
- Tier: 1

## Critical Concerns
1. CHA_DISPATCH_CONFIDENCE is still two hardcoded 0.8 literals; extract to a shared constant.
2. PR mixes Java fix with unexecuted Go/Python/C++/CUDA/Elixir/Julia/ObjC changes — no description provided for these.

## Final Recommendation
- Rating: ⭐⭐⭐ (3/5)
- Action: APPROVE WITH CONDITIONS
- Reasoning: The two stated Java fixes are correct, well-motivated, and verified by
  the expanded expected-edges manifest. The rating is limited to 3/5 because (a) the
  CHA_DISPATCH_CONFIDENCE value should be a shared constant rather than two synchronized
  magic numbers, and (b) the PR contains substantial unexplained changes to 5+ other
  languages that violate the one-PR-one-concern principle from CLAUDE.md. If the Go,
  Python, C++, CUDA, Elixir, Julia, and ObjC extractor changes are independently correct
  and tested, they should be explained in the PR body or split out. Approving as-is
  means those changes are implicitly reviewed by proximity to the Java fix, which is
  not acceptable for a safety-critical tool.

@greptile-apps

greptile-apps Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR fixes two root causes of parity divergence between the wasm, native, and hybrid build paths for Java (and C++/CUDA): a last-wins Map.set() bug in typeMap population that allowed a local variable binding to silently overwrite an earlier interface-typed binding, and a hardcoded confidence value in runPostNativeCha that disagreed with the 0.8 used by runChaPostPass on the other two paths.

  • helpers.ts: Promotes the 0.8 CHA edge confidence to an exported named constant CHA_DISPATCH_CONFIDENCE and applies it in runChaPostPass.
  • native-orchestrator.ts: Imports and uses CHA_DISPATCH_CONFIDENCE, replacing the previously independent hardcoded value and its now-redundant alignment comment.
  • cpp.ts / cuda.ts: Switches handleCppDeclaration and handleCudaDeclaration from ctx.typeMap.set() (last-wins) to setTypeMapEntry (first-wins on equal confidence), matching Java and the Rust extractor semantics after prior-round review feedback.

Confidence Score: 5/5

Safe to merge — all four changes are narrowly scoped bug fixes with no new logic surface area.

The typeMap fix (first-wins via setTypeMapEntry) closes a real parity bug but is a one-line change per extractor using an already-tested helper. The confidence constant extraction is a pure refactor with no observable behavior change beyond eliminating the pre-existing mismatch. The prior-round review findings about cpp.ts and cuda.ts have been correctly addressed in this iteration.

No files require special attention.

Important Files Changed

Filename Overview
src/domain/graph/builder/helpers.ts Extracts the previously magic-number 0.8 CHA edge confidence into the exported constant CHA_DISPATCH_CONFIDENCE; the constant is also applied inline in runChaPostPass. No logic change — purely a de-duplication that enables native-orchestrator.ts to share the same value.
src/domain/graph/builder/stages/native-orchestrator.ts Imports CHA_DISPATCH_CONFIDENCE from helpers and replaces the locally hardcoded 0.8 and its explanatory comment; edge confidence is now identical to the value produced by runChaPostPass on the WASM/hybrid path.
src/extractors/cpp.ts Switches the single ctx.typeMap.set() call inside handleCppDeclaration to setTypeMapEntry (first-wins on equal confidence), fixing the typeMap-pollution bug that mirrored the Java issue.
src/extractors/cuda.ts Identical fix to cpp.ts: replaces ctx.typeMap.set() with setTypeMapEntry inside handleCudaDeclaration, aligning CUDA with the first-wins semantics used by every other TS extractor.

Sequence Diagram

sequenceDiagram
    participant W as WASM / Hybrid path
    participant N as Native path
    participant H as helpers.ts

    note over H: CHA_DISPATCH_CONFIDENCE = 0.8

    W->>H: runChaPostPass()
    H-->>W: "edge confidence = CHA_DISPATCH_CONFIDENCE (0.8)"

    N->>H: runPostNativeCha() [imports constant]
    H-->>N: "edge confidence = CHA_DISPATCH_CONFIDENCE (0.8)"

    note over W,N: Both paths now emit identical confidence on CHA-expanded edges
Loading

Reviews (11): Last reviewed commit: "fix: resolve merge conflicts with main" | Re-trigger Greptile

Comment thread src/extractors/cpp.ts
Comment thread src/extractors/cuda.ts
@github-actions

github-actions Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Codegraph Impact Analysis

4 functions changed10 callers affected across 7 files

  • runChaPostPass in src/domain/graph/builder/helpers.ts:382 (3 transitive callers)
  • runPostNativeCha in src/domain/graph/builder/stages/native-orchestrator.ts:423 (4 transitive callers)
  • handleCppDeclaration in src/extractors/cpp.ts:216 (2 transitive callers)
  • handleCudaDeclaration in src/extractors/cuda.ts:216 (2 transitive callers)

…TCH_CONFIDENCE constant

Switch handleCppDeclaration and handleCudaDeclaration from last-wins
ctx.typeMap.set() to first-wins setTypeMapEntry(), fixing the same flat
typeMap pollution bug this PR corrects in java.ts.

Extract the CHA dispatch confidence value to a named CHA_DISPATCH_CONFIDENCE
constant in helpers.ts so runChaPostPass and the native orchestrator share
a single source of truth instead of two synchronized magic numbers.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed review feedback in commit 4aced08:

  • Greptile P1 (cpp.ts, cuda.ts): Both handleCppDeclaration and handleCudaDeclaration now use setTypeMapEntry (first-wins on tie) instead of ctx.typeMap.set() (last-wins), eliminating the same flat-typeMap pollution bug fixed in java.ts.

  • Claude review (CHA_DISPATCH_CONFIDENCE constant): Extracted the CHA dispatch confidence value to a named CHA_DISPATCH_CONFIDENCE = 0.8 constant in helpers.ts (alongside runChaPostPass). Both runChaPostPass and the native orchestrator's runPostNativeCha now reference the same constant, removing the implicit coupling between two synchronized magic numbers.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

…nds (#1503)

The incremental scoping gate in runPostNativeCha only checked tgt.kind = 'class'
for new RTA evidence, but the adjacent two-tier RTA collection also falls back to
constructor/function kinds for databases written by older native engine versions.
On such incremental builds Gate B would never fire, causing the scan to be wrongly
scoped to changed files and missing CHA expansion for unchanged-file call sites.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed the Gate B incremental scoping concern from the Greptile summary in commit 7eda366:

The Gate B query in runPostNativeCha now checks tgt.kind IN ('class', 'constructor', 'function') instead of only tgt.kind = 'class', matching the two-tier fallback logic in the adjacent RTA collection. On incremental builds against databases written by older native engine versions (which store constructor calls under constructor/function kinds), Gate B now correctly fires when new RTA evidence is present in changed files, preventing the scan from being incorrectly scoped to changed files only and silently missing CHA expansion for unchanged-file call sites.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm

Copy link
Copy Markdown
Contributor Author

Addressed both concerns from the Claude review:

  1. CHA_DISPATCH_CONFIDENCE shared constant (critical concern Bump actions/setup-node from 4 to 6 #1): Extracted CHA_DISPATCH_CONFIDENCE = 0.8 to helpers.ts in commit 4aced08. Both runChaPostPass and runPostNativeCha now import and reference the same constant — no more two synchronized magic numbers.

  2. Unexplained scope (critical concern Bump actions/checkout from 4 to 6 #2): Updated the PR description to document each accompanying change (C++/CUDA typeMap fix, Go factory/Python constructor inference in Rust, two-pass findCaller, confidence-descending sort, class-scoped field annotation keys, Gate B kind extension, Elixir/Julia/ObjC fixture corrections, perf changes). Each change has a commit message already; the PR body was just not surfacing them.

  3. CUDA_PRIMITIVE_TYPES / CPP_PRIMITIVE_TYPES duplication (informational from Greptile): Filed as a dedicated refactor issue at refactor: consolidate CUDA_PRIMITIVE_TYPES and CPP_PRIMITIVE_TYPES into shared constant #1518 — out of scope for this fix but tracked.

The minor comment about java.ts:277–282 verbose comments is noted but not blocking.

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@claude

@claude

claude Bot commented Jun 13, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


Follow-up Review: fix/java-interface-dispatch-1469

  • Gather context
  • Verify CHA_DISPATCH_CONFIDENCE shared constant
  • Verify C++/CUDA setTypeMapEntry (first-wins)
  • Verify Gate B kind extension
  • Check overall assessment

Verification Results

All three critical items from the previous review have been correctly implemented:

1. CHA_DISPATCH_CONFIDENCE shared constant

helpers.ts:365 now exports export const CHA_DISPATCH_CONFIDENCE = 0.8; and native-orchestrator.ts imports and uses it (const conf = CHA_DISPATCH_CONFIDENCE). The coupling between runChaPostPass and runPostNativeCha is now explicit and safe — a future change to either caller is forced to touch the shared constant.

2. C++/CUDA setTypeMapEntry (first-wins)

cpp.ts:235 and cuda.ts:235 both now use setTypeMapEntry(ctx.typeMap, varName, typeName, 0.9) — confirmed in the file reads above. The Greptile P1 review comments are fully resolved.

3. Gate B kind extension

native-orchestrator.ts now checks tgt.kind IN ('class', 'constructor', 'function') for the Gate B scoping query. The rationale (older native engine schema compatibility) is documented in the surrounding comment block.


Updated Assessment

The two critical concerns from the initial review are resolved. The three changes are narrowly scoped and technically correct:

  • The CHA_DISPATCH_CONFIDENCE extraction removes a latent maintenance trap without changing any observable behavior.
  • The C++/CUDA first-wins fix applies the same correct semantics as the Java fix — both extractors now match the Rust extractor's behavior.
  • The Gate B kind extension correctly handles incremental builds against older-schema DBs where constructor calls were emitted under constructor/function kinds.

Rating: ⭐⭐⭐⭐ (4/5)
Recommendation: APPROVE

The one remaining 3→4 star gap: the verbose comment blocks added to java.ts and other files violate the repo's CLAUDE.md "DO NOT ADD COMMENTS unless asked" rule. This is cosmetic and non-blocking, but worth cleaning up in a follow-on commit if the author wants full style compliance.

View job | Branch: fix/java-interface-dispatch-1469

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm

Copy link
Copy Markdown
Contributor Author

Resolved merge conflicts with main in commit 733b0d5. Two files had conflicts:

  • src/domain/parser.ts: accepted main's addition of WASM-backed tree disposal when symbolsOnly=true (frees WASM linear memory on every incremental rebuild that doesn't need the full tree).
  • src/domain/graph/builder/stages/native-orchestrator.ts: 5 conflict blocks — all comment-level reformulations plus one refactor (compute backfillHappened before the if check). Accepted main's cleaner wording and refactoring; the PR's functional changes (CHA_DISPATCH_CONFIDENCE constant, Gate B kind extension) are preserved intact.

@carlos-alm carlos-alm merged commit 9b6d1e4 into main Jun 14, 2026
23 checks passed
@carlos-alm carlos-alm deleted the fix/java-interface-dispatch-1469 branch June 14, 2026 11:02
@github-actions github-actions Bot locked and limited conversation to collaborators Jun 14, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Engine parity: three-way interface-vs-implementation dispatch split for Java

1 participant