Summary
On the v3.9.5 self-build benchmark (same source, same worktree), native's Rust-orchestrated edges and roles phases are substantially slower than WASM's JS pipeline, even though native emits fewer edges and classifies fewer nodes. That inverts the expected native-is-faster ordering and is a distinct bug from:
Evidence
Full-build phase timings (median of 3), from today's npm run benchmark:
| Phase |
WASM |
Native |
Δ |
| edges |
179 ms |
310 ms |
+131 (+73 %) |
| roles |
62 ms |
269 ms |
+207 (+334 %) |
| ast |
392 ms |
405 ms |
+13 (+3 %, parity) |
| insert |
625 |
568 |
parity |
| structure |
313 |
56 |
native faster |
| complexity |
617 |
38 |
native 16× faster |
| cfg |
374 |
233 |
native faster |
| dataflow |
159 |
143 |
parity |
| parse |
5 729 |
87 |
native 66× faster |
And the outputs being produced by the slower phases:
|
WASM |
Native |
Δ |
| edges rows |
37 367 |
36 949 |
−418 |
| nodes rows (input to roles) |
17 984 |
17 727 |
−257 |
So native's edges phase is 73 % slower per-build while producing 1 % fewer edges, and its roles phase is 4.3× slower while classifying 1.4 % fewer nodes. Per-item cost is:
| Phase |
ms / item, WASM |
ms / item, Native |
Native overhead |
| edges |
0.0048 |
0.0084 |
+75 % |
| roles |
0.0034 |
0.0152 |
+347 % |
Architectural note
src/domain/graph/builder/pipeline.ts shows these timings come from the Rust orchestrator:
const resultJson = ctx.nativeDb.buildGraph(...);
const result = JSON.parse(resultJson) as NativeOrchestratorResult;
const p = result.phases;
// …
edgesMs: +(p.edgesMs ?? 0).toFixed(1),
rolesMs: +(p.rolesMs ?? 0).toFixed(1),
So this is Rust-reported wall-time, not napi overhead from repeated JS↔Rust crossings. The Rust implementation of edge-building and role-classification is genuinely doing more work (or less efficient work) per unit than the JS pipeline does on WASM-parsed trees.
Investigation hints
crates/codegraph-core/ — edges and roles phases of the native orchestrator. Likely candidates:
roles: full-table scans (e.g. per-role-check SELECTs instead of a single pass), or recomputing role metrics that the JS side caches/indexes.
edges: non-indexed resolution lookups, or redundant symbol-resolution passes that the JS side short-circuits.
- Compare SQL emitted by Rust
roles vs src/domain/analysis/roles.ts (or wherever WASM's rolesMs is accumulated). A simple EXPLAIN QUERY PLAN diff on the hot queries may be sufficient to spot missing index use.
- The
edges delta could compound with the missing ~418 edges — if some code path is doing an O(N²) lookup that short-circuits when an edge matches, fewer matches means more iterations.
Repro
rm -rf .codegraph && npx codegraph build --engine wasm --verbose 2>&1 | grep -iE 'edges|roles'
rm -rf .codegraph && npx codegraph build --engine native --verbose 2>&1 | grep -iE 'edges|roles'
Or run the full benchmark: npm run benchmark — the JSON output includes per-phase ms under wasm.phases and native.phases.
Acceptance
- Native
edges phase is ≤ 1.2× WASM on codegraph self-build.
- Native
roles phase is ≤ 1.2× WASM on codegraph self-build.
- Benchmark asserts a ceiling on these ratios so re-regression is caught automatically.
Related
Summary
On the v3.9.5 self-build benchmark (same source, same worktree), native's Rust-orchestrated
edgesandrolesphases are substantially slower than WASM's JS pipeline, even though native emits fewer edges and classifies fewer nodes. That inverts the expected native-is-faster ordering and is a distinct bug from:stringAST nodes vs WASM — 7.5% DB bloat in 3.9.5 #1010 (DB bloat from excessast_nodesrows),Evidence
Full-build phase timings (median of 3), from today's
npm run benchmark:And the outputs being produced by the slower phases:
So native's
edgesphase is 73 % slower per-build while producing 1 % fewer edges, and itsrolesphase is 4.3× slower while classifying 1.4 % fewer nodes. Per-item cost is:Architectural note
src/domain/graph/builder/pipeline.tsshows these timings come from the Rust orchestrator:So this is Rust-reported wall-time, not napi overhead from repeated JS↔Rust crossings. The Rust implementation of edge-building and role-classification is genuinely doing more work (or less efficient work) per unit than the JS pipeline does on WASM-parsed trees.
Investigation hints
crates/codegraph-core/—edgesandrolesphases of the native orchestrator. Likely candidates:roles: full-table scans (e.g. per-role-check SELECTs instead of a single pass), or recomputing role metrics that the JS side caches/indexes.edges: non-indexed resolution lookups, or redundant symbol-resolution passes that the JS side short-circuits.rolesvssrc/domain/analysis/roles.ts(or wherever WASM'srolesMsis accumulated). A simpleEXPLAIN QUERY PLANdiff on the hot queries may be sufficient to spot missing index use.edgesdelta could compound with the missing ~418 edges — if some code path is doing an O(N²) lookup that short-circuits when an edge matches, fewer matches means more iterations.Repro
Or run the full benchmark:
npm run benchmark— the JSON output includes per-phase ms underwasm.phasesandnative.phases.Acceptance
edgesphase is ≤ 1.2× WASM on codegraph self-build.rolesphase is ≤ 1.2× WASM on codegraph self-build.Related
stringAST nodes vs WASM — 7.5% DB bloat in 3.9.5 #1010 DB size bloat (distinct: row-count inflation inast_nodes, not phase timings)