Skip to content

fix(perf): correct INLINE_BACKFILL_THRESHOLD docstring and raise threshold#1492

Open
carlos-alm wants to merge 2 commits into
fix/native-phase-timings-1434from
fix/backfill-threshold-1435
Open

fix(perf): correct INLINE_BACKFILL_THRESHOLD docstring and raise threshold#1492
carlos-alm wants to merge 2 commits into
fix/native-phase-timings-1434from
fix/backfill-threshold-1435

Conversation

@carlos-alm

Copy link
Copy Markdown
Contributor

The INLINE_BACKFILL_THRESHOLD docstring claimed pool cost was "amortised over enough parse work" — measurements show IPC overhead scales linearly (~55–64ms/file pool vs ~8–10ms/file inline). The real motivation is grammar-crash safety for exotic parsers (#965); JS/TS/TSX (required-tier, used in all this-dispatch backfill calls) have never triggered the V8 fatal crash class and are safe to run inline.

Changes

  • Docstring corrected: explains the crash-safety rationale (exotic grammars can trigger uncatchable V8 fatal errors), clarifies that JS/TS/TSX are required-tier and safe inline, and accurately states that IPC overhead scales linearly rather than amortising
  • Threshold raised 16 → 32: keeps typical this-dispatch batches (≤ 18 files on the codegraph corpus) on the inline fast path. Exotic-language drops (e.g. recurring HCL case is 4 files) are almost always well below 32 and also benefit from the inline path without meaningful crash risk increase

Why 32

The primary hot caller (runPostNativeThisDispatch) measured 18 JS/TS files at 55–64ms/file through the pool vs 8–10ms/file inline — a 3–6x penalty. Raising to 32 covers that batch with headroom. Anything larger (truly exotic-language drops at scale) still routes through the pool for crash isolation.

Closes #1435

…ld for required-tier grammars

The docstring claimed pool cost was "amortised over enough parse work" —
measurements show IPC overhead scales linearly (~55–64ms/file pool vs
~8–10ms/file inline). The real motivation is crash safety for exotic WASM
grammars (#965); JS/TS/TSX (required-tier, used in all this-dispatch
backfill calls) have never triggered the V8 fatal crash class and are safe
to run inline.

Raise threshold 16 → 32 to keep typical this-dispatch batches (≤ 18 files
on the codegraph corpus) on the inline fast path. Exotic-language drops are
almost always well under 32 files and also benefit from the inline path
without meaningful crash risk increase.

Closes #1435
@claude

claude Bot commented Jun 13, 2026

Copy link
Copy Markdown

Claude finished @carlos-alm's task —— View job


PR Review

  • Read changed files and gather context
  • Phase 1: Root cause analysis
  • Phase 2: Configuration change detection
  • Phase 3: Test integrity
  • Phase 4: Code quality
  • Phase 5: Backlog compliance
  • Final assessment

Phase 1: Root Cause Analysis

Problem Being Solved: The INLINE_BACKFILL_THRESHOLD was set to 16, but the primary hot caller (runPostNativeThisDispatch) processes up to ~18 JS/TS files and was hitting the pool path unnecessarily. Measurements showed IPC overhead at ~55–64ms/file through the pool vs ~8–10ms/file inline — a ~3–6x penalty. The old docstring falsely implied pool overhead was "amortised over parse work," when in reality it scales linearly.

Why This Approach: Two orthogonal fixes in one minimal change — (1) raise the threshold past the measured batch size of 18, and (2) correct the misleading docstring to accurately explain the crash-safety rationale (not cost amortisation). The threshold of 32 gives headroom over 18 while still routing truly large batches through the pool.

Risk Assessment: Low. JS/TS/TSX are required-tier grammars with no history of the V8 fatal crash class. Raising the inline ceiling to 32 from 16 for files that call parseFilesWasmForBackfill means the only meaningful difference is performance. The exotic-language backfill path (e.g. HCL, ~4 files) benefits from inline too, with negligible additional crash risk.


Phase 2: Configuration Change Detection

No linting rules, test timeouts, or build configuration were changed. The only constant modified is INLINE_BACKFILL_THRESHOLD — a behavioral performance tuning parameter, not a quality gate. No concerns here.


Phase 3: Test Integrity

No tests were modified. There are no direct unit tests for parseFilesWasmForBackfill or INLINE_BACKFILL_THRESHOLD (confirmed via search) — the threshold is an internal constant exercised through integration tests. No assertions were weakened or removed.

Minor observation: the threshold is a magic constant with no test asserting the inline path is taken for expected batch sizes. This isn't a blocker, but a test that exercises parseFilesWasmForBackfill with a batch at the boundary (e.g. 31 vs 33 files) would prevent future regressions. Not required for this PR.


Phase 4: Code Quality

Docstring correctness (src/domain/parser.ts:1181–1201): The new docstring is materially more accurate:

  • Correctly identifies crash safety (not amortisation) as the core rationale ✓
  • Cites the specific measured numbers (55–64ms/file pool, 8–10ms/file inline) with issue reference ✓
  • Explains why JS/TS/TSX are safe inline (required-tier, no V8 fatal crash history) ✓
  • Acknowledges exotic drops also benefit from inline given they're typically ≤ 4 files ✓

Threshold value (src/domain/parser.ts:1202): 32 is well-justified. The measured batch is ≤ 18; 32 gives ~1.7x headroom. No realistic exotic-language scenario is expected to exceed 32 based on the documented HCL case being 4 files.

Secondary docstring (src/domain/parser.ts:1253–1265, parseFilesWasmForBackfill): This docstring still reads "Threshold matches typical engine-parity drop sizes (a few fixture files in one or two languages)" — that was accurate for the old 16 value but is now slightly misleading since the threshold was raised specifically to cover the larger this-dispatch batches, not just engine-parity drops. This is a minor inconsistency that could be addressed in a follow-up, but it's not incorrect.

No bugs, security issues, or architectural concerns introduced.


Phase 5: Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies
  • Foundation-aligned: ✓ — Pure internal performance tuning; no cloud APIs, no external services, no agent behavior
  • Problem-fit: 3/5 — Improves build performance (especially in native-engine post-pass), which indirectly speeds up graph generation for AI agents. Doesn't directly address token waste or hallucinated deps, but faster indexing is a meaningful quality-of-life improvement
  • Breaking: No — internal constant, no API or CLI contract changes
  • Tier: 1

Final Assessment

## Root Cause Analysis
Problem Being Solved: INLINE_BACKFILL_THRESHOLD=16 was below the measured
  batch size of ~18 JS/TS files in runPostNativeThisDispatch, causing a
  3–6x performance penalty from unnecessary IPC overhead. The old docstring
  also incorrectly described cost amortisation as the rationale.
Why This Approach: Raise to 32 (covers the 18-file batch with headroom),
  and correct the docstring to explain the real crash-safety motivation.
Risk Assessment: Low — JS/TS/TSX are required-tier with no V8 fatal crash
  history; raising the inline ceiling for them carries negligible crash risk.

## Backlog Compliance
- Zero-dep: ✓
- Foundation-aligned: ✓
- Problem-fit: 3/5 — performance improvement to graph indexing pipeline
- Breaking: No
- Tier: 1

## Minor Concerns
- src/domain/parser.ts:1253–1265: The `parseFilesWasmForBackfill` docstring
  still says "threshold matches typical engine-parity drop sizes" which is
  now somewhat misleading — the threshold was raised for this-dispatch batch
  coverage. Not blocking; can be addressed in follow-up.
- No boundary test exists for the threshold (31 vs 33 files). Not required
  but would prevent silent regressions.

## Final Recommendation
- Rating: ⭐⭐⭐⭐ (4/5)
- Action: APPROVE
- Reasoning: The change is well-motivated with cited measurements and a
  referenced issue (#1435). The docstring correction accurately replaces
  misleading "amortisation" framing with the true crash-safety rationale.
  The threshold value is justified by the measured batch size. Minor
  inconsistency in the secondary docstring is cosmetic and non-blocking.

@greptile-apps

greptile-apps Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR corrects the INLINE_BACKFILL_THRESHOLD docstring — replacing the inaccurate "IPC cost amortises over parse work" narrative with the real crash-safety rationale — and raises the threshold from 16 to 32 to keep typical JS/TS/TSX this-dispatch batches (≤ 18 files) on the faster inline path.

Confidence Score: 5/5

Safe to merge — the change is a docstring correction and a threshold increase from 16 to 32, with no logic changes.

The routing logic in parseFilesWasmForBackfill is unchanged; only the boundary value and its documentation are touched. Exotic-language drops (typically 4 files) were already below the old threshold of 16, so their inline-vs-pool routing is unaffected in practice. JS/TS/TSX are explicitly documented as safe for the inline path, and the 32-file ceiling still routes large batches through the crash-isolated worker pool.

No files require special attention. src/domain/parser.ts is the only changed file and the modification is isolated to a constant value and its surrounding documentation.

Important Files Changed

Filename Overview
src/domain/parser.ts INLINE_BACKFILL_THRESHOLD raised 16→32 with corrected docstring; parseFilesWasmForBackfill docstring de-duped to point at constant. Logic is unchanged; only the routing boundary and documentation are affected.

Reviews (3): Last reviewed commit: "fix(perf): update stale parseFilesWasmFo..." | Re-trigger Greptile

…nce threshold constant

The secondary docstring still described the old 16-value rationale
("engine-parity drop sizes"). Replace with a pointer to
INLINE_BACKFILL_THRESHOLD where the full rationale now lives.
@carlos-alm

Copy link
Copy Markdown
Contributor Author

Fixed — updated the stale parseFilesWasmForBackfill docstring at line 1256 to replace the old 16-value rationale ("engine-parity drop sizes") with a pointer to INLINE_BACKFILL_THRESHOLD where the full rationale now lives (f90d4ba).

@carlos-alm

Copy link
Copy Markdown
Contributor Author

@greptileai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant