Skip to content

fix(test): stabilize lifecycle, sleep, queue, and run edge cases#4660

Closed
NathanFlurry wants to merge 1 commit intobreak-up/reenable-gateway-url-coveragefrom
break-up/stabilize-driver-lifecycle-tests
Closed

fix(test): stabilize lifecycle, sleep, queue, and run edge cases#4660
NathanFlurry wants to merge 1 commit intobreak-up/reenable-gateway-url-coveragefrom
break-up/stabilize-driver-lifecycle-tests

Conversation

@NathanFlurry
Copy link
Copy Markdown
Member

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Copy link
Copy Markdown
Member Author

NathanFlurry commented Apr 14, 2026

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@claude
Copy link
Copy Markdown

claude Bot commented Apr 14, 2026

PR #4660 Review: fix(test): stabilize lifecycle, sleep, queue, and run edge cases

Summary: This PR stabilizes a set of test fixtures and test files for the driver test suite, addressing race conditions, polling gaps, abort-signal handling, API call ordering, and a sandbox-environment workaround. Several changes are clear improvements; the items below flag issues worth addressing before merge.

Medium Issues

1. queue.ts - Duck-typed error discrimination instead of canonical RivetError check

Per CLAUDE.md: only passthrough canonical structured errors. The cast and string-field comparison is fragile. The check should use instanceof RivetError or the __type tag. If RivetError is not importable from a fixture context, that gap is the deeper issue; the workaround should include a comment explaining why.

2. sleep-db.ts - triggerSleepTwice is dead code tied to a still-skipped test

The new triggerSleepTwice action is added to sleepWithDb, sleepWithDbConn, and sleepWithDbAction, but the corresponding test is still test.skip with a TODO(#4705) comment. This adds unverified fixture code with no live coverage. Either lift the skip if the semantics are now correct, or defer these fixture additions to the PR that removes the skip.

3. start-stop-race.ts - Unguarded async RPC in onDestroy

An awaited outgoing RPC in onDestroy can fail silently during eviction or engine shutdown. If recordEvent times out or errors, destroyCalled is true but the observer event is missing and the test will not catch the gap. Per the Fail-By-Default Runtime principle, required lifecycle operations should surface failures explicitly. At minimum, wrap this in a try/catch that re-throws, or assert the observer event count in the test to catch the silent drop.

4. inline-client.ts - isDynamicSandboxRuntime() is an undocumented, fragile heuristic

The check process.cwd() === "/root" silently switches the entire connectToCounterAndIncrement implementation (no WebSocket connection in the sandbox path) based on a hardcoded CWD string. If the sandbox CWD changes or this actor is tested in a different environment, the test takes the wrong branch silently. This warrants a comment explaining the capability gap in the sandbox runtime that necessitates the workaround, and ideally a TODO linking to a tracked issue for removing this once the capability gap is closed.

Low / Style Issues

5. actor-queue.test.ts - Polling loop lacks a justification comment

CLAUDE.md requires a one-line comment on every vi.waitFor call explaining why polling is used instead of direct awaiting. The new for loop polling getSpawned() uses the local waitFor helper rather than vi.waitFor, so the linter will not flag it. A one-line comment explaining why event-driven coordination is not possible here should be added.

6. actor-run.test.ts - describe.sequential choice is unexplained

All run tests now execute sequentially even when skip?.sleep is false. The reason is not documented. Is it because tests share actor keys, or because the sandbox environment is single-threaded? A brief comment would help the next reader understand the tradeoff.

7. inline-client.ts - Potential race in waitForConnectionOpen

In waitForConnectionOpen, if the connection transitions to connected between the connStatus check and the new Promise subscription, onOpen fires its callback via queueMicrotask before unsubscribeOpen/unsubscribeError are initialized and the cleanup could call undefined. This is an existing pattern issue but worth noting since this PR introduces the helper.

Positive Changes

  • access-control.ts / destroy.ts: Replacing Response.json with new Response(JSON.stringify) is the right fix for runtime compatibility.
  • run.ts: Simplifying sendMessage to use c.queue.send directly removes unnecessary client-indirection back to the actors own queue.
  • actor-handle.test.ts: Adding await handle.increment(0) after create correctly ensures actor startup before the second create call, fixing the race.
  • lifecycle-hooks.test.ts: Moving connection params from connect to getOrCreate(undefined, params) is the correct API usage.
  • actor-queue.test.ts: Adding explicit timeout values to the two new test() calls provides clear timeout semantics.

Overall this is a solid stabilization pass. The main asks before merge are: canonical error checking in queue.ts, guarding the async RPC in onDestroy, and either enabling or deferring the triggerSleepTwice dead code.

@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from ecf268c to 4c4e267 Compare April 15, 2026 02:40
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from dcb82e8 to 1d9c571 Compare April 15, 2026 02:40
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from 4c4e267 to e823f78 Compare April 15, 2026 02:50
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from 1d9c571 to 8e50d57 Compare April 15, 2026 02:50
@NathanFlurry NathanFlurry marked this pull request as ready for review April 15, 2026 02:57
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from 8e50d57 to f2052c1 Compare April 15, 2026 03:07
@NathanFlurry NathanFlurry marked this pull request as draft April 15, 2026 03:17
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from e823f78 to 83bf5c0 Compare April 15, 2026 06:55
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from f2052c1 to 5aeb5f4 Compare April 15, 2026 06:55
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from 83bf5c0 to c25da72 Compare April 27, 2026 05:57
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from 5aeb5f4 to 4d25019 Compare April 27, 2026 05:57
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from c4f0322 to 6b63eed Compare April 27, 2026 17:35
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from 8ceea0a to 50332ec Compare April 27, 2026 17:35
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from 6b63eed to 392102e Compare April 27, 2026 19:06
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch 2 times, most recently from 32fd32d to 43f0986 Compare April 27, 2026 19:21
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from 392102e to e81836a Compare April 27, 2026 19:30
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch 2 times, most recently from abdf328 to f4f7606 Compare April 27, 2026 19:40
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch 2 times, most recently from e5ca28f to 49982a7 Compare April 27, 2026 20:48
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from f4f7606 to 3277534 Compare April 27, 2026 20:48
@NathanFlurry NathanFlurry force-pushed the break-up/reenable-gateway-url-coverage branch from 49982a7 to e6b5003 Compare April 27, 2026 21:38
@NathanFlurry NathanFlurry force-pushed the break-up/stabilize-driver-lifecycle-tests branch from 3277534 to d14747a Compare April 27, 2026 21:38
@NathanFlurry
Copy link
Copy Markdown
Member Author

Landed in main via stack-merge fast-forward push. Commits are in main; closing to match.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant