feat(ai-sandbox): Sprites sandbox provider#868
Conversation
Add @tanstack/ai-sandbox-sprites, a SandboxProvider/SandboxHandle implementation backed by Sprites (sprites.dev, Fly.io) cloud sandboxes, following the @tanstack/ai-sandbox-daytona / -vercel shape. - Dependency-free client (Sprites REST + exec control WebSocket); no SDK. - exec with separate stdout/stderr, background spawn, native /fs I/O, exec-backed git, env injection, durable filesystem, resume-by-id. - ports.connect() exposes the Sprite's single proxied public-URL port (default 8080), switching the URL to public auth. - Hardened exec lifecycle: abnormal-close surfaces an error instead of a false exit 0; control-message parses drained before reading the exit code; exec URL (cmd/env) stripped from connection errors. - Unit tests (fake client) + gated live tests against api.sprites.dev; docs/sandbox/providers.md entry; changeset. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Wire Sprite checkpoints through the SandboxHandle:
- capabilities.snapshots = true
- handle.snapshot(label?) creates a checkpoint, returns {id: "<name>#vN"}
- handle.listCheckpoints() / handle.restoreCheckpoint(idOrRef) for in-place
restore (restarts the Sprite; readiness polled via a fetch probe bounded by
an abort signal)
- client.createCheckpoint/listCheckpoints/restoreCheckpoint over the REST API
Restore is in-place and a checkpoint does not survive Sprite deletion, so the
provider intentionally does not implement reconstruct-after-gone
restoreSnapshot — the framework degrades to a fresh create when a Sprite is
gone. Unit tests cover the wiring; the gated live test covers create+list
(restore restarts the Sprite and can take minutes, so it's excluded from CI).
Docs + changeset updated.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ting domain - Describe Sprites as "stateful sandboxes" (its own term) instead of "cloud sandboxes". - Remove the sprites.dev marketing link from the core README table, package description, and JSDoc — matching the other providers there, which carry no domain. Keep the link in docs/sandbox/providers.md, where Daytona/Vercel also link, and keep the functional api.sprites.dev endpoint. - Harden post-restore readiness: probe the workdir (not just root) and require two consecutive successes, since the overlay can briefly return I/O errors right after a restart; raise the default ready timeout to match observed multi-minute restarts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Document that immediately after a checkpoint restore the overlay can be listable while individual file reads briefly return an I/O error as it settles — callers acting on the filesystem the instant restore returns should retry reads. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lifecycle / cancellation (client.ts): - exec(): parse control frames synchronously so an early kill/close can't read a stale (undefined) session id; kill()/abort now reach the server-side kill endpoint even before session_info arrives, instead of orphaning the remote process (H4). Add a connect watchdog so a CONNECTING stall fails wait() instead of hanging forever (M3). An explicit kill() resolves wait() with a conventional 137 rather than rejecting (G3). - restoreCheckpoint(): probe a write→read round-trip (not just a directory list) so it resolves only once the restored overlay actually serves reads, not while it is merely listable (H5); honor the caller's AbortSignal during the multi-minute readiness wait (M5). - createCheckpoint(): return the version THIS call created (pre/post diff + stream parse) instead of the current max, which a concurrent or eventually-consistent list could make wrong (M4). Handle / provider: - ports.connect() no longer silently downgrades URL auth to public; it returns a token-authenticated channel for sprite-auth Sprites and never mutates the mode (H3). - create() runs the workspace mkdir from '/', so a non-default workdir is not created with its own (not-yet-existent) dir as cwd (G2). - fs error messages fall back to stdout, since the fast path folds stderr into stdout for instant commands (M1). - restoreCheckpoint(ref) validates the Sprite-name component of a name#vN ref (L1). Packaging / docs: - Add engines node>=22.4 (global undici WebSocket); fix the Node-version comment and the readyTimeoutMs JSDoc (M8/N1). Tests: add deterministic client.test.ts (stub WebSocket + fetch: frame demux, abnormal-close→throw, kill endpoint, early-abort kill, createCheckpoint id, fsRead 404→throw, lifecycle) and provider.test.ts (create naming/urlAuth/mkdir, resume branches); update handle.test.ts for the no-downgrade connect and ref validation (H6). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds a new ChangesSprites sandbox provider
Estimated code review effort: 4 (Complex) | ~75 minutes Suggested reviewers: 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 10
🧹 Nitpick comments (2)
packages/ai-sandbox-sprites/tests/provider.test.ts (1)
86-96: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winTest doesn't actually verify the mkdir-from-
/cwd claim.
mkdirWsis alwaysundefinedhere —callsonly recordsfetchinvocations, but exec runs over the stubbed WebSocket, never touchingfetch. The lookup result is then discarded viavoid mkdirWs, and the only real assertion (capabilities.snapshots) is unrelated to the test's name/comment. This behavior (mkdir running withcwd: '/'rather than the not-yet-existent workdir) is called out explicitly inprovider.tsas a deliberate fix, but it isn't actually exercised by this test.To genuinely cover it, capture the constructor URL in
AutoExecWebSocket(likeStubWebSocketdoes inclient.test.ts) and assert on thedirquery param.♻️ Proposed fix
class AutoExecWebSocket { + static last: AutoExecWebSocket | undefined binaryType = 'blob' + url: string private listeners: Record<string, Array<Listener>> = {} - constructor() { + constructor(url: string) { + this.url = url + AutoExecWebSocket.last = this setTimeout(() => { this.emit('open') this.emit('message', { data: JSON.stringify({ type: 'exit', exit_code: 0 }), }) this.emit('close', { code: 1000, reason: '' }) }, 0) }const handle = await provider.create({}) expect(handle.id).toMatch(/^tanstack-ai-[0-9a-f]{12}$/) - // The workdir mkdir must run with cwd '/', not the not-yet-existent workdir. - const mkdirWs = calls.find((c) => c.url.includes('/exec')) - // exec goes over WebSocket, not fetch, so assert via the handle behavior: - expect(handle.capabilities.snapshots).toBe(true) - void mkdirWs + // The workdir mkdir must run with cwd '/', not the not-yet-existent workdir. + const mkdirUrl = new URL(AutoExecWebSocket.last?.url ?? '') + expect(mkdirUrl.searchParams.get('dir')).toBe('/')🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/ai-sandbox-sprites/tests/provider.test.ts` around lines 86 - 96, The test in spritesSandbox.create() is not actually verifying the mkdir-from-root behavior because exec uses the stubbed AutoExecWebSocket, not fetch, so the current calls lookup and handle.capabilities.snapshots assertion miss the intended claim. Update the provider.test.ts coverage to inspect the WebSocket constructor URL (similar to StubWebSocket usage in client.test.ts) and assert that the exec request includes the expected dir query param of “/”, using the spritesSandbox/create and AutoExecWebSocket symbols to locate the behavior.packages/ai-sandbox-sprites/tests/sprites.test.ts (1)
93-103: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winFixed 2s sleep before hitting the spawned server is a flakiness risk.
Waiting a hardcoded 2000ms for the Node server to bind before fetching through the public proxy can be flaky if startup is slower (cold sandbox, network variance). A short retry/poll loop on the fetch would be more robust than a fixed sleep, though this is a low-frequency gated suite so the impact is limited.
♻️ Proposed fix
- // Give the listener a moment, then fetch through the public proxy. - await new Promise((r) => setTimeout(r, 2000)) - const res = await fetch(channel.url, { - ...(channel.headers ? { headers: channel.headers } : {}), - }) - expect(res.status).toBe(200) + // Poll until the listener is up, then fetch through the public proxy. + let res: Response | undefined + for (let attempt = 0; attempt < 10; attempt++) { + try { + res = await fetch(channel.url, { + ...(channel.headers ? { headers: channel.headers } : {}), + }) + if (res.status === 200) break + } catch { + // not ready yet + } + await new Promise((r) => setTimeout(r, 500)) + } + expect(res?.status).toBe(200)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/ai-sandbox-sprites/tests/sprites.test.ts` around lines 93 - 103, The test in sprites.test.ts uses a fixed 2-second sleep before fetching from the spawned server, which is flaky when startup is slower. Replace that hardcoded wait in the test body around sbx.process.spawn and sbx.ports.connect with a small retry/poll loop that repeatedly fetches channel.url until it returns 200 or times out, using the existing server/channel variables to keep the assertion stable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/sandbox/providers.md`:
- Around line 163-164: The `ports.connect()` documentation is inaccurate because
`connect()` does not change auth mode; it only returns a URL based on the
existing `urlAuth` setting. Update the wording near the `ports.connect(8080)`
description in the sandbox providers docs to say it returns the Sprite’s public
URL by default and that `urlAuth: 'sprite'` must be set at creation time to
require org-token access. Use the `ports.connect()` and `urlAuth` symbols to
locate and rephrase the affected text.
- Around line 148-149: The Sprites provider note is making an unsupported claim
about harness credentials being injected as workspace secrets. Reword or remove
that line so it only states behavior that is confirmed by the
SpritesProvider/SpritesHandle implementation, and if workspace-secrets injection
is intended as a generic defineWorkspace behavior, move it out of the
Sprites-specific bullet. Use the SpritesProvider, SpritesHandle, and
defineWorkspace references to verify the wording matches the actual
implementation.
In `@packages/ai-sandbox-sprites/package.json`:
- Around line 48-50: The peer dependency declaration for `@tanstack/ai-sandbox` in
package.json should follow the repository standard for internal packages. Update
the peerDependencies entry to use workspace:* to match the devDependencies
pattern and keep the package manifest consistent.
In `@packages/ai-sandbox-sprites/src/client.ts`:
- Around line 428-440: The readiness probe in the restore flow is using a fixed
`.tanstack-restore-probe` path inside `probePath`, which can overwrite an
existing user file during checkpoint restore. Update the logic in the restore
readiness loop that calls `probeReadWrite()` and `deleteSentinel()` so the
sentinel is created outside the restored tree or uses a unique temporary
location/name that cannot collide with user data, and ensure cleanup only
removes the probe artifact you created.
- Around line 339-350: The checkpoint version selection in this method is
picking the max from all fresh versions, which can return another writer’s
checkpoint instead of the version observed by this call. Update the logic in
this flow after checkpoin tVersions so it prefers the stream-reported checkpoint
version first, and only falls back to the fresh set or overall max when that
version is unavailable. Keep the fix localized to this return path and preserve
the existing error handling when no versioned checkpoint exists.
In `@packages/ai-sandbox-sprites/src/handle.ts`:
- Around line 303-305: The fork() method on SandboxHandle currently throws
synchronously even though it is declared to return Promise<SandboxHandle>, so
update fork() in handle.ts to return a rejected Promise instead of throwing.
Keep the UnsupportedCapabilityError('sprites', 'fork') as the rejection reason
so callers can handle it with standard promise rejection flows.
In `@packages/ai-sandbox-sprites/src/provider.ts`:
- Around line 79-90: The handle() flow is always passing the provider’s config
urlAuth into SpritesHandle, which breaks resume() for existing sprites because
the original sprite-auth mode is lost and ports.connect() may not send the
needed auth headers. Update handle() (and the resume path that reuses it) to
preserve the sprite’s actual URL auth mode from the existing sprite/handle state
instead of blindly using this.urlAuth, so SpritesHandle is constructed with the
correct auth setting for resumed sprites.
- Around line 126-133: In SandboxProvider.resume, the current catch-all swallows
every failure and incorrectly treats auth, network, and server errors as a
missing Sprite. Narrow the handling around this.client.getSprite so only a
confirmed “not found/gone” error returns null, and rethrow or propagate all
other errors. Use the resume method and the getSprite call to identify the error
type/code from the client before deciding whether to fall back.
- Around line 95-123: After createSprite() succeeds, any failure in the
post-create setup path (setUrlAuth, the mkdir exec/wait workspace creation, or
handle.env.set) should clean up the newly created Sprite before rethrowing.
Update the flow in provider.ts around createSprite, setUrlAuth,
this.client.exec(...).wait(), and handle.env.set to wrap the setup steps in a
try/catch (or equivalent) and delete the remote sprite on error so no live
sandbox is leaked.
In `@packages/ai-sandbox-sprites/tests/provider.test.ts`:
- Around line 71-79: The test setup in provider.test.ts deletes SPRITES_API_KEY
but never restores it, which can leak state across files. Update the
beforeEach/afterEach pair around the WebSocket stubbing so the original
process.env.SPRITES_API_KEY value is saved before deletion and restored in
afterEach, alongside the existing vi.unstubAllGlobals and vi.restoreAllMocks
cleanup. Use the existing test hooks in provider.test.ts to keep the env
mutation isolated.
---
Nitpick comments:
In `@packages/ai-sandbox-sprites/tests/provider.test.ts`:
- Around line 86-96: The test in spritesSandbox.create() is not actually
verifying the mkdir-from-root behavior because exec uses the stubbed
AutoExecWebSocket, not fetch, so the current calls lookup and
handle.capabilities.snapshots assertion miss the intended claim. Update the
provider.test.ts coverage to inspect the WebSocket constructor URL (similar to
StubWebSocket usage in client.test.ts) and assert that the exec request includes
the expected dir query param of “/”, using the spritesSandbox/create and
AutoExecWebSocket symbols to locate the behavior.
In `@packages/ai-sandbox-sprites/tests/sprites.test.ts`:
- Around line 93-103: The test in sprites.test.ts uses a fixed 2-second sleep
before fetching from the spawned server, which is flaky when startup is slower.
Replace that hardcoded wait in the test body around sbx.process.spawn and
sbx.ports.connect with a small retry/poll loop that repeatedly fetches
channel.url until it returns 200 or times out, using the existing server/channel
variables to keep the assertion stable.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b6d5109d-e30e-4b41-957a-79b241415b8a
⛔ Files ignored due to path filters (1)
pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (15)
.changeset/sprites-sandbox-provider.mddocs/sandbox/providers.mdpackages/ai-sandbox-sprites/CHANGELOG.mdpackages/ai-sandbox-sprites/package.jsonpackages/ai-sandbox-sprites/src/client.tspackages/ai-sandbox-sprites/src/handle.tspackages/ai-sandbox-sprites/src/index.tspackages/ai-sandbox-sprites/src/provider.tspackages/ai-sandbox-sprites/tests/client.test.tspackages/ai-sandbox-sprites/tests/handle.test.tspackages/ai-sandbox-sprites/tests/provider.test.tspackages/ai-sandbox-sprites/tests/sprites.test.tspackages/ai-sandbox-sprites/tsconfig.jsonpackages/ai-sandbox-sprites/vite.config.tspackages/ai-sandbox/README.md
| `apiUrl` / `SPRITES_API_URL`. Harness credentials are injected as workspace | ||
| secrets. |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
Verify or rephrase "Harness credentials are injected as workspace secrets."
This claim is not substantiated by the provider implementation shown in the snippets. The SpritesProvider accepts an apiKey and apiUrl, and the SpritesHandle supports env.set() for runtime environment variables, but there is no evidence that harness credentials (e.g., XAI_API_KEY) are treated differently for Sprites than for other providers. If this is just describing generic defineWorkspace({ secrets: ... }) behavior, it should not be listed as a Sprites-specific bullet. If Sprites does have special workspace-secrets injection, please point to the implementation.
- - **Auth / env:** needs `SPRITES_API_KEY` (token form
- `org/projectNumber/tokenId/secret`); override the control-plane URL with
- `apiUrl` / `SPRITES_API_URL`. Harness credentials are injected as workspace
- secrets.
+ - **Auth / env:** needs `SPRITES_API_KEY` (token form
+ `org/projectNumber/tokenId/secret`); override the control-plane URL with
+ `apiUrl` / `SPRITES_API_URL`.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| `apiUrl` / `SPRITES_API_URL`. Harness credentials are injected as workspace | |
| secrets. | |
| `apiUrl` / `SPRITES_API_URL`. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/sandbox/providers.md` around lines 148 - 149, The Sprites provider note
is making an unsupported claim about harness credentials being injected as
workspace secrets. Reword or remove that line so it only states behavior that is
confirmed by the SpritesProvider/SpritesHandle implementation, and if
workspace-secrets injection is intended as a generic defineWorkspace behavior,
move it out of the Sprites-specific bullet. Use the SpritesProvider,
SpritesHandle, and defineWorkspace references to verify the wording matches the
actual implementation.
| configurable via `httpPort`) to its always-on public URL. `ports.connect(8080)` | ||
| switches the URL to `public` auth and returns it; other ports are not exposed. |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Correct the ports.connect() auth description.
The docs claim that ports.connect(8080) "switches the URL to public auth," but the implementation does not mutate the auth mode. The urlAuth mode is set at creation time (defaulting to 'public') and connect() merely returns a plain public URL when the mode is 'public', or a token-authenticated channel when it is 'sprite'. The wording "switches" implies a state change that does not happen.
Suggested rephrase: "ports.connect(8080) returns the Sprite's public URL (already public auth by default; use urlAuth: 'sprite' at creation to gate it with an org token instead)."
- `ports.connect(8080)` switches the URL to `public` auth and returns it; other ports are not exposed.
+ `ports.connect(8080)` returns the public URL (already `public` auth by default; use `urlAuth: 'sprite'` to keep it org-token gated); other ports are not exposed.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| configurable via `httpPort`) to its always-on public URL. `ports.connect(8080)` | |
| switches the URL to `public` auth and returns it; other ports are not exposed. | |
| configurable via `httpPort`) to its always-on public URL. `ports.connect(8080)` | |
| returns the public URL (already `public` auth by default; use `urlAuth: 'sprite'` to keep it org-token gated); other ports are not exposed. |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/sandbox/providers.md` around lines 163 - 164, The `ports.connect()`
documentation is inaccurate because `connect()` does not change auth mode; it
only returns a URL based on the existing `urlAuth` setting. Update the wording
near the `ports.connect(8080)` description in the sandbox providers docs to say
it returns the Sprite’s public URL by default and that `urlAuth: 'sprite'` must
be set at creation time to require org-token access. Use the `ports.connect()`
and `urlAuth` symbols to locate and rephrase the affected text.
| "peerDependencies": { | ||
| "@tanstack/ai-sandbox": "workspace:^" | ||
| }, |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win
Use workspace:* in peerDependencies for internal packages.
The peerDependencies field uses workspace:^ for @tanstack/ai-sandbox, but the repository standard requires workspace:* for all internal package dependencies. Align with the devDependencies entry on line 52.
"peerDependencies": {
- "`@tanstack/ai-sandbox`": "workspace:^"
+ "`@tanstack/ai-sandbox`": "workspace:*"
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "peerDependencies": { | |
| "@tanstack/ai-sandbox": "workspace:^" | |
| }, | |
| "peerDependencies": { | |
| "`@tanstack/ai-sandbox`": "workspace:*" | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-sandbox-sprites/package.json` around lines 48 - 50, The peer
dependency declaration for `@tanstack/ai-sandbox` in package.json should follow
the repository standard for internal packages. Update the peerDependencies entry
to use workspace:* to match the devDependencies pattern and keep the package
manifest consistent.
Source: Coding guidelines
| const after = await this.checkpointVersions(name, options.signal) | ||
| const fresh = after.filter((v) => !before.has(v)) | ||
| // Prefer a version that did not exist before this call; fall back to the | ||
| // stream-reported version, then to the overall max. | ||
| const candidates = fresh.length > 0 ? fresh : streamVersions | ||
| const pool = candidates.length > 0 ? candidates : after | ||
| if (pool.length === 0) { | ||
| throw new Error( | ||
| `Sprites: checkpoint created for "${name}" but no versioned checkpoint was found.`, | ||
| ) | ||
| } | ||
| return `v${Math.max(...pool)}` |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Prefer this call’s stream-reported checkpoint version before the fresh max.
When another checkpoint is created between the before/after lists, fresh can contain multiple versions and Math.max(...pool) may return the other writer’s checkpoint instead of this call’s snapshot.
Proposed fix
- const candidates = fresh.length > 0 ? fresh : streamVersions
- const pool = candidates.length > 0 ? candidates : after
+ const streamFresh = streamVersions.filter((v) => !before.has(v))
+ const candidates =
+ streamFresh.length > 0
+ ? streamFresh
+ : fresh.length > 0
+ ? fresh
+ : streamVersions
+ const pool = candidates.length > 0 ? candidates : after📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const after = await this.checkpointVersions(name, options.signal) | |
| const fresh = after.filter((v) => !before.has(v)) | |
| // Prefer a version that did not exist before this call; fall back to the | |
| // stream-reported version, then to the overall max. | |
| const candidates = fresh.length > 0 ? fresh : streamVersions | |
| const pool = candidates.length > 0 ? candidates : after | |
| if (pool.length === 0) { | |
| throw new Error( | |
| `Sprites: checkpoint created for "${name}" but no versioned checkpoint was found.`, | |
| ) | |
| } | |
| return `v${Math.max(...pool)}` | |
| const after = await this.checkpointVersions(name, options.signal) | |
| const fresh = after.filter((v) => !before.has(v)) | |
| // Prefer a version that did not exist before this call; fall back to the | |
| // stream-reported version, then to the overall max. | |
| const streamFresh = streamVersions.filter((v) => !before.has(v)) | |
| const candidates = | |
| streamFresh.length > 0 | |
| ? streamFresh | |
| : fresh.length > 0 | |
| ? fresh | |
| : streamVersions | |
| const pool = candidates.length > 0 ? candidates : after | |
| if (pool.length === 0) { | |
| throw new Error( | |
| `Sprites: checkpoint created for "${name}" but no versioned checkpoint was found.`, | |
| ) | |
| } | |
| return `v${Math.max(...pool)}` |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-sandbox-sprites/src/client.ts` around lines 339 - 350, The
checkpoint version selection in this method is picking the max from all fresh
versions, which can return another writer’s checkpoint instead of the version
observed by this call. Update the logic in this flow after checkpoin tVersions
so it prefers the stream-reported checkpoint version first, and only falls back
to the fresh set or overall max when that version is unavailable. Keep the fix
localized to this return path and preserve the existing error handling when no
versioned checkpoint exists.
| const deadline = Date.now() + timeoutMs | ||
| const sentinel = `${probePath.replace(/\/$/, '')}/.tanstack-restore-probe` | ||
| const marker = `ready-${Date.now()}` | ||
| let lastError: unknown | ||
| let consecutive = 0 | ||
| while (Date.now() < deadline) { | ||
| signal?.throwIfAborted() | ||
| try { | ||
| await this.probeReadWrite(name, sentinel, marker, signal) | ||
| consecutive += 1 | ||
| if (consecutive >= 2) { | ||
| // Best-effort cleanup; ignore failures. | ||
| await this.deleteSentinel(name, sentinel).catch(() => undefined) |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Avoid overwriting restored user files with the readiness sentinel.
The fixed .tanstack-restore-probe path is written inside probePath, and deleteSentinel() only rewrites it as an empty file. If the restored checkpoint already contains that file, restore readiness corrupts user data.
Proposed fix direction
- const sentinel = `${probePath.replace(/\/$/, '')}/.tanstack-restore-probe`
+ const probeBase = probePath.replace(/\/$/, '')
+ const sentinel = `${probeBase}/.tanstack-restore-probe-${Date.now()}-${Math.random()
+ .toString(36)
+ .slice(2)}`- private async deleteSentinel(name: string, path: string): Promise<void> {
- const res = await fetch(
- this.spritePath(name, `/fs/write?path=${encodeURIComponent(path)}`),
- {
- method: 'PUT',
- headers: this.headers({ 'content-type': 'application/octet-stream' }),
- body: new Uint8Array(0),
- },
- ).catch(() => undefined)
- await res?.body?.cancel()
- }
+ private async deleteSentinel(name: string, path: string): Promise<void> {
+ const proc = this.exec(name, {
+ argv: ['rm', '-f', path],
+ connectTimeoutMs: 8000,
+ })
+ await Promise.allSettled([
+ drain(proc.stdout),
+ drain(proc.stderr),
+ proc.wait(),
+ ])
+ }async function drain(stream: AsyncIterable<string>): Promise<void> {
for await (const _ of stream) {
// discard
}
}Also applies to: 498-508
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-sandbox-sprites/src/client.ts` around lines 428 - 440, The
readiness probe in the restore flow is using a fixed `.tanstack-restore-probe`
path inside `probePath`, which can overwrite an existing user file during
checkpoint restore. Update the logic in the restore readiness loop that calls
`probeReadWrite()` and `deleteSentinel()` so the sentinel is created outside the
restored tree or uses a unique temporary location/name that cannot collide with
user data, and ensure cleanup only removes the probe artifact you created.
| fork = (): Promise<SandboxHandle> => { | ||
| throw new UnsupportedCapabilityError('sprites', 'fork') | ||
| } |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Return a rejected Promise from fork().
The method is typed as Promise<SandboxHandle> but throws synchronously, so callers using promise rejection handling can miss it.
Proposed fix
- fork = (): Promise<SandboxHandle> => {
+ fork = async (): Promise<SandboxHandle> => {
throw new UnsupportedCapabilityError('sprites', 'fork')
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fork = (): Promise<SandboxHandle> => { | |
| throw new UnsupportedCapabilityError('sprites', 'fork') | |
| } | |
| fork = async (): Promise<SandboxHandle> => { | |
| throw new UnsupportedCapabilityError('sprites', 'fork') | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-sandbox-sprites/src/handle.ts` around lines 303 - 305, The fork()
method on SandboxHandle currently throws synchronously even though it is
declared to return Promise<SandboxHandle>, so update fork() in handle.ts to
return a rejected Promise instead of throwing. Keep the
UnsupportedCapabilityError('sprites', 'fork') as the rejection reason so callers
can handle it with standard promise rejection flows.
| private handle(sprite: { | ||
| name: string | ||
| url: string | ||
| }): SpritesHandle { | ||
| return new SpritesHandle({ | ||
| client: this.client, | ||
| name: sprite.name, | ||
| url: sprite.url, | ||
| workdir: this.workdir, | ||
| httpPort: this.httpPort, | ||
| urlAuth: this.urlAuth, | ||
| }) |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Preserve the actual URL auth mode when resuming.
handle() always uses config urlAuth; on resume(), that can make ports.connect() omit required auth headers for an existing sprite-auth Sprite.
Proposed fix
private handle(sprite: {
name: string
url: string
+ urlAuth?: SpriteUrlAuth
+ }, urlAuth: SpriteUrlAuth = sprite.urlAuth ?? this.urlAuth): SpritesHandle {
- }): SpritesHandle {
return new SpritesHandle({
client: this.client,
name: sprite.name,
url: sprite.url,
workdir: this.workdir,
httpPort: this.httpPort,
- urlAuth: this.urlAuth,
+ urlAuth,
})
}- const handle = this.handle(sprite)
+ const handle = this.handle(sprite, this.urlAuth)Also applies to: 121-129
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-sandbox-sprites/src/provider.ts` around lines 79 - 90, The
handle() flow is always passing the provider’s config urlAuth into
SpritesHandle, which breaks resume() for existing sprites because the original
sprite-auth mode is lost and ports.connect() may not send the needed auth
headers. Update handle() (and the resume path that reuses it) to preserve the
sprite’s actual URL auth mode from the existing sprite/handle state instead of
blindly using this.urlAuth, so SpritesHandle is constructed with the correct
auth setting for resumed sprites.
| const sprite = await this.client.createSprite(name, { | ||
| ...(this.config.waitForCapacity !== undefined | ||
| ? { waitForCapacity: this.config.waitForCapacity } | ||
| : {}), | ||
| ...(input.signal ? { signal: input.signal } : {}), | ||
| }) | ||
|
|
||
| if (sprite.urlAuth !== this.urlAuth) { | ||
| await this.client.setUrlAuth(sprite.name, this.urlAuth, input.signal) | ||
| } | ||
|
|
||
| // Ensure the workspace dir exists before any cwd-bound command runs in it. | ||
| // Run from `/` (not the workdir, which does not exist yet) so the exec's own | ||
| // cwd resolution does not fail for a non-default `workdir`. | ||
| const mkdir = this.client.exec(sprite.name, { | ||
| argv: ['mkdir', '-p', this.workdir], | ||
| cwd: '/', | ||
| ...(input.signal ? { signal: input.signal } : {}), | ||
| }) | ||
| const code = await mkdir.wait() | ||
| if (code !== 0) { | ||
| throw new Error( | ||
| `Sprites: failed to create workspace directory "${this.workdir}" (exit ${code}).`, | ||
| ) | ||
| } | ||
|
|
||
| const handle = this.handle(sprite) | ||
| if (input.env) await handle.env.set(input.env) | ||
| return handle |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Clean up the Sprite if post-create setup fails.
After createSprite() succeeds, failures in setUrlAuth, workspace creation, or env setup throw without deleting the remote Sprite, leaking a live sandbox.
Proposed fix direction
- const sprite = await this.client.createSprite(name, {
+ const sprite = await this.client.createSprite(name, {
...(this.config.waitForCapacity !== undefined
? { waitForCapacity: this.config.waitForCapacity }
: {}),
...(input.signal ? { signal: input.signal } : {}),
})
- if (sprite.urlAuth !== this.urlAuth) {
- await this.client.setUrlAuth(sprite.name, this.urlAuth, input.signal)
- }
+ try {
+ if (sprite.urlAuth !== this.urlAuth) {
+ await this.client.setUrlAuth(sprite.name, this.urlAuth, input.signal)
+ }
- // Ensure the workspace dir exists before any cwd-bound command runs in it.
+ // Ensure the workspace dir exists before any cwd-bound command runs in it.
...
- if (input.env) await handle.env.set(input.env)
- return handle
+ if (input.env) await handle.env.set(input.env)
+ return handle
+ } catch (error) {
+ await this.client.deleteSprite(sprite.name).catch(() => undefined)
+ throw error
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const sprite = await this.client.createSprite(name, { | |
| ...(this.config.waitForCapacity !== undefined | |
| ? { waitForCapacity: this.config.waitForCapacity } | |
| : {}), | |
| ...(input.signal ? { signal: input.signal } : {}), | |
| }) | |
| if (sprite.urlAuth !== this.urlAuth) { | |
| await this.client.setUrlAuth(sprite.name, this.urlAuth, input.signal) | |
| } | |
| // Ensure the workspace dir exists before any cwd-bound command runs in it. | |
| // Run from `/` (not the workdir, which does not exist yet) so the exec's own | |
| // cwd resolution does not fail for a non-default `workdir`. | |
| const mkdir = this.client.exec(sprite.name, { | |
| argv: ['mkdir', '-p', this.workdir], | |
| cwd: '/', | |
| ...(input.signal ? { signal: input.signal } : {}), | |
| }) | |
| const code = await mkdir.wait() | |
| if (code !== 0) { | |
| throw new Error( | |
| `Sprites: failed to create workspace directory "${this.workdir}" (exit ${code}).`, | |
| ) | |
| } | |
| const handle = this.handle(sprite) | |
| if (input.env) await handle.env.set(input.env) | |
| return handle | |
| const sprite = await this.client.createSprite(name, { | |
| ...(this.config.waitForCapacity !== undefined | |
| ? { waitForCapacity: this.config.waitForCapacity } | |
| : {}), | |
| ...(input.signal ? { signal: input.signal } : {}), | |
| }) | |
| try { | |
| if (sprite.urlAuth !== this.urlAuth) { | |
| await this.client.setUrlAuth(sprite.name, this.urlAuth, input.signal) | |
| } | |
| // Ensure the workspace dir exists before any cwd-bound command runs in it. | |
| // Run from `/` (not the workdir, which does not exist yet) so the exec's own | |
| // cwd resolution does not fail for a non-default `workdir`. | |
| const mkdir = this.client.exec(sprite.name, { | |
| argv: ['mkdir', '-p', this.workdir], | |
| cwd: '/', | |
| ...(input.signal ? { signal: input.signal } : {}), | |
| }) | |
| const code = await mkdir.wait() | |
| if (code !== 0) { | |
| throw new Error( | |
| `Sprites: failed to create workspace directory "${this.workdir}" (exit ${code}).`, | |
| ) | |
| } | |
| const handle = this.handle(sprite) | |
| if (input.env) await handle.env.set(input.env) | |
| return handle | |
| } catch (error) { | |
| await this.client.deleteSprite(sprite.name).catch(() => undefined) | |
| throw error | |
| } |
🧰 Tools
🪛 OpenGrep (1.23.0)
[ERROR] 109-113: Dynamic command passed to child_process.exec/execSync. Use child_process.execFile or spawn with an argument array instead.
(coderabbit.command-injection.exec-js)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-sandbox-sprites/src/provider.ts` around lines 95 - 123, After
createSprite() succeeds, any failure in the post-create setup path (setUrlAuth,
the mkdir exec/wait workspace creation, or handle.env.set) should clean up the
newly created Sprite before rethrowing. Update the flow in provider.ts around
createSprite, setUrlAuth, this.client.exec(...).wait(), and handle.env.set to
wrap the setup steps in a try/catch (or equivalent) and delete the remote sprite
on error so no live sandbox is leaked.
| async resume(input: SandboxResumeInput): Promise<SandboxHandle | null> { | ||
| try { | ||
| const sprite = await this.client.getSprite(input.id, input.signal) | ||
| return this.handle(sprite) | ||
| } catch { | ||
| // Gone / not found. | ||
| return null | ||
| } |
There was a problem hiding this comment.
🗄️ Data Integrity & Integration | 🟠 Major | ⚡ Quick win
Only return null for a confirmed missing Sprite.
This catch-all also hides auth, network, and 5xx errors as “gone,” which can make callers fall back to a fresh sandbox and lose resumable state.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-sandbox-sprites/src/provider.ts` around lines 126 - 133, In
SandboxProvider.resume, the current catch-all swallows every failure and
incorrectly treats auth, network, and server errors as a missing Sprite. Narrow
the handling around this.client.getSprite so only a confirmed “not found/gone”
error returns null, and rethrow or propagate all other errors. Use the resume
method and the getSprite call to identify the error type/code from the client
before deciding whether to fall back.
| beforeEach(() => { | ||
| calls = [] | ||
| delete process.env.SPRITES_API_KEY | ||
| vi.stubGlobal('WebSocket', AutoExecWebSocket) | ||
| }) | ||
| afterEach(() => { | ||
| vi.unstubAllGlobals() | ||
| vi.restoreAllMocks() | ||
| }) |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win
SPRITES_API_KEY deletion isn't restored, risking cross-file leakage.
beforeEach deletes process.env.SPRITES_API_KEY but afterEach never restores it. Vitest can run multiple test files in the same worker process, and sprites.test.ts reads this same env var at module scope to decide whether to skip its gated live-API suite. If that file's module loads after this one runs, a developer's real SPRITES_API_KEY could be silently wiped, causing the integration suite to skip unexpectedly.
🛡️ Proposed fix
+let originalApiKey: string | undefined
beforeEach(() => {
calls = []
- delete process.env.SPRITES_API_KEY
+ originalApiKey = process.env.SPRITES_API_KEY
+ delete process.env.SPRITES_API_KEY
vi.stubGlobal('WebSocket', AutoExecWebSocket)
})
afterEach(() => {
+ if (originalApiKey !== undefined) process.env.SPRITES_API_KEY = originalApiKey
vi.unstubAllGlobals()
vi.restoreAllMocks()
})📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| beforeEach(() => { | |
| calls = [] | |
| delete process.env.SPRITES_API_KEY | |
| vi.stubGlobal('WebSocket', AutoExecWebSocket) | |
| }) | |
| afterEach(() => { | |
| vi.unstubAllGlobals() | |
| vi.restoreAllMocks() | |
| }) | |
| let originalApiKey: string | undefined | |
| beforeEach(() => { | |
| calls = [] | |
| originalApiKey = process.env.SPRITES_API_KEY | |
| delete process.env.SPRITES_API_KEY | |
| vi.stubGlobal('WebSocket', AutoExecWebSocket) | |
| }) | |
| afterEach(() => { | |
| if (originalApiKey !== undefined) process.env.SPRITES_API_KEY = originalApiKey | |
| vi.unstubAllGlobals() | |
| vi.restoreAllMocks() | |
| }) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/ai-sandbox-sprites/tests/provider.test.ts` around lines 71 - 79, The
test setup in provider.test.ts deletes SPRITES_API_KEY but never restores it,
which can leak state across files. Update the beforeEach/afterEach pair around
the WebSocket stubbing so the original process.env.SPRITES_API_KEY value is
saved before deletion and restored in afterEach, alongside the existing
vi.unstubAllGlobals and vi.restoreAllMocks cleanup. Use the existing test hooks
in provider.test.ts to keep the env mutation isolated.
|
@kylemclaren ty for this, I love fly! |
|
View your CI Pipeline Execution ↗ for commit 93ce4b0
☁️ Nx Cloud last updated this comment at |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/ai-sandbox-sprites/src/client.ts (1)
526-531: 🩺 Stability & Availability | 🟠 Major | ⚡ Quick winBound remote kill before it can block local teardown.
terminate()waits forkillSession()before closing the WebSocket; becausekillSession()has no timeout, an unresponsive kill endpoint can leavewait()blocked on abort/kill.Proposed fix
const response = await fetch( this.spritePath(name, `/exec/${encodeURIComponent(sessionId)}/kill`), - { method: 'POST', headers: this.headers() }, + { + method: 'POST', + headers: this.headers(), + signal: AbortSignal.timeout(5000), + }, ).catch(() => undefined)const terminate = async (): Promise<void> => { const id = await waitForSessionId(5000) - if (id !== undefined) await this.killSession(name, id) + const kill = + id !== undefined ? this.killSession(name, id) : Promise.resolve() try { ws.close() } catch { // already closing/closed } + await kill }Also applies to: 689-693
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/ai-sandbox-sprites/src/client.ts` around lines 526 - 531, The remote kill call in killSession can block local teardown because terminate() waits for it before closing the WebSocket. Update killSession in client.ts to bound the POST request with a timeout or abort signal, and make terminate() proceed with WebSocket shutdown even if the kill endpoint is unresponsive. Use the killSession and terminate methods to locate the flow, and ensure any failure or timeout from the remote kill is handled without delaying wait().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@packages/ai-sandbox-sprites/src/client.ts`:
- Around line 526-531: The remote kill call in killSession can block local
teardown because terminate() waits for it before closing the WebSocket. Update
killSession in client.ts to bound the POST request with a timeout or abort
signal, and make terminate() proceed with WebSocket shutdown even if the kill
endpoint is unresponsive. Use the killSession and terminate methods to locate
the flow, and ensure any failure or timeout from the remote kill is handled
without delaying wait().
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1a50fc0a-ce03-4a45-bb21-8a61cbc62996
📒 Files selected for processing (5)
packages/ai-sandbox-sprites/src/client.tspackages/ai-sandbox-sprites/src/handle.tspackages/ai-sandbox-sprites/src/provider.tspackages/ai-sandbox-sprites/tests/client.test.tspackages/ai-sandbox-sprites/tests/provider.test.ts
🚧 Files skipped from review as they are similar to previous changes (4)
- packages/ai-sandbox-sprites/src/provider.ts
- packages/ai-sandbox-sprites/tests/provider.test.ts
- packages/ai-sandbox-sprites/tests/client.test.ts
- packages/ai-sandbox-sprites/src/handle.ts
@tanstack/ai
@tanstack/ai-acp
@tanstack/ai-angular
@tanstack/ai-anthropic
@tanstack/ai-bedrock
@tanstack/ai-claude-code
@tanstack/ai-client
@tanstack/ai-code-mode
@tanstack/ai-code-mode-skills
@tanstack/ai-codex
@tanstack/ai-devtools-core
@tanstack/ai-elevenlabs
@tanstack/ai-event-client
@tanstack/ai-fal
@tanstack/ai-gemini
@tanstack/ai-grok
@tanstack/ai-grok-build
@tanstack/ai-groq
@tanstack/ai-isolate-cloudflare
@tanstack/ai-isolate-node
@tanstack/ai-isolate-quickjs
@tanstack/ai-mcp
@tanstack/ai-mistral
@tanstack/ai-ollama
@tanstack/ai-openai
@tanstack/ai-opencode
@tanstack/ai-openrouter
@tanstack/ai-preact
@tanstack/ai-react
@tanstack/ai-react-ui
@tanstack/ai-sandbox
@tanstack/ai-sandbox-cloudflare
@tanstack/ai-sandbox-daytona
@tanstack/ai-sandbox-docker
@tanstack/ai-sandbox-local-process
@tanstack/ai-sandbox-sprites
@tanstack/ai-sandbox-vercel
@tanstack/ai-solid
@tanstack/ai-solid-ui
@tanstack/ai-svelte
@tanstack/ai-utils
@tanstack/ai-vue
@tanstack/ai-vue-ui
@tanstack/openai-base
@tanstack/preact-ai-devtools
@tanstack/react-ai-devtools
@tanstack/solid-ai-devtools
commit: |
The Sprites provider (added in #868) was documented in providers.md but omitted from the provider lists in the overview, quick-start, harnesses, and tools pages. List it alongside Daytona/Vercel where the provider set is enumerated. The sandbox-web example picker prose is left unchanged (that example wires docker/local/vercel/daytona only). Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
🎯 Changes
Adds
@tanstack/ai-sandbox-sprites— a Sprites (Fly.io stateful sandboxes) provider implementingSandboxProvider/SandboxHandle, modeled onai-sandbox-daytona/-vercel.spawn, filesystem, exec-backed git, env, durable filesystem, resume-by-id, in-place checkpoints, andports.connect()for the proxied public-URL port.SPRITES_API_KEY.Verified live against
api.sprites.dev(exec, fs round-trip, spawn streaming, public-port fetch, checkpoint create/list). Unit tests stubfetch/WebSocket; live tests are gated onSPRITES_API_KEY.✅ Checklist
pnpm run test:pr.🚀 Release Impact
Summary by CodeRabbit
ports.connect()and environment/cwd support for processes.SPRITES_API_KEYand optional API URL override).