You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/freebuff-waiting-room.md
+6-8Lines changed: 6 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,7 +4,7 @@
4
4
5
5
The waiting room is the admission control layer for **free-mode** requests against the freebuff Fireworks deployment. It has three jobs:
6
6
7
-
1.**Drip-admit users** — admit at a steady trickle (default 1 per 15s) so load ramps up gradually rather than stampeding the deployment when the queue is long.
7
+
1.**Drip-admit users** — admit at a steady trickle (default 1 per `ADMISSION_TICK_MS`, currently 15s) so load ramps up gradually rather than stampeding the deployment when the queue is long.
8
8
2.**Gate on upstream health** — before each admission tick, probe the Fireworks metrics endpoint with a short timeout (`isFireworksAdmissible` in `web/src/server/free-session/admission.ts`). If it doesn't respond OK, admission halts until it does — this is the primary concurrency control, not a static cap.
9
9
3.**One instance per account** — prevent a single user from running N concurrent freebuff CLIs to get N× throughput.
10
10
@@ -132,14 +132,13 @@ One pod runs the admission loop at a time, coordinated via Postgres advisory loc
132
132
Each tick does (in order):
133
133
134
134
1.**Sweep expired.**`DELETE FROM free_session WHERE status='active' AND expires_at < now() - grace`. Runs regardless of upstream health so zombie sessions are cleaned up even during an outage.
135
-
2.**Admit.**`admitFromQueue()` first calls `isFireworksAdmissible()` (short-timeout GET against the Fireworks metrics endpoint). If the probe fails, returns `{ skipped: 'health' }` — admission pauses and the queue grows until recovery. Otherwise opens a transaction, takes `pg_try_advisory_xact_lock(FREEBUFF_ADMISSION_LOCK_ID)`, and `SELECT ... WHERE status='queued' ORDER BY queued_at, user_id LIMIT MAX_ADMITS_PER_TICK FOR UPDATE SKIP LOCKED` → `UPDATE` the rows to `status='active'` with `admitted_at=now()`, `expires_at=now()+sessionLength`. Staggering at `MAX_ADMITS_PER_TICK=1` / 15s keeps Fireworks from a thundering herd of newly-admitted CLIs.
135
+
2.**Admit.**`admitFromQueue()` first calls `isFireworksAdmissible()` (short-timeout GET against the Fireworks metrics endpoint). If the probe fails, returns `{ skipped: 'health' }` — admission pauses and the queue grows until recovery. Otherwise opens a transaction, takes `pg_try_advisory_xact_lock(FREEBUFF_ADMISSION_LOCK_ID)`, and `SELECT ... WHERE status='queued' ORDER BY queued_at, user_id LIMIT 1 FOR UPDATE SKIP LOCKED` → `UPDATE` the row to `status='active'` with `admitted_at=now()`, `expires_at=now()+sessionLength`. One admit per tick keeps Fireworks from a thundering herd of newly-admitted CLIs.
136
136
137
137
### Tunables
138
138
139
139
| Constant | Location | Default | Purpose |
140
140
|---|---|---|---|
141
-
|`ADMISSION_TICK_MS`|`config.ts`| 15000 | How often the ticker fires |
142
-
|`MAX_ADMITS_PER_TICK`|`config.ts`| 1 | Upper bound on admits per tick |
141
+
|`ADMISSION_TICK_MS`|`config.ts`| 15000 | How often the ticker fires. One user is admitted per tick. |
|`FREEBUFF_SESSION_GRACE_MS`| env | 1_800_000 | Drain window after expiry — gate still admits requests so an in-flight agent can finish, but the CLI is expected to block new prompts. Hard cutoff at `expires_at + grace`. |
145
144
@@ -224,6 +223,7 @@ For free-mode requests (`codebuff_metadata.cost_mode === 'free'`), `_post.ts` ca
224
223
225
224
| HTTP |`error`| When |
226
225
|---|---|---|
226
+
| 426 |`freebuff_update_required`| Request did not include a `freebuff_instance_id` — the client is a pre-waiting-room build. The CLI shows the server-supplied message verbatim. |
227
227
| 428 |`waiting_room_required`| No session row exists. Client should call POST /session. |
228
228
| 429 |`waiting_room_queued`| Row exists with `status='queued'`. Client should keep polling GET. |
229
229
| 409 |`session_superseded`| Claimed `instance_id` does not match stored one — another CLI took over. |
@@ -249,13 +249,11 @@ This is a **trust-the-client** design: the server still admits requests during t
249
249
Computed in `session-view.ts` from the drip-admission rate:
This estimate **ignores health-gated pauses**: during a Fireworks incident admission halts entirely, so the actual wait can be longer. We choose to under-report here because showing "unknown" / "indefinite" is worse UX for the common case where the deployment is healthy.
0 commit comments