You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Drop MAX_CONCURRENT_SESSIONS; drip admission is sole concurrency control
FREEBUFF_MAX_CONCURRENT_SESSIONS is gone. Admission now runs purely
as a drip (MAX_ADMITS_PER_TICK=1 every 15s) gated by the Fireworks
health monitor — utilisation ramps up slowly and pauses the moment
metrics degrade, so a static cap is redundant.
Renamed SessionDeps' getMaxConcurrentSessions/getSessionLengthMs to
getAdmissionTickMs/getMaxAdmitsPerTick (those are what the wait-time
estimate actually needs now). estimateWaitMs is rewritten from the
session-cycle model to the drip model:
waitMs = ceil((position - 1) / maxAdmitsPerTick) * admissionTickMs
Dropped the 'full' branch of AdmissionTickResult and the full-capacity
admission test — the only reason admission skips now is health.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy file name to clipboardExpand all lines: docs/freebuff-waiting-room.md
+11-14Lines changed: 11 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,8 +4,8 @@
4
4
5
5
The waiting room is the admission control layer for **free-mode** requests against the freebuff Fireworks deployment. It has three jobs:
6
6
7
-
1.**Bound concurrency** — cap the number of simultaneously-active free users so one deployment does not degrade under load.
8
-
2.**Gate on upstream health** — only admit new users while the Fireworks deployment is reporting `healthy` (via the separate monitor in `web/src/server/fireworks-monitor/`).
7
+
1.**Drip-admit users** — admit at a steady trickle (default 1 per 15s) so load ramps up gradually rather than stampeding the deployment when the queue is long.
8
+
2.**Gate on upstream health** — only admit new users while the Fireworks deployment is reporting `healthy` (via the separate monitor in `web/src/server/fireworks-monitor/`). Once metrics degrade, admission halts until they recover — this is the primary concurrency control, not a static cap.
9
9
3.**One instance per account** — prevent a single user from running N concurrent freebuff CLIs to get N× throughput.
10
10
11
11
Users who cannot be admitted immediately are placed in a FIFO queue and given an estimated wait time. Admitted users get a fixed-length session (default 1h) during which they can make free-mode requests subject to the existing per-user rate limits.
Flipping the flag is safe at runtime: existing rows stay in the DB and will be admitted / expired correctly whenever the flag is flipped back on.
@@ -127,17 +126,15 @@ Each tick does (in order):
127
126
128
127
1.**Sweep expired.**`DELETE FROM free_session WHERE status='active' AND expires_at < now()`. Runs regardless of upstream health so zombie sessions are cleaned up even during an outage.
129
128
2.**Check upstream health.**`isFireworksAdmissible()` from the monitor. If not `healthy`, skip admission for this tick (queue grows; users see `status: 'queued'` with increasing position).
130
-
3.**Measure capacity.**`capacity = min(MAX_CONCURRENT - activeCount, MAX_ADMITS_PER_TICK)`. `MAX_ADMITS_PER_TICK=20` caps thundering-herd admission when a large block of sessions expires simultaneously.
131
-
4.**Admit.**`SELECT ... WHERE status='queued' ORDER BY queued_at, user_id LIMIT capacity FOR UPDATE SKIP LOCKED`, then `UPDATE` those rows to `status='active'` with `admitted_at=now()`, `expires_at=now()+sessionLength`.
129
+
3.**Admit.**`SELECT ... WHERE status='queued' ORDER BY queued_at, user_id LIMIT MAX_ADMITS_PER_TICK FOR UPDATE SKIP LOCKED`, then `UPDATE` those rows to `status='active'` with `admitted_at=now()`, `expires_at=now()+sessionLength`. Staggering the queue at `MAX_ADMITS_PER_TICK=1` / 15s keeps Fireworks from getting hit by a thundering herd of newly-admitted CLIs; once metrics show the deployment is saturated, step 2 halts further admissions.
132
130
133
131
### Tunables
134
132
135
133
| Constant | Location | Default | Purpose |
136
134
|---|---|---|---|
137
-
|`ADMISSION_TICK_MS`|`config.ts`|5000| How often the ticker fires |
138
-
|`MAX_ADMITS_PER_TICK`|`config.ts`|20| Upper bound on admits per tick |
135
+
|`ADMISSION_TICK_MS`|`config.ts`|15000| How often the ticker fires |
136
+
|`MAX_ADMITS_PER_TICK`|`config.ts`|1| Upper bound on admits per tick |
- Position 1..`maxConcurrent` → 0 (next tick will admit them)
221
-
- Position `maxConcurrent`+1..`2*maxConcurrent` → one full session length
217
+
- Position 1 → 0 (next tick admits you)
218
+
- Position `maxAdmitsPerTick` + 1 → one tick
222
219
- and so on.
223
220
224
-
Actual wait is usually shorter because users call `DELETE /session` on CLI exit and sessions turn over naturally. We show an upper bound because under-promising on wait time is better UX than surprise delays.
221
+
This estimate **ignores health-gated pauses**: during a Fireworks incident admission halts entirely, so the actual wait can be longer. We choose to under-report here because showing "unknown" / "indefinite" is worse UX for the common case where the deployment is healthy.
0 commit comments