Freebuff waiting room by jahooma · Pull Request #509 · CodebuffAI/codebuff

jahooma · 2026-04-18T22:38:06Z

No description provided.

Co-authored-by: CodebuffAI <189203002+CodebuffAI@users.noreply.github.com>

greptile-apps · 2026-04-18T22:43:47Z

Greptile Summary

This PR introduces a freebuff waiting room — a queue-based admission system that gates free-tier CLI users before allowing them to send chat requests. It is implemented end-to-end across the server (new free_session table, admission ticker, /api/v1/freebuff/session REST endpoints, and a gate injected into the chat-completions hot path) and the CLI (a polling hook useFreebuffSession, zustand store, and two new terminal UI screens for the waiting room and the "superseded" terminal state).

Key design decisions:

Feature-flagged off by default (FREEBUFF_WAITING_ROOM_ENABLED=false), requiring explicit opt-in per environment.
Advisory lock + FOR UPDATE SKIP LOCKED in admitFromQueue prevents double-admission across pods.
Instance ID rotation on every POST /session enforces one-CLI-per-account.
Dependency injection throughout server code makes the layer fully unit-testable without DB.

Notable issues found:

No TTL/expiry on queued rows: If a CLI crashes without sending the cleanup DELETE, its queued row stays in the table indefinitely, inflating queue depth and wait-time estimates for all other users.
Null claimedInstanceId → session_superseded: A missing instance ID in an active-session request returns the same 409 code as a genuine takeover, making the two conditions indistinguishable in logs.
Non-transactional position + depth reads: queuePositionFor and queueDepth() run in parallel without a transaction, so position can transiently exceed depth on the client display.
onDisableAds={() => {}} in WaitingRoomScreen: The no-op handler may render a dead affordance if AdBanner shows a disable button.

Confidence Score: 4/5

Safe to merge behind the feature flag; the queued-row TTL gap is real but won't manifest until the feature is enabled in production.

The implementation is thorough and well-tested with dependency injection throughout. The feature defaults to OFF so no existing request path is affected. The main gap — no TTL on queued rows — is a correctness concern once the feature goes live but does not break the primary admission or chat path. The two P2 issues are minor observability/cosmetic concerns.

web/src/server/free-session/store.ts (queued-row TTL), web/src/server/free-session/public-api.ts (null instanceId path and non-transactional reads)

Important Files Changed

Filename	Overview
web/src/server/free-session/store.ts	Core DB layer; uses advisory lock + FOR UPDATE SKIP LOCKED for safe concurrent admission. Main concern: no TTL on queued rows means stale entries accumulate indefinitely if CLIs crash without sending DELETE.
web/src/server/free-session/public-api.ts	Session gate and state API; null claimedInstanceId is treated as session_superseded (same 409 as a rotated instance), conflating missing-field with takeover. Position + depth queries are non-transactional.
web/src/server/free-session/admission.ts	Background admission ticker with overlap-prevention (inFlight guard) and Fireworks health gate. Well-structured with injectable deps for testing.
cli/src/hooks/use-freebuff-session.ts	Polling hook managing POST/GET/DELETE lifecycle and bell-on-admission. Module-level activeRefreshHandle enables imperative refresh from gate-error handlers. Cleanup is well-guarded.
cli/src/components/waiting-room-screen.tsx	Waiting room UI with elapsed timer, position display, and ad banner. onDisableAds is a silent no-op — acceptable if AdBanner hides the affordance, but worth verifying.
web/src/app/api/v1/chat/completions/_post.ts	Gate check injected before rate-limiter; only runs for free-mode requests when waiting room is enabled. Correctly returns structured 4xx by error code.
cli/src/app.tsx	AuthedSurface wrapper correctly gates WaitingRoomScreen and FreebuffSupersededScreen before rendering Chat.
packages/internal/src/db/schema.ts	New free_session table with correct PK (user_id), cascade FK, and two targeted indexes for queue dequeue and expiry sweep.
web/src/server/free-session/session-view.ts	Pure view-projection function with injected clock; well-tested. estimateWaitMs uses wave-based upper-bound formula.
cli/src/hooks/helpers/send-message.ts	Gate errors mapped to user-readable messages; superseded is treated as terminal, others trigger a forcePost refresh.

Sequence Diagram

sequenceDiagram
    participant CLI
    participant SessionAPI as /api/v1/freebuff/session
    participant ChatAPI as /api/v1/chat/completions
    participant DB as free_session (Postgres)
    participant Ticker as AdmissionTicker

    CLI->>SessionAPI: POST /session (on startup)
    SessionAPI->>DB: joinOrTakeOver (UPSERT)
    DB-->>SessionAPI: row {status: queued, instanceId}
    SessionAPI-->>CLI: {status: queued, position, estimatedWaitMs}

    loop Poll every 5s while queued
        CLI->>SessionAPI: GET /session
        SessionAPI->>DB: getSessionRow
        DB-->>SessionAPI: row
        SessionAPI-->>CLI: {status: queued | active}
    end

    Ticker->>DB: sweepExpired() + admitFromQueue() every 5s
    Note over Ticker,DB: advisory lock prevents double-admission across pods
    DB-->>Ticker: admitted rows
    Note over CLI: bell plays on queued→active transition

    CLI->>ChatAPI: POST /completions + freebuff_instance_id
    ChatAPI->>DB: checkSessionAdmissible(userId, instanceId)
    alt session active and instance matches
        DB-->>ChatAPI: {ok: true}
        ChatAPI-->>CLI: 200 stream
    else queued / expired / superseded
        DB-->>ChatAPI: {ok: false, code}
        ChatAPI-->>CLI: 428/429/409/410
        CLI->>SessionAPI: refreshFreebuffSession() or markSuperseded()
    end

    CLI->>SessionAPI: DELETE /session (on exit)
    SessionAPI->>DB: endSession(userId)

Comments Outside Diff (1)

cli/src/components/waiting-room-screen.tsx, line 386-388 (link)

onDisableAds is a no-op — may render a dead UI affordance

onDisableAds={() => {}} passes a silent no-op to AdBanner. If AdBanner renders any "Disable ads" affordance based on this prop, clicking it will silently do nothing. Consider passing undefined if AdBanner hides the affordance in that case, or add a comment explaining why no-op is intentional here (e.g. "free users cannot disable ads in the waiting room").

_{Reviews (1): Last reviewed commit: "Merge branch 'main' into freebuff-waitin..." | Re-trigger Greptile}

greptile-apps · 2026-04-18T22:43:51Z

+        sql`(${schema.freeSession.queued_at}, ${schema.freeSession.user_id}) <= (${params.queuedAt}, ${params.userId})`,
+      ),
+    )
+  return Number(rows[0]?.n ?? 0)
+}
+
+/** Remove rows whose active session has expired. Safe to call repeatedly. */
+export async function sweepExpired(now: Date): Promise<number> {
+  const deleted = await db


No TTL on queued rows — stale entries accumulate indefinitely

sweepExpired only removes rows where status = 'active' AND expires_at < now. Queued rows are never automatically removed. A user who joins the queue (POST /session) and then hard-crashes or loses network before their CLI can send the fire-and-forget DELETE will leave a stale status=queued row in the table forever. These phantom entries:

Inflate queueDepth() for all other users, making wait-time estimates wrong.

Keep would-be new users at artificially higher positions until a real admin runs cleanup.

Consider adding a queued_at + max_queue_wait expiry sweep alongside the active-session sweep, e.g.:

// In sweepExpired (store.ts): also remove queued rows idle for too long await db.delete(schema.freeSession).where( and( eq(schema.freeSession.status, 'queued'), lt(schema.freeSession.queued_at, new Date(now.getTime() - MAX_QUEUE_IDLE_MS)), ), )

Alternatively, track the last heartbeat time and sweep queued rows that haven't polled in N minutes.

greptile-apps · 2026-04-18T22:43:52Z

+  if (!row.expires_at || row.expires_at.getTime() <= now.getTime()) {
+    return {
+      ok: false,
+      code: 'session_expired',
+      message: 'Your free session has expired. Re-join the waiting room via POST /api/v1/freebuff/session.',
+    }
+  }
+
+  if (!params.claimedInstanceId || params.claimedInstanceId !== row.active_instance_id) {


Null claimedInstanceId returns session_superseded instead of a distinct error

When claimedInstanceId is null or undefined (absent from the request), the code falls through to the session_superseded branch with the message "Another instance of freebuff has taken over this session." This conflates two separate conditions — a missing identifier vs. a rotated one — under the same 409 code.

In the current UI flow this is safe (the WaitingRoomScreen gate prevents users from sending messages before their instance ID is populated), but it makes server-side observability harder and would produce a confusing 409 for any future client that omits the field.

Consider splitting the two cases:

if (!params.claimedInstanceId) { return { ok: false, code: 'waiting_room_required', message: 'freebuff_instance_id missing from request. Call POST /api/v1/freebuff/session first.', } } if (params.claimedInstanceId !== row.active_instance_id) { return { ok: false, code: 'session_superseded', message: '...' } }

greptile-apps · 2026-04-18T22:43:53Z

+  | { ok: false; code: 'session_superseded'; message: string }
+  | { ok: false; code: 'session_expired'; message: string }
+
+/**
+ * Called from the chat/completions hot path for free-mode requests. Either
+ * returns `{ ok: true }` (request may proceed) or a structured rejection
+ * the caller translates into a 4xx response.
+ *
+ * Never trusts client timestamps. The caller supplies `claimedInstanceId`


queuePositionFor and queueDepth are not fetched in a transaction

These two queries run in parallel outside a transaction. If users are admitted between the two reads, position could exceed depth (e.g. position=3, depth=2), which would be confusing on the client's waiting-room display. This is cosmetic rather than a correctness bug, but wrapping both in a single read-only transaction or a WITH CTE would eliminate the inconsistency.

Replicas=0 or no replicas metric at all (the deployment has been scaled to zero or dropped from the scrape) now flips that deployment's health to unhealthy unconditionally, so admission fails closed instead of funneling users to a backend that cannot serve traffic. Also drop generationQueueMs degraded 5000 -> 400 and ttftMs degraded 8000 -> 2000, and comment out the kimi deployment since only glm-5.1 is in production. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

FREEBUFF_MAX_CONCURRENT_SESSIONS is gone. Admission now runs purely as a drip (MAX_ADMITS_PER_TICK=1 every 15s) gated by the Fireworks health monitor — utilisation ramps up slowly and pauses the moment metrics degrade, so a static cap is redundant. Renamed SessionDeps' getMaxConcurrentSessions/getSessionLengthMs to getAdmissionTickMs/getMaxAdmitsPerTick (those are what the wait-time estimate actually needs now). estimateWaitMs is rewritten from the session-cycle model to the drip model: waitMs = ceil((position - 1) / maxAdmitsPerTick) * admissionTickMs Dropped the 'full' branch of AdmissionTickResult and the full-capacity admission test — the only reason admission skips now is health. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Stdin is in raw mode on these screens, so SIGINT never fires — Ctrl+C had no effect and users had to kill the process. Now both screens hook Ctrl+C via OpenTUI's useKeyboard, flush analytics with a 1s cap, and exit. The waiting-room screen additionally sends a best-effort DELETE /api/v1/freebuff/session before exit so the seat frees up immediately instead of waiting on the server-side expiry sweep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

p50 TTFT degraded 1000 → 1500ms and p50 generation queue degraded 200 → 300ms, so a healthy deployment running at steady-state 1s TTFT does not trip the admission gate. scripts/scrape-check.ts pulls the live Fireworks metrics and prints the same per-deployment health the admission gate sees — useful for tuning thresholds without guessing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Keep admitting requests for FREEBUFF_SESSION_GRACE_MS (default 30m) after a session's expires_at so in-flight agent runs can drain; hard cutoff past that. Also: replicas=0 → unhealthy, hoist chat/completions gate status map, fix stale threshold comment and a pre-existing free-mode test missing the checkSessionAdmissible override. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

When the server flips the seat into `draining` (past `expires_at`) or past the hard cutoff, the chat input is replaced by a "Session ended" banner. While an agent is still streaming under the grace window, Enter is disabled and the banner shows "Agent is wrapping up. Rejoin the wait room after it's finished." — Esc still interrupts. Once idle, Enter re-POSTs /session to drop back into the waiting room. Adds a small countdown to the far right of the status bar (muted, turning soft warning in the final minute — no red) and schedules the next poll just after expires_at / gracePeriodEndsAt so the draining transition shows up promptly instead of stalling at 0 for a full interval. Moves getFreebuffInstanceId onto the session hook's module handle and deletes the now-vestigial freebuff-session-store. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Delete the 1.5k-LOC fireworks-monitor package (Prometheus scrape, health computation, admin endpoint, CLI scripts) in favor of a single-function reachability probe inline in free-session/admission.ts: GET the account metrics endpoint with a 5s timeout and fail closed on non-OK. The full-health-scoring machinery was load-bearing on nothing — admission only ever read the boolean gate, and reachability is what actually matters for halting during an outage. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

When the waiting-room gate rejects a free-mode request and no freebuff_instance_id was sent, return 426 with a "please restart to upgrade" message. Old CLI versions render the message verbatim in their error banner; new clients still get the normal gate responses. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Collapse client-side draining/ended into a single ended state, move freebuff session state into zustand (replacing the module-level handle singleton), host the Fireworks probe inside admitFromQueue, and share the wire types between server and CLI. Drops ~150 lines net. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The CLI used to layer two client-only states (`superseded`, `ended`) on top of the server's response and translate `draining` → `ended` in a mapper. Replaces all of that with a single union the server emits directly: - GET /session now reads `X-Freebuff-Instance-Id` and returns `{ status: 'superseded' }` when the active row's id no longer matches. Previously this was inferred client-side from the chat gate's 409. - The wire's `draining` is renamed to `ended` and carries the same grace fields. The CLI's post-grace synthetic `ended` (no instanceId) reuses that shape. Also drops the zustand `driver` indirection — imperative session controls (refresh / mark superseded / mark ended) live as module-level functions on the hook, talking to a private controller ref. Combines `refreshFreebuffSession` and `rejoinFreebuffSession` into one with an optional `{ resetChat }` flag. Inlines the constant getters in `SessionDeps`/`AdmissionDeps` so tests pass plain numbers, drops the `limit` parameter from `admitFromQueue` (always 1), and consolidates the three 1-Hz UI tickers into a shared `useNow` hook. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Ambient bar fills proportionally to time remaining so users have a passive sense of session progress; a bold warning-colored "X:XX" readout appears only for the final five minutes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jahooma and others added 13 commits April 17, 2026 18:59

Freebuff waiting room backend

c7e5807

Freebuff waiting room client

41ffbab

Fix cli test

fa1f8f8

10 minutes before cache clears in free mode

2d9f081

Remove thinker-with-files-gemini from freebuff

90a9580

Bump version to 1.0.642

c76c5a3

Bump Freebuff version to 0.0.34

35d0775

Increase test timeout

0f261bf

fix: avoid DNS lookup after proxied release CONNECT (#506)

9f8de8a

Remove evalbuff and expensivebuff (#493)

e411821

Co-authored-by: CodebuffAI <189203002+CodebuffAI@users.noreply.github.com>

fix: correct code-map line counting (#508)

45fe312

Revert restrictions on using paid codebuff

5c51810

Merge branch 'main' into freebuff-waiting-room

bb53e06

jahooma requested review from brandonkachen and charleslien as code owners April 18, 2026 22:38

greptile-apps bot reviewed Apr 18, 2026

View reviewed changes

jahooma and others added 13 commits April 18, 2026 15:52

Fixes

282194a

Admit one user per 30s

8ca704a

Improve waiting room screen

8ee55ab

Session countdown

845bed1

jahooma and others added 6 commits April 18, 2026 18:45

Tweak text for session end banner

d575c88

More cleanup

fdf60ae

Fix basher test

59a0e48

Handle old backend result

1aeab98

jahooma merged commit 7e07b1a into main Apr 19, 2026
32 of 34 checks passed

jahooma deleted the freebuff-waiting-room branch April 19, 2026 05:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Freebuff waiting room#509

Freebuff waiting room#509
jahooma merged 32 commits intomainfrom
freebuff-waiting-room

jahooma commented Apr 18, 2026

Uh oh!

greptile-apps bot commented Apr 18, 2026 •

edited

Loading

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Apr 18, 2026

Uh oh!

greptile-apps bot Apr 18, 2026

Uh oh!

greptile-apps bot Apr 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jahooma commented Apr 18, 2026

Uh oh!

greptile-apps bot commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Apr 18, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps bot commented Apr 18, 2026 •

edited

Loading