Skip to content

fix(event-service): refcount subscribe/unsubscribe by context#2927

Merged
iscekic merged 31 commits intofeat/kilo-chat-migration-pr1from
feat/kilo-chat-migration-pr4-5
Apr 30, 2026
Merged

fix(event-service): refcount subscribe/unsubscribe by context#2927
iscekic merged 31 commits intofeat/kilo-chat-migration-pr1from
feat/kilo-chat-migration-pr4-5

Conversation

@iscekic
Copy link
Copy Markdown
Contributor

@iscekic iscekic commented Apr 29, 2026

Summary

Small SDK hardening pulled out of the kilo-chat migration stack as PR 4.5. Stacks on top of #2924.

EventServiceClient.subscribe/unsubscribe previously stored active contexts in a Set<string>, so two consumers subscribing to the same context and one of them unsubscribing would yank the subscription out from under the other. We don't hit this today (PR 4's three presence tiers each use a distinct context) but PR 5+ will start composing subscriptions over the same context (e.g., a sidebar unread-counts hook subscribing to kiloclawInstanceContext while KiloChatLayout already does).

Fix: change activeContexts to Map<string, number>. Wire context.subscribe is sent only on the 0→1 transition; context.unsubscribe only on 1→0. resubscribeContexts() dedupes by context key on reconnect.

This makes subscribe/unsubscribe idempotent and composable across consumers — the SDK owns the contract so callers (web usePresenceSubscription, the upcoming mobile equivalents) stay naive about it.

Plan update

docs/superpowers/plans/2026-04-29-mobile-kilo-chat-migration.md (gitignored, local-only) updated:

  • Added a PR 4.5 entry in the slicing section + summary table.
  • Annotated Phase 7 with the SDK contract so future workers don't add a parallel hook-level refcount.

Test plan

  • cd packages/event-service && pnpm test — 17 passed (12 existing + 5 new refcount tests + 1 sanity).
  • cd packages/event-service && pnpm run typecheck — clean.
  • cd packages/event-service && pnpm run lint — clean.
  • cd services/event-service && pnpm test — 16 passed (consumer of the SDK; behavior unchanged for single-consumer call sites).
  • Format applied via pnpm run format:changed.

Stack

PR #2907#2914#2918#2924 (PR 4) → this PR.

iscekic added 30 commits April 29, 2026 17:28
The badge_counts.badge_bucket column is a free-form string. To prevent
namespace collisions as more surfaces start emitting badge updates
(per-instance today, per-conversation later), centralize bucket-key
derivation in @kilocode/notifications and route NotificationChannelDO
through it. Mirrors the presence-context builders in @kilocode/event-service.

Safe to introduce now without a data migration because PR 2's migration
already wipes badge_counts.
…-chat producer

Adds kiloclawInstanceContext and kiloclawConversationContext path
builders to @kilocode/event-service, replacing hardcoded template
literals in kilo-chat's event-push.ts and its test so all callers
share a single source of truth.
When a chat message is persisted, fire-and-forget a call to
NOTIFICATIONS.sendPushForConversation so non-sender human members of the
conversation receive a push. Runs after realtime/event-service delivery
inside postCommitFanOut, with errors swallowed so push failures cannot
fail the send.

- Skip when there are no other human recipients or no sandboxId.
- senderUserId = callerId for human senders, null for bot senders.
- title is "<sandboxLabel> · <conversationTitle>"; bodyPreview is the
  first 200 chars of the concatenated text blocks.
- Add @kilocode/notifications workspace dep and layer the RPC method
  shape into Env via bindings.d.ts.
- Add a notifications-stub worker to the vitest config so tests can
  spy on env.NOTIFICATIONS.sendPushForConversation, and globally mock
  sandbox-lookup in setup.ts (it imports pg via @kilocode/db).
…es, fix test mock

- Remove `stream-chat` from `services/notifications/package.json`; the Stream
  webhook (its only consumer) was deleted earlier in the stack.
- Regenerate `worker-configuration.d.ts` so the workerd runtime types match the
  current toolchain (sibling services were on `1.20260312.1`; this one had
  drifted to `1.20251217.0` from a stale local cache).
- Fix the global test mock to reference the renamed `badge_counts` table; the
  setup file was authored against the pre-rename name and never matched.
- Tidy two pre-existing lint nits in the new test files (`import type` for
  type-only import, drop unused `cols` parameter).
…leak

- Switch `NotificationsService` from default-only to a named class export
  with a separate default. `services/kilo-chat/wrangler.jsonc` binds via
  `entrypoint: "NotificationsService"`, which resolves named module
  exports. The default-only form (`export default class NotificationsService`)
  exports under the `default` key — kilo-chat's RPC binding would not have
  resolved at deploy. Mirrors the existing pattern in
  `services/kilo-chat/src/index.ts` (`KiloChatService`).

- `dispatchPush` now uses a two-stage idempotency record (`pending` →
  `delivered`). The badge increment was previously non-idempotent: an
  Expo failure returned `failed` without writing the idempotency key, so
  upstream retries (which the design explicitly invites) re-ran the
  increment before the next send and inflated the badge by one per
  retry. The `pending` marker is written before the increment and
  short-circuits the increment on retry; the `delivered` marker is only
  written on success.

- `setAlarm` is now gated on `getAlarm() === null`. Calling `setAlarm`
  unconditionally on each successful push — as the previous code did —
  replaces the pending alarm and pushes the cleanup forward indefinitely
  on a conversation receiving more than one push per `IDEM_TTL_MS`,
  leaking expired idempotency entries.

Adds two test cases covering the badge-retry and alarm-reset paths.
- Schedule the cleanup alarm when writing the `pending` marker, not only
  on `delivered`. Without this, an Expo failure followed by no further
  push activity for the conversation leaves the `pending` record in DO
  storage forever (no alarm was ever set to prune it).

- After the alarm fires, reschedule for the earliest remaining record's
  expiry instead of leaving the alarm slot empty. Otherwise a quiet
  conversation strands its younger entries until some unrelated future
  dispatch wakes the DO up.

Both paths go through a small `ensureCleanupAlarm` helper that gates on
`getAlarm() === null` so a busy conversation still doesn't push the
alarm forward on every call.
The kiloclaw-scoped presence paths are literally `/presence` prefixed
onto the kiloclaw event-context paths. Build them by composition so the
`/kiloclaw/{sandboxId}[/{conversationId}]` segment shape is defined in
exactly one place — `kiloclaw-contexts.ts`.

Pure refactor; same string output, template-literal types still narrow
to the same shape.
Introduces a single app-shell EventServiceProvider that owns the
EventServiceClient and KiloChatClient for all authenticated routes.
Mounted in (app)/layout.tsx so platform/instance/conversation presence
subscriptions and the kilo-chat UI share one WebSocket.

KiloChatLayout now consumes the global clients via useEventServiceClient()
instead of spinning up its own pair, and the getToken prop is removed from
KiloChatLayoutProps (along with both call sites). The local
useEventService(getToken) factory is dead code and has been deleted;
useInstanceContext / useConversationContext stay since they take
EventServiceClient as a parameter.
Thin hook that subscribes the global EventServiceClient to a single
context for the lifetime of the calling component, gated by an `active`
flag. Will back upcoming platform- and instance-level presence
indicators.
…eSubscription

- Drop dead getToken field from KiloChatContextValue (no consumers).
- Remove useInstanceContext / useConversationContext hooks; both call
  sites now use the shared usePresenceSubscription primitive directly.
- Harden usePresenceSubscription against empty-string contexts.
- usePresenceSubscription: accept 'string | null' instead of empty-string
  sentinel; update call sites (KiloChatLayout, MessageArea, useInstancePresence)
- kilo-chat router: validate expiresAt with z.iso.datetime()
- kilo-chat-router test: verify the JWT payload (kiloUserId, tokenSource,
  version) and that expiresAt lands in the expected ~1h window
- MessageArea: comment distinguishing the always-on chat-event subscription
  from the visibility-gated presence subscription
Multiple consumers can now independently hold the same context without
trampling each other. The wire context.subscribe/context.unsubscribe
messages are only sent on the 0->1 and 1->0 refcount transitions; the
intermediate churn stays client-side.

Resubscribe-on-reconnect dedupes by context key.

Tests cover: double-subscribe collapses to a single wire send, partial
unsubscribe keeps the context alive, last-consumer-out releases it,
mixed batches only send newly-active contexts, unknown-context
unsubscribes are no-ops, and reconnect resubscribes each context once.
@iscekic iscekic self-assigned this Apr 29, 2026
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented Apr 29, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Files Reviewed (2 files)
  • packages/event-service/src/client.ts
  • packages/event-service/src/__tests__/client.test.ts

Reviewed by gpt-5.5-2026-04-23 · 116,399 tokens

Base automatically changed from feat/kilo-chat-migration-pr4 to feat/kilo-chat-migration-pr1 April 30, 2026 13:05
…to feat/kilo-chat-migration-pr4-5

# Conflicts:
#	packages/notifications/src/badge-buckets.ts
#	services/kilo-chat/wrangler.jsonc
@iscekic iscekic merged commit f4489ac into feat/kilo-chat-migration-pr1 Apr 30, 2026
1 check passed
@iscekic iscekic deleted the feat/kilo-chat-migration-pr4-5 branch April 30, 2026 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant