Skip to content

feat: Add system subclusters and kernel facet service#803

Merged
rekmarks merged 48 commits intomainfrom
rekm/system-vats-redux
Feb 10, 2026
Merged

feat: Add system subclusters and kernel facet service#803
rekmarks merged 48 commits intomainfrom
rekm/system-vats-redux

Conversation

@rekmarks
Copy link
Member

@rekmarks rekmarks commented Feb 3, 2026

Adds support for "system subclusters" - statically declared subclusters that are launched at kernel initialization and persist across kernel restarts. System subclusters can receive powerful kernel services not available to normal vats via the KernelFacet. In summary:

  • System subcluster persistence: System subclusters now persist across kernel restarts like regular subclusters
    • Orphaned system subclusters (config removed) are deleted on boot without starting their vats
  • KernelFacet: Privileged API for system vats to interact with the kernel (launch subclusters, queue messages, etc.)
    • Basically, an expanded version of the previously introduced "kernel facade" (for background CapTP purposes)
    • Introducing this as a service necessitated eagerly dispatching kernel service invocations, which would otherwise deadlock the current crank if the called kernel method waited for the current crank to complete
  • Controller vat: Moved Omnium controllers to a TypeScript controller-vat that runs inside the kernel
  • Globals endowments: Added globals config to allow vats to receive specific globals (like Date) in their SES Compartment
  • Kernel subcluster representation: The kernel now stores a name -> id mapping for subcluster vats, which facilitates identifying the bootstrap vat of a launched subcluster
    • Previously, only an array of vat ids was stored

Note

High Risk
Touches kernel initialization, message routing, and kernel-service invocation semantics (now non-awaited) and adds new persisted system-subcluster state, so regressions could affect boot behavior, crank processing, and subcluster/vat lifecycle.

Overview
Adds system subclusters: statically configured subclusters that are restored on boot, launched after the run queue starts, can be looked up via getSystemSubclusterRoot, and are cleaned up when configs are removed (including orphan deletion before vat startup and mapping updates on reload/reset).

Replaces the browser-runtime CapTP “kernel facade” with a first-class KernelFacet kernel service (exported from @metamask/ocap-kernel) and updates CapTP endpoints, RPC (launchSubcluster result now uses rootKref), and Omnium’s background to interact via the facet and a controller system vat.

Extends vat configuration with an allowlisted globals array to inject specific globals (e.g., Date) into SES worker endowments, and refactors subcluster storage from a vat-id array to a name→vat-id record (requiring VatManager.launchVat/store APIs to accept vat names) with broad test updates plus new Node e2e coverage for system-subcluster behavior and persistence.

Written by Cursor Bugbot for commit 4763784. This will update automatically on new commits. Configure here.

@rekmarks rekmarks force-pushed the rekm/system-vats-redux branch from f58f8c0 to 079a10a Compare February 3, 2026 20:08
@rekmarks
Copy link
Member Author

rekmarks commented Feb 3, 2026

@cursor review

@github-actions
Copy link
Contributor

github-actions bot commented Feb 3, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 77.93%
⬇️ -0.30%
6282 / 8061
🔵 Statements 77.88%
⬇️ -0.33%
6383 / 8195
🔵 Functions 76.05%
⬇️ -0.42%
1582 / 2080
🔵 Branches 77.62%
⬇️ -0.64%
2296 / 2958
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
packages/cli/src/vite/vat-bundler.ts 0%
🟰 ±0%
0%
🟰 ±0%
0%
🟰 ±0%
0%
🟰 ±0%
18-60
packages/extension/src/global.d.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/kernel-browser-runtime/src/background-captp.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/kernel-browser-runtime/src/index.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/kernel-browser-runtime/src/types.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/kernel-browser-runtime/src/kernel-worker/kernel-worker.ts 0%
🟰 ±0%
0%
🟰 ±0%
0%
🟰 ±0%
0%
🟰 ±0%
26-111
packages/kernel-browser-runtime/src/kernel-worker/captp/index.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/kernel-browser-runtime/src/kernel-worker/captp/kernel-captp.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/kernel-browser-runtime/src/rpc-handlers/launch-subcluster.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/ocap-kernel/src/Kernel.ts 90.42%
⬇️ -0.71%
78.57%
⬇️ -2.38%
86.36%
⬇️ -1.44%
90.42%
⬇️ -0.71%
122, 252-255, 272, 420, 488, 558, 568-569, 612
packages/ocap-kernel/src/KernelRouter.ts 90.16%
🟰 ±0%
75.38%
🟰 ±0%
100%
🟰 ±0%
90.16%
🟰 ±0%
110, 163, 175, 225, 252-261, 268, 314, 329, 332
packages/ocap-kernel/src/KernelServiceManager.ts 93.87%
⬇️ -6.13%
88.88%
⬇️ -11.12%
100%
🟰 ±0%
93.87%
⬇️ -6.13%
178-183
packages/ocap-kernel/src/index.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/ocap-kernel/src/kernel-facet.ts 100% 100% 100% 100%
packages/ocap-kernel/src/types.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/ocap-kernel/src/store/methods/subclusters.ts 98.8%
⬇️ -1.20%
86.66%
⬇️ -2.62%
96.15%
⬇️ -3.85%
98.76%
⬇️ -1.24%
258
packages/ocap-kernel/src/vats/SubclusterManager.ts 95.62%
⬇️ -2.88%
83.92%
⬇️ -10.52%
100%
🟰 ±0%
95.56%
⬇️ -2.94%
187-189, 193-195, 259, 311-313, 486, 491-493, 497-499
packages/ocap-kernel/src/vats/VatManager.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/ocap-kernel/src/vats/VatSupervisor.ts 72.72%
⬇️ -1.92%
42.42%
⬇️ -2.40%
58.33%
🟰 ±0%
72.72%
⬇️ -1.92%
126, 137, 145, 183, 221-225, 236, 245-246, 267-269, 272, 276-278, 310-312, 329, 346-354
packages/omnium-gatherum/src/background.ts 0%
🟰 ±0%
0%
🟰 ±0%
0%
🟰 ±0%
0%
🟰 ±0%
21-275
packages/omnium-gatherum/src/global.d.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/omnium-gatherum/src/offscreen.ts 0%
🟰 ±0%
0%
🟰 ±0%
0%
🟰 ±0%
0%
🟰 ±0%
20-122
packages/omnium-gatherum/src/controllers/index.ts 100%
⬆️ +100.00%
100%
🟰 ±0%
100%
⬆️ +100.00%
100%
⬆️ +100.00%
packages/omnium-gatherum/src/controllers/caplet/caplet-controller.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/omnium-gatherum/src/controllers/caplet/index.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/omnium-gatherum/src/controllers/caplet/types.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/omnium-gatherum/src/controllers/storage/index.ts 100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
100%
🟰 ±0%
packages/omnium-gatherum/src/vats/controller-vat.ts 0% 0% 0% 0% 58-196
packages/omnium-gatherum/src/vats/storage/baggage-adapter.ts 100% 100% 100% 100%
Generated in workflow #3619 for commit 4763784 by the Vitest Coverage Report Action

@rekmarks
Copy link
Member Author

rekmarks commented Feb 3, 2026

@cursor review

@rekmarks rekmarks force-pushed the rekm/system-vats-redux branch from 2d79ecf to 0a66993 Compare February 5, 2026 00:01
@rekmarks
Copy link
Member Author

rekmarks commented Feb 5, 2026

@cursor review

@rekmarks rekmarks changed the title feat: Add system vats support with KernelFacet feat: Add system subclusters and kernel facet service Feb 5, 2026
@rekmarks rekmarks marked this pull request as ready for review February 5, 2026 02:20
@rekmarks rekmarks requested a review from a team as a code owner February 5, 2026 02:20
Comment on lines +23 to +28
// TODO: Remove this define block and add a process shim to VatSupervisor
// workerEndowments instead. This injects into ALL bundles but is only needed
// for libraries like immer that check process.env.NODE_ENV.
define: {
'process.env.NODE_ENV': JSON.stringify('production'),
},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going to address in a follow-up. Requires changes to how we bundle vats with Vite best not added to this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +298 to +301
// Map of allowed global names to their values
const allowedGlobals: Record<string, unknown> = {
Date: globalThis.Date,
};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See: #813

@rekmarks rekmarks requested a review from sirtimid February 6, 2026 20:46
@rekmarks
Copy link
Member Author

rekmarks commented Feb 6, 2026

Stack


Managed by gh-stack

rekmarks and others added 3 commits February 6, 2026 15:03
Implement system vats that are launched at kernel initialization and have
access to privileged kernel services. Key changes:

- Add SystemVatConfig type and getSystemVatRoot method to Kernel
- Launch system vats after queue starts to avoid deadlock
- Terminate and relaunch existing system vat subclusters on restart
- Add bootstrap-vat.js for Omnium system services with CapletController
- Add baggage-backed storage adapter for vat persistence
- Pass systemVats config via URL params from offscreen to kernel worker
- Update background.ts to use system vat for caplet operations
- Add process.env.NODE_ENV replacement in vat bundler for SES compatibility
- Simplify kernel-facet.ts by removing SystemVatManager
- Add duplicate name check in KernelServiceManager.registerKernelServiceObject

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rename bootstrap-vat.js to bootstrap-vat.ts with full type annotations
- Export Baggage type from baggage-adapter.ts
- Make logger optional throughout controller hierarchy
- Simplify defineMethods to take array of method names instead of object map
- Update background.ts to use simplified method names (install, uninstall, etc.)
- Update package.json build script to reference .ts file

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
rekmarks and others added 16 commits February 6, 2026 15:03
reloadSubcluster() creates a new subcluster with a new ID, but was not
updating #systemSubclusterRoots or the persisted systemSubcluster.*
mappings. This left stale mappings that caused 'has no bootstrap vat'
errors on subsequent kernel restarts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…o SubclusterManager

System subcluster state and logic (persist/restore/cleanup mappings,
launch new named subclusters, track roots) belongs in SubclusterManager
which already owns subcluster CRUD, termination, and reload. This moves
~140 lines out of Kernel.ts into SubclusterManager, keeping Kernel as a
thin orchestration layer that delegates to its managers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…registration

The kernelFacet kernel service now takes ko3, shifting all vat root
ko IDs by 1. Update hardcoded ko references in control-panel,
object-registry, and remote-comms e2e tests accordingly.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…usters

reloadAllSubclusters bypasses reloadSubcluster and has its own loop
that calls addSubcluster + launchVatsForSubcluster directly, so it
never updated the in-memory systemSubclusterRoots map or persisted
mappings. After a reload-all, getSystemSubclusterRoot() would return
stale krefs pointing to deleted objects.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace local Baggage type definition with the one exported from
ocap-kernel, which includes keys() for native iteration. This
eliminates the manual __storage_keys__ tracking in the baggage adapter.
Also replace local LaunchResult type with SubclusterLaunchResult from
ocap-kernel, and remove dead resuscitation guard in controller-vat
bootstrap.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Construct a fallback Logger in controller-vat when vatPowers.logger is
not provided, ensuring a real Logger is always passed downstream. This
makes the logger property non-optional in ControllerConfig,
Controller, ControllerStorage, and CapletController, eliminating
optional chaining on logger calls throughout.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nally

Instead of the caller manually binding each method, makeKernelFacet
now takes the kernel instance directly and iterates over a const array
of method names to bind them. This reduces the call site in Kernel.ts
from 12 lines to 1.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the positional (resetStorage, mnemonicOrOptions) parameters
with a single options object. resetStorage defaults to true since
nearly every call site uses that value.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Apply vitest eslint config to `**/test/**/*` in addition to
`**/*.test.ts` files, so non-test-named files under test directories
also get the right rules. Remove now-unnecessary eslint-disable
comments in system-vat.ts.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The omnium.caplet type was declared as Promisified<CapletControllerFacet>
but the implementation routes through queueMessage, returning raw
CapData instead of deserialized values. Replace with explicit method
signatures using QueueMessageResult, and add the missing
callCapletMethod and getCapletRoot methods.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
On restart with an empty systemSubclusters array, the kernel facet was
never registered because provideFacet() was guarded by configs.length > 0.
Persisted run queue items targeting the kernel facet kref would cause
invokeKernelService to throw, crashing the kernel queue.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@rekmarks rekmarks force-pushed the rekm/system-vats-redux branch from 48ff5d1 to 31868d8 Compare February 6, 2026 23:04
sirtimid
sirtimid previously approved these changes Feb 9, 2026
Copy link
Contributor

@sirtimid sirtimid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👏

Comment on lines +127 to +135
const mappings = this.#kernelStore.getAllSystemSubclusterMappings();
for (const [name, mappedSubclusterId] of mappings) {
if (mappedSubclusterId === subclusterId) {
this.#systemSubclusterRoots.delete(name);
this.#kernelStore.deleteSystemSubclusterMapping(name);
this.#logger.info(`Cleaned up system subcluster mapping "${name}"`);
break;
}
}
Copy link
Contributor

@FUDCo FUDCo Feb 10, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(A more general comment than for just the particular code fragment highlighted here:) Possibly a dumb question as I haven't entirely swapped the intended model into my understanding yet, but: why are the system subclusters managed as a special case here? That is, if I shut a kernel down and then restart it, it generally should come back up running the same stuff it was running before, so I'd think that all this code to manage the persistent knowledge of subclusters would be generic (indeed, I thought that was how it worked all along, so I would think there would already be stuff in there that does something like this).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra bookkeeping exists because:

  • System subclusters are specified in a config object provided to Kernel.make(), where they are identified by name
  • After the system subclusters are launched, we store the subcluster -> name mapping for these subclusters for bookkeeping purposes
  • This enables us to:
    • Get system subcluster boostrap vat root objects by name via the Kernel's API, which we do in omnium
    • Perform cleanup if the system subcluster config changes. For instance, if a previously configured and launched system subcluster foo is no longer in the config on kernel restart, that subcluster will be garbage collected using the subcluster -> name mapping.

…cet registration

The kernel facet now always takes ko3, shifting all vat root ko IDs
by 1 in tests that don't use system subclusters.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Comment on lines +268 to +279
provideFacet(): KernelFacet {
const existing = this.#kernelServiceManager.getKernelService('kernelFacet');
if (existing) {
return existing.service as KernelFacet;
}

const kernelFacet = makeKernelFacet(this);
this.#kernelServiceManager.registerKernelServiceObject(
'kernelFacet',
kernelFacet,
);
return kernelFacet;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This follows the "provide" pattern, but the value it returns is not used at the one call site, and since it is only called that one time, I presume it's never used in a mode where existing has a value but instead is only used at initialization time where it always going to register the kernel service object. Is this in anticipation of some future use that you haven't gotten to yet or is it the vestigial remnant of some earlier stage of development?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually do call this in one location in kernel-browser-runtime, in kernel-captp.ts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose we could switch to a private initializer and a public getter, but I don't know if that's actually any cleaner than what we have.

Copy link
Contributor

@FUDCo FUDCo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, but with one minor issue that you might want to consider addressing before merging. If you do, let me know and I'll promise a quick re-review and approval cycle.

* @returns The kernel facet.
*/
provideFacet(): KernelFacet {
const existing = this.#kernelServiceManager.getKernelService('kernelFacet');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not a blocker since we are still in dev mode, but worth attending to: this puts the 'kernelFacet' kernel service into the same namespace with the other kernel services, but this will make it available to any subcluster's bootstrap method, which is almost certainly something we don't want. I think we want to partition kernel services into system mode and user mode services, such that only system subclusters can have access to the former.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. That's bad, but doesn't leave us an invalid state, so I'll address via #832 in a follow-up.

@rekmarks rekmarks added this pull request to the merge queue Feb 10, 2026
Merged via the queue into main with commit 844671e Feb 10, 2026
44 checks passed
@rekmarks rekmarks deleted the rekm/system-vats-redux branch February 10, 2026 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants