feat: Add system subclusters and kernel facet service#803
Conversation
f58f8c0 to
079a10a
Compare
|
@cursor review |
|
@cursor review |
2d79ecf to
0a66993
Compare
|
@cursor review |
| // TODO: Remove this define block and add a process shim to VatSupervisor | ||
| // workerEndowments instead. This injects into ALL bundles but is only needed | ||
| // for libraries like immer that check process.env.NODE_ENV. | ||
| define: { | ||
| 'process.env.NODE_ENV': JSON.stringify('production'), | ||
| }, |
There was a problem hiding this comment.
Going to address in a follow-up. Requires changes to how we bundle vats with Vite best not added to this PR.
| // Map of allowed global names to their values | ||
| const allowedGlobals: Record<string, unknown> = { | ||
| Date: globalThis.Date, | ||
| }; |
Stack
Managed by gh-stack |
Implement system vats that are launched at kernel initialization and have access to privileged kernel services. Key changes: - Add SystemVatConfig type and getSystemVatRoot method to Kernel - Launch system vats after queue starts to avoid deadlock - Terminate and relaunch existing system vat subclusters on restart - Add bootstrap-vat.js for Omnium system services with CapletController - Add baggage-backed storage adapter for vat persistence - Pass systemVats config via URL params from offscreen to kernel worker - Update background.ts to use system vat for caplet operations - Add process.env.NODE_ENV replacement in vat bundler for SES compatibility - Simplify kernel-facet.ts by removing SystemVatManager - Add duplicate name check in KernelServiceManager.registerKernelServiceObject Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Rename bootstrap-vat.js to bootstrap-vat.ts with full type annotations - Export Baggage type from baggage-adapter.ts - Make logger optional throughout controller hierarchy - Simplify defineMethods to take array of method names instead of object map - Update background.ts to use simplified method names (install, uninstall, etc.) - Update package.json build script to reference .ts file Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
reloadSubcluster() creates a new subcluster with a new ID, but was not updating #systemSubclusterRoots or the persisted systemSubcluster.* mappings. This left stale mappings that caused 'has no bootstrap vat' errors on subsequent kernel restarts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…o SubclusterManager System subcluster state and logic (persist/restore/cleanup mappings, launch new named subclusters, track roots) belongs in SubclusterManager which already owns subcluster CRUD, termination, and reload. This moves ~140 lines out of Kernel.ts into SubclusterManager, keeping Kernel as a thin orchestration layer that delegates to its managers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…registration The kernelFacet kernel service now takes ko3, shifting all vat root ko IDs by 1. Update hardcoded ko references in control-panel, object-registry, and remote-comms e2e tests accordingly. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…usters reloadAllSubclusters bypasses reloadSubcluster and has its own loop that calls addSubcluster + launchVatsForSubcluster directly, so it never updated the in-memory systemSubclusterRoots map or persisted mappings. After a reload-all, getSystemSubclusterRoot() would return stale krefs pointing to deleted objects. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace local Baggage type definition with the one exported from ocap-kernel, which includes keys() for native iteration. This eliminates the manual __storage_keys__ tracking in the baggage adapter. Also replace local LaunchResult type with SubclusterLaunchResult from ocap-kernel, and remove dead resuscitation guard in controller-vat bootstrap. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Construct a fallback Logger in controller-vat when vatPowers.logger is not provided, ensuring a real Logger is always passed downstream. This makes the logger property non-optional in ControllerConfig, Controller, ControllerStorage, and CapletController, eliminating optional chaining on logger calls throughout. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…nally Instead of the caller manually binding each method, makeKernelFacet now takes the kernel instance directly and iterates over a const array of method names to bind them. This reduces the call site in Kernel.ts from 12 lines to 1. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace the positional (resetStorage, mnemonicOrOptions) parameters with a single options object. resetStorage defaults to true since nearly every call site uses that value. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Apply vitest eslint config to `**/test/**/*` in addition to `**/*.test.ts` files, so non-test-named files under test directories also get the right rules. Remove now-unnecessary eslint-disable comments in system-vat.ts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The omnium.caplet type was declared as Promisified<CapletControllerFacet> but the implementation routes through queueMessage, returning raw CapData instead of deserialized values. Replace with explicit method signatures using QueueMessageResult, and add the missing callCapletMethod and getCapletRoot methods. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
On restart with an empty systemSubclusters array, the kernel facet was never registered because provideFacet() was guarded by configs.length > 0. Persisted run queue items targeting the kernel facet kref would cause invokeKernelService to throw, crashing the kernel queue. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
48ff5d1 to
31868d8
Compare
| const mappings = this.#kernelStore.getAllSystemSubclusterMappings(); | ||
| for (const [name, mappedSubclusterId] of mappings) { | ||
| if (mappedSubclusterId === subclusterId) { | ||
| this.#systemSubclusterRoots.delete(name); | ||
| this.#kernelStore.deleteSystemSubclusterMapping(name); | ||
| this.#logger.info(`Cleaned up system subcluster mapping "${name}"`); | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
(A more general comment than for just the particular code fragment highlighted here:) Possibly a dumb question as I haven't entirely swapped the intended model into my understanding yet, but: why are the system subclusters managed as a special case here? That is, if I shut a kernel down and then restart it, it generally should come back up running the same stuff it was running before, so I'd think that all this code to manage the persistent knowledge of subclusters would be generic (indeed, I thought that was how it worked all along, so I would think there would already be stuff in there that does something like this).
There was a problem hiding this comment.
This extra bookkeeping exists because:
- System subclusters are specified in a config object provided to
Kernel.make(), where they are identified by name - After the system subclusters are launched, we store the
subcluster -> namemapping for these subclusters for bookkeeping purposes - This enables us to:
- Get system subcluster boostrap vat root objects by name via the Kernel's API, which we do in omnium
- Perform cleanup if the system subcluster config changes. For instance, if a previously configured and launched system subcluster
foois no longer in the config on kernel restart, that subcluster will be garbage collected using thesubcluster -> namemapping.
…cet registration The kernel facet now always takes ko3, shifting all vat root ko IDs by 1 in tests that don't use system subclusters. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| provideFacet(): KernelFacet { | ||
| const existing = this.#kernelServiceManager.getKernelService('kernelFacet'); | ||
| if (existing) { | ||
| return existing.service as KernelFacet; | ||
| } | ||
|
|
||
| const kernelFacet = makeKernelFacet(this); | ||
| this.#kernelServiceManager.registerKernelServiceObject( | ||
| 'kernelFacet', | ||
| kernelFacet, | ||
| ); | ||
| return kernelFacet; |
There was a problem hiding this comment.
This follows the "provide" pattern, but the value it returns is not used at the one call site, and since it is only called that one time, I presume it's never used in a mode where existing has a value but instead is only used at initialization time where it always going to register the kernel service object. Is this in anticipation of some future use that you haven't gotten to yet or is it the vestigial remnant of some earlier stage of development?
There was a problem hiding this comment.
We actually do call this in one location in kernel-browser-runtime, in kernel-captp.ts.
There was a problem hiding this comment.
I suppose we could switch to a private initializer and a public getter, but I don't know if that's actually any cleaner than what we have.
FUDCo
left a comment
There was a problem hiding this comment.
Approving, but with one minor issue that you might want to consider addressing before merging. If you do, let me know and I'll promise a quick re-review and approval cycle.
| * @returns The kernel facet. | ||
| */ | ||
| provideFacet(): KernelFacet { | ||
| const existing = this.#kernelServiceManager.getKernelService('kernelFacet'); |
There was a problem hiding this comment.
Probably not a blocker since we are still in dev mode, but worth attending to: this puts the 'kernelFacet' kernel service into the same namespace with the other kernel services, but this will make it available to any subcluster's bootstrap method, which is almost certainly something we don't want. I think we want to partition kernel services into system mode and user mode services, such that only system subclusters can have access to the former.
There was a problem hiding this comment.
Good catch. That's bad, but doesn't leave us an invalid state, so I'll address via #832 in a follow-up.
Adds support for "system subclusters" - statically declared subclusters that are launched at kernel initialization and persist across kernel restarts. System subclusters can receive powerful kernel services not available to normal vats via the
KernelFacet. In summary:globalsconfig to allow vats to receive specific globals (likeDate) in their SES Compartmentname -> idmapping for subcluster vats, which facilitates identifying the bootstrap vat of a launched subclusterNote
High Risk
Touches kernel initialization, message routing, and kernel-service invocation semantics (now non-awaited) and adds new persisted system-subcluster state, so regressions could affect boot behavior, crank processing, and subcluster/vat lifecycle.
Overview
Adds system subclusters: statically configured subclusters that are restored on boot, launched after the run queue starts, can be looked up via
getSystemSubclusterRoot, and are cleaned up when configs are removed (including orphan deletion before vat startup and mapping updates on reload/reset).Replaces the browser-runtime CapTP “kernel facade” with a first-class
KernelFacetkernel service (exported from@metamask/ocap-kernel) and updates CapTP endpoints, RPC (launchSubclusterresult now usesrootKref), and Omnium’s background to interact via the facet and a controller system vat.Extends vat configuration with an allowlisted
globalsarray to inject specific globals (e.g.,Date) into SES worker endowments, and refactors subcluster storage from a vat-id array to a name→vat-id record (requiringVatManager.launchVat/store APIs to accept vat names) with broad test updates plus new Node e2e coverage for system-subcluster behavior and persistence.Written by Cursor Bugbot for commit 4763784. This will update automatically on new commits. Configure here.