Skip to content

SeiNodeDeployment.spec.template.spec.peers changes don't propagate to existing child SeiNodes #243

@bdchatham

Description

@bdchatham

Problem

When a SeiNodeDeployment is updated to change `spec.template.spec.peers` (e.g., adding a new `ec2Tags` peer source alongside an existing `label` selector), the change does not propagate to existing child SeiNode resources. New pods rolled via image-drift NodeUpdate plans come up with the old peer config.

Concrete repro

sei-protocol/platform#525 added an `ec2Tags` peer source to `clusters/prod/protocol/pacific-1/node.yaml`:

peers:
  - ec2Tags:
      region: eu-central-1
      tags:
        ChainIdentifier: pacific-1
        Component: state-syncer
  - label:
      selector:
        sei.io/chain: pacific-1

After merge:

$ kubectl get snd -n pacific-1 node-0 -o jsonpath='{.spec.template.spec.peers}'
[{"ec2Tags":{...}},{"label":{...}}]   # both entries present ✓

$ kubectl get seinode -n pacific-1 node-0-0 -o jsonpath='{.spec.peers}'
[{"label":{...}}]                       # ec2Tags missing ✗

$ kubectl get seinode -n pacific-1 node-0-0 -o jsonpath='{.status.resolvedPeers}'
[<4 in-cluster .svc.cluster.local addresses>]   # no EC2 IPs ✗

The same PR also bumped `spec.image` from v6.4.1 to v6.4.3, which triggered a NodeUpdate plan that rolled the pod successfully — but the peer-config change rode along unbroken. The image-bump propagation works; the peers-config propagation doesn't.

Impact

Operators can't update peer sources via the SND once child SeiNodes exist. Workaround would be deleting child SeiNodes so the SND recreates them with the current template — destructive (loses PVC if deletionPolicy isn't Retain, loses peer-discovery continuity).

Relevant experts

  • kubernetes-specialist (planner / reconciler authority)

Proposed approach

Two paths to investigate:

  1. SND reconciler's spec-propagation scope. The SND controller emits `await-spec-update` tasks (visible in logs). Determine whether that task propagates the full `template.spec` to child SeiNodes or only the subset that touches the pod template (image, validator/fullNode mode, etc.). If it's a subset, decide whether to extend to peers or to introduce a separate spec-propagation surface.
  2. Child SeiNode spec is owned by the SND. If the SND's SSA on child SeiNodes claims `spec.image` but not `spec.peers`, the existing field-manager (whoever set peers on create) keeps ownership and the SND's new peer config never lands. SSA-ownership audit on the SND→SeiNode write path.

Acceptance criteria

  • SND spec.template.spec.peers updates propagate to existing child SeiNode.spec.peers on the next reconcile
  • Test demonstrates: add an ec2Tags peer to an existing SND → child SeiNode picks it up → status.resolvedPeers includes the EC2 IPs
  • Existing image-bump propagation path remains untouched

Out of scope

  • Whether `label` selector and `ec2Tags` peer sources interact correctly when both are present at the SeiNode level (presumed to compose — the rpc-fleet pattern uses both). Test should verify but the union-vs-replace semantics aren't being changed here.

References

  • sei-protocol/platform#525 — the PR that surfaced this (pacific-1 node.yaml peer config addition rolled without propagating)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions