Skip to content

chore(ci): harden nested e2e module setup#2349

Open
universal-itengineer wants to merge 1 commit into
mainfrom
chore/ci/nested-e2e-add-retries
Open

chore(ci): harden nested e2e module setup#2349
universal-itengineer wants to merge 1 commit into
mainfrom
chore/ci/nested-e2e-add-retries

Conversation

@universal-itengineer
Copy link
Copy Markdown
Member

@universal-itengineer universal-itengineer commented May 14, 2026

Description

This PR hardens the nested e2e setup flow by moving module setup logic from the reusable workflow into dedicated bash helpers under .github/scripts/bash/e2e.

The new helpers cover:

  • enable-sdn.sh: SDN ModuleConfig creation, module readiness, workloads, and SDN admission endpoint readiness.
  • apply-clusternetworks.sh: retry wrapper for ClusterNetwork apply while SDN admission becomes ready.
  • configure-virtualization.sh: ModuleSource creation, Deckhouse queue waits, deckhouse-dev source propagation, and Virtualization ModuleConfig/ModulePullOverride apply.
  • common.sh and deckhouse.sh: shared error/env handling and Deckhouse diagnostics/queue wait helpers.

Shell scripts are checked with shellcheck -x and bash -n.

Why do we need it, and what problem does it solve?

Nested e2e setup can fail while Deckhouse is still processing module changes after bootstrap. One observed failure happened when virtualization was configured with source: deckhouse-dev before that source was available for the module.

Another failure mode is invalid registry credentials for deckhouse-dev: Deckhouse reports a registry authentication problem in ModuleSource.status.message, and the later admission failure can look like an unavailable source. The new script checks ModuleSource state first and reports a safe 401 Unauthorized classification without printing the full status message or dockerCfg contents.

What is the expected result?

Rerunning the nested e2e workflow should be more stable during module setup:

  • SDN setup waits for module/workload/admission readiness before ClusterNetwork configuration.
  • ClusterNetwork apply retries while the SDN admission endpoint is becoming ready.
  • Virtualization setup waits for deckhouse-dev ModuleSource, empty Deckhouse queue, and source propagation before applying the module config.
  • CI logs include actionable diagnostics for module setup and registry authentication failures without leaking credentials.

Checklist

  • The code is covered by unit tests.
  • e2e tests passed.
  • Documentation updated according to the changes.
  • Changes were tested in the Kubernetes cluster manually.

Changelog entries

section: ci
type: fix
summary: "Harden nested e2e module setup against Deckhouse module propagation and admission readiness races."

@universal-itengineer universal-itengineer marked this pull request as ready for review May 14, 2026 16:39
@universal-itengineer universal-itengineer added this to the v1.9.0 milestone May 14, 2026
@universal-itengineer universal-itengineer force-pushed the chore/ci/nested-e2e-add-retries branch from ee9c69b to 87f3829 Compare May 14, 2026 17:11
Signed-off-by: Nikita Korolev <nikita.korolev@flant.com>
@universal-itengineer universal-itengineer force-pushed the chore/ci/nested-e2e-add-retries branch from 87f3829 to 8d247a6 Compare May 14, 2026 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant