Skip to content

[DCN] Add tests for cinderBackups#3962

Closed
gais-ameer-rh wants to merge 22 commits into
openstack-k8s-operators:mainfrom
gais-ameer-rh:OSPRH-28343
Closed

[DCN] Add tests for cinderBackups#3962
gais-ameer-rh wants to merge 22 commits into
openstack-k8s-operators:mainfrom
gais-ameer-rh:OSPRH-28343

Conversation

@gais-ameer-rh
Copy link
Copy Markdown

@gais-ameer-rh gais-ameer-rh commented May 26, 2026

spec.cinder.template.cinderBackup (singluar) in DCN DT is
replaced with cinderBackups (plural) to deploy multiple cinder backups per edge sites.
Invoking hooks/playbooks/dz_storage_cinder_backups.yaml playbook to validates the behaviour of cinderBackups in DCN scenario.

The playbook tests different scenarios of cinder backup creation
and restoring the backups across availability zones.

Jira: OSPRH-28343
Signed-off-by: Gais Ameer gameer@redhat.com

gais-ameer-rh and others added 22 commits May 19, 2026 16:18
spec.cinder.template.cinderBackup (singluar) in DZ-Storage DT is
replaced with cinderBackups (plural) to deploy multiple cinder backups based on AZ topology.
hooks/playbooks/dz_storage_cinder_backups.yaml validates the behaviour of cinderBackups.

The playbook tests different scenarios of cinder backup creation
and restoring the backups across availability zones.

Jira: OSPRH-28342

Signed-off-by: Gais Ameer <gameer@redhat.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add an optional `cipher` parameter (choices: aes, aes256k; default: aes)
to the `cephx_key` Ansible module so CI jobs can generate AES-256k
(32-byte, type=2) CephX keys.

- Refactor __create_cephx_key() to accept cipher argument; use
  key_type=2 and os.urandom(32) for aes256k, key_type=1 and
  os.urandom(16) for aes (default, backward compatible).
- Update DOCUMENTATION, EXAMPLES and RETURN docstrings.
- Update the "Generate a cephx key" task in hooks/playbooks/ceph.yml
  to pass `cipher: "{{ cifmw_ceph_key_cipher | default('aes') }}"`,
  allowing scenarios to opt in via a single variable.
- Add tests/unit/modules/test_cephx_key.py with 8 tests covering both
  cipher modes, invalid input, base64 validity, and key randomness.

Jira: OSPRH-29667

Signed-off-by: John Fulton <fulton@redhat.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After controller-0 reboots, SSH may become transiently unreachable
while background system initialization completes. The configure_controller
block delegates tasks to controller-0 without verifying connectivity
first, causing intermittent UNREACHABLE failures on the install_ca task.

Add wait_for_connection at the start of the delegate_to block to ensure
controller-0 is fully accessible before proceeding.

Signed-off-by: Miguel Angel Nieto Jimenez <mnietoji@redhat.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The containers-built.log contains a list of images as registry/namespace/repository.
The containers.podman.podman_image module treats a two-part dest (host/namespace)
as a prefix and appends the entire local image name, which produced nested
repo names on quay.rdoproject.org.
Pass a complete dest (registry, namespace, repository basename, and tag)
so Podman pushes to the intended repository.

Commit also include a changes that remove calling {{ item }} when
loop is used and reduce using `cat` command with `grep` - we can directly
call `grep` without `cat`.
One more change also done in this commit was to rename tasks that
contains wrong script name, that makes confusion.

Signed-off-by: Daniel Pawlik <dpawlik@redhat.com>
This commit add extra parameter for horizontest in order to modify
projecttext xpath based on upstream and downstream dashboard theme.

Signed-off-by: Jan Jasek <jjasek@redhat.com>
PCP hook playbooks hardcode hosts: all,!localhost, which causes
ansible-playbook to exit rc=2 when compute nodes are unreachable
(e.g. not yet provisioned or already torn down). This kills the
CI job for a metrics side-effect.

Introduce cifmw_pcp_metrics_hosts (default: all,!localhost) so that
downstream jobs can exclude host groups that are expected to be
unavailable. By not targeting those hosts at all, Ansible never
encounters UNREACHABLE errors and exits cleanly.

Co-Authored-By: Claude <noreply@anthropic.com>

Signed-off-by: Roberto Alfieri <ralfieri@redhat.com>
…e repo in all NFV templates

Extend the fix from fffa721 (HCI template) to all remaining NFV
templates. Preserve the complete node configuration (ansibleHost,
networks, fixedIP) from the architecture repository instead of
overwriting it with just hostName.

Affected templates: ovs-dpdk, ovs-dpdk-sriov, ovs-dpdk-sriov-ipv6,
sriov, ovs-dpdk-sriov-2nodesets, ovs-dpdk-sriov-ipv6-2nodesets,
and ovs-dpdk-sriov-networker.

Signed-off-by: Miguel Angel Nieto Jimenez <mnietoji@redhat.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Because `cifmw_pcp_metrics_hosts` was defined in the pcp_metrics role's
defaults file, the `pcp-metrics-pre.yml` playbook was trying use to this
var before it was imported, leading to `'cifmw_pcp_metrics_hosts' is
undefined` errors

Signed-off-by: Michael Burke <michburk@redhat.com>
Add focused unit coverage for CRI-O pulled report verification, including
successful enrichment, cross-node evidence accounting, and failure when no logs
are provided. Reuse shared test utilities with a local fallback so tests run
in both collection-style and local environments.

Co-authored-by: Cursor <cursoragent@cursor.com>
Signed-off-by: nemarjan <nemarjan@redhat.com>
Add structured documentation for AI coding agents working on this
repository. AGENTS.md provides repo-wide context: variable naming
rules, generated file warnings, testing commands, commit conventions,
and repository layout. CLAUDE.md adds Claude-specific behavioral
guidance and references AGENTS.md as the primary source of truth.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Roberto Alfieri <ralfieri@redhat.com>
New reproducer for the dt-sharded deployment topology which uses
per-service dedicated galera, rabbitmq and mamcached clusters.
Based on dt-vhosts-compact with designateext network removed (not used by
dt-sharded) and ceph/service-values paths adjusted.

Signed-Off-By: Luca Miccini <lmiccini@redhat.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Enable systemd predictable network interface naming inside guest VMs
by removing net.ifnames=0 from kernel args via virt-customize. This
gives guests consistent PCI-topology-based names (enp1s0, enp2s0, etc.)
instead of legacy ethN naming. Predictable network interfaces are
requirement for testing Leapp upgrade functionality.

Controlled by cifmw_libvirt_manager_predictable_nic_names (defaults
to false).

Jira: OSPRH-29381

Co-Authored-By: Lukas Bezdicka <lbezdick@redhat.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sergii Golovatiuk <sgolovat@redhat.com>
…asks

Long-running tasks like `oc adm wait-for-stable-cluster` after certificate
rotation can cause SSH connections to be dropped by intermediate network
devices (firewalls, NAT) due to inactivity. Add ServerAliveInterval and
ServerAliveCountMax to maintain the connection alive.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Sergii Golovatiuk <sgolovat@redhat.com>
cifmw_cephadm_log_path uses ansible_user_dir which resolves to
/root on compute nodes running with become: true. The post.yml
and logs.yml tasks delegate_to: localhost to write log files, but
on the controller the zuul user cannot create directories under
/root.

Use cifmw_basedir (always /home/zuul/ci-framework-data in CI)
with a fallback to the original expression for non-CI contexts.

Related-Issue: ANVIL-109
Co-authored-by: Cursor <cursoragent@cursor.com>

Signed-off-by: Roberto Alfieri <ralfieri@redhat.com>
A couple of spelling and formatting errors weren't caught in their
original prs. This adds some words to the dictionary and fixes an error
where an asterisk needed to be escaped.

Signed-off-by: Michael Burke <michburk@redhat.com>
Deploy MinIO as a lightweight S3-compatible object store for use
as the Velero backup target in development and CI environments.

Signed-off-by: Andrew Bays <abays@redhat.com>
Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Install and configure the OADP (OpenShift API for Data Protection)
operator with an S3-compatible storage backend, create the
DataProtectionApplication CR, set up VolumeSnapshotClass for CSI
snapshots, and verify the BackupStorageLocation is available.

Signed-off-by: Andrew Bays <abays@redhat.com>
Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Orchestrate backup, restore, and cleanup of OpenStack control plane
and data plane resources, including Galera database dumps, Velero CSI
volume snapshots, and ordered multi-phase restore sequences.

Also adds playbooks (backup_restore.yaml) and integrates backup and
restore into the post-deployment pipeline.

Signed-off-by: Andrew Bays <abays@redhat.com>
Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
This PR switched baremetal and end-to-end jobs to OCP- 4.20

Jobs:
baremetal and end-to-end

Signed-off-by: Bhagyashri Shewale bshewale@redhat.com
- Wait for compute services and network agents to be ready with
  retry loops before proceeding to workload validation, preventing
  tempest from running against a partially recovered control plane
- Delete test-operator CRs (Tempest, Tobiko, AnsibleTest, HorizonTest)
  at the beginning of cleanup while controllers and dependencies are
  still running, so finalizers get processed properly
- Wait for test-operator pods to terminate after CR deletion
- Adapt GaleraRestore pod discovery to the shortened resource names
  from mariadb-operator which drops the galera instance name prefix
  from generated resources (restore-<name> instead of
  <galera>-restore-<name>). Uses the galerarestore/name label selector
  when available, with fallback to the old naming convention so this
  change can land independently of the mariadb-operator PR
- Increase control plane ready timeout from 10m to 30m
- Fix loop_var collision with _delete_all_of_kind.yml

Related-To: openstack-k8s-operators/mariadb-operator#463

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Martin Schuppert <mschuppert@redhat.com>
…H shuffling

The OpenStackBaremetalSet operator assigns BMHs to nodeset hostnames
non-deterministically (OSPRH-10282), causing compute-0 to get the
network config of compute-1 and vice versa. Add bmhLabelSelector
with nodeName per node so each edpm-compute-X is deterministically
bound to the BMH named compute-X.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Miguel Angel Nieto Jimenez <mnietoji@redhat.com>
spec.cinder.template.cinderBackup (singluar) in DCN DT is
replaced with cinderBackups (plural) to deploy multiple cinder backups per edge sites.
Invoking hooks/playbooks/dz_storage_cinder_backups.yaml playbook to validates the behaviour of cinderBackups in DCN scenario.

The playbook tests different scenarios of cinder backup creation
and restoring the backups across availability zones.
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign brjackma for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 26, 2026

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@centosinfra-prod-github-app
Copy link
Copy Markdown

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-k8s-operators/ci-framework for 3962,cb79f67edd668c2a62ff0a6a9092c62712246c32

@fultonj
Copy link
Copy Markdown
Contributor

fultonj commented May 27, 2026

obsoleted by #3963

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.