Skip to content

WKBCH-25: Complete CRR support#77

Open
dvasilas wants to merge 3 commits intomainfrom
improvement/complete-crr
Open

WKBCH-25: Complete CRR support#77
dvasilas wants to merge 3 commits intomainfrom
improvement/complete-crr

Conversation

@dvasilas
Copy link
Copy Markdown
Contributor

Builds on #49 to make replication actually fire end-to-end.

  • Single cloudserver instance. Source and destination buckets owned by the same account.
  • The replication role is created in testaccount (the bucket-owning account) via vault's accountSeeds.

Adds a helper script (scripts/enable-crr.sh) that creates the source and destination buckets, enables versioning, and applies the replication policy.
Idempotent for CI use.

PR #49 added the CRR plumbing but didn't create the IAM role backbeat
assumes for replication. Add it in the data account via vault
accountSeeds so role-issued credentials authorize source and destination
S3 calls.

Ship scripts/enable-crr.sh to create the buckets and apply the
replication policy idempotently.
Comment thread cmd/config.go
@claude
Copy link
Copy Markdown

claude Bot commented Apr 28, 2026

  • cmd/config.go:65 — HostUID/HostGID bypassed by early return on empty path in LoadEnvironmentConfig, would default to 0 (root). Move initialization into DefaultEnvironmentConfig() so both code paths are covered.

    Rest of the PR looks solid: replication-role seed, admin-backbeat credentials file, kafka topic name fix (backbeat-metrics-group-crrbackbeat-metrics to match backbeat config), set +e guard for zookeeper idempotency, and user: HOST_UID:HOST_GID on setup containers are all correct.

    Review by Claude Code

Both containers write into host-mounted directories.
Becuase the run as root, they left root-owned files.
@dvasilas dvasilas force-pushed the improvement/complete-crr branch from 15800df to 705326d Compare April 29, 2026 13:07
@claude
Copy link
Copy Markdown

claude Bot commented Apr 29, 2026

LGTM — changes are internally consistent and well-structured. Replication role, admin credentials file, Kafka topic rename, host UID/GID mapping, and the enable-crr.sh helper all line up correctly across the touched files. The set +e guard for zookeeper-shell.sh NodeExists is appropriate, and the new setup-kafka dependency on backbeat closes a real race condition.

Review by Claude Code

"port": 8500,
"adminPort": 8600,
"adminCredentialsFile": "/home/scality/backbeat/node_modules/vaultclient/tests/utils/admincredentials.json"
"adminCredentialsFile": "/conf/admin-backbeat.json"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the diff between the two? or was there some permissions issue?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the same content, I just thought it's a bit better for future readers to have the file here.

Comment thread cmd/config.go

func LoadEnvironmentConfig(path string) (EnvironmentConfig, error) {
cfg := DefaultEnvironmentConfig()
cfg.HostUID = os.Getuid()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember messing around with this but it was for the logs/volumes with the host, anything related to CRR in this case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to CRR, just a side-fix for an issue I noticed (705326d).

It is about file permissions in volumes on the host.
Because containers run as root, files in volumes are owned by root and need sudo to rm.

dvasilas added a commit to scality/log-courier that referenced this pull request Apr 30, 2026
Pin workbench to the head of WKBCH-25 (scality/workbench#77), which
provisions the queue-populator's raft-id-dispatcher to cover all data
raft sessions. The previous SHA only provisioned raftIds 0,1,2 while
metadata-s3 with raftSessions=3 creates 4 sessions (0=admin, 1..3=data),
leaving raftId 3 uncovered: any bucket whose hash landed there sat at
ReplicationStatus=PENDING forever and the CRR e2e test timed out.

Temporary pin; revert to a tagged workbench release once WKBCH-25 is
merged and tagged.
WKBCH-25 added redis to backbeat config, which activates backlogControl
(previously a no-op since redis was unreachable). It then throttles the
conductor whenever the bucket-tasks topic has any consumer lag, which
causes test flakiness under bursty parallel load: lifecycle stalls long
enough to blow expiration timeouts.
dvasilas added a commit to scality/log-courier that referenced this pull request Apr 30, 2026
Pin workbench to the head of WKBCH-25 (scality/workbench#77) which
disables the lifecycle conductor's backlogControl. Without this, the
conductor throttles whenever the bucket-tasks topic has any consumer
lag, which the parallel e2e suite triggers reliably and stalls
lifecycle expiration past the test timeout.
setup-vault:
condition: service_completed_successfully
setup-kafka:
condition: service_completed_successfully
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR adds a redis config block to backbeat's config.json, making redis a new runtime dependency. Consider adding an explicit depends_on so backbeat doesn't start before redis is healthy. In practice redis starts faster than the setup services, so this is unlikely to bite, but it keeps the ordering guarantees consistent with the other dependencies.

Suggested change
condition: service_completed_successfully
condition: service_completed_successfully
redis:
condition: service_healthy

— Claude Code

@claude
Copy link
Copy Markdown

claude Bot commented Apr 30, 2026

  • templates/global/docker-compose.yaml:187 — backbeat service is missing depends_on: redis: condition: service_healthy now that its config.json includes a redis block

    Otherwise clean: credentials/ARNs are consistent across files, embed directive covers the new template, the set +e scope in kafka/setup.sh is correctly bounded, and the HostUID/HostGID injection is safe through the mergo merge.

    Review by Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants