Skip to content

fix(test-cluster): manage containers via testcontainers' Docker client so start/stop share one runtime#624

Open
beryllw wants to merge 1 commit into
apache:mainfrom
beryllw:fix-fluss-cluster-stop
Open

fix(test-cluster): manage containers via testcontainers' Docker client so start/stop share one runtime#624
beryllw wants to merge 1 commit into
apache:mainfrom
beryllw:fix-fluss-cluster-stop

Conversation

@beryllw

@beryllw beryllw commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Purpose

Linked issue: close #623

Brief change log

  • stop_cluster, FlussTestingCluster::stop, and the start pre-clean now force-remove via the shared client. A 404 (already gone) is treated as success; other errors warn instead of being silently swallowed.
  • all_containers_exist lists containers through the client with an anchored name filter (now async).
  • New run_blocking helper drives the async teardown from the synchronous stop() callers, working both inside a tokio runtime (async integration tests) and from the atexit cleanup handler.
  • Imports the Docker types via the testcontainers::bollard re-export, so the crate takes no direct bollard dependency (Cargo.toml/Cargo.lock unchanged on balance).

Tests

Verified against a live Podman cluster:

start            -> zookeeper / coordinator / tablet containers Up
start (again)    -> idempotent, no recreation, no name conflict
stop             -> all three containers removed

cargo build / cargo clippy clean for the crate (the 3 pre-existing main.rs readiness-poll lints are untouched and out of scope).

API and Format

Documentation

@leekeiabstraction leekeiabstraction left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the improvements. Left some small comments.

Comment on lines +225 to +238
for name in self.container_names() {
// Anchored exact-name match; `all(false)` = running only, so a stopped
// leftover counts as absent and gets recreated.
let mut filters = HashMap::new();
filters.insert("name".to_string(), vec![format!("^{name}$")]);
let options = ListContainersOptionsBuilder::default()
.all(false)
.filters(&filters)
.build();
match docker.list_containers(Some(options)).await {
Ok(list) if !list.is_empty() => continue,
_ => return false,
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Seems like we're making a list_containers per-container. Anyway to make this more efficient?

Comment on lines +576 to +596
let mut filters = HashMap::new();
filters.insert(
"name".to_string(),
vec![
format!("zookeeper-{name}"),
format!("coordinator-server-{name}"),
format!("tablet-server-{name}-"),
],
);
let options = ListContainersOptionsBuilder::default()
.all(true)
.filters(&filters)
.build();

let containers = match docker.list_containers(Some(options)).await {
Ok(containers) => containers,
Err(e) => {
eprintln!("warning: failed to list cluster containers: {e}");
return;
}
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This looks similar to parts of all_containers_exist, can these be refactored?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fluss-test-cluster stop reports success but leaves containers running

2 participants