Skip to content

fix(docker): use cargo-chef for dependency caching in CI#3635

Open
jowparks wants to merge 8 commits into
l3from
joeparks/fix-docker-l3-cache
Open

fix(docker): use cargo-chef for dependency caching in CI#3635
jowparks wants to merge 8 commits into
l3from
joeparks/fix-docker-l3-cache

Conversation

@jowparks

Copy link
Copy Markdown
Collaborator

Problem

The Docker L3 CI build cache is not working — every push recompiles all Rust dependencies from scratch (~30+ min builds).

Root cause: Two issues compound:

  1. COPY . . invalidates the source layer on every commit. The source stage copies the entire repo, so any file change produces a different layer hash. All downstream cargo build layers miss the registry cache.

  2. --mount=type=cache,target=/app/target is ephemeral on GHA. These BuildKit cache mounts persist on the local daemon, but GHA spins a fresh runner per job — the mounts are always empty.

Solution

Replace the single-stage source copy with cargo-chef's three-stage caching pattern:

Stage Purpose
planner cargo chef prepare → produces recipe.json (deterministic from Cargo.toml/lock)
deps cargo chef cook → compiles all external deps into a cacheable layer
source COPY . . on top of pre-built deps

Cache behavior after this change

Source-only change (most commits):

  • recipe.json unchanged → deps layer hits registry cache → dependency compilation skipped entirely
  • Builder stages only recompile workspace crates (~5-10 min)

Cargo.toml/lock change (dep updates):

  • recipe.json changes → deps layer rebuilds → full dep compile (unavoidable)

The zk-prover gets its own zk-deps cook stage since it requires Go + protoc from zk-builder-base.

Impact

  • L3 CI: Source-only builds drop from ~30+ min to ~5-10 min
  • Main branch (docker.yml): Also benefits since it uses the same Dockerfile
  • Local builds: No regression — docker buildx bake works identically

The Docker build cache was not working on GitHub Actions because:

1. `COPY . .` invalidated the source layer on every commit, causing
   all downstream cargo build layers to miss the registry cache.

2. `--mount=type=cache,target=/app/target` mounts are local to the
   BuildKit daemon and ephemeral on GHA runners (empty every run).

Replace the single-stage source copy with cargo-chef's three-stage
pattern:

- `planner`: prepares a recipe.json capturing workspace deps
- `deps`: cooks all external dependencies (layer cached until
  Cargo.toml/Cargo.lock change)
- `source`: copies full source on top of pre-built deps

Source-only commits now skip dependency compilation entirely, reducing
CI build times from ~30+ min to ~5-10 min for the common case.

The zk-prover retains its own cook stage (zk-deps) since it requires
the Go + protoc toolchain from zk-builder-base.

Co-authored-by: OpenCode <opencode-noreply@coinbase.com>
Comment thread etc/docker/Dockerfile.rust-services
Comment thread etc/docker/Dockerfile.rust-services Outdated
jowparks and others added 4 commits June 18, 2026 10:37
Dockerfile.rust-services dropped the challenger stages but
docker-bake.hcl and docker-compose.yml still referenced them,
breaking `bake rust-services` and `bake devnet-l3`.

Co-authored-by: OpenCode <opencode-noreply@coinbase.com>
The challenger-builder and challenger runtime stages were
accidentally removed in the cargo-chef migration. Restore them
using the same cache-mount-free pattern as the other services.

Co-authored-by: OpenCode <opencode-noreply@coinbase.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Comment thread etc/docker/Dockerfile.rust-services Outdated
jowparks and others added 3 commits June 18, 2026 11:02
Limit the standard dependency cook to non-ZK service packages so it does not build the ZK prover dependencies without Go/protoc tooling. Scope the ZK dependency cook to base-prover-zk.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Co-authored-by: OpenCode <opencode-noreply@coinbase.com>
@github-actions

Copy link
Copy Markdown
Contributor

Review Summary

No issues found. The cargo-chef migration is well-structured:

  • Three-stage pattern (planner → deps → source) follows cargo-chef best practices and correctly separates dependency resolution from source compilation
  • cargo-chef pinned to @0.1.77 --locked — deterministic installs
  • ZK stage correctly copies the pre-built cargo-chef binary from the chef stage instead of recompiling
  • Package list in cargo chef cook matches all builder stages (including challenger)
  • PROFILE ARG correctly redeclared in each stage (required for multi-stage builds) with proper devdebug directory mapping
  • Rust version bump 1.93 → 1.94 consistently applied in both Dockerfile.rust-services and docker-bake.hcl

Note: The three previous inline comments from an earlier review run are stale — they raised issues (missing challenger stages, unpinned cargo-chef, duplicate cargo-chef compile) that have all been addressed in the current revision.

@github-actions

Copy link
Copy Markdown
Contributor

❌ base-std fork tests did not run

The build or setup step failed before any tests could execute. Check the workflow logs for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant