fix(docker): use cargo-chef for dependency caching in CI#3635
Open
jowparks wants to merge 8 commits into
Open
Conversation
The Docker build cache was not working on GitHub Actions because: 1. `COPY . .` invalidated the source layer on every commit, causing all downstream cargo build layers to miss the registry cache. 2. `--mount=type=cache,target=/app/target` mounts are local to the BuildKit daemon and ephemeral on GHA runners (empty every run). Replace the single-stage source copy with cargo-chef's three-stage pattern: - `planner`: prepares a recipe.json capturing workspace deps - `deps`: cooks all external dependencies (layer cached until Cargo.toml/Cargo.lock change) - `source`: copies full source on top of pre-built deps Source-only commits now skip dependency compilation entirely, reducing CI build times from ~30+ min to ~5-10 min for the common case. The zk-prover retains its own cook stage (zk-deps) since it requires the Go + protoc toolchain from zk-builder-base. Co-authored-by: OpenCode <opencode-noreply@coinbase.com>
Dockerfile.rust-services dropped the challenger stages but docker-bake.hcl and docker-compose.yml still referenced them, breaking `bake rust-services` and `bake devnet-l3`. Co-authored-by: OpenCode <opencode-noreply@coinbase.com>
This reverts commit 083fd1f.
The challenger-builder and challenger runtime stages were accidentally removed in the cargo-chef migration. Restore them using the same cache-mount-free pattern as the other services. Co-authored-by: OpenCode <opencode-noreply@coinbase.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Limit the standard dependency cook to non-ZK service packages so it does not build the ZK prover dependencies without Go/protoc tooling. Scope the ZK dependency cook to base-prover-zk. Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent) Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Co-authored-by: OpenCode <opencode-noreply@coinbase.com>
Contributor
Review SummaryNo issues found. The cargo-chef migration is well-structured:
Note: The three previous inline comments from an earlier review run are stale — they raised issues (missing challenger stages, unpinned cargo-chef, duplicate cargo-chef compile) that have all been addressed in the current revision. |
Contributor
❌ base-std fork tests did not runThe build or setup step failed before any tests could execute. Check the workflow logs for details. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The Docker L3 CI build cache is not working — every push recompiles all Rust dependencies from scratch (~30+ min builds).
Root cause: Two issues compound:
COPY . .invalidates the source layer on every commit. Thesourcestage copies the entire repo, so any file change produces a different layer hash. All downstreamcargo buildlayers miss the registry cache.--mount=type=cache,target=/app/targetis ephemeral on GHA. These BuildKit cache mounts persist on the local daemon, but GHA spins a fresh runner per job — the mounts are always empty.Solution
Replace the single-stage source copy with cargo-chef's three-stage caching pattern:
plannercargo chef prepare→ producesrecipe.json(deterministic from Cargo.toml/lock)depscargo chef cook→ compiles all external deps into a cacheable layersourceCOPY . .on top of pre-built depsCache behavior after this change
Source-only change (most commits):
recipe.jsonunchanged →depslayer hits registry cache → dependency compilation skipped entirelyCargo.toml/lock change (dep updates):
recipe.jsonchanges →depslayer rebuilds → full dep compile (unavoidable)The zk-prover gets its own
zk-depscook stage since it requires Go + protoc fromzk-builder-base.Impact
docker.yml): Also benefits since it uses the same Dockerfiledocker buildx bakeworks identically