feat(service): add layer cache for cross-mount dedup#40
Conversation
📊 Code Coverage Report
📦 Per-package breakdown |
There was a problem hiding this comment.
Code Review
This pull request upgrades the Go environment to version 1.25, updates several core dependencies, and introduces a new LayerCache system to optimize model pulls. The cache implements layer-level singleflighting and cross-volume deduplication using hardlinks, with state persisted via layers.json files. Review feedback identifies a critical path traversal vulnerability when resolving layer filepaths from OCI annotations. Additionally, there are concerns regarding the performance impact of frequent synchronous disk I/O for cache persistence, potential thundering herd issues in the concurrency model during context cancellation, and the functional limitations of hardlinking across different filesystems in CSI environments.
d9fcc54 to
c4129c9
Compare
|
Thanks for the PR! This is a co-reviews by me and GPT 5.4 here :) I reviewed both PRs (#39 and #40) in detail. Both move the project in the right direction, but they make different tradeoffs. For PR39, the main strengths are:
For PR39, have the following issues:
For PR40, the main strengths are:
For PR40, I see the following issues:
Overall, I think PR39 is more ambitious and covers more of the requested functionality in one shot, especially the node-level layer concurrency control. However, it does so with a much heavier implementation that is harder to maintain and riskier to merge. My current preference is PR40. It is simpler, closer to the existing architecture, and easier to harden incrementally. :) |
Signed-off-by: chlins <chlins.zhang@gmail.com>
This pull request introduces a new layer cache hook system to improve layer deduplication and persistence in the model CSI driver, alongside several dependency and tooling updates. The most significant changes are the addition of the
layercache_hookimplementation and its comprehensive tests, as well as updates to Go versioning and dependencies to ensure compatibility and improved performance.Layer cache hook implementation and testing:
pkg/service/layercache_hook.go, which implements acombinedHookandlayerCacheHookto manage layer deduplication, cache hits, and persistence across volumes. This ensures that all volumes maintain accurate cache state, even after owner deletion or restarts.pkg/service/layercache_hook_test.gowith thorough unit tests for the new hook logic, covering cache hits, ownership transfer, deduplication persistence, and edge cases.Go version and dependency upgrades:
go.mod,build/Dockerfile, and GitHub Actions workflows for both coverage and lint jobs, ensuring the project uses the latest language features and security updates. [1] [2] [3]go.mod, includinggo-git,modctl,go.opentelemetry.io/otel, and others, to newer versions for improved stability, performance, and compatibility. [1] [2] [3] [4]CI/CD and tooling improvements:
golangci-lint-actionversion in.github/workflows/lint.ymlfor better linting performance and reliability.Test and interface adjustments:
pkg/server/server_test.goto support the new layer cache hook interface and its additional parameters. [1] [2]These changes collectively enhance the reliability, maintainability, and efficiency of layer handling in the driver, while keeping the codebase up-to-date with the latest Go ecosystem advancements.