Skip to content

feat(service): add layer cache for cross-mount dedup#40

Open
chlins wants to merge 1 commit into
mainfrom
feat/cache
Open

feat(service): add layer cache for cross-mount dedup#40
chlins wants to merge 1 commit into
mainfrom
feat/cache

Conversation

@chlins
Copy link
Copy Markdown
Member

@chlins chlins commented May 10, 2026

This pull request introduces a new layer cache hook system to improve layer deduplication and persistence in the model CSI driver, alongside several dependency and tooling updates. The most significant changes are the addition of the layercache_hook implementation and its comprehensive tests, as well as updates to Go versioning and dependencies to ensure compatibility and improved performance.

Layer cache hook implementation and testing:

  • Added pkg/service/layercache_hook.go, which implements a combinedHook and layerCacheHook to manage layer deduplication, cache hits, and persistence across volumes. This ensures that all volumes maintain accurate cache state, even after owner deletion or restarts.
  • Added pkg/service/layercache_hook_test.go with thorough unit tests for the new hook logic, covering cache hits, ownership transfer, deduplication persistence, and edge cases.

Go version and dependency upgrades:

  • Upgraded the Go version to 1.25 in go.mod, build/Dockerfile, and GitHub Actions workflows for both coverage and lint jobs, ensuring the project uses the latest language features and security updates. [1] [2] [3]
  • Updated multiple dependencies in go.mod, including go-git, modctl, go.opentelemetry.io/otel, and others, to newer versions for improved stability, performance, and compatibility. [1] [2] [3] [4]

CI/CD and tooling improvements:

  • Updated the golangci-lint-action version in .github/workflows/lint.yml for better linting performance and reliability.
  • Minor cleanup in workflow scripts, such as removing unnecessary trailing lines.

Test and interface adjustments:

  • Updated test mocks and function signatures in pkg/server/server_test.go to support the new layer cache hook interface and its additional parameters. [1] [2]

These changes collectively enhance the reliability, maintainability, and efficiency of layer handling in the driver, while keeping the codebase up-to-date with the latest Go ecosystem advancements.

@chlins chlins added the enhancement New feature or request label May 10, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 10, 2026

📊 Code Coverage Report

Metric Coverage Threshold Status
Overall 76.9% 70%
Changed lines 90% 90%
📦 Per-package breakdown
github.com/modelpack/model-csi-driver/pkg/client/grpc.go:104:                    80.0%
github.com/modelpack/model-csi-driver/pkg/client/grpc.go:116:                    80.0%
github.com/modelpack/model-csi-driver/pkg/client/grpc.go:31:                     85.7%
github.com/modelpack/model-csi-driver/pkg/client/grpc.go:60:                     75.0%
github.com/modelpack/model-csi-driver/pkg/client/grpc.go:69:                     80.0%
github.com/modelpack/model-csi-driver/pkg/client/grpc.go:81:                     80.0%
github.com/modelpack/model-csi-driver/pkg/client/grpc.go:92:                     80.0%
github.com/modelpack/model-csi-driver/pkg/client/http.go:105:                    100.0%
github.com/modelpack/model-csi-driver/pkg/client/http.go:23:                     80.0%
github.com/modelpack/model-csi-driver/pkg/client/http.go:49:                     74.3%
github.com/modelpack/model-csi-driver/pkg/client/request.go:12:                  100.0%
github.com/modelpack/model-csi-driver/pkg/client/request.go:34:                  75.0%
github.com/modelpack/model-csi-driver/pkg/client/request.go:50:                  66.7%
github.com/modelpack/model-csi-driver/pkg/client/request.go:65:                  75.0%
github.com/modelpack/model-csi-driver/pkg/config/auth/docker.go:112:             87.5%
github.com/modelpack/model-csi-driver/pkg/config/auth/docker.go:26:              100.0%
github.com/modelpack/model-csi-driver/pkg/config/auth/docker.go:32:              100.0%
github.com/modelpack/model-csi-driver/pkg/config/auth/docker.go:55:              100.0%
github.com/modelpack/model-csi-driver/pkg/config/auth/docker.go:66:              92.9%
github.com/modelpack/model-csi-driver/pkg/config/auth/docker.go:88:              92.3%
github.com/modelpack/model-csi-driver/pkg/config/auth/keychain.go:20:            100.0%
github.com/modelpack/model-csi-driver/pkg/config/auth/keychain.go:31:            100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:103:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:108:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:113:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:118:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:123:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:128:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:133:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:138:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:143:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:148:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:152:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:156:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:160:                  62.8%
github.com/modelpack/model-csi-driver/pkg/config/config.go:17:                   87.5%
github.com/modelpack/model-csi-driver/pkg/config/config.go:240:                  83.3%
github.com/modelpack/model-csi-driver/pkg/config/config.go:253:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:261:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:265:                  100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:71:                   100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:75:                   100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:79:                   100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:83:                   100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:87:                   100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:91:                   100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:95:                   100.0%
github.com/modelpack/model-csi-driver/pkg/config/config.go:99:                   100.0%
github.com/modelpack/model-csi-driver/pkg/config/watcher.go:13:                  65.2%
github.com/modelpack/model-csi-driver/pkg/logger/logger.go:19:                   100.0%
github.com/modelpack/model-csi-driver/pkg/logger/logger.go:29:                   100.0%
github.com/modelpack/model-csi-driver/pkg/logger/logger.go:41:                   100.0%
github.com/modelpack/model-csi-driver/pkg/metrics/mount_collector.go:21:         100.0%
github.com/modelpack/model-csi-driver/pkg/metrics/mount_collector.go:34:         100.0%
github.com/modelpack/model-csi-driver/pkg/metrics/mount_collector.go:38:         100.0%
github.com/modelpack/model-csi-driver/pkg/metrics/mount_collector.go:42:         83.3%
github.com/modelpack/model-csi-driver/pkg/metrics/registry.go:117:               100.0%
github.com/modelpack/model-csi-driver/pkg/metrics/registry.go:126:               100.0%
github.com/modelpack/model-csi-driver/pkg/metrics/registry.go:135:               100.0%
github.com/modelpack/model-csi-driver/pkg/metrics/registry.go:146:               100.0%
github.com/modelpack/model-csi-driver/pkg/metrics/registry.go:24:                100.0%
github.com/modelpack/model-csi-driver/pkg/metrics/serve.go:26:                   100.0%
github.com/modelpack/model-csi-driver/pkg/metrics/serve.go:37:                   90.9%
github.com/modelpack/model-csi-driver/pkg/metrics/serve.go:59:                   83.3%
github.com/modelpack/model-csi-driver/pkg/mounter/builder.go:39:                 100.0%
github.com/modelpack/model-csi-driver/pkg/mounter/builder.go:50:                 100.0%
github.com/modelpack/model-csi-driver/pkg/mounter/builder.go:54:                 100.0%
github.com/modelpack/model-csi-driver/pkg/mounter/builder.go:59:                 100.0%
github.com/modelpack/model-csi-driver/pkg/mounter/builder.go:64:                 100.0%
github.com/modelpack/model-csi-driver/pkg/mounter/builder.go:69:                 100.0%
github.com/modelpack/model-csi-driver/pkg/mounter/builder.go:74:                 100.0%
github.com/modelpack/model-csi-driver/pkg/mounter/builder.go:82:                 100.0%
github.com/modelpack/model-csi-driver/pkg/mounter/builder.go:88:                 80.0%
github.com/modelpack/model-csi-driver/pkg/mounter/mounter.go:15:                 100.0%
github.com/modelpack/model-csi-driver/pkg/mounter/mounter.go:26:                 66.7%
github.com/modelpack/model-csi-driver/pkg/mounter/mounter.go:37:                 90.9%
github.com/modelpack/model-csi-driver/pkg/mounter/mounter.go:57:                 71.4%
github.com/modelpack/model-csi-driver/pkg/mounter/mounter.go:81:                 83.3%
github.com/modelpack/model-csi-driver/pkg/provider/provider.go:15:               100.0%
github.com/modelpack/model-csi-driver/pkg/service/artifact.go:11:                100.0%
github.com/modelpack/model-csi-driver/pkg/service/cache.go:119:                  85.7%
github.com/modelpack/model-csi-driver/pkg/service/cache.go:135:                  85.7%
github.com/modelpack/model-csi-driver/pkg/service/cache.go:28:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/cache.go:37:                   73.9%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:101:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:115:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:122:             76.2%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:157:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:164:             50.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:189:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:196:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:203:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:210:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:217:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:22:              84.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:249:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:256:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:263:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:270:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller.go:62:              91.3%
github.com/modelpack/model-csi-driver/pkg/service/controller_local.go:143:       79.3%
github.com/modelpack/model-csi-driver/pkg/service/controller_local.go:184:       28.1%
github.com/modelpack/model-csi-driver/pkg/service/controller_local.go:25:        57.3%
github.com/modelpack/model-csi-driver/pkg/service/controller_remote.go:134:      0.0%
github.com/modelpack/model-csi-driver/pkg/service/controller_remote.go:199:      0.0%
github.com/modelpack/model-csi-driver/pkg/service/controller_remote.go:34:       100.0%
github.com/modelpack/model-csi-driver/pkg/service/controller_remote.go:46:       0.0%
github.com/modelpack/model-csi-driver/pkg/service/controller_remote.go:60:       0.0%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server.go:107:         66.7%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server.go:151:         71.4%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server.go:190:         88.9%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server.go:46:          100.0%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server.go:54:          82.4%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server.go:84:          83.3%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server_handler.go:124: 83.3%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server_handler.go:156: 90.9%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server_handler.go:185: 85.7%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server_handler.go:203: 100.0%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server_handler.go:27:  83.3%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server_handler.go:38:  100.0%
github.com/modelpack/model-csi-driver/pkg/service/dynamic_server_handler.go:56:  80.0%
github.com/modelpack/model-csi-driver/pkg/service/identity.go:21:                100.0%
github.com/modelpack/model-csi-driver/pkg/service/identity.go:41:                100.0%
github.com/modelpack/model-csi-driver/pkg/service/identity.go:9:                 100.0%
github.com/modelpack/model-csi-driver/pkg/service/kube.go:14:                    0.0%
github.com/modelpack/model-csi-driver/pkg/service/kube.go:25:                    0.0%
github.com/modelpack/model-csi-driver/pkg/service/kube.go:34:                    0.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:111:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:128:             88.9%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:197:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:237:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:264:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:293:             95.8%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:334:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:364:             95.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:397:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:406:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:420:             92.3%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:442:             93.3%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:465:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:480:             90.9%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:509:             81.2%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:534:             85.7%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:547:             100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:54:              100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:59:              100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache.go:96:              100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache_hook.go:119:        82.4%
github.com/modelpack/model-csi-driver/pkg/service/layercache_hook.go:147:        100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache_hook.go:153:        100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache_hook.go:28:         100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache_hook.go:40:         100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache_hook.go:61:         100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache_hook.go:70:         100.0%
github.com/modelpack/model-csi-driver/pkg/service/layercache_hook.go:95:         100.0%
github.com/modelpack/model-csi-driver/pkg/service/model.go:114:                  89.5%
github.com/modelpack/model-csi-driver/pkg/service/model.go:157:                  91.7%
github.com/modelpack/model-csi-driver/pkg/service/model.go:177:                  100.0%
github.com/modelpack/model-csi-driver/pkg/service/model.go:27:                   80.0%
github.com/modelpack/model-csi-driver/pkg/service/model.go:48:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/model.go:60:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/model.go:68:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/model.go:96:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/node.go:123:                   94.4%
github.com/modelpack/model-csi-driver/pkg/service/node.go:154:                   66.7%
github.com/modelpack/model-csi-driver/pkg/service/node.go:202:                   94.4%
github.com/modelpack/model-csi-driver/pkg/service/node.go:233:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/node.go:241:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/node.go:249:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/node.go:269:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/node.go:26:                    100.0%
github.com/modelpack/model-csi-driver/pkg/service/node.go:34:                    100.0%
github.com/modelpack/model-csi-driver/pkg/service/node.go:42:                    100.0%
github.com/modelpack/model-csi-driver/pkg/service/node.go:46:                    100.0%
github.com/modelpack/model-csi-driver/pkg/service/node.go:50:                    48.8%
github.com/modelpack/model-csi-driver/pkg/service/node_dynamic.go:18:            73.3%
github.com/modelpack/model-csi-driver/pkg/service/node_dynamic.go:52:            68.4%
github.com/modelpack/model-csi-driver/pkg/service/node_static.go:16:             81.8%
github.com/modelpack/model-csi-driver/pkg/service/node_static.go:42:             69.2%
github.com/modelpack/model-csi-driver/pkg/service/node_static_inline.go:18:      0.0%
github.com/modelpack/model-csi-driver/pkg/service/node_static_inline.go:54:      57.1%
github.com/modelpack/model-csi-driver/pkg/service/puller.go:46:                  100.0%
github.com/modelpack/model-csi-driver/pkg/service/puller.go:54:                  88.5%
github.com/modelpack/model-csi-driver/pkg/service/quota.go:21:                   94.1%
github.com/modelpack/model-csi-driver/pkg/service/quota.go:49:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/quota.go:55:                   100.0%
github.com/modelpack/model-csi-driver/pkg/service/quota.go:67:                   84.2%
github.com/modelpack/model-csi-driver/pkg/service/service.go:41:                 100.0%
github.com/modelpack/model-csi-driver/pkg/service/service.go:45:                 0.0%
github.com/modelpack/model-csi-driver/pkg/service/worker.go:127:                 100.0%
github.com/modelpack/model-csi-driver/pkg/service/worker.go:136:                 87.5%
github.com/modelpack/model-csi-driver/pkg/service/worker.go:161:                 71.7%
github.com/modelpack/model-csi-driver/pkg/service/worker.go:250:                 77.1%
github.com/modelpack/model-csi-driver/pkg/service/worker.go:28:                  100.0%
github.com/modelpack/model-csi-driver/pkg/service/worker.go:34:                  100.0%
github.com/modelpack/model-csi-driver/pkg/service/worker.go:46:                  100.0%
github.com/modelpack/model-csi-driver/pkg/service/worker.go:63:                  75.0%
github.com/modelpack/model-csi-driver/pkg/service/worker.go:82:                  79.2%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:100:                    100.0%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:107:                    93.8%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:141:                    100.0%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:180:                    88.9%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:201:                    100.0%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:28:                     100.0%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:34:                     100.0%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:46:                     100.0%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:53:                     100.0%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:69:                     100.0%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:76:                     80.0%
github.com/modelpack/model-csi-driver/pkg/status/hook.go:87:                     80.0%
github.com/modelpack/model-csi-driver/pkg/status/status.go:110:                  87.5%
github.com/modelpack/model-csi-driver/pkg/status/status.go:125:                  83.3%
github.com/modelpack/model-csi-driver/pkg/status/status.go:136:                  100.0%
github.com/modelpack/model-csi-driver/pkg/status/status.go:51:                   75.0%
github.com/modelpack/model-csi-driver/pkg/status/status.go:68:                   100.0%
github.com/modelpack/model-csi-driver/pkg/status/status.go:74:                   66.7%
github.com/modelpack/model-csi-driver/pkg/status/status.go:92:                   100.0%
github.com/modelpack/model-csi-driver/pkg/tracing/tracing.go:22:                 85.7%
github.com/modelpack/model-csi-driver/pkg/tracing/tracing.go:36:                 55.6%
github.com/modelpack/model-csi-driver/pkg/tracing/tracing.go:72:                 100.0%
github.com/modelpack/model-csi-driver/pkg/tracing/tracing.go:79:                 81.8%
github.com/modelpack/model-csi-driver/pkg/utils/utils.go:16:                     100.0%
github.com/modelpack/model-csi-driver/pkg/utils/utils.go:34:                     75.0%
github.com/modelpack/model-csi-driver/pkg/utils/utils.go:54:                     81.8%

total:											(statements)				76.9%

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request upgrades the Go environment to version 1.25, updates several core dependencies, and introduces a new LayerCache system to optimize model pulls. The cache implements layer-level singleflighting and cross-volume deduplication using hardlinks, with state persisted via layers.json files. Review feedback identifies a critical path traversal vulnerability when resolving layer filepaths from OCI annotations. Additionally, there are concerns regarding the performance impact of frequent synchronous disk I/O for cache persistence, potential thundering herd issues in the concurrency model during context cancellation, and the functional limitations of hardlinking across different filesystems in CSI environments.

Comment thread pkg/service/layercache_hook.go Outdated
Comment thread pkg/service/layercache_hook.go
Comment thread pkg/service/layercache.go
Comment thread pkg/service/layercache.go Outdated
Comment thread pkg/service/layercache.go Outdated
@chlins chlins force-pushed the feat/cache branch 3 times, most recently from d9fcc54 to c4129c9 Compare May 11, 2026 13:19
@imeoer imeoer self-requested a review May 12, 2026 11:40
@imeoer
Copy link
Copy Markdown
Collaborator

imeoer commented May 15, 2026

Thanks for the PR! This is a co-reviews by me and GPT 5.4 here :)

I reviewed both PRs (#39 and #40) in detail. Both move the project in the right direction, but they make different tradeoffs.

For PR39, the main strengths are:

  1. It covers the full feature set more explicitly. It introduces digest-level singleflight, node-level layer concurrency control, hardlink-based layer reuse, and cache rebuild support after restart.
  2. It has stronger observability semantics. The separate LayerCached path is a good idea because it avoids counting a cache hit as a real network pull.
  3. It clearly targets the original problem statement end-to-end: deduplicating concurrent layer pulls, limiting node-wide fan-out, and reducing duplicate disk usage.

For PR39, have the following issues:

  • [Important] The implementation is significantly more complex because it partially re-implements layer orchestration in the driver instead of staying on top of the existing modctl pull flow. It manually resolves manifests, fetches blobs, and decodes layers. That gives more control, but it also creates a long-term maintenance burden and increases the chance of divergence from upstream modctl behavior.
  • [Important] The solution is not fully uniform across all pull modes. For filtered or partial pulls, it falls back to the old path, so the dedup/concurrency behavior is not consistently applied everywhere.
  • [Important] The semaphore-based throttling is count-based, not size-aware. It limits the number of in-flight layers, but it does not account for layer size, so it is not a real bandwidth control mechanism.
  • [Important] One review comment is valid: acquiring the semaphore with context.Background() in the tracking hook means cancellation may not stop goroutines that are blocked waiting for permits.
  • The cache eviction logic based on strings.HasPrefix is risky, because one volume path can be a prefix of another unrelated path.
  • [Important] Manifest fetching appears to assume a tag-based reference. Digest-based references should also be handled correctly.
  • On hardlink retry failure, using RemoveByPrefix(sourcePath) is conceptually wrong and can remove too much state. If the source path is still valid and the error is only EXDEV, the cache entry should not be removed.

For PR40, the main strengths are:

  • The design is much simpler and fits the current codebase better. It keeps the existing pull/fetch flow and extends it through hooks instead of replacing the orchestration layer.
  • It is easier to reason about operationally. The LayerCache Acquire/Publish/Fail model is straightforward, and the ownership/hit flow is localized.
  • The persistence model is practical. Writing per-volume layers.json and rebuilding from it is a good way to preserve dedup state across restarts.
  • I especially like that cache-hit volumes also persist their own layers.json. That avoids losing dedup state when the original owner volume is removed.
  • From a maintenance perspective, this approach is less invasive and should be easier to evolve incrementally.

For PR40, I see the following issues:

  • It does not fully implement node-level layer concurrency throttling. It deduplicates same-digest work, but it does not add a global limit for different digests being pulled at the same time.
  • [Important] The target path is derived from OCI annotations and joined directly with the target directory. That introduces a path traversal risk unless the resolved path is validated to stay inside the target directory.
  • [Important] Persisting layers.json synchronously on every publish, including cache hits, may create unnecessary disk I/O during large pulls.
  • Waking all waiters with cond.Broadcast on cancellation can cause unnecessary contention and a thundering herd effect.
  • The reused-layer path still deserves careful review from a metrics perspective, because “reused” and “actually pulled from remote” should remain clearly distinguishable.

Overall, I think PR39 is more ambitious and covers more of the requested functionality in one shot, especially the node-level layer concurrency control. However, it does so with a much heavier implementation that is harder to maintain and riskier to merge.

My current preference is PR40. It is simpler, closer to the existing architecture, and easier to harden incrementally. :)

Signed-off-by: chlins <chlins.zhang@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants