You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Introduces streaming LFS upload with on-the-fly SHA‑256 hashing and temp-path staging (lfs/tmp/UUID), verifies OID before moving to final path, and adds Store interface methods Move and Delete with filesystem and GCS implementations and tests.
Refactors upload to stream through a hashingReader that computes SHA‑256 while writing to a temp path; verifies hash against provided OID, deletes temp on mismatch, moves temp to final path on success, and returns the final object path.
Blob Storage Interface blob/interface.go
Adds Move(ctx, srcPath, dstPath string) error and Delete(ctx, filePath string) error to the Store interface.
Filesystem Implementation blob/filesystem.go
Implements Move (ensures destination dir, renames file) and Delete (removes file, returns ErrNotFound when missing) for FileSystemStore.
Filesystem Tests blob/filesystem_test.go
Adds tests for Move (same dir, nested dir, missing source) and Delete (existing, nested, non-existent error handling).
GCS Implementation blob/gcs.go
Implements Move using CopierFrom then deleting source (logs on delete failure) and Delete for GCSStore, mapping not-found to ErrNotFound.
Sequence Diagram
sequenceDiagram
participant Client
participant Handler as Upload Handler
participant Reader as HashingReader
participant TempStore as Temp Storage
participant Verifier as Hash Verifier
participant Mover as Move Handler
participant FinalStore as Final Storage
Client->>Handler: Upload file + OID
Handler->>Reader: Wrap incoming stream
Reader->>TempStore: Stream write to lfs/tmp/UUID
Reader->>Reader: Compute SHA-256 on-the-fly
TempStore-->>Reader: Write complete
Reader->>Verifier: Compare computed hash with OID
alt Hash matches
Verifier->>Mover: Request move temp -> final
Mover->>FinalStore: Rename/move path
FinalStore-->>Mover: Move success
Mover-->>Handler: Final path
Handler-->>Client: 200 + final path
else Hash mismatch
Verifier->>TempStore: Delete temp file
TempStore-->>Verifier: Deleted (or log failure)
Verifier-->>Handler: Hash mismatch error
Handler-->>Client: Error response
end
Loading
Estimated code review effort
🎯 3 (Moderate) | ⏱️ ~22 minutes
Poem
🐰 I nibbled bytes in streaming light,
Hashing as I hopped through night,
Temp holes closed, final burrow made,
Move and delete—no crumbs left laid! ✨
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name
Status
Explanation
Resolution
Docstring Coverage
⚠️ Warning
Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.
Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name
Status
Explanation
Description Check
✅ Passed
Check skipped - CodeRabbit’s high-level summary is enabled.
Title check
✅ Passed
The title accurately describes the main change: refactoring the Git LFS upload flow with streaming and hash verification. The ticket reference provides traceability.
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch atmsn/CODE-4993_b
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (2)
blob/gcs.go (1)
166-189: Non-atomic Move: source deletion failure is silently swallowed.
If the copy succeeds but the source deletion fails (Lines 183-186), the method returns nil — the caller believes the move succeeded while the source object still exists. For the LFS temp-path use case this is tolerable (data integrity is preserved at the destination), but orphaned temp objects will accumulate over time without cleanup.
Consider either:
Returning a wrapped error so the caller can decide (and still record the successful copy), or
Implementing a periodic cleanup job for lfs/tmp/ prefixed objects.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@blob/gcs.go` around lines 166 - 189, The Move method currently swallows
errors from srcObj.Delete after a successful dstObj.CopierFrom(...).Run, leaving
orphaned objects; update Move (function Move on GCSStore) to surface deletion
failures instead of returning nil: after a successful copy, if
srcObj.Delete(ctx) returns an error, return a wrapped error (e.g.,
fmt.Errorf("copied %q to %q but failed to delete source: %w", srcPath, dstPath,
err)) so callers know the copy succeeded but cleanup failed; keep the existing
log.Warn but also return the error so higher-level code can decide to retry or
schedule cleanup.
app/api/controller/lfs/upload.go (1)
146-153: Orphaned blob if lfsStore.Create fails with a non-duplicate error.
If the Move succeeds but Create fails (non-duplicate), the file sits at finalPath with no DB record. A retry would self-heal (re-upload overwrites), so this is low-risk — but worth noting for operational awareness. A periodic reconciliation or a cleanup attempt here could prevent blob storage drift over time.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/api/controller/lfs/upload.go` around lines 146 - 153, When
c.lfsStore.Create(ctx, object) fails with a non-duplicate error after the Move
succeeded, the blob at finalPath is left orphaned; update the error path in the
same block to attempt cleanup: after detecting err != nil && !errors.Is(err,
store.ErrDuplicate), call the storage deletion method to remove finalPath (eg.
use the component that performed Move — e.g., c.blobStore.Delete(ctx, finalPath)
or equivalent), log any deletion errors, and then return the original Create
error (wrap it) so callers see the root failure; ensure you do not swallow the
Create error and that deletion errors are only logged, not returned.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/api/controller/lfs/upload.go`:
- Line 97: Validate that pointer.OId begins with the "sha256:" prefix before
trimming: check strings.HasPrefix(pointer.OId, "sha256:") and return/bubble an
explicit error (or BadRequest) if it does not, instead of calling
strings.TrimPrefix blindly; then compute expectedHash by trimming the prefix and
proceed with the existing hash comparison logic (referencing pointer.OId and
expectedHash in upload handling code) so malformed/unsupported OIDs produce a
clear validation error rather than a misleading "content hash doesn't match".
In `@blob/filesystem_test.go`:
- Around line 407-413: The test case "move non-existent file" in
filesystem_test.go currently only expects an error but doesn't assert its type;
update the test case struct for that entry to include errorType: ErrNotFound and
ensure the test assertion in the Move test checks for that specific error type
(same pattern used in the Delete test), i.e., when calling Move verify returned
error is ErrNotFound after the Move implementation is changed to return
ErrNotFound for missing sources.
In `@blob/filesystem.go`:
- Around line 106-122: The Move implementation on FileSystemStore should
translate a missing source into blob.ErrNotFound like other backends; modify
FileSystemStore.Move (look up srcDiskPath and dstDiskPath) to check the error
returned by os.Rename and, if errors.Is(err, fs.ErrNotExist) (or unwrapping
yields fs.ErrNotExist), return blob.ErrNotFound instead of the wrapped
*os.PathError; otherwise return the wrapped error as-is so behavior matches the
Delete method and the GCS backend.
---
Nitpick comments:
In `@app/api/controller/lfs/upload.go`:
- Around line 146-153: When c.lfsStore.Create(ctx, object) fails with a
non-duplicate error after the Move succeeded, the blob at finalPath is left
orphaned; update the error path in the same block to attempt cleanup: after
detecting err != nil && !errors.Is(err, store.ErrDuplicate), call the storage
deletion method to remove finalPath (eg. use the component that performed Move —
e.g., c.blobStore.Delete(ctx, finalPath) or equivalent), log any deletion
errors, and then return the original Create error (wrap it) so callers see the
root failure; ensure you do not swallow the Create error and that deletion
errors are only logged, not returned.
In `@blob/gcs.go`:
- Around line 166-189: The Move method currently swallows errors from
srcObj.Delete after a successful dstObj.CopierFrom(...).Run, leaving orphaned
objects; update Move (function Move on GCSStore) to surface deletion failures
instead of returning nil: after a successful copy, if srcObj.Delete(ctx) returns
an error, return a wrapped error (e.g., fmt.Errorf("copied %q to %q but failed
to delete source: %w", srcPath, dstPath, err)) so callers know the copy
succeeded but cleanup failed; keep the existing log.Warn but also return the
error so higher-level code can decide to retry or schedule cleanup.
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Potential issue | 🟠 Major
Missing validation that pointer.OId actually has the sha256: prefix.
strings.TrimPrefix is a no-op when the prefix is absent — expectedHash would silently become the full OID string. The hash comparison on Line 113 would then always fail, and the uploaded temp file would be deleted. This masks the real problem (unsupported/malformed OID) behind a misleading "content hash doesn't match" error.
‼️IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
Verify each finding against the current code and only fix it if needed.
In `@app/api/controller/lfs/upload.go` at line 97, Validate that pointer.OId
begins with the "sha256:" prefix before trimming: check
strings.HasPrefix(pointer.OId, "sha256:") and return/bubble an explicit error
(or BadRequest) if it does not, instead of calling strings.TrimPrefix blindly;
then compute expectedHash by trimming the prefix and proceed with the existing
hash comparison logic (referencing pointer.OId and expectedHash in upload
handling code) so malformed/unsupported OIDs produce a clear validation error
rather than a misleading "content hash doesn't match".
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Potential issue | 🟡 Minor
Missing errorType check for the non-existent source move test.
The "move non-existent file" case only asserts that some error is returned but doesn't verify it's ErrNotFound. The analogous Delete test (Line 493) correctly sets errorType: ErrNotFound. Once the Move implementation is updated to return ErrNotFound for missing sources (see comment on blob/filesystem.go), this test should verify the error type too.
‼️IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
Suggested change
{
name: "move non-existent file",
srcPath: "nonexistent.txt",
dstPath: "dst.txt",
content: "",
expectError: true,
},
{
name: "move non-existent file",
srcPath: "nonexistent.txt",
dstPath: "dst.txt",
content: "",
expectError: true,
errorType: ErrNotFound,
},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@blob/filesystem_test.go` around lines 407 - 413, The test case "move
non-existent file" in filesystem_test.go currently only expects an error but
doesn't assert its type; update the test case struct for that entry to include
errorType: ErrNotFound and ensure the test assertion in the Move test checks for
that specific error type (same pattern used in the Delete test), i.e., when
calling Move verify returned error is ErrNotFound after the Move implementation
is changed to return ErrNotFound for missing sources.
The reason will be displayed to describe this comment to others. Learn more.
⚠️ Potential issue | 🟠 Major
Move does not return ErrNotFound for a missing source, unlike the GCS backend and the Delete method.
When the source file doesn't exist, os.Rename returns an *os.PathError wrapping fs.ErrNotExist, but this method doesn't translate it to blob.ErrNotFound. The GCS implementation (Lines 177-178 of blob/gcs.go) returns ErrNotFound when the source is missing. The Delete method in this same file (Lines 128-129) also correctly maps fs.ErrNotExist → ErrNotFound.
This inconsistency means callers cannot reliably use errors.Is(err, blob.ErrNotFound) across backends for Move.
Proposed fix
if err := os.Rename(srcDiskPath, dstDiskPath); err != nil {
+ if errors.Is(err, fs.ErrNotExist) {+ return ErrNotFound+ }
return fmt.Errorf("failed to move file: %w", err)
}
📝 Committable suggestion
‼️IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
Verify each finding against the current code and only fix it if needed.
In `@blob/filesystem.go` around lines 106 - 122, The Move implementation on
FileSystemStore should translate a missing source into blob.ErrNotFound like
other backends; modify FileSystemStore.Move (look up srcDiskPath and
dstDiskPath) to check the error returned by os.Rename and, if errors.Is(err,
fs.ErrNotExist) (or unwrapping yields fs.ErrNotExist), return blob.ErrNotFound
instead of the wrapped *os.PathError; otherwise return the wrapped error as-is
so behavior matches the Delete method and the GCS backend.
The reason will be displayed to describe this comment to others. Learn more.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
app/api/controller/lfs/upload.go (1)
146-153: ⚠️ Potential issue | 🟠 Major
Orphaned blob at finalPath when lfsStore.Create fails.
After blobStore.Move succeeds, finalPath holds the object in blob storage. If lfsStore.Create then returns a non-ErrDuplicate error (line 147), the function returns an error without deleting the blob at finalPath. The blob and the metadata record are left in an inconsistent state. While a retry would likely self-heal (Move overwrites; Create succeeds), the inconsistency window is real and the pattern is inconsistent with the cleanup applied after hash mismatch and Move failure.
🛡️ Proposed fix
err = c.lfsStore.Create(ctx, object)
if err != nil && !errors.Is(err, store.ErrDuplicate) {
+ if deleteErr := c.blobStore.Delete(ctx, finalPath); deleteErr != nil {+ if !errors.Is(deleteErr, blob.ErrNotFound) {+ log.Ctx(ctx).Warn().Err(deleteErr).+ Str("final_path", finalPath).+ Msg("failed to delete final blob after store create failure")+ }+ }
return nil, fmt.Errorf("failed to create object: %w", err)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/api/controller/lfs/upload.go` around lines 146 - 153, When
lfsStore.Create(ctx, object) fails with a non-ErrDuplicate error after
blobStore.Move has already placed the blob at finalPath, clean up the orphaned
blob by calling blobStore.Delete(ctx, finalPath) before returning the create
error; i.e., in the error branch following lfsStore.Create use
blobStore.Delete(ctx, finalPath) (handle/ignore its error but ensure the
original create error is returned/wrapped), referencing lfsStore.Create,
blobStore.Delete, blobStore.Move, and finalPath to locate the code.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@app/api/controller/lfs/upload.go`:
- Around line 146-153: When lfsStore.Create(ctx, object) fails with a
non-ErrDuplicate error after blobStore.Move has already placed the blob at
finalPath, clean up the orphaned blob by calling blobStore.Delete(ctx,
finalPath) before returning the create error; i.e., in the error branch
following lfsStore.Create use blobStore.Delete(ctx, finalPath) (handle/ignore
its error but ensure the original create error is returned/wrapped), referencing
lfsStore.Create, blobStore.Delete, blobStore.Move, and finalPath to locate the
code.
---
Duplicate comments:
In `@app/api/controller/lfs/upload.go`:
- Line 97: The code currently trims "sha256:" silently from pointer.OId using
strings.TrimPrefix which allows missing prefixes to go unnoticed and later
causes a misleading "content hash doesn't match" error; update the validation so
you first check strings.HasPrefix(pointer.OId, "sha256:") and if absent return a
clear validation error (or bad request) indicating the OID is missing the
required "sha256:" prefix, otherwise set expectedHash =
strings.TrimPrefix(pointer.OId, "sha256:") and proceed with the existing hash
comparison logic.
We reviewed changes in 156a9b5...c279b6b on this pull request. Below is the summary for the review, and you can see the individual issues we found as inline review comments.
Some issues found as part of this review are outside of the diff in this pull request and aren't shown in the inline review comments due to GitHub's API limitations. You can see those issues on the DeepSource dashboard.
AI Review is run only on demand for your team. We're only showing results of static analysis review right now. To trigger AI Review, comment @deepsourcebot review on this thread.
The reason will be displayed to describe this comment to others. Learn more.
Type "FileSystemStore" has both value and pointer receivers
(Go's FAQ)[https://go.dev/doc/faq#methods_on_values_or_pointers] recommends
that method receivers should be consistent. If some of the methods of the type
must have pointer receivers, the rest should too, so the method set is
consistent regardless of how the type is used. This is because value and
pointer receivers have different method sets.
The reason will be displayed to describe this comment to others. Learn more.
Type "FileSystemStore" has both value and pointer receivers
(Go's FAQ)[https://go.dev/doc/faq#methods_on_values_or_pointers] recommends
that method receivers should be consistent. If some of the methods of the type
must have pointer receivers, the rest should too, so the method set is
consistent regardless of how the type is used. This is because value and
pointer receivers have different method sets.
Hash is computed over truncated bytes, but a too-short upload is never rejected.io.LimitReader(file, pointer.Size) caps reads at pointer.Size, and the hash is taken over whatever bytes actually flow through hashReader. If the client sends fewer than pointer.Size bytes, the upload still streams to the temp path and the OID is verified against the partial content. The only thing that catches a corrupted/truncated body is the SHA-256 comparison — which is correct — so this is not a hash bypass. However, there is no explicit length check: a client that lies with a large pointer.Size but sends a short body that happens to hash to the claimed OID is impossible (collision), so integrity holds. No action strictly required, but note that the previous io.ReadAll path had identical semantics. Downgrade/ignore if size enforcement lives upstream.
Successful-path temp object can leak silently on GCS. On the happy path the temp object is removed only as a side effect of Move. In GCSStore.Move, after a successful CopierFrom().Run(), a failure of srcObj.Delete(ctx) is logged as a warning and the function returns nil (gcs.go NEW:183-186). The caller treats the move as fully successful, so the temp object under lfs/tmp/<uuid> is never cleaned up and accumulates indefinitely (no GC/lifecycle is added in this PR). Recommend either configuring a bucket lifecycle rule on the lfs/tmp/ prefix, or surfacing the delete failure so an out-of-band cleanup can retry. Business impact: unbounded storage growth of orphaned temp blobs.
Move is non-atomic on GCS (copy-then-delete), so a crash between copy and delete leaves a dangling temp object. Unlike the filesystem backend which uses os.Rename (atomic), GCSStore.Move copies then deletes. If the process crashes after CopierFrom().Run() succeeds but before srcObj.Delete, the final object is correctly written (good) but the temp object leaks. Combined with the warning-only delete handling, temp leakage is the consistent failure mode for the GCS backend. Same mitigation as above (lifecycle rule on lfs/tmp/). Calling out as a distinct durability/cleanup concern across backends per the refactor's stated goal.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary by CodeRabbit
New Features
Improvements
Tests