Skip to content

Feat(Storage): Enable full object checksum validation on JSON path#8825

Open
thiyaguk09 wants to merge 6 commits intogoogleapis:mainfrom
thiyaguk09:feat/enable-full-checksum-validation
Open

Feat(Storage): Enable full object checksum validation on JSON path#8825
thiyaguk09 wants to merge 6 commits intogoogleapis:mainfrom
thiyaguk09:feat/enable-full-checksum-validation

Conversation

@thiyaguk09
Copy link
Contributor

@thiyaguk09 thiyaguk09 commented Dec 26, 2025

Enhanced Checksum Validation & Header Logic

This PR implements comprehensive MD5 and CRC32c checksum validation for object uploads, ensuring data integrity via the X-Goog-Hash header, improving data integrity verification. It refactors the upload architecture to handle hashes dynamically across different upload strategies.

Key Technical Changes

1. Core Library Enhancements (google-cloud-core)

  • ResumableUploader & StreamableUploader: Added type-safe logic (int)($rangeEnd + 1) === (int)$size to ensure X-Goog-Hash is transmitted only on the final chunk/request, preventing intermediate validation errors.
  • MultipartUploader: Standardized header merging to ensure hashes calculated by the connection layer are always included in single-shot uploads.
  • Header Integrity: Refactored restOptions merging to ensure custom metadata and encryption headers are preserved alongside checksums.

2. Storage Package Improvements (google-cloud-storage)

  • Automatic Hashing: Implemented logic to calculate missing MD5 or CRC32c hashes when the validate option is enabled.
  • Validation Logic: Updated Bucket::upload() to honor user-provided checksums and prevent redundant re-calculation.
  • Test Coverage: Added unit tests in BucketTest and RestTest to verify hash behavior in resumable, streamable, and multipart scenarios.

Note

CI "Lowest Dependencies" Failure: This failure occurs because the CI environment pulls the tagged version of google-cloud-core from Packagist instead of using the local changes in this PR. This will resolve once the Core changes are merged.

@product-auto-label product-auto-label bot added the api: storage Issues related to the Cloud Storage API. label Dec 26, 2025
@thiyaguk09 thiyaguk09 force-pushed the feat/enable-full-checksum-validation branch 2 times, most recently from d61822d to 4091fcc Compare March 23, 2026 11:33
Refactor Resumable, Streamable, and Multipart uploaders to ensure
integrity headers (X-Goog-Hash) are only attached to the request
when an upload is being finalized.

- In StreamableUploader, introduced `$isFinalRequest` to track
  intent before writeSize recalculations.
- In ResumableUploader, added a boundary check to only attach
  the hash when the current range matches the total file size.
- Aligns with GCS best practices for resumable upload integrity.
@thiyaguk09 thiyaguk09 force-pushed the feat/enable-full-checksum-validation branch from 4091fcc to 2ac9aad Compare March 23, 2026 11:40
@thiyaguk09 thiyaguk09 marked this pull request as ready for review March 23, 2026 12:54
@thiyaguk09 thiyaguk09 requested review from a team as code owners March 23, 2026 12:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: storage Issues related to the Cloud Storage API.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant