GH-49752: [C++][Gandiva] Fix potential buffer overrun in gandiva ssl function#49780
GH-49752: [C++][Gandiva] Fix potential buffer overrun in gandiva ssl function#49780lriggs wants to merge 4 commits intoapache:mainfrom
Conversation
|
|
||
| // SHA-256 digest is 32 bytes, so result_buf_size must be 64. Pass 63 to trigger | ||
| // the error path that was previously guarded by && instead of ||. | ||
| const char* result = gandiva::gdv_hash_using_openssl( |
There was a problem hiding this comment.
We now accessing gdv_hash_using_openssl() in a roundabout way.
I'd be fine in this case to proceed without the test.
There's a way to make gdv_hash_using_openssl() "internal" for the sake of this test bug is it worth it?
There was a problem hiding this comment.
p.s. rest looks good so once we have a call on test vs no test I can approve.
There was a problem hiding this comment.
None of the changes are because of the test. I removed GANDIVA_EXPORT since that wasn't needed and was identified as potential risk.
There was a problem hiding this comment.
The test actually does cause a problem on Windows when linking against the ssl function for some reason. It works on other OSs. Im going to just remove it since it wasn't a highly valuable test, it only tested the input check.
|
|
||
| #include "gandiva/execution_context.h" | ||
| #include "gandiva/hash_utils.h" | ||
| #include "openssl/evp.h" |
| /// It uses the EVP API in the OpenSSL library to generate | ||
| /// the hash. The type of the hash is defined by the | ||
| /// \b hash_type \b parameter. | ||
| GANDIVA_EXPORT |
There was a problem hiding this comment.
Pull request overview
This PR addresses correctness/safety issues in Gandiva’s OpenSSL-based hashing helper (gdv_hash_using_openssl) and reduces its intended API exposure.
Changes:
- Stops exporting
gdv_hash_using_openssl(intended to be internal). - Fixes validation logic so mismatched digest size or result buffer size triggers an error.
- Adjusts result-buffer allocation and
snprintfsizing to avoid potential out-of-bounds writes when hex-encoding the digest.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
cpp/src/gandiva/hash_utils_test.cc |
Adds an OpenSSL header include in the hash utils unit tests. |
cpp/src/gandiva/hash_utils.h |
Removes GANDIVA_EXPORT from gdv_hash_using_openssl declaration. |
cpp/src/gandiva/hash_utils.cc |
Fixes validation condition and tightens result buffer allocation / snprintf bounds. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const char* gdv_hash_using_openssl(int64_t context, const void* message, | ||
| size_t message_length, const EVP_MD* hash_type, | ||
| uint32_t result_buf_size, int32_t* out_length); |
There was a problem hiding this comment.
gdv_hash_using_openssl is still declared in the installed public header but is no longer exported. That leaves a callable declaration that will typically fail to link for downstream users when building shared libs (and it also forces exposing OpenSSL types via a public header). If this helper is truly internal, remove its declaration from hash_utils.h and keep it static/in an unnamed namespace in hash_utils.cc (or move it to a non-installed internal header).
| unsigned int result_length; | ||
| EVP_DigestFinal_ex(md_ctx, result, &result_length); | ||
|
|
There was a problem hiding this comment.
The return value of EVP_DigestFinal_ex is not checked. On failure it can leave result_length unset/unchanged, and the subsequent validation/loop can read past result (allocated to hash_digest_size) and potentially write out of bounds. Capture and validate the return status (and set result_length = 0 before calling) and error out if finalization fails.
| unsigned int result_length; | |
| EVP_DigestFinal_ex(md_ctx, result, &result_length); | |
| unsigned int result_length = 0; | |
| if (EVP_DigestFinal_ex(md_ctx, result, &result_length) != evp_success_status) { | |
| gdv_fn_context_set_error_msg(context, | |
| "Could not obtain the hash for the defined value"); | |
| EVP_MD_CTX_free(md_ctx); | |
| OPENSSL_free(result); | |
| *out_length = 0; | |
| return ""; | |
| } |
|
|
||
| #include "gandiva/execution_context.h" | ||
| #include "gandiva/hash_utils.h" | ||
| #include "openssl/evp.h" |
There was a problem hiding this comment.
hash_utils.h already includes <openssl/evp.h>, and this test file doesn’t reference any OpenSSL APIs directly. This extra include is redundant and adds an unnecessary dependency/compile cost; consider removing it.
| #include "openssl/evp.h" |
Rationale for this change
Fixes security related problems found in gdv_hash_using_openssl. Those problems were not deemed to be a security risk.
What changes are included in this PR?
[hash_utils.h:41, hash_utils.cc:66] Removed GANDIVA_EXPORT from gdv_hash_using_openssl — it's an internal helper, not part of the public API.
[hash_utils.cc:105] Changed && → || in the validation condition. The original only errored when both checks failed; now it errors when either result_length != hash_digest_size or result_buf_size != (2 * hash_digest_size).
[hash_utils.cc:135] Fixed snprintf buffer size, so it correctly accounts for the already-written bytes and prevents potential out-of-bounds writes. Allocate result_buf_size + 1 bytes — the extra byte absorbs the final null terminator. Pass result_buf_size - result_buff_index + 1 to snprintf — reflects the actual remaining space (2 hex chars + 1 null = 3 bytes on the last call), preventing any potential overflow if the format ever changed.
Are these changes tested?
Yes, unit tests.
Are there any user-facing changes?
No.