Environment
|
|
| OS |
Linux 5.15.0 |
| python-blosc2 |
4.1.2 |
| C-Blosc2 |
2.23.1 (2026-03-03) |
| Python |
3.x (conda) |
| NumPy |
2.x |
Summary
Calling blosc2.ndarray_from_cframe(cframe, copy=True) aborts the process with a native heap corruption error whenever the cframe contains a SPECIAL (zero-run-length) chunk that is followed by at least one NORMAL chunk.
munmap_chunk(): invalid pointer
Aborted (core dumped)
Minimal Reproduction
import blosc2
import numpy as np
# 14-element array: first 7 elements are zero (-> SPECIAL chunk),
# last 7 elements are nonzero (-> NORMAL chunk).
arr = np.zeros(14, dtype=np.float32)
arr[7:] = 1.0
b2arr = blosc2.asarray(arr, chunks=(7,), blocks=(3,))
cframe = b2arr.to_cframe()
# Crashes: munmap_chunk(): invalid pointer / Aborted (core dumped)
blosc2.ndarray_from_cframe(cframe, copy=True)
Observed vs. Expected Behaviour
|
|
| Expected |
ndarray_from_cframe returns a valid NDArray with the original data. |
| Observed |
Process aborts immediately with munmap_chunk(): invalid pointer. |
Trigger Condition
The crash depends solely on the ordering of SPECIAL and NORMAL chunks in the cframe. A chunk is SPECIAL (zero-run-length encoded) when all its elements are zero.
| Chunk sequence |
Result |
[NORMAL, NORMAL] |
OK |
[SPECIAL, SPECIAL] |
OK |
[NORMAL, NORMAL, SPECIAL] |
OK — specials at tail only |
[SPECIAL, NORMAL] |
CRASH |
[NORMAL, SPECIAL, NORMAL] |
CRASH |
[SPECIAL, SPECIAL, NORMAL] |
CRASH |
Rule: crash occurs when any SPECIAL chunk has at least one NORMAL chunk after it.
Additional Observations
- The bug is codec- and filter-independent: reproduces with BLOSCLZ, LZ4, LZ4HC, ZLIB, ZSTD and with NOFILTER, SHUFFLE, BITSHUFFLE.
ndarray_from_cframe(cframe, copy=False) does not crash (but the cframe bytes object must outlive the returned NDArray).
blosc2.open() on a file-backed NDArray does not crash.
- The crash is in native code (
blosc2_schunk_from_buffer), not in Python.
Suspected Root Cause
In blosc2_schunk_from_buffer(copy=True), the C code appears to store a raw pointer into the source cframe buffer for SPECIAL chunks rather than malloc()-ing a private copy. When the schunk (or its chunks) are later freed, free() is called on that non-malloc()'d pointer, corrupting the heap.
The copy=False path is unaffected because no ownership transfer is attempted for SPECIAL chunks — the pointer into the caller-owned buffer is valid for as long as the buffer lives.
Workaround
Avoid the cframe round-trip when loading a file-backed NDArray. blosc2.open() returns a fully functional NDArray without going through ndarray_from_cframe:
# Safe alternative for file-backed arrays
store = blosc2.open(filepath, mode="r")
If an in-memory copy is required:
store = blosc2.open(filepath, mode="r")
in_memory = blosc2.asarray(store[:])
Environment
Summary
Calling
blosc2.ndarray_from_cframe(cframe, copy=True)aborts the process with a native heap corruption error whenever the cframe contains a SPECIAL (zero-run-length) chunk that is followed by at least one NORMAL chunk.Minimal Reproduction
Observed vs. Expected Behaviour
ndarray_from_cframereturns a validNDArraywith the original data.munmap_chunk(): invalid pointer.Trigger Condition
The crash depends solely on the ordering of SPECIAL and NORMAL chunks in the cframe. A chunk is SPECIAL (zero-run-length encoded) when all its elements are zero.
[NORMAL, NORMAL][SPECIAL, SPECIAL][NORMAL, NORMAL, SPECIAL][SPECIAL, NORMAL][NORMAL, SPECIAL, NORMAL][SPECIAL, SPECIAL, NORMAL]Rule: crash occurs when any SPECIAL chunk has at least one NORMAL chunk after it.
Additional Observations
ndarray_from_cframe(cframe, copy=False)does not crash (but the cframe bytes object must outlive the returned NDArray).blosc2.open()on a file-backed NDArray does not crash.blosc2_schunk_from_buffer), not in Python.Suspected Root Cause
In
blosc2_schunk_from_buffer(copy=True), the C code appears to store a raw pointer into the source cframe buffer for SPECIAL chunks rather thanmalloc()-ing a private copy. When the schunk (or its chunks) are later freed,free()is called on that non-malloc()'d pointer, corrupting the heap.The
copy=Falsepath is unaffected because no ownership transfer is attempted for SPECIAL chunks — the pointer into the caller-owned buffer is valid for as long as the buffer lives.Workaround
Avoid the cframe round-trip when loading a file-backed NDArray.
blosc2.open()returns a fully functionalNDArraywithout going throughndarray_from_cframe:If an in-memory copy is required: