Skip to content

GH-49644: [Python] Support converting list of multi-dimensional array…#50203

Open
aboderinsamuel wants to merge 3 commits into
apache:mainfrom
aboderinsamuel:gh-49644-list-multidim-to-fixed-shape-tensor
Open

GH-49644: [Python] Support converting list of multi-dimensional array…#50203
aboderinsamuel wants to merge 3 commits into
apache:mainfrom
aboderinsamuel:gh-49644-list-multidim-to-fixed-shape-tensor

Conversation

@aboderinsamuel

@aboderinsamuel aboderinsamuel commented Jun 17, 2026

Copy link
Copy Markdown

Rationale for this change

Constructing a fixed-shape-tensor array from a list of individual ndarrays only
worked when each element was 1-D; ≥2-D elements failed with
ArrowInvalid: Can only convert 1-dimensional array values. The only workaround
was stacking the list into a single ndarray and using
FixedShapeTensorArray.from_numpy_ndarray.

What changes are included in this PR?

The C++ list converter PyListConverter::AppendNdarray now accepts
multi-dimensional ndarray elements for fixed-size lists (the storage of a
fixed-shape tensor) by flattening them in C order. The fixed-size-list builder
still validates that the flattened length matches the list width, so wrong sizes
error cleanly. Variable-sized lists remain restricted to 1-D values to avoid
ambiguity. As a side benefit, plain fixed_size_list also accepts
multi-dimensional ndarray elements now.

Are these changes tested?

Yes:

  • test_tensor_array_from_list_of_ndarrays — construction from 2-D and 3-D
    ndarrays, null handling, storage parity with from_numpy_ndarray, and the
    size-mismatch error, across int8/int64/float32.
  • test_fixed_size_list_from_multidim_ndarray — plain fixed_size_list from
    multi-dim arrays, plus a check that variable-sized lists still reject 2-D.

Are there any user-facing changes?

Yes — pa.array([multi-dim ndarrays], type=fixed_shape_tensor(...)) (and the
same for fixed_size_list) now works instead of raising. Existing 1-D behavior
and variable-sized-list behavior are unchanged.

Scoped to construction only; the reverse to_numpy shape-preservation also
raised in the issue is intentionally left as a separate follow-up.

Copilot AI review requested due to automatic review settings June 17, 2026 09:21
@github-actions

Copy link
Copy Markdown

⚠️ GitHub issue #49644 has been automatically assigned in GitHub to PR creator.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds support for constructing fixed-shape tensors and fixed-size lists from lists of multi-dimensional NumPy ndarrays by flattening values in C order (GH-49644).

Changes:

  • Add tests covering tensor arrays built from lists of ndarrays (including nulls and shape mismatch).
  • Add tests ensuring fixed-size lists accept multi-dimensional ndarray elements (and reject invalid cases).
  • Update ndarray-to-list conversion to allow flattening for FIXED_SIZE_LIST while keeping variable-sized lists restricted to 1D.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
python/pyarrow/tests/test_extension_type.py Adds coverage for building FixedShapeTensorArray from a list of ndarrays.
python/pyarrow/tests/test_array.py Adds a regression test for fixed-size list conversion from multi-dimensional ndarrays.
python/pyarrow/src/arrow/python/python_to_arrow.cc Implements multi-dimensional ndarray flattening for fixed-size lists during conversion.

Comment thread python/pyarrow/tests/test_array.py Outdated
Comment thread python/pyarrow/tests/test_array.py
@github-actions github-actions Bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Jun 17, 2026
@aboderinsamuel aboderinsamuel force-pushed the gh-49644-list-multidim-to-fixed-shape-tensor branch from 90c2ac4 to e695d01 Compare June 17, 2026 20:42
Copilot AI review requested due to automatic review settings June 17, 2026 20:42

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread python/pyarrow/src/arrow/python/python_to_arrow.cc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants