Skip to content

Update pandas version upper bound to support python 3.14#39056

Merged
shunping merged 4 commits into
apache:masterfrom
shunping:update-pandas
Jun 22, 2026
Merged

Update pandas version upper bound to support python 3.14#39056
shunping merged 4 commits into
apache:masterfrom
shunping:update-pandas

Conversation

@shunping

@shunping shunping commented Jun 22, 2026

Copy link
Copy Markdown
Collaborator

Bump pandas version to 2.3.x to fix segfault on dataframe tests that run on python 3.14.

Stacktrace of segfault from failed tests: https://github.com/apache/beam/actions/runs/27927544557/job/82678205191

Current thread 0x00000002066c22c0 (most recent call first):
  File "/Users/runner/work/beam/beam/sdks/python/target/.tox/py314-macos/lib/python3.14/site-packages/pandas/core/arrays/datetimes.py", line 439 in _generate_range
  File "/Users/runner/work/beam/beam/sdks/python/target/.tox/py314-macos/lib/python3.14/site-packages/pandas/core/indexes/datetimes.py", line 1008 in date_range
  File "/Users/runner/work/beam/beam/sdks/python/apache_beam/dataframe/frames_test.py", line 796 in test_loc
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/unittest/case.py", line 615 in _callTestMethod
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/unittest/case.py", line 669 in run
  File "/Library/Frameworks/Python.framework/Versions/3.14/lib/python3.14/unittest/case.py", line 725 in __call__

This failure occurs when running the tests with pandas 2.2.3, whereas pandas 2.3.3 is the first version to introduce support for Python 3.14 (https://github.com/pandas-dev/pandas/releases/tag/v2.3.3).

Also see #38691 (comment)


Notice that a few tests failed (https://github.com/apache/beam/actions/runs/27951859585/job/82710533408?pr=39056) due to a change in pandas 2.3.0+, where null/missing values returned
by Series.str.get_dummies() are changed from 'nan' to 'NaN'.

In this PR, we also applied a fix in the code so the tests pass on pandas before and after 2.3.0.

Also related to #31238.

@shunping shunping marked this pull request as ready for review June 22, 2026 18:04
@shunping

Copy link
Copy Markdown
Collaborator Author

r: @tvalentyn

@gemini-code-assist

Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request updates the project's pandas dependency to support Python 3.14, addressing critical segmentation faults encountered during testing. It includes necessary adjustments to handle behavioral changes in pandas 2.3.0+ regarding how null values are represented in dummy columns, ensuring the codebase remains compatible with both older and newer versions of the library.

Highlights

  • Dependency Update: Updated the pandas version upper bound from <2.3 to <2.4 in setup.py to support Python 3.14 and resolve segmentation faults.
  • Compatibility Fix: Implemented dynamic detection of the NaN column name in get_dummies() to ensure compatibility across different pandas versions (2.2.x vs 2.3.x+).
  • Test Infrastructure: Introduced PD_FULL_VERSION to allow for more granular version checks in test suites, ensuring robust test execution across pandas releases.
New Features

🧠 You can now enable Memory (public preview) to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions

Copy link
Copy Markdown
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the pandas dependency constraint to support versions up to <2.4, adjusts test skip conditions to use a full version check, and dynamically detects the NaN column name used by pandas in get_dummies(). The feedback suggests replacing the dynamic pandas operation with a static version check to avoid runtime overhead and potential IndexError.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread sdks/python/apache_beam/dataframe/frames.py Outdated
@shunping shunping merged commit 9aed894 into apache:master Jun 22, 2026
111 of 112 checks passed
@shunping shunping deleted the update-pandas branch June 22, 2026 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants