Skip to content

fix(rdkit): avoid deprecated explicit valence API#969

Open
njzjz-bot wants to merge 5 commits intodeepmodeling:masterfrom
njzjz-bot:fix/issue-968
Open

fix(rdkit): avoid deprecated explicit valence API#969
njzjz-bot wants to merge 5 commits intodeepmodeling:masterfrom
njzjz-bot:fix/issue-968

Conversation

@njzjz-bot
Copy link
Copy Markdown
Contributor

@njzjz-bot njzjz-bot commented Apr 18, 2026

Problem

  • RDKit deprecated Atom.GetExplicitValence() in recent releases, which makes dpdata CI emit a deprecation warning.
  • dpdata.rdkit.sanitize.get_explicit_valence() still called the deprecated API directly.

Change

  • Prefer Atom.GetValence(Chem.ValenceType.EXPLICIT) when available, with a fallback to Atom.GetExplicitValence() for compatibility with older RDKit versions.
  • Add a focused unit test that verifies both the new-API path and the legacy fallback path without requiring RDKit at test time.

Notes

Authored by OpenClaw (model: gpt-5.4)

Summary by CodeRabbit

Release Notes

  • Chores

    • Enhanced compatibility by adding automatic version detection and fallback support for different RDKit library versions to ensure consistent behavior.
  • Tests

    • Added comprehensive test coverage to verify valence calculations work correctly across different RDKit library versions.

Prefer the new RDKit explicit valence API when available and
fall back to the legacy method for older RDKit versions.

Add a focused unit test that covers both code paths without
requiring RDKit at test time.

Authored by OpenClaw (model: gpt-5.4)
@dosubot dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Apr 18, 2026
@dosubot dosubot Bot added the dpdata label Apr 18, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 18, 2026

Warning

Rate limit exceeded

@pre-commit-ci[bot] has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 39 minutes and 38 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 39 minutes and 38 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7caa9e7f-1f34-4306-a7b1-e59a5fefbb31

📥 Commits

Reviewing files that changed from the base of the PR and between f1cf0be and 31c12f3.

📒 Files selected for processing (1)
  • tests/test_rdkit_sanitize.py
📝 Walkthrough

Walkthrough

Modified get_explicit_valence() function to prefer RDKit's newer GetValence(Chem.ValenceType.EXPLICIT) API when available, with fallback to the deprecated method for older versions. Added comprehensive tests for both code paths using mocked atom objects.

Changes

Cohort / File(s) Summary
RDKit API Compatibility Update
dpdata/rdkit/sanitize.py
Updated get_explicit_valence() to use newer RDKit valence API when available, tracking which method was used; falls back to legacy method if new API is unavailable; verbose logging updated to reference actual method name.
Unit Tests
tests/test_rdkit_sanitize.py
New test module with dynamic imports that validates get_explicit_valence() behavior under both RDKit API scenarios (new and legacy), using mocked atom objects and patched sys.modules.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main change: fixing deprecated RDKit API usage by adopting the newer valence API.
Linked Issues check ✅ Passed The pull request fully addresses issue #968 by replacing deprecated Atom.GetExplicitValence() with Atom.GetValence(Chem.ValenceType.EXPLICIT) and providing backward compatibility with fallback logic.
Out of Scope Changes check ✅ Passed All changes are directly aligned with the stated objective: updating the valence API in sanitize.py and adding comprehensive unit tests for both new and legacy API paths.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Apr 18, 2026

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 2 untouched benchmarks


Comparing njzjz-bot:fix/issue-968 (53ce12e) with master (e0f56ac)

Open in CodSpeed

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 18, 2026

Codecov Report

❌ Patch coverage is 57.14286% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.67%. Comparing base (e0f56ac) to head (53ce12e).

Files with missing lines Patch % Lines
dpdata/rdkit/sanitize.py 57.14% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #969      +/-   ##
==========================================
- Coverage   86.70%   86.67%   -0.03%     
==========================================
  Files          86       86              
  Lines        8039     8045       +6     
==========================================
+ Hits         6970     6973       +3     
- Misses       1069     1072       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/test_rdkit_sanitize.py (1)

8-12: Optional: consider importing via the package instead of loading the file by path.

Loading sanitize.py via importlib.util.spec_from_file_location bypasses dpdata.rdkit.__init__ and the normal package import path, which is unusual relative to the rest of the suite (e.g., tests/test_gaussian_driver.py uses importlib.util.find_spec + @unittest.skipIf to gate on optional deps). Since get_explicit_valence itself doesn't require rdkit at import time, a plain from dpdata.rdkit.sanitize import get_explicit_valence (optionally guarded) would be more conventional and avoid the assert SPEC.loader is not None dance. Not blocking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_rdkit_sanitize.py` around lines 8 - 12, The test currently loads
sanitize.py by path using MODULE_PATH, SPEC and
SPEC.loader.exec_module(sanitize); replace this with a normal package import to
avoid bypassing dpdata.rdkit.__init__: import the symbol directly (e.g. from
dpdata.rdkit.sanitize import get_explicit_valence) and, if you need to guard for
optional rdkit, use importlib.util.find_spec("rdkit") or a try/except
ImportError and apply unittest.skipIf or skip the test accordingly rather than
using importlib.util.spec_from_file_location and the SPEC.loader assertion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/test_rdkit_sanitize.py`:
- Around line 71-82: The test test_falls_back_to_legacy_api_when_new_api_missing
can silently succeed by accidentally using a real rdkit submodule cached at
sys.modules['rdkit.Chem']; when patching sys.modules to fake a top-level rdkit
you must also ensure sys.modules['rdkit.Chem'] is patched (either removed or set
to a harmless fake) so Python cannot fall back to a real submodule; update the
test to set sys.modules['rdkit.Chem'] to the same fake (or to None/placeholder)
alongside sys.modules['rdkit'] when exercising sanitize.get_explicit_valence to
guarantee the legacy-api fallback is actually being tested.

---

Nitpick comments:
In `@tests/test_rdkit_sanitize.py`:
- Around line 8-12: The test currently loads sanitize.py by path using
MODULE_PATH, SPEC and SPEC.loader.exec_module(sanitize); replace this with a
normal package import to avoid bypassing dpdata.rdkit.__init__: import the
symbol directly (e.g. from dpdata.rdkit.sanitize import get_explicit_valence)
and, if you need to guard for optional rdkit, use
importlib.util.find_spec("rdkit") or a try/except ImportError and apply
unittest.skipIf or skip the test accordingly rather than using
importlib.util.spec_from_file_location and the SPEC.loader assertion.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c35f8b2a-9824-4e42-a0bc-6476550c80cc

📥 Commits

Reviewing files that changed from the base of the PR and between e0f56ac and f1cf0be.

📒 Files selected for processing (2)
  • dpdata/rdkit/sanitize.py
  • tests/test_rdkit_sanitize.py

Comment thread tests/test_rdkit_sanitize.py Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates RDKit valence handling to avoid using a deprecated API while keeping compatibility with older RDKit versions, and adds a unit test to exercise both code paths without requiring RDKit at test time.

Changes:

  • Use Atom.GetValence(Chem.ValenceType.EXPLICIT) when available, with fallback to Atom.GetExplicitValence().
  • Improve verbose inconsistency message to include which valence API was used.
  • Add a new unit test for the preferred (new) API path and the legacy fallback path using mocked RDKit modules.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
dpdata/rdkit/sanitize.py Prefer non-deprecated RDKit explicit valence API with a compatibility fallback; update verbose message accordingly.
tests/test_rdkit_sanitize.py Add RDKit-free unit tests using mocks to cover new-API and fallback behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/test_rdkit_sanitize.py Outdated
Comment thread tests/test_rdkit_sanitize.py Outdated
@njzjz-bot
Copy link
Copy Markdown
Contributor Author

Addressed the review comments in 7a7f862:

  • import dpdata.rdkit.sanitize through the normal package path instead of loading the file by path
  • patch both sys.modules["rdkit"] and sys.modules["rdkit.Chem"] in the tests
  • assert the legacy fallback actually calls GetExplicitValence()

I could only do a local syntax check here because this machine does not currently have the full test deps installed, but CI on the PR should verify the RDKit-backed path.

@njzjz njzjz requested a review from wanghan-iapcm April 18, 2026 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dpdata size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] DEPRECATION WARNING: please use GetValence(Chem.ValenceType.EXPLICIT) instead

2 participants