fix(pt): clone inference coords before enabling grad#5476
Conversation
Avoid enabling gradients in-place on LAMMPS-provided coordinate views. Clone the extended coordinates before building force graphs and pass that cloned tensor to derivative generation while keeping auxiliary metadata private to model output conversion. Closes deepmodeling#5165 Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThree atomic models avoid in-place ChangesCoordinate Gradient Safety and force_coord Propagation
Sequence Diagram(s)sequenceDiagram
participant coord_view as coord_view (leaf view)
participant force_coord as force_coord (clone if needed)
participant atomic as AtomicModel.forward_common_atomic
participant fitter as fit_output_to_model_output
coord_view->>force_coord: if not requires_grad -> detach().clone().requires_grad_(True)
force_coord->>atomic: pass as extended_coord / positional coord
atomic->>fitter: atomic outputs + force_coord -> assemble final model outputs
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Only clone extended coordinates when gradients are not already enabled. This keeps existing autodiff and Hessian graphs intact while still avoiding in-place requires_grad_ on non-grad coordinate views. Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #5476 +/- ##
==========================================
- Coverage 82.25% 81.36% -0.90%
==========================================
Files 833 868 +35
Lines 89094 96439 +7345
Branches 4227 4235 +8
==========================================
+ Hits 73287 78467 +5180
- Misses 14515 16672 +2157
- Partials 1292 1300 +8 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR fixes PyTorch inference gradient handling so view tensors passed from lower-level callers such as LAMMPS are not modified in-place when force/virial derivatives are enabled.
Changes:
- Clones coordinate tensors before enabling gradients in PT atomic-model paths.
- Carries the derivative coordinate tensor through
_force_coordmetadata to model output transformation. - Adds regression tests for leaf-view coordinate inputs in atomic and lower model inference paths.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
deepmd/pt/model/atomic_model/base_atomic_model.py |
Preserves private metadata keys while applying atom masks. |
deepmd/pt/model/atomic_model/dp_atomic_model.py |
Clones grad-enabled coordinates and returns _force_coord. |
deepmd/pt/model/atomic_model/linear_atomic_model.py |
Applies the same coordinate cloning and _force_coord propagation for linear models. |
deepmd/pt/model/atomic_model/pairtab_atomic_model.py |
Applies cloning and keeps pairtab energy connected to _force_coord. |
deepmd/pt/model/model/make_model.py |
Uses _force_coord as the autograd derivative source when available. |
source/tests/pt/model/test_dp_atomic_model.py |
Adds atomic-model regression coverage for leaf-view coordinate inputs. |
source/tests/pt/model/test_dp_model.py |
Adds lower-model regression coverage for leaf-view coordinate inputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| coord = to_torch_tensor(self.coord_ext).requires_grad_(True) | ||
| coord_view = coord.view(self.nf, self.nall, 3) |
| coord_view = coord_ext.requires_grad_(True).view(self.nf, -1, 3) | ||
|
|
||
| ret = md0.forward_lower(coord_view, atype_ext, nlist, do_atomic_virial=True) | ||
|
|
||
| self.assertIn("extended_force", ret) |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
source/tests/pt/model/test_dp_atomic_model.py (1)
77-103: ⚡ Quick winAdd documentation and explicit verification for the leaf-view regression test.
This test protects against issue
#5165(RuntimeError when enabling gradients on leaf-view tensors), but that context isn't documented. Future maintainers won't understand why this specific test setup matters.Consider these improvements:
- Add a docstring or comment explaining this is a regression test for issue
#5165and why the leaf-view scenario is critical (LAMMPS provides view tensors)- Add explicit assertions to verify the test preconditions:
self.assertTrue(coord.is_leaf, "coord must be a leaf tensor") self.assertTrue(coord_view._is_view(), "coord_view must be a view")- Consider validating output correctness beyond just key existence—for example, check that energy values are reasonable or match a known reference
The current test will catch crashes (good), but stronger documentation and precondition checks would make the test's purpose clearer and more robust.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@source/tests/pt/model/test_dp_atomic_model.py` around lines 77 - 103, The test test_forward_common_atomic_accepts_leaf_view_input lacks documentation and explicit precondition checks for the leaf-view regression it protects; add a brief docstring or inline comment referencing issue `#5165` and why a leaf tensor with a view matters, then add explicit assertions before calling md0.forward_* to ensure coord.is_leaf is True and coord_view._is_view() (or coord_view.is_view() depending on API) is True; finally add at least one lightweight correctness check on outputs (e.g., energy tensor shape and finite values from ret["energy"] or a reasonable range) to validate results beyond key existence while keeping the original crash-protection behavior in test_forward_common_atomic_accepts_leaf_view_input.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@source/tests/pt/model/test_dp_atomic_model.py`:
- Around line 77-103: The test
test_forward_common_atomic_accepts_leaf_view_input lacks documentation and
explicit precondition checks for the leaf-view regression it protects; add a
brief docstring or inline comment referencing issue `#5165` and why a leaf tensor
with a view matters, then add explicit assertions before calling md0.forward_*
to ensure coord.is_leaf is True and coord_view._is_view() (or
coord_view.is_view() depending on API) is True; finally add at least one
lightweight correctness check on outputs (e.g., energy tensor shape and finite
values from ret["energy"] or a reasonable range) to validate results beyond key
existence while keeping the original crash-protection behavior in
test_forward_common_atomic_accepts_leaf_view_input.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 35aef69e-ecc3-4b04-8a62-ab185f5ee8c0
📒 Files selected for processing (2)
source/tests/pt/model/test_dp_atomic_model.pysource/tests/pt/model/test_dp_model.py
🚧 Files skipped from review as they are similar to previous changes (1)
- source/tests/pt/model/test_dp_model.py
|
Updated the implementation to avoid passing the cloned coordinate through a private This keeps atomic model outputs clean while still avoiding in-place Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5) |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/pt/model/atomic_model/dp_atomic_model.py`:
- Around line 274-275: In DPAtomicModel.forward_atomic (and any call paths into
forward_common_atomic), avoid calling extended_coord.requires_grad_(True)
in-place on a potential leaf/view tensor; instead create a non-leaf clone of
extended_coord before enabling gradients so autograd/view semantics are
preserved (e.g., clone extended_coord and then set requires_grad on the clone),
and use that cloned tensor for the rest of the forward pass.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d5a0a4a7-fbb5-43e8-af5d-2b26080a3e6c
📒 Files selected for processing (5)
deepmd/pt/model/atomic_model/dp_atomic_model.pydeepmd/pt/model/atomic_model/linear_atomic_model.pydeepmd/pt/model/atomic_model/pairtab_atomic_model.pydeepmd/pt/model/model/make_model.pysource/tests/pt/model/test_dp_atomic_model.py
💤 Files with no reviewable changes (1)
- source/tests/pt/model/test_dp_atomic_model.py
| nframes, nloc, nnei = nlist.shape | ||
| extended_coord = extended_coord.view(nframes, -1, 3) | ||
| if self.do_grad_r() or self.do_grad_c(): | ||
| if (self.do_grad_r() or self.do_grad_c()) and not extended_coord.requires_grad: |
There was a problem hiding this comment.
Line 277 reshapes extended_coord = extended_coord.view(...), making the local variable a non-leaf view. The new guard at line 278 still lets line 279 call extended_coord.requires_grad_(True) on that non-leaf view, which raises RuntimeError: you can only change requires_grad flags of leaf variables. Compare with linear_atomic_model.py:261-263 where the requires_grad_ call precedes the .view(), and with dp_atomic_model.py:274-275 which clones to obtain a leaf. The standard make_model.forward_common_lower path masks this because it clones upstream, but a direct caller of PairTabAtomicModel.forward_common_atomic with a leaf view — the exact scenario the new test_dp_atomic_model.py test is designed to cover — would crash. Reorder (requires_grad_ before .view()) or apply the same .clone() treatment as dp_atomic_model.py.
There was a problem hiding this comment.
Fixed in 912ea401: LinearEnergyAtomicModel now also clones before requires_grad_(True), matching the DP/pair-tab treatment, so a leaf-view input is not mutated in-place before reshaping. PairTabAtomicModel was already using the clone path on the current branch.
— OpenClaw 2026.5.28 (model: custom-chat-jinzhezeng-group/gpt-5.5)
| ] | ||
| ret = md0.forward_common_atomic(*args) | ||
|
|
||
| self.assertIn("energy", ret) |
There was a problem hiding this comment.
test_forward_common_atomic_accepts_leaf_view_input only asserts assertIn("energy", ret) — no check that coord_view is unmutated (e.g. self.assertFalse(coord_view.requires_grad) and/or value equality against a pre-call clone). The test would pass even if dp_atomic_model.py reverted to the in-place requires_grad_(True) pattern — i.e. it does not exercise the regression the PR claims to fix. The corresponding test_dp_model.py::test_forward_lower_accepts_leaf_view_input does include the requires_grad assertion; mirror that here.
There was a problem hiding this comment.
Thanks, agreed. The existing DP atomic test now checks both that the caller's view remains requires_grad=False and that values are unchanged after the call. I also added the same leaf-view regression coverage for linear and pair-tab atomic models in 912ea401.
— OpenClaw 2026.5.28 (model: custom-chat-jinzhezeng-group/gpt-5.5)
| atype = extended_atype[:, :nloc] | ||
| if self.do_grad_r() or self.do_grad_c(): | ||
| extended_coord.requires_grad_(True) | ||
| if (self.do_grad_r() or self.do_grad_c()) and not extended_coord.requires_grad: |
There was a problem hiding this comment.
The new compound guard and not extended_coord.requires_grad introduces a second boolean branch, but the new tests cover only the requires_grad=False branch (where the clone happens). The requires_grad=True branch — the motivating case for the guard, where existing autograd state should be preserved — is untested. The same guard added at linear_atomic_model.py:261 and pairtab_atomic_model.py:278 has no new tests at all.
There was a problem hiding this comment.
Added/kept coverage for both branches:
requires_grad=False: verifies leaf-view inputs are accepted without mutating the caller tensor.requires_grad=True: verifies existing autograd state is preserved and gradients flow.
This now covers DP atomic, linear atomic, and pair-tab atomic paths. Local uvx pre-commit run --files ... passes. I could not run the PyTorch tests locally because this environment does not have torch; trying to fetch it transiently timed out due the large package download.
— OpenClaw 2026.5.28 (model: custom-chat-jinzhezeng-group/gpt-5.5)
Address review feedback for the coordinate gradient regression by cloning PairTab coordinates before reshaping, asserting direct atomic callers do not mutate view inputs, and covering already grad-enabled inputs for DP, model lower, linear, and pairtab paths. Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
Apply the same clone-before-requires-grad treatment to the linear atomic wrapper so leaf views are not mutated in-place before reshaping. Add regression coverage for leaf-view inputs in linear and pair-tab atomic models. Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
Apply ruff-format's line wrapping to the DP model regression test so pre-commit.ci does not need to autofix the PR branch. Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
Summary
Tests
uvx pre-commit run --files deepmd/pt/model/atomic_model/base_atomic_model.py deepmd/pt/model/atomic_model/dp_atomic_model.py deepmd/pt/model/atomic_model/linear_atomic_model.py deepmd/pt/model/atomic_model/pairtab_atomic_model.py deepmd/pt/model/model/make_model.py source/tests/pt/model/test_dp_atomic_model.py source/tests/pt/model/test_dp_model.pyPYTHONPATH=$PWD uv run --index-strategy unsafe-best-match --with pytest --with numpy --with scipy --with pyyaml --with dargs --with typing_extensions --with h5py --with wcmatch --with packaging --with ml_dtypes --with mendeleev --with array-api-compat --with lmdb --with msgpack --with torch==2.5.1+cpu --extra-index-url https://download.pytorch.org/whl/cpu --no-project pytest source/tests/pt/model/test_dp_model.py::TestDPModel::test_forward_lower_accepts_leaf_view_input source/tests/pt/model/test_dp_atomic_model.py::TestDPAtomicModel::test_forward_common_atomic_accepts_leaf_view_input -qCloses #5165
Authored by OpenClaw (model: custom-chat-jinzhezeng-group/gpt-5.5)
Summary by CodeRabbit
Refactor
Tests