feat(pt_expt): full model and refact the module output names of dpmodel backend by wanghan-iapcm · Pull Request #5243 · deepmodeling/deepmd-kit

wanghan-iapcm · 2026-02-14T12:25:53Z

Summary by CodeRabbit

New Features
- Exposed model introspection (descriptor and output/bias accessors) and a PyTorch experimental energy model with traceable lower-level export and translated output mapping.
Improvements
- Better device propagation for GPU/accelerator allocations, backend-agnostic input/output casting, and removal of in-place mutations for safer computation.
Refactor
- Streamlined PyTorch module wrapping to decorator-based classes for cleaner runtime integration.
Tests
- Added extensive autodiff and cross-backend tests for energy, force, and virial (including PT-Expt).

…y on pt backend.

+    def forward(
+        self,
+        coord: torch.Tensor,
+        atype: torch.Tensor,
+        box: torch.Tensor | None = None,
+        fparam: torch.Tensor | None = None,
+        aparam: torch.Tensor | None = None,
+        do_atomic_virial: bool = False,
+    ) -> dict[str, torch.Tensor]:


            return
        return super().__setattr__(name, value)

+    def call(self, x: torch.Tensor) -> torch.Tensor:


+        )
+        # Compare the common keys
+        common_keys = set(dp_ret.keys()) & set(pt_ret.keys())
+        self.assertTrue(len(common_keys) > 0)


gemini-code-assist

Code Review

This pull request implements the full model support for the experimental PyTorch backend (pt_expt), including hooks for descriptor and fitting layer evaluation, and autograd-based derivative calculations. The changes also improve backend-agnostic array operations in the core dpmodel and ensure compatibility with torch.fx tracing. I have identified a few issues regarding potential runtime errors in the new evaluation hooks and unintended side effects in the output definition translation logic.

coderabbitai · 2026-02-14T12:31:29Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds PT-Expt PyTorch-exportable model layers and autodiff-based output transformations, introduces descriptor and bias accessors, refactors input/output casting across backends, enables middle-output capture in fitting, updates device-aware tensor creation, replaces in-place network ops, and adds extensive pt_expt tests.

Changes

Cohort / File(s)	Summary
Fitting output path `deepmd/dpmodel/fitting/general_fitting.py`	Always return a results dict keyed by the model var name; introduce local results dict and ensure middle-output is stored under result mapping for mixed and non-mixed flows.
Model accessors & energy mapping (dpmodel) `deepmd/dpmodel/model/dp_model.py`, `deepmd/dpmodel/model/ener_model.py`	Added `get_descriptor()` accessor in DPModelCommon; added `translated_output_def()` in dpmodel energy model to map internal names to user-facing outputs.
Type-cast refactor (dpmodel core) `deepmd/dpmodel/model/make_model.py`	Renamed public casting helpers to internal `_input_type_cast`/`_output_type_cast`, generalized dtype handling/return types, added out-bias accessors (`get_out_bias`, `set_out_bias`, `change_out_bias`) and updated call wiring.
Device-aware output transforms (dpmodel) `deepmd/dpmodel/model/transform_output.py`	Propagate device when allocating arrays (zeros/virial/hessian) to ensure tensors are created on the mapping/device.
In-place arithmetic removal (dpmodel utils) `deepmd/dpmodel/utils/network.py`	Replaced in-place operators with explicit arithmetic assignments in network forward logic.
PT-Expt module decorator migration `deepmd/pt_expt/atomic_model/dp_atomic_model.py`, `deepmd/pt_expt/fitting/ener_fitting.py`, `deepmd/pt_expt/fitting/invar_fitting.py`	Replace manual torch.nn.Module plumbing with `@torch_module` decorator; remove custom init/call/setattr and dpmodel_setattr usages.
PT-Expt model infrastructure & EnergyModel `deepmd/pt_expt/model/__init__.py`, `deepmd/pt_expt/model/make_model.py`, `deepmd/pt_expt/model/ener_model.py`	Add pt_expt make_model factory and a new EnergyModel class exposing forward / forward_lower and traceable lower-path exports; wire DPModelCommon integration for PyTorch.
PT-Expt transform & autodiff utilities `deepmd/pt_expt/model/transform_output.py`	Add atomic_virial_corr, task_deriv_one, take_deriv, fit_output_to_model_output and helpers to convert fitting-network outputs into model outputs via torch.autograd, handling forces, virials, masking, and atomic decomposition.
PT-Expt network/tracing compatibility `deepmd/pt_expt/utils/network.py`	Adjust parameter wrapping and add NativeLayer.call plus _torch_activation to avoid make_fx proxy-tracing issues; ensure parameters register appropriately.
PD/PT small wiring changes `deepmd/pd/model/model/make_model.py`, `deepmd/pt/model/model/make_model.py`	Rename input/output cast helpers to underscored variants and update call sites to match dpmodel internal convention.
Atomic model runtime checks `deepmd/pt/model/atomic_model/dp_atomic_model.py`	Add runtime validation to raise clear errors if eval_descriptor or eval_fitting_last_layer caches are empty when queried.
Tests & test infra additions `source/tests/...` (multiple files)	Add PT-Expt test support and many new tests: PT-Expt energy model unit tests, autodiff finite-difference force/virial tests, integration with existing cross-backend Ener tests, and eval_pt_expt_model helper in common test utilities.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client Code
    participant EM as EnergyModel (PT-Expt)
    participant Desc as Descriptor
    participant Fit as Fitting Net
    participant TO as TransformOutput
    participant Autograd as PyTorch Autograd

    Client->>EM: forward(coord, atype, box, ...)
    EM->>Desc: compute descriptors
    Desc-->>EM: descriptor tensor
    EM->>Fit: evaluate fitting network
    Fit-->>EM: fitting outputs (per-atom/reducible)
    EM->>TO: fit_output_to_model_output(fit_ret, coord_ext, ...)
    TO->>Autograd: enable grad on extended coords
    Autograd->>TO: compute ∇energy -> forces, virial, atom_virial
    TO-->>EM: model outputs (energy, atom_energy, force, virial, ...)
    EM-->>Client: return outputs

sequenceDiagram
    participant GP as GeneralFitting
    participant PT as Per-type Net(s)
    participant Acc as MiddleOutput Accumulator
    participant Res as Results Dict

    GP->>GP: eval_return_middle_output?
    alt mixed_types False
        GP->>Acc: init per-type accumulation
        loop per-type
            GP->>PT: evaluate net(xx)
            PT-->>GP: output (+ middle_output)
            GP->>Acc: accumulate middle_output
        end
        GP->>Res: store Acc as "middle_output"
    else mixed_types True
        GP->>PT: call_until_last(xx)
        PT-->>GP: middle_output
        GP->>Res: store middle_output
    end
    GP-->>Res: set var_name -> output tensor
    GP-->>Client: return Res

(Note: colored rectangles not used; flows kept minimal.)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

feat(pt_expt): add fitting for energy #5218: Modifies deepmd/dpmodel/fitting/general_fitting.py _call_common output handling — overlaps with the middle-output/results-dict changes here.
feat: new backend pytorch exportable. #5194: Adds/extends pt_expt backend components and tests; closely related to the new pt_expt model, make_model, and tests in this PR.
refact(pt_expt): add decorator to simplify the module #5213: Introduces and applies the @torch_module decorator pattern used here to replace manual torch.nn.Module plumbing across pt_expt modules.

Suggested reviewers

iProzd
njzjz

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 64.18% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Merge Conflict Detection	✅ Passed	✅ No merge conflicts detected when merging into `master`
Title check	✅ Passed	The title accurately describes the main changes: introducing full model support for pt_expt backend and refactoring dpmodel backend's module output names.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@source/tests/consistent/model/test_ener.py`:
- Around line 1184-1190: The variable nloc is assigned but never used—remove the
unused assignment or replace it with a meaningful use; specifically in the test
block where nframes, coords_2f, atype_2f, box_2f, natoms_data, and energy_data
are set, delete the declaration "nloc = 6" (or if intended, use nloc to drive
array shapes/validation) so there are no unused local variables in the test_ener
setup.
- Around line 124-126: The test contains a duplicate assignment to pd_class
(assigned to EnergyModelPD twice); remove the redundant second assignment so
pd_class is only set once and leave the other class assignments (pt_expt_class =
EnergyModelPTExpt and jax_class = EnergyModelJAX) unchanged.
- Around line 979-988: In test_change_out_bias the local variable nloc is
assigned but never used; remove the unused assignment (nloc = 6) from the
test_change_out_bias function to satisfy static analysis (or if the intended
intent was to use it, replace references to hardcoded 6 with nloc where
appropriate) so that only necessary variables (e.g. coords_2f, atype_2f, box_2f,
natoms_data, energy_data) remain.

🧹 Nitpick comments (6)

deepmd/dpmodel/fitting/general_fitting.py (1)

599-639: Minor: double forward pass when eval_return_middle_output is enabled.

When the hook is active, the network is evaluated twice per type (or once extra for mixed types) — once for the full output (line 609/635) and again via call_until_last (line 621/639). This duplicates computation of all layers except the last.

Since the dpmodel backend is primarily for reference/testing rather than production, this is acceptable. However, if performance becomes a concern, you could refactor the network to return both the final and penultimate outputs in a single pass.
deepmd/pt_expt/model/transform_output.py (1)
109-110: zip() without strict=True (ruff B905).

Both split_vv1 and split_svv1 are produced from the same size split, so they're guaranteed to have equal length. Adding strict=True makes this invariant explicit and guards against future refactors that might change one without the other.
Suggested fix
-    for vvi, svvi in zip(split_vv1, split_svv1):
+    for vvi, svvi in zip(split_vv1, split_svv1, strict=True):
source/tests/pt_expt/model/test_ener_model.py (1)

207-250: Consider extending DP-consistency test to also verify force values.

The consistency test validates energy and atom_energy but doesn't check force. Since the dpmodel sets derivative outputs to None, this is understandable, but you could compute a numerical finite-difference force from the dpmodel to cross-validate the autograd-based force from pt_expt. This would strengthen confidence in the derivative pathway.

(Note: the autodiff test file mentioned in the summary may already cover this — feel free to disregard if so.)

deepmd/pt_expt/model/ener_model.py (1)

100-147: forward_lower bakes fparam, aparam, do_atomic_virial into the traced graph.

The closure captures these values as constants during make_fx tracing, so the returned module is specialized for the specific fparam/aparam/do_atomic_virial values passed at trace time. This is appropriate for export workflows but means the caller must re-trace for different parameter configurations. The docstring correctly documents this ("Sample inputs with representative shapes"), but it might be worth explicitly noting that fparam/aparam/do_atomic_virial are baked in as well.

source/tests/pt_expt/model/test_autodiff.py (2)

49-76: Docstring says cell shape is [nf, 3, 3], but the body reshapes to [nframes, 9].

The function accepts cell with shape [nf, 3, 3] per the docstring, which is consistent with the callers (e.g., cell.unsqueeze(0) producing [1, 3, 3]). However, line 72 reshapes it to [nframes, 9]. This works, but the docstring could note that either [nf, 3, 3] or [nf, 9] is accepted, since reshape(nframes, 9) handles both shapes silently. Very minor — just a documentation clarity nit.

154-181: Duplicated setUp between TestEnergyModelSeAForce and TestEnergyModelSeAVirial.

Both test classes have identical setUp methods. Consider extracting a shared helper or a common base mixin to reduce duplication.

That said, this pattern (identical setUp in mixin-based test classes) is common in this codebase and the duplication is small, so this is a minor nit.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1fa1eb27b2

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

codecov · 2026-02-14T12:59:56Z

Codecov Report

❌ Patch coverage is 75.09579% with 130 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.06%. Comparing base (a0bd530) to head (53a7e7f).
⚠️ Report is 159 commits behind head on master.

Files with missing lines	Patch %	Lines
deepmd/dpmodel/model/dp_zbl_model.py	40.32%	37 Missing ⚠️
deepmd/dpmodel/model/dipole_model.py	40.42%	28 Missing ⚠️
deepmd/dpmodel/model/dos_model.py	48.14%	14 Missing ⚠️
deepmd/dpmodel/model/polar_model.py	48.14%	14 Missing ⚠️
deepmd/dpmodel/model/property_model.py	48.14%	14 Missing ⚠️
deepmd/pt_expt/utils/network.py	75.75%	8 Missing ⚠️
deepmd/dpmodel/model/ener_model.py	91.30%	6 Missing ⚠️
deepmd/pt_expt/model/transform_output.py	93.90%	5 Missing ⚠️
deepmd/dpmodel/model/make_model.py	96.96%	1 Missing ⚠️
deepmd/dpmodel/model/spin_model.py	85.71%	1 Missing ⚠️
... and 2 more

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #5243      +/-   ##
==========================================
- Coverage   82.12%   82.06%   -0.06%     
==========================================
  Files         736      740       +4     
  Lines       74237    74667     +430     
  Branches     3615     3616       +1     
==========================================
+ Hits        60966    61277     +311     
- Misses      12107    12225     +118     
- Partials     1164     1165       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

njzjz · 2026-02-14T16:16:28Z

How do we handle model output keys in the new backend?

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@deepmd/dpmodel/atomic_model/dp_atomic_model.py`:
- Around line 217-221: The runtime check using assert in the block guarded by
self.enable_eval_fitting_last_layer_hook is unsafe because asserts are removed
with -O; replace it with an explicit runtime exception (e.g., raise RuntimeError
or ValueError) that checks "middle_output" in ret and raises a clear error
message, then pop the key and append to self.eval_fitting_last_layer_list as
before (refer to the symbols self.enable_eval_fitting_last_layer_hook,
ret.pop("middle_output"), and self.eval_fitting_last_layer_list to locate the
code).

🧹 Nitpick comments (1)

deepmd/dpmodel/atomic_model/dp_atomic_model.py (1)

140-148: Consider clearing the cache after retrieval to prevent stale data and unbounded growth.

Both eval_descriptor() and eval_fitting_last_layer() leave the cache intact after returning the concatenated result. If a caller invokes them, then runs more forward passes, a subsequent call silently includes data from both the old and new passes. If clearing isn't desired, at minimum document the accumulation semantics so callers know to call set_eval_*_hook(True) again to reset.

Also applies to: 156-164

wanghan-iapcm · 2026-02-15T08:20:08Z

How do we handle model output keys in the new backend?

Shouldn't it always use the output keys in the dpmodel backend?

+    def call(
+        self,
+        coord: Array,
+        atype: Array,
+        box: Array | None = None,
+        fparam: Array | None = None,
+        aparam: Array | None = None,
+        do_atomic_virial: bool = False,
+    ) -> dict[str, Array]:


+    def call(
+        self,
+        coord: Array,
+        atype: Array,
+        box: Array | None = None,
+        fparam: Array | None = None,
+        aparam: Array | None = None,
+        do_atomic_virial: bool = False,
+    ) -> dict[str, Array]:


+    def call(
+        self,
+        coord: Array,
+        atype: Array,
+        box: Array | None = None,
+        fparam: Array | None = None,
+        aparam: Array | None = None,
+        do_atomic_virial: bool = False,
+    ) -> dict[str, Array]:


+    def call(
+        self,
+        coord: Array,
+        atype: Array,
+        box: Array | None = None,
+        fparam: Array | None = None,
+        aparam: Array | None = None,
+        do_atomic_virial: bool = False,
+    ) -> dict[str, Array]:


+    def call(
+        self,
+        coord: Array,
+        atype: Array,
+        box: Array | None = None,
+        fparam: Array | None = None,
+        aparam: Array | None = None,
+        do_atomic_virial: bool = False,
+    ) -> dict[str, Array]:


+    def call(
+        self,
+        coord: Array,
+        atype: Array,
+        box: Array | None = None,
+        fparam: Array | None = None,
+        aparam: Array | None = None,
+        do_atomic_virial: bool = False,
+    ) -> dict[str, Array]:


Han Wang added 30 commits February 6, 2026 07:48

implement pytorch-exportable for se_e2_a descriptor

ec2e031

better type for xp.zeros

b8a48ff

implement env, base_descriptor and exclude_mask, remove the dependenc…

1cc001f

…y on pt backend.

mv to_torch_tensor to common

f2fbe88

simplify __init__ of the NaiveLayer

e2afbe9

fix bug

4ba511a

fix bug

fb9598a

simplify init method of se_e2_a descriptor. fig bug in consistent UT

fa03351

restructure the test folders. add test_common.

09b33f1

add test_exclusion_mask.py

67f2e54

fix poitential import issue in test.

f7d83dd

correct __call__(). fix bug

0c96bb6

fix registration issue

9dca912

fix pt-expt file extension

17f0a5d

fix(pt): expansion of get_default_nthreads()

8ce93ba

fix bug of intra-inter

3091988

fix bug of default dp inter value

85f0583

fix cicd

d33324d

feat: add support for se_r

4de9a56

fix device of xp array

f4dc0af

fix device of xp array

2384835

revert extend_coord_with_ghosts

9646d71

raise error for non-implemented methods

f270069

restore import torch

57433d3

fix(pt,pt-expt): guard thread setters

eedcbaf

make exclusion mask modules

d8b2cf4

fix(pt-expt): clear params on None

aeef15a

fix bug

8bdb1f8

utility to handel dpmodel -> pt_expt conversion

d3b01da

fix to_numpy_array device

3452a2a

dosubot Bot added the new feature label Feb 14, 2026

github-advanced-security AI found potential problems Feb 14, 2026

View reviewed changes

gemini-code-assist Bot reviewed Feb 14, 2026

View reviewed changes

Comment thread deepmd/dpmodel/model/ener_model.py

Comment thread deepmd/dpmodel/atomic_model/dp_atomic_model.py Outdated

Comment thread deepmd/dpmodel/atomic_model/dp_atomic_model.py Outdated

coderabbitai Bot reviewed Feb 14, 2026

View reviewed changes

Comment thread source/tests/consistent/model/test_ener.py Outdated

Comment thread source/tests/consistent/model/test_ener.py

Comment thread source/tests/consistent/model/test_ener.py

chatgpt-codex-connector Bot reviewed Feb 14, 2026

View reviewed changes

Comment thread deepmd/pt_expt/model/ener_model.py

njzjz linked an issue Feb 14, 2026 that may be closed by this pull request

Full Model and Autograd Support (PyTorch Exportable) #5224

Closed

njzjz reviewed Feb 14, 2026

View reviewed changes

Comment thread deepmd/dpmodel/atomic_model/dp_atomic_model.py Outdated

Comment thread deepmd/dpmodel/fitting/general_fitting.py Outdated

Comment thread deepmd/dpmodel/atomic_model/dp_atomic_model.py Outdated

add guard for eval_descriptor and eval_fitting_last_layer

b67accc

coderabbitai Bot reviewed Feb 15, 2026

View reviewed changes

Comment thread deepmd/dpmodel/atomic_model/dp_atomic_model.py Outdated

Han Wang added 2 commits February 15, 2026 15:43

fix issues

19df985

remove eval_ hooks

fc0be62

rm eval_return_middle_output

c15212d

wanghan-iapcm requested a review from njzjz February 15, 2026 11:52

make output of energy model compatible among backends

1ef67ec

github-advanced-security AI found potential problems Feb 15, 2026

View reviewed changes

Han Wang added 3 commits February 16, 2026 09:45

fix bugs

79fa7ce

fix bugs

9eae3cd

implement rename for all models

f6a695f

github-advanced-security AI found potential problems Feb 16, 2026

View reviewed changes

Han Wang added 2 commits February 16, 2026 18:30

fixes

ecc411e

fix bugs

53a7e7f

wanghan-iapcm marked this pull request as draft February 16, 2026 12:02

wanghan-iapcm changed the title ~~feat(pt_expt): full model~~ feat(pt_expt): full model and refact the module output names of dpmodel backend Feb 16, 2026

wanghan-iapcm added the Test CUDA Trigger test CUDA workflow label Feb 16, 2026

github-actions Bot removed the Test CUDA Trigger test CUDA workflow label Feb 16, 2026

wanghan-iapcm closed this Feb 24, 2026

Conversation

wanghan-iapcm commented Feb 14, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

Check warning

Uh oh!

Check warning

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check notice

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

codecov Bot commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

njzjz commented Feb 14, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wanghan-iapcm commented Feb 15, 2026

Uh oh!

Check warning

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check warning

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Check warning

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wanghan-iapcm commented Feb 14, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Feb 14, 2026 •

edited

Loading

codecov Bot commented Feb 14, 2026 •

edited

Loading