feat(pt/pt_expt): O(N) vesin neighbor list for Python/ASE inference#5486
feat(pt/pt_expt): O(N) vesin neighbor list for Python/ASE inference#5486wanghan-iapcm wants to merge 6 commits into
Conversation
…erence The Python/ASE inference path builds the neighbor list with the native all-pairs O(N^2) algorithm, which dominates end-to-end throughput for large systems (e.g. via the ASE calculator). Add an opt-in `nlist_backend` option (default "vesin") that builds the neighbor list with the O(N) vesin cell list on the host, then feeds it to the model's own forward_common_lower + communicate_extended_output, leaving the model graph (and results) unchanged. - deepmd/dpmodel/utils/nlist.py: shared `build_neighbor_list_vesin` + `is_vesin_available`; converts vesin (i,j,S) edges to the deepmd extended format (extended_coord/atype, mapping, candidate nlist). - deepmd/pt/infer/deep_eval.py: `nlist_backend` kwarg; `_eval_model_vesin` calls the exported forward_common_lower and reduces extended outputs to local. Gated to non-spin, non-hessian models; falls back to native when vesin is unavailable. - deepmd/pt_expt/infer/deep_eval.py: `nlist_backend` kwarg; vesin branch in `_prepare_inputs` (ASE neighbor_list still takes precedence). - deepmd/calculator.py: thread `nlist_backend` through the ASE DP calculator. - pyproject.toml: add `vesin` dependency. Results match the native builder bit-for-bit in energy and to float64 round-off in forces/virials (ghost ordering differs). Works with auto_batch_size. Tests: - common/dpmodel/test_nlist.py::TestNeighListVesin (builder vs native) - pt/model/test_deeppot.py::TestDeepPot (dispatch + native-vs-vesin, PBC/non-PBC, atomic, multiframe, auto_batch_size) - pt_expt/infer/test_deep_eval.py::TestDeepEvalEner (vesin consistency)
Add committed tests for the pt_expt DeepPot interface mirroring the pt backend coverage: native disables vesin, invalid backend -> ValueError, vesin-unavailable fallback, and spin gate-off.
Replace the synthetic spin gate-off tests (which mutated the _is_spin flag on an energy model) with end-to-end tests that load a real spin model and run the spin eval path with nlist_backend="vesin": - pt: TestDeepPotSpinNlistBackend builds an in-process spin model, asserts vesin is gated off, and checks the spin eval (e/f/v/ae/av/fm/mask_mag) matches nlist_backend="native". - pt_expt: TestSpinInference.test_nlist_backend_vesin_gated_off_for_spin loads the real .pt2/.pte spin fixtures, asserts vesin is gated off, and checks the spin eval matches the reference.
Replace the silent spin/hessian/vesin-missing fallback with explicit behavior: - "auto" (new default): use the vesin O(N) builder when applicable (vesin installed, non-spin, non-hessian) and silently fall back to native otherwise. Preserves the fast path for energy models by default. - "vesin": strict — raise ValueError if vesin is missing, the model is spin or hessian, or (pt_expt) an explicit ASE neighbor_list is also given. - "native": force the native builder. The ASE DP calculator default also becomes "auto". Tests updated to cover every dispatch branch (invalid, native, auto-fallback and strict-raise for each blocker) on real spin models for both backends.
Add committed tests that the ASE `deepmd.calculator.DP` threads `nlist_backend` to the underlying DeepPot backend and gives backend-independent results: - pt backend (.pth) and pt_expt backend (.pte) calculator tests compare native vs vesin energy/forces. - pt_expt: an explicit ASE neighbor_list takes precedence over the auto vesin path (_use_vesin stays False).
vesin.torch runs the cell list on the device of the input tensors (CPU or CUDA), and torch is a hard dependency of both the pt and pt_expt backends, so the separate numpy builder was redundant. Consolidate on a single torch builder: - New deepmd/pt_expt/utils/nlist.py: build_neighbor_list_vesin_torch + is_vesin_torch_available. The neighbor search stays on the input device, so GPU inference avoids the host round-trip. The vesin call is pinned to the input device to ignore any ambient default-device setting. - Remove the numpy build_neighbor_list_vesin / is_vesin_available from deepmd/dpmodel/utils/nlist.py (dpmodel is now vesin-free; torch code does not belong in the array-API backend). - Both pt and pt_expt deep_eval import the torch builder from deepmd.pt_expt.utils.nlist and gate on is_vesin_torch_available(). - pyproject: depend on vesin[torch] so the device-capable bindings ship by default. Tests: vesin builder test moved to source/tests/pt_expt/utils/test_nlist.py (compares against the native builder, plus an input-device-placement check); skip guards switched to vesin.torch.
|
|
||
| def test_nlist_backend_vesin_unavailable(self) -> None: | ||
| # "auto" silently falls back; explicit "vesin" raises. | ||
| import deepmd.pt.infer.deep_eval as deep_eval_mod |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 237fd1119c
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| native all-pairs O(N^2) neighbor-list build, which dominates runtime for | ||
| large systems on the Python / ASE inference path. | ||
| """ | ||
| model = self.dp.model["Default"].to(DEVICE) |
There was a problem hiding this comment.
Preserve data modifiers on the vesin path
For frozen .pth models that carry data_modifier.pth, this path now bypasses ModelWrapper by calling the bare Default model, so the modifier contribution that the native path adds in ModelWrapper.forward is silently dropped whenever nlist_backend resolves to auto/vesin. In those deployments (e.g. models with an added correction term), energies/forces/virials from the default inference path will be missing the modifier rather than matching the existing native results; either apply the modifier after forward_common_lower or keep such models on the native wrapper path.
Useful? React with 👍 / 👎.
📝 WalkthroughWalkthroughThe PR introduces configurable neighbor-list backend selection for inference, allowing users to opt into faster Vesin torch-based construction instead of the native implementation. It adds validation, device-resident execution, automatic fallback semantics for ChangesNeighbor-list backend selection
Sequence DiagramsequenceDiagram
participant User as User code
participant DP as DP (ASE calculator)
participant DeepPot as DeepPot (inference)
participant DeepEval as DeepEval
participant Vesin as vesin.torch
User->>DP: DP(model, nlist_backend='auto')
DP->>DeepPot: nlist_backend parameter
DeepPot->>DeepEval: initialize with nlist_backend
DeepEval->>DeepEval: _setup_nlist_backend validation
alt vesin available & supported
DeepEval->>Vesin: build_neighbor_list_vesin_torch
Vesin->>DeepEval: (ext_coord, ext_atype, nlist, mapping)
else fallback to native
DeepEval->>DeepEval: native neighbor list
end
DeepEval->>DeepPot: extended/native neighbor list
DeepPot->>User: energy/forces
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/pt_expt/infer/deep_eval.py`:
- Around line 144-184: The vesin backend must also be blocked for models that
expose Hessian outputs; in _setup_nlist_backend, extend the unsupported check
(currently using self._is_spin) to detect Hessian output variables (e.g. inspect
self.output_variables for any with category ==
OutputVariableCategory.DERV_R_DERV_R) and set unsupported accordingly (e.g.
"hessian"), then use that unsupported flag in the existing vesin and auto
branches so vesin is rejected when Hessians are requested and self._use_vesin is
False in that case.
In `@deepmd/pt_expt/utils/nlist.py`:
- Around line 170-180: The code currently truncates candidate neighbors to keep
= min(nsel, max_nn) before calling nlist_distinguish_types, which drops
farther-type candidates; change the flow so that when distinguish_types is True
you preserve the larger candidate pool for type-based selection: build nlist
using a larger keep (e.g., keep = min(max_nn, sorted_idx.shape[1]) or otherwise
the full available candidates) and call nlist_distinguish_types(nlist[None],
extended_atype[None], sel)[0] first, then after the per-type selection crop the
resulting nlist back to nsel slots; update references to nlist, keep,
sorted_idx, sorted_valid, and nlist_distinguish_types accordingly.
In `@deepmd/pt/infer/deep_eval.py`:
- Around line 144-145: The constructor in deepmd/pt/infer/deep_eval.py currently
accepts neighbor_list and nlist_backend but never uses neighbor_list before the
vesin path is chosen (see the constructor and the vesin-enabling branch around
the vesin path), so update the constructor logic to either (A) pass the provided
neighbor_list through to wherever the backend is configured (wire the ASE
NewPrimitiveNeighborList into the vesin/other backend setup) or (B) immediately
raise a ValueError when neighbor_list is not None and nlist_backend == "vesin"
(or when neighbor_list is non-None and the chosen backend cannot consume it),
ensuring the constructor (the __init__ / class constructor that defines
neighbor_list and nlist_backend) enforces or propagates the caller-provided list
rather than silently ignoring it.
In `@pyproject.toml`:
- Around line 61-63: The pyproject currently lists 'vesin[torch]' as an
unconditional runtime dependency; move it into an optional extra and update
metadata injection so it's only requested for the Torch/ASE inference path.
Create a new optional extra name (e.g., "vesin-torch" or "torch-inference") in
pyproject.toml and remove 'vesin[torch]' from [project].dependencies; instead
add it under [project.optional-dependencies] with that extra key. Update
backend/dynamic_metadata.py (which currently exposes
get_pt_requirement/get_tf_requirement) to either add a new
get_vesin_requirement/get_torch_inference_requirement that returns the optional
extra name or ensure code that triggers the Torch/ASE path references the new
optional extra so runtime can fall back when vesin is unavailable.
In `@source/tests/pt_expt/infer/test_deep_eval.py`:
- Around line 721-731: The test
test_nlist_backend_vesin_conflicts_with_neighbor_list imports ase.neighborlist
unconditionally which causes failures when ASE is not installed; update the test
to detect ASE availability and skip if missing by attempting to import ase (or
using pytest.importorskip) and calling self.skipTest when import fails before
constructing DeepPot with nlist_backend="vesin" and
neighbor_list=ase.neighborlist.NewPrimitiveNeighborList(...); keep references to
DeepPot, nlist_backend, and neighbor_list intact so the behavior under test
remains unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: e18f825f-19c5-480c-8700-3c04ffcbbc31
📒 Files selected for processing (10)
deepmd/calculator.pydeepmd/pt/infer/deep_eval.pydeepmd/pt_expt/infer/deep_eval.pydeepmd/pt_expt/utils/nlist.pypyproject.tomlsource/tests/pt/model/test_deeppot.pysource/tests/pt/test_calculator.pysource/tests/pt_expt/infer/test_deep_eval.pysource/tests/pt_expt/infer/test_deep_eval_spin.pysource/tests/pt_expt/utils/test_nlist.py
| def _setup_nlist_backend(self, nlist_backend: str) -> None: | ||
| """Resolve the requested neighbor-list backend for the inference path. | ||
|
|
||
| ``"auto"`` uses the O(N) vesin cell list when applicable and silently | ||
| falls back to native otherwise; ``"vesin"`` is strict and raises if it | ||
| cannot be honored; ``"native"`` forces the native builder. An explicitly | ||
| supplied ASE ``neighbor_list`` takes precedence over the vesin path. | ||
| """ | ||
| if nlist_backend not in ("auto", "vesin", "native"): | ||
| raise ValueError( | ||
| f"Unknown nlist_backend {nlist_backend!r}; " | ||
| "expected 'auto', 'vesin', or 'native'." | ||
| ) | ||
| self.nlist_backend = nlist_backend | ||
| ase_provided = self.neighbor_list is not None | ||
| unsupported = "spin" if self._is_spin else None | ||
| if nlist_backend == "vesin": | ||
| # explicit request: fail loudly if it cannot be honored | ||
| if not is_vesin_torch_available(): | ||
| raise ValueError( | ||
| "nlist_backend='vesin' was requested but the 'vesin.torch' " | ||
| "package is not installed; install it (`pip install " | ||
| "vesin[torch]`) or use nlist_backend='native' (or 'auto')." | ||
| ) | ||
| if unsupported is not None: | ||
| raise ValueError( | ||
| f"nlist_backend='vesin' is not supported for {unsupported} " | ||
| "models; use nlist_backend='native' (or 'auto')." | ||
| ) | ||
| if ase_provided: | ||
| raise ValueError( | ||
| "nlist_backend='vesin' conflicts with an explicitly supplied " | ||
| "neighbor_list; pass only one." | ||
| ) | ||
| self._use_vesin = True | ||
| elif nlist_backend == "native": | ||
| self._use_vesin = False | ||
| else: # auto: use vesin when possible, otherwise fall back silently | ||
| self._use_vesin = ( | ||
| is_vesin_torch_available() and unsupported is None and not ase_provided | ||
| ) |
There was a problem hiding this comment.
Block vesin for hessian models here too.
Line 159 only treats spin as unsupported, but this backend still exposes hessian outputs via OutputVariableCategory.DERV_R_DERV_R. A hessian archive can therefore select the vesin path here instead of failing fast, which breaks the advertised backend contract and risks incorrect second-derivative results.
Suggested fix
- unsupported = "spin" if self._is_spin else None
+ unsupported = None
+ if self._is_spin:
+ unsupported = "spin"
+ elif any(
+ v.category == OutputVariableCategory.DERV_R_DERV_R
+ for v in self._model_output_def.def_outp.get_data().values()
+ ):
+ unsupported = "hessian"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@deepmd/pt_expt/infer/deep_eval.py` around lines 144 - 184, The vesin backend
must also be blocked for models that expose Hessian outputs; in
_setup_nlist_backend, extend the unsupported check (currently using
self._is_spin) to detect Hessian output variables (e.g. inspect
self.output_variables for any with category ==
OutputVariableCategory.DERV_R_DERV_R) and set unsupported accordingly (e.g.
"hessian"), then use that unsupported flag in the existing vesin and auto
branches so vesin is rejected when Hessians are requested and self._use_vesin is
False in that case.
| nlist = torch.full((nloc, nsel), -1, dtype=torch.int64, device=device) | ||
| keep = min(nsel, max_nn) | ||
| if keep > 0: | ||
| nlist[:, :keep] = torch.where( | ||
| sorted_valid[:, :keep], | ||
| sorted_idx[:, :keep], | ||
| torch.full_like(sorted_idx[:, :keep], -1), | ||
| ) | ||
|
|
||
| if distinguish_types: | ||
| nlist = nlist_distinguish_types(nlist[None], extended_atype[None], sel)[0] |
There was a problem hiding this comment.
Don't truncate before enforcing per-type slots.
When distinguish_types=True, Lines 170-177 keep only the sum(sel) nearest neighbors before nlist_distinguish_types() runs. That drops farther candidates from underrepresented types, so crowded systems can return the wrong typed neighbor list even though valid per-type neighbors still exist inside rcut. This will silently change model inputs relative to the native builder.
Suggested fix
- nlist = torch.full((nloc, nsel), -1, dtype=torch.int64, device=device)
- keep = min(nsel, max_nn)
- if keep > 0:
- nlist[:, :keep] = torch.where(
- sorted_valid[:, :keep],
- sorted_idx[:, :keep],
- torch.full_like(sorted_idx[:, :keep], -1),
- )
-
- if distinguish_types:
- nlist = nlist_distinguish_types(nlist[None], extended_atype[None], sel)[0]
+ sorted_nlist = torch.where(
+ sorted_valid,
+ sorted_idx,
+ torch.full_like(sorted_idx, -1),
+ )
+
+ if distinguish_types:
+ sorted_nlist = nlist_distinguish_types(
+ sorted_nlist[None], extended_atype[None], sel
+ )[0]
+
+ nlist = torch.full((nloc, nsel), -1, dtype=torch.int64, device=device)
+ keep = min(nsel, sorted_nlist.shape[1])
+ if keep > 0:
+ nlist[:, :keep] = sorted_nlist[:, :keep]🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@deepmd/pt_expt/utils/nlist.py` around lines 170 - 180, The code currently
truncates candidate neighbors to keep = min(nsel, max_nn) before calling
nlist_distinguish_types, which drops farther-type candidates; change the flow so
that when distinguish_types is True you preserve the larger candidate pool for
type-based selection: build nlist using a larger keep (e.g., keep = min(max_nn,
sorted_idx.shape[1]) or otherwise the full available candidates) and call
nlist_distinguish_types(nlist[None], extended_atype[None], sel)[0] first, then
after the per-type selection crop the resulting nlist back to nsel slots; update
references to nlist, keep, sorted_idx, sorted_valid, and nlist_distinguish_types
accordingly.
| neighbor_list: Optional["ase.neighborlist.NewPrimitiveNeighborList"] = None, | ||
| nlist_backend: str = "auto", |
There was a problem hiding this comment.
Don’t silently ignore an explicit neighbor_list in the pt backend.
This constructor still exposes neighbor_list, but the value is never stored or consulted before Line 297 enables vesin. With deepmd.calculator.DP now forwarding both knobs, DP(..., neighbor_list=..., nlist_backend="vesin") will take the vesin path instead of the caller-provided list and never warn. Please either wire the ASE list through here or fail fast when neighbor_list is non-None; at minimum, "vesin" should reject that combination.
Also applies to: 260-299
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@deepmd/pt/infer/deep_eval.py` around lines 144 - 145, The constructor in
deepmd/pt/infer/deep_eval.py currently accepts neighbor_list and nlist_backend
but never uses neighbor_list before the vesin path is chosen (see the
constructor and the vesin-enabling branch around the vesin path), so update the
constructor logic to either (A) pass the provided neighbor_list through to
wherever the backend is configured (wire the ASE NewPrimitiveNeighborList into
the vesin/other backend setup) or (B) immediately raise a ValueError when
neighbor_list is not None and nlist_backend == "vesin" (or when neighbor_list is
non-None and the chosen backend cannot consume it), ensuring the constructor
(the __init__ / class constructor that defines neighbor_list and nlist_backend)
enforces or propagates the caller-provided list rather than silently ignoring
it.
| # O(N) neighbor-list backend for the Python / ASE inference path; the | ||
| # [torch] extra adds vesin.torch for a device-resident (GPU) build. | ||
| 'vesin[torch]', |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Inspect base dependency declaration:"
rg -n -C2 '^\s*dependencies\s*=\s*\[|vesin|torch' pyproject.toml
echo
echo "Inspect optional dependency definitions and dynamic metadata hooks:"
rg -n -C2 'optional-dependencies|torch|cpu|gpu|ase' pyproject.toml backend README.mdRepository: deepmodeling/deepmd-kit
Length of output: 25749
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "All occurrences of vesin:"
rg -n --hidden --no-ignore -S "vesin" pyproject.toml backend README.md
echo
echo "Inspect optional-dependencies block in pyproject.toml around dynamic metadata config:"
rg -n -C2 "optional-dependencies|tool\.deepmd_build_backend|deepmd_build_backend" pyproject.toml
echo
echo "Inspect backend dynamic/metadata helpers that may add torch-related deps:"
ls -1 backend
for f in backend/*.py; do
echo "---- $f ----"
rg -n "optional-dependencies|optional_dependencies|get_requires|vesin|torch\]" "$f" || true
doneRepository: deepmodeling/deepmd-kit
Length of output: 2257
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "--- pyproject.toml: core dependencies section around lines 35-90 ---"
sed -n '35,95p' pyproject.toml
echo
echo "--- pyproject.toml: tool.deepmd_build_backend.optional-dependencies full block (approx 80-160) ---"
sed -n '80,170p' pyproject.toml
echo
echo "--- pyproject.toml: find any optional extra keys referencing torch / ase ---"
rg -n --context 1 "optional-dependencies|\\base\\b|\\btorch\\b|vesin\\[" pyproject.toml -S
echo
echo "--- backend/dynamic_metadata.py ---"
sed -n '1,120p' backend/dynamic_metadata.py
echo
echo "--- backend/find_pytorch.py (search for vesin / torch extra injection) ---"
rg -n "vesin|optional|extra|ase|torch\\]" backend/find_pytorch.py backend/find_pytorch.py -S || true
sed -n '1,240p' backend/find_pytorch.pyRepository: deepmodeling/deepmd-kit
Length of output: 13414
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "Search for vesin in backend:"
rg -n --hidden --no-ignore -S "vesin" backend || true
echo
echo "Search for vesin in optional-dependencies groups only (pyproject.toml):"
rg -n --hidden --no-ignore -S "vesin" pyproject.toml || true
echo
echo "Show dynamic metadata optional-dependencies logic (does it add vesin?):"
rg -n "optional-dependencies|get_pt_requirement|get_tf_requirement" backend/dynamic_metadata.pyRepository: deepmodeling/deepmd-kit
Length of output: 652
Move vesin[torch] out of core runtime dependencies (pyproject.toml: [project].dependencies, lines 61-63).
vesin[torch]is added unconditionally to[project].dependencies, so every install attempts it (even non-PyTorch/ASE users and platforms where the compiled extension may not load).- No corresponding optional-dependencies extra includes
vesin, andbackend/dynamic_metadata.pyonly injects PyTorch/TensorFlow requirements (get_pt_requirement/get_tf_requirement), notvesin.
Place vesin[torch] behind a dedicated optional extra (e.g., torch/ASE inference path) so runtime can fall back when it’s unavailable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pyproject.toml` around lines 61 - 63, The pyproject currently lists
'vesin[torch]' as an unconditional runtime dependency; move it into an optional
extra and update metadata injection so it's only requested for the Torch/ASE
inference path. Create a new optional extra name (e.g., "vesin-torch" or
"torch-inference") in pyproject.toml and remove 'vesin[torch]' from
[project].dependencies; instead add it under [project.optional-dependencies]
with that extra key. Update backend/dynamic_metadata.py (which currently exposes
get_pt_requirement/get_tf_requirement) to either add a new
get_vesin_requirement/get_torch_inference_requirement that returns the optional
extra name or ensure code that triggers the Torch/ASE path references the new
optional extra so runtime can fall back when vesin is unavailable.
| def test_nlist_backend_vesin_conflicts_with_neighbor_list(self) -> None: | ||
| import ase.neighborlist | ||
|
|
||
| with self.assertRaises(ValueError): | ||
| DeepPot( | ||
| self.tmpfile.name, | ||
| nlist_backend="vesin", | ||
| neighbor_list=ase.neighborlist.NewPrimitiveNeighborList( | ||
| cutoffs=self.rcut, bothways=True | ||
| ), | ||
| ) |
There was a problem hiding this comment.
Skip this ASE-only test when ASE is unavailable.
Lines 722-728 import ase.neighborlist unconditionally, unlike the surrounding ASE-backed tests. In an environment without ASE this test errors instead of skipping, even though the behavior under test depends on an explicit ASE neighbor list.
Suggested fix
+ `@unittest.skipUnless`(
+ importlib.util.find_spec("ase") is not None, "ase not installed"
+ )
def test_nlist_backend_vesin_conflicts_with_neighbor_list(self) -> None:
import ase.neighborlist🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@source/tests/pt_expt/infer/test_deep_eval.py` around lines 721 - 731, The
test test_nlist_backend_vesin_conflicts_with_neighbor_list imports
ase.neighborlist unconditionally which causes failures when ASE is not
installed; update the test to detect ASE availability and skip if missing by
attempting to import ase (or using pytest.importorskip) and calling
self.skipTest when import fails before constructing DeepPot with
nlist_backend="vesin" and
neighbor_list=ase.neighborlist.NewPrimitiveNeighborList(...); keep references to
DeepPot, nlist_backend, and neighbor_list intact so the behavior under test
remains unchanged.
There was a problem hiding this comment.
Pull request overview
This PR introduces an opt-in O(N) neighbor-list construction path for Python/ASE inference by integrating vesin.torch into the PyTorch (pt) and exportable PyTorch (pt_expt) inference backends, and wires the selection through DeepPot/DeepEval and the ASE DP calculator via a new nlist_backend argument.
Changes:
- Added
nlist_backenddispatch (auto/vesin/native) to inference backends and threaded it through the ASE calculator interface. - Implemented a device-resident
vesin.torchneighbor-list builder (deepmd/pt_expt/utils/nlist.py) and integrated it into both backends. - Added targeted unit tests covering backend selection, fallbacks/raises, spin gating, and native-vs-vesin numerical consistency.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
deepmd/pt/infer/deep_eval.py |
Adds nlist_backend selection and a vesin-powered inference evaluation path using forward_common_lower + communicate_extended_output. |
deepmd/pt_expt/infer/deep_eval.py |
Adds nlist_backend selection and integrates vesin builder into _prepare_inputs with ASE-conflict handling. |
deepmd/pt_expt/utils/nlist.py |
New torch-specific vesin neighbor-list builder producing extended coords/types, mapping, and candidate nlists. |
deepmd/calculator.py |
Threads nlist_backend through the ASE calculator to DeepPot. |
pyproject.toml |
Adds vesin[torch] to runtime dependencies. |
source/tests/pt/model/test_deeppot.py |
Adds tests for nlist_backend behavior (auto/native/vesin, missing-vesin behavior, hessian/spin gating, consistency). |
source/tests/pt/test_calculator.py |
Adds ASE calculator-level test ensuring nlist_backend reaches the backend and preserves results. |
source/tests/pt_expt/utils/test_nlist.py |
New tests validating vesin builder vs native neighbor relationships and device placement. |
source/tests/pt_expt/infer/test_deep_eval.py |
Adds pt_expt inference tests for vesin consistency, multiframe, backend selection, ASE conflict behavior, and calculator threading. |
source/tests/pt_expt/infer/test_deep_eval_spin.py |
Adds spin-model gating test ensuring explicit vesin raises and auto falls back while matching reference outputs. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| """Evaluate using an O(N) ``vesin`` neighbor list built on the host. | ||
|
|
||
| The neighbor list is built outside the model graph and fed to the | ||
| exported ``forward_common_lower`` interface; the extended-region output | ||
| is mapped back to local atoms with :func:`communicate_extended_output`, | ||
| exactly mirroring the native ``forward_common`` path. This avoids the | ||
| native all-pairs O(N^2) neighbor-list build, which dominates runtime for | ||
| large systems on the Python / ASE inference path. |
| # O(N) neighbor-list backend for the Python / ASE inference path; the | ||
| # [torch] extra adds vesin.torch for a device-resident (GPU) build. | ||
| 'vesin[torch]', |
| ``"vesin"`` strictly requires the vesin path (raises if it is missing or | ||
| the model is spin / hessian); ``"native"`` forces the built-in | ||
| all-pairs builder. Ignored when an explicit ``neighbor_list`` is given. |
|
I will make deepmd-kit directly consume vesin style nlist. |
Motivation
The Python / ASE inference path builds the neighbor list with the native all-pairs O(N²) algorithm (
build_neighbor_listmaterializes a dense[nf, nloc, nall, 3]displacement tensor). For large systems this dominates end-to-end throughput — exactly the front-end bottleneck described in the DPA4 manuscript §2.4 (the ASE throughput curves bend down at large N because of native O(N²) neighbor lists).This PR adds an opt-in, O(N) neighbor-list backend based on the
vesincell list for the Python inference interfaces (DeepPot, the ASEDPcalculator), for both thept(.pth/.pt) andpt_expt(.pte/.pt2) backends. Model results are unchanged: the neighbor search is non-differentiable, so its result is fed to the model's ownforward_common_lower+communicate_extended_output, leaving the autograd graph and outputs intact.API
New
nlist_backendargument onDeepPot/DeepEval/ the ASEDPcalculator:"auto"(default): use the O(N)vesincell list when applicable (vesin.torchinstalled, non-spin, non-hessian model) and silently fall back to the native builder otherwise. Keeps the fast path by default; never breaks spin/hessian."vesin": strict — raiseValueErrorifvesin.torchis missing, the model is spin/hessian, or (pt_expt) an explicit ASEneighbor_listis also given."native": force the native all-pairs builder.Design
vesin.torch, which runs the cell list on the device of the input tensors (CPU or CUDA). On GPU the whole extended-system build stays on the GPU (no host round-trip). Lives indeepmd/pt_expt/utils/nlist.py; both backends import it.dpmodelstays vesin-free (no torch code in the array-API backend).vesin's(i, j, S)edge list into deepmd's extended format (extended coords/types,mapping, candidatenlist) and letsforward_common_lower.format_nlistre-format (distance-sort, truncate tosel, type-split) — matchingforward_commonexactly.auto_batch_size.pyproject: depends onvesin[torch]so the device-capable bindings ship by default.Verification
native vs vesin agree to floating-point round-off (energy bit-identical; forces/virials ~1e-16, the only difference being ghost-atom enumeration order):
0.0, F/V~1e-160.0, F/V~5e-16(spot-checked)0.0(spot-checked)Tests
source/tests/pt_expt/utils/test_nlist.py— vesin.torch builder vs native (PBC/non-PBC/distinguish), input-device placement.source/tests/pt/model/test_deeppot.py,source/tests/pt_expt/infer/test_deep_eval.py,test_deep_eval_spin.py,source/tests/pt/test_calculator.py— everynlist_backenddispatch branch (invalid → ValueError, native,autofallback and strict-vesinraise for vesin-missing / spin / hessian / ASE-conflict), native-vs-vesin consistency (PBC/non-PBC/atomic/multiframe/auto_batch_size), real-spin gate-off end-to-end, and the ASE calculator threadingnlist_backendfor both backends.Known limitations
vesin.torchis implemented but exercised only via the CPU branch here.vesin[torch]is a compiled torch-ABI extension: on a platform without a matching wheel, install could fail (runtime is guarded byis_vesin_torch_available, but pyproject is not). There is no numpy-vesin fallback by design —autofalls back to native, explicitvesinraises.torch.export;extend_coord_with_ghostsover-generating ~27×N ghosts; the (still dead) ASEneighbor_listarg on theptbackend; a committed throughput benchmark.charge_spinand.pt2-runner+vesin are covered by equivalence/.pterather than dedicated committed tests; message-passing and dipole through vesin are spot-checked, not committed.Summary by CodeRabbit
New Features
nlist_backendparameter to calculator and inference backends for selecting neighbor-list construction algorithms.Chores