feat(pt): add atomic charge population fitting and model#5494
feat(pt): add atomic charge population fitting and model#5494chrisahart wants to merge 8 commits into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughAdds a new "population" model type: fitting net, atomic/model adapters, loss implementation and registration, training dispatch, inference wrappers, arg schemas, documentation, and example data/configs. ChangesPopulation Model Training and Inference
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers:
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/infer/deep_population.py`:
- Around line 68-106: The eval() docstring in deep_population.py incorrectly
describes returning energy/force/virial but this method returns population data
only; update the docstring for the eval function to describe that it computes
and returns population arrays (shape/details), clarify parameters (coords,
cells, atom_types, fparam, aparam, mixed_type, atomic if applicable) if any
affect population, and remove or replace the energy/force/virial return section
with precise return descriptions for population (and any auxiliary arrays the
method actually returns); use the function name eval in deep_population.py to
locate and edit the docstring so generated docs correctly reflect its behavior.
In `@deepmd/pt/loss/ener.py`:
- Line 610: Remove the trailing whitespace after the closing parenthesis ) on
the affected line in deepmd/pt/loss/ener.py (the line that currently ends with
") "); edit the line to end with just ")" (no spaces), then run "ruff
format ." (or your project's formatter) to ensure formatting compliance before
committing.
In `@deepmd/pt/loss/population.py`:
- Around line 155-176: The forward() currently hard-codes F.l1_loss; instead use
the configured loss function and beta parameter (the module-level loss_func and
beta set in the constructor/config) when computing spin_loss, spin_total_loss,
pop_loss, pop_alpha_total_loss, and pop_beta_total_loss. Replace direct calls to
F.l1_loss with a call that invokes the configured loss function (or a mapped
function when loss_func is a string) and pass the configured reduction and beta
(e.g., for smooth_l1/huber variants) so the chosen loss type and beta are
honored throughout forward(). Ensure the same change is applied to all
occurrences: spin_pred/spin_label, spin_total_pred/spin_total_label,
pop_pred/pop_label, pop_alpha_total_pred/pop_alpha_total_label, and
pop_beta_total_pred/pop_beta_total_label.
- Around line 157-183: The per-atom terms are not normalized by the number of
atoms: use the existing natoms (or compute it if needed) to divide per-atom
losses before weighting so the loss is size-independent; specifically, in the
aggregation where loss adds pref_spin * spin_loss and pref_pop * pop_loss,
replace those with pref_spin * (spin_loss / natoms) and pref_pop * (pop_loss /
natoms) (leave spin_total/pop_alpha_total/pop_beta_total as totals if they are
already global), ensuring natoms is referenced where currently unused and that
loss_func outputs are divided as described.
- Line 28: The constructor PopulationLoss.__init__ currently uses a mutable
default parameter metric: list = ["mae"]; change the signature to metric:
Optional[list] = None and inside __init__ set self.metric = list(metric) if
metric is not None else ["mae"] (or self.metric = ["mae"] when metric is None)
to avoid sharing the same list instance across instances and ensure you copy any
provided list to prevent external mutations affecting the internal state.
In `@deepmd/pt/model/task/population.py`:
- Around line 46-57: The class docstring for Population is out of sync with its
__init__ signature: remove or update parameters listed (task_dim, property_name,
bias_atom_p, intensive, neuron, resnet_dt) that are not accepted by
Population.__init__, and instead document only the actual parameters and their
meanings as defined in Population.__init__; locate the Population class and its
__init__ method and make the docstring match the real signature and semantics
(or add the missing parameters to __init__ if they were intended), ensuring
parameter names and types in the docstring match the identifiers used in the
code.
- Around line 80-92: The constructor currently uses a mutable default list for
the parameter `neuron` which can be shared across instances; change the
signature to use neuron: Optional[list[int]] = None (or Union[None, list[int]]),
and inside the constructor (the __init__ / population initializer that currently
accepts `neuron`) set a new list when None, e.g. self.neuron = list(neuron) if
neuron is not None else [128, 128, 128] so each instance gets its own list;
update the type hint accordingly and run ruff check . and ruff format . before
committing.
In `@doc/model/train-fitting-population.md`:
- Line 1: The title header "Fit atomic charge population {{ tensorflow_icon }}
{{ pytorch_icon }} {{ jax_icon }} {{ dpmodel_icon }}" shows backend badges that
don't match the document body (which states PyTorch-only); update the header to
reflect actual support by removing the unsupported badges and leaving only "{{
pytorch_icon }}" or, if other backends are supported, update the body to
document TensorFlow/JAX/DPModel support; specifically edit the header text to
remove or adjust the tokens tensorflow_icon, jax_icon, and dpmodel_icon so the
title and content are consistent.
- Line 19: Replace the incorrect "dos" mention with "population" in the sentence
about fitting, and replace the non-descriptive link text "here" with a
descriptive label like "population" (e.g., change "their meaning can be found
[here](dpa3.md)" to "their meaning can be found in the population section" using
a descriptive link); keep references to {ref}`model[standard]/fitting_net
<model[standard]/fitting_net>`, {ref}`loss <loss>`, and the other refs
({ref}`model <model>`, {ref}`learning_rate <learning_rate>`, {ref}`training
<training>`) intact so the sentence correctly reads that to fit the population
you must modify model[standard]/fitting_net and loss and that meanings are in
the population section.
- Around line 106-111: The fenced code block in train-fitting-population.md is
missing a language tag; edit the opening triple-backtick that precedes the table
and add a language identifier (e.g., use ```text) so the block becomes fenced
with a language tag and satisfies markdown linting.
- Line 97: Update the example command to use the actual provided training config
filename: replace the shown command `dp --pt train input.json` with `dp --pt
train input_torch.json` so the documentation matches the referenced
`input_torch.json` PyTorch backend training configuration.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: f45972c1-a108-4b7c-acdb-6f5fae0bd468
📒 Files selected for processing (24)
deepmd/infer/deep_eval.pydeepmd/infer/deep_population.pydeepmd/pt/infer/deep_eval.pydeepmd/pt/loss/__init__.pydeepmd/pt/loss/ener.pydeepmd/pt/loss/population.pydeepmd/pt/model/atomic_model/__init__.pydeepmd/pt/model/atomic_model/population_atomic_model.pydeepmd/pt/model/model/__init__.pydeepmd/pt/model/model/population_model.pydeepmd/pt/model/task/__init__.pydeepmd/pt/model/task/population.pydeepmd/pt/train/training.pydeepmd/utils/argcheck.pydoc/model/train-fitting-population.mdexamples/population/data/set.000/aparam.npyexamples/population/data/set.000/atomic_population.npyexamples/population/data/set.000/box.npyexamples/population/data/set.000/coord.npyexamples/population/data/set.000/energy.npyexamples/population/data/set.000/force.npyexamples/population/data/type.rawexamples/population/data/type_map.rawexamples/population/train/input_torch.json
Implements a new population fitting network (PopulationFittingNet) and corresponding atomic/full model (DPPopulationAtomicModel, PopulationModel) for predicting electronic population properties. Includes a dedicated loss function, inference interface (DeepPopulation), argument definitions, training integration, example data, and documentation. Co-authored-by: Cursor <cursoragent@cursor.com>
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (4)
deepmd/infer/deep_population.py (1)
66-103:⚠️ Potential issue | 🟡 Minor | ⚡ Quick win
eval()docstring still documents energy/force/virial instead of population output.The documented returns don’t match runtime behavior and can mislead API users.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/infer/deep_population.py` around lines 66 - 103, The docstring for eval() in deep_population.py incorrectly documents energy/force/virial outputs; update the eval() function's docstring to describe the actual population outputs returned by DeepPopulation (e.g., population arrays, fitness/score fields, generation metadata, and shapes/types), include parameter behavior (coords/cells/atom_types/mixed_type/fparam/aparam/atomic) only as relevant to population generation, and list the correct return values and their shapes/types (and note when optional fields are present) so the documentation matches runtime behavior and calling code expectations.deepmd/pt/loss/population.py (3)
170-180:⚠️ Potential issue | 🟠 Major | ⚡ Quick winConfigured
loss_func/betaare not honored inforward().
forward()always uses L1 (Line 170), so non-maeconfigurations silently behave incorrectly.Suggested fix
- loss_func = partial(F.l1_loss, reduction="sum") + if self.loss_func == "mae": + loss_func = partial(F.l1_loss, reduction="sum") + elif self.loss_func == "smooth_mae": + loss_func = partial(F.smooth_l1_loss, reduction="sum", beta=self.beta) + elif self.loss_func == "rmse": + def loss_func(input: torch.Tensor, target: torch.Tensor) -> torch.Tensor: + return torch.sqrt(F.mse_loss(input, target, reduction="sum")) + else: + raise ValueError(f"Unsupported loss_func: {self.loss_func}")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt/loss/population.py` around lines 170 - 180, The forward() implementation currently hardcodes loss_func = partial(F.l1_loss, reduction="sum") and always uses L1; update forward() to respect the configured loss type and beta by selecting the proper loss function (e.g., F.l1_loss for "mae" and F.mse_loss for "mse") and applying the configured reduction and beta weight from the instance (use existing config fields like self.loss or self.beta if present). Replace the hardcoded loss_func with a selector that creates the partial with the correct function and reduction, then use that for spin_loss, spin_total_loss, pop_loss, pop_alpha_total_loss, and pop_beta_total_loss so the configured loss and beta are honored. Ensure you reference the existing symbols forward(), loss_func, spin_loss, spin_total_loss, pop_loss, pop_alpha_total_loss, and pop_beta_total_loss when making the change.
172-187:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPer-atom loss terms should be normalized by
natomsto match the documented objective.
spin_lossandpop_lossare sum-reduced but not divided byN, so loss scale grows with system size.Suggested fix
+ if natoms <= 0: + raise ValueError("natoms must be positive") @@ - spin_loss = loss_func(input=spin_pred, target=spin_label) + spin_loss = loss_func(input=spin_pred, target=spin_label) / natoms @@ - pop_loss = loss_func(input=pop_pred, target=pop_label) + pop_loss = loss_func(input=pop_pred, target=pop_label) / natoms🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt/loss/population.py` around lines 172 - 187, spin_loss and pop_loss (and any other sum-reduced per-atom terms such as pop_alpha_total_loss/pop_beta_total_loss if they are computed as sums) are not normalized by the number of atoms, so the total loss scales with system size; update the accumulation before adding into loss to divide these per-atom sum losses by natoms (use the existing natoms variable) so you add pref_spin * (spin_loss / natoms) and pref_pop * (pop_loss / natoms) (and similarly normalize pop_alpha_total_loss and pop_beta_total_loss if they are per-atom sums) when computing the total loss.
30-30:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winMutable default for
metricis still shared across instances.Line 30 uses a mutable list default; instances can accidentally share state.
Suggested fix
- metric: list = ["mae"], + metric: list[str] | None = None, @@ - self.metric = metric + self.metric = ["mae"] if metric is None else list(metric)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt/loss/population.py` at line 30, The parameter "metric" in deepmd/pt/loss/population.py is defined with a mutable default list (metric: list = ["mae"]) which can be shared across instances; change the signature to use metric: Optional[list] = None (or omit the default) and inside the constructor/function (e.g., in the PopulationLoss initializer or the function where "metric" is defined) set if metric is None: metric = ["mae"]; also update any type annotations/imports as needed and ensure downstream code uses the local copy so instances do not share the same list.
🧹 Nitpick comments (1)
deepmd/infer/deep_population.py (1)
126-129: ⚡ Quick win
atomicbranch is redundant; both paths return the same tuple.You can simplify this to a single return for readability and easier maintenance.
Suggested simplification
- if atomic: - return (atomic_population,) - else: - return (atomic_population,) + return (atomic_population,)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/infer/deep_population.py` around lines 126 - 129, The conditional in deep_population.py that checks the boolean variable atomic and returns (atomic_population,) in both branches is redundant; simplify by removing the if/else and replace with a single return of (atomic_population,) (referencing the atomic variable and atomic_population identifier) so the function returns that tuple directly without the unnecessary branching.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/pt/loss/population.py`:
- Line 134: The division by self.starter_learning_rate when computing coef (coef
= learning_rate / self.starter_learning_rate) can raise ZeroDivisionError;
update the code in the method that computes coef to validate
self.starter_learning_rate first (e.g., if self.starter_learning_rate == 0:
raise ValueError("starter_learning_rate must be non-zero") or use a safe
fallback like eps = 1e-12 and compute coef = learning_rate /
max(self.starter_learning_rate, eps)), and ensure any error message references
starter_learning_rate so callers can correct the configuration.
---
Duplicate comments:
In `@deepmd/infer/deep_population.py`:
- Around line 66-103: The docstring for eval() in deep_population.py incorrectly
documents energy/force/virial outputs; update the eval() function's docstring to
describe the actual population outputs returned by DeepPopulation (e.g.,
population arrays, fitness/score fields, generation metadata, and shapes/types),
include parameter behavior
(coords/cells/atom_types/mixed_type/fparam/aparam/atomic) only as relevant to
population generation, and list the correct return values and their shapes/types
(and note when optional fields are present) so the documentation matches runtime
behavior and calling code expectations.
In `@deepmd/pt/loss/population.py`:
- Around line 170-180: The forward() implementation currently hardcodes
loss_func = partial(F.l1_loss, reduction="sum") and always uses L1; update
forward() to respect the configured loss type and beta by selecting the proper
loss function (e.g., F.l1_loss for "mae" and F.mse_loss for "mse") and applying
the configured reduction and beta weight from the instance (use existing config
fields like self.loss or self.beta if present). Replace the hardcoded loss_func
with a selector that creates the partial with the correct function and
reduction, then use that for spin_loss, spin_total_loss, pop_loss,
pop_alpha_total_loss, and pop_beta_total_loss so the configured loss and beta
are honored. Ensure you reference the existing symbols forward(), loss_func,
spin_loss, spin_total_loss, pop_loss, pop_alpha_total_loss, and
pop_beta_total_loss when making the change.
- Around line 172-187: spin_loss and pop_loss (and any other sum-reduced
per-atom terms such as pop_alpha_total_loss/pop_beta_total_loss if they are
computed as sums) are not normalized by the number of atoms, so the total loss
scales with system size; update the accumulation before adding into loss to
divide these per-atom sum losses by natoms (use the existing natoms variable) so
you add pref_spin * (spin_loss / natoms) and pref_pop * (pop_loss / natoms) (and
similarly normalize pop_alpha_total_loss and pop_beta_total_loss if they are
per-atom sums) when computing the total loss.
- Line 30: The parameter "metric" in deepmd/pt/loss/population.py is defined
with a mutable default list (metric: list = ["mae"]) which can be shared across
instances; change the signature to use metric: Optional[list] = None (or omit
the default) and inside the constructor/function (e.g., in the PopulationLoss
initializer or the function where "metric" is defined) set if metric is None:
metric = ["mae"]; also update any type annotations/imports as needed and ensure
downstream code uses the local copy so instances do not share the same list.
---
Nitpick comments:
In `@deepmd/infer/deep_population.py`:
- Around line 126-129: The conditional in deep_population.py that checks the
boolean variable atomic and returns (atomic_population,) in both branches is
redundant; simplify by removing the if/else and replace with a single return of
(atomic_population,) (referencing the atomic variable and atomic_population
identifier) so the function returns that tuple directly without the unnecessary
branching.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 59cc6b29-20ab-40f4-a039-36584f6f2aae
📒 Files selected for processing (10)
deepmd/infer/deep_population.pydeepmd/pt/infer/deep_eval.pydeepmd/pt/loss/__init__.pydeepmd/pt/loss/population.pydeepmd/pt/model/model/__init__.pydeepmd/pt/model/task/population.pydeepmd/pt/train/training.pydeepmd/utils/argcheck.pydoc/model/train-fitting-population.mdexamples/population/train/input_torch.json
✅ Files skipped from review due to trivial changes (1)
- doc/model/train-fitting-population.md
🚧 Files skipped from review as they are similar to previous changes (7)
- examples/population/train/input_torch.json
- deepmd/pt/loss/init.py
- deepmd/pt/infer/deep_eval.py
- deepmd/pt/model/model/init.py
- deepmd/pt/model/task/population.py
- deepmd/pt/train/training.py
- deepmd/utils/argcheck.py
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (5)
deepmd/pt/loss/population.py (4)
30-30:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAvoid mutable default for
metricand copy user input.
metric: list = ["mae"]shares state across instances; assigning directly also allows outside mutation of internal state.Proposed fix
- metric: list = ["mae"], + metric: list[str] | None = None, @@ - self.metric = metric + self.metric = ["mae"] if metric is None else list(metric)#!/bin/bash set -euo pipefail ruff check deepmd/pt/loss/population.py --select B006As per coding guidelines
**/*.py: Install linter and runruff check .before committing changes or the CI will fail; format code withruff format .before committing changes or the CI will fail.Also applies to: 74-74
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt/loss/population.py` at line 30, The function/class signature using metric: list = ["mae"] uses a mutable default; change the default to None (metric: Optional[List[str]] = None) and inside the constructor or function (e.g., in PopulationLoss.__init__ or the relevant function that accepts metric) set self.metric = list(metric) if metric is not None else ["mae"] to ensure you copy user input and avoid shared state; apply the same fix to the other occurrence noted (the metric parameter at the second location) and run the linter/formatter (ruff) before committing.
172-187:⚠️ Potential issue | 🟠 Major | ⚡ Quick winNormalize per-atom terms by
natomsto keep loss size-independent.
spin_lossandpop_lossare summed over atoms but not divided bynatoms, so system size changes the effective weighting.Proposed fix
+ if natoms <= 0: + raise ValueError("natoms must be positive") @@ - spin_loss = loss_func(input=spin_pred, target=spin_label) + spin_loss = loss_func(input=spin_pred, target=spin_label) / natoms @@ - pop_loss = loss_func(input=pop_pred, target=pop_label) + pop_loss = loss_func(input=pop_pred, target=pop_label) / natoms🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt/loss/population.py` around lines 172 - 187, The spin and per-atom population losses are currently summed over atoms without normalization, so update the loss accumulation in the block that adds spin_loss and pop_loss (the lines using variables spin_loss and pop_loss) to divide those terms by natoms (e.g., use (spin_loss / natoms) and (pop_loss / natoms)) before multiplying by their prefactors; leave system-total terms (spin_total_loss, pop_alpha_total_loss, pop_beta_total_loss) unchanged. Ensure you reference the existing variables spin_loss, pop_loss, natoms, pref_spin and pref_pop when making the change.
170-180:⚠️ Potential issue | 🟠 Major | ⚡ Quick winHonor configured
loss_funcandbetainstead of hard-coding L1.The constructor exposes
loss_func/beta, butforward()always usesF.l1_loss, which makes non-maeconfigs incorrect.Proposed fix
- loss_func = partial(F.l1_loss, reduction="sum") + if self.loss_func == "mae": + loss_func = partial(F.l1_loss, reduction="sum") + elif self.loss_func == "smooth_mae": + loss_func = partial(F.smooth_l1_loss, reduction="sum", beta=self.beta) + elif self.loss_func == "rmse": + def loss_func(input: torch.Tensor, target: torch.Tensor) -> torch.Tensor: + return torch.sqrt(F.mse_loss(input, target, reduction="mean")) + else: + raise ValueError(f"Unsupported loss_func: {self.loss_func}")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt/loss/population.py` around lines 170 - 180, The forward() currently hard-codes F.l1_loss via loss_func = partial(F.l1_loss, reduction="sum") and uses that for spin_loss, pop_loss, pop_alpha_total_loss, pop_beta_total_loss; change it to use the configured loss function and beta from the constructor (e.g. self.loss_func and self.beta). Replace the partial(F.l1_loss, ...) with a call that uses self.loss_func and always passes reduction="sum", and when the configured function accepts a beta (e.g. F.smooth_l1_loss or similar) pass beta=self.beta; then compute spin_loss, spin_total_loss, pop_loss, pop_alpha_total_loss, pop_beta_total_loss using that configured function instead of the hard-coded F.l1_loss.
134-134:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGuard
starter_learning_ratebefore computingcoef.
learning_rate / self.starter_learning_ratecan divide by zero for invalid config.Proposed fix
self.starter_learning_rate = starter_learning_rate + if self.starter_learning_rate <= 0.0: + raise ValueError("starter_learning_rate must be positive")🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt/loss/population.py` at line 134, The computation coef = learning_rate / self.starter_learning_rate can divide by zero; before dividing in the population logic (where coef, learning_rate, and self.starter_learning_rate are used) check self.starter_learning_rate for zero or near-zero and handle it: either raise a clear ValueError or set coef to a safe default (e.g., 1.0) when starter_learning_rate is 0, and update any downstream logic accordingly; reference the variables coef, learning_rate and self.starter_learning_rate to locate and fix the issue.deepmd/pt/model/task/population.py (1)
68-68:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winReplace mutable default for
neuronwithNone.
neuron: list[int] = [128, 128, 128]creates shared state across instances.Proposed fix
- neuron: list[int] = [128, 128, 128], + neuron: list[int] | None = None, @@ - super().__init__( + if neuron is None: + neuron = [128, 128, 128] + super().__init__( var_name="population",#!/bin/bash set -euo pipefail ruff check deepmd/pt/model/task/population.py --select B006As per coding guidelines
**/*.py: Install linter and runruff check .before committing changes or the CI will fail; format code withruff format .before committing changes or the CI will fail.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@deepmd/pt/model/task/population.py` at line 68, Change the mutable default for the parameter named "neuron" from a shared list to None: update the signature that currently has "neuron: list[int] = [128, 128, 128]" to use "neuron: Optional[list[int]] = None" (add Optional import from typing), and inside the function/class (wherever the "neuron" parameter is used) set "if neuron is None: neuron = [128, 128, 128]". Ensure this change touches the function or constructor that declares "neuron" so no instances share the default list, and run ruff (ruff format . and ruff check ...) to satisfy lint rules.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@deepmd/pt/loss/population.py`:
- Around line 72-73: The class mixes the label keys "atom_population" and
"atomic_population" which breaks the label contract; update the class so
self.var_name and all internal references use "atomic_population" consistently
(replace the current self.var_name = "atom_population" with "atomic_population"
and change any lookups/usages that reference the old key at the other
occurrences mentioned around the sections corresponding to lines ~158 and ~205),
ensuring any code that reads labels uses self.var_name or the literal
"atomic_population" uniformly.
In `@deepmd/pt/model/task/population.py`:
- Line 106: The population output is currently marked reducible which allows
per-atom reductions and will break the downstream reshape in
deepmd/infer/deep_population.py where results["population"] is expected to be
(nframes, natoms, 2); change the OutputSpec (the entry for "population") to
reducible=False so the output is not reduced and preserves the full (nframes,
natoms, 2) tensor shape when produced. Ensure the symbol for the population
output (named "population" in the outputs specification) is updated accordingly.
---
Duplicate comments:
In `@deepmd/pt/loss/population.py`:
- Line 30: The function/class signature using metric: list = ["mae"] uses a
mutable default; change the default to None (metric: Optional[List[str]] = None)
and inside the constructor or function (e.g., in PopulationLoss.__init__ or the
relevant function that accepts metric) set self.metric = list(metric) if metric
is not None else ["mae"] to ensure you copy user input and avoid shared state;
apply the same fix to the other occurrence noted (the metric parameter at the
second location) and run the linter/formatter (ruff) before committing.
- Around line 172-187: The spin and per-atom population losses are currently
summed over atoms without normalization, so update the loss accumulation in the
block that adds spin_loss and pop_loss (the lines using variables spin_loss and
pop_loss) to divide those terms by natoms (e.g., use (spin_loss / natoms) and
(pop_loss / natoms)) before multiplying by their prefactors; leave system-total
terms (spin_total_loss, pop_alpha_total_loss, pop_beta_total_loss) unchanged.
Ensure you reference the existing variables spin_loss, pop_loss, natoms,
pref_spin and pref_pop when making the change.
- Around line 170-180: The forward() currently hard-codes F.l1_loss via
loss_func = partial(F.l1_loss, reduction="sum") and uses that for spin_loss,
pop_loss, pop_alpha_total_loss, pop_beta_total_loss; change it to use the
configured loss function and beta from the constructor (e.g. self.loss_func and
self.beta). Replace the partial(F.l1_loss, ...) with a call that uses
self.loss_func and always passes reduction="sum", and when the configured
function accepts a beta (e.g. F.smooth_l1_loss or similar) pass beta=self.beta;
then compute spin_loss, spin_total_loss, pop_loss, pop_alpha_total_loss,
pop_beta_total_loss using that configured function instead of the hard-coded
F.l1_loss.
- Line 134: The computation coef = learning_rate / self.starter_learning_rate
can divide by zero; before dividing in the population logic (where coef,
learning_rate, and self.starter_learning_rate are used) check
self.starter_learning_rate for zero or near-zero and handle it: either raise a
clear ValueError or set coef to a safe default (e.g., 1.0) when
starter_learning_rate is 0, and update any downstream logic accordingly;
reference the variables coef, learning_rate and self.starter_learning_rate to
locate and fix the issue.
In `@deepmd/pt/model/task/population.py`:
- Line 68: Change the mutable default for the parameter named "neuron" from a
shared list to None: update the signature that currently has "neuron: list[int]
= [128, 128, 128]" to use "neuron: Optional[list[int]] = None" (add Optional
import from typing), and inside the function/class (wherever the "neuron"
parameter is used) set "if neuron is None: neuron = [128, 128, 128]". Ensure
this change touches the function or constructor that declares "neuron" so no
instances share the default list, and run ruff (ruff format . and ruff check
...) to satisfy lint rules.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d3db218e-752b-461e-88a1-883f41a64383
📒 Files selected for processing (23)
deepmd/infer/deep_eval.pydeepmd/infer/deep_population.pydeepmd/pt/infer/deep_eval.pydeepmd/pt/loss/__init__.pydeepmd/pt/loss/population.pydeepmd/pt/model/atomic_model/__init__.pydeepmd/pt/model/atomic_model/population_atomic_model.pydeepmd/pt/model/model/__init__.pydeepmd/pt/model/model/population_model.pydeepmd/pt/model/task/__init__.pydeepmd/pt/model/task/population.pydeepmd/pt/train/training.pydeepmd/utils/argcheck.pydoc/model/train-fitting-population.mdexamples/population/data/set.000/aparam.npyexamples/population/data/set.000/atomic_population.npyexamples/population/data/set.000/box.npyexamples/population/data/set.000/coord.npyexamples/population/data/set.000/energy.npyexamples/population/data/set.000/force.npyexamples/population/data/type.rawexamples/population/data/type_map.rawexamples/population/train/input_torch.json
✅ Files skipped from review due to trivial changes (4)
- examples/population/data/type.raw
- examples/population/data/type_map.raw
- deepmd/pt/loss/init.py
- deepmd/pt/model/atomic_model/init.py
🚧 Files skipped from review as they are similar to previous changes (8)
- deepmd/pt/model/atomic_model/population_atomic_model.py
- deepmd/infer/deep_eval.py
- deepmd/pt/train/training.py
- deepmd/pt/model/task/init.py
- deepmd/pt/model/model/population_model.py
- deepmd/infer/deep_population.py
- examples/population/train/input_torch.json
- deepmd/pt/infer/deep_eval.py
- Fix eval() docstring in DeepPopulation to describe population outputs - Remove redundant if/else branch in DeepPopulation.eval() - Fix mutable default argument for metric in PopulationLoss - Fix mutable default argument for neuron in PopulationFittingNet - Add starter_learning_rate > 0 guard in PopulationLoss - Add natoms > 0 guard in PopulationLoss.forward() - Normalize per-atom spin_loss and pop_loss by natoms - Fix __init__ docstring parameter names in PopulationLoss - Fix doc equation to match natoms normalization in code - Fix non-descriptive link and add language tag in train-fitting-population.md Co-authored-by: Cursor <cursoragent@cursor.com>
Improves docstring coverage across PopulationFittingNet, PopulationModel, and DPPopulationAtomicModel to satisfy the 80% coverage threshold. Co-authored-by: Cursor <cursoragent@cursor.com>
… argcheck Brings docstring coverage above the 80% threshold required by CodeRabbit. Co-authored-by: Cursor <cursoragent@cursor.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #5494 +/- ##
==========================================
- Coverage 81.37% 81.30% -0.08%
==========================================
Files 868 873 +5
Lines 96598 96788 +190
Branches 4242 4243 +1
==========================================
+ Hits 78611 78695 +84
- Misses 16683 16791 +108
+ Partials 1304 1302 -2 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
…et docstring - Honor configured loss_func (mae, smooth_mae, rmse) and beta in PopulationLoss.forward(), matching property.py style - Remove unused functools.partial import - Rename inner _loss parameter to avoid shadowing outer label dict - Update loss_func docstring to list all supported values - Add missing trainable and type_map parameters to class docstring Co-authored-by: Cursor <cursoragent@cursor.com>
|
@coderabbitai review |
✅ Action performedReview finished.
|
Co-authored-by: Cursor <cursoragent@cursor.com>
…m shape Aligns PopulationFittingNet.output_def() with DeepPopulation.output_def() and ensures downstream reshape to (nframes, natoms, 2) is never broken by atom-wise reduction. Co-authored-by: Cursor <cursoragent@cursor.com>
|
@coderabbitai resume |
✅ Action performedReviews resumed. |
|
@njzjz Can you please review this PR when you have a chance? This is my first time contributing to DeePMD-kit. Thank you! |
njzjz-bot
left a comment
There was a problem hiding this comment.
Thanks for adding the population fitting path. I think the loss needs one fix before this is safe to merge: the total-population / total-spin terms currently sum over the whole batch, so with batch_size > 1 different frames can cancel each other out and the loss depends on how samples are batched. Please keep the frame dimension and compute totals per frame instead.
I also left a note on the rmse branch: its reduction semantics differ from the mae/smooth_mae branches, which makes the prefactors hard to interpret. A batch-size > 1 unit test for this loss would be especially useful, since batch=1 would miss the total-term issue.
— OpenClaw 2026.5.28 (model: custom-chat-jinzhezeng-group/gpt-5.5)
| spin_pred = torch.sub(pop_pred[:, 0], pop_pred[:, 1]) | ||
| spin_label = torch.sub(pop_label[:, 0], pop_label[:, 1]) | ||
|
|
||
| spin_total_pred = torch.sum(spin_pred) |
There was a problem hiding this comment.
This looks like a blocker: after reshape([-1, self.task_dim]), the frame/batch dimension is gone, so these total terms are summed over the entire batch rather than per structure/frame. With batch_size > 1, errors from different frames can cancel each other and the loss becomes batch-composition dependent.
Could you keep the shape as something like (nframes, natoms, task_dim) and sum over atoms only, e.g. spin_pred.sum(dim=1) / pop_pred[:, :, 0].sum(dim=1), then apply the total loss per frame? Please also add a batch-size > 1 test for this, because batch=1 would not catch the issue.
— OpenClaw 2026.5.28 (model: custom-chat-jinzhezeng-group/gpt-5.5)
| elif self.loss_func == "mae": | ||
| return F.l1_loss(pred, tgt, reduction="sum") | ||
| elif self.loss_func == "rmse": | ||
| return torch.sqrt(F.mse_loss(pred, tgt, reduction="mean")) |
There was a problem hiding this comment.
rmse has different reduction semantics from the other branches: mae and smooth_mae return a sum that is later divided by natoms for atomic terms, while rmse returns sqrt(mean(...)). That makes loss_func change the effective prefactor/scale rather than only the pointwise penalty shape.
It would be better to normalize all branches consistently, ideally with a clear definition such as per-frame totals and per-atom averages.
— OpenClaw 2026.5.28 (model: custom-chat-jinzhezeng-group/gpt-5.5)
There was a problem hiding this comment.
Pull request overview
This PR introduces a new DeepPopulation workflow to fit per-atom (alpha, beta) electron populations using the PyTorch backend, integrating it into training, model construction, and inference, and providing docs/examples.
Changes:
- Added a new PyTorch population fitting net + atomic model + model wrapper registered under
fitting_net.type = "population"/ model type"population". - Added a new
PopulationLossand wired it into the PyTorch training loss factory. - Added inference support via
DeepPopulation, plus documentation and example training input/data scaffolding.
Reviewed changes
Copilot reviewed 17 out of 23 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| examples/population/train/input_torch.json | Example PyTorch training configuration for population fitting. |
| examples/population/data/type.raw | Example dataset type indices for the population example. |
| examples/population/data/type_map.raw | Example dataset type map for the population example. |
| doc/model/train-fitting-population.md | New user documentation for population fitting (config, loss, data format, workflow). |
| deepmd/utils/argcheck.py | Registers population fitting-net args and population loss args for config validation/docs. |
| deepmd/pt/train/training.py | Wires loss.type == "population" to instantiate PopulationLoss. |
| deepmd/pt/model/task/population.py | Implements PopulationFittingNet registered as Fitting.register("population"). |
| deepmd/pt/model/task/init.py | Exports PopulationFittingNet from the task package. |
| deepmd/pt/model/model/population_model.py | Implements PopulationModel registered as BaseModel.register("population"). |
| deepmd/pt/model/model/init.py | Routes fitting_net.type == "population" to PopulationModel; exports it. |
| deepmd/pt/model/atomic_model/population_atomic_model.py | Adds DPPopulationAtomicModel wrapper enforcing the population fitting net. |
| deepmd/pt/model/atomic_model/init.py | Exports DPPopulationAtomicModel. |
| deepmd/pt/loss/population.py | Adds PopulationLoss implementation. |
| deepmd/pt/loss/init.py | Exports PopulationLoss. |
| deepmd/pt/infer/deep_eval.py | Selects DeepPopulation wrapper when the model outputs include population. |
| deepmd/infer/deep_population.py | Adds DeepPopulation inference API returning (nframes, natoms, 2) populations. |
| deepmd/infer/deep_eval.py | Adds output-name mapping for population in backend translation. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self.task_dim = 2 | ||
| self.var_name = "atom_population" | ||
| self.loss_func = loss_func |
| pop_pred = model_pred["population"].reshape([-1, self.task_dim]) | ||
| pop_label = label["atom_population"].reshape([-1, self.task_dim]) | ||
|
|
| more_loss["spin_total"] = spin_total_pred | ||
| more_loss["spin_loss"] = spin_loss | ||
| more_loss["spin_total_loss"] = spin_total_loss | ||
| more_loss["pop_loss"] = pop_loss | ||
| more_loss["pop_alpha_total_loss"] = pop_alpha_total_loss | ||
| more_loss["pop_beta_total_loss"] = pop_beta_total_loss |
| \mathcal{L}_{N} = | ||
| & \lambda_N \left( N^{ML} - N^{DFT} \right) + | ||
| \\ & \lambda_{N_{\alpha}} \left( N^{ML}_{\alpha} - N^{DFT}_{\alpha} \right) + | ||
| \\ & \lambda_{N_{\beta}} \left( N^{ML}_{\beta} - N^{DFT}_{\beta} \right) + |
|
|
||
| This includes contributions from: | ||
|
|
||
| - The total number of electrons ( N ), |
| loss_func: str = "smooth_mae", | ||
| metric: list[str] | None = None, | ||
| starter_learning_rate: float = 1.0, | ||
| start_pref_spin: float = 1.00, |
| class PopulationLoss(TaskLoss): | ||
| """Loss function for training the atomic charge population model. | ||
|
|
||
| Computes weighted L1 losses for per-atom spin, total spin, per-atom | ||
| population, and total alpha/beta population channels. | ||
| """ |
We implement the DeepPopulation model, which can be used to fit the atomic charge population.
See the preprint on arXiv for details and an example of how the DeepPopulation model was used to study small polaron transport with DeepPolaron.
The Fitting Network
The JSON of
populationtype should be provided liketypespecifies which type of fitting net should be used. It should bepopulation.enermode.Loss
The loss function for DeepPopulation has the form:
This includes contributions from:
The loss section should be provided like
Training Data Preparation
The atomic charge population file should be named
atomic_population.npy, with shape[num_frames, num_atoms, 2], where[num_frames, num_atoms, 0]represents the alpha spin channel and[num_frames, num_atoms, 1]represents the beta spin channel.Summary by CodeRabbit
New Features
Documentation
Examples