Skip to content

fix(pt_expt): respect model default_fparam in data requirement#5417

Merged
wanghan-iapcm merged 1 commit intodeepmodeling:masterfrom
wanghan-iapcm:fix-pt-expt-fparam-default
Apr 24, 2026
Merged

fix(pt_expt): respect model default_fparam in data requirement#5417
wanghan-iapcm merged 1 commit intodeepmodeling:masterfrom
wanghan-iapcm:fix-pt-expt-fparam-default

Conversation

@wanghan-iapcm
Copy link
Copy Markdown
Collaborator

@wanghan-iapcm wanghan-iapcm commented Apr 23, 2026

Summary

  • pt_expt's get_additional_data_requirement hard-coded default=0.0 for the fparam data item, ignoring the model's default_fparam. For datasets without fparam.npy, the data system fed zeros to the fitting net while pt correctly fed the model's default — after fparam normalisation a constant DC offset (e.g. [0, -100] when stats are avg=[0,1], inv_std=[100,100]) was injected on every step, slowing multi-task training convergence vs the pt baseline.
  • Mirror pt's logic (deepmd/pt/train/training.py:1762-1779): pull default_fparam from the model and set must=not has_default_fparam. The fparam-present path is unchanged (find_fparam=1.0 → the default value is unused).
  • Add a focused unit test that covers both branches (has_default_fparam true / false) per the project's UT-coverage rule.

Test plan

  • pytest source/tests/pt_expt/test_training.py::TestAdditionalDataRequirement -v — both new branches pass.
  • pytest source/tests/pt_expt/test_training.py::TestTraining -v — 4 existing training-loop tests still pass.
  • pytest source/tests/pt_expt/test_training.py::TestCompiledVaryingNframesWithParams -v — fparam-present compile path with numb_fparam=2 still passes.
  • End-to-end multi-task convergence comparison (pt vs pt_expt) on the user's reported script — requires Electrolyte data not available in this environment; intent is verified analytically.

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Improved force parameter handling: the system now properly respects model-provided default values when determining whether force parameters are required during training.
  • Tests

    • Added comprehensive test coverage for force parameter requirement logic.

pt_expt's get_additional_data_requirement hard-coded default=0.0 for
fparam, ignoring the model's default_fparam. For datasets without
fparam.npy this fed zeros to the fitting net while pt fed the model
default — after fparam normalisation a constant DC offset (e.g.
[0, -100]) was injected every step, slowing multi-task training
convergence. Mirror pt's logic so must/default come from the model.
@dosubot dosubot Bot added the bug label Apr 23, 2026
@wanghan-iapcm wanghan-iapcm requested a review from iProzd April 23, 2026 16:17
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

📝 Walkthrough

Walkthrough

The implementation of get_additional_data_requirement() now conditionally determines fparam requirement status based on whether the model provides a default fparam value. If available, fparam is marked optional with the model's default; otherwise, it becomes required with a 0.0 fallback.

Changes

Cohort / File(s) Summary
Implementation
deepmd/pt_expt/train/training.py
Modified get_additional_data_requirement() to check has_default_fparam() flag. Sets must=False and uses model's default when available; otherwise sets must=True with 0.0 fallback.
Testing
source/tests/pt_expt/test_training.py
Added TestAdditionalDataRequirement suite with tests covering both branches: model with default fparam (optional, uses model default) and without (required, uses 0.0).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(pt_expt): respect model default_fparam in data requirement' clearly and specifically summarizes the main change: modifying how fparam defaults are handled in pt_expt's data requirement logic.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
source/tests/pt_expt/test_training.py (1)

564-587: Use a tensor default to cover the real model API.

The fake currently returns a Python list, but real models expose get_default_fparam() as torch.Tensor | None. Returning a tensor here makes this regression test cover the conversion path used in production.

🧪 Proposed test adjustment
-            def get_default_fparam(self) -> list[float]:
-                return [0.0, 1.0]
+            def get_default_fparam(self) -> torch.Tensor:
+                device = "cuda" if torch.cuda.is_available() else "cpu"
+                return torch.tensor([0.0, 1.0], device=device)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/pt_expt/test_training.py` around lines 564 - 587, The test's
fake model class _M should return a torch.Tensor from get_default_fparam to
match the real model API and exercise the conversion path in
get_additional_data_requirement; update _M.get_default_fparam to return
torch.tensor([0.0, 1.0]) (and import torch in the test) so fparam_req.default is
produced from a tensor and the existing np.testing.assert_array_equal still
verifies the values for fparam_req in the test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pt_expt/train/training.py`:
- Around line 122-132: The default fparam may be a CUDA torch.Tensor causing
np.asarray(cuda_tensor) to throw; update the code around
has_default_fparam/_model.get_default_fparam() and the fparam_default assignment
so that if get_default_fparam() returns a torch.Tensor you first detach it and
move it to CPU then convert to numpy (e.g. value.detach().cpu().numpy()), and if
it returns None keep 0.0; ensure this converted numpy array is what you pass as
default into DataRequirementItem("fparam", _model.get_dim_fparam(), ...).

---

Nitpick comments:
In `@source/tests/pt_expt/test_training.py`:
- Around line 564-587: The test's fake model class _M should return a
torch.Tensor from get_default_fparam to match the real model API and exercise
the conversion path in get_additional_data_requirement; update
_M.get_default_fparam to return torch.tensor([0.0, 1.0]) (and import torch in
the test) so fparam_req.default is produced from a tensor and the existing
np.testing.assert_array_equal still verifies the values for fparam_req in the
test.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: aef05d71-a0a9-48fb-9e30-927287910285

📥 Commits

Reviewing files that changed from the base of the PR and between 54f42d9 and 2548c01.

📒 Files selected for processing (2)
  • deepmd/pt_expt/train/training.py
  • source/tests/pt_expt/test_training.py

Comment thread deepmd/pt_expt/train/training.py
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.47%. Comparing base (54f42d9) to head (2548c01).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5417   +/-   ##
=======================================
  Coverage   80.46%   80.47%           
=======================================
  Files         821      821           
  Lines       86075    86076    +1     
  Branches     4140     4139    -1     
=======================================
+ Hits        69263    69266    +3     
+ Misses      15536    15534    -2     
  Partials     1276     1276           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wanghan-iapcm wanghan-iapcm added this pull request to the merge queue Apr 24, 2026
Merged via the queue into deepmodeling:master with commit aa44999 Apr 24, 2026
72 of 78 checks passed
@wanghan-iapcm wanghan-iapcm deleted the fix-pt-expt-fparam-default branch April 24, 2026 07:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants