Describe the bug
In monai/losses/image_dissimilarity.py, LocalNormalizedCrossCorrelationLoss.__init__ contains two related problems in kernel initialization:
Problem 1: Typo in require_grads (silent no-op)
self.kernel = _kernel(self.kernel_size)
self.kernel.require_grads = False # BUG: 'require_grads' is NOT a valid tensor attribute!
self.kernel_vol = self.get_kernel_vol()
require_grads (plural, with trailing s) is not a valid PyTorch tensor attribute. The correct attribute is requires_grad. This line silently does nothing — it creates a new Python attribute called require_grads on the tensor object instead of controlling gradient tracking. As a result, the kernel tensor silently tracks gradients in every forward pass, consuming unnecessary memory in the computation graph.
Problem 2: Plain attribute assignment instead of register_buffer
Using self.kernel = ... (plain attribute) instead of self.register_buffer("kernel", ...) means:
- When the user calls
loss.to("cuda"), loss.cuda(), or loss.half() on the loss module, the kernel does NOT move to the target device — it stays on CPU. This causes a device mismatch at runtime.
- The kernel is not included in
state_dict() / load_state_dict() which leads to silent inconsistencies if checkpointing the loss object.
To Reproduce
import torch
from monai.losses import LocalNormalizedCrossCorrelationLoss
loss = LocalNormalizedCrossCorrelationLoss(spatial_dims=3, kernel_type="gaussian")
# Bug 1: require_grads typo silently does nothing
print(loss.kernel.requires_grad) # True! Not False as intended
print(hasattr(loss.kernel, 'require_grads')) # True -- spurious attribute created
# Bug 2: kernel not a registered buffer
print(dict(loss.named_buffers())) # {} -- kernel is NOT here!
loss.cuda()
print(loss.kernel.device) # cpu -- kernel did NOT move to GPU!
Expected behavior
loss.kernel.requires_grad should be False
loss.kernel and loss.kernel_vol should appear in loss.named_buffers()
- After
loss.cuda(), loss.kernel.device should be cuda:0
Fix
Replace both assignments with register_buffer:
self.register_buffer("kernel", _kernel(self.kernel_size))
self.register_buffer("kernel_vol", self.get_kernel_vol())
This is tracked in PR #8818.
Environment
Affects all versions. Reproducible on MONAI dev branch as of 2026-04-11.
Related chain of issues
This bug reveals a broader pattern worth auditing across the MONAI losses module:
GlobalMutualInformationLoss — check if bin_centers is properly registered as a buffer
- Other custom loss classes that use constant tensors initialized in
__init__
- Test coverage for device movement of loss modules (
loss.cuda() should not cause device mismatch)
Describe the bug
In
monai/losses/image_dissimilarity.py,LocalNormalizedCrossCorrelationLoss.__init__contains two related problems in kernel initialization:Problem 1: Typo in
require_grads(silent no-op)require_grads(plural, with trailings) is not a valid PyTorch tensor attribute. The correct attribute isrequires_grad. This line silently does nothing — it creates a new Python attribute calledrequire_gradson the tensor object instead of controlling gradient tracking. As a result, the kernel tensor silently tracks gradients in every forward pass, consuming unnecessary memory in the computation graph.Problem 2: Plain attribute assignment instead of
register_bufferUsing
self.kernel = ...(plain attribute) instead ofself.register_buffer("kernel", ...)means:loss.to("cuda"),loss.cuda(), orloss.half()on the loss module, the kernel does NOT move to the target device — it stays on CPU. This causes a device mismatch at runtime.state_dict()/load_state_dict()which leads to silent inconsistencies if checkpointing the loss object.To Reproduce
Expected behavior
loss.kernel.requires_gradshould beFalseloss.kernelandloss.kernel_volshould appear inloss.named_buffers()loss.cuda(),loss.kernel.deviceshould becuda:0Fix
Replace both assignments with
register_buffer:This is tracked in PR #8818.
Environment
Affects all versions. Reproducible on MONAI
devbranch as of 2026-04-11.Related chain of issues
This bug reveals a broader pattern worth auditing across the MONAI losses module:
GlobalMutualInformationLoss— check ifbin_centersis properly registered as a buffer__init__loss.cuda()should not cause device mismatch)