Skip to content

ddpm model/pipeline review #13649

@hlky

Description

@hlky

ddpm model/pipeline review

Commit tested: 0f1abc4ae8b0eb2a3b40e82a310507281144c423

Review performed against the repository review rules.

Coverage status: fast DDPM tests exist, and a slow accelerator integration test exists. Local run: 2 passed, 1 skipped for tests/pipelines/ddpm/test_ddpm.py; the skipped test is marked @slow.

Issue 1: DDPMPipeline does not run latents on the offload execution device

Affected code:

if self.device.type == "mps":
# randn does not work reproducibly on mps
image = randn_tensor(image_shape, generator=generator, dtype=self.unet.dtype)
image = image.to(self.device)
else:
image = randn_tensor(image_shape, generator=generator, device=self.device, dtype=self.unet.dtype)
# set step values
self.scheduler.set_timesteps(num_inference_steps)
for t in self.progress_bar(self.scheduler.timesteps):
# 1. predict noise model_output
model_output = self.unet(image, t).sample
# 2. compute previous image: x_t -> x_t-1
image = self.scheduler.step(model_output, t, image, generator=generator).prev_sample
if XLA_AVAILABLE:
xm.mark_step()
image = (image / 2 + 0.5).clamp(0, 1)
image = image.cpu().permute(0, 2, 3, 1).numpy()
if output_type == "pil":
image = self.numpy_to_pil(image)
if not return_dict:
return (image,)
return ImagePipelineOutput(images=image)

Problem:
DDPMPipeline declares model_cpu_offload_seq = "unet" but initializes image on self.device, not self._execution_device, and never calls self.maybe_free_model_hooks() before returning. Under enable_model_cpu_offload(), self.device remains CPU while the UNet forward is executed on the accelerator by accelerate hooks. That leaves the scheduler step operating on mismatched CPU/GPU tensors.

Duplicate search:
No duplicate found for DDPMPipeline enable_model_cpu_offload, DDPMPipeline maybe_free_model_hooks, or DDPMPipeline _execution_device.

Impact:
Users enabling CPU offload can hit device mismatch failures or leave the UNet resident on the accelerator after the call, defeating the offload contract.

Reproduction:

import torch
from diffusers import DDPMPipeline, DDPMScheduler, UNet2DModel

if not torch.cuda.is_available():
    raise SystemExit("CUDA/accelerator required")

unet = UNet2DModel(
    block_out_channels=(4, 8),
    layers_per_block=1,
    norm_num_groups=4,
    sample_size=8,
    in_channels=3,
    out_channels=3,
    down_block_types=("DownBlock2D", "AttnDownBlock2D"),
    up_block_types=("AttnUpBlock2D", "UpBlock2D"),
)
pipe = DDPMPipeline(unet=unet, scheduler=DDPMScheduler(num_train_timesteps=2))
pipe.set_progress_bar_config(disable=True)
pipe.enable_model_cpu_offload(device="cuda")

pipe(num_inference_steps=1, output_type="np")

Relevant precedent:

image = randn_tensor(image_shape, generator=generator, device=self._execution_device, dtype=self.unet.dtype)

def maybe_free_model_hooks(self):
r"""
Method that performs the following:
- Offloads all components.
- Removes all model hooks that were added when using `enable_model_cpu_offload`, and then applies them again.
In case the model has not been offloaded, this function is a no-op.
- Resets stateful diffusers hooks of denoiser components if they were added with
[`~hooks.HookRegistry.register_hook`].
Make sure to add this function to the end of the `__call__` function of your pipeline so that it functions
correctly when applying `enable_model_cpu_offload`.

Suggested fix:

device = self._execution_device

if device.type == "mps":
    image = randn_tensor(image_shape, generator=generator, dtype=self.unet.dtype)
    image = image.to(device)
else:
    image = randn_tensor(image_shape, generator=generator, device=device, dtype=self.unet.dtype)

# ...

if output_type == "pil":
    image = self.numpy_to_pil(image)

self.maybe_free_model_hooks()

Issue 2: Generator lists are not validated against batch_size

Affected code:

if self.device.type == "mps":
# randn does not work reproducibly on mps
image = randn_tensor(image_shape, generator=generator, dtype=self.unet.dtype)
image = image.to(self.device)
else:
image = randn_tensor(image_shape, generator=generator, device=self.device, dtype=self.unet.dtype)

Problem:
DDPMPipeline.__call__ accepts generator: torch.Generator | list[torch.Generator], but it does not validate list length. A one-element list for a larger batch is silently treated like a single generator, and other short lists can fail with a raw IndexError.

Duplicate search:
No duplicate found for DDPMPipeline generator list or DDPMPipeline generator list length.

Impact:
Per-sample seeding behaves inconsistently and invalid inputs do not produce the clear ValueError users get from related pipelines.

Reproduction:

import torch
from diffusers import DDPMPipeline, DDPMScheduler, UNet2DModel

unet = UNet2DModel(
    block_out_channels=(4, 8),
    layers_per_block=1,
    norm_num_groups=4,
    sample_size=8,
    in_channels=3,
    out_channels=3,
    down_block_types=("DownBlock2D", "AttnDownBlock2D"),
    up_block_types=("AttnUpBlock2D", "UpBlock2D"),
)
pipe = DDPMPipeline(unet=unet, scheduler=DDPMScheduler(num_train_timesteps=2))
pipe.set_progress_bar_config(disable=True)

pipe(
    batch_size=3,
    generator=[torch.Generator(device="cpu").manual_seed(i) for i in range(2)],
    num_inference_steps=1,
    output_type="np",
)

Relevant precedent:

if isinstance(generator, list) and len(generator) != batch_size:
raise ValueError(
f"You have passed a list of generators of length {len(generator)}, but requested an effective batch"
f" size of {batch_size}. Make sure the batch size matches the length of the generators."
)
image = randn_tensor(image_shape, generator=generator, device=self._execution_device, dtype=self.unet.dtype)

if isinstance(generator, list) and len(generator) != batch_size:
raise ValueError(
f"You have passed a list of generators of length {len(generator)}, but requested an effective batch"
f" size of {batch_size}. Make sure the batch size matches the length of the generators."
)

Suggested fix:

if isinstance(generator, list) and len(generator) != batch_size:
    raise ValueError(
        f"You have passed a list of generators of length {len(generator)}, but requested an effective batch"
        f" size of {batch_size}. Make sure the batch size matches the length of the generators."
    )

Issue 3: Invalid output_type values silently return NumPy

Affected code:

image = (image / 2 + 0.5).clamp(0, 1)
image = image.cpu().permute(0, 2, 3, 1).numpy()
if output_type == "pil":
image = self.numpy_to_pil(image)

Problem:
Any output_type other than "pil" falls through to NumPy output. This includes typos and common values like "pt", with no warning or error.

Duplicate search:
No duplicate found for DDPMPipeline output_type or DDPMPipeline invalid output_type.

Impact:
A misspelled output type can silently change behavior, making downstream code receive np.ndarray when the caller expected a different format or a validation error.

Reproduction:

from diffusers import DDPMPipeline, DDPMScheduler, UNet2DModel

unet = UNet2DModel(
    block_out_channels=(4, 8),
    layers_per_block=1,
    norm_num_groups=4,
    sample_size=8,
    in_channels=3,
    out_channels=3,
    down_block_types=("DownBlock2D", "AttnDownBlock2D"),
    up_block_types=("AttnUpBlock2D", "UpBlock2D"),
)
pipe = DDPMPipeline(unet=unet, scheduler=DDPMScheduler(num_train_timesteps=2))
pipe.set_progress_bar_config(disable=True)

images = pipe(num_inference_steps=1, output_type="definitely-not-valid").images
print(type(images).__name__)  # ndarray

Relevant precedent:

# Follows diffusers.VaeImageProcessor.postprocess
def postprocess_image(self, sample: torch.Tensor, output_type: str = "pil"):
if output_type not in ["pt", "np", "pil"]:
raise ValueError(
f"output_type={output_type} is not supported. Make sure to choose one of ['pt', 'np', or 'pil']"
)
# Equivalent to diffusers.VaeImageProcessor.denormalize
sample = (sample / 2 + 0.5).clamp(0, 1)
if output_type == "pt":
return sample
# Equivalent to diffusers.VaeImageProcessor.pt_to_numpy
sample = sample.cpu().permute(0, 2, 3, 1).numpy()

if output_type not in ["latent", "pt", "np", "pil"]:
deprecation_message = (
f"the output_type {output_type} is outdated and has been set to `np`. Please make sure to set it to one of these instead: "
"`pil`, `np`, `pt`, `latent`"
)
deprecate("Unsupported output_type", "1.0.0", deprecation_message, standard_warn=False)
output_type = "np"
if output_type == "latent":
return image
image = self._denormalize_conditionally(image, do_denormalize)
if output_type == "pt":
return image

Suggested fix:

if output_type not in ["pt", "np", "pil"]:
    raise ValueError("output_type must be one of ['pt', 'np', or 'pil'].")

image = (image / 2 + 0.5).clamp(0, 1)
if output_type == "pt":
    pass
else:
    image = image.cpu().permute(0, 2, 3, 1).numpy()
    if output_type == "pil":
        image = self.numpy_to_pil(image)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions