Raise ValueError instead of tearing down CUDA when AuraFlow latents exceed pos_embed_max_size#13740
Open
Ricardo-M-L wants to merge 1 commit into
Open
Raise ValueError instead of tearing down CUDA when AuraFlow latents exceed pos_embed_max_size#13740Ricardo-M-L wants to merge 1 commit into
Ricardo-M-L wants to merge 1 commit into
Conversation
…xceed pos_embed_max_size When the input latent grid exceeds the pretrained positional embedding grid, pe_selection_index_based_on_dim silently produces negative / out-of-range gather indices. On CUDA this trips a vectorized_gather_kernel device-side assert, which destroys the CUDA context for the entire process and forces a Python restart (see huggingface#12656). Check the bounds up front and raise a ValueError with a clear message about the largest supported resolution, matching how PatchEmbed.cropped_pos_embed in models/embeddings.py handles the same situation for SD3. Fixes huggingface#12656
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
Fixes #12656.
When the input latent grid exceeds the pretrained positional embedding grid,
pe_selection_index_based_on_dimsilently produces negative / out-of-range gather indices. On CUDA this trips a vectorized_gather_kernel device-side assert, which destroys the CUDA context for the entire process and forces a Python restart.Fix
Check the bounds up front and raise a `ValueError` with a clear message about the largest supported resolution, matching how `PatchEmbed.cropped_pos_embed` in `models/embeddings.py` handles the same situation for SD3.
```
AuraFlow positional embedding only supports up to N latent tokens
per axis, but got M. Reduce height/width below ...
```
Verification
13 LOC, 1 file. Pure error-path improvement — the happy path is unchanged. Without the fix the failure mode is a permanent CUDA tear-down requiring process restart; with it the user gets a clean exception they can catch.