fix 4326: bound exported dynamic shapes and normalize symbolic slice …#4341
fix 4326: bound exported dynamic shapes and normalize symbolic slice …#4341micwill755 wants to merge 4 commits into
Conversation
|
@micwill755 |
|
|
||
| with gm.graph.inserting_before(node): | ||
| dim_size = gm.graph.call_function( | ||
| torch.ops.aten.sym_size.int, args=(input_node, dim) |
There was a problem hiding this comment.
Nice yeah I think is exactly what we are looking for.
If you want something a bit cleaner to read, we have this utility:
https://github.com/pytorch/TensorRT/blob/main/py/torch_tensorrt/dynamo/lowering/_SubgraphBuilder.py
but its more style than functional
There was a problem hiding this comment.
Cool, let me review and edit accordingly.
| partitioned_module.recompile() | ||
|
|
||
|
|
||
| def _apply_dynamic_shape_bounds( |
There was a problem hiding this comment.
@apbose please review / comment on if this overlaps with your work
There was a problem hiding this comment.
This was adapted from a previous fix in Wenbing’s branch: _apply_dynamic_shape_bounds.
It fixes the case where ZoomASR is exported with broad or unbounded Dim.DYNAMIC ranges by applying the user-provided torch_tensorrt.Input min/max bounds during Torch-TensorRT compilation. Without this compiler-side fix, I had to work around the issue in the ZoomASR export code by changing the dynamic shape annotations from:
"features": {0: Dim.DYNAMIC, 1: Dim.DYNAMIC, 2: Dim.STATIC},
"feature_lengths": {0: Dim.DYNAMIC},
to explicitly bounded dimensions:
batch_dim = Dim("batch", min=1, max=max_batch_size)
feature_len_dim = Dim("feature_len", min=1, max=3000)
dynamic_shapes={
"features": {0: batch_dim, 1: feature_len_dim, 2: Dim.STATIC},
"feature_lengths": {0: batch_dim},
}
With _apply_dynamic_shape_bounds, the model export can keep using Dim.DYNAMIC, while Torch-TensorRT still constructs the intended bounded TensorRT optimization profile from the provided input specs.
There was a problem hiding this comment.
My only concern with this code and Wenbings prior version of this is im not sure it properly inserts the updated shapes. Like I would expect some form of shape prop done here as well after constraining the placeholders. If we can solve this in export with explicit ranges, it might be better to split this into its own PR that fully handles constraining an already exported exported program. It should still unblock Shane without and the explicit shape in export right?
There was a problem hiding this comment.
Agreed and yes this will unblock Shane.
There was a problem hiding this comment.
yeah PR 4213 addresses this, validating the input bounds against the exporter bounds.
- One issue against changing the var_to_range is that it tracks the exporter bounds, and if we change the Input bounds for a later compile, it conflates the Input bound of TRT profile with the export bound, and can lead to error. Also it does not do the shape prop. So something like this fails in above
ep = torch.export.export(model, (sample,),
dynamic_shapes={"x": {0: Dim.DYNAMIC}})
# var_to_range[s0] = [2, int_oo] -- model valid for any size
Now two separate compilations:
# User A: small deployment, max batch 100
trt_a = torchtrt.dynamo.compile(ep, inputs=[Input(max_shape=(100, 8))])
# _apply_dynamic_shape_bounds mutates var_to_range[s0] → [2, 100]
# User B: larger deployment, max batch 500
trt_b = torchtrt.dynamo.compile(ep, inputs=[Input(max_shape=(500, 8))])
# _apply_dynamic_shape_bounds now sees var_to_range[s0] = [2, 100]
# computes min(100, 500) = 100
# User B silently gets max=100 instead of 500
- Second type which the other PR addresses is — if the export used explicit Dim("batch", min=10, max=20) and the user passes Input(min_shape=2) this method will clamp input bound to 10 ignoring 2.
There was a problem hiding this comment.
@apbose and @micwill755 can you guys figure out if a follow on PR is needed for this or if we should be covered by #4213
There was a problem hiding this comment.
The removal of no-op for sym_min and also the negative symbolic slice bounds are something which PR 4213 does not cover, which are needed
There was a problem hiding this comment.
synced with @micwill755. I will merge 4213 and he will rebase from there.
There was a problem hiding this comment.
I dont think he needs to rebase, I think we drop this specific change, merge the rest and then look at readding it on top of #4213 if needed
|
|
||
| from .pass_utils import clean_up_graph_after_modifications | ||
|
|
||
|
|
There was a problem hiding this comment.
cant we handle this case in the converter itself?
There was a problem hiding this comment.
Is there a reason we would prefer that?
There was a problem hiding this comment.
I think generally simpler converters and more in lowering is better, but I think the line needs to be clear, either we normalize dimensions in the graph or we do it in the converter
There was a problem hiding this comment.
going through this case ya lowering seems better, since the negative seems to be ITensor, so we would need extra Iselect layers, making the converter more complicated. Though we have done this before in converters, but if we want simpler converters now, doing in lowering makes more sense
|
|
||
|
|
||
| @dynamo_tensorrt_converter(torch.sym_min, supports_dynamic_shapes=True) | ||
| @dynamo_tensorrt_converter(torch.ops.aten.minimum.default, supports_dynamic_shapes=True) |
There was a problem hiding this comment.
why do we have this in both the converter and a lowering pass to remove sym_min?
There was a problem hiding this comment.
Lowering pass should only remove a specific instance of sym_min
Description
This change fixes three dynamic-shape issues exposed by the ZoomASR repro. Together, these fixes make Torch-TensorRT respect user-provided input bounds, lower negative symbolic slicing into TensorRT-friendly shape math, and clean up symbolic expressions that otherwise survive into runtime or conversion.
A previous working branch had a helper called _apply_dynamic_shape_bounds in Torch-TensorRT _compiler.py. That function propagated user-provided torch_tensorrt.Input min/max bounds into the FX shape_env, which explains why the old path used the intended ZoomASR feature length bound of 3000 instead of the broad exported range [9, 16384].
PyTorch 2.12 can emit slices with negative symbolic bounds, such as x[-n:] or x[:-n]. This change adds a lowering pass that rewrites those bounds to dim_size - n for both slice start and stop, making the graph explicit and TensorRT-friendly.
The issue discussion mentioned removing torch.sym_min(x, INT64_MAX) as a possible needed patch. This change adds that as a separate lowering pass; it removes the no-op sym_min expression before runtime, preventing failures where torch.sym_min receives tensor-like inputs.
Fixes #4326
Type of change
Checklist: