Add solve fuzzer and fix the soundness bugs it surfaced by alexreinking · Pull Request #9105 · halide/Halide

alexreinking · 2026-04-23T16:13:27Z

Summary

New libfuzzer harness at test/fuzz/solve.cpp for solve_expression, solve_for_{inner,outer}_interval, and and_condition_over_domain, running over random expression trees up to depth 6 across Int(8..64), UInt(1..64), Float(32), Float(64), with broadcasts and ramps enabled.
Fixes every soundness bug the fuzzer surfaced along the way — five in Solve.cpp, four in Bounds.cpp, and one in Simplify_Cast.cpp — each with a companion regression test in test/correctness/solve.cpp that reproduces the failure without the fix.
Clean at 250k+ fuzz iterations on the full type set after the fixes.

Solve.cpp

visit_cmp a + b @ c -> a @ c - b flipped ordering under modular wrap; restrict to no_overflow_int, keep EQ/NE.
visit_cmp Mul rewrites introduced div-by-zero UB on non-constant multipliers; move EQ/NE cases under the positive-const guard and gate on no_overflow_int.
visit(Add) a + a -> a * 2 aborted on UInt(1); switch to Mul::make(a, make_const(a.type(), 2)).
visit(Add)/visit(Sub) f(x)/a + g(x) -> (f(x) + g(x)*a)/a wrapped on narrow types; gate on no_overflow_int.
Canonicalize two inlined copies of is_int() && bits() >= 32 to no_overflow_int.

Bounds.cpp

visit(Cast) shortcut silently wrapped unsigned narrowings to signed (UInt(64) -> Int(32) carried a wrapped-negative as max); restrict to non-unsigned sources.
Bounded signed-int sources could still wrap and invert the interval (e.g. int64 [59, 4294967287] -> int32 [59, -9]); further restrict the shortcut to unbounded signed-int sources and let bounded ones fall through to the existing can_prove(cast(from, cast(to, x)) == x) fit check.
Float sources hit the same flavor of bug (two float endpoints wrap to opposite sides of zero, inverting the int interval); drop the float shortcut entirely — the bounded-a path already does a sound to.can_represent(*fmin) fit check.
visit(Mod) claimed [0, ...] even for floats; fmod takes the sign of the dividend and is NaN on zero divisor, so return bounds_of_type(t) for float types.

Simplify_Cast.cpp

Nested-cast rule stripped the inner cast on the "sign/zero extend then truncate" theory, but that only holds when the inner source is int-or-uint. For a float source (e.g. int32(uint64(float64(-21)))) stripping changes the value. Add the is_int_or_uint() guard.

Test plan

ctest -R 'correctness_(solve|bounds|simplify)$' passes
test/fuzz/fuzz_solve -runs=10000 across multiple seeds with floats enabled runs clean
Reverting each fix individually reproduces the corresponding regression test failure

Fixes #9101

🤖 Generated with Claude Code

Adds a libfuzzer harness at test/fuzz/solve.cpp that exercises solve_expression (semantic equivalence under random substitutions), solve_for_inner_interval / solve_for_outer_interval (inside-inner implies true; outside-outer implies false), and and_condition_over_domain (weakened condition implies input) on random expression trees up to depth 6 across Int(8..64), UInt(1..64), Float(32), Float(64), with broadcasts and ramps enabled. The random_expr_generator now skips bitwise ops on float types, which would otherwise abort the fuzzer on construction. Alongside the fuzzer, this fixes every soundness bug the fuzzer surfaced while building it up to the wide type set: Solve.cpp (src/Solve.cpp): - visit_cmp rewrote `a + b @ c` to `a @ c - b` for every comparison, but the subtraction wrap flips ordering (LT/LE/GT/GE) under modular arithmetic. Restrict ordering rewrites to no_overflow_int; EQ/NE stay (modular equivalence preserves equality). - visit_cmp's Mul rewrites `a*c @ b <=> a @ b/c && b%c == 0` computed b/c and b%c for any shape of c, introducing div-by-zero UB when c was a non-constant that evaluated to zero. The code comment claimed it only fired on constant positive multipliers; move the EQ/NE cases inside that guard and also require no_overflow_int. Narrow signed types (where overflow is still UB) go through the same gate. - visit(Add)'s `a + a -> a * 2` used `Expr * int`, which aborts on UInt(1) ("Integer constant 2 ... coerced to type uint1"). Switch to Mul::make(a, make_const(a.type(), 2)); make_const truncates modulo width, which is the correct modular value. - visit(Add)/visit(Sub) `f(x)/a + g(x) -> (f(x) + g(x)*a)/a` only holds under non-wrapping arithmetic. Gate on no_overflow_int. - Canonicalize two inlined `a.type().is_int() && bits() >= 32` predicates to no_overflow_int. Bounds.cpp (src/Bounds.cpp): - visit(Cast) took a blanket "no overflow possible" shortcut for signed-int/at-least-int32 destinations regardless of the source, which silently wrapped UInt(32)/UInt(64) -> Int(32) narrowings and carried a wrapped-negative as "max". Restrict the shortcut to non-unsigned sources. - After the first restriction, the fuzzer also found that *bounded* signed-int sources can exceed the destination and wrap (e.g. int64 bounds [59, 4294967287] becoming int32 [59, -9], an inverted interval that downstream treats as "always true"). Further restrict the shortcut to unbounded signed-int sources; let bounded ones fall through to the existing can_prove(cast(from, cast(to, x)) == x) fit check. - Later, the fuzzer (with floats enabled) caught the same flavor of bug for float sources: the comment claimed "float-to-int casts truncate in place and preserve interval orientation", but IntImm::make wraps by sign-extending low bits, and two float endpoints can wrap to opposite sides of zero (inverting the interval). Drop the float shortcut entirely; the bounded-a path already does a sound to.can_represent(*fmin) fit check with a fallback to bounds_of_type(to). - visit(Mod) unconditionally claimed the result is in [0, ...], which matches integer mod but not fmod (sign of the dividend; NaN on zero divisor). For float types, return bounds_of_type(t) (unbounded). Simplify_Cast.cpp (src/Simplify_Cast.cpp): - A nested-cast simplification stripped the inner cast when the outer was narrower, on the grounds that the inner is always a sign or zero extend. That assumption only holds when the inner cast's source is itself int-or-uint; for a float source, the inner is an fp-to-int conversion and stripping it changes the value. Concretely, `int32(uint64(float64(-21)))` evaluates to 0 (fp-to-unsigned of negative is implementation-defined, often 0), but the stripped `int32(float64(-21))` evaluates to -21. Add the `cast->value.type().is_int_or_uint()` guard, mirroring the sibling rule above it. Each of the solver and bounds fixes has a companion regression test in test/correctness/solve.cpp that, prior to the fix, reproduces the specific failure mode (verified by reverting each fix individually). With all fixes in place, the fuzzer runs clean for 250k+ iterations across the full type set; the correctness suite (solve, bounds, simplify, cast) still passes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

The previous commit dropped the float-source shortcut from visit(Cast) to fix the fuzzer-found unsoundness where constant float bounds outside int range wrapped via IntImm::make and inverted the resulting int interval. That was too broad: bilateral_grid (and similar pipelines) produce SYMBOLIC float bounds like select(r_sigma > 0, 0/r_sigma, 1/r_sigma) + 0.5 at bounds-inference time, which can't be proven to fit in int range but are load-bearing for downstream BoundsInference to resolve Func call extents -- without them, "histogram calls histogram in an unbounded way in dimension 2" and the build fails. Restore the float-source shortcut for bounded-a, but reject it only when an endpoint is a float *constant* we can prove exceeds the int range. Symbolic bounds pass through (matching Halide's convention that out-of-range float-to-int is the programmer's problem); constant bounds that would wrap do not. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

abadams · 2026-04-23T17:00:41Z

            could_overflow = false;
+        } else if (from.is_float() && to.is_int() && to.bits() >= 32) {
+            // Float source to int32+: out-of-range float-to-signed-int
+            // casts are implementation-defined, so Halide's convention


This is slightly misleading. Float to signed int casts in Halide saturate for bits <= 16.

abadams · 2026-04-23T17:03:44Z


-        // Mod is always positive
+        if (t.is_float()) {
+            // fmod's result takes the sign of the dividend (so it can be


Mod in Halide is supposed to be positive. For integers we have the Euclidean identity:

(a / b) * b + (a % b) == a

The division rounds up when b is negative, so the LHS of the mul is larger than the true answer, which means, when multiplied by the negative b, the LHS of the add is smaller than the true answer, which means mod must be positive.

I think the bug might be in lowering of floating point mod. It's currently lowered as:

a - floor(a / b) * b

but it should be

a - floor((a/b) * b)

Thinking about this more, my rationale is this: Floating point mod should coincide with integer mod if the floats happen to be integers. This means if 7 % -2 == 1 in the integers 7.f % -2.f should be 1.f in floats.

Floating point mod should coincide with integer mod if the floats happen to be integers.

I would be wary of deviating from what the hardware does (fmod) for this.

We've already deviated for integer div. We should be consistent.

abadams · 2026-04-23T17:05:01Z

+        // a float source makes `cast` an fp-to-int conversion, whose low
+        // bits are not the same as an fp-to-int conversion of a narrower
+        // type. For example, int32(uint64(float64(-21))) evaluates to 0
+        // (undefined-behavior territory for negative float-to-unsigned)


I don't think casting a negative float to uint is UB in Halide. It's supposed to saturate.

Though again, my understanding may not match the current lowering!

abadams · 2026-04-23T17:07:09Z

+            // for the fit check below to prove, and the destination is
+            // wide enough that any concretization will fit.
+            //
+            // Unsigned sources are excluded unconditionally: unsigned


I don't think this is true either. A uint to int cast is assumed to not overflow.

See #7814 for what happened when I tried to claim they wrap.

abadams · 2026-04-23T17:09:31Z

+            // Several of these rewrites rearrange the comparison by adding
+            // or subtracting on both sides. That's only sound under an
+            // assumption of no integer overflow -- for types that may wrap
+            // (unsigned, or narrow signed whose overflow is UB and could


"narrow signed whose overflow is UB" Not entirely sure what narrow means here but our narrow signed ints don't have UB on overflow.

abadams · 2026-04-23T17:15:26Z

+                    // uint8 with c = 3, b = 7: the rewrite misses the
+                    // solutions that arise from wrap).
+                    //
+                    // Don't use operator/ and operator % to sneak past the


I think the original comment was stale. Division and mod by zero is now fine (returns zero). For the rewrite I think this means:

a * 0 == b <=> a == b / 0

which is wrong, so it's just the comment that needs correcting.

I'm worried that the earlier comment says we would only use this for positive or negative constants, but there's no guard to check that. It might mean that we don't statically know at this point that it's a non-zero constant, but the usage site knows that for some other reason. That's sort of horrifying so I hope it isn't the case.

abadams · 2026-04-23T17:20:40Z

@@ -644,16 +672,14 @@ class SolveExpression : public IRMutator {
                    } else if (is_ne) {
                        // f(x) * c != b -> f(x) != b/c || b%c != 0


Oh I see the other rules were guarded by is_positive_const, just not this one, so it's not the horrifying situation. Let's check

f(x) * 0 == b <=> f(x) == b/0 || b%0 == 0

Simplifying both sides

b == 0 <=> true

Nope. The NE case is also busted.

(meaning the code change here is correct)

abadams · 2026-04-23T17:23:47Z

@@ -0,0 +1,404 @@
+#include "Halide.h"


The solver tests are currently in Solve.cpp at the bottom. It would be good to move all of them here, instead of just making a new collection.

abadams · 2026-04-23T17:24:29Z

+// Returns true if the expression, under the given substitution, contains a
+// division or modulo whose divisor simplifies to zero. Solve is allowed to
+// rearrange such expressions in ways that cause the simplifier to fold the UB
+// inconsistently between the original and rewritten form -- so samples that


Division by zero is not UB!

abadams · 2026-04-23T17:24:49Z

+            val = random_int_val(reg.fuzz, -32, 32);
+        }
+
+        // Skip samples that invoke div/mod-by-zero UB in the input: the


Again, not UB

abadams · 2026-04-23T17:27:41Z

+            // extends the low bits and can invert the resulting
+            // interval, which is worse than just using bounds_of_type.
+            Interval s = a;
+            if (s.has_lower_bound()) s.min = simplify(s.min);


There is a method elsewhere in this file that simplifies whole intervals. It contains a critical difference to this: It only calls the simplifier once if max.same_as(min). This guarantees the two don't become two different but equal Expr objects, which would cause is_single_point to fail.

abadams · 2026-04-23T17:35:26Z

+            double lo = -std::ldexp(1.0, to.bits() - 1);
+            double hi_exclusive = std::ldexp(1.0, to.bits() - 1);
+            bool const_oob = false;
+            if (s.has_lower_bound()) {


This path may break pipelines that rely on Bounds inference "knowing" that cast<int>(abs(unbounded_float)) is non-negative. I really think we need to just throw up our hands here if the cast is to an int32 and assume it's not going to overflow. People use all sorts of types cast to int to index luts, and they assume the bounds on the original are still true through the cast.

abadams · 2026-04-23T17:37:55Z

I think, unfortunately, like other fuzzers this one is going to have to carefully avoid getting itself into a situation where an Int(32) Expr overflows.

abadams · 2026-04-23T17:46:18Z

+        if (t.is_float()) {
+            // fmod's result takes the sign of the dividend (so it can be
+            // negative), its magnitude is bounded by |b| but can reach it,
+            // and fmod(x, 0) = NaN which is not in any real interval. Don't


I think we're assuming no-nans here. Otherwise it would be a hypothetical "strict_mod" intrinsic.

abadams · 2026-04-23T17:48:21Z

        // eliminated. The inner cast is either a sign extend
-        // or a zero extend, and the outer cast truncates the extended bits
+        // or a zero extend, and the outer cast truncates the extended bits.
+        // The requirement that cast->value is itself int-or-uint is crucial:


Comments should not spend this much space arguing why a hypothetical alternative version of this would be buggy. After the first sentence it's enough to just say that everything involved needs to be an int.

abadams · 2026-04-23T17:54:53Z

+            Interval s = a;
+            if (s.has_lower_bound()) s.min = simplify(s.min);
+            if (s.has_upper_bound()) s.max = simplify(s.max);
+            double lo = -std::ldexp(1.0, to.bits() - 1);


Aside: I think we should consider allowing one-line if statements.

- Revert the Bounds.cpp uint-to-int and bounded-int-to-int shortcut restrictions from earlier in the branch. Halide's convention (per PR #7814 discussion) is that uint-to-int casts are assumed not to overflow; tightening the shortcut broke performance_boundary_conditions and bilateral_grid. Keep only the float-to-int precision/soundness fix from this branch. - Drop the two correctness tests that exercised the reverted shortcut (narrowing uint and int casts) -- they were exercising programmer-level contract violations, not real compiler bugs. - Teach the fuzzer to skip substitutions that violate the "assumed not to overflow" narrowing-cast contract, analogous to the existing div/mod-by-zero skip. - Move `solve_test()` and its helpers from the bottom of src/Solve.cpp (previously invoked from test/internal.cpp) into test/correctness/solve.cpp, as abadams suggested keeping all solver tests together. - Use the existing simplify(Interval) helper in the float cast branch so single-point intervals stay same_as-equal after simplify. - Update comments that incorrectly said "UB": narrow signed wraps (not UB), div/mod-by-zero returns 0 (not UB), float-to-uint of negative saturates (not UB), float-to-int for bits <= 16 saturates. - Reframe the float-mod bounds_of change: Halide mod is supposed to be Euclidean non-negative, but the current CodeGen_LLVM lowering `a - b * floor(a/b)` produces negative results when b is negative. bounds_of's unbounded fallback is a workaround until the lowering is fixed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

alexreinking · 2026-04-24T18:05:13Z

Thanks for the thorough review! Pushed a1cde0a82 with the following changes in response:

Semantics clarifications (thanks for setting me straight):

Reverted the Bounds.cpp visit(Cast) shortcut restrictions for uint→int and bounded int→int. I had added those based on fuzzer findings without realizing they conflict with Halide's "assumed not to overflow" convention (per Fix bounds inference for uint -> int casts #7814). Tightening them broke performance_boundary_conditions and bilateral_grid, which both rely on int32(uint32(...)) narrowing carrying bounds through. Kept only the float→int precision/soundness fix.
Dropped the two correctness tests that exercised the reverted shortcut — they were locking in programmer-level contract violations, not compiler bugs.
Taught the fuzzer to skip substitutions that violate the "assumed not to overflow" narrowing-cast contract, analogous to the existing div/mod-by-zero skip.

Comment corrections:

"narrow signed whose overflow is UB" → narrow signed wraps (not UB).
Div/mod-by-zero framing in both Solve.cpp and the fuzzer → returns 0 (not UB); the rewrite is unsound because a*0 == b becomes a == b/0 && b%0 == 0 which collapses to a == 0 && b == 0, losing the original semantics.
int32(uint64(float64(-21))) in Simplify_Cast.cpp → float-to-uint saturates to 0 in Halide (not UB).
Float-to-int implementation-defined comment tightened to note bits ≤ 16 saturates.
Switched to the existing simplify(Interval) helper for the new float-cast branch.

Reorganization:

Moved solve_test() body and its helpers from the bottom of src/Solve.cpp into test/correctness/solve.cpp, as you suggested. test/internal.cpp no longer calls it.

Float mod:

Kept the bounds_of unbounded fallback for float mod, but reframed the comment: Halide mod is supposed to be Euclidean non-negative, but the current CodeGen_LLVM lowering a - b * floor(a/b) produces negative results when b < 0 (e.g. fmod(5, -3) = -1 under that formula). The bounds change is a workaround until the lowering is tightened to enforce the Euclidean invariant. Happy to take a crack at the lowering in a separate PR if you'd like.

Full build + correctness + 200k fuzz iterations still clean.

— Claude

alexreinking marked this pull request as draft April 23, 2026 16:15

abadams reviewed Apr 23, 2026

View reviewed changes

abadams mentioned this pull request Apr 23, 2026

Improve the comments in IR.h to give more detail on intended semantics #9106

Merged

abadams added the dev_meeting Topic to be discussed at the next dev meeting label Apr 24, 2026

		@@ -644,16 +672,14 @@ class SolveExpression : public IRMutator {
		} else if (is_ne) {
		// f(x) * c != b -> f(x) != b/c \|\| b%c != 0

Conversation

alexreinking commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Solve.cpp

Bounds.cpp

Simplify_Cast.cpp

Test plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

abadams commented Apr 23, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alexreinking commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alexreinking commented Apr 23, 2026 •

edited

Loading