Skip to content

RFC: priority-aware + canonical tie-breaks in tearing (order-robust tear-variable selection)#96

Merged
AayushSabharwal merged 6 commits into
mainfrom
priority-tearing-prototype
Jun 24, 2026
Merged

RFC: priority-aware + canonical tie-breaks in tearing (order-robust tear-variable selection)#96
AayushSabharwal merged 6 commits into
mainfrom
priority-tearing-prototype

Conversation

@baggepinnen

Copy link
Copy Markdown
Contributor

Motivation

SciML/ModelingToolkit.jl#4612: algebraic tearing is equation-order sensitive — reordering four semantically-identical connect statements in a multibody model flips it from reliably-solvable to InitialFailure at every initialization seed. Users currently have no way to influence the tearing result; state_priority is only consulted by dummy-derivative state selection.

What this adds (prototype, draft)

  1. Priority-aware tear-variable selection in CarpanzanoTearing (the default algorithm via DummyDerivativeTearing{CarpanzanoTearing}): heuristics 1 & 2 prefer keeping HIGH-state_priority variables as tear (iteration) variables, mirroring the dummy-derivative semantics. Model libraries can express e.g. "cut forces/torques are poor tear variables" by a negative priority on connector flows.
  2. Canonical (name-rank) tie-break after priorities: SystemStructure gains canonical_ranks (lexicographic rank of variable names, built alongside state_priorities in TearingState); remaining ties resolve by (priority DESC, canonical_rank ASC) — deterministic under equation/declaration reordering.
  3. Component-granular priority lookup in build_state_priorities (scalar variable before parent array) — currently unreachable from the user API because AtomicArrayDict rejects indexed keys (see below), included for forward-compatibility.
  4. (Inert in the default pipeline) the same candidate ordering for tear_graph_modia, and an ENV-gated torn-set dump (MTKT_DUMP_TORN) in reassembly that was essential for diagnosing all of this — can be dropped from the final PR if unwanted.

All changes are behavior-identical when priorities are uniform and names are not consulted (strict comparisons keep first-found candidates).

Validation (multibody quarter-car from MTK#4612, both connect orderings)

configuration torn reaction (order A / order B) outcome
baseline body_upright.frame_b.f[3] / b1.frame_a.tau[3] solves / fails all seeds
+ class priorities (all connector f,tau = −2) steered, but still order-split (uniform priority ⇒ new ties) solves / fails
+ canonical tie-break (this PR) r2.frame_b.tau[3] / r2.frame_b.tau[3] — converged solves / still fails (see below)

Known gaps (intentionally out of scope here, for discussion)

  • The remaining order-divergence in the test model is a within-array component choice (v_0[1] vs v_0[3]) that is decided before the Carpanzano heuristics — in the initial maximal_matching / single-solvable-equation cascade, which iterate in equation order. Full determinism additionally needs canonical ordering at those stages (equation-level canonicalization is harder: equations have no stable identity under reordering).
  • Per-component user priorities (v_0[1] => 5) are rejected by AtomicArrayDict ("treats symbolic arrays as atomic") — the user-facing representation needs extending before item 3 is reachable.
  • No unit tests yet; draft for design feedback first.

cc SciML/ModelingToolkit.jl#4612, #95

@AayushSabharwal AayushSabharwal left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good overall. I'm a little concerned about the canonical_ranks building. Calling string on a variable/expression is bad and I want to avoid it as much as possible (including finding a way around the lexicographic sorting of equations). It may be possible to replace the strings with a Vector{Union{Symbol, Int}} with some amount of work.

I also would like to avoid merging this until the effect on TTFX is measured, given the env-var based dump (despite its usefulness) and additional string-ification.

Comment thread src/modia_tearing.jl Outdated
# semantics of dummy-derivative state selection. The sort is
# stable, so behavior is unchanged whenever priorities are equal
# (e.g. all default 0). Do not mutate the graph's adjacency list.
vs = sort(Int[v for v in vs]; by = varpriority, alg = Base.Sort.DEFAULT_STABLE)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
vs = sort(Int[v for v in vs]; by = varpriority, alg = Base.Sort.DEFAULT_STABLE)
vs = copy(vs)
sort!(vs; by = varpriority, alg = Base.Sort.DEFAULT_STABLE)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied (with sort! on the copy), thanks.

@baggepinnen

Copy link
Copy Markdown
Contributor Author

Addressed the review in 9bed425:

No more stringification

canonical_ranks are now built from a structured key per variable — no string() on any symbolic expression:

canonical_sort_key(v) :: Tuple{Symbol, NTuple{N,Int}, NTuple{M,Int}}
#                              name     indices        opsig
  • namegetname of the (array) variable, a Symbol; isless(::Symbol, ::Symbol) is a non-allocating C-level compare,
  • indices — the integer indices when v is a scalarized array element (empty otherwise),
  • opsig — the operator chain wrapping the variable (Differential1, Shift2, steps, other single-argument Operator3),

built with the existing @match/split_indexed_var-style destructuring and compared as plain tuples (tuple isless handles the varying lengths). This is essentially the Vector{Union{Symbol, Int}} you suggested, in tuple form so comparison is allocation-free. Computed once per TearingState.

One clarification on "the lexicographic sorting of equations": the PR never sorts equations — only this once-per-TearingState variable rank is built, and it's consulted purely as the last tie-break in tear-variable selection. If the concern is using name order at all as the canonical order: names are the only equation-order-independent identity a variable has, so some name-derived total order seems unavoidable for this purpose — but happy to discuss alternatives.

TTFX

Measured on the MultibodyComponents.jl test env (Julia 1.12.6, same machine, full Pkg.precompile() cascade after switching, then 3 fresh sessions each; model = an 8-state suspension mechanism):

metric merge-base 3d10ed7 this PR 9bed425
Pkg.precompile() (19-pkg cascade) 299.5 s 300.4 s
using ModelingToolkit 4.39 / 4.48 / 4.39 s 4.46 / 4.52 / 4.41 s
using MultibodyComponents 2.04 / 2.19 / 2.14 s 2.24 / 2.11 / 2.15 s
first multibody(model) 82.1 / 82.9 / 84.1 s 83.1 / 83.8 / 82.9 s

All deltas are below run-to-run spread — no measurable TTFX impact. (The ENV-gated dump compiles to a single haskey(ENV, ...) branch per generate_system_equations! call; I can drop it from the PR if preferred, though it has been very useful for debugging order-sensitivity.)

@AayushSabharwal

Copy link
Copy Markdown
Member

Hmm, what about runtime after TTFX on a large-ish model? i.e. something with a few thousand variables?

@baggepinnen

Copy link
Copy Markdown
Contributor Author

Runtime-after-TTFX on a large-ish model, as requested: the quadrotor (MultibodyComponents.LQRControlledRotorCraft, 4-link cable + suspended load, LQR state feedback) — 25 states, ~4830 observed equations after simplification. Fresh Pkg.precompile() per config, then one session per config running the full pipeline twice (multibody/mtkcompile → ODEProblemsolve(FBDF()), tspan (0, 20)):

with this PR (main + #96 + #99) without this PR (main + #99)
structure 25 states / 4830 obs 25 states / 4828 obs
first mtkcompile 22.17 s 21.80 s
first ODEProblem 28.58 s 30.21 s
first solve 17.45 s 18.03 s
second mtkcompile 1.86 s 1.86 s
second ODEProblem 2.39 s 2.42 s
second solve 2.27 s 2.82 s
retcode / steps Success / 1755 Success / 1714

All timing deltas — first and warm — are within single-run noise; warm mtkcompile is identical to the hundredth of a second (the priority lookup + canonical ranks are built once per TearingState and only consulted as a last tie-break). The PR does change the tearing result slightly on this model, as intended (4830 vs 4828 observed, 1755 vs 1714 solver steps; both Success).

For completeness, the same measurement without #97 (merge-base + this PR) is also indistinguishable: first 21.49 / 28.45 / 17.56 s, second 1.87 / 2.49 / 2.48 s, identical structure and trajectory to the with-#97 + this-PR column.

@baggepinnen

Copy link
Copy Markdown
Contributor Author

CI failures addressed in ec4b57e: canonical_sort_key was not total — fullvars can contain compound expressions (e.g. the multi-argument clock operators over non-variable arguments in the nested-clock-operators test), for which getname throws matching non-exhaustive. There is now a canonical_name that is total over all BSImpl variants: named variables key off their name (Sym/call-variable/getindex), operator and function applications key off their operation's name (nameof, still no stringification), and the remaining structural variants key off fixed sentinel symbols. Key collisions are fine by construction — the canonical tie-break simply falls back to the original order among colliding entries.

ModelingToolkitTearing tests pass locally (including Clock inference handles nested clock operators correctly, 12/12); expecting the InterfaceI jobs to clear too since they died in the same TearingState path.

Comment on lines +479 to +499
opsig = ()
x = v
while true
stripped = @match x begin
BSImpl.Term(; f, args) && if f isa Differential end => begin
opsig = (opsig..., 1)
args[1]
end
BSImpl.Term(; f, args) && if f isa MTKBase.Shift end => begin
opsig = (opsig..., 2, f.steps)
args[1]
end
BSImpl.Term(; f, args) && if f isa SU.Operator && length(args) == 1 end => begin
opsig = (opsig..., 3)
args[1]
end
_ => nothing
end
stripped === nothing && break
x = stripped
end

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This opsig approach is really bad for type-inference - the type after the while true is completely uninferrable. Can this be a Vector{Int} instead?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in e496df0opsig and idxs are now built as Vector{Int} (via push!) instead of growing tuples, so the key is a concrete Tuple{Symbol, Vector{Int}, Vector{Int}} and inferrable. Vectors compare lexicographically, so the sort order is unchanged. (Fixed idxs too since it had the same growing-tuple pattern.)

baggepinnen and others added 6 commits June 22, 2026 20:21
- CarpanzanoTearing heuristics 1&2 consult structure.state_priorities as
  tie-breaks (high priority preferred as tear variable); behavior-identical
  on uniform priorities
- tear_graph_modia candidate ordering by priority (inert in default pipeline)
- MTKT_DUMP_TORN env-gated torn-set dump in generate_system_equations!

Context: SciML/ModelingToolkit.jl#4612 — insufficient alone (array-component
and connection-set-alias ties are metadata-indistinguishable); canonical
tie-breaks needed for order-determinism.
- SystemStructure gains canonical_ranks (lexicographic rank of variable
  names), built in TearingState alongside state_priorities
- CarpanzanoTearing heuristics break remaining ties by (priority DESC,
  canonical rank ASC), making tear-variable selection deterministic under
  semantically-neutral equation/declaration reordering (MTK#4612)
…y+stable sort!

- canonical_ranks now sort by (name::Symbol, indices::NTuple{Int}, opsig::NTuple{Int})
  built structurally from the variable (getindex indices; operator chain with
  Differential/Shift encoding) — no string() on symbolic expressions.
- tearEquations!: apply suggested copy + in-place stable sort!.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
fullvars can contain compound expressions (e.g. multi-argument clock
operators over non-variable arguments), for which getname throws
'matching non-exhaustive'. Introduce canonical_name, total over all
BSImpl variants: named variables key off their name, operator/function
applications off their operation's name, remaining structural variants
off fixed sentinels. Key collisions are acceptable — the canonical
tie-break then falls back to the original order among colliding entries.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Per review: `opsig` (and `idxs`) were accumulated as `(opsig..., …)` inside
the loop, so their type — and hence the key type after the loop — is
uninferrable. Build them as `Vector{Int}` instead. Vectors compare
lexicographically, so the sort order is unchanged, and the returned key is
now a concrete `Tuple{Symbol, Vector{Int}, Vector{Int}}`.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@AayushSabharwal AayushSabharwal force-pushed the priority-tearing-prototype branch from e496df0 to ec3cc20 Compare June 23, 2026 11:29
@AayushSabharwal AayushSabharwal marked this pull request as ready for review June 23, 2026 11:32
@AayushSabharwal AayushSabharwal merged commit 4ddf8f0 into main Jun 24, 2026
11 of 23 checks passed
@AayushSabharwal AayushSabharwal deleted the priority-tearing-prototype branch June 24, 2026 08:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants