Skip to content

Fix inline-linear-SCC topsort cycle when chaining Euler orientation states (MTK#4608)#113

Closed
baggepinnen wants to merge 1 commit into
fbc/inline-linear-scc-buried-b-repairfrom
fbc/fix-4608-inline-scc-residual-incidence
Closed

Fix inline-linear-SCC topsort cycle when chaining Euler orientation states (MTK#4608)#113
baggepinnen wants to merge 1 commit into
fbc/inline-linear-scc-buried-b-repairfrom
fbc/fix-4608-inline-scc-residual-incidence

Conversation

@baggepinnen

Copy link
Copy Markdown
Contributor

Summary

Fixes the mtkcompile/structural_simplify failure ArgumentError: The equations have at least one cycle. (from topsort_equations) when chaining two or more rotation-matrix (Euler) orientation states — SciML/ModelingToolkit.jl#4608.

Stacks on #106, which adds the opt-in _assert_b_free_of_scc_vars invariant. This PR makes that invariant hold by construction (and thus actually fixes #4608), rather than only checking it under MTKTEARING_CHECK_REDUCTION.

Root cause

get_linear_scc_linsol builds the inline A \ b block by extracting each SCC variable's coefficient out of every (total_sub-substituted) residual, gated on the structural incidence graph via has_edge(graph, eq, var). That graph is built before total_sub substitution and is mutated during reassembly, so it can desync from the residual: total_sub can reintroduce an SCC variable into a residual whose structural edge was already torn. The has_edge gate then skips expanding that variable into the coefficient matrix A, leaving it buried in b. __reduce_linear_system! — which only inspects coefficient rows, never b — subsequently smuggles it onto the RHS, producing a self-referential A \ b (the solution vector appears inside its own RHS). The resulting observed equations form a cycle that topsort_equations rejects.

Confirmed on the #4608 MWE: the buried variables are exactly the second joint's Euler rates j2.phid[1:3], all with has_edge == false. The world-rooted first joint stays in sync (constant R, zero angular velocity), which is why a single Euler joint compiles but two chained do not.

Fix

Gate extraction on the actual residual incidence (get_variables on the substituted residual) instead of the stale structural graph. Every SCC variable genuinely present in a residual is expanded into A, so b is free of SCC variables by construction. Blocks whose structural graph is already in sync get identical extraction and identical output — no behavior change on healthy models.

Verification (registry MTK 11.26.8)

A/B over models that expose the cycle and controls that do not:

Model before after
1 Euler Spherical (control) compiles (6) compiles (6)
2 chained Euler Spherical cycle compiles (15), 0 self-ref obs
3 chained Euler Spherical cycle compiles (18), 0 self-ref obs
2 chained Quaternion (control) compiles (17) compiles (17)
Cable, 3 Euler segments cycle compiles (24), 0 self-ref obs
Cable, 5 Euler segments cycle compiles (36), 0 self-ref obs
Cable, 3 Quaternion (control) compiles (28) compiles (28)

All controls keep identical unknown counts → no regression. With MTKTEARING_CHECK_REDUCTION=1 the #106 self-check confirms the previously self-referential j2.phid block is now consistent (relres ≈ 4e-17).

Scope

Draft. This fixes the structural cycle only. Runtime solving of these models hits a separate, non-rank-tolerant inline-\ (LU) issue at degenerate configurations — independent of the cycle and out of scope here.

🤖 Generated with Claude Code

…ce (#98)

Resolves the `topsort_equations` "equations have at least one cycle" failure
when chaining >=2 rotation-matrix (Euler) orientation states
(SciML/ModelingToolkit.jl#4608).

`get_linear_scc_linsol` gated coefficient extraction on the structural
incidence `graph` (`has_edge`). That graph is built before `total_sub`
substitution and is mutated during reassembly, so it can desync from the
substituted residual: when `total_sub` reintroduces an SCC variable (e.g. a
chained joint inherits the upstream angular velocity, pulling the Euler-angle
rate into a residual whose structural edge was already torn), the gate skips
expanding that variable into the coefficient matrix `A` and leaves it buried
in `b`. `__reduce_linear_system!` then smuggles it onto the RHS, producing a
self-referential `A \ b` whose observed equations cannot be topologically
ordered.

Gate extraction on the actual residual incidence instead of the stale graph,
so every present SCC variable is expanded into `A` and `b` is free of SCC
variables by construction -- making the opt-in `_assert_b_free_of_scc_vars`
invariant (added in #106) hold rather than merely checking it.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@AayushSabharwal

Copy link
Copy Markdown
Member

Superseded by #114

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants