feat(persistent): in-place solver updates (Solver framework + HiGHS/Gurobi/Xpress/Mosek)#718
feat(persistent): in-place solver updates (Solver framework + HiGHS/Gurobi/Xpress/Mosek)#718FabianHofmann wants to merge 22 commits into
Conversation
Tracks per-Constraint coefficient mutation via a single boolean slot, flipped in coeffs/vars/lhs setters. Pure-constant rhs writes now short-circuit and leave coeffs/vars buffers untouched (by identity), so rhs-only updates don't trigger expensive coefficient recompare on the persistent-solver fast path.
Pure-Python snapshot primitives for the persistent-solver Phase 1. Deep-copies value-side fields (var_lb/ub, con_rhs/sign, obj_linear), holds vlabels/clabels by reference, stores canonical CSR (indptr, indices) per constraint container. No Solver import.
Pure-function diff for the persistent-solver Phase 1. Detects structural, coord, sparsity, quadratic-objective, value-only var/con, and objective-linear/sense changes. Supports same_model fast path via _coef_dirty and cross-model full re-scan. Includes a focused test suite covering capture, mutation paths, deep-copy invariant, and the same_model toggle.
- supports_persistent_update class flag (default False) - snapshot/_rebuilds/_in_place_updates/_last_rebuild_reason fields - snapshot capture at end of direct _build, _clear_coef_dirty helper - apply_update stub raising UnsupportedUpdate - solve(model, assign) dispatcher with diff-or-rebuild path - update(model, apply=True) primitive returning ModelDiff - threading.Lock around diff+apply+resnapshot - __getstate__/__setstate__ drop native handle and snapshot
…date support Skip diff computation entirely when supports_persistent_update is False on apply, per plan: 'dispatcher checks flag before calling — if False, skips diffing entirely and goes to rebuild.'
Replace xarray-based snapshot and CSR pattern compare with per-row canonicalised numpy buffers; new ContainerVarUpdate / ContainerRowUpdate payloads. Gurobi/HiGHS apply_update rewritten around batched setAttr / changeColsBounds / changeColsCost / changeColsIntegrality; coefficient writes touch only changed cells. Cross-model diff now ~matches same-model cost for bound/rhs/coef-value sweeps.
compute_diff/Solver.solve/Solver.update grow an ignore_dims kwarg. None (default) keeps the current no-coord-check behaviour; any iterable opts into per-container coord-equality on every dim not in the set, supporting rolling-horizon workflows where e.g. the snapshot dim is expected to drift.
…_rebuild - Solver.from_name now accepts model=None; the first solve(m, ...) builds. - compute_diff folded into ModelDiff.from_snapshot classmethod; new ModelDiff.from_models diffs two linopy models directly. - Solver.solve grows disallow_rebuild=True, which raises RebuildRequiredError instead of falling back to a rebuild.
…m_models - Add `track_updates` flag (default False) to Solver; skip ModelSnapshot capture when disabled. Raise UpdatesDisabledError on solve(model)/update() if a built solver was constructed without tracking. - Rewrite ModelDiff.from_models to build directly from two models without capturing snapshots; share helpers with from_snapshot. - Update persistent tests to opt into track_updates=True; add coverage for the disabled path.
Cross-instance resolves now diff via from_models against the previously built model, with no snapshot. Same-instance mutation still raises UpdatesDisabledError. Snapshot recapture is skipped in this mode. Add cross-instance solve/update tests for the no-snapshot path.
Collapse _diff_objective QUAD_OBJ branches; cache n_coef_updates; short-circuit _canonicalize_rows when rows already sorted; tighten buffer extraction. Introduce VarKind enum used across snapshot/diff and HiGHS/Gurobi apply_update; reuse linopy.constants sign tokens. Move _clear_coef_dirty into ModelSnapshot.capture.
Source con buffers from Constraint.to_matrix_with_rhs, replacing the dense (n_rows, max_n_term) arrays with CSR (indptr, indices, data). Sign dtype adopts 'U1' across the persistent layer and apply_update in HiGHS/Gurobi consumes CSR-slice payloads instead of -1 masks. Deletes _canonicalize_rows and the _INT64_MAX sentinel.
Replace per-container ContainerVarUpdate/ContainerRowUpdate dicts with flat arrays (var_bounds_*, var_type_*, con_coef_* COO, con_rhs_*, con_sign_*) plus VarSlice/ConSlice per-container offsets for diagnostics. Add con_rhs_as_bounds() for ranged-row solvers. Backend apply_update bodies collapse to flat-array calls; remove duplicated label->position resolution.
Implement in-place model updates for Xpress (chgbounds/chgrhs/chgmcoef/ chgrowtype/chgobj/chgobjsense/chgcoltype) and Mosek (chgvarbound/ chgconbound/putaijlist/putclist/putvartypelist/putobjsense). Mosek rejects constraint sign change to trigger rebuild. Consolidate gurobi/highs apply_update tests into a single parametrized file that also covers xpress and mosek.
for more information, see https://pre-commit.ci
|
@FBumann here we go, if you want to take a first look. docs to come |
Ill have a look ltoday |
* hold solver lock through _run_direct so two threads calling solve(model) on the same Solver no longer race on the native handle (HiGHS returned 0.0 from the second concurrent solve). * narrow Optional ndarrays in persistent.diff.push_var / push_con and in HiGHS/Gurobi/Xpress/Mosek apply_update objective paths. * widen Constraint.rhs setter to ExpressionLike | VariableLike | ConstantLike to match the as_expression call in the body. * widen Constraints.__getitem__(str) return type to Constraint (the dominant case) so tests can set .rhs/.coeffs/.sign without ignores. * add docs for in-place solver updates.
|
@FabianHofmann Sorry, I wont be able to review this today. |
take your time, there is no hurry. I'll do some integration tests anyway |
|
Are there any conflicts with #717? Are we sure we want to merge and publish this before we go v1? Just making sure... |
|
Benched PR #718 ModelDiff vs hand-rolled
Diff compute + Manually calling Suggestion: expose an opt-in on the persistent API so users with presolve-heavy LPs can drop the warm basis without bypassing the encapsulation. Either:
Zero structural rebuilds observed across all 7 re-solves in both ModelDiff routes ( Caveats:
|
|
Production-scale follow-up: PSA-MEE Solve-phase comparison (B vs SQ vs B+clear) at production scale couldn't be captured this round — the reduced-pipeline harness wasn't LP-feasible past chunk 1, so HiGHS timings came from an infeasibility path and would mislead. Will retry once the harness uses the full PSA-MEE chunk loop with storage + N-1 enabled. The presolve-skip mechanism that drove the synthetic regression (HiGHS logging Bottom line for the PR scope: the |
|
@MaykThewessen @FabianHofmann We might be able to add such later. But I'd definitely defer it from this first PR. We should put a note into the docs about users maybe profiting from calling such methods on the solver directly. One downside: We propose suage of |
|
I read through the code and overall api. I would propose the following api guide lines: 1. Mutation API:
|
| call | reads model | mutates handle | runs solver |
|---|---|---|---|
solver.diff(model) |
✓ | ||
solver.apply(model) |
✓ | ✓ | |
solver.solve(model=...) |
✓ | ✓ | ✓ |
update(model, apply=True|False)does two structurally different things behind one bool; defaultapply=Truesilently mutates.- Name collides with
gurobipy.Model.update()(lazy-flush attribute changes) — different semantic. - Split makes the basis-clear flow legible (
applyseparately fromsolve).
Good news: No conflicts with the new v1 semantics convention
|
@FBumann Agreed: drop Native handle via
A uniform My benchmark comment's "B+clear" route was exploratory, not a request for the wrapper. Happy to leave basis-management outside this PR and revisit only if a clear cross-solver use case emerges. Suggested docs note for
LGTM to defer the unified API: additive later, non-breaking. |
|
@FabianHofmann first round of my Review: ArchitectureA1. Through any supported mutation path — the existing setters at Two coherent positions:
Recommendation: B is the safer default. Cross-model paths already use False; in the same-model path the perf cost is small. Solver-aware callers who promise to use A2. Read-only A3. A4. Per-backend A5. Lock is too coarse. A6. Snapshot is the wrong unit of caching. DRYD1. D2. D3. Xpress D4. "Set binary integrality and then re-clamp bounds to [0,1]" is hand-rolled in HiGHS (1433-1442), Xpress (2485-2492), MOSEK (3221-3225). Lift to a Solver utility — the rule is solver-independent (it's linopy semantics, not a backend quirk). D5. |
Test coverage gaps —
|
| Reason | Covered? |
|---|---|
NONE |
✅ |
STRUCTURAL_CONTAINERS |
in {LABELS, CONTAINERS} sets |
STRUCTURAL_LABELS |
diff.py:557,606) unreachable from public API |
COORD_REINDEX |
✅ |
SPARSITY |
✅ |
QUAD_OBJ (diff.py:702) |
❌ no tests |
BACKEND_REJECTED |
✅ |
Gaps worth fixing:
QUAD_OBJhas zero tests — both directions (linear→quad, quad→linear) are reachable. Without coverage, QP users risk silent miscompare.- Loose
in {…}assertions intest_sparsity_change_triggers_rebuild,test_cross_model_sparsity_change_rebuilds,test_cross_model_structural_mismatch_rebuilds— tighten toisso future refactors of check ordering get caught. disallow_rebuild=Trueonly tested for one reason (STRUCTURAL_CONTAINERS). The contract is "every rebuild trigger raises" — add tests for SPARSITY, COORD_REINDEX, QUAD_OBJ, BACKEND_REJECTED.from_modelsnegative paths uncovered — only the happy path is tested. Mirror thefrom_snapshotcases (structural/sparsity/quad/coord) since the buffer-extraction path is independent.- Dead code or missing test at
diff.py:557and:606— the per-containerSTRUCTURAL_LABELSreturns can't fire after the global label check at:372-377short-circuits. Either delete (replace withassert) or write the test that justifies them.
|
@FBumann Strong review. Adding bench-grounded color on A1 / A5 / A6 since my numbers are the ones being cited. A1:
|
|
Correction to my A1 framing: I overshot. Microbench of the actual compare path (
So at the realistic production size (6 nnz/row), the extra walk is ~35 ms, not the 50-200 ms my last comment implied as the floor. That's ~22 % of a 160 ms apply phase, or ~0.3 % of a 12 s solve. Felix's original "small relative to solve time" framing was right; my "30-125 % of apply" was the worst-case interpretation, not the typical one. Doesn't change the recommendation (still Option B: default Bench script available on request, or I can drop it in |
Closes the A1 residual from the #718 review. The flag-trust path (`skip_coef_compare = same_model and not coef_dirty`) is correct through Constraint.update() (set in one place, shims forward), but `c.coeffs.values[...] = ...` still bypasses _coef_dirty. With same_model=True as the default, that bypass silently produces wrong diffs. Flip the default to False. Cross-model paths (the only production caller, Solver._update_locked, passes explicitly) are unaffected. Same-model warm-update paths now value-diff the CSR data — small perf hit (50-200ms at Mayk-scale per Mayk's bench), correct by default. Solver-aware callers who own the mutation contract can opt back into the optimization with `same_model=True`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Have not reviewed the code yet, just read the comments during some spare time, but my position is always to try to have perf by default:
So, i'd argue for the footgun A, but put the security mechanism on by flagging the numpy arrays as read only: a = np.array([1, 2, 3])
a.flags.writeable = False
a[0] = 99 # ValueError: assignment destination is read-only |
|
@coroa I wrote some comments about the user api here: #727 In short: We could skip the user parameter entirely by storing a Together with your suggestion about the writable flag this would be very safe, convenient and uses the performant branch whenever possible. |
|
@coroa I asked Claude code** about potential gaps of the writeable flag:
Feels 80% safe, but might produce rough edges in our internals and many places where we have to reassign this flag. |
Changes proposed in this Pull Request
Adds an opt-in persistent-update framework so a built solver can be re-solved against a mutated
Modelwithout a full rebuild.Core
linopy.persistent:ModelSnapshot,ModelDiff,StructuralKey,VarKind,RebuildReason.ModelDiffstores changes in flat-native arrays (bounds / var-types / RHS / signs / COO coefs / linear objective / sense) plus per-containerVarSlice/ConSliceviews.ModelDiff.from_snapshot(snap, model)andModelDiff.from_models(m1, m2)— snapshot-based and snapshot-free diffs._coef_dirtyflag on constraints with RHS-setter short-circuit so RHS-only edits skip the coefficient re-walk.Solver orchestration
Solvergainstrack_updates, lazy-build (firstsolve(model, …)builds),apply_update,update,disallow_rebuild, and structured rebuild reasons. Backends without persistent-update support short-circuit to rebuild.apply_update:changeColsBounds/changeColsIntegrality/changeRowBounds/changeCoeff/changeColsCost/changeObjectiveSense. Sign change → rebuild.setAttr(LB/UB/VType/RHS/Sense/Obj),chgCoeff,ModelSense. In-place sign change.chgbounds/chgcoltype/chgrhs/chgrowtype/chgmcoef/chgobj/chgobjsense. In-place sign change.chgvarbound/chgconbound/putvartypelist/putaijlist/putclist/putobjsense. Sign change → rebuild.Tests
test_persistent_snapshot_diff.pycovering allModelDiffsemantics.test_persistent_apply_update.pyrunning 9 cases × 4 backends (skipped per backend when license/installation is unavailable).test_persistent_solver_extras.py.Checklist
doc.doc/release_notes.rstof the upcoming release is included.