perf: direct CSR-to-LP writer for frozen constraints#631
Conversation
Override Constraint.to_polars() to expand CSR data directly into a polars DataFrame, bypassing the expensive mutable() → xarray Dataset reconstruction. Also override iterate_slices() to yield CSR row-batches instead of relying on xarray's isel(). Move eliminate_zeros() to freeze time (from_mutable) so the cleanup happens once rather than on every to_polars() call. LP write is now 20-40% faster than master across all benchmark models. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
thank you |
|
this is already nice! I made some LP writing experiment in https://github.com/FabianHofmann/crs2lp which showed quite some improvements in performance. It directly writes the LP file from the CRS objects using custom rust module. That said, I am happy to pull this one in to keep changes small but in the long term it is probably worth to move the implementation in the repo. |
|
@coroa this is great, one big question that came up when looking at the code is: why renaming |
this should be in the other pr, moving it there (sorry) |
…-rewrite+lp-writer
|
@FabianHofmann This was a pretty much vibe coded Proof of concept, so i dont really know to be honest. But as this change is quite small i can have a look and get back to you in a few hours. |
When _sign is a numpy array (per-row signs from from_rule with mixed <=/>=/= constraints), expand it per-nonzero via _sign[rows] instead of using pl.lit() which only works for scalar strings. Also slice _sign in iterate_slices when it's an array. Add test for frozen mixed-sign constraint LP output equivalence. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
I think its ready to go |
5e26346
into
PyPSA:perf/matrix-accessor-rewrite
* perf: add to_matrix_via_csr * perf: improve per-constraint csr matrix construction * Add conversion functions * feat: add ability to freeze constraints into csr * Add io.to_netcdf support for frozen Constraint * fix: re-implement matrices * Move sum_duplicates * feat: VariableLabelIndex * fix: until solve * fix: disentangle range and ncons * fix: don't freeze if model is chunked * fix typing errors * fix: bring back forward-refs * fix issues in tests * fix: add doc strings to VariableLabelIndex * test: relax dtype assertions for Windows np.int32 compatibility Use np.issubdtype() instead of exact dtype comparisons to allow np.int32/np.float32 that Windows may produce. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix: review fixes for #630 (matrix accessor rewrite) (#632) * fix: review fixes for PR #630 (matrix accessor rewrite) - Fix __repr__ passing CSR positions instead of variable labels - Fix set_blocks failing on frozen Constraint - Extract _active_to_dataarray helper to reduce DRY violations - Simplify reset_dual to direct mutation instead of reconstruction - Add tests for freeze/mutable roundtrip, VariableLabelIndex, to_matrix_with_rhs, from_mutable mixed signs, repr correctness * feat: support mixed per-element signs in frozen Constraint, add rhs/lhs setters - Store _sign as str (uniform, fast) or np.ndarray (mixed, per-row) - Add rhs/lhs setters on Constraint via _refreeze_after pattern - Invalidate _dual on mutation; update netcdf serialization for array signs - Add tests for setters, mixed-sign freeze/roundtrip/sanitize/repr/netcdf * feat: mixed per-element signs in frozen Constraint, rhs/lhs setters, DRY helpers * rename Constraint to CSRConstraint * rename MutableConstraint to Constraint (original name) * fix: xpress crash with zero constraints and remove_variables not removing variable without referencing constraints * Fix MultiIndex deprecation warning in CSRConstraint by using assign_multiindex_safe * Add freeze_constraints option, default freeze to None (resolves from global setting) * bench: add pypsa carbon_management benchmark for direct solver path * Add set_names parameter to skip solver name-setting in direct IO, use polars for 3x faster name generation * perf: use polars to list logic in print_variables and print_constraints * Use non-deprecated formatting APIs in tests * Tighten direct-solver naming tests and repr formatting * fix mypy * Fix mypy failures in constraint and IO tests * Move freeze and direct IO naming settings to Model * perf: speed up mutable direct solver export * docs: add CSRConstraint documentation to release notes, API reference, and constraints notebook * perf: direct CSR-to-LP writer for frozen constraints (#631) * perf: direct CSR-to-LP writer for frozen constraints Override Constraint.to_polars() to expand CSR data directly into a polars DataFrame, bypassing the expensive mutable() → xarray Dataset reconstruction. Also override iterate_slices() to yield CSR row-batches instead of relying on xarray's isel(). Move eliminate_zeros() to freeze time (from_mutable) so the cleanup happens once rather than on every to_polars() call. LP write is now 20-40% faster than master across all benchmark models. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: handle mixed per-row signs in CSR-to-LP writer When _sign is a numpy array (per-row signs from from_rule with mixed <=/>=/= constraints), expand it per-nonzero via _sign[rows] instead of using pl.lit() which only works for scalar strings. Also slice _sign in iterate_slices when it's an array. Add test for frozen mixed-sign constraint LP output equivalence. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Co-authored-by: Fabian Hofmann <fab.hof@gmx.de> * merge: move bench_pypsa_carbon_management into benchmarks/ suite, fix MatrixAccessor compat * fix: align CSRConstraint.iterate_slices return type with base class for mypy * fix: make assert_linequal compare semantic equality of expressions Sort both sides by variable labels along _term before comparing, so expressions with different term orderings (e.g. from CSR round-trip with freeze_constraints=True) are correctly recognized as equal. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix types from merge conflict * docs: note CSRConstraint API differences from Constraint * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * refac(CSRConstraint): simplify iterate_slices with _equal_nnz_slices helper Replace the convoluted cumsum/diff/range loop with a clean while-loop helper that uses searchsorted directly on indptr. Batch slices pass coords=[] since batches cover contiguous active rows, not a contiguous slice of the coordinate grid. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * deprecate(Constraint): add DeprecationWarning to .flat property Use to_polars() instead. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * refac(ConstraintBase): make ncons/lhs/to_matrix abstract; move to Constraint _matrix_export_data becomes a method on Constraint instead of a module-level function. ncons, lhs, and to_matrix are now abstract in ConstraintBase, with xarray-based implementations on Constraint and CSR-based implementations on CSRConstraint. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(Model): add __weakref__ slot and drop stale matrices.clean_cached_properties call * refac(CSRConstraint): direct rhs setter, read-only lhs setter; drop xarray round-trip - rhs setter writes _rhs directly, rejects expressions - lhs setter raises AttributeError (call .mutable() to modify terms) - lhs getter skips mutable() wrapper, builds LinearExpression from _to_dataset - to_polars uses pl.lit for scalar sign * fix(types): drop duplicate MaskLike/PathLike; annotate sign_expr for mypy * refac(CSRConstraint): make rhs setter read-only; call .mutable() to modify --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Fabian Hofmann <fab.hof@gmx.de> Co-authored-by: Felix <117816358+FBumann@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Summary
Constraint.to_polars()to expand CSR data directly into a polars DataFrame, bypassing the expensivemutable()→ xarray Dataset reconstructionConstraint.iterate_slices()to yield CSR row-batches instead of relying on xarray'sisel()(which doesn't exist on frozen constraints — this was crashing)eliminate_zeros()to freeze time (from_mutable) so cleanup happens onceBenchmark results (vs master)
Test plan
pytest test/test_io.py— all 25 tests passpytest test/test_constraint.py— all 72 tests passtest_to_file_lp_frozen_vs_mutableverifies byte-identical LP output between frozen and mutable pathsbenchmarks/test_lp_write.py— all benchmarks pass including previously crashingpypsa_scigrid🤖 Generated with Claude Code