Zero Half(Odd-Cycle) cuts#1428
Conversation
Brings in upstream's unified OpenMP threading model (PR NVIDIA#1099) and other fixes (NVIDIA#1206 concurrent LP exception cleanup, NVIDIA#1214/NVIDIA#1216 destruction/ capture fixes) while preserving local work on the cut and clique stack. Conflict resolution highlights: - Drop std::future/std::async clique flow; adopt upstream's omp task + omp_atomic_t<bool> signal_extend pattern. - Drop modify_problem parameter from find_initial_cliques (we already removed the code that consumed it); adapt the omp-task call site in branch_and_bound::solve accordingly. - Take upstream's [this, &population] capture for the root-LP CPUFJ improvement callback; the new omp taskwait-before-destruction guarantee makes the prior context-lifetime fix unnecessary. - Take upstream's do_cut_pass refactor of the per-pass LP resolve loop; move our per-pass root_lp_with_cuts publish into do_cut_pass so the benchmark metric is still updated on early exits. - Keep our out-of-line omp_mutex_t definitions in omp_helpers.cpp; the enhanced omp_atomic_t with std::memory_order is taken from upstream.
|
/ok to test 3edceae |
|
/nvskills-ci |
aliceb-nv
left a comment
There was a problem hiding this comment.
Thanks Akif! Let's get Chris' eyes on this as well
One thought - maybe we could start thinking about splitting cuts.cpp into a file per cut family or something. It is becoming quite heavy :)
| if (clique_table_ == nullptr) { | ||
| if (signal_extend_) { signal_extend_->store(true, std::memory_order_release); } | ||
| #pragma omp taskwait depend(in : *signal_extend_) | ||
| } |
There was a problem hiding this comment.
Isn't there a risk of race here? In "find_initial_cliques", clique_table_out is set to non-null before extend_cliques runs, so you could encounter a scenario where prepare_fractional_sub_cg reads clique_table_, sees it is non-null, and races on the clique table while it is being updated by extend_cliques, right?
There was a problem hiding this comment.
You are right. I think in practice it never happened because the other cut generation functions were taking some time and most of the time clique table was non-null. Now it depends on signal_extend being null. But there was a convoluted logic of local, shared out clique_table and that also complicated things, it was because we were using clique generation in presolve before and that included ownership transfer logic. Now that logic is also simplified. Thanks for catching it Alice!
|
/ok to test 0ea3102 |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cpp/src/grpc/codegen/field_registry.yaml`:
- Around line 685-688: The gRPC path currently maps zero_half_cuts directly into
settings.zero_half_cuts without enforcing the local CUOPT_MIP_ZERO_HALF_CUTS
bounds. Add request-side validation in the generated protobuf-to-settings flow
(field_registry.yaml / generated_proto_to_mip_settings.inc path) so
zero_half_cuts is rejected unless it is within [-1, 1], or clamp it before
assigning to the MIP settings. Make sure the fix is applied where the field is
translated into the solver settings, not only in solver_settings.cu.
In `@cpp/src/mip_heuristics/presolve/conflict_graph/clique_table.cu`:
- Around line 727-739: The publication order in clique_table.cu is unsafe
because clique_table_out is assigned in the clique_table extension path before
extend_cliques and fill_var_clique_maps finish mutating the same clique_table
object. Fix this by delaying the clique_table_out assignment until after all
mutations are complete, or by publishing a separate immutable snapshot from the
base table and extending a different object; use the clique_table_out
assignment, extend_cliques, and fill_var_clique_maps symbols to locate the
affected flow.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 93cd4392-45ad-481c-98b7-554bb5f60951
⛔ Files ignored due to path filters (3)
cpp/src/grpc/codegen/generated/cuopt_remote_data.protois excluded by!**/generated/**cpp/src/grpc/codegen/generated/generated_mip_settings_to_proto.incis excluded by!**/generated/**cpp/src/grpc/codegen/generated/generated_proto_to_mip_settings.incis excluded by!**/generated/**
📒 Files selected for processing (11)
cpp/include/cuopt/linear_programming/constants.hcpp/src/branch_and_bound/branch_and_bound.cppcpp/src/cuts/cuts.cppcpp/src/grpc/codegen/field_registry.yamlcpp/src/math_optimization/solver_settings.cucpp/src/mip_heuristics/presolve/conflict_graph/clique_table.cucpp/src/mip_heuristics/presolve/conflict_graph/clique_table.cuhdocs/cuopt/source/cuopt-c/mip/mip-c-api.rstdocs/cuopt/source/mip-settings.rstskills/cuopt-developer/references/contributing.mdskills/cuopt-developer/references/conventions.md
✅ Files skipped from review due to trivial changes (5)
- cpp/include/cuopt/linear_programming/constants.h
- skills/cuopt-developer/references/conventions.md
- docs/cuopt/source/cuopt-c/mip/mip-c-api.rst
- docs/cuopt/source/mip-settings.rst
- skills/cuopt-developer/references/contributing.md
🚧 Files skipped from review as they are similar to previous changes (2)
- cpp/src/branch_and_bound/branch_and_bound.cpp
- cpp/src/cuts/cuts.cpp
|
/ok to test 758e045 |
| // aborts/terminates right after. Each channel below enables it through its own | ||
| // DEBUG_* flag and supplies its own prefix; when the flag is 0 the call expands | ||
| // to a no-op that still consumes its arguments. | ||
| #define CUTS_DEBUG_LOG(prefix, ...) \ |
There was a problem hiding this comment.
Nit: We have a settings.log.debug already. Could this be used instead of this macro?
There was a problem hiding this comment.
It seems strange to have cut specific debugging/logging.
There was a problem hiding this comment.
The setting.log.debug prints trace logs underneath because it was too verbose. I wanted to have some log in between which is CUTS_DEBUG_LOG. I can the trace logging to setting.log.trace so we can have finer control on the logs on another PR. I can convert this to settings.log.debug for now.
| f_t* work_estimate, | ||
| f_t max_work_estimate) | ||
| { | ||
| for (const auto candidate : candidates) { |
There was a problem hiding this comment.
Nit: prefer an actual type here instead of auto. Using a type makes the code self-documenting.
| if (toc(start_time) >= time_limit) { return; } | ||
| bool add = true; | ||
| i_t checks = 0; | ||
| for (const auto v : selected) { |
There was a problem hiding this comment.
Same here. Prefer actual type to auto
| scratch.ensure_size(static_cast<std::size_t>(total_idx)); | ||
| ++scratch.gen; | ||
| const std::uint64_t gen = scratch.gen; | ||
| auto& dist = scratch.dist; |
| ZERO_HALF_DEBUG("dijkstra_odd_cycle work_limit hit pops=%lld", static_cast<long long>(pops)); | ||
| return false; | ||
| } | ||
| for (const auto v_local : neigh) { |
|
|
||
| std::unordered_set<i_t> seen_local; | ||
| seen_local.reserve(local_seq.size()); | ||
| for (const auto lv : local_seq) { |
| cycle_vertices.reserve(local_seq.size()); | ||
| std::unordered_set<i_t> seen_var; | ||
| seen_var.reserve(local_seq.size()); | ||
| for (const auto lv : local_seq) { |
| std::unordered_set<i_t> cycle_members(cycle_vertices.begin(), cycle_vertices.end()); | ||
| std::vector<i_t> candidates; | ||
| candidates.reserve(adj_set.size()); | ||
| for (const auto candidate : adj_set) { |
| } | ||
|
|
||
| template <typename i_t, typename f_t> | ||
| void cut_generation_t<i_t, f_t>::prepare_fractional_sub_cg( |
There was a problem hiding this comment.
Nit: The code would be easier to read if the CG acronym was written out in full. Does CG stand for cut graph?
There was a problem hiding this comment.
The acronym is conflict_graph. I can extend it.
| total_adj_entries += adj_set.size(); | ||
| auto& adj = sub_cg_.adj_local[idx]; | ||
| adj.reserve(adj_set.size()); | ||
| for (const auto neighbor : adj_set) { |
| { | ||
| std::unordered_set<i_t> adj_global; | ||
| adj_global.reserve(adj.size()); | ||
| for (const auto neighbor : adj) { |
| } | ||
| } | ||
|
|
||
| // Build the fractional conflict-graph subgraph once (resolving the async |
There was a problem hiding this comment.
Ah it stands for conflict graph. My ignorance/confusion is a reason to write it out in full.
| added_per_var++; | ||
| // mark all CG vertices that participated so we do not re-derive the same | ||
| // cycle from a different source vertex | ||
| for (const auto v : cycle_vertices) { |
There was a problem hiding this comment.
Prefer explicit type here instead of auto
chris-maes
left a comment
There was a problem hiding this comment.
Thanks for adding these cuts in @akifcorduk
LGTM. Very minor style comments. I did not review the math.
|
/ok to test b90197c |
This branch adds zero-half (odd-cycle / odd-wheel) cuts to the set of cuts. The cut generator runs on the same fractional conflict-graph subgraph that clique cuts use: once per cut pass, prepare_fractional_sub_cg() builds the subgraph then Dijkstra finds violated odd cycles on the auxillarry conflict graph (edge weights summing below 0.5), optionally extends them to odd-wheels by adding fully conflicting variables(similar logic to clique extension). Clique cut was refactored to share that subgraph and the same greedy extension helpers, and implied-bound cuts were moved earlier in the pass so the background clique-table thread has more time to finish before we join it.
A-> this PR
B-> main
Gap Closed Metrics:
Top 10 A>B (A-B), gap_closed_pct
Top 10 B>A (B-A), gap_closed_pct
Average and shifted geomean gap closed
A avg=29.648 sgm=8.1513 (shift=1)
B avg=28.924 sgm=7.7679 (shift=1)
Benchmark Results
A Batch 1
total_optimal: 79, avg_mip_gap: 0.3173, geomean_mip_gap: 0.1947, n_low_error: 125
A Batch 2
total_optimal: 77, avg_mip_gap: 0.3175, geomean_mip_gap: 0.1972, n_low_error: 124
B Batch 1
total_optimal: 75, avg_mip_gap: 0.3419, geomean_mip_gap: 0.2062, n_low_error: 119
B Batch2
total_optimal: 75, avg_mip_gap: 0.3126, geomean_mip_gap: 0.1930, n_low_error: 118
On average +3 optimal solutions, -0.4% geomean MIP gap reductions. Overall 30 instances have zero-half cut in the benchmark set.