Skip to content

Run CPUFJ bursts at the root#1179

Merged
rapids-bot[bot] merged 23 commits into
NVIDIA:mainfrom
aliceb-nv:cpufj_cut_bursts
May 13, 2026
Merged

Run CPUFJ bursts at the root#1179
rapids-bot[bot] merged 23 commits into
NVIDIA:mainfrom
aliceb-nv:cpufj_cut_bursts

Conversation

@aliceb-nv
Copy link
Copy Markdown
Contributor

Some primal heuristics, like CPUFJ, may benefit in some cases from operating on the problem with root cuts.
This PR adds support for this, by running bursts of a few iterations of CPUFJ per cut pass opportunistically, on threads that would otherwise remain idle; and also once after the root cut passes are completed.

In most cases, this provides no benefits, but some instances like tutaki find their first integer incumbent much faster. Additionally, awhea is now reliably feasibilized and solved to a gap of <5% on 10min runs.

Description

Issue

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

@aliceb-nv aliceb-nv added this to the 26.06 milestone May 5, 2026
@aliceb-nv aliceb-nv added the non-breaking Introduces a non-breaking change label May 5, 2026
@aliceb-nv aliceb-nv requested a review from a team as a code owner May 5, 2026 09:01
@aliceb-nv aliceb-nv added the improvement Improves an existing functionality label May 5, 2026
@aliceb-nv aliceb-nv requested review from akifcorduk and hlinsen May 5, 2026 09:01
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 5, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@aliceb-nv
Copy link
Copy Markdown
Contributor Author

/ok to test 64ba333

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Integrates CPU feasibility-jump (CPUFJ) as a parallel heuristic during branch-and-bound root cut passes: adds task types and APIs, refactors host-LP initialization, extends cpufj_solve with work-unit limits, and orchestrates concurrent CPUFJ execution and final-task runs around the cut-pass loop.

Changes

CPUFJ Root-Cut Heuristic Integration

Layer / File(s) Summary
Header includes & declarations
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh
Adds cpu_fj_thread.cuh include and extends cpufj_solve declaration with work_unit_limit defaulting to infinity.
Core includes & forward decls
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
Adds dual-simplex/presolve headers, std includes, and forward-declares finalize_fj_cpu_host_initialization.
Host data view & finalization
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
Implements set_host_data_view and finalize_fj_cpu_host_initialization, and updates init_fj_cpu to delegate finalization.
Init from host LP (CSR conversion)
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
Adds init_fj_cpu_from_host_lp to convert a dual-simplex host LP to an initialized fj_cpu_climber_t (CSR conversion, reverse adjacency, bounds/seed projection).
cpufj_solve signature & work-unit limit
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
Extends cpufj_solve to accept work_unit_limit and adds early-exit when accumulated work units reach the limit; updates best-objective trace.
Task type & API declarations
cpp/src/mip_heuristics/feasibility_jump/cpu_fj_thread.cuh
Declares fj_cpu_task_t (preemption flag, owned climber) and factory/run/stop function templates.
Task wrapper & implementations
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu, cpp/src/mip_heuristics/feasibility_jump/cpu_fj_thread.cuh
Implements task deleter, make_fj_cpu_task_from_host_lp, run_fj_cpu_task, stop_fj_cpu_task, and updates explicit template instantiations.
Branch-and-Bound includes & wiring
cpp/src/branch_and_bound/branch_and_bound.cpp
Includes CPUFJ thread/task header and utilities/scope_guard.hpp; introduces per-scope task handle, improvement callback, and RAII stop guard.
Cut-pass control flow refactor
cpp/src/branch_and_bound/branch_and_bound.cpp
Refactors root cut-pass into do_cut_pass lambda returning structured cut_pass_result_t (CONTINUE/BREAK/RETURN + mip_status_t).
OpenMP orchestration per cut pass
cpp/src/branch_and_bound/branch_and_bound.cpp
Per-cut-pass OpenMP parallel/single optionally launches CPUFJ task concurrently with do_cut_pass, stops/waits for task, and handles returned action (including solver exit on RETURN); may prebuild next task in non-deterministic mode.
Final root-cut CPUFJ execution
cpp/src/branch_and_bound/branch_and_bound.cpp
After cut-pass loop, conditionally constructs and runs a final CPUFJ task when cuts exist, using deterministic vs non-deterministic time-limit policy, then resets the task handle.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Run CPUFJ bursts at the root' directly and clearly describes the main change—enabling CPUFJ heuristic bursts during root cut passes.
Description check ✅ Passed The description is directly related to the changeset, explaining the motivation for running CPUFJ during root cuts and providing concrete performance examples.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)

1768-1775: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce work_unit_limit on the actual consumed work.

work_units_elapsed is only accrued and checked every 100 iterations, and the last partial batch is never flushed before exit. For the short root-cut bursts this PR adds, that means a small work_unit_limit can be overshot by a wide margin while still underreporting the final work consumed.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1768 - 1775,
The code only updates fj_cpu.work_units_elapsed (via
fj_cpu.memory_aggregator.collect() and applying fj_cpu.work_unit_bias) every 100
iterations, causing the final partial batch to be unaccounted and allowing
work_unit_limit to be overshot; modify the loop so that you flush and apply the
remaining memory statistics on every iteration boundary that may exit: call auto
[loads, stores] = fj_cpu.memory_aggregator.collect(), compute biased_work =
(loads + stores) * fj_cpu.work_unit_bias / 1e10, add it to
fj_cpu.work_units_elapsed, call fj_cpu.producer_sync->notify_progress() if
non-null, and then check fj_cpu.work_units_elapsed >= work_unit_limit to
break—either by moving this logic out of the 100-iteration guard into the common
path or by ensuring you perform one final collect/update/notify/check right
before breaking/returning so the true consumed work is enforced.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 2231-2237: The callback root_cut_cpufj_improvement_callback is
decoding the host-LP assignment against the live original_lp_ (via
uncrush_primal_solution) which can be concurrently mutated by
add_cuts/remove_cuts; capture and use the exact LP snapshot used to build the
CPUFJ task (or change the task to emit user-space assignments directly) so
uncrush_primal_solution runs against an immutable snapshot, and add
synchronization (or assert/guideline comment) around shared state mutation to
flag the missing concurrency protection for original_problem_/original_lp_
before calling set_new_solution().
- Around line 2254-2256: These return paths call set_solution_at_root(...) and
return immediately, which bypasses the existing async clique cleanup; insert a
call to finish_clique_thread() immediately before each early return so the async
clique task is stopped and joined. Specifically, add finish_clique_thread() just
prior to the set_solution_at_root(solution, cut_info); return
mip_status_t::OPTIMAL; sequence shown, and make the same change at the other
analogous exit that calls set_solution_at_root and returns (the second location
referenced in the comment).

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1881-1886: The task restart fails because run_fj_cpu_task only
clears fj_cpu.halted but not fj_cpu.preemption_flag, so after stop_fj_cpu_task
sets preemption_flag = true subsequent runs immediately exit; update
run_fj_cpu_task (and the duplicate block around the other overload at the
1890–1896 region) to reset fj_cpu.preemption_flag to false before calling
cpufj_solve_loop (i.e., clear the atomic preemption flag on the fj_cpu within
the fj_cpu_task_t<i_t,f_t> instance so the while (!fj_cpu.halted &&
!fj_cpu.preemption_flag.load()) loop can run again).

---

Outside diff comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1768-1775: The code only updates fj_cpu.work_units_elapsed (via
fj_cpu.memory_aggregator.collect() and applying fj_cpu.work_unit_bias) every 100
iterations, causing the final partial batch to be unaccounted and allowing
work_unit_limit to be overshot; modify the loop so that you flush and apply the
remaining memory statistics on every iteration boundary that may exit: call auto
[loads, stores] = fj_cpu.memory_aggregator.collect(), compute biased_work =
(loads + stores) * fj_cpu.work_unit_bias / 1e10, add it to
fj_cpu.work_units_elapsed, call fj_cpu.producer_sync->notify_progress() if
non-null, and then check fj_cpu.work_units_elapsed >= work_unit_limit to
break—either by moving this logic out of the 100-iteration guard into the common
path or by ensuring you perform one final collect/update/notify/check right
before breaking/returning so the true consumed work is enforced.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 86405377-b8f0-4c1e-888b-d909dd15851b

📥 Commits

Reviewing files that changed from the base of the PR and between a4de253 and 64ba333.

📒 Files selected for processing (5)
  • cpp/src/branch_and_bound/branch_and_bound.cpp
  • cpp/src/mip_heuristics/feasibility_jump/cpu_fj_thread.cuh
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh
  • cpp/src/mip_heuristics/local_search/local_search.cu

Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp
Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp
Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp Outdated
Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu Outdated
Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu Outdated
Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)

1739-1752: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Guard the periodic objective log for host-LP CPUFJ tasks.

init_fj_cpu_from_host_lp() initializes pb_ptr to nullptr, but this log path still calls fj_cpu->pb_ptr->get_user_obj_from_solver_obj(...). A root-cut CPUFJ burst that reaches log_interval iterations will dereference null here. Reuse h_best_objective directly, or gate the conversion on pb_ptr != nullptr.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1739 - 1752,
The periodic debug log dereferences fj_cpu->pb_ptr via
get_user_obj_from_solver_obj even though init_fj_cpu_from_host_lp can leave
pb_ptr null; modify the CUOPT_LOG_DEBUG call in the block guarded by
fj_cpu.iterations % fj_cpu.log_interval to avoid dereferencing pb_ptr by either
(a) using fj_cpu->h_best_objective directly in the log instead of
fj_cpu->pb_ptr->get_user_obj_from_solver_obj(...), or (b) wrap the conversion in
a conditional that checks fj_cpu->pb_ptr != nullptr and falls back to
h_best_objective when pb_ptr is null (references: CUOPT_LOG_DEBUG call,
fj_cpu->pb_ptr, get_user_obj_from_solver_obj, h_best_objective,
init_fj_cpu_from_host_lp, fj_cpu.log_interval).
cpp/src/branch_and_bound/branch_and_bound.cpp (1)

2293-2300: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Return cut_pass_result_t struct, not bare mip_status_t enum.

The lambda is declared as -> cut_pass_result_t, but this branch returns mip_status_t::INFEASIBLE directly. This will not compile. Match the pattern used at other early exits in the same lambda: return {cut_pass_action_t::RETURN, mip_status_t::INFEASIBLE};

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 2293 - 2300, The
branch that handles !problem_feasible is returning a raw mip_status_t but the
surrounding lambda is declared to return cut_pass_result_t; update the return to
match other early exits by returning the struct, e.g. return
{cut_pass_action_t::RETURN, mip_status_t::INFEASIBLE}; keep the existing
side-effects (calling settings_.heuristic_preemption_callback(), setting
signal_extend_cliques_.store(...), and the `#pragma` omp taskwait depend(in :
*clique_signal)) and replace the bare mip_status_t::INFEASIBLE with the
cut_pass_result_t form.
♻️ Duplicate comments (2)
cpp/src/branch_and_bound/branch_and_bound.cpp (2)

2549-2550: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Run clique-thread cleanup on the remaining RETURN exits too.

This outer return still bypasses finish_clique_thread() for the RETURN statuses propagated from do_cut_pass() (numerical/time-limit/infeasible paths). That leaves the async clique task running on those exits.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 2549 - 2550, The
return on cut_pass_action_t::RETURN bypasses finish_clique_thread(), leaving the
async clique task running; modify the block handling cut_pass_result from
do_cut_pass() so that when cut_pass_result.action == cut_pass_action_t::RETURN
you call finish_clique_thread() (or otherwise ensure clique-thread cleanup)
before returning cut_pass_result.status—e.g., capture the status, call
finish_clique_thread(), then return the status to guarantee the async task is
cleaned up on RETURN paths.

2235-2243: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Decode CPUFJ incumbents against the same LP snapshot that produced them.

assignment comes from the host-LP snapshot used to build the CPUFJ task, but this callback uncrushes it against the live original_lp_ while the next cut pass is mutating rows/columns. That can remap cut/slack columns and feed a corrupted incumbent into set_new_solution(). Please capture the exact snapshot used for task creation, or have the task emit user-space assignments directly.
As per coding guidelines, "Flag missing synchronization for shared state in concurrent code."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 2235 - 2243, The
callback root_cut_cpufj_improvement_callback decodes a host-LP assignment using
uncrush_primal_solution against the live original_lp_ while concurrent cut
passes mutate rows/columns, risking remapped columns and corrupted incumbents;
fix by capturing the LP snapshot used to build the CPUFJ task (or have the task
produce a user-space assignment) and use that snapshot for
uncrush_primal_solution instead of original_lp_, or serialize access by holding
mutex_original_lp_ around both snapshot capture and uncrush/validation before
calling set_new_solution; ensure the code references the CPUFJ task snapshot,
uncrush_primal_solution, original_lp_, mutex_original_lp_, and set_new_solution
so the decoding uses a stable LP view.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1841-1847: The run_fj_cpu_task() body currently clears
fj_cpu.halted and fj_cpu.preemption_flag before calling cpufj_solve_loop(),
which can race with stop_fj_cpu_task() if stop is requested before the OpenMP
task starts; move the initialization/reset of these flags out of
run_fj_cpu_task() and into the code path that prepares/enqueues the
fj_cpu_task_t (i_t,f_t) task so flags are set before task creation; keep
run_fj_cpu_task() as a pure entry that only asserts task.fj_cpu and calls
cpufj_solve_loop(fj_cpu, time_limit, work_unit_limit) to avoid overwriting stop
requests.
- Around line 1637-1640: The definition of cpufj_solve must match the header:
change the function signature to return void and accept a pointer parameter
(fj_cpu_climber_t<i_t, f_t>* fj_cpu) and remove the static/internal linkage so
it has external linkage; also fix the call site that invokes
cpufj_solve_loop(...) to call cpufj_solve(...) instead so the implementation is
actually used, and ensure the explicit template instantiations reference the
corrected void cpufj_solve(fj_cpu_climber_t<i_t, f_t>*, ...) signature.

---

Outside diff comments:
In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 2293-2300: The branch that handles !problem_feasible is returning
a raw mip_status_t but the surrounding lambda is declared to return
cut_pass_result_t; update the return to match other early exits by returning the
struct, e.g. return {cut_pass_action_t::RETURN, mip_status_t::INFEASIBLE}; keep
the existing side-effects (calling settings_.heuristic_preemption_callback(),
setting signal_extend_cliques_.store(...), and the `#pragma` omp taskwait
depend(in : *clique_signal)) and replace the bare mip_status_t::INFEASIBLE with
the cut_pass_result_t form.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1739-1752: The periodic debug log dereferences fj_cpu->pb_ptr via
get_user_obj_from_solver_obj even though init_fj_cpu_from_host_lp can leave
pb_ptr null; modify the CUOPT_LOG_DEBUG call in the block guarded by
fj_cpu.iterations % fj_cpu.log_interval to avoid dereferencing pb_ptr by either
(a) using fj_cpu->h_best_objective directly in the log instead of
fj_cpu->pb_ptr->get_user_obj_from_solver_obj(...), or (b) wrap the conversion in
a conditional that checks fj_cpu->pb_ptr != nullptr and falls back to
h_best_objective when pb_ptr is null (references: CUOPT_LOG_DEBUG call,
fj_cpu->pb_ptr, get_user_obj_from_solver_obj, h_best_objective,
init_fj_cpu_from_host_lp, fj_cpu.log_interval).

---

Duplicate comments:
In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 2549-2550: The return on cut_pass_action_t::RETURN bypasses
finish_clique_thread(), leaving the async clique task running; modify the block
handling cut_pass_result from do_cut_pass() so that when cut_pass_result.action
== cut_pass_action_t::RETURN you call finish_clique_thread() (or otherwise
ensure clique-thread cleanup) before returning cut_pass_result.status—e.g.,
capture the status, call finish_clique_thread(), then return the status to
guarantee the async task is cleaned up on RETURN paths.
- Around line 2235-2243: The callback root_cut_cpufj_improvement_callback
decodes a host-LP assignment using uncrush_primal_solution against the live
original_lp_ while concurrent cut passes mutate rows/columns, risking remapped
columns and corrupted incumbents; fix by capturing the LP snapshot used to build
the CPUFJ task (or have the task produce a user-space assignment) and use that
snapshot for uncrush_primal_solution instead of original_lp_, or serialize
access by holding mutex_original_lp_ around both snapshot capture and
uncrush/validation before calling set_new_solution; ensure the code references
the CPUFJ task snapshot, uncrush_primal_solution, original_lp_,
mutex_original_lp_, and set_new_solution so the decoding uses a stable LP view.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0f8fa69a-6646-4ff3-ac2d-c175c950b37c

📥 Commits

Reviewing files that changed from the base of the PR and between 64ba333 and 7323d4b.

📒 Files selected for processing (3)
  • cpp/src/branch_and_bound/branch_and_bound.cpp
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh

Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu Outdated
Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)

1720-1727: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

The burst budget is only enforced once per 100 iterations.

For the new root-cut “short burst” mode, a finite work_unit_limit can be exceeded by a full accounting window before this check runs. That makes the task much less opportunistic and can keep a supposedly idle cut-pass thread busy well past the requested budget.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1720 - 1727,
The burst budget check is only run every 100 iterations, allowing
work_unit_limit to be exceeded; update the loop so memory accounting and the
budget break happen every iteration: call fj_cpu->memory_aggregator.collect(),
compute biased_work and update fj_cpu->work_units_elapsed on every loop
iteration (not gated by fj_cpu->iterations % 100), then call
fj_cpu->producer_sync->notify_progress() if non-null and immediately check if
fj_cpu->work_units_elapsed >= work_unit_limit to break; modify the block around
memory_aggregator.collect()/biased_work/work_units_elapsed/producer_sync to
remove the 100-iteration guard so the budget is enforced promptly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1479-1483: The host-LP path leaves fj_cpu->pb_ptr null which
causes a null-deref in cpufj_solve when calling
pb_ptr->get_user_obj_from_solver_obj(...); fix by supplying a valid
objective-conversion hook instead of nullptr: create a small dummy/proxy
implementing the same interface used by fj_cpu_climber_t (e.g., a lightweight
object with get_user_obj_from_solver_obj that returns an appropriate default or
identity conversion) and assign its pointer to fj_cpu->pb_ptr in the host-LP
initialization (the same place where fj_cpu->view and fj_cpu->settings are set),
or alternatively ensure cpufj_solve and any callers (including
make_fj_cpu_task_from_host_lp) check pb_ptr for nullptr before calling
get_user_obj_from_solver_obj; reference fj_cpu_climber_t<i_t,f_t>,
fj_t<i_t,f_t>::climber_data_t::view_t, pb_ptr, cpufj_solve, and
make_fj_cpu_task_from_host_lp when making the change.

---

Outside diff comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1720-1727: The burst budget check is only run every 100
iterations, allowing work_unit_limit to be exceeded; update the loop so memory
accounting and the budget break happen every iteration: call
fj_cpu->memory_aggregator.collect(), compute biased_work and update
fj_cpu->work_units_elapsed on every loop iteration (not gated by
fj_cpu->iterations % 100), then call fj_cpu->producer_sync->notify_progress() if
non-null and immediately check if fj_cpu->work_units_elapsed >= work_unit_limit
to break; modify the block around
memory_aggregator.collect()/biased_work/work_units_elapsed/producer_sync to
remove the 100-iteration guard so the budget is enforced promptly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9e0434df-533d-406b-ab19-a45d713b4d2b

📥 Commits

Reviewing files that changed from the base of the PR and between 7323d4b and 0138b07.

📒 Files selected for processing (3)
  • cpp/src/branch_and_bound/branch_and_bound.cpp
  • cpp/src/mip_heuristics/feasibility_jump/cpu_fj_thread.cuh
  • cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
🚧 Files skipped from review as they are similar to previous changes (1)
  • cpp/src/branch_and_bound/branch_and_bound.cpp

Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
@akifcorduk
Copy link
Copy Markdown
Contributor

/ok to test

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 11, 2026

/ok to test

@akifcorduk, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

@akifcorduk
Copy link
Copy Markdown
Contributor

/ok to test 9a64c14

Copy link
Copy Markdown
Contributor

@nguidotti nguidotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aliceb-nv and @akifcorduk for the hard work! A couple of things:

  • After #1099, the omp parallel/single directives are not longer needed. You can eliminate them completely.
  • Is wrapping the cut pass as a lambda necessary? You could just launch the CPU FJ, do the cut passes, and then stop the CPU FJ. Launching a omp task is an asynchronous operation.
  • I think cpu_fj_task can be eliminated since all its methods are operating over the cpu_fj_climber anyway.

Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp Outdated
} else {
}

auto do_cut_pass = [&]() -> cut_pass_result_t {
Copy link
Copy Markdown
Contributor

@nguidotti nguidotti May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh I understand why Alice designed this way. You cannot call return from within a omp parallel region. If you eliminate the directives like I recommend, then the lambda is no longer necessary. You could just launch the omp task with cpu fj at the beginning, do the cut pass and then do a taskwait with dependency on the root fj object.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also eliminate the enum cut_pass_action_t in this way.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason is that cut pass code has early return points. When we return, the cpufj task remains not synchonized.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A call to stop_fj + taskwait on the scope_guard should handle early returns, no?

Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh
Comment thread cpp/src/mip_heuristics/feasibility_jump/cpu_fj_thread.cuh
Comment thread cpp/src/mip_heuristics/feasibility_jump/cpu_fj_thread.cuh
@akifcorduk
Copy link
Copy Markdown
Contributor

/ok to test 9356dc8

@akifcorduk
Copy link
Copy Markdown
Contributor

/ok to test 2763151

@akifcorduk
Copy link
Copy Markdown
Contributor

/ok to test 26f2cbd

@akifcorduk
Copy link
Copy Markdown
Contributor

/ok to test 30596fa

@akifcorduk
Copy link
Copy Markdown
Contributor

/ok to test 6f1f1d9

Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp Outdated
@akifcorduk
Copy link
Copy Markdown
Contributor

/ok to test ef39841

@akifcorduk
Copy link
Copy Markdown
Contributor

/ok to test c45f3ff

Copy link
Copy Markdown
Contributor

@nguidotti nguidotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks @akifcorduk!

@nguidotti
Copy link
Copy Markdown
Contributor

I just merged #1201. So, you can revert the changes to the determinism test

@akifcorduk
Copy link
Copy Markdown
Contributor

/merge

@rapids-bot rapids-bot Bot merged commit ad098d1 into NVIDIA:main May 13, 2026
205 of 209 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants