Run CPUFJ bursts at the root by aliceb-nv · Pull Request #1179 · NVIDIA/cuopt

aliceb-nv · 2026-05-05T09:01:04Z

Some primal heuristics, like CPUFJ, may benefit in some cases from operating on the problem with root cuts.
This PR adds support for this, by running bursts of a few iterations of CPUFJ per cut pass opportunistically, on threads that would otherwise remain idle; and also once after the root cut passes are completed.

In most cases, this provides no benefits, but some instances like tutaki find their first integer incumbent much faster. Additionally, awhea is now reliably feasibilized and solved to a gap of <5% on 10min runs.

Description

Issue

Checklist

I am familiar with the Contributing Guidelines.
Testing
- New or existing tests cover these changes
- Added tests
- Created an issue to follow-up
- NA
Documentation
- The documentation is up to date with these changes
- Added new documentation
- NA

copy-pr-bot · 2026-05-05T09:01:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

aliceb-nv · 2026-05-05T09:01:36Z

/ok to test 64ba333

coderabbitai · 2026-05-05T09:10:56Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Integrates CPU feasibility-jump (CPUFJ) as a parallel heuristic during branch-and-bound root cut passes: adds task types and APIs, refactors host-LP initialization, extends cpufj_solve with work-unit limits, and orchestrates concurrent CPUFJ execution and final-task runs around the cut-pass loop.

Changes

CPUFJ Root-Cut Heuristic Integration

Layer / File(s)	Summary
Header includes & declarations `cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh`	Adds `cpu_fj_thread.cuh` include and extends `cpufj_solve` declaration with `work_unit_limit` defaulting to infinity.
Core includes & forward decls `cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`	Adds dual-simplex/presolve headers, std includes, and forward-declares `finalize_fj_cpu_host_initialization`.
Host data view & finalization `cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`	Implements `set_host_data_view` and `finalize_fj_cpu_host_initialization`, and updates `init_fj_cpu` to delegate finalization.
Init from host LP (CSR conversion) `cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`	Adds `init_fj_cpu_from_host_lp` to convert a dual-simplex host LP to an initialized `fj_cpu_climber_t` (CSR conversion, reverse adjacency, bounds/seed projection).
cpufj_solve signature & work-unit limit `cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`	Extends `cpufj_solve` to accept `work_unit_limit` and adds early-exit when accumulated work units reach the limit; updates best-objective trace.
Task type & API declarations `cpp/src/mip_heuristics/feasibility_jump/cpu_fj_thread.cuh`	Declares `fj_cpu_task_t` (preemption flag, owned climber) and factory/run/stop function templates.
Task wrapper & implementations `cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`, `cpp/src/mip_heuristics/feasibility_jump/cpu_fj_thread.cuh`	Implements task deleter, `make_fj_cpu_task_from_host_lp`, `run_fj_cpu_task`, `stop_fj_cpu_task`, and updates explicit template instantiations.
Branch-and-Bound includes & wiring `cpp/src/branch_and_bound/branch_and_bound.cpp`	Includes CPUFJ thread/task header and `utilities/scope_guard.hpp`; introduces per-scope task handle, improvement callback, and RAII stop guard.
Cut-pass control flow refactor `cpp/src/branch_and_bound/branch_and_bound.cpp`	Refactors root cut-pass into `do_cut_pass` lambda returning structured `cut_pass_result_t` (CONTINUE/BREAK/RETURN + `mip_status_t`).
OpenMP orchestration per cut pass `cpp/src/branch_and_bound/branch_and_bound.cpp`	Per-cut-pass OpenMP parallel/single optionally launches CPUFJ task concurrently with `do_cut_pass`, stops/waits for task, and handles returned action (including solver exit on RETURN); may prebuild next task in non-deterministic mode.
Final root-cut CPUFJ execution `cpp/src/branch_and_bound/branch_and_bound.cpp`	After cut-pass loop, conditionally constructs and runs a final CPUFJ task when cuts exist, using deterministic vs non-deterministic time-limit policy, then resets the task handle.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Run CPUFJ bursts at the root' directly and clearly describes the main change—enabling CPUFJ heuristic bursts during root cut passes.
Description check	✅ Passed	The description is directly related to the changeset, explaining the motivation for running CPUFJ during root cuts and providing concrete performance examples.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)

1768-1775: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Enforce work_unit_limit on the actual consumed work.

work_units_elapsed is only accrued and checked every 100 iterations, and the last partial batch is never flushed before exit. For the short root-cut bursts this PR adds, that means a small work_unit_limit can be overshot by a wide margin while still underreporting the final work consumed.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1768 - 1775,
The code only updates fj_cpu.work_units_elapsed (via
fj_cpu.memory_aggregator.collect() and applying fj_cpu.work_unit_bias) every 100
iterations, causing the final partial batch to be unaccounted and allowing
work_unit_limit to be overshot; modify the loop so that you flush and apply the
remaining memory statistics on every iteration boundary that may exit: call auto
[loads, stores] = fj_cpu.memory_aggregator.collect(), compute biased_work =
(loads + stores) * fj_cpu.work_unit_bias / 1e10, add it to
fj_cpu.work_units_elapsed, call fj_cpu.producer_sync->notify_progress() if
non-null, and then check fj_cpu.work_units_elapsed >= work_unit_limit to
break—either by moving this logic out of the 100-iteration guard into the common
path or by ensuring you perform one final collect/update/notify/check right
before breaking/returning so the true consumed work is enforced.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 2231-2237: The callback root_cut_cpufj_improvement_callback is
decoding the host-LP assignment against the live original_lp_ (via
uncrush_primal_solution) which can be concurrently mutated by
add_cuts/remove_cuts; capture and use the exact LP snapshot used to build the
CPUFJ task (or change the task to emit user-space assignments directly) so
uncrush_primal_solution runs against an immutable snapshot, and add
synchronization (or assert/guideline comment) around shared state mutation to
flag the missing concurrency protection for original_problem_/original_lp_
before calling set_new_solution().
- Around line 2254-2256: These return paths call set_solution_at_root(...) and
return immediately, which bypasses the existing async clique cleanup; insert a
call to finish_clique_thread() immediately before each early return so the async
clique task is stopped and joined. Specifically, add finish_clique_thread() just
prior to the set_solution_at_root(solution, cut_info); return
mip_status_t::OPTIMAL; sequence shown, and make the same change at the other
analogous exit that calls set_solution_at_root and returns (the second location
referenced in the comment).

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1881-1886: The task restart fails because run_fj_cpu_task only
clears fj_cpu.halted but not fj_cpu.preemption_flag, so after stop_fj_cpu_task
sets preemption_flag = true subsequent runs immediately exit; update
run_fj_cpu_task (and the duplicate block around the other overload at the
1890–1896 region) to reset fj_cpu.preemption_flag to false before calling
cpufj_solve_loop (i.e., clear the atomic preemption flag on the fj_cpu within
the fj_cpu_task_t<i_t,f_t> instance so the while (!fj_cpu.halted &&
!fj_cpu.preemption_flag.load()) loop can run again).

---

Outside diff comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1768-1775: The code only updates fj_cpu.work_units_elapsed (via
fj_cpu.memory_aggregator.collect() and applying fj_cpu.work_unit_bias) every 100
iterations, causing the final partial batch to be unaccounted and allowing
work_unit_limit to be overshot; modify the loop so that you flush and apply the
remaining memory statistics on every iteration boundary that may exit: call auto
[loads, stores] = fj_cpu.memory_aggregator.collect(), compute biased_work =
(loads + stores) * fj_cpu.work_unit_bias / 1e10, add it to
fj_cpu.work_units_elapsed, call fj_cpu.producer_sync->notify_progress() if
non-null, and then check fj_cpu.work_units_elapsed >= work_unit_limit to
break—either by moving this logic out of the 100-iteration guard into the common
path or by ensuring you perform one final collect/update/notify/check right
before breaking/returning so the true consumed work is enforced.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 86405377-b8f0-4c1e-888b-d909dd15851b

📥 Commits

Reviewing files that changed from the base of the PR and between a4de253 and 64ba333.

📒 Files selected for processing (5)

cpp/src/branch_and_bound/branch_and_bound.cpp
cpp/src/mip_heuristics/feasibility_jump/cpu_fj_thread.cuh
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh
cpp/src/mip_heuristics/local_search/local_search.cu

coderabbitai

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)
1739-1752: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Guard the periodic objective log for host-LP CPUFJ tasks.

init_fj_cpu_from_host_lp() initializes pb_ptr to nullptr, but this log path still calls fj_cpu->pb_ptr->get_user_obj_from_solver_obj(...). A root-cut CPUFJ burst that reaches log_interval iterations will dereference null here. Reuse h_best_objective directly, or gate the conversion on pb_ptr != nullptr.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1739 - 1752,
The periodic debug log dereferences fj_cpu->pb_ptr via
get_user_obj_from_solver_obj even though init_fj_cpu_from_host_lp can leave
pb_ptr null; modify the CUOPT_LOG_DEBUG call in the block guarded by
fj_cpu.iterations % fj_cpu.log_interval to avoid dereferencing pb_ptr by either
(a) using fj_cpu->h_best_objective directly in the log instead of
fj_cpu->pb_ptr->get_user_obj_from_solver_obj(...), or (b) wrap the conversion in
a conditional that checks fj_cpu->pb_ptr != nullptr and falls back to
h_best_objective when pb_ptr is null (references: CUOPT_LOG_DEBUG call,
fj_cpu->pb_ptr, get_user_obj_from_solver_obj, h_best_objective,
init_fj_cpu_from_host_lp, fj_cpu.log_interval).
cpp/src/branch_and_bound/branch_and_bound.cpp (1)
2293-2300: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Return cut_pass_result_t struct, not bare mip_status_t enum.

The lambda is declared as -> cut_pass_result_t, but this branch returns mip_status_t::INFEASIBLE directly. This will not compile. Match the pattern used at other early exits in the same lambda: return {cut_pass_action_t::RETURN, mip_status_t::INFEASIBLE};
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 2293 - 2300, The
branch that handles !problem_feasible is returning a raw mip_status_t but the
surrounding lambda is declared to return cut_pass_result_t; update the return to
match other early exits by returning the struct, e.g. return
{cut_pass_action_t::RETURN, mip_status_t::INFEASIBLE}; keep the existing
side-effects (calling settings_.heuristic_preemption_callback(), setting
signal_extend_cliques_.store(...), and the `#pragma` omp taskwait depend(in :
*clique_signal)) and replace the bare mip_status_t::INFEASIBLE with the
cut_pass_result_t form.

♻️ Duplicate comments (2)

cpp/src/branch_and_bound/branch_and_bound.cpp (2)
2549-2550: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Run clique-thread cleanup on the remaining RETURN exits too.

This outer return still bypasses finish_clique_thread() for the RETURN statuses propagated from do_cut_pass() (numerical/time-limit/infeasible paths). That leaves the async clique task running on those exits.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 2549 - 2550, The
return on cut_pass_action_t::RETURN bypasses finish_clique_thread(), leaving the
async clique task running; modify the block handling cut_pass_result from
do_cut_pass() so that when cut_pass_result.action == cut_pass_action_t::RETURN
you call finish_clique_thread() (or otherwise ensure clique-thread cleanup)
before returning cut_pass_result.status—e.g., capture the status, call
finish_clique_thread(), then return the status to guarantee the async task is
cleaned up on RETURN paths.
2235-2243: ⚠️ Potential issue | 🔴 Critical | 🏗️ Heavy lift

Decode CPUFJ incumbents against the same LP snapshot that produced them.

assignment comes from the host-LP snapshot used to build the CPUFJ task, but this callback uncrushes it against the live original_lp_ while the next cut pass is mutating rows/columns. That can remap cut/slack columns and feed a corrupted incumbent into set_new_solution(). Please capture the exact snapshot used for task creation, or have the task emit user-space assignments directly.
As per coding guidelines, "Flag missing synchronization for shared state in concurrent code."
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/branch_and_bound/branch_and_bound.cpp` around lines 2235 - 2243, The
callback root_cut_cpufj_improvement_callback decodes a host-LP assignment using
uncrush_primal_solution against the live original_lp_ while concurrent cut
passes mutate rows/columns, risking remapped columns and corrupted incumbents;
fix by capturing the LP snapshot used to build the CPUFJ task (or have the task
produce a user-space assignment) and use that snapshot for
uncrush_primal_solution instead of original_lp_, or serialize access by holding
mutex_original_lp_ around both snapshot capture and uncrush/validation before
calling set_new_solution; ensure the code references the CPUFJ task snapshot,
uncrush_primal_solution, original_lp_, mutex_original_lp_, and set_new_solution
so the decoding uses a stable LP view.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1841-1847: The run_fj_cpu_task() body currently clears
fj_cpu.halted and fj_cpu.preemption_flag before calling cpufj_solve_loop(),
which can race with stop_fj_cpu_task() if stop is requested before the OpenMP
task starts; move the initialization/reset of these flags out of
run_fj_cpu_task() and into the code path that prepares/enqueues the
fj_cpu_task_t (i_t,f_t) task so flags are set before task creation; keep
run_fj_cpu_task() as a pure entry that only asserts task.fj_cpu and calls
cpufj_solve_loop(fj_cpu, time_limit, work_unit_limit) to avoid overwriting stop
requests.
- Around line 1637-1640: The definition of cpufj_solve must match the header:
change the function signature to return void and accept a pointer parameter
(fj_cpu_climber_t<i_t, f_t>* fj_cpu) and remove the static/internal linkage so
it has external linkage; also fix the call site that invokes
cpufj_solve_loop(...) to call cpufj_solve(...) instead so the implementation is
actually used, and ensure the explicit template instantiations reference the
corrected void cpufj_solve(fj_cpu_climber_t<i_t, f_t>*, ...) signature.

---

Outside diff comments:
In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 2293-2300: The branch that handles !problem_feasible is returning
a raw mip_status_t but the surrounding lambda is declared to return
cut_pass_result_t; update the return to match other early exits by returning the
struct, e.g. return {cut_pass_action_t::RETURN, mip_status_t::INFEASIBLE}; keep
the existing side-effects (calling settings_.heuristic_preemption_callback(),
setting signal_extend_cliques_.store(...), and the `#pragma` omp taskwait
depend(in : *clique_signal)) and replace the bare mip_status_t::INFEASIBLE with
the cut_pass_result_t form.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1739-1752: The periodic debug log dereferences fj_cpu->pb_ptr via
get_user_obj_from_solver_obj even though init_fj_cpu_from_host_lp can leave
pb_ptr null; modify the CUOPT_LOG_DEBUG call in the block guarded by
fj_cpu.iterations % fj_cpu.log_interval to avoid dereferencing pb_ptr by either
(a) using fj_cpu->h_best_objective directly in the log instead of
fj_cpu->pb_ptr->get_user_obj_from_solver_obj(...), or (b) wrap the conversion in
a conditional that checks fj_cpu->pb_ptr != nullptr and falls back to
h_best_objective when pb_ptr is null (references: CUOPT_LOG_DEBUG call,
fj_cpu->pb_ptr, get_user_obj_from_solver_obj, h_best_objective,
init_fj_cpu_from_host_lp, fj_cpu.log_interval).

---

Duplicate comments:
In `@cpp/src/branch_and_bound/branch_and_bound.cpp`:
- Around line 2549-2550: The return on cut_pass_action_t::RETURN bypasses
finish_clique_thread(), leaving the async clique task running; modify the block
handling cut_pass_result from do_cut_pass() so that when cut_pass_result.action
== cut_pass_action_t::RETURN you call finish_clique_thread() (or otherwise
ensure clique-thread cleanup) before returning cut_pass_result.status—e.g.,
capture the status, call finish_clique_thread(), then return the status to
guarantee the async task is cleaned up on RETURN paths.
- Around line 2235-2243: The callback root_cut_cpufj_improvement_callback
decodes a host-LP assignment using uncrush_primal_solution against the live
original_lp_ while concurrent cut passes mutate rows/columns, risking remapped
columns and corrupted incumbents; fix by capturing the LP snapshot used to build
the CPUFJ task (or have the task produce a user-space assignment) and use that
snapshot for uncrush_primal_solution instead of original_lp_, or serialize
access by holding mutex_original_lp_ around both snapshot capture and
uncrush/validation before calling set_new_solution; ensure the code references
the CPUFJ task snapshot, uncrush_primal_solution, original_lp_,
mutex_original_lp_, and set_new_solution so the decoding uses a stable LP view.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0f8fa69a-6646-4ff3-ac2d-c175c950b37c

📥 Commits

Reviewing files that changed from the base of the PR and between 64ba333 and 7323d4b.

📒 Files selected for processing (3)

cpp/src/branch_and_bound/branch_and_bound.cpp
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuh

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)
1720-1727: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

The burst budget is only enforced once per 100 iterations.

For the new root-cut “short burst” mode, a finite work_unit_limit can be exceeded by a full accounting window before this check runs. That makes the task much less opportunistic and can keep a supposedly idle cut-pass thread busy well past the requested budget.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1720 - 1727,
The burst budget check is only run every 100 iterations, allowing
work_unit_limit to be exceeded; update the loop so memory accounting and the
budget break happen every iteration: call fj_cpu->memory_aggregator.collect(),
compute biased_work and update fj_cpu->work_units_elapsed on every loop
iteration (not gated by fj_cpu->iterations % 100), then call
fj_cpu->producer_sync->notify_progress() if non-null and immediately check if
fj_cpu->work_units_elapsed >= work_unit_limit to break; modify the block around
memory_aggregator.collect()/biased_work/work_units_elapsed/producer_sync to
remove the 100-iteration guard so the budget is enforced promptly.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1479-1483: The host-LP path leaves fj_cpu->pb_ptr null which
causes a null-deref in cpufj_solve when calling
pb_ptr->get_user_obj_from_solver_obj(...); fix by supplying a valid
objective-conversion hook instead of nullptr: create a small dummy/proxy
implementing the same interface used by fj_cpu_climber_t (e.g., a lightweight
object with get_user_obj_from_solver_obj that returns an appropriate default or
identity conversion) and assign its pointer to fj_cpu->pb_ptr in the host-LP
initialization (the same place where fj_cpu->view and fj_cpu->settings are set),
or alternatively ensure cpufj_solve and any callers (including
make_fj_cpu_task_from_host_lp) check pb_ptr for nullptr before calling
get_user_obj_from_solver_obj; reference fj_cpu_climber_t<i_t,f_t>,
fj_t<i_t,f_t>::climber_data_t::view_t, pb_ptr, cpufj_solve, and
make_fj_cpu_task_from_host_lp when making the change.

---

Outside diff comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1720-1727: The burst budget check is only run every 100
iterations, allowing work_unit_limit to be exceeded; update the loop so memory
accounting and the budget break happen every iteration: call
fj_cpu->memory_aggregator.collect(), compute biased_work and update
fj_cpu->work_units_elapsed on every loop iteration (not gated by
fj_cpu->iterations % 100), then call fj_cpu->producer_sync->notify_progress() if
non-null and immediately check if fj_cpu->work_units_elapsed >= work_unit_limit
to break; modify the block around
memory_aggregator.collect()/biased_work/work_units_elapsed/producer_sync to
remove the 100-iteration guard so the budget is enforced promptly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 9e0434df-533d-406b-ab19-a45d713b4d2b

📥 Commits

Reviewing files that changed from the base of the PR and between 7323d4b and 0138b07.

📒 Files selected for processing (3)

cpp/src/branch_and_bound/branch_and_bound.cpp
cpp/src/mip_heuristics/feasibility_jump/cpu_fj_thread.cuh
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu

🚧 Files skipped from review as they are similar to previous changes (1)

cpp/src/branch_and_bound/branch_and_bound.cpp

akifcorduk · 2026-05-11T13:13:08Z

/ok to test

copy-pr-bot · 2026-05-11T13:13:12Z

/ok to test

@akifcorduk, there was an error processing your request: E1

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/1/

akifcorduk · 2026-05-11T13:13:33Z

/ok to test 9a64c14

nguidotti

Thanks @aliceb-nv and @akifcorduk for the hard work! A couple of things:

After #1099, the omp parallel/single directives are not longer needed. You can eliminate them completely.
Is wrapping the cut pass as a lambda necessary? You could just launch the CPU FJ, do the cut passes, and then stop the CPU FJ. Launching a omp task is an asynchronous operation.
I think cpu_fj_task can be eliminated since all its methods are operating over the cpu_fj_climber anyway.

nguidotti · 2026-05-11T13:32:54Z

-    } else {
+    }
+
+    auto do_cut_pass = [&]() -> cut_pass_result_t {


Ahh I understand why Alice designed this way. You cannot call return from within a omp parallel region. If you eliminate the directives like I recommend, then the lambda is no longer necessary. You could just launch the omp task with cpu fj at the beginning, do the cut pass and then do a taskwait with dependency on the root fj object.

You could also eliminate the enum cut_pass_action_t in this way.

The reason is that cut pass code has early return points. When we return, the cpufj task remains not synchonized.

A call to stop_fj + taskwait on the scope_guard should handle early returns, no?

akifcorduk · 2026-05-11T15:42:31Z

/ok to test 9356dc8

akifcorduk · 2026-05-12T09:38:19Z

/ok to test 2763151

akifcorduk · 2026-05-12T10:15:31Z

/ok to test 26f2cbd

akifcorduk · 2026-05-12T10:29:49Z

/ok to test 30596fa

akifcorduk · 2026-05-12T13:01:29Z

/ok to test 6f1f1d9

akifcorduk · 2026-05-12T16:14:22Z

/ok to test ef39841

akifcorduk · 2026-05-12T19:01:24Z

/ok to test c45f3ff

nguidotti

Looks good to me. Thanks @akifcorduk!

nguidotti · 2026-05-13T08:26:34Z

I just merged #1201. So, you can revert the changes to the determinism test

akifcorduk · 2026-05-13T09:30:38Z

/merge

aliceb-nv added 9 commits April 27, 2026 05:28

run a single burst

9698d00

run cpufj in parallel during cuts

80ee80a

Merge branch 'main' into cpufj_cut_bursts

bb71745

adjust logs, fix time limit

9e464bb

Merge branch 'main' into cpufj_cut_bursts

0053a27

cleanup

e2300a4

Merge branch 'main' into cpufj_cut_bursts

d4f1975

some cleanup

45c8154

AI review comments

64ba333

aliceb-nv added this to the 26.06 milestone May 5, 2026

aliceb-nv added the non-breaking Introduces a non-breaking change label May 5, 2026

aliceb-nv requested a review from a team as a code owner May 5, 2026 09:01

aliceb-nv added the improvement Improves an existing functionality label May 5, 2026

aliceb-nv requested review from akifcorduk and hlinsen May 5, 2026 09:01

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp

Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp

akifcorduk reviewed May 5, 2026

View reviewed changes

Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp Outdated

Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu Outdated

Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu Outdated

Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp Outdated

akifcorduk added 4 commits May 11, 2026 13:00

Merge branch 'main' of github.com:NVIDIA/cuopt into cpufj_cut_bursts

4c2140d

handle ai comments

a142c38

fix timer

fa0af6b

Merge branch 'main' of github.com:NVIDIA/cuopt into cpufj_cut_bursts

7323d4b

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu Outdated

Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu Outdated

fix merge conflicts

0138b07

coderabbitai Bot reviewed May 11, 2026

View reviewed changes

Comment thread cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu

akifcorduk added 2 commits May 11, 2026 15:10

fix ai comments

06dd9f7

fix ai review

9a64c14

akifcorduk approved these changes May 11, 2026

View reviewed changes

nguidotti requested changes May 11, 2026

View reviewed changes

fix few issues

9356dc8

move lambda to function

2763151

fix deterministic queue addition

26f2cbd

fix time limit

30596fa

fix seed issue

6f1f1d9

nguidotti reviewed May 12, 2026

View reviewed changes

Comment thread cpp/src/branch_and_bound/branch_and_bound.cpp Outdated

attempt fixing the infeasible issue in determinism tests

ef39841

add depend in/out

c45f3ff

nguidotti approved these changes May 13, 2026

View reviewed changes

rapids-bot Bot merged commit ad098d1 into NVIDIA:main May 13, 2026
205 of 209 checks passed

Conversation

aliceb-nv commented May 5, 2026

Description

Issue

Checklist

Uh oh!

copy-pr-bot Bot commented May 5, 2026

Uh oh!

aliceb-nv commented May 5, 2026

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Review ran into problems

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

akifcorduk commented May 11, 2026

Uh oh!

copy-pr-bot Bot commented May 11, 2026

Uh oh!

akifcorduk commented May 11, 2026

Uh oh!

nguidotti left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

nguidotti May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nguidotti May 11, 2026

Choose a reason for hiding this comment

Uh oh!

akifcorduk May 11, 2026

Choose a reason for hiding this comment

Uh oh!

nguidotti May 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

akifcorduk commented May 11, 2026

Uh oh!

akifcorduk commented May 12, 2026

Uh oh!

akifcorduk commented May 12, 2026

Uh oh!

akifcorduk commented May 12, 2026

Uh oh!

akifcorduk commented May 12, 2026

Uh oh!

Uh oh!

akifcorduk commented May 12, 2026

Uh oh!

akifcorduk commented May 12, 2026

Uh oh!

nguidotti left a comment

coderabbitai Bot commented May 5, 2026 •

edited

Loading

nguidotti May 11, 2026 •

edited

Loading