Skip to content

Replaced cusparse wrappers with simple unique_ptrs and more RAII #1342

Open
Bubullzz wants to merge 16 commits into
NVIDIA:mainfrom
Bubullzz:replace_wrappers_with_unique_ptrs
Open

Replaced cusparse wrappers with simple unique_ptrs and more RAII #1342
Bubullzz wants to merge 16 commits into
NVIDIA:mainfrom
Bubullzz:replace_wrappers_with_unique_ptrs

Conversation

@Bubullzz

@Bubullzz Bubullzz commented May 29, 2026

Copy link
Copy Markdown
Contributor

Currently in CuOpt we have many wrappers around cusparse data structures to implement the current behaviour:

  • limit ownership to a single point
  • allow for a non-owning access to be passed as arguments to fonctions
  • custom destructor calling the associated CuSparse destruction API

All of these features are handmade using an internal need_destruction state. This seems dangerous and easily replacable by a unique_ptr.

In this PR I replace these wrappers with unique_ptr when they are owned and I replace them with direct pointers, alliased as cusparse_object_view when passed as argument to functions.

@Bubullzz Bubullzz requested a review from a team as a code owner May 29, 2026 15:24
@Bubullzz Bubullzz requested review from chris-maes and rg20 May 29, 2026 15:24
@copy-pr-bot

copy-pr-bot Bot commented May 29, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@Bubullzz Bubullzz added the do not merge Do not merge if this flag is set label May 29, 2026
@Bubullzz Bubullzz changed the title Replaced wrappers with simple unique_ptrs and more RAII Replaced cusparse wrappers with simple unique_ptrs and more RAII May 29, 2026
@coderabbitai

coderabbitai Bot commented May 29, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: dd1ce70f-2908-4228-8198-d37039b47e4d

📥 Commits

Reviewing files that changed from the base of the PR and between 90f1214 and 1afbe40.

📒 Files selected for processing (1)
  • skills/cuopt-multi-objective-exploration/skill-card.md
✅ Files skipped from review due to trivial changes (1)
  • skills/cuopt-multi-objective-exploration/skill-card.md

📝 Walkthrough

Walkthrough

Refactors cuSPARSE descriptor ownership across barrier and PDLP code from wrapper and raw handles to RAII unique_ptr-based descriptors with non-owning view aliases. Updates constructor setup, helper APIs, and runtime cuSPARSE call sites to pass descriptors through .get(). Also updates one skill-card credential line.

Changes

cuSPARSE Descriptor RAII Migration

Layer / File(s) Summary
Descriptor type system and API contracts
cpp/src/pdlp/cusparse_view.hpp, cpp/src/barrier/cusparse_view.hpp, cpp/src/barrier/cusparse_info.hpp
Introduces custom deleters, owning alias types, non-owning descriptor-view aliases, and factory declarations for cuSPARSE matrices, vectors, dense matrices, and SpMVOp handles.
Barrier cusparse_view constructor and API
cpp/src/barrier/cusparse_view.{hpp,cu}
Constructor uses make_csr and make_dnvec factories, create_vector returns an owning vector handle, spmv and transpose_spmv accept descriptor-view parameters, and the destructor declaration is removed.
Barrier sparse matrix kernel updates
cpp/src/barrier/sparse_matrix_kernels.cuh
SpGEMM and SpMM paths replace wrapper construction with make_csr factories and pass descriptors via .get().
Barrier iteration_data_t descriptor migration
cpp/src/barrier/barrier.cu
iteration_data_t member descriptors switch to owning vector handles and gpu_adat_multiply accepts descriptor-view parameters.
Barrier compute-path call sites
cpp/src/barrier/barrier.cu
Residual, H-formation, dx-formation, debug-residual, and objective-computation cuSPARSE calls update to pass owned descriptors via .get().
PDLP cusparse_view main-path constructor
cpp/src/pdlp/cusparse_view.cu
The first cusparse_view_t constructor creates descriptors via make_* factories, and its buffer-size and preprocess calls pass .get() descriptor views.
PDLP cusparse_view alternate-path constructor
cpp/src/pdlp/cusparse_view.cu
The second cusparse_view_t constructor creates descriptors via make_* factories and updates its buffer-size and preprocess call sites to use .get().
PDLP cusparse_view restart-strategy constructor
cpp/src/pdlp/cusparse_view.cu
The restart constructor recreates descriptors from saved sizes and pointers, recovers scratch-buffer pointers, and updates buffer-size and preprocess calls to use .get().
PDLP helper functions and CSR pointer redirection
cpp/src/pdlp/cusparse_view.cu
Mixed-precision SpMV helpers, SpMM preprocess, SpMVOp factories and deleters, and CSR pointer redirection use descriptor-view parameters and .get() at cuSPARSE call sites.
Optimal batch-size benchmarking
cpp/src/pdlp/optimal_batch_size_handler/optimal_batch_size_handler.cu
SpMM_benchmarks_context_t and evaluate_node accept descriptor views, dense descriptors are created via make_dnmat, and SpMM and node-evaluation calls use .get().
PDHG solver iteration and updates
cpp/src/pdlp/pdhg.cu
compute_next_dual_solution, spmvop paths, compute_At_y/compute_A_x variants, and update_solution pass matrices, vectors, and plans via .get().
PDLP batch-context handling
cpp/src/pdlp/pdlp.cu
Batch-context resize rebuilds descriptors via make_dnmat, and later buffer-size, preprocess, compute_fixed_error, and initial step-size calls use .get().
Restart and termination strategy paths
cpp/src/pdlp/restart_strategy/*, cpp/src/pdlp/termination_strategy/*
Restart gradient and Lagrangian computations, convergence and infeasibility residuals, and adaptive-step-size interaction routines pass cuSPARSE descriptors via .get().
Skill card credential line update
skills/cuopt-multi-objective-exploration/skill-card.md
The requirements section credential-type line is updated and trailing whitespace is removed.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Suggested reviewers

  • Iroy30
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 5.26% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: replacing cuSPARSE wrappers with unique_ptr-based RAII and view aliases.
Description check ✅ Passed The description matches the diff by explaining the move from handmade cuSPARSE wrappers to unique_ptr ownership and non-owning view aliases.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/src/pdlp/cusparse_view.cu (1)

240-242: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Missing has_value() check before dereferencing optional.

func at line 242 is dereferenced without checking if the dlsym lookup succeeded.

Proposed fix
 void cusparse_spmvop_run(cusparseHandle_t handle,
                          cusparseSpMVOpPlan_t plan,
                          const void* alpha,
                          const void* beta,
                          cusparse_dn_vec_descr_view vecX,
                          cusparse_dn_vec_descr_view vecY,
                          cusparse_dn_vec_descr_view vecZ,
                          cudaStream_t stream)
 {
   static const auto func = dynamic_load_runtime::function<cusparseSpMVOp_sig>("cusparseSpMVOp");
+  cuopt_expects(func.has_value(), "cusparseSpMVOp symbol not found at runtime");
   RAFT_CUSPARSE_TRY(cusparseSetStream(handle, stream));
   RAFT_CUSPARSE_TRY((*func)(handle, plan, alpha, beta, vecX, vecY, vecZ));
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/pdlp/cusparse_view.cu` around lines 240 - 242, The code dereferences
the optional dynamic_load_runtime::function<cusparseSpMVOp_sig> named func
without checking it; before calling (*func)(handle, plan, alpha, beta, vecX,
vecY, vecZ) add a check like if (!func.has_value()) and handle the failure (log
or return an error/throw) with a clear message including the symbol name
"cusparseSpMVOp"; keep the existing RAFT_CUSPARSE_TRY usage for actual cuSPARSE
calls and ensure the early error path prevents the dereference of func and
returns/propagates an appropriate error.
🧹 Nitpick comments (2)
cpp/src/pdlp/cusparse_view.hpp (1)

38-57: 💤 Low value

Consider _t suffix for deleter types per project naming conventions.

The coding guidelines specify types/structs should use snake_case_t with _t suffix (e.g., cusparse_sp_mat_deleter_t). However, the current naming follows common STL deleter patterns, so this is a stylistic choice.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/pdlp/cusparse_view.hpp` around lines 38 - 57, The structs
cusparse_sp_mat_deleter, cusparse_dn_vec_deleter, and cusparse_dn_mat_deleter do
not follow the project naming convention requiring a snake_case_t suffix; rename
them to cusparse_sp_mat_deleter_t, cusparse_dn_vec_deleter_t, and
cusparse_dn_mat_deleter_t respectively, update all uses/typedefs/usings in this
compilation unit and any headers that reference these types (e.g., unique_ptr
deleter specializations or variable declarations), and ensure the operator()
implementations remain unchanged and still call RAFT_CUSPARSE_TRY_NO_THROW on
cusparseDestroySpMat/cusparseDestroyDnVec/cusparseDestroyDnMat.
cpp/src/pdlp/cusparse_view.cu (1)

147-149: 💤 Low value

Inconsistent error-checking macro.

Line 147 uses CUSPARSE_CHECK while other cuSPARSE calls in this file use RAFT_CUSPARSE_TRY. Consider using RAFT_CUSPARSE_TRY for consistency.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/src/pdlp/cusparse_view.cu` around lines 147 - 149, Replace the
inconsistent CUSPARSE_CHECK call with the RAFT_CUSPARSE_TRY macro to match the
rest of the file: change the call to cusparseSetStream(...) so it is wrapped
with RAFT_CUSPARSE_TRY rather than CUSPARSE_CHECK, keeping the same arguments
and preserving the subsequent RAFT_CUSPARSE_TRY(cusparseSpMM_preprocess(...))
call; ensure you reference and use the RAFT_CUSPARSE_TRY macro for both
cusparseSetStream and cusparseSpMM_preprocess to maintain consistent error
handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cpp/src/pdlp/cusparse_view.cu`:
- Around line 213-218: The code dereferences the optional dynamic loader result
fn (dynamic_load_runtime::function<cusparseSpMVOp_createDescr_sig>) without
checking presence; update the factory that creates the cusparseSpMVOp_descr (the
block that calls (*fn)(...)) to first test fn.has_value() (or if(!fn) branch)
and handle the missing symbol by returning an empty cusparse_spmvop_descr_uptr
(or otherwise propagate a clear error) instead of dereferencing; ensure the
handling is applied where cusparseSpMVOp_createDescr is invoked so callers of
is_cusparse_runtime_spmvop_supported() are not assumed sufficient.
- Around line 221-228: The dynamic loader result `fn` in make_spmvop_plan is
dereferenced without checking it exists; update make_spmvop_plan to test
dynamic_load_runtime::function<...> fn for presence (e.g., if (!fn) throw or
return an error) before calling (*fn)(...), mirroring the fix used in
make_spmvop_descr so that cusparseSpMVOp_createPlan is only invoked when the
symbol lookup succeeded and RAFT_CUSPARSE_TRY is reached with a valid function
pointer.

---

Outside diff comments:
In `@cpp/src/pdlp/cusparse_view.cu`:
- Around line 240-242: The code dereferences the optional
dynamic_load_runtime::function<cusparseSpMVOp_sig> named func without checking
it; before calling (*func)(handle, plan, alpha, beta, vecX, vecY, vecZ) add a
check like if (!func.has_value()) and handle the failure (log or return an
error/throw) with a clear message including the symbol name "cusparseSpMVOp";
keep the existing RAFT_CUSPARSE_TRY usage for actual cuSPARSE calls and ensure
the early error path prevents the dereference of func and returns/propagates an
appropriate error.

---

Nitpick comments:
In `@cpp/src/pdlp/cusparse_view.cu`:
- Around line 147-149: Replace the inconsistent CUSPARSE_CHECK call with the
RAFT_CUSPARSE_TRY macro to match the rest of the file: change the call to
cusparseSetStream(...) so it is wrapped with RAFT_CUSPARSE_TRY rather than
CUSPARSE_CHECK, keeping the same arguments and preserving the subsequent
RAFT_CUSPARSE_TRY(cusparseSpMM_preprocess(...)) call; ensure you reference and
use the RAFT_CUSPARSE_TRY macro for both cusparseSetStream and
cusparseSpMM_preprocess to maintain consistent error handling.

In `@cpp/src/pdlp/cusparse_view.hpp`:
- Around line 38-57: The structs cusparse_sp_mat_deleter,
cusparse_dn_vec_deleter, and cusparse_dn_mat_deleter do not follow the project
naming convention requiring a snake_case_t suffix; rename them to
cusparse_sp_mat_deleter_t, cusparse_dn_vec_deleter_t, and
cusparse_dn_mat_deleter_t respectively, update all uses/typedefs/usings in this
compilation unit and any headers that reference these types (e.g., unique_ptr
deleter specializations or variable declarations), and ensure the operator()
implementations remain unchanged and still call RAFT_CUSPARSE_TRY_NO_THROW on
cusparseDestroySpMat/cusparseDestroyDnVec/cusparseDestroyDnMat.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 43105512-0a29-4bb5-b0f5-b5cd32da0e79

📥 Commits

Reviewing files that changed from the base of the PR and between ea7acf0 and e471907.

📒 Files selected for processing (5)
  • cpp/src/barrier/barrier.cu
  • cpp/src/barrier/cusparse_info.hpp
  • cpp/src/barrier/cusparse_view.cu
  • cpp/src/pdlp/cusparse_view.cu
  • cpp/src/pdlp/cusparse_view.hpp

Comment thread cpp/src/pdlp/cusparse_view.cu Outdated
Comment thread cpp/src/pdlp/cusparse_view.cu
@mlubin

mlubin commented May 29, 2026

Copy link
Copy Markdown
Contributor

Trying to clean the code a bit as a side task while waiting for other jobs to finish.

I love the clean up effort! Leaving the review for the experts.

@Kh4ster Kh4ster requested review from Kh4ster and removed request for chris-maes and rg20 June 2, 2026 12:49
@github-actions

Copy link
Copy Markdown

🔔 Hi @anandhkb, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@Bubullzz Bubullzz added improvement Improves an existing functionality and removed do not merge Do not merge if this flag is set labels Jun 11, 2026
@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test 036469d

@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test 520218a

@Bubullzz Bubullzz added the non-breaking Introduces a non-breaking change label Jun 11, 2026
@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test 5e199e8

@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ ok to test ba95897

@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test 90f1214

@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test 90f1214

@copy-pr-bot

copy-pr-bot Bot commented Jun 23, 2026

Copy link
Copy Markdown

/ok to test 90f1214

@Bubullzz, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test b5ffd92

@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test 8bdebd7

@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test f5eeb57

@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test f5eeb57

@copy-pr-bot

copy-pr-bot Bot commented Jun 25, 2026

Copy link
Copy Markdown

/ok to test f5eeb57

@Bubullzz, there was an error processing your request: E2

See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/

@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test 5c436b7

@Bubullzz Bubullzz requested a review from a team as a code owner June 25, 2026 16:19
@Bubullzz Bubullzz requested a review from Iroy30 June 25, 2026 16:19
@Bubullzz

Copy link
Copy Markdown
Contributor Author

/ok to test 1afbe40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants