Add nightly CI for optional-dependency testing (PyTorch, numba-cuda)#1987
Draft
leofang wants to merge 12 commits intoNVIDIA:mainfrom
Draft
Add nightly CI for optional-dependency testing (PyTorch, numba-cuda)#1987leofang wants to merge 12 commits intoNVIDIA:mainfrom
leofang wants to merge 12 commits intoNVIDIA:mainfrom
Conversation
…ba-cuda) Add ci-nightly.yml that downloads wheels from the latest successful CI run on main and tests them against PyTorch and numba-cuda, without rebuilding. Key changes: - ci-nightly.yml: new orchestrator (schedule 2 AM UTC + workflow_dispatch) - test-wheel-linux/windows.yml: add run-id input for cross-run artifact downloads, and test-mode input (standard/nightly-pytorch/nightly-numba-cuda) with conditional test steps - ci/test-matrix.yml: add nightly entries with MODE field (4 pytorch + 6 numba-cuda across linux-64, linux-aarch64, win-64) - ci/tools/run-tests: add nightly-install mode that installs all wheels without running standard tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Member
Author
|
/ok to test 4aadce2 |
- Add concurrency group matching ci.yml's pattern - Replace jq one-liner with explicit cancelled/failure checks per ci.yml's battle-tested pattern (see long comment there for rationale) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove before merging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Member
Author
|
/ok to test ac5238c |
Full history is not needed — we only read ci/versions.yml. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Artifact names embed the commit SHA from the build that created them. When the nightly workflow downloads artifacts from a different CI run, it must use that run's SHA — not github.sha (the nightly run's own SHA) — to construct the correct artifact names. - ci-nightly.yml: resolve head_sha from the source CI run via `gh run view --json headSha`, pass it to test workflows - test-wheel-linux/windows.yml: add `sha` input (defaults to github.sha for backward compatibility), use it in env-vars Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Member
Author
|
/ok to test 9286598 |
|
a279179 to
8720de0
Compare
Member
Author
|
/ok to test 8720de0 |
8720de0 to
6976f8a
Compare
Member
Author
|
/ok to test 6976f8a |
- Install ALL wheels (pathfinder + bindings + core) and optional dep (torch/numba-cuda) in a single pip call so pip resolves everything together and avoids costly reinstall cycles from version conflicts - Fix "Display structure" step: show only artifact files (cuda_python*.whl, cuda_pathfinder/) instead of ls -lahR . which lists the entire repo - Fix numba-cuda test command: python -m numba.runtests numba.cuda.tests - Install Visual C++ Redistributable on Windows before PyTorch (pytorch/pytorch#166628) - run-tests now does pip list at the end of nightly installs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
6976f8a to
0b7cc50
Compare
Member
Author
|
/ok to test fc1dc5d |
fc1dc5d to
3653e7a
Compare
Member
Author
|
/ok to test 3653e7a |
3653e7a to
bf62c2b
Compare
Member
Author
|
/ok to test bf62c2b |
0c98a26 to
5d653ce
Compare
Member
Author
|
/ok to test 5d653ce |
24ea333 to
8586cf7
Compare
Member
Author
|
/ok to test 8586cf7 |
8586cf7 to
6953cdd
Compare
Member
Author
|
/ok to test 6953cdd |
6953cdd to
294eee4
Compare
Member
Author
|
/ok to test 294eee4 |
294eee4 to
4f409a7
Compare
Member
Author
|
/ok to test 4f409a7 |
CUDA_VER in the test environment should match TORCH_CUDA in major.minor. BUILD_CUDA_VER (from build-ctk-ver input) is used for artifact names, so CUDA_VER can differ. - cu126 → CUDA_VER: 12.6.3 (was 12.9.1) - cu130 → CUDA_VER: 13.0.2 (was 13.2.1) For CUDA 12 entries, USE_BACKPORT_BINDINGS kicks in automatically since BUILD_CUDA_MAJOR (13) \!= TEST_CUDA_MAJOR (12), pulling bindings from the backport branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
4f409a7 to
edeaa76
Compare
Member
Author
|
/ok to test edeaa76 |
Member
Author
|
The failing numba-cuda tests will be fixed by a new release (where the fix is included, NVIDIA/numba-cuda#873). The failing PyTorch tests will be fixed by #1988. |
leofang
commented
Apr 30, 2026
CUDA_VER in the test environment should match TORCH_CUDA in major.minor. BUILD_CUDA_VER (from build-ctk-ver input) is used for artifact names, so CUDA_VER can differ. - cu126 → CUDA_VER: 12.6.3 (was 12.9.1) - cu130 → CUDA_VER: 13.0.2 (was 13.2.1) For CUDA 12 entries, USE_BACKPORT_BINDINGS kicks in automatically since BUILD_CUDA_MAJOR (13) \!= TEST_CUDA_MAJOR (12), pulling bindings from the backport branch. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ea811c1 to
6f04205
Compare
The indentation bug in test_linker.py was fixed in the latest numba-cuda release, so the workaround patch is no longer needed. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…python into ci-nightly-optdeps
Member
Author
|
/ok to test 833490e |
leofang
added a commit
that referenced
this pull request
May 1, 2026
…1999) * Fix torch-incompatible assertions in TestViewCudaArrayInterfaceGPU The _check_view method in TestViewCudaArrayInterfaceGPU was missed during the tensor bridge refactor (#1894) and still used raw numpy attributes (in_arr.size, in_arr.strides, in_arr.flags, etc.) that don't work with torch tensors. Use the _arr_* helpers that #1894 added for torch/numpy compatibility. Caught by the nightly optional-dependency CI (#1987). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Fix strides assertion for torch CAI: allow explicit C-contiguous strides torch's __cuda_array_interface__ always reports strides, even for C-contiguous tensors. Use the same assertion pattern as the other _check_view methods: allow strides to equal the C-contiguous values instead of requiring None. Verified locally: 7/7 torch CAI tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Unify strides assertion pattern across all _check_view methods Use the same if/else pattern with `in (None, strides_in_counts)` in all three _check_view methods for consistency. Previously TestViewCPU and TestViewCudaArrayInterfaceGPU used a one-liner that was harder to read and behaved slightly differently. Verified locally: 66/66 tests pass across TestViewCPU, TestViewGPU, and TestViewCudaArrayInterfaceGPU (including all torch variants). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Address review: flip strides assertion, add _arr_dtype, merge main Per @rwgk's review: - Flip strides check to branch on view.strides (all 3 _check_view) - Add _arr_dtype helper using __cuda_array_interface__["typestr"] for torch tensors, restore dtype assertion in CAI _check_view - Merge main to pick up #1998 (numba flags fix) Verified locally: 76/76 tests pass across all three test classes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Member
Author
|
/ok to test 2a749d5 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add a nightly CI pipeline that tests cuda-python wheels against optional dependencies (PyTorch and numba-cuda) without rebuilding wheels. Wheels are downloaded from the latest successful CI run on
main.Design
ci-nightly.yml: New orchestrator workflow (2 AM UTC daily +workflow_dispatchfor manual testing). Finds the latest successful CI run on main and passes its run-id to the existing test workflows.test-wheel-linux/windows.yml: Extended with two new inputs:run-id: enablesactions/download-artifactto pull wheels from a different workflow run (defaults togithub.run_idfor backward compatibility)test-mode:standard(default, current behavior),nightly-pytorch, ornightly-numba-cudatest-matrix.yml: Newnightly:entries with aMODEfield. The orchestrator uses the existingmatrix_filterinput to select by mode.run-tests: Newnightly-installmode that installs all wheels without running standard tests.Test matrix (14 jobs)
PyTorch (8 jobs: 4 linux-64 + 4 win-64)
Tests:
cuda_core/tests/test_utils.py(SMV/DLPack interop) +cuda_core/tests/example_tests/(pytorch_example)numba-cuda (6 jobs: 2 linux-64 + 2 linux-aarch64 + 2 win-64)
Tests:
python -m numba_cuda.numba.cuda.tests(numba-cuda's bundled test suite)How to test this PR
run-idfrom a recent successful CI runStandard CI (
ci.yml) is unaffected —test-modedefaults tostandardandrun-iddefaults togithub.run_id.-- Leo's bot