Skip to content

tests: Mark tests as unsafe or limit number of threads.#2229

Open
seberg wants to merge 4 commits into
NVIDIA:mainfrom
seberg:ft-testing-markers
Open

tests: Mark tests as unsafe or limit number of threads.#2229
seberg wants to merge 4 commits into
NVIDIA:mainfrom
seberg:ft-testing-markers

Conversation

@seberg

@seberg seberg commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

This is straight forward (but that doesn't mean we may not want to skip some of) and does two things:

  1. Running with many threads, the IPC tests run out of fd's and get very slow, so limit them to some arbitrary value 4/8.
  2. Mark tests as thread-unsafe (I'll add some inline comments).

For thread-unsafe markers should sanity check if it looks like it should be thread-safe we may need either explicit tests or fix things.

seberg added 4 commits June 16, 2026 11:09
- thread_unsafe: nvml init ref-count, graphMem attr, mock-based tests,
  OpenGL, peer-access pool state, multiprocessing warning, program-cache
  race reproduction, and functools.cache mutation tests
- parallel_threads_limit: IPC / worker-pool tests that spawn subprocesses
  or open file descriptors (limit 4), example tests (limit 8), and the
  event-registration test whose timeouts are slow

Signed-off-by: Sebastian Berg <sebastianb@nvidia.com>
@seberg seberg added this to the cuda.core next milestone Jun 16, 2026
@seberg seberg self-assigned this Jun 16, 2026
@seberg seberg added P1 Medium priority - Should do test Improvements or additions to tests cuda.bindings Everything related to the cuda.bindings module cuda.core Everything related to the cuda.core module cuda.pathfinder Everything related to the cuda.pathfinder module labels Jun 16, 2026
@copy-pr-bot

copy-pr-bot Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

assert len(read_counters) == 5


@pytest.mark.thread_unsafe(reason="API appears to be thread-unsafe (2026-06)")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This felt fine to me, as it writes the counter but if it doesn't seem good it could be an upstream issue. (CI ran into a segfault, IIRC.)



@pytest.mark.usefixtures("clear_find_binary_cache")
@pytest.mark.thread_unsafe(reason="functools.cache may replace entry.")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests identity on Path objects, which isn't a given with functools.cache right now. But it felt harmless enough to just ignore here.



@pytest.mark.skipif(sys.platform == "win32", reason="Test not supported on Windows")
@pytest.mark.thread_unsafe(reason="nvml init affects other threads")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely true.

@seberg

seberg commented Jun 16, 2026

Copy link
Copy Markdown
Contributor Author

/ok to test 1f7783f

@github-actions

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cuda.bindings Everything related to the cuda.bindings module cuda.core Everything related to the cuda.core module cuda.pathfinder Everything related to the cuda.pathfinder module P1 Medium priority - Should do test Improvements or additions to tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants