Join worker vCPU threads before tearing down guest memory by Max042004 · Pull Request #85 · sysprog21/elfuse

Max042004 · 2026-06-06T13:52:17Z

On exit_group the main thread leaves vcpu_run_loop as soon as it observes
the exit-group flag and proceeds to cleanup_main_resources(), which unmaps
the guest slab via guest_destroy(). Sibling vCPU threads may still be
mid-iteration in their own run loops (e.g. in shim_globals_recompute_attention,
which touches guest memory). A worker that reads the slab after the main
thread frees it faults at the host level and the elfuse process dies with
SIGSEGV, so a guest that requested exit_group(0) is reported as exit 139.

This was masked until now because workloads that exercise it (multi-threaded
JVMs) crashed earlier; with the fault-delivery fix javac runs to completion
and reaches the exit_group teardown, exposing the race.

Have the main thread call thread_join_workers() after vcpu_run_loop()
returns and before any teardown. It waits for the workers to wind down
(they respond to the hv_vcpus_exit() that exit_group already issued) and is
a no-op once they have. javac now exits 0.

Summary by cubic

Join worker vCPU threads before tearing down guest memory to prevent host SIGSEGVs and preserve correct exit codes on exit_group. Shut down the debugger first so paused GDB sessions don’t cause join timeouts.

Bug Fixes
- Call gdb_stub_shutdown() before thread_join_workers() to wake paused workers and avoid join timeouts/detach during GDB stops.
- Call thread_join_workers() after vcpu_run_loop() and before guest teardown to avoid workers touching freed guest memory.
- Include runtime/thread.h.

^{Written for commit b981ccc. Summary will update on new commits.}

jserv · 2026-06-06T16:19:48Z

(cherry picked from commit e2f63eab9575439bd2f900954c8e095986d321b8)

Don't mention this as this commit never existed in public.

jserv · 2026-06-06T16:27:02Z

+     * fault on freed guest memory and crash the host with SIGSEGV, masking the
+     * real exit code. thread_join_workers() is a no-op once the workers have
+     * already wound down (the common single-threaded case). */
+    thread_join_workers();


Ordering vs gdb_stub_shutdown() at the next line leaves a residual race. gdb_stub_shutdown() (gdbstub.c:1435-1440) broadcasts resume_cond to release any GDB-paused workers so they can exit. A worker stuck in gdb_stub_handle_stop has active=1 (paused in ptrace-stop, not deactivated), so the new join here polls 100ms, sees no deactivation, and detaches. Then gdb_stub_shutdown releases the worker, it resumes execution, and main proceeds to shim_globals_counters_dump / cleanup_main_resources -- same SIGSEGV pattern the PR is fixing, just gated on a GDB session.

Move the join one line down so the GDB-paused case is also covered:

int exit_code = vcpu_run_loop(...); /* Tear down debugger state before freeing guest/vCPU resources. */ gdb_stub_shutdown(); thread_join_workers(); if (shim_globals_stats_enabled()) shim_globals_counters_dump(&g);

jserv · 2026-06-06T16:27:02Z

+     * fault on freed guest memory and crash the host with SIGSEGV, masking the
+     * real exit code. thread_join_workers() is a no-op once the workers have
+     * already wound down (the common single-threaded case). */
+    thread_join_workers();


sc_exit_group at syscall.c:719 already calls thread_join_workers(), so this is a second join. Two consequences worth tracking:

If a worker calls exit_group, that worker's thread_join_workers snapshots all others including main (slot 0, which never thread_deactivates itself) -- so it polls 100ms then detaches main. Detaching the host process's main pthread is legal but architecturally odd.

After that first call returns and main eventually runs this new one, any worker that wound down in between has active=0 -- main's call then tries pthread_join on a thread that was already joined or detached. POSIX says undefined; macOS libpthread returns EINVAL in practice (return is unchecked at thread.c:354), so not catastrophic but contract drift.

Cleaner: drop the thread_join_workers() from sc_exit_group. The exit_group caller still does proc_request_exit_group + thread_for_each(thread_force_exit_cb) which forces every sibling out of hv_vcpu_run; main observes the flag, exits the loop, and runs the single authoritative join here. Removes the detach-main wart and the double-join. Can be a follow-up -- not a blocker for this PR.

On exit_group the main thread leaves vcpu_run_loop as soon as it observes the exit-group flag and proceeds to cleanup_main_resources(), which unmaps the guest slab via guest_destroy(). Sibling vCPU threads may still be mid-iteration in their own run loops (e.g. in shim_globals_recompute_attention, which touches guest memory). A worker that reads the slab after the main thread frees it faults at the host level and the elfuse process dies with SIGSEGV, so a guest that requested exit_group(0) is reported as exit 139. This was masked until now because workloads that exercise it (multi-threaded JVMs) crashed earlier; with the fault-delivery fix javac runs to completion and reaches the exit_group teardown, exposing the race. Have the main thread call thread_join_workers() after vcpu_run_loop() returns and before any teardown. It waits for the workers to wind down (they respond to the hv_vcpus_exit() that exit_group already issued) and is a no-op once they have. javac now exits 0. Call gdb_stub_shutdown() before thread_join_workers() rather than after: a worker parked in gdb_stub_handle_stop() stays active (not deactivated) until gdb_stub_shutdown() broadcasts resume_cond, so joining first only times out and detaches it while it is still paused in the GDB stop, reintroducing the same freed-memory race whenever a GDB session is attached.

This comment was marked as resolved.

Sign in to view

jserv reviewed Jun 6, 2026

View reviewed changes

Max042004 force-pushed the fix-worker-join-teardown branch 2 times, most recently from 951abef to 24287df Compare July 2, 2026 10:47

Max042004 force-pushed the fix-worker-join-teardown branch from 24287df to b981ccc Compare July 2, 2026 10:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Join worker vCPU threads before tearing down guest memory#85

Join worker vCPU threads before tearing down guest memory#85
Max042004 wants to merge 1 commit into
sysprog21:mainfrom
Max042004:fix-worker-join-teardown

Max042004 commented Jun 6, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

jserv commented Jun 6, 2026

Uh oh!

jserv Jun 6, 2026

Uh oh!

jserv Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Max042004 commented Jun 6, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Uh oh!

This comment was marked as resolved.

Uh oh!

jserv commented Jun 6, 2026

Uh oh!

jserv Jun 6, 2026

Choose a reason for hiding this comment

Uh oh!

jserv Jun 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Max042004 commented Jun 6, 2026 •

edited by cubic-dev-ai Bot

Loading