From 52604b6eca7bf793f5c14d579122d3ce18a776fd Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Fri, 15 May 2026 11:04:37 +0000 Subject: [PATCH 1/3] =?UTF-8?q?#118:=20libuv=5Freactor=5Fshutdown=20?= =?UTF-8?q?=E2=80=94=20wait=20for=20threadpool=20cancel-completions?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The first half of #118 (NTS-only): async_dns_addrinfo_t (288 B) leaked when a coroutine cancelled a DNS lookup and the reactor shut down before the libuv threadpool worker finished its blocking getaddrinfo() syscall. The dispose path sets LIBUV_DNS_F_DISPOSE_PENDING and relies on on_addrinfo_event firing (re-entering dispose) to reach the pefree branch. Our shutdown drain used UV_RUN_NOWAIT only — non-blocking peeks that don't wait for the threadpool cancel-completion to surface via libuv's internal pipe. After uv_loop_close() the callback can't fire, the struct leaks. Per libuv docs (and issue libuv/libuv#2481), uv_cancel() on an in-flight uv_getaddrinfo_t returns EBUSY: workers cannot be preempted, only waited on. Two-phase drain in libuv_reactor_shutdown: Phase 1 — bounded UV_RUN_NOWAIT (sync callbacks already on loop). Phase 2 — bounded UV_RUN_ONCE (block until threadpool cancel- completions arrive; cap so a wedged DNS server can't hang shutdown forever). If a worker is still wedged past the budget, leave the loop open: pefree-ing pending structs would race the still-running worker (UAF, much worse than a leak). The OS reclaims memory at process exit. A ZEND_DEBUG-only warning is printed for visibility. Local impact: ~15 ms shutdown for a normal cancel test (Phase 1 closes the loop immediately; Phase 2 not entered). No functional regressions in ext/async/tests. --- CHANGELOG.md | 18 ++++++++++++++++++ libuv_reactor.c | 23 ++++++++++++++++------- 2 files changed, 34 insertions(+), 7 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index bb9c498..3217738 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -26,6 +26,24 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 squarely in PDO core overhead (PDOStatement object init, fetch wrapping). ### Fixed +- **#118 — getaddrinfo event-struct leak on reactor shutdown (NTS).** The + `async_dns_addrinfo_t` (288 B) was never freed when a coroutine cancelled + a DNS resolution and the reactor shut down before libuv's threadpool + worker finished its blocking `getaddrinfo()` syscall. The dispose path + set `LIBUV_DNS_F_DISPOSE_PENDING` and relied on `on_addrinfo_event` + firing to reach the `pefree` branch — but our shutdown drain used + `UV_RUN_NOWAIT` (non-blocking peeks) and didn't wait for the threadpool + cancel-completion to surface via libuv's internal pipe; after + `uv_loop_close()` the callback could no longer fire. Note: per libuv + docs, `uv_cancel(uv_getaddrinfo_t*)` returns `EBUSY` for an in-flight + request — we cannot preempt the worker, only wait for it. + Fix in `libuv_reactor_shutdown` (`ext/async/libuv_reactor.c`): two-phase + drain — bounded `UV_RUN_NOWAIT` for ready callbacks, then bounded + `UV_RUN_ONCE` for async cancel-completions from the threadpool. If a + worker is still wedged past the budget (e.g. DNS server not responding), + leave the loop open: `pefree`-ing pending structs would race with the + still-running worker (UAF, much worse than a leak); the OS reclaims the + memory at process exit. - **#118 — Tracing-JIT SEGV in `Async\Chaos` thread-pool fuzz tests (`FAST_CONCAT` deref of `0x1`).** Root cause was *not* in the async extension itself but in `ext/opcache/jit/zend_jit_ir.c` diff --git a/libuv_reactor.c b/libuv_reactor.c index 40a5324..dc841bd 100644 --- a/libuv_reactor.c +++ b/libuv_reactor.c @@ -420,16 +420,25 @@ bool libuv_reactor_shutdown(void) libuv_cleanup_signal_events(); libuv_cleanup_process_events(); - /* Drain pending callbacks before closing the loop. A single NOWAIT - * pass misses threadpool requests (getaddrinfo) cancelled during - * shutdown: their completion callback fires a few iterations later, - * so uv_loop_close() would hit EBUSY and the request struct leaks. - * Bounded busy-drain — cancelled requests always complete promptly. */ - for (int guard = 0; guard < 10000 && uv_loop_alive(UVLOOP) != 0; guard++) { + /* Sync drain: pick up ready callbacks. */ + for (int i = 0; i < 100 && uv_loop_alive(UVLOOP); i++) { uv_run(UVLOOP, UV_RUN_NOWAIT); } + /* Async drain: wait for threadpool cancel-completions (getaddrinfo, + * fs). uv_cancel can't preempt an in-flight worker, we must wait. */ + for (int i = 0; i < 500 && uv_loop_alive(UVLOOP); i++) { + uv_run(UVLOOP, UV_RUN_ONCE); + } + /* Worker still running past the budget — leave the loop open; + * pefree would race with the worker (UAF > leak; OS reclaims). */ + if (uv_loop_alive(UVLOOP)) { +#ifdef ZEND_DEBUG + fprintf(stderr, "async: libuv shutdown timeout; loop left open\n"); +#endif + } else { + uv_loop_close(UVLOOP); + } - uv_loop_close(UVLOOP); ASYNC_G(reactor_started) = false; zend_hash_destroy(&ASYNC_G(active_io_handles)); } From 578beac84bb71337aa31d35436984bae61f8a081 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Fri, 15 May 2026 11:57:30 +0000 Subject: [PATCH 2/3] =?UTF-8?q?#118:=20CHANGELOG=20=E2=80=94=20xferinfo/pr?= =?UTF-8?q?ogress=20callback=20exception=20delivery=20(macOS=20fix)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Companion entry for the php-src fix in ext/curl/interface.c. Documents why the bug only surfaced on macOS (libuv/kqueue reentry path) and that the same pattern was already applied to curl_prereqfunction and curl_debug — these two callbacks were the gap. --- CHANGELOG.md | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/CHANGELOG.md b/CHANGELOG.md index 3217738..f49e221 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -26,6 +26,26 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 squarely in PDO core overhead (PDOStatement object init, fetch wrapping). ### Fixed +- **#118 — curl `XFERINFOFUNCTION` / `PROGRESSFUNCTION` exception leak (macOS).** + When the user callback threw, `curl_xferinfo` / `curl_progress` in + `ext/curl/interface.c` left `EG(exception)` set and returned 0, so libcurl + kept driving the transfer; `zend_call_known_fcc` on subsequent ticks + short-circuited on the pending exception without clearing it. Eventually + the transfer ended and the dangling exception surfaced outside the + coroutine — on Linux it landed on a frame the awaiter unwound, but on + macOS the libuv/kqueue reentry path delivered it as **uncaught** at + engine top-level (`Fatal error: Uncaught RuntimeException`), failing + `tests/curl/035-progress_exception.phpt` and + `056-multi_progress_exception.phpt` on `MACOS_*_NTS`. + Two other async-aware curl callbacks (`curl_prereqfunction`, + `curl_debug`) already did the right thing; `xferinfo`/`progress` were + missed when that pattern was applied. + Fix: in both callbacks, after `zend_call_known_fcc` returns, if + `EG(exception)` is set and `ch->async_event` exists, hand the exception + off to `curl_async_event_set_callback_exception()`, clear it, and + return 1 to abort the transfer — the captured exception is then + re-thrown into the awaiter through the normal `curl_async_event_t` + delivery path (`curl_async.c:1104`). - **#118 — getaddrinfo event-struct leak on reactor shutdown (NTS).** The `async_dns_addrinfo_t` (288 B) was never freed when a coroutine cancelled a DNS resolution and the reactor shut down before libuv's threadpool From ccec14e7fafedb92098f069daca798819067853c Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Fri, 15 May 2026 12:13:03 +0000 Subject: [PATCH 3/3] ci: #118 install libiconv via brew so macOS configure finds it MACOS_NTS job was failing at the Configure PHP step with 'Please specify the install prefix of iconv with --with-iconv=' because the workflow passes --with-iconv=$(brew --prefix)/opt/libiconv but libiconv was not in the brew install list. LINUX_NTS_ASAN and LINUX_ZTS_JIT are green; this unblocks verification of the curl xferinfo/progress macOS fix. --- .github/workflows/debug-bugfix.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/debug-bugfix.yml b/.github/workflows/debug-bugfix.yml index 8631b52..ecd2c0a 100644 --- a/.github/workflows/debug-bugfix.yml +++ b/.github/workflows/debug-bugfix.yml @@ -80,7 +80,7 @@ jobs: - name: Install build dependencies (macOS) if: runner.os == 'macOS' run: | - brew install autoconf bison re2c libuv curl oniguruma openssl@3 || true + brew install autoconf bison re2c libuv curl oniguruma openssl@3 libiconv || true # bison is keg-only on macOS; the system bison (2.3) is too old. echo "$(brew --prefix bison)/bin" >> "$GITHUB_PATH"