From d3b04ef7e5d17032ae3e428dea1b1d078f2e6e76 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Fri, 22 May 2026 20:04:53 +0000 Subject: [PATCH 1/8] #136 io-chaos: protocol-level HTTP coverage for async ext/curl MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the CURL half of the #136 coverage gap: nothing previously exercised async ext/curl under the random scheduler, cancellation, or a mid-response connection failure, even though every async curl_exec() goes through the libuv reactor. EvilPeer gains an `http` mode (EvilPeer::serveHttp): it drains one HTTP request and writes back an HTTP/1.1 response. The serve-mode body toxics (slice / drip / abrupt close / hard reset / forked peer / Toxiproxy) are joined by HTTP-specific ones — chunked transfer-encoding, a mendacious Content-Length (over/under-stated), dribbled headers, an arbitrary status. New chaos topic fuzzy-tests/curl/ with http_chaos.feature (12 scenarios): a reactor-driven curl client (`coroutine "C" fetches peer "EP" over HTTP`) exercised against every toxic, under the random scheduler, and cancelled mid-transfer. Generated .phpt carry a curl --SKIPIF-- probe. Harness: mode='http' dispatch in Context::run() and forked_peer.php, eight HTTP Given steps, the curlGet() cancellation-aware client routine, an HTTP status assertion, SKIP_RULES['curl']. Found a real bug: async curl dropped all but the first chunk of a chunked-encoded response body (CURLE_WRITE_ERROR) — fixed in php-src ext/curl/curl_async.c; see fuzzy-tests/FINDINGS.md. --- CHANGELOG.md | 1 + fuzzy-tests/FINDINGS.md | 30 ++++ fuzzy-tests/_harness/Context.php | 14 +- fuzzy-tests/_harness/Steps.php | 194 +++++++++++++++++++++++ fuzzy-tests/_harness/generate.php | 1 + fuzzy-tests/_peers/EvilPeer.php | 143 +++++++++++++++++ fuzzy-tests/_peers/forked_peer.php | 10 +- fuzzy-tests/curl/http_chaos.feature | 228 ++++++++++++++++++++++++++++ 8 files changed, 610 insertions(+), 11 deletions(-) create mode 100644 fuzzy-tests/curl/http_chaos.feature diff --git a/CHANGELOG.md b/CHANGELOG.md index 4c601a4..b41c45b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.7.0] - ### Added +- **#136 HTTP chaos: async ext/curl coverage** — EvilPeer gains an `http` mode (`EvilPeer::serveHttp`): it drains one HTTP request and writes back an HTTP/1.1 response, with the serve-mode body toxics (slice/drip/abrupt close/hard reset/forked peer/Toxiproxy) joined by HTTP-specific ones — chunked transfer-encoding, a mendacious `Content-Length` (over/under-stated), dribbled headers, an arbitrary status code. New chaos topic `fuzzy-tests/curl/` with `http_chaos.feature`: a reactor-driven `ext/curl` client (`coroutine "C" fetches peer "EP" over HTTP`) is exercised against every toxic, under the random scheduler, and cancelled mid-transfer — closing the CURL half of the #136 coverage gap (nothing previously exercised async curl under chaos). Generated `.phpt` carry a `--SKIPIF--` curl probe. **Found a real bug:** async curl dropped all but the first chunk of a chunked-encoded response body (`CURLE_WRITE_ERROR`) — fixed in php-src `ext/curl/curl_async.c` (`#136`); see `fuzzy-tests/FINDINGS.md`. - **#127 I/O chaos: EvilPeer + transport×logic crossing** — new `fuzzy-tests/_peers/EvilPeer.php`, a deliberately misbehaving network peer driven by a declarative fault table. Toxics: payload slicing, inter-chunk drip delay, abrupt mid-stream close (`reset`); parameters accept the seeded-random fuzz syntax (`random:N`, `1|5`). New chaos topic `fuzzy-tests/io/`: `evil_peer.feature` (sliced/dripped stream reassembled exactly), `abrupt_close.feature` (dropped connection → clean payload prefix, no hang), and `combined_chaos.feature` — **crosses transport chaos with logic chaos**: toxic-selection mutation blocks × client-logic mutation blocks × the random scheduler, all checked against a fixed payload oracle. On a failure the executor prints a **chaos event log** — the exact low-level toxic sequence the EvilPeer played out plus each client's I/O trace. Harness gains `defineEvilPeer()` + a prep-phase that binds each peer's listening socket and serves it from a coroutine. - **#129 I/O chaos: Toxiproxy transport-level fault injection** — new `fuzzy-tests/_peers/ToxiproxyClient.php`, a minimal HTTP-API client for [Toxiproxy](https://github.com/Shopify/toxiproxy). An EvilPeer can now be fronted by a Toxiproxy proxy (`client → proxy → peer`), injecting transport faults a pure-PHP peer cannot reproduce precisely: real bandwidth throttling, latency with jitter, TCP-segment slicing, `limit_data` byte-counted truncation, `reset_peer` timed RST. New steps `evil peer "EP" is fronted by Toxiproxy` / `Toxiproxy throttles|adds latency to|slices|cuts off|resets peer "EP" …`; new feature `fuzzy-tests/io/toxiproxy.feature`. Opt-in by design: every generated `.phpt` carries a `--SKIPIF--` probe (`SKIP_RULES['toxiproxy']`) and skips wherever no Toxiproxy admin endpoint answers — so the suite never gates per-PR CI. A dedicated `nightly-io-chaos.yml` workflow stands Toxiproxy up and runs the suite under FIFO + four random scheduler seeds. Closes the last item of #129. - **#107 `ThreadPool` workers auto-detect** — `workers` is optional (default `0` → `Async\available_parallelism()`). diff --git a/fuzzy-tests/FINDINGS.md b/fuzzy-tests/FINDINGS.md index a52811d..3482a9b 100644 --- a/fuzzy-tests/FINDINGS.md +++ b/fuzzy-tests/FINDINGS.md @@ -66,3 +66,33 @@ Fixed in `php_stdiop_write()` / `php_stdiop_read()`: re-suspend until *this* request completed. Regression test `tests/io/083-concurrent_async_write.phpt`. The structural fix — per-request completion events instead of the broadcast — is tracked in true-async/php-async#130. + +## Async curl drops a chunked response body (real bug — fixed) + +The new `curl/http_chaos.feature` (issue #136) drives an async `ext/curl` +client against the EvilPeer in its `http` mode. Every scenario passed except +the three "chunked transfer encoding" rows, which failed with +`curl_get_ok == 0`: curl reported `CURLE_WRITE_ERROR` — +*"Failure writing output to destination, passed 272 returned 17"* — after +delivering only the first 17-byte chunk to the `CURLOPT_WRITEFUNCTION` +callback. The same program on stock PHP 8.3 returns the whole body. + +Root cause in `ext/curl/curl_async.c`. The async write path uses libcurl's +`CURL_WRITEFUNC_PAUSE` / unpause pattern: the first `curl_write` call copies +the data, spawns a coroutine for the PHP callback and returns `PAUSE`; the +completion callback stores the callback's return value and unpauses, and the +*re-call* returns that stored value. The re-call branch assumed libcurl +re-delivers exactly the slice that was paused on — but on unpause libcurl +re-delivers the whole paused window **and coalesces any freshly decoded data +into it**. With chunked transfer-encoding the de-chunker produces many small +pieces, so the re-call carried 272 bytes while the stored result was 17 → +`passed 272 returned 17` → `CURLE_WRITE_ERROR`. (A fixed Content-Length body +arrives one network read at a time, one write callback per reactor wakeup, +so it never tripped — only chunked decoding coalesces.) + +Fixed in `curl_async_write_user()`: the re-call now tracks a +`consumed_offset` through the (possibly grown) window — it reports the full +length back to libcurl only once the PHP callback has accepted every byte, +otherwise it feeds the remainder through another callback slice. A genuine +short return / exception still surfaces verbatim via a new `aborted` flag. +Tracked in php-src as `#136`. diff --git a/fuzzy-tests/_harness/Context.php b/fuzzy-tests/_harness/Context.php index 390e04e..d0bd635 100644 --- a/fuzzy-tests/_harness/Context.php +++ b/fuzzy-tests/_harness/Context.php @@ -99,7 +99,7 @@ final class Context { * strategies referenced for the scenario's lifetime */ public array $poolStrategies = []; - /** @var array + /** @var array * EvilPeer fault tables, keyed by peer name. */ public array $evilPeerDefs = []; @@ -228,6 +228,8 @@ public function defineEvilPeer(string $name): void { 'payload' => '', 'slice' => 0, 'delay' => 0, 'reset' => -1, 'hold' => 0, 'hardReset' => false, 'mode' => 'serve', 'forked' => false, 'toxiproxy' => false, + 'httpStatus' => 200, 'httpChunked' => false, + 'httpClenLie' => 0, 'httpHeaderDelay' => 0, ]; } } @@ -492,11 +494,11 @@ public function run(): void { return; } $self->inc("evil_peer_served_$name"); - if (($spec['mode'] ?? 'serve') === 'consume') { - EvilPeer::consume($conn, $spec, $self, $name); - } else { - EvilPeer::serve($conn, $spec, $self, $name); - } + match ($spec['mode'] ?? 'serve') { + 'consume' => EvilPeer::consume($conn, $spec, $self, $name), + 'http' => EvilPeer::serveHttp($conn, $spec, $self, $name), + default => EvilPeer::serve($conn, $spec, $self, $name), + }; }); } diff --git a/fuzzy-tests/_harness/Steps.php b/fuzzy-tests/_harness/Steps.php index bf562bc..6f8a416 100644 --- a/fuzzy-tests/_harness/Steps.php +++ b/fuzzy-tests/_harness/Steps.php @@ -383,6 +383,96 @@ function(Context $ctx, string $name, string $msExpr) { }) ->requires('toxiproxy'); + // ---- Evil HTTP peer: an EvilPeer that speaks HTTP/1.1 ---- + // The peer drains one HTTP request, then writes back a response. The + // body-level toxics ("slices output", "delays ms between chunks", + // "closes abruptly after N bytes", "uses a hard reset", "runs as a + // forked peer", every Toxiproxy step) all reuse the serve-mode steps + // above — they only set keys, mode-agnostic. The steps below add the + // HTTP-specific framing and toxics. An async ext/curl client driven by + // the reactor faces this peer. + + // Given an evil HTTP peer "EP" serving N bytes + $r->on('/^an evil HTTP peer "([^"]+)" serving (\S+) bytes$/', + function(Context $ctx, string $name, string $nExpr) { + $n = (int)$ctx->resolver->resolve($nExpr); + $ctx->defineEvilPeer($name); + $ctx->evilPeerDefs[$name]['mode'] = 'http'; + $payload = ''; + for ($i = 0; $i < $n; $i++) { + $payload .= chr(33 + ($i % 94)); // printable ASCII cycle + } + $ctx->evilPeerDefs[$name]['payload'] = $payload; + }) + ->requires('curl'); + + // Given an evil HTTP peer "EP" serving "body" + $r->on('/^an evil HTTP peer "([^"]+)" serving "([^"]*)"$/', + function(Context $ctx, string $name, string $body) { + $ctx->defineEvilPeer($name); + $ctx->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->evilPeerDefs[$name]['payload'] = $body; + }) + ->requires('curl'); + + // Given evil HTTP peer "EP" responds with status N + // The peer answers with an arbitrary HTTP status. curl still completes + // the transaction successfully (errno 0) — a 4xx/5xx is a valid HTTP + // response, not a transport error. + $r->on('/^evil HTTP peer "([^"]+)" responds with status (\S+)$/', + function(Context $ctx, string $name, string $sExpr) { + $ctx->defineEvilPeer($name); + $ctx->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->evilPeerDefs[$name]['httpStatus'] = (int)$ctx->resolver->resolve($sExpr); + }) + ->requires('curl'); + + // Given evil HTTP peer "EP" uses chunked transfer encoding + // The body arrives Transfer-Encoding: chunked; curl must de-chunk it + // back to the exact byte stream regardless of how it was framed. + $r->on('/^evil HTTP peer "([^"]+)" uses chunked transfer encoding$/', + function(Context $ctx, string $name) { + $ctx->defineEvilPeer($name); + $ctx->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->evilPeerDefs[$name]['httpChunked'] = true; + }) + ->requires('curl'); + + // Given evil HTTP peer "EP" overstates Content-Length by N bytes + // A mendacious header — the peer promises more than it delivers. curl + // waits for bytes that never come and must report CURLE_PARTIAL_FILE, + // never hang. + $r->on('/^evil HTTP peer "([^"]+)" overstates Content-Length by (\S+) bytes$/', + function(Context $ctx, string $name, string $nExpr) { + $ctx->defineEvilPeer($name); + $ctx->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->evilPeerDefs[$name]['httpClenLie'] = (int)$ctx->resolver->resolve($nExpr); + }) + ->requires('curl'); + + // Given evil HTTP peer "EP" understates Content-Length by N bytes + // The peer promises fewer bytes than it sends; curl stops reading at + // the advertised length, so the client sees a clean prefix. + $r->on('/^evil HTTP peer "([^"]+)" understates Content-Length by (\S+) bytes$/', + function(Context $ctx, string $name, string $nExpr) { + $ctx->defineEvilPeer($name); + $ctx->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->evilPeerDefs[$name]['httpClenLie'] = -(int)$ctx->resolver->resolve($nExpr); + }) + ->requires('curl'); + + // Given evil HTTP peer "EP" delays N ms mid-headers + // Slow-headers toxic: the response status line and headers dribble in + // over two TCP writes with a pause between. curl's header parser must + // stay interruptible and reassemble them correctly. + $r->on('/^evil HTTP peer "([^"]+)" delays (\S+) ms mid-headers$/', + function(Context $ctx, string $name, string $nExpr) { + $ctx->defineEvilPeer($name); + $ctx->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->evilPeerDefs[$name]['httpHeaderDelay'] = (int)$ctx->resolver->resolve($nExpr); + }) + ->requires('curl'); + // ---- When: actions inside a coroutine ---- // When coroutine "X" downloads from peer "EP" @@ -410,6 +500,23 @@ function(Context $ctx, string $coro, string $peer) { }); }); + // When coroutine "X" fetches peer "EP" over HTTP + // Runs an async ext/curl GET against the evil HTTP peer. The body is + // captured incrementally through CURLOPT_WRITEFUNCTION, so a truncated + // or cancelled transfer still leaves the prefix that did arrive. + // Cancellation-aware: a cancel mid-request lands in curl_get_cancelled. + // The liveness invariant + // curl_get_ok + curl_get_cancelled + curl_get_failed + // + curl_get_no_peer == curl_get_attempts + // therefore holds for every interleaving. + $r->on('/^coroutine "([^"]+)" fetches peer "([^"]+)" over HTTP$/', + function(Context $ctx, string $coro, string $peer) { + $ctx->planAction($coro, function(Context $ctx) use ($coro, $peer) { + StandardSteps::curlGet($ctx, $coro, $peer); + }); + }) + ->requires('curl'); + // When coroutine "X" uploads N bytes to peer "EP" // Connects and writes N bytes in a single fwrite(). Against a slow or // never-reading consume-mode peer that fwrite() suspends on a full @@ -2942,6 +3049,20 @@ function(Context $ctx, string $coro, string $peer) { } }); + // Then coroutine "X" received HTTP status N + // The curl client stashes the response status code into the + // curl_http_code_$coro counter; a 4xx/5xx is a valid response, so this + // is decidable independently of the transport-level outcome bucket. + $r->on('/^coroutine "([^"]+)" received HTTP status (\S+)$/', + function(Context $ctx, string $coro, string $sExpr) { + $want = (int)$ctx->resolver->resolve($sExpr); + $got = $ctx->counters["curl_http_code_$coro"] ?? 0; + if ($got !== $want) { + throw new \RuntimeException(sprintf( + "coroutine %s expected HTTP status %d, got %d", $coro, $want, $got)); + } + }); + // Then group "G" is finished $r->on('/^group "([^"]+)" is finished$/', function(Context $ctx, string $name) { @@ -3169,4 +3290,77 @@ public static function ioUpload(Context $ctx, string $coro, string $peer, int $b $coro, $peer, $bytes, $writeSize, $writes, $sent, $outcome); } } + + /** + * Shared async-curl routine used by the "fetches peer over HTTP" step. + * Runs one ext/curl GET against an evil HTTP peer; ext/async drives the + * transfer through the libuv reactor, so the coroutine yields for the + * duration and a concurrent killer can cancel it mid-request. + * + * The body is captured incrementally via CURLOPT_WRITEFUNCTION into + * $ctx->ioData so a truncated / cancelled transfer still leaves the prefix + * that arrived — the same clean-prefix invariant the raw-socket download + * uses. The outcome is bucketed into exactly one counter: + * curl_get_ok — curl_errno() == 0 (a 4xx/5xx still counts) + * curl_get_cancelled — AsyncCancellation delivered into the transfer + * curl_get_failed — any curl error / other throwable + * curl_get_no_peer — peer address never resolved + * so curl_get_ok + cancelled + failed + no_peer == curl_get_attempts for + * every interleaving. The response status is stashed separately into + * curl_http_code_$coro. + */ + public static function curlGet(Context $ctx, string $coro, string $peer): void { + $ctx->inc("curl_get_attempts_$coro"); + // Define the body slot up front so a clean-prefix assertion stays valid + // even when the request never produces a byte. + $ctx->ioData[$coro] = ''; + $addr = $ctx->evilPeerAddr[$peer] ?? null; + if ($addr === null) { + $ctx->inc("curl_get_no_peer_$coro"); + return; + } + $buf = ''; + $outcome = 'ok'; + $errno = 0; + $httpCode = 0; + $ch = null; + try { + $ch = curl_init(); + curl_setopt($ch, CURLOPT_URL, 'http://' . $addr . '/'); + curl_setopt($ch, CURLOPT_TIMEOUT, 5); + curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 5); + // Append every delivered chunk; returning a short count would make + // curl abort, so always report the full length back. + curl_setopt($ch, CURLOPT_WRITEFUNCTION, + function($ch, string $data) use (&$buf) { + $buf .= $data; + return strlen($data); + }); + curl_exec($ch); + $errno = curl_errno($ch); + $httpCode = (int)curl_getinfo($ch, CURLINFO_HTTP_CODE); + if ($errno === 0) { + $ctx->inc("curl_get_ok_$coro"); + } else { + $outcome = 'failed'; + $ctx->inc("curl_get_failed_$coro"); + } + } catch (\Async\AsyncCancellation $e) { + $outcome = 'cancelled'; + $ctx->inc("curl_get_cancelled_$coro"); + } catch (\Throwable $e) { + $outcome = 'failed'; + $ctx->inc("curl_get_failed_$coro"); + } finally { + if ($ch instanceof \CurlHandle) { + @curl_close($ch); + } + $ctx->ioData[$coro] = $buf; + $ctx->inc("curl_recv_bytes_$coro", strlen($buf)); + $ctx->counters["curl_http_code_$coro"] = $httpCode; + $ctx->events[] = sprintf( + 'curl %s: peer=%s http=%d errno=%d recv=%dB outcome=%s', + $coro, $peer, $httpCode, $errno, strlen($buf), $outcome); + } + } } diff --git a/fuzzy-tests/_harness/generate.php b/fuzzy-tests/_harness/generate.php index 562eb99..3fb1cce 100644 --- a/fuzzy-tests/_harness/generate.php +++ b/fuzzy-tests/_harness/generate.php @@ -186,6 +186,7 @@ function findFeatures(string $root): array { 'unix-sockets' => 'if (PHP_OS_FAMILY === "Windows") { echo "skip unix-domain sockets not supported"; exit; }', 'tcp' => '/* TCP loopback is portable; no skip */', 'sockets' => 'if (!function_exists("socket_import_stream")) { echo "skip ext/sockets required"; exit; }', + 'curl' => 'if (!extension_loaded("curl")) { echo "skip ext/curl required"; exit; }', 'fork' => 'if (!function_exists("pcntl_fork")) { echo "skip fork() not available"; exit; }', 'tty' => 'if (PHP_OS_FAMILY === "Windows") { echo "skip TTY semantics differ on Windows"; exit; }', 'zts' => 'if (!ZEND_THREAD_SAFE) { echo "skip requires Thread-Safe (ZTS) PHP build"; exit; }', diff --git a/fuzzy-tests/_peers/EvilPeer.php b/fuzzy-tests/_peers/EvilPeer.php index 8191bb0..561c00d 100644 --- a/fuzzy-tests/_peers/EvilPeer.php +++ b/fuzzy-tests/_peers/EvilPeer.php @@ -20,6 +20,25 @@ * is the upload / back-pressure path — a peer that drains its * receive buffer slowly makes the client's fwrite() block on a * full send buffer, exercising the reactor's write-wait hook. + * - http: the peer drains one HTTP request, then writes back an HTTP/1.1 + * response — status line + headers + body. The same serve-mode + * toxics apply to the body (slice/delay/reset/hardReset); on top + * of them sit HTTP-specific toxics (see below). This is the path + * an async ext/curl client exercises through the reactor. + * + * HTTP-mode fault table keys (in addition to the serve keys above, which + * apply to the response body): + * httpStatus : int — response status code (default 200) + * httpChunked : bool — deliver the body with Transfer-Encoding: chunked + * instead of a fixed Content-Length + * httpClenLie : int — delta added to the Content-Length header; a + * positive value over-promises (curl waits for + * bytes that never come → CURLE_PARTIAL_FILE), a + * negative one under-promises (curl stops early → + * a clean prefix). 0 = honest header. + * httpHeaderDelay : int — ms to pause partway through the header block, + * so the response headers dribble in (slow-headers + * toxic). 0 = headers sent in one write. * * Fault table keys (serve mode): * payload : string — the bytes the peer would deliver @@ -199,4 +218,128 @@ public static function consume($conn, array $spec, ?Context $ctx = null, string $name, $rate, $delay, $reset, $hold, (int) $hardReset, $total, implode(' ', $trace)); } } + + /** + * HTTP-mode: drain one HTTP request off the accepted connection, then play + * out an HTTP/1.1 response through the configured toxics. An async ext/curl + * client driven by the reactor faces this peer; the body toxics are exactly + * the serve-mode ones (slice/delay/reset/hardReset), with the HTTP-specific + * toxics layered on top (see the class docblock). + * + * @param resource $conn an accepted stream-socket connection + * @param array $spec the fault table + * @param Context|null $ctx scenario context for the chaos event log + * @param string $name peer name, used in the log line + */ + public static function serveHttp($conn, array $spec, ?Context $ctx = null, string $name = 'peer'): void { + $body = $spec['payload'] ?? ''; + $slice = $spec['slice'] ?? 0; + $delay = $spec['delay'] ?? 0; + $reset = $spec['reset'] ?? -1; + $hardReset = $spec['hardReset'] ?? false; + $forked = $spec['forked'] ?? false; + $status = $spec['httpStatus'] ?? 200; + $chunked = $spec['httpChunked'] ?? false; + $clenLie = $spec['httpClenLie'] ?? 0; + $hdrDelay = $spec['httpHeaderDelay'] ?? 0; + + $trace = []; + + // Drain the request — request line + headers, then a body if the + // client announced a Content-Length (so a POST does not leave bytes + // unread, which a hard close would otherwise turn into an RST). + $req = ''; + while (strpos($req, "\r\n\r\n") === false && strlen($req) < 65536) { + $chunk = @fread($conn, 4096); + if ($chunk === false || $chunk === '') { + break; + } + $req .= $chunk; + } + if (preg_match('/Content-Length:\s*(\d+)/i', $req, $m)) { + $want = (int) $m[1]; + $have = strlen($req) - (int) strpos($req, "\r\n\r\n") - 4; + while ($have < $want) { + $chunk = @fread($conn, min(4096, $want - $have)); + if ($chunk === false || $chunk === '') { + break; + } + $have += strlen($chunk); + } + } + $trace[] = 'req' . strlen($req) . 'B'; + + // Status line + headers. Either a (possibly mendacious) Content-Length + // or chunked framing — never both. + $head = "HTTP/1.1 $status " . self::reason($status) . "\r\n"; + $head .= "Content-Type: application/octet-stream\r\n"; + if ($chunked) { + $head .= "Transfer-Encoding: chunked\r\n"; + } else { + $head .= 'Content-Length: ' . max(0, strlen($body) + $clenLie) . "\r\n"; + } + $head .= "Connection: close\r\n\r\n"; + + // Slow-headers toxic: split the header block and pause in the middle, + // so the response status/headers dribble in over two TCP writes. + if ($hdrDelay > 0 && strlen($head) > 2) { + $cut = intdiv(strlen($head), 2); + @fwrite($conn, substr($head, 0, $cut)); + self::pause($hdrDelay, $forked); + @fwrite($conn, substr($head, $cut)); + $trace[] = 'hdr-split d' . $hdrDelay; + } else { + @fwrite($conn, $head); + $trace[] = 'hdr'; + } + + // Body. A reset toxic caps delivery at `reset` bytes; otherwise the + // whole body goes out, optionally sliced/dripped and chunk-framed. + $len = strlen($body); + if ($reset >= 0 && $reset < $len) { + $len = $reset; + } + $step = $slice > 0 ? $slice : ($len > 0 ? $len : 1); + for ($off = 0; $off < $len; $off += $step) { + $n = min($step, $len - $off); + $piece = substr($body, $off, $n); + @fwrite($conn, $chunked ? sprintf("%x\r\n%s\r\n", $n, $piece) : $piece); + $trace[] = 'w' . $n; + if ($delay > 0 && $off + $step < $len) { + self::pause($delay, $forked); + $trace[] = 'd' . $delay; + } + } + // The chunked terminator only goes out on an untruncated response — a + // reset toxic closes mid-body, so the client must see a partial file. + $truncated = ($reset >= 0 && $reset < strlen($body)); + if ($chunked && !$truncated) { + @fwrite($conn, "0\r\n\r\n"); + $trace[] = 'chunk-end'; + } + + // Close — a hard-reset toxic forces an RST instead of a graceful FIN. + $resetSock = $hardReset ? self::armHardReset($conn) : null; + @fclose($conn); + $trace[] = $hardReset ? "hard-close@$len" : "close@$len"; + + if ($ctx !== null) { + $ctx->events[] = sprintf( + 'evil-http %s: status=%d body=%dB chunked=%d clenLie=%d slice=%d delay=%d reset=%d hdrDelay=%d | %s', + $name, $status, strlen($body), (int) $chunked, $clenLie, + $slice, $delay, $reset, $hdrDelay, implode(' ', $trace)); + } + } + + /** Reason phrase for the HTTP status codes the chaos suite uses. */ + private static function reason(int $status): string { + return [ + 200 => 'OK', 204 => 'No Content', + 301 => 'Moved Permanently', 302 => 'Found', + 400 => 'Bad Request', 403 => 'Forbidden', 404 => 'Not Found', + 418 => "I'm a teapot", + 500 => 'Internal Server Error', 502 => 'Bad Gateway', + 503 => 'Service Unavailable', + ][$status] ?? 'Status'; + } } diff --git a/fuzzy-tests/_peers/forked_peer.php b/fuzzy-tests/_peers/forked_peer.php index f81cbda..96522ea 100644 --- a/fuzzy-tests/_peers/forked_peer.php +++ b/fuzzy-tests/_peers/forked_peer.php @@ -38,11 +38,11 @@ $conn = @stream_socket_accept($server, 10); if ($conn !== false) { - if (($spec['mode'] ?? 'serve') === 'consume') { - EvilPeer::consume($conn, $spec); - } else { - EvilPeer::serve($conn, $spec); - } + match ($spec['mode'] ?? 'serve') { + 'consume' => EvilPeer::consume($conn, $spec), + 'http' => EvilPeer::serveHttp($conn, $spec), + default => EvilPeer::serve($conn, $spec), + }; } @fclose($server); diff --git a/fuzzy-tests/curl/http_chaos.feature b/fuzzy-tests/curl/http_chaos.feature new file mode 100644 index 0000000..e9c6188 --- /dev/null +++ b/fuzzy-tests/curl/http_chaos.feature @@ -0,0 +1,228 @@ +Feature: HTTP chaos — an async ext/curl client against a misbehaving HTTP peer + + The io/ features drive a raw-TCP client against the EvilPeer. This feature + closes the gap tracked in issue #136: nothing exercised ext/curl under the + random scheduler, cancellation, or a mid-response connection failure, even + though every async curl_exec() goes through the libuv reactor. + + The peer here is the same EvilPeer in its `http` mode: it drains one HTTP + request, then writes back an HTTP/1.1 response. The body-level toxics are + the serve-mode ones (slicing, drip delay, abrupt close, hard reset, forked + peer, every Toxiproxy transport toxic); on top of them sit HTTP-specific + toxics — chunked framing, a mendacious Content-Length, dribbled headers, + an arbitrary status code. + + Invariants, decidable regardless of interleaving: + - a non-truncating response is de-framed back to the exact body bytes; + - a truncating fault (mid-body close / reset / over-promised length) + surfaces as a clean curl error, never a hang — and whatever bytes the + write callback captured are a clean prefix of the body; + - a cancel mid-fetch leaves the coroutine completed and unorphaned, and + the outcome buckets sum to the attempt count; + - the response status code is reported faithfully (a 4xx/5xx is a valid + HTTP transaction, not a transport error). + + Scenario: a clean HTTP response arrives intact + Given an evil HTTP peer "EP" serving 4096 bytes + And a coroutine "C" + When coroutine "C" fetches peer "EP" over HTTP + Then counter "curl_get_attempts_C" equals 1 + And counter "curl_get_ok_C" equals 1 + And counter "curl_recv_bytes_C" equals 4096 + And coroutine "C" received HTTP status 200 + And coroutine "C" received the payload of peer "EP" intact + And no orphan coroutines + + Scenario: a sliced and dripped body is reassembled exactly + # The peer drips the body in small application-level chunks; curl must + # reassemble the byte stream regardless of how it was fragmented. + Given an evil HTTP peer "EP" serving 2048 bytes + And evil peer "EP" slices output into 64-byte chunks + And evil peer "EP" delays 2 ms between chunks + And a coroutine "C" + When coroutine "C" fetches peer "EP" over HTTP + Then counter "curl_get_ok_C" equals 1 + And counter "curl_recv_bytes_C" equals 2048 + And coroutine "C" received the payload of peer "EP" intact + And no orphan coroutines + + Scenario Outline: a chunked-encoded body is de-chunked exactly + # Transfer-Encoding: chunked — curl must de-chunk back to the exact bytes + # whatever the chunk size, including a 1-byte-per-chunk worst case. + Given an evil HTTP peer "EP" serving 1500 bytes + And evil HTTP peer "EP" uses chunked transfer encoding + And evil peer "EP" slices output into -byte chunks + And a coroutine "C" + When coroutine "C" fetches peer "EP" over HTTP + Then counter "curl_get_ok_C" equals 1 + And counter "curl_recv_bytes_C" equals 1500 + And coroutine "C" received the payload of peer "EP" intact + And no orphan coroutines + + Examples: + | chunk | + | 1 | + | 17 | + | 512 | + + Scenario Outline: an arbitrary status code completes the transaction + # A 4xx/5xx is a valid HTTP response — curl finishes with errno 0 and the + # body still arrives intact; only the reported status differs. + Given an evil HTTP peer "EP" serving 128 bytes + And evil HTTP peer "EP" responds with status + And a coroutine "C" + When coroutine "C" fetches peer "EP" over HTTP + Then counter "curl_get_ok_C" equals 1 + And coroutine "C" received HTTP status + And coroutine "C" received the payload of peer "EP" intact + And no orphan coroutines + + Examples: + | status | + | 200 | + | 404 | + | 500 | + | 503 | + + Scenario: slow-dribbled response headers do not corrupt the body + # The status line and headers arrive over two TCP writes with a pause + # between — curl's header parser must reassemble them and still hand back + # the exact body. + Given an evil HTTP peer "EP" serving 1024 bytes + And evil HTTP peer "EP" delays 5 ms mid-headers + And a coroutine "C" + When coroutine "C" fetches peer "EP" over HTTP + Then counter "curl_get_ok_C" equals 1 + And coroutine "C" received HTTP status 200 + And coroutine "C" received the payload of peer "EP" intact + And no orphan coroutines + + Scenario: a peer closing mid-body surfaces as a clean partial-file error + # Honest Content-Length, but the peer closes after only 768 of 2048 body + # bytes. curl must report a transport error (not errno 0, not a hang) and + # whatever the write callback captured is a clean prefix of the body. + Given an evil HTTP peer "EP" serving 2048 bytes + And evil peer "EP" closes abruptly after 768 bytes + And a coroutine "C" + When coroutine "C" fetches peer "EP" over HTTP + Then counter "curl_get_attempts_C" equals 1 + And counter "curl_get_failed_C" equals 1 + And counter "curl_recv_bytes_C" equals 768 + And coroutine "C" received a clean prefix of peer "EP" + And no orphan coroutines + + Scenario: an over-promised Content-Length errors without a hang + # The peer delivers the whole body but advertises 512 bytes more. curl + # waits for bytes that never come; when the connection closes it must + # report a partial-file error — the body it did receive is still intact. + Given an evil HTTP peer "EP" serving 1024 bytes + And evil HTTP peer "EP" overstates Content-Length by 512 bytes + And a coroutine "C" + When coroutine "C" fetches peer "EP" over HTTP + Then counter "curl_get_attempts_C" equals 1 + And counter "curl_get_failed_C" equals 1 + And counter "curl_recv_bytes_C" equals 1024 + And coroutine "C" received the payload of peer "EP" intact + And no orphan coroutines + + Scenario: an under-promised Content-Length yields a clean prefix + # The peer advertises 256 bytes fewer than it sends; curl stops at the + # advertised length and finishes cleanly with a prefix of the body. + Given an evil HTTP peer "EP" serving 1024 bytes + And evil HTTP peer "EP" understates Content-Length by 256 bytes + And a coroutine "C" + When coroutine "C" fetches peer "EP" over HTTP + Then counter "curl_get_ok_C" equals 1 + And counter "curl_recv_bytes_C" equals 768 + And coroutine "C" received a clean prefix of peer "EP" + And no orphan coroutines + + Scenario: a hard reset mid-body terminates the fetch cleanly + # An RST (SO_LINGER 0) instead of a graceful FIN, mid-body. curl must + # surface a transport error and the coroutine must terminate — no UAF in + # the reactor's curl request, no hang. + Given an evil HTTP peer "EP" serving 4096 bytes + And evil peer "EP" slices output into 256-byte chunks + And evil peer "EP" delays 1 ms between chunks + And evil peer "EP" closes abruptly after 1024 bytes + And evil peer "EP" uses a hard reset + And a coroutine "C" + When coroutine "C" fetches peer "EP" over HTTP + Then counter "curl_get_attempts_C" equals 1 + And counter "curl_get_failed_C" equals 1 + And coroutine "C" received a clean prefix of peer "EP" + And coroutine "C" is completed + And no orphan coroutines + + Scenario Outline: cancel a coroutine mid-HTTP-fetch + # A killer cancels the fetching coroutine while curl is parked in the + # reactor waiting on a dripped response. The cancel must be delivered into + # the curl wait; the coroutine terminates via its catch block. Under the + # random scheduler the cancel can land before, during, or after the + # transfer — so only the liveness sum and the no-hang/no-orphan invariants + # are decidable. + Given an evil HTTP peer "EP" serving 2048 bytes + And evil peer "EP" slices output into 32-byte chunks + And evil peer "EP" delays 3 ms between chunks + And a coroutine "C" + And a coroutine "K" + When coroutine "C" fetches peer "EP" over HTTP + And coroutine "K" sleeps ms + And coroutine "K" cancels coroutine "C" + Then counter "curl_get_ok_C" plus counter "curl_get_cancelled_C" plus counter "curl_get_failed_C" plus counter "curl_get_no_peer_C" equals counter "curl_get_attempts_C" + And coroutine "C" is completed + And no orphan coroutines + + Examples: + | ms | + | 0 | + | 5 | + | 25 | + | 80 | + + Scenario: many concurrent HTTP fetches under the random scheduler + # Three curl clients race three evil HTTP peers, each applying a different + # non-truncating toxic. Every transfer must still complete with its body + # intact regardless of how the scheduler interleaves the reactor work. + Given an evil HTTP peer "EP1" serving 1024 bytes + And evil peer "EP1" slices output into 48-byte chunks + And an evil HTTP peer "EP2" serving 1024 bytes + And evil HTTP peer "EP2" uses chunked transfer encoding + And evil peer "EP2" slices output into 96-byte chunks + And an evil HTTP peer "EP3" serving 1024 bytes + And evil HTTP peer "EP3" delays 4 ms mid-headers + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + When coroutine "C1" fetches peer "EP1" over HTTP + And coroutine "C2" fetches peer "EP2" over HTTP + And coroutine "C3" fetches peer "EP3" over HTTP + Then counter "curl_get_ok_C1" equals 1 + And counter "curl_get_ok_C2" equals 1 + And counter "curl_get_ok_C3" equals 1 + And coroutine "C1" received the payload of peer "EP1" intact + And coroutine "C2" received the payload of peer "EP2" intact + And coroutine "C3" received the payload of peer "EP3" intact + And no orphan coroutines + + Scenario: HTTP toxics crossed with logic and scheduler chaos + # Three chaos axes around a fixed 512-byte oracle: which non-truncating + # HTTP toxic the peer applies, whether a sibling coroutine perturbs the + # scheduler, and the interleaving. None of the toxics truncate, so the + # exact-value invariant stays decidable across the whole cross-product. + Given an evil HTTP peer "EP" serving 512 bytes + One of: + - evil peer "EP" slices output into 32-byte chunks + - evil HTTP peer "EP" uses chunked transfer encoding + - evil HTTP peer "EP" delays 3 ms mid-headers + - evil HTTP peer "EP" responds with status 418 + Given a coroutine "C" + And a coroutine "N" + When coroutine "C" fetches peer "EP" over HTTP + Any of: + - coroutine "N" sleeps 2 ms + - coroutine "N" sleeps 6 ms + Then counter "curl_get_ok_C" equals 1 + And counter "curl_recv_bytes_C" equals 512 + And coroutine "C" received the payload of peer "EP" intact + And no orphan coroutines From f9572a7705aed4a0bcd79cfef805c74de6c70f61 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Fri, 22 May 2026 20:32:46 +0000 Subject: [PATCH 2/8] #136 db-chaos: protocol-level coverage for async PDO MySQL MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the database half of the #136 coverage gap. A DB driver speaks a binary wire protocol, so a pure-PHP mock is not worth building — the chaos is injected at the transport level: Toxiproxy sits between the async PDO client and a real MySQL server, and every connect / query / transaction goes through the libuv reactor. New chaos topic fuzzy-tests/db/ with mysql_chaos.feature (12 scenarios): a query / transaction completing intact through a non-truncating toxic (latency / bandwidth / TCP slicer), a dropped connection surfacing as a clean PDOException — never a hang, a coroutine cancelled mid-query, and the PDO connection pool failing every slot cleanly when the server connection is lost. Pooled and non-pooled connections both covered. Harness: - new ChaosNet class — the network-fixture layer (EvilPeers, Toxiproxy proxies, chaos databases) extracted from Context so that class stays the scenario orchestrator and does not also carry the whole peer / proxy / DB lifecycle. Context holds one ChaosNet as $ctx->net; Context drops from 850 to 588 lines. - DB steps: a [pooled] MySQL database "DB"; Toxiproxy adds latency / throttles / slices / resets database "DB"; coroutine "C" queries / runs a slow query on / runs a transaction on database "DB". - dbRun() / dbTransaction() client routines; connection parameters from CHAOS_MYSQL / CHAOS_MYSQL_USER / _PASS / _DB with chaos-friendly defaults. - generate.php: pdo_mysql + mysql-server --SKIPIF-- rules, so the suite runs only where ext/pdo_mysql and a MySQL server are present. Verified: 30/30 db scenarios pass under fifo + three random scheduler seeds; the full 563-test fuzzy suite is green after the Context split. --- CHANGELOG.md | 1 + fuzzy-tests/FINDINGS.md | 18 ++ fuzzy-tests/_harness/ChaosNet.php | 333 +++++++++++++++++++++++++++ fuzzy-tests/_harness/Context.php | 188 +-------------- fuzzy-tests/_harness/Steps.php | 353 ++++++++++++++++++++++++----- fuzzy-tests/_harness/generate.php | 13 ++ fuzzy-tests/db/mysql_chaos.feature | 216 ++++++++++++++++++ 7 files changed, 885 insertions(+), 237 deletions(-) create mode 100644 fuzzy-tests/_harness/ChaosNet.php create mode 100644 fuzzy-tests/db/mysql_chaos.feature diff --git a/CHANGELOG.md b/CHANGELOG.md index b41c45b..6503bec 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.7.0] - ### Added +- **#136 Database chaos: async PDO MySQL coverage** — new chaos topic `fuzzy-tests/db/` with `mysql_chaos.feature`: the async PDO MySQL driver is exercised against a real MySQL server fronted by Toxiproxy (`client coroutine → Toxiproxy → MySQL`), so the transport toxics — latency, bandwidth caps, TCP slicing, `reset_peer` mid-query / mid-transaction — land on the driver's wire I/O. 12 scenarios cover a non-pooled and a pool-enabled connection: a query / transaction completing intact through a non-truncating toxic, a dropped connection surfacing as a clean `PDOException` (never a hang), a coroutine cancelled mid-query, and the connection pool failing every slot cleanly when the server connection is lost. New steps `a [pooled] MySQL database "DB"` / `Toxiproxy adds latency to|throttles|slices|resets database "DB"` / `coroutine "C" queries|runs a slow query on|runs a transaction on database "DB"`; connection parameters come from the environment (`CHAOS_MYSQL`, `CHAOS_MYSQL_USER/PASS/DB`). Opt-in like Toxiproxy: generated `.phpt` carry a `--SKIPIF--` probe for ext/pdo_mysql + a reachable MySQL server, so the suite stays inert on dev machines and per-PR CI. Closes the database half of the #136 coverage gap. The network-fixture layer of the harness — EvilPeers, Toxiproxy proxies, chaos databases — is extracted from `Context` into a dedicated `ChaosNet` class (`$ctx->net`) so `Context` stays the scenario orchestrator. - **#136 HTTP chaos: async ext/curl coverage** — EvilPeer gains an `http` mode (`EvilPeer::serveHttp`): it drains one HTTP request and writes back an HTTP/1.1 response, with the serve-mode body toxics (slice/drip/abrupt close/hard reset/forked peer/Toxiproxy) joined by HTTP-specific ones — chunked transfer-encoding, a mendacious `Content-Length` (over/under-stated), dribbled headers, an arbitrary status code. New chaos topic `fuzzy-tests/curl/` with `http_chaos.feature`: a reactor-driven `ext/curl` client (`coroutine "C" fetches peer "EP" over HTTP`) is exercised against every toxic, under the random scheduler, and cancelled mid-transfer — closing the CURL half of the #136 coverage gap (nothing previously exercised async curl under chaos). Generated `.phpt` carry a `--SKIPIF--` curl probe. **Found a real bug:** async curl dropped all but the first chunk of a chunked-encoded response body (`CURLE_WRITE_ERROR`) — fixed in php-src `ext/curl/curl_async.c` (`#136`); see `fuzzy-tests/FINDINGS.md`. - **#127 I/O chaos: EvilPeer + transport×logic crossing** — new `fuzzy-tests/_peers/EvilPeer.php`, a deliberately misbehaving network peer driven by a declarative fault table. Toxics: payload slicing, inter-chunk drip delay, abrupt mid-stream close (`reset`); parameters accept the seeded-random fuzz syntax (`random:N`, `1|5`). New chaos topic `fuzzy-tests/io/`: `evil_peer.feature` (sliced/dripped stream reassembled exactly), `abrupt_close.feature` (dropped connection → clean payload prefix, no hang), and `combined_chaos.feature` — **crosses transport chaos with logic chaos**: toxic-selection mutation blocks × client-logic mutation blocks × the random scheduler, all checked against a fixed payload oracle. On a failure the executor prints a **chaos event log** — the exact low-level toxic sequence the EvilPeer played out plus each client's I/O trace. Harness gains `defineEvilPeer()` + a prep-phase that binds each peer's listening socket and serves it from a coroutine. - **#129 I/O chaos: Toxiproxy transport-level fault injection** — new `fuzzy-tests/_peers/ToxiproxyClient.php`, a minimal HTTP-API client for [Toxiproxy](https://github.com/Shopify/toxiproxy). An EvilPeer can now be fronted by a Toxiproxy proxy (`client → proxy → peer`), injecting transport faults a pure-PHP peer cannot reproduce precisely: real bandwidth throttling, latency with jitter, TCP-segment slicing, `limit_data` byte-counted truncation, `reset_peer` timed RST. New steps `evil peer "EP" is fronted by Toxiproxy` / `Toxiproxy throttles|adds latency to|slices|cuts off|resets peer "EP" …`; new feature `fuzzy-tests/io/toxiproxy.feature`. Opt-in by design: every generated `.phpt` carries a `--SKIPIF--` probe (`SKIP_RULES['toxiproxy']`) and skips wherever no Toxiproxy admin endpoint answers — so the suite never gates per-PR CI. A dedicated `nightly-io-chaos.yml` workflow stands Toxiproxy up and runs the suite under FIFO + four random scheduler seeds. Closes the last item of #129. diff --git a/fuzzy-tests/FINDINGS.md b/fuzzy-tests/FINDINGS.md index 3482a9b..d422749 100644 --- a/fuzzy-tests/FINDINGS.md +++ b/fuzzy-tests/FINDINGS.md @@ -96,3 +96,21 @@ length back to libcurl only once the PHP callback has accepted every byte, otherwise it feeds the remainder through another callback slice. A genuine short return / exception still surfaces verbatim via a new `aborted` flag. Tracked in php-src as `#136`. + +## Async PDO MySQL pool leaks a raw warning on a dropped connection (observation) + +The `db/mysql_chaos.feature` suite (issue #136) fronts a real MySQL server +with Toxiproxy and drops the connection mid-query with the `reset_peer` +toxic. With `PDO::ATTR_ERRMODE = ERRMODE_EXCEPTION` the driver correctly +raises a `PDOException` — but the **pool-enabled** path also emits a bare +`E_WARNING` ("Error while reading greeting packet") from mysqlnd on top of +the exception, where the non-pooled `new PDO()` path of the same failing +connect does not. + +It does not break error handling — the exception still propagates and is +caught — so the chaos steps simply `@`-silence the expected noise, the same +way the raw-socket I/O steps already do. Worth a follow-up: under +ERRMODE_EXCEPTION the pool's internal connect (`pdo_pool_acquire_conn` → +`db_handle_factory`) should suppress the low-level mysqlnd warning the way +the direct constructor path does. Not a correctness bug; tracked as a +loose end, not fixed here. diff --git a/fuzzy-tests/_harness/ChaosNet.php b/fuzzy-tests/_harness/ChaosNet.php new file mode 100644 index 0000000..e540df5 --- /dev/null +++ b/fuzzy-tests/_harness/ChaosNet.php @@ -0,0 +1,333 @@ +net`; steps reach fixtures through it — + * `$ctx->net->evilPeerDefs`, `$ctx->net->defineEvilDb()`, and so on. + * + * Context::run() drives two entry points: setUp() in the prep phase (bind + * peers, spawn serve coroutines, front peers + DBs with Toxiproxy) and + * tearDown() in the cleanup phase (close sockets, reap forked processes, + * delete proxies, drop pooled handles). + */ + +namespace Async\Chaos; + +use function Async\spawn; + +require_once __DIR__ . '/../_peers/EvilPeer.php'; +require_once __DIR__ . '/../_peers/ToxiproxyClient.php'; + +final class ChaosNet { + // ---- EvilPeer fixtures ---- + + /** @var array + * EvilPeer fault tables, keyed by peer name. */ + public array $evilPeerDefs = []; + + /** @var array peer name => "host:port" — populated by setUp() */ + public array $evilPeerAddr = []; + + /** @var array peer name => listening socket */ + public array $evilPeerServers = []; + + /** @var array peer name => + * forked peer process handle (only for peers run as a forked peer) */ + public array $evilPeerProcs = []; + + /** @var array> + * Toxiproxy toxics layered onto a fronted peer, keyed by peer name. */ + public array $evilPeerToxics = []; + + // ---- Toxiproxy ---- + + /** @var string[] Toxiproxy proxy names created in setUp(), torn down after. */ + public array $toxiproxyProxies = []; + + /** Toxiproxy admin client — constructed lazily when a peer/DB is fronted. */ + public ?ToxiproxyClient $toxiproxy = null; + + // ---- Database-under-chaos fixtures ---- + + /** @var array + * Database-under-chaos definitions, keyed by name. Unlike an EvilPeer, the + * upstream is a real DB server — the harness only fronts it with Toxiproxy + * so the chaos lands on the driver's wire I/O. */ + public array $evilDbDefs = []; + + /** @var array> + * Toxiproxy toxics layered onto a fronted database, keyed by db name. */ + public array $evilDbToxics = []; + + /** @var array db name => proxy "host:port" the client + * connects through — populated by setUp() once the proxy is up. */ + public array $evilDbAddr = []; + + /** @var array shared pool-enabled PDO handles, keyed by db + * name — one handle many coroutines share, created in setUp(). */ + public array $evilDbPool = []; + + /** Define an EvilPeer; idempotent. Toxics are layered on by later steps. */ + public function defineEvilPeer(string $name): void { + if (!isset($this->evilPeerDefs[$name])) { + $this->evilPeerDefs[$name] = [ + 'payload' => '', 'slice' => 0, 'delay' => 0, 'reset' => -1, + 'hold' => 0, 'hardReset' => false, 'mode' => 'serve', 'forked' => false, + 'toxiproxy' => false, + 'httpStatus' => 200, 'httpChunked' => false, + 'httpClenLie' => 0, 'httpHeaderDelay' => 0, + ]; + } + } + + /** + * Front an EvilPeer with Toxiproxy and, optionally, append one transport + * toxic. `stream` may be 'auto' — resolved at proxy-creation time to + * 'downstream' for a serve peer or 'upstream' for a consume peer. + */ + public function addEvilPeerToxic( + string $name, + ?string $type = null, + string $stream = 'auto', + array $attributes = [] + ): void { + $this->defineEvilPeer($name); + $this->evilPeerDefs[$name]['toxiproxy'] = true; + if ($type !== null) { + $this->evilPeerToxics[$name][] = [ + 'type' => $type, 'stream' => $stream, 'attributes' => $attributes, + ]; + } + } + + /** Declare a database-under-chaos; idempotent. Toxics are layered on by + * later steps; setUp() fronts it with a Toxiproxy proxy. */ + public function defineEvilDb(string $name, string $driver = 'mysql', bool $pool = false, int $poolMax = 4): void { + if (!isset($this->evilDbDefs[$name])) { + $this->evilDbDefs[$name] = ['driver' => $driver, 'pool' => false, 'poolMax' => $poolMax]; + } + if ($pool) { + $this->evilDbDefs[$name]['pool'] = true; + $this->evilDbDefs[$name]['poolMax'] = $poolMax; + } + } + + /** Append one Toxiproxy toxic to a fronted database. `stream` defaults to + * 'downstream' — the server→client direction that carries the result set; + * Toxiproxy rejects any value other than 'upstream' / 'downstream'. */ + public function addEvilDbToxic(string $name, string $type, array $attributes, string $stream = 'downstream'): void { + $this->defineEvilDb($name); + $this->evilDbToxics[$name][] = [ + 'type' => $type, 'stream' => $stream, 'attributes' => $attributes, + ]; + } + + /** + * Open a PDO connection to a fronted database, through its Toxiproxy proxy. + * Connection parameters come from the environment (so the suite adapts to + * whatever DB the CI / dev box exposes) with chaos-friendly defaults: + * CHAOS_MYSQL (host:port) · CHAOS_MYSQL_USER · CHAOS_MYSQL_PASS · + * CHAOS_MYSQL_DB. + * A pool-enabled handle is created with POOL_MIN 0, so the constructor + * itself opens no socket — it neither yields nor needs a coroutine. + */ + public function openDbConnection(string $db, bool $pool): \PDO { + $addr = $this->evilDbAddr[$db] ?? ''; + $colon = strrpos($addr, ':'); + $host = $colon === false ? $addr : substr($addr, 0, $colon); + $port = $colon === false ? 3306 : (int) substr($addr, $colon + 1); + $user = getenv('CHAOS_MYSQL_USER') ?: 'test'; + $pass = getenv('CHAOS_MYSQL_PASS') ?: 'test'; + $name = getenv('CHAOS_MYSQL_DB') ?: 'chaos_test'; + $dsn = "mysql:host=$host;port=$port;dbname=$name"; + $opts = [ + \PDO::ATTR_ERRMODE => \PDO::ERRMODE_EXCEPTION, + \PDO::ATTR_TIMEOUT => 5, + ]; + if ($pool) { + $opts[\PDO::ATTR_POOL_ENABLED] = true; + $opts[\PDO::ATTR_POOL_MIN] = 0; + $opts[\PDO::ATTR_POOL_MAX] = $this->evilDbDefs[$db]['poolMax'] ?? 4; + } + return new \PDO($dsn, $user, $pass, $opts); + } + + /** + * Prep phase: bind every in-process peer's listening socket synchronously + * (so the client address is known before any client coroutine runs), spawn + * one accept-and-serve coroutine per peer, then front the marked peers and + * every declared database with a Toxiproxy proxy. + * + * @return \Async\Coroutine[] the peer serve coroutines, for run() to await + */ + public function setUp(Context $ctx): array { + $peerCoros = []; + + // Each in-process peer: bind, then serve one connection from a + // coroutine awaited like a user coroutine. + foreach ($this->evilPeerDefs as $name => $spec) { + // A forked peer runs in its own OS process — no in-process socket + // or accept coroutine; proc_open binds and serves it externally. + if (!empty($spec['forked'])) { + $this->startForkedPeer($name, $spec); + continue; + } + $server = @stream_socket_server('tcp://127.0.0.1:0', $errno, $errstr); + if ($server === false) { + throw new \RuntimeException("EvilPeer $name: cannot listen: $errstr"); + } + $this->evilPeerServers[$name] = $server; + $this->evilPeerAddr[$name] = stream_socket_get_name($server, false); + $peerCoros[] = spawn(function() use ($ctx, $name, $server, $spec) { + $ctx->inc("evil_peer_accept_attempts_$name"); + $conn = @stream_socket_accept($server, 5); + if ($conn === false) { + $ctx->inc("evil_peer_accept_failed_$name"); + return; + } + $ctx->inc("evil_peer_served_$name"); + match ($spec['mode'] ?? 'serve') { + 'consume' => EvilPeer::consume($conn, $spec, $ctx, $name), + 'http' => EvilPeer::serveHttp($conn, $spec, $ctx, $name), + default => EvilPeer::serve($conn, $spec, $ctx, $name), + }; + }); + } + + // Toxiproxy fronting: for every peer marked `is fronted by Toxiproxy`, + // create a proxy whose upstream is the peer's real listening socket, + // attach the declared toxics, and rewrite $evilPeerAddr to the proxy's + // listen address. The client then connects through Toxiproxy without + // knowing it — the peer-address indirection makes this transparent. + // Generated .phpt for these scenarios carry a Toxiproxy --SKIPIF-- + // probe, so by the time we get here Toxiproxy is known to be up. + foreach ($this->evilPeerDefs as $name => $spec) { + if (empty($spec['toxiproxy'])) { + continue; + } + $this->toxiproxy ??= new ToxiproxyClient(); + $upstream = $this->evilPeerAddr[$name]; + $proxyName = sprintf('chaos_%d_%s_%s', getmypid(), bin2hex(random_bytes(3)), $name); + $listen = $this->toxiproxy->createProxy($proxyName, '127.0.0.1:0', $upstream); + $this->toxiproxyProxies[] = $proxyName; + $defaultStream = ($spec['mode'] ?? 'serve') === 'consume' ? 'upstream' : 'downstream'; + foreach ($this->evilPeerToxics[$name] ?? [] as $i => $tox) { + $stream = $tox['stream'] === 'auto' ? $defaultStream : $tox['stream']; + $this->toxiproxy->addToxic( + $proxyName, $proxyName . '_t' . $i, + $tox['type'], $stream, $tox['attributes']); + } + // From here on the client connects through the proxy, not the peer. + $this->evilPeerAddr[$name] = $listen; + $ctx->events[] = sprintf( + 'toxiproxy %s: proxy %s upstream=%s listen=%s toxics=%d', + $name, $proxyName, $upstream, $listen, + count($this->evilPeerToxics[$name] ?? [])); + } + + // Database-under-chaos fronting: a real DB server cannot be an + // in-process peer, so every declared DB is reached only through a + // Toxiproxy proxy (upstream = the real server). The toxics then land on + // the driver's wire I/O — latency / bandwidth / RST mid-query. A + // pool-enabled DB also gets its one shared PDO handle built here. + foreach ($this->evilDbDefs as $name => $spec) { + $this->toxiproxy ??= new ToxiproxyClient(); + $upstream = getenv('CHAOS_MYSQL') ?: '127.0.0.1:3306'; + $proxyName = sprintf('chaosdb_%d_%s_%s', getmypid(), bin2hex(random_bytes(3)), $name); + $listen = $this->toxiproxy->createProxy($proxyName, '127.0.0.1:0', $upstream); + $this->toxiproxyProxies[] = $proxyName; + $this->evilDbAddr[$name] = $listen; + // Build the shared pooled handle through the still-clean proxy, + // BEFORE any toxic is attached — a connection-killing toxic + // (reset_peer) must land on the chaos queries, not on the pooled + // PDO constructor's own validation connect. + if ($spec['pool']) { + $this->evilDbPool[$name] = $this->openDbConnection($name, true); + } + foreach ($this->evilDbToxics[$name] ?? [] as $i => $tox) { + $this->toxiproxy->addToxic( + $proxyName, $proxyName . '_t' . $i, + $tox['type'], $tox['stream'], $tox['attributes']); + } + $ctx->events[] = sprintf( + 'toxiproxy-db %s: proxy %s upstream=%s listen=%s pool=%d toxics=%d', + $name, $proxyName, $upstream, $listen, (int) $spec['pool'], + count($this->evilDbToxics[$name] ?? [])); + } + + return $peerCoros; + } + + /** + * Cleanup phase: drop pooled PDO handles (the pool destructor releases + * every per-coroutine connection), close peer listening sockets, reap + * forked peer processes, and delete every Toxiproxy proxy. + */ + public function tearDown(): void { + // Drop shared pool-enabled PDO handles — the PDO pool destructor + // releases every per-coroutine connection it still holds. + $this->evilDbPool = []; + + // Close every EvilPeer listening socket left open. + foreach ($this->evilPeerServers as $server) { + if (is_resource($server)) { + @fclose($server); + } + } + + // Reap forked peer processes — each exits on its own after serving + // one connection; terminate is a safety net for an unconnected peer. + foreach ($this->evilPeerProcs as $entry) { + foreach ([1, 2] as $fd) { + if (isset($entry['pipes'][$fd]) && is_resource($entry['pipes'][$fd])) { + @fclose($entry['pipes'][$fd]); + } + } + if (is_resource($entry['proc'])) { + @proc_terminate($entry['proc']); + @proc_close($entry['proc']); + } + } + + // Delete every Toxiproxy proxy created for this scenario. + if ($this->toxiproxy !== null) { + foreach ($this->toxiproxyProxies as $proxyName) { + $this->toxiproxy->deleteProxy($proxyName); + } + } + } + + /** Launch a forked EvilPeer in its own process and record its address. + * The fault table is handed over on the child's stdin; the child prints + * its bound "host:port" as the first stdout line. */ + private function startForkedPeer(string $name, array $spec): void { + $script = __DIR__ . '/../_peers/forked_peer.php'; + $descriptors = [ + 0 => ['pipe', 'r'], // child stdin — the serialized fault table + 1 => ['pipe', 'w'], // child stdout — the bound address + 2 => ['pipe', 'w'], // child stderr + ]; + $pipes = []; + $proc = @proc_open([PHP_BINARY, $script], $descriptors, $pipes); + if (!is_resource($proc)) { + throw new \RuntimeException("EvilPeer $name: cannot fork peer process"); + } + fwrite($pipes[0], base64_encode(serialize($spec))); + fclose($pipes[0]); + $addr = trim((string) fgets($pipes[1])); + if ($addr === '') { + @proc_terminate($proc); + @proc_close($proc); + throw new \RuntimeException("EvilPeer $name: forked peer did not report an address"); + } + $this->evilPeerAddr[$name] = $addr; + $this->evilPeerProcs[$name] = ['proc' => $proc, 'pipes' => $pipes]; + } +} diff --git a/fuzzy-tests/_harness/Context.php b/fuzzy-tests/_harness/Context.php index d0bd635..15b5ec7 100644 --- a/fuzzy-tests/_harness/Context.php +++ b/fuzzy-tests/_harness/Context.php @@ -27,8 +27,7 @@ use function Async\await_all; use function Async\suspend; -require_once __DIR__ . '/../_peers/EvilPeer.php'; -require_once __DIR__ . '/../_peers/ToxiproxyClient.php'; +require_once __DIR__ . '/ChaosNet.php'; final class Context { public Rng $rng; @@ -99,30 +98,6 @@ final class Context { * strategies referenced for the scenario's lifetime */ public array $poolStrategies = []; - /** @var array - * EvilPeer fault tables, keyed by peer name. */ - public array $evilPeerDefs = []; - - /** @var array peer name => "host:port" — populated by run() */ - public array $evilPeerAddr = []; - - /** @var array peer name => listening socket */ - public array $evilPeerServers = []; - - /** @var array peer name => - * forked peer process handle (only for peers run as a forked peer) */ - public array $evilPeerProcs = []; - - /** @var array> - * Toxiproxy toxics layered onto a fronted peer, keyed by peer name. */ - public array $evilPeerToxics = []; - - /** @var string[] Toxiproxy proxy names created in run(), torn down after. */ - public array $toxiproxyProxies = []; - - /** Toxiproxy admin client — constructed lazily when a peer is fronted. */ - public ?ToxiproxyClient $toxiproxy = null; - /** @var array coroutine name => bytes it received over I/O */ public array $ioData = []; @@ -158,9 +133,13 @@ final class Context { public bool $hasRun = false; + /** Network-fixture layer: EvilPeers, Toxiproxy proxies, chaos databases. */ + public ChaosNet $net; + public function __construct(int $seed) { $this->rng = new Rng($seed); $this->resolver = new ValueResolver($this->rng); + $this->net = new ChaosNet(); } /** Define a channel by name; idempotent (last wins). */ @@ -221,66 +200,6 @@ public function definePool(string $name, int $min = 0, int $max = 10, bool $reje $this->poolDefs[$name] = ['min' => $min, 'max' => $max, 'rejectRelease' => $rejectRelease]; } - /** Define an EvilPeer; idempotent. Toxics are layered on by later steps. */ - public function defineEvilPeer(string $name): void { - if (!isset($this->evilPeerDefs[$name])) { - $this->evilPeerDefs[$name] = [ - 'payload' => '', 'slice' => 0, 'delay' => 0, 'reset' => -1, - 'hold' => 0, 'hardReset' => false, 'mode' => 'serve', 'forked' => false, - 'toxiproxy' => false, - 'httpStatus' => 200, 'httpChunked' => false, - 'httpClenLie' => 0, 'httpHeaderDelay' => 0, - ]; - } - } - - /** - * Front an EvilPeer with Toxiproxy and, optionally, append one transport - * toxic. `stream` may be 'auto' — resolved at proxy-creation time to - * 'downstream' for a serve peer or 'upstream' for a consume peer. - */ - public function addEvilPeerToxic( - string $name, - ?string $type = null, - string $stream = 'auto', - array $attributes = [] - ): void { - $this->defineEvilPeer($name); - $this->evilPeerDefs[$name]['toxiproxy'] = true; - if ($type !== null) { - $this->evilPeerToxics[$name][] = [ - 'type' => $type, 'stream' => $stream, 'attributes' => $attributes, - ]; - } - } - - /** Launch a forked EvilPeer in its own process and record its address. - * The fault table is handed over on the child's stdin; the child prints - * its bound "host:port" as the first stdout line. */ - private function startForkedPeer(string $name, array $spec): void { - $script = __DIR__ . '/../_peers/forked_peer.php'; - $descriptors = [ - 0 => ['pipe', 'r'], // child stdin — the serialized fault table - 1 => ['pipe', 'w'], // child stdout — the bound address - 2 => ['pipe', 'w'], // child stderr - ]; - $pipes = []; - $proc = @proc_open([PHP_BINARY, $script], $descriptors, $pipes); - if (!is_resource($proc)) { - throw new \RuntimeException("EvilPeer $name: cannot fork peer process"); - } - fwrite($pipes[0], base64_encode(serialize($spec))); - fclose($pipes[0]); - $addr = trim((string) fgets($pipes[1])); - if ($addr === '') { - @proc_terminate($proc); - @proc_close($proc); - throw new \RuntimeException("EvilPeer $name: forked peer did not report an address"); - } - $this->evilPeerAddr[$name] = $addr; - $this->evilPeerProcs[$name] = ['proc' => $proc, 'pipes' => $pipes]; - } - public function defineThreadChannel(string $name, int $capacity): void { $this->threadChannelDefs[$name] = $capacity; } @@ -467,71 +386,10 @@ public function run(): void { await_all($seeders); } - // EvilPeer setup: bind each in-process peer's listening socket - // synchronously so $evilPeerAddr is known before any client coroutine - // runs, then spawn one accept-and-serve coroutine per peer. The - // coroutine is awaited like a user coroutine — it returns once it has - // served (or failed to serve) one connection. - $peerCoros = []; - foreach ($this->evilPeerDefs as $name => $spec) { - // A forked peer runs in its own OS process — no in-process socket - // or accept coroutine; proc_open binds and serves it externally. - if (!empty($spec['forked'])) { - $this->startForkedPeer($name, $spec); - continue; - } - $server = @stream_socket_server('tcp://127.0.0.1:0', $errno, $errstr); - if ($server === false) { - throw new \RuntimeException("EvilPeer $name: cannot listen: $errstr"); - } - $this->evilPeerServers[$name] = $server; - $this->evilPeerAddr[$name] = stream_socket_get_name($server, false); - $peerCoros[] = spawn(function() use ($self, $name, $server, $spec) { - $self->inc("evil_peer_accept_attempts_$name"); - $conn = @stream_socket_accept($server, 5); - if ($conn === false) { - $self->inc("evil_peer_accept_failed_$name"); - return; - } - $self->inc("evil_peer_served_$name"); - match ($spec['mode'] ?? 'serve') { - 'consume' => EvilPeer::consume($conn, $spec, $self, $name), - 'http' => EvilPeer::serveHttp($conn, $spec, $self, $name), - default => EvilPeer::serve($conn, $spec, $self, $name), - }; - }); - } - - // Toxiproxy fronting: for every peer marked `is fronted by Toxiproxy`, - // create a proxy whose upstream is the peer's real listening socket, - // attach the declared toxics, and rewrite $evilPeerAddr to the proxy's - // listen address. The client then connects through Toxiproxy without - // knowing it — the peer-address indirection makes this transparent. - // Generated .phpt for these scenarios carry a Toxiproxy --SKIPIF-- - // probe, so by the time we get here Toxiproxy is known to be up. - foreach ($this->evilPeerDefs as $name => $spec) { - if (empty($spec['toxiproxy'])) { - continue; - } - $this->toxiproxy ??= new ToxiproxyClient(); - $upstream = $this->evilPeerAddr[$name]; - $proxyName = sprintf('chaos_%d_%s_%s', getmypid(), bin2hex(random_bytes(3)), $name); - $listen = $this->toxiproxy->createProxy($proxyName, '127.0.0.1:0', $upstream); - $this->toxiproxyProxies[] = $proxyName; - $defaultStream = ($spec['mode'] ?? 'serve') === 'consume' ? 'upstream' : 'downstream'; - foreach ($this->evilPeerToxics[$name] ?? [] as $i => $tox) { - $stream = $tox['stream'] === 'auto' ? $defaultStream : $tox['stream']; - $this->toxiproxy->addToxic( - $proxyName, $proxyName . '_t' . $i, - $tox['type'], $stream, $tox['attributes']); - } - // From here on the client connects through the proxy, not the peer. - $this->evilPeerAddr[$name] = $listen; - $this->events[] = sprintf( - 'toxiproxy %s: proxy %s upstream=%s listen=%s toxics=%d', - $name, $proxyName, $upstream, $listen, - count($this->evilPeerToxics[$name] ?? [])); - } + // Network-fixture prep: bind EvilPeers + spawn their serve coroutines, + // front the marked peers and every chaos database with Toxiproxy. The + // peer serve coroutines are awaited alongside the user coroutines. + $peerCoros = $this->net->setUp($this); // First pass: spawn every coroutine, populate handles. Coroutine bodies // do NOT run yet (spawn just queues), so by the time the first body @@ -663,31 +521,9 @@ public function run(): void { try { $pool->close(); } catch (\Throwable $e) { /* already closing */ } } } - // Close every EvilPeer listening socket left open. - foreach ($this->evilPeerServers as $server) { - if (is_resource($server)) { - @fclose($server); - } - } - // Reap forked peer processes — each exits on its own after serving - // one connection; terminate is a safety net for an unconnected peer. - foreach ($this->evilPeerProcs as $entry) { - foreach ([1, 2] as $fd) { - if (isset($entry['pipes'][$fd]) && is_resource($entry['pipes'][$fd])) { - @fclose($entry['pipes'][$fd]); - } - } - if (is_resource($entry['proc'])) { - @proc_terminate($entry['proc']); - @proc_close($entry['proc']); - } - } - // Delete every Toxiproxy proxy created for this scenario. - if ($this->toxiproxy !== null) { - foreach ($this->toxiproxyProxies as $proxyName) { - $this->toxiproxy->deleteProxy($proxyName); - } - } + // Network-fixture teardown: drop pooled PDO handles, close peer + // sockets, reap forked peer processes, delete every Toxiproxy proxy. + $this->net->tearDown(); // Suppress "Future was never used" warnings for futures that no // coroutine got around to awaiting (await_any only consumes one). diff --git a/fuzzy-tests/_harness/Steps.php b/fuzzy-tests/_harness/Steps.php index 6f8a416..29f85e0 100644 --- a/fuzzy-tests/_harness/Steps.php +++ b/fuzzy-tests/_harness/Steps.php @@ -193,8 +193,8 @@ function(Context $ctx, string $name) { // toxics (slicing, delay) onto this fault table. $r->on('/^an evil peer "([^"]+)" serving "([^"]*)"$/', function(Context $ctx, string $name, string $payload) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['payload'] = $payload; + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['payload'] = $payload; }); // Given an evil peer "EP" serving N bytes @@ -203,26 +203,26 @@ function(Context $ctx, string $name, string $payload) { $r->on('/^an evil peer "([^"]+)" serving (\S+) bytes$/', function(Context $ctx, string $name, string $nExpr) { $n = (int)$ctx->resolver->resolve($nExpr); - $ctx->defineEvilPeer($name); + $ctx->net->defineEvilPeer($name); $payload = ''; for ($i = 0; $i < $n; $i++) { $payload .= chr(33 + ($i % 94)); // printable ASCII cycle } - $ctx->evilPeerDefs[$name]['payload'] = $payload; + $ctx->net->evilPeerDefs[$name]['payload'] = $payload; }); // Given evil peer "EP" slices output into N-byte chunks $r->on('/^evil peer "([^"]+)" slices output into (\S+)-byte chunks$/', function(Context $ctx, string $name, string $nExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['slice'] = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['slice'] = (int)$ctx->resolver->resolve($nExpr); }); // Given evil peer "EP" delays N ms between chunks $r->on('/^evil peer "([^"]+)" delays (\S+) ms between chunks$/', function(Context $ctx, string $name, string $nExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['delay'] = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['delay'] = (int)$ctx->resolver->resolve($nExpr); }); // Given evil peer "EP" closes abruptly after N bytes @@ -230,8 +230,8 @@ function(Context $ctx, string $name, string $nExpr) { // client must see a clean truncation, not a hang or a corrupt buffer. $r->on('/^evil peer "([^"]+)" closes abruptly after (\S+) bytes$/', function(Context $ctx, string $name, string $nExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['reset'] = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['reset'] = (int)$ctx->resolver->resolve($nExpr); }); // Given an evil peer "EP" that never reads @@ -240,9 +240,9 @@ function(Context $ctx, string $name, string $nExpr) { // suspends on the reactor's write-wait hook — the back-pressure path. $r->on('/^an evil peer "([^"]+)" that never reads$/', function(Context $ctx, string $name) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['mode'] = 'consume'; - $ctx->evilPeerDefs[$name]['slice'] = 0; + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['mode'] = 'consume'; + $ctx->net->evilPeerDefs[$name]['slice'] = 0; }); // Given an evil peer "EP" that reads N bytes at a time @@ -251,16 +251,16 @@ function(Context $ctx, string $name) { // the client's writer suspended for a while. $r->on('/^an evil peer "([^"]+)" that reads (\S+) bytes at a time$/', function(Context $ctx, string $name, string $nExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['mode'] = 'consume'; - $ctx->evilPeerDefs[$name]['slice'] = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['mode'] = 'consume'; + $ctx->net->evilPeerDefs[$name]['slice'] = (int)$ctx->resolver->resolve($nExpr); }); // Given evil peer "EP" delays N ms between reads $r->on('/^evil peer "([^"]+)" delays (\S+) ms between reads$/', function(Context $ctx, string $name, string $nExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['delay'] = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['delay'] = (int)$ctx->resolver->resolve($nExpr); }); // Given evil peer "EP" stops reading after N bytes @@ -268,8 +268,8 @@ function(Context $ctx, string $name, string $nExpr) { // writer must see a clean broken-pipe failure, not a hang. $r->on('/^evil peer "([^"]+)" stops reading after (\S+) bytes$/', function(Context $ctx, string $name, string $nExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['reset'] = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['reset'] = (int)$ctx->resolver->resolve($nExpr); }); // Given evil peer "EP" holds the connection for N ms @@ -277,8 +277,8 @@ function(Context $ctx, string $name, string $nExpr) { // connection open before closing, i.e. the killer's window to cancel. $r->on('/^evil peer "([^"]+)" holds the connection for (\S+) ms$/', function(Context $ctx, string $name, string $nExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['hold'] = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['hold'] = (int)$ctx->resolver->resolve($nExpr); }); // Given evil peer "EP" uses a hard reset @@ -287,8 +287,8 @@ function(Context $ctx, string $name, string $nExpr) { // faces a real ECONNRESET, not a clean EOF. $r->on('/^evil peer "([^"]+)" uses a hard reset$/', function(Context $ctx, string $name) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['hardReset'] = true; + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['hardReset'] = true; }) ->requires('sockets'); @@ -298,8 +298,8 @@ function(Context $ctx, string $name) { // endpoint, no shared reactor. The same fault table applies. $r->on('/^evil peer "([^"]+)" runs as a forked peer$/', function(Context $ctx, string $name) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['forked'] = true; + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['forked'] = true; }); // ---- Toxiproxy: external transport-level fault injection ---- @@ -315,7 +315,7 @@ function(Context $ctx, string $name) { // pass-through baseline (the proxy must be transparent on its own). $r->on('/^evil peer "([^"]+)" is fronted by Toxiproxy$/', function(Context $ctx, string $name) { - $ctx->addEvilPeerToxic($name); + $ctx->net->addEvilPeerToxic($name); }) ->requires('toxiproxy'); @@ -325,7 +325,7 @@ function(Context $ctx, string $name) { $r->on('/^Toxiproxy throttles peer "([^"]+)" to (\S+) KB\/s$/', function(Context $ctx, string $name, string $rateExpr) { $rate = (int)$ctx->resolver->resolve($rateExpr); - $ctx->addEvilPeerToxic($name, 'bandwidth', 'auto', ['rate' => $rate]); + $ctx->net->addEvilPeerToxic($name, 'bandwidth', 'auto', ['rate' => $rate]); }) ->requires('toxiproxy'); @@ -335,7 +335,7 @@ function(Context $ctx, string $name, string $rateExpr) { function(Context $ctx, string $latExpr, string $jitExpr, string $name) { $lat = (int)$ctx->resolver->resolve($latExpr); $jit = (int)$ctx->resolver->resolve($jitExpr); - $ctx->addEvilPeerToxic($name, 'latency', 'auto', + $ctx->net->addEvilPeerToxic($name, 'latency', 'auto', ['latency' => $lat, 'jitter' => $jit]); }) ->requires('toxiproxy'); @@ -345,7 +345,7 @@ function(Context $ctx, string $latExpr, string $jitExpr, string $name) { $r->on('/^Toxiproxy adds (\S+) ms latency to peer "([^"]+)"$/', function(Context $ctx, string $latExpr, string $name) { $lat = (int)$ctx->resolver->resolve($latExpr); - $ctx->addEvilPeerToxic($name, 'latency', 'auto', ['latency' => $lat]); + $ctx->net->addEvilPeerToxic($name, 'latency', 'auto', ['latency' => $lat]); }) ->requires('toxiproxy'); @@ -355,7 +355,7 @@ function(Context $ctx, string $latExpr, string $name) { $r->on('/^Toxiproxy slices peer "([^"]+)" into (\S+)-byte TCP segments$/', function(Context $ctx, string $name, string $sizeExpr) { $size = (int)$ctx->resolver->resolve($sizeExpr); - $ctx->addEvilPeerToxic($name, 'slicer', 'auto', [ + $ctx->net->addEvilPeerToxic($name, 'slicer', 'auto', [ 'average_size' => $size, 'size_variation' => intdiv($size, 4), 'delay' => 0, @@ -369,7 +369,7 @@ function(Context $ctx, string $name, string $sizeExpr) { $r->on('/^Toxiproxy cuts peer "([^"]+)" off after (\S+) bytes$/', function(Context $ctx, string $name, string $nExpr) { $n = (int)$ctx->resolver->resolve($nExpr); - $ctx->addEvilPeerToxic($name, 'limit_data', 'auto', ['bytes' => $n]); + $ctx->net->addEvilPeerToxic($name, 'limit_data', 'auto', ['bytes' => $n]); }) ->requires('toxiproxy'); @@ -379,7 +379,7 @@ function(Context $ctx, string $name, string $nExpr) { $r->on('/^Toxiproxy resets peer "([^"]+)" after (\S+) ms$/', function(Context $ctx, string $name, string $msExpr) { $ms = (int)$ctx->resolver->resolve($msExpr); - $ctx->addEvilPeerToxic($name, 'reset_peer', 'auto', ['timeout' => $ms]); + $ctx->net->addEvilPeerToxic($name, 'reset_peer', 'auto', ['timeout' => $ms]); }) ->requires('toxiproxy'); @@ -396,22 +396,22 @@ function(Context $ctx, string $name, string $msExpr) { $r->on('/^an evil HTTP peer "([^"]+)" serving (\S+) bytes$/', function(Context $ctx, string $name, string $nExpr) { $n = (int)$ctx->resolver->resolve($nExpr); - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['mode'] = 'http'; $payload = ''; for ($i = 0; $i < $n; $i++) { $payload .= chr(33 + ($i % 94)); // printable ASCII cycle } - $ctx->evilPeerDefs[$name]['payload'] = $payload; + $ctx->net->evilPeerDefs[$name]['payload'] = $payload; }) ->requires('curl'); // Given an evil HTTP peer "EP" serving "body" $r->on('/^an evil HTTP peer "([^"]+)" serving "([^"]*)"$/', function(Context $ctx, string $name, string $body) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['mode'] = 'http'; - $ctx->evilPeerDefs[$name]['payload'] = $body; + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->net->evilPeerDefs[$name]['payload'] = $body; }) ->requires('curl'); @@ -421,9 +421,9 @@ function(Context $ctx, string $name, string $body) { // response, not a transport error. $r->on('/^evil HTTP peer "([^"]+)" responds with status (\S+)$/', function(Context $ctx, string $name, string $sExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['mode'] = 'http'; - $ctx->evilPeerDefs[$name]['httpStatus'] = (int)$ctx->resolver->resolve($sExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->net->evilPeerDefs[$name]['httpStatus'] = (int)$ctx->resolver->resolve($sExpr); }) ->requires('curl'); @@ -432,9 +432,9 @@ function(Context $ctx, string $name, string $sExpr) { // back to the exact byte stream regardless of how it was framed. $r->on('/^evil HTTP peer "([^"]+)" uses chunked transfer encoding$/', function(Context $ctx, string $name) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['mode'] = 'http'; - $ctx->evilPeerDefs[$name]['httpChunked'] = true; + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->net->evilPeerDefs[$name]['httpChunked'] = true; }) ->requires('curl'); @@ -444,9 +444,9 @@ function(Context $ctx, string $name) { // never hang. $r->on('/^evil HTTP peer "([^"]+)" overstates Content-Length by (\S+) bytes$/', function(Context $ctx, string $name, string $nExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['mode'] = 'http'; - $ctx->evilPeerDefs[$name]['httpClenLie'] = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->net->evilPeerDefs[$name]['httpClenLie'] = (int)$ctx->resolver->resolve($nExpr); }) ->requires('curl'); @@ -455,9 +455,9 @@ function(Context $ctx, string $name, string $nExpr) { // the advertised length, so the client sees a clean prefix. $r->on('/^evil HTTP peer "([^"]+)" understates Content-Length by (\S+) bytes$/', function(Context $ctx, string $name, string $nExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['mode'] = 'http'; - $ctx->evilPeerDefs[$name]['httpClenLie'] = -(int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->net->evilPeerDefs[$name]['httpClenLie'] = -(int)$ctx->resolver->resolve($nExpr); }) ->requires('curl'); @@ -467,12 +467,83 @@ function(Context $ctx, string $name, string $nExpr) { // stay interruptible and reassemble them correctly. $r->on('/^evil HTTP peer "([^"]+)" delays (\S+) ms mid-headers$/', function(Context $ctx, string $name, string $nExpr) { - $ctx->defineEvilPeer($name); - $ctx->evilPeerDefs[$name]['mode'] = 'http'; - $ctx->evilPeerDefs[$name]['httpHeaderDelay'] = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilPeer($name); + $ctx->net->evilPeerDefs[$name]['mode'] = 'http'; + $ctx->net->evilPeerDefs[$name]['httpHeaderDelay'] = (int)$ctx->resolver->resolve($nExpr); }) ->requires('curl'); + // ---- Database under chaos: a real DB server fronted by Toxiproxy ---- + // A DB driver speaks a binary wire protocol, so a pure-PHP mock is not + // worth it — the chaos lands at the transport level instead: Toxiproxy + // sits between the async PDO client and a real MySQL server, injecting + // latency / bandwidth caps / RST mid-query. Every DB step is tagged + // ->requires('toxiproxy','pdo_mysql','mysql-server'); the generator + // emits a --SKIPIF-- probe so the test runs only where all three are + // present (the nightly job) and skips everywhere else. + + // Given a MySQL database "DB" + // A non-pooled database: each query opens its own PDO connection. + $r->on('/^a MySQL database "([^"]+)"$/', + function(Context $ctx, string $name) { + $ctx->net->defineEvilDb($name, 'mysql'); + }) + ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + + // Given a pooled MySQL database "DB" + // A pool-enabled database: one shared PDO handle, per-coroutine slots. + $r->on('/^a pooled MySQL database "([^"]+)"$/', + function(Context $ctx, string $name) { + $ctx->net->defineEvilDb($name, 'mysql', true); + }) + ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + + // Given a pooled MySQL database "DB" with N connections + $r->on('/^a pooled MySQL database "([^"]+)" with (\S+) connections$/', + function(Context $ctx, string $name, string $nExpr) { + $n = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilDb($name, 'mysql', true, $n > 0 ? $n : 1); + }) + ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + + // Given Toxiproxy adds N ms latency to database "DB" + $r->on('/^Toxiproxy adds (\S+) ms latency to database "([^"]+)"$/', + function(Context $ctx, string $latExpr, string $name) { + $lat = (int)$ctx->resolver->resolve($latExpr); + $ctx->net->addEvilDbToxic($name, 'latency', ['latency' => $lat]); + }) + ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + + // Given Toxiproxy throttles database "DB" to N KB/s + $r->on('/^Toxiproxy throttles database "([^"]+)" to (\S+) KB\/s$/', + function(Context $ctx, string $name, string $rateExpr) { + $rate = (int)$ctx->resolver->resolve($rateExpr); + $ctx->net->addEvilDbToxic($name, 'bandwidth', ['rate' => $rate]); + }) + ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + + // Given Toxiproxy slices database "DB" into N-byte TCP segments + $r->on('/^Toxiproxy slices database "([^"]+)" into (\S+)-byte TCP segments$/', + function(Context $ctx, string $name, string $sizeExpr) { + $size = (int)$ctx->resolver->resolve($sizeExpr); + $ctx->net->addEvilDbToxic($name, 'slicer', [ + 'average_size' => $size, + 'size_variation' => intdiv($size, 4), + 'delay' => 0, + ]); + }) + ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + + // Given Toxiproxy resets database "DB" after N ms + // reset_peer toxic — a TCP RST N ms into the connection; lands + // mid-query for any query that runs longer than N ms. + $r->on('/^Toxiproxy resets database "([^"]+)" after (\S+) ms$/', + function(Context $ctx, string $name, string $msExpr) { + $ms = (int)$ctx->resolver->resolve($msExpr); + $ctx->net->addEvilDbToxic($name, 'reset_peer', ['timeout' => $ms]); + }) + ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + // ---- When: actions inside a coroutine ---- // When coroutine "X" downloads from peer "EP" @@ -517,6 +588,49 @@ function(Context $ctx, string $coro, string $peer) { }) ->requires('curl'); + // When coroutine "X" queries database "DB" + // Runs a SELECT over the async PDO MySQL driver — connect + query I/O + // go through the libuv reactor. The query reads the five seed rows + // (ids 1..5), stable regardless of what transaction scenarios append. + // Cancellation-aware; the liveness invariant + // db_query_ok + db_query_cancelled + db_query_failed + // + db_query_no_db == db_query_attempts + // holds for every interleaving. + $r->on('/^coroutine "([^"]+)" queries database "([^"]+)"$/', + function(Context $ctx, string $coro, string $db) { + $ctx->planAction($coro, function(Context $ctx) use ($coro, $db) { + StandardSteps::dbRun($ctx, $coro, $db, + 'SELECT id, label, n FROM items WHERE id <= 5 ORDER BY id', 'query'); + }); + }) + ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + + // When coroutine "X" runs a slow query on database "DB" + // SELECT SLEEP(2) — keeps the coroutine parked in the reactor on the + // DB socket long enough for a killer to cancel it or a reset_peer + // toxic to land mid-query. + $r->on('/^coroutine "([^"]+)" runs a slow query on database "([^"]+)"$/', + function(Context $ctx, string $coro, string $db) { + $ctx->planAction($coro, function(Context $ctx) use ($coro, $db) { + StandardSteps::dbRun($ctx, $coro, $db, + 'SELECT SLEEP(2)', 'slow_query'); + }); + }) + ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + + // When coroutine "X" runs a transaction on database "DB" + // BEGIN → INSERT → COMMIT. A connection fault mid-transaction must + // surface as a clean error and leave neither the connection nor the + // pool slot wedged; the server rolls the transaction back on the + // dropped connection. + $r->on('/^coroutine "([^"]+)" runs a transaction on database "([^"]+)"$/', + function(Context $ctx, string $coro, string $db) { + $ctx->planAction($coro, function(Context $ctx) use ($coro, $db) { + StandardSteps::dbTransaction($ctx, $coro, $db); + }); + }) + ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + // When coroutine "X" uploads N bytes to peer "EP" // Connects and writes N bytes in a single fwrite(). Against a slow or // never-reading consume-mode peer that fwrite() suspends on a full @@ -3008,10 +3122,10 @@ function(Context $ctx, string $name) { // must equal the peer's declared payload, byte for byte. $r->on('/^coroutine "([^"]+)" received the payload of peer "([^"]+)" intact$/', function(Context $ctx, string $coro, string $peer) { - if (!isset($ctx->evilPeerDefs[$peer])) { + if (!isset($ctx->net->evilPeerDefs[$peer])) { throw new \RuntimeException("evil peer $peer not defined"); } - $expected = $ctx->evilPeerDefs[$peer]['payload']; + $expected = $ctx->net->evilPeerDefs[$peer]['payload']; $got = $ctx->ioData[$coro] ?? null; if ($got === null) { throw new \RuntimeException("coroutine $coro received nothing from peer $peer"); @@ -3030,10 +3144,10 @@ function(Context $ctx, string $coro, string $peer) { // payload length. $r->on('/^coroutine "([^"]+)" received a clean prefix of peer "([^"]+)"$/', function(Context $ctx, string $coro, string $peer) { - if (!isset($ctx->evilPeerDefs[$peer])) { + if (!isset($ctx->net->evilPeerDefs[$peer])) { throw new \RuntimeException("evil peer $peer not defined"); } - $payload = $ctx->evilPeerDefs[$peer]['payload']; + $payload = $ctx->net->evilPeerDefs[$peer]['payload']; $got = $ctx->ioData[$coro] ?? null; if ($got === null) { throw new \RuntimeException("coroutine $coro received nothing from peer $peer"); @@ -3172,7 +3286,7 @@ public static function ioDownload(Context $ctx, string $coro, string $peer, int // stays valid even when the download never gets past connect — a hard // RST can reset the connection before stream_socket_client() returns. $ctx->ioData[$coro] = ''; - $addr = $ctx->evilPeerAddr[$peer] ?? null; + $addr = $ctx->net->evilPeerAddr[$peer] ?? null; if ($addr === null) { $ctx->inc("io_download_no_peer_$coro"); return; @@ -3236,7 +3350,7 @@ public static function ioDownload(Context $ctx, string $coro, string $peer, int */ public static function ioUpload(Context $ctx, string $coro, string $peer, int $bytes, int $writeSize): void { $ctx->inc("io_upload_attempts_$coro"); - $addr = $ctx->evilPeerAddr[$peer] ?? null; + $addr = $ctx->net->evilPeerAddr[$peer] ?? null; if ($addr === null) { $ctx->inc("io_upload_no_peer_$coro"); return; @@ -3314,7 +3428,7 @@ public static function curlGet(Context $ctx, string $coro, string $peer): void { // Define the body slot up front so a clean-prefix assertion stays valid // even when the request never produces a byte. $ctx->ioData[$coro] = ''; - $addr = $ctx->evilPeerAddr[$peer] ?? null; + $addr = $ctx->net->evilPeerAddr[$peer] ?? null; if ($addr === null) { $ctx->inc("curl_get_no_peer_$coro"); return; @@ -3363,4 +3477,121 @@ function($ch, string $data) use (&$buf) { $coro, $peer, $httpCode, $errno, strlen($buf), $outcome); } } + + /** + * Shared async-PDO routine used by the "queries / runs a slow query" + * database steps. Connects (pooled: reuses the shared handle; non-pooled: + * opens a private one), runs $sql, drains the result set, and buckets the + * outcome into per-coroutine counters keyed by $verb: + * db__ok — query completed, rows drained + * db__cancelled — AsyncCancellation delivered into the query wait + * db__failed — PDOException / other throwable (e.g. RST) + * db__no_db — database fixture never resolved + * so the four buckets sum to db__attempts for every interleaving. + * db__rows_ records how many rows were drained. + * + * The PDO connect and the query both go through the libuv reactor, so a + * concurrent killer can cancel the coroutine mid-connect or mid-query, and + * a Toxiproxy reset_peer toxic can drop the connection mid-result. + */ + public static function dbRun(Context $ctx, string $coro, string $db, string $sql, string $verb): void { + $ctx->inc("db_{$verb}_attempts_$coro"); + $spec = $ctx->net->evilDbDefs[$db] ?? null; + if ($spec === null || !isset($ctx->net->evilDbAddr[$db])) { + $ctx->inc("db_{$verb}_no_db_$coro"); + return; + } + $outcome = 'ok'; + $rows = 0; + $pdo = null; + try { + // Pooled: one shared handle, the pool hands out a per-coroutine + // slot. Non-pooled: a private connection opened (and dropped) here. + $pdo = $spec['pool'] + ? $ctx->net->evilDbPool[$db] + : $ctx->net->openDbConnection($db, false); + // A dropped connection is the expected outcome under the toxics — + // mysqlnd emits a raw E_WARNING for it on top of the PDOException; + // @ silences that expected noise (the exception is still caught). + $stmt = @$pdo->query($sql); + while (@$stmt->fetch(\PDO::FETCH_NUM) !== false) { + $rows++; + } + $ctx->inc("db_{$verb}_ok_$coro"); + } catch (\Async\AsyncCancellation $e) { + $outcome = 'cancelled'; + $ctx->inc("db_{$verb}_cancelled_$coro"); + } catch (\Throwable $e) { + $outcome = 'failed'; + $ctx->inc("db_{$verb}_failed_$coro"); + } finally { + $ctx->inc("db_{$verb}_rows_$coro", $rows); + // Non-pooled: drop this coroutine's private connection now. + if (!$spec['pool']) { + $pdo = null; + } + $ctx->events[] = sprintf( + 'db %s: db=%s verb=%s pool=%d rows=%d outcome=%s', + $coro, $db, $verb, (int) $spec['pool'], $rows, $outcome); + } + } + + /** + * Shared async-PDO routine for the "runs a transaction" database step: + * BEGIN → INSERT → COMMIT. A connection fault mid-transaction must surface + * as a clean error — the coroutine terminates, the connection (or pool + * slot) is not left wedged, and the server rolls the transaction back on + * the dropped connection. Outcome buckets mirror dbRun(): + * db_txn_ok / db_txn_cancelled / db_txn_failed / db_txn_no_db sum to + * db_txn_attempts; db_txn_committed counts the transactions that COMMIT + * actually acknowledged. + */ + public static function dbTransaction(Context $ctx, string $coro, string $db): void { + $ctx->inc("db_txn_attempts_$coro"); + $spec = $ctx->net->evilDbDefs[$db] ?? null; + if ($spec === null || !isset($ctx->net->evilDbAddr[$db])) { + $ctx->inc("db_txn_no_db_$coro"); + return; + } + $outcome = 'ok'; + $pdo = null; + try { + $pdo = $spec['pool'] + ? $ctx->net->evilDbPool[$db] + : $ctx->net->openDbConnection($db, false); + // @ silences mysqlnd's raw E_WARNING on a dropped connection — + // the expected outcome under the toxics; the PDOException still + // propagates to the catch blocks below. + @$pdo->beginTransaction(); + $stmt = @$pdo->prepare('INSERT INTO items (label, n) VALUES (?, ?)'); + @$stmt->execute(["txn-$coro", 0]); + @$pdo->commit(); + $ctx->inc("db_txn_committed_$coro"); + $ctx->inc("db_txn_ok_$coro"); + } catch (\Async\AsyncCancellation $e) { + $outcome = 'cancelled'; + $ctx->inc("db_txn_cancelled_$coro"); + } catch (\Throwable $e) { + $outcome = 'failed'; + $ctx->inc("db_txn_failed_$coro"); + } finally { + // Best-effort rollback so a pooled slot is not left mid-transaction + // for the next coroutine — harmless if the connection already died. + if ($pdo !== null) { + try { + if (@$pdo->inTransaction()) { + @$pdo->rollBack(); + } + } catch (\Throwable $e) { + /* connection already gone — the server rolled back for us */ + } + } + if (!$spec['pool']) { + $pdo = null; + } + $ctx->events[] = sprintf( + 'db-txn %s: db=%s pool=%d outcome=%s', + $coro, $db, (int) $spec['pool'], $outcome); + } + } } diff --git a/fuzzy-tests/_harness/generate.php b/fuzzy-tests/_harness/generate.php index 3fb1cce..a996e42 100644 --- a/fuzzy-tests/_harness/generate.php +++ b/fuzzy-tests/_harness/generate.php @@ -201,6 +201,19 @@ function findFeatures(string $root): array { $ts = @stream_socket_client("tcp://$th:$tport", $te, $tm, 2); if ($ts === false) { echo "skip Toxiproxy not running at $tp (set CHAOS_TOXIPROXY)"; exit; } fclose($ts); +PROBE, + 'pdo_mysql' => 'if (!extension_loaded("pdo_mysql")) { echo "skip ext/pdo_mysql required"; exit; }', + // A reachable MySQL server is opt-in, like Toxiproxy: the database chaos + // tests run only where one answers and skip everywhere else. The probe is + // a plain TCP connect — the upstream the Toxiproxy proxy will point at. + 'mysql-server' => <<<'PROBE' +$ms = getenv("CHAOS_MYSQL") ?: "127.0.0.1:3306"; +$mp = strrpos($ms, ":"); +$mh = $mp === false ? $ms : substr($ms, 0, $mp); +$mport = $mp === false ? 3306 : (int)substr($ms, $mp + 1); +$msk = @stream_socket_client("tcp://$mh:$mport", $me, $mm, 2); +if ($msk === false) { echo "skip MySQL not reachable at $ms (set CHAOS_MYSQL)"; exit; } +fclose($msk); PROBE, ]; diff --git a/fuzzy-tests/db/mysql_chaos.feature b/fuzzy-tests/db/mysql_chaos.feature new file mode 100644 index 0000000..51fcc0e --- /dev/null +++ b/fuzzy-tests/db/mysql_chaos.feature @@ -0,0 +1,216 @@ +Feature: Database chaos — async PDO MySQL against a Toxiproxy-fronted server + + Closes the database half of the #136 coverage gap. The PDO MySQL driver + speaks a binary wire protocol, so a pure-PHP mock is not worth building — + the chaos is injected at the transport level instead: Toxiproxy sits + between the async PDO client and a real MySQL server, and every connect / + query / transaction goes through the libuv reactor. + + client coroutine ──▶ Toxiproxy proxy ──▶ real MySQL server + └── latency / bandwidth / slicer / reset_peer + + This exercises the paths where reactor cancellation, use-after-free and + pool-state bugs hide: a coroutine cancelled mid-query, the connection + dropped mid-query / mid-transaction, and the PDO connection pool handing + out / reclaiming / replacing a slot whose connection was lost. + + Opt-in by design: every scenario needs ext/pdo_mysql, a reachable MySQL + server (CHAOS_MYSQL, default 127.0.0.1:3306, schema seeded with a five-row + `items` table) and a running Toxiproxy. The generated .phpt carry a + --SKIPIF-- probe for all three, so the suite stays inert on dev machines + and per-PR CI and runs only on the nightly job. + + Invariants, decidable regardless of interleaving: + - a non-truncating toxic (latency / bandwidth / slicer) leaves the result + set byte-for-byte intact — the query still returns its five rows; + - a dropped connection surfaces as a clean PDOException, never a hang; + - a cancel mid-query leaves the coroutine completed and unorphaned, and + the outcome buckets sum to the attempt count; + - the connection pool keeps serving good connections to other coroutines + even while one slot's connection is being reset. + + Scenario: a query through a clean proxy returns every row + Given a MySQL database "DB" + And a coroutine "C" + When coroutine "C" queries database "DB" + Then counter "db_query_attempts_C" equals 1 + And counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines + + Scenario Outline: latency does not corrupt the result set + Given a MySQL database "DB" + And Toxiproxy adds ms latency to database "DB" + And a coroutine "C" + When coroutine "C" queries database "DB" + Then counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines + + Examples: + | latency | + | 1 | + | 5 | + | 20 | + + Scenario: a bandwidth-throttled query still returns every row + Given a MySQL database "DB" + And Toxiproxy throttles database "DB" to 64 KB/s + And a coroutine "C" + When coroutine "C" queries database "DB" + Then counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines + + Scenario Outline: a TCP-sliced wire stream is reassembled exactly + # Toxiproxy chops the MySQL wire protocol into tiny TCP segments — the + # driver must reassemble the binary frames whatever the fragmentation. + Given a MySQL database "DB" + And Toxiproxy slices database "DB" into -byte TCP segments + And a coroutine "C" + When coroutine "C" queries database "DB" + Then counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines + + Examples: + | segment | + | 1 | + | 64 | + | 512 | + + Scenario: an RST mid-query surfaces as a clean error + # reset_peer fires a TCP RST 200 ms into the connection; the slow query + # runs for ~2 s, so the reset always lands mid-query. The driver must + # raise a clean PDOException — bucketed as failed — never hang. + Given a MySQL database "DB" + And Toxiproxy resets database "DB" after 200 ms + And a coroutine "C" + When coroutine "C" runs a slow query on database "DB" + Then counter "db_slow_query_attempts_C" equals 1 + And counter "db_slow_query_failed_C" equals 1 + And coroutine "C" is completed + And no orphan coroutines + + Scenario Outline: cancel a coroutine mid-query + # A killer cancels the coroutine while it is parked in the reactor waiting + # on the DB socket. Under the random scheduler the cancel can land before, + # during, or after the query — so only the liveness sum and the + # no-hang / no-orphan invariants are decidable. + Given a MySQL database "DB" + And a coroutine "C" + And a coroutine "K" + When coroutine "C" runs a slow query on database "DB" + And coroutine "K" sleeps ms + And coroutine "K" cancels coroutine "C" + Then counter "db_slow_query_ok_C" plus counter "db_slow_query_cancelled_C" plus counter "db_slow_query_failed_C" plus counter "db_slow_query_no_db_C" equals counter "db_slow_query_attempts_C" + And coroutine "C" is completed + And no orphan coroutines + + Examples: + | ms | + | 0 | + | 50 | + | 300 | + | 900 | + + Scenario: many coroutines share one pooled connection handle + # Four coroutines each query through the same pool-enabled PDO; the pool + # hands every coroutine its own slot. All four must complete with the + # full result set — no slot cross-talk. + Given a pooled MySQL database "PDB" with 4 connections + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + And a coroutine "C4" + When coroutine "C1" queries database "PDB" + And coroutine "C2" queries database "PDB" + And coroutine "C3" queries database "PDB" + And coroutine "C4" queries database "PDB" + Then counter "db_query_ok_C1" equals 1 + And counter "db_query_ok_C2" equals 1 + And counter "db_query_ok_C3" equals 1 + And counter "db_query_ok_C4" equals 1 + And counter "db_query_rows_C1" equals 5 + And counter "db_query_rows_C4" equals 5 + And no orphan coroutines + + Scenario: concurrent pooled queries under latency + Given a pooled MySQL database "PDB" with 3 connections + And Toxiproxy adds 5 ms latency to database "PDB" + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + When coroutine "C1" queries database "PDB" + And coroutine "C2" queries database "PDB" + And coroutine "C3" queries database "PDB" + Then counter "db_query_ok_C1" equals 1 + And counter "db_query_ok_C2" equals 1 + And counter "db_query_ok_C3" equals 1 + And no orphan coroutines + + Scenario: the pool fails every query cleanly when the server connection drops + # reset_peer drops every connection through the proxy. Each pooled + # coroutine's query must fail with a clean PDOException — bucketed as + # failed — and the pool must not wedge, hang or leak: all coroutines + # complete, none is orphaned, and the shared handle tears down cleanly. + # This is the pool's behaviour when every slot's connection is lost. + Given a pooled MySQL database "PDB" with 4 connections + And Toxiproxy resets database "PDB" after 200 ms + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + When coroutine "C1" queries database "PDB" + And coroutine "C2" queries database "PDB" + And coroutine "C3" queries database "PDB" + Then counter "db_query_ok_C1" plus counter "db_query_failed_C1" equals counter "db_query_attempts_C1" + And counter "db_query_ok_C2" plus counter "db_query_failed_C2" equals counter "db_query_attempts_C2" + And counter "db_query_ok_C3" plus counter "db_query_failed_C3" equals counter "db_query_attempts_C3" + And coroutine "C1" is completed + And coroutine "C2" is completed + And coroutine "C3" is completed + And no orphan coroutines + + Scenario: a transaction commits through a latency-toxic'd connection + Given a MySQL database "DB" + And Toxiproxy adds 5 ms latency to database "DB" + And a coroutine "C" + When coroutine "C" runs a transaction on database "DB" + Then counter "db_txn_attempts_C" equals 1 + And counter "db_txn_ok_C" equals 1 + And counter "db_txn_committed_C" equals 1 + And no orphan coroutines + + Scenario: an RST mid-transaction surfaces cleanly and wedges nothing + # The transaction opens, then reset_peer drops the connection 150 ms in. + # The driver must raise a clean error; the coroutine completes and the + # connection (pooled slot here) is left usable, not mid-transaction. + Given a pooled MySQL database "PDB" + And Toxiproxy resets database "PDB" after 150 ms + And a coroutine "C" + And a coroutine "K" + When coroutine "C" runs a transaction on database "PDB" + And coroutine "K" runs a slow query on database "PDB" + Then counter "db_txn_ok_C" plus counter "db_txn_cancelled_C" plus counter "db_txn_failed_C" plus counter "db_txn_no_db_C" equals counter "db_txn_attempts_C" + And coroutine "C" is completed + And no orphan coroutines + + Scenario: database toxics crossed with logic and scheduler chaos + # Three chaos axes around a fixed five-row oracle: which non-truncating + # transport toxic Toxiproxy applies, whether the client pools its + # connection, and the scheduler interleaving. None of the toxics + # truncate, so the exact row count stays decidable across the product. + Given a MySQL database "DB" + One of: + - Toxiproxy adds 3 ms latency to database "DB" + - Toxiproxy throttles database "DB" to 128 KB/s + - Toxiproxy slices database "DB" into random:64-byte TCP segments + And a coroutine "C" + And a coroutine "N" + When coroutine "C" queries database "DB" + Any of: + - coroutine "N" sleeps 2 ms + - coroutine "N" sleeps 6 ms + Then counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines From fc25ff7513043a2c7b04f536f0fdbdc593348a87 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Fri, 22 May 2026 20:38:12 +0000 Subject: [PATCH 3/8] #136 db-chaos: async mysqli coverage MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Extends the database chaos suite to the mysqli extension — the same Toxiproxy-fronted MySQL server, reached through mysqli's own connection / result / prepared-statement API instead of PDO. mysqli has no connection pool, so each chaos query opens and closes its own connection. - fuzzy-tests/db/mysqli_chaos.feature — 10 scenarios → 28 .phpt: a query / transaction intact through a non-truncating toxic (latency / bandwidth / slicer), a dropped connection surfacing as a clean mysqli_sql_exception never a hang, a coroutine cancelled mid-query, concurrent queries. - ChaosNet::openMysqliConnection() — builds the connection from the same CHAOS_MYSQL* environment, MYSQLI_REPORT_STRICT so faults raise mysqli_sql_exception; mysqliRun() / mysqliTransaction() client routines. - new steps: a MySQLi database "DB"; coroutine "C" queries / runs a slow query via / runs a transaction via mysqli "DB". The Toxiproxy toxic steps are now driver-agnostic (requires toxiproxy + mysql-server only) and shared by the PDO and mysqli suites. - generate.php: ext/mysqli --SKIPIF-- rule. Verified: 58/58 database scenarios (PDO MySQL + mysqli) pass under fifo + three random scheduler seeds; the full 591-test fuzzy suite is green. --- CHANGELOG.md | 1 + fuzzy-tests/_harness/ChaosNet.php | 22 ++++ fuzzy-tests/_harness/Steps.php | 141 +++++++++++++++++++++++++- fuzzy-tests/_harness/generate.php | 1 + fuzzy-tests/db/mysqli_chaos.feature | 151 ++++++++++++++++++++++++++++ 5 files changed, 312 insertions(+), 4 deletions(-) create mode 100644 fuzzy-tests/db/mysqli_chaos.feature diff --git a/CHANGELOG.md b/CHANGELOG.md index 6503bec..389e8b6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.7.0] - ### Added +- **#136 Database chaos: async mysqli coverage** — `fuzzy-tests/db/mysqli_chaos.feature` (10 scenarios) extends the database chaos suite to the `mysqli` extension: the same Toxiproxy-fronted MySQL server reached through mysqli's own connection / result / prepared-statement API instead of PDO. mysqli has no connection pool, so each chaos query opens its own connection. New steps `a MySQLi database "DB"` and `coroutine "C" queries|runs a slow query via|runs a transaction via mysqli "DB"`; the Toxiproxy toxic steps are now driver-agnostic and shared with the PDO suite. `ChaosNet::openMysqliConnection()` builds the connection (`MYSQLI_REPORT_STRICT` so faults raise `mysqli_sql_exception`). Generated `.phpt` carry an ext/mysqli `--SKIPIF--` probe. - **#136 Database chaos: async PDO MySQL coverage** — new chaos topic `fuzzy-tests/db/` with `mysql_chaos.feature`: the async PDO MySQL driver is exercised against a real MySQL server fronted by Toxiproxy (`client coroutine → Toxiproxy → MySQL`), so the transport toxics — latency, bandwidth caps, TCP slicing, `reset_peer` mid-query / mid-transaction — land on the driver's wire I/O. 12 scenarios cover a non-pooled and a pool-enabled connection: a query / transaction completing intact through a non-truncating toxic, a dropped connection surfacing as a clean `PDOException` (never a hang), a coroutine cancelled mid-query, and the connection pool failing every slot cleanly when the server connection is lost. New steps `a [pooled] MySQL database "DB"` / `Toxiproxy adds latency to|throttles|slices|resets database "DB"` / `coroutine "C" queries|runs a slow query on|runs a transaction on database "DB"`; connection parameters come from the environment (`CHAOS_MYSQL`, `CHAOS_MYSQL_USER/PASS/DB`). Opt-in like Toxiproxy: generated `.phpt` carry a `--SKIPIF--` probe for ext/pdo_mysql + a reachable MySQL server, so the suite stays inert on dev machines and per-PR CI. Closes the database half of the #136 coverage gap. The network-fixture layer of the harness — EvilPeers, Toxiproxy proxies, chaos databases — is extracted from `Context` into a dedicated `ChaosNet` class (`$ctx->net`) so `Context` stays the scenario orchestrator. - **#136 HTTP chaos: async ext/curl coverage** — EvilPeer gains an `http` mode (`EvilPeer::serveHttp`): it drains one HTTP request and writes back an HTTP/1.1 response, with the serve-mode body toxics (slice/drip/abrupt close/hard reset/forked peer/Toxiproxy) joined by HTTP-specific ones — chunked transfer-encoding, a mendacious `Content-Length` (over/under-stated), dribbled headers, an arbitrary status code. New chaos topic `fuzzy-tests/curl/` with `http_chaos.feature`: a reactor-driven `ext/curl` client (`coroutine "C" fetches peer "EP" over HTTP`) is exercised against every toxic, under the random scheduler, and cancelled mid-transfer — closing the CURL half of the #136 coverage gap (nothing previously exercised async curl under chaos). Generated `.phpt` carry a `--SKIPIF--` curl probe. **Found a real bug:** async curl dropped all but the first chunk of a chunked-encoded response body (`CURLE_WRITE_ERROR`) — fixed in php-src `ext/curl/curl_async.c` (`#136`); see `fuzzy-tests/FINDINGS.md`. - **#127 I/O chaos: EvilPeer + transport×logic crossing** — new `fuzzy-tests/_peers/EvilPeer.php`, a deliberately misbehaving network peer driven by a declarative fault table. Toxics: payload slicing, inter-chunk drip delay, abrupt mid-stream close (`reset`); parameters accept the seeded-random fuzz syntax (`random:N`, `1|5`). New chaos topic `fuzzy-tests/io/`: `evil_peer.feature` (sliced/dripped stream reassembled exactly), `abrupt_close.feature` (dropped connection → clean payload prefix, no hang), and `combined_chaos.feature` — **crosses transport chaos with logic chaos**: toxic-selection mutation blocks × client-logic mutation blocks × the random scheduler, all checked against a fixed payload oracle. On a failure the executor prints a **chaos event log** — the exact low-level toxic sequence the EvilPeer played out plus each client's I/O trace. Harness gains `defineEvilPeer()` + a prep-phase that binds each peer's listening socket and serves it from a coroutine. diff --git a/fuzzy-tests/_harness/ChaosNet.php b/fuzzy-tests/_harness/ChaosNet.php index e540df5..d3d02a6 100644 --- a/fuzzy-tests/_harness/ChaosNet.php +++ b/fuzzy-tests/_harness/ChaosNet.php @@ -159,6 +159,28 @@ public function openDbConnection(string $db, bool $pool): \PDO { return new \PDO($dsn, $user, $pass, $opts); } + /** + * Open a mysqli connection to a fronted database, through its Toxiproxy + * proxy. mysqli has no connection pool, so every chaos query opens (and + * closes) its own connection. Same environment-driven parameters as + * openDbConnection(); MYSQLI_REPORT_ERROR|STRICT makes connect/query + * failures throw mysqli_sql_exception, so the chaos steps can bucket them. + */ + public function openMysqliConnection(string $db): \mysqli { + $addr = $this->evilDbAddr[$db] ?? ''; + $colon = strrpos($addr, ':'); + $host = $colon === false ? $addr : substr($addr, 0, $colon); + $port = $colon === false ? 3306 : (int) substr($addr, $colon + 1); + $user = getenv('CHAOS_MYSQL_USER') ?: 'test'; + $pass = getenv('CHAOS_MYSQL_PASS') ?: 'test'; + $name = getenv('CHAOS_MYSQL_DB') ?: 'chaos_test'; + \mysqli_report(MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT); + // @ silences mysqlnd's raw E_WARNING when the connect fails under a + // toxic — MYSQLI_REPORT_STRICT still raises mysqli_sql_exception, + // which the chaos client routine catches and buckets. + return @new \mysqli($host, $user, $pass, $name, $port); + } + /** * Prep phase: bind every in-process peer's listening socket synchronously * (so the client address is known before any client coroutine runs), spawn diff --git a/fuzzy-tests/_harness/Steps.php b/fuzzy-tests/_harness/Steps.php index 29f85e0..ea05485 100644 --- a/fuzzy-tests/_harness/Steps.php +++ b/fuzzy-tests/_harness/Steps.php @@ -512,7 +512,7 @@ function(Context $ctx, string $latExpr, string $name) { $lat = (int)$ctx->resolver->resolve($latExpr); $ctx->net->addEvilDbToxic($name, 'latency', ['latency' => $lat]); }) - ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + ->requires('toxiproxy', 'mysql-server'); // Given Toxiproxy throttles database "DB" to N KB/s $r->on('/^Toxiproxy throttles database "([^"]+)" to (\S+) KB\/s$/', @@ -520,7 +520,7 @@ function(Context $ctx, string $name, string $rateExpr) { $rate = (int)$ctx->resolver->resolve($rateExpr); $ctx->net->addEvilDbToxic($name, 'bandwidth', ['rate' => $rate]); }) - ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + ->requires('toxiproxy', 'mysql-server'); // Given Toxiproxy slices database "DB" into N-byte TCP segments $r->on('/^Toxiproxy slices database "([^"]+)" into (\S+)-byte TCP segments$/', @@ -532,7 +532,7 @@ function(Context $ctx, string $name, string $sizeExpr) { 'delay' => 0, ]); }) - ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + ->requires('toxiproxy', 'mysql-server'); // Given Toxiproxy resets database "DB" after N ms // reset_peer toxic — a TCP RST N ms into the connection; lands @@ -542,7 +542,18 @@ function(Context $ctx, string $name, string $msExpr) { $ms = (int)$ctx->resolver->resolve($msExpr); $ctx->net->addEvilDbToxic($name, 'reset_peer', ['timeout' => $ms]); }) - ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + ->requires('toxiproxy', 'mysql-server'); + + // Given a MySQLi database "DB" + // The same Toxiproxy-fronted MySQL server, reached through the mysqli + // extension instead of PDO. mysqli has no connection pool, so every + // query opens its own connection. The Toxiproxy toxic steps above are + // driver-agnostic and apply to a MySQLi database too. + $r->on('/^a MySQLi database "([^"]+)"$/', + function(Context $ctx, string $name) { + $ctx->net->defineEvilDb($name, 'mysqli'); + }) + ->requires('toxiproxy', 'mysqli', 'mysql-server'); // ---- When: actions inside a coroutine ---- @@ -631,6 +642,40 @@ function(Context $ctx, string $coro, string $db) { }) ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + // When coroutine "X" queries via mysqli "DB" + // Same SELECT as the PDO query step, but over the mysqli extension — + // connect + query I/O go through the libuv reactor. Liveness invariant + // mysqli_query_ok + cancelled + failed + no_db == mysqli_query_attempts. + $r->on('/^coroutine "([^"]+)" queries via mysqli "([^"]+)"$/', + function(Context $ctx, string $coro, string $db) { + $ctx->planAction($coro, function(Context $ctx) use ($coro, $db) { + StandardSteps::mysqliRun($ctx, $coro, $db, + 'SELECT id, label, n FROM items WHERE id <= 5 ORDER BY id', 'query'); + }); + }) + ->requires('toxiproxy', 'mysqli', 'mysql-server'); + + // When coroutine "X" runs a slow query via mysqli "DB" + // SELECT SLEEP(2) over mysqli — parks the coroutine in the reactor on + // the DB socket for a killer to cancel or a reset_peer toxic to hit. + $r->on('/^coroutine "([^"]+)" runs a slow query via mysqli "([^"]+)"$/', + function(Context $ctx, string $coro, string $db) { + $ctx->planAction($coro, function(Context $ctx) use ($coro, $db) { + StandardSteps::mysqliRun($ctx, $coro, $db, 'SELECT SLEEP(2)', 'slow_query'); + }); + }) + ->requires('toxiproxy', 'mysqli', 'mysql-server'); + + // When coroutine "X" runs a transaction via mysqli "DB" + // begin_transaction → prepared INSERT → commit over mysqli. + $r->on('/^coroutine "([^"]+)" runs a transaction via mysqli "([^"]+)"$/', + function(Context $ctx, string $coro, string $db) { + $ctx->planAction($coro, function(Context $ctx) use ($coro, $db) { + StandardSteps::mysqliTransaction($ctx, $coro, $db); + }); + }) + ->requires('toxiproxy', 'mysqli', 'mysql-server'); + // When coroutine "X" uploads N bytes to peer "EP" // Connects and writes N bytes in a single fwrite(). Against a slow or // never-reading consume-mode peer that fwrite() suspends on a full @@ -3594,4 +3639,92 @@ public static function dbTransaction(Context $ctx, string $coro, string $db): vo $coro, $db, (int) $spec['pool'], $outcome); } } + + /** + * Shared async-mysqli routine used by the "queries / runs a slow query + * via mysqli" steps. mysqli has no connection pool, so each call opens + * (and closes) its own connection through the Toxiproxy proxy. Outcome + * buckets mirror dbRun(): + * mysqli__ok / _cancelled / _failed / _no_db sum to + * mysqli__attempts; mysqli__rows records rows drained. + */ + public static function mysqliRun(Context $ctx, string $coro, string $db, string $sql, string $verb): void { + $ctx->inc("mysqli_{$verb}_attempts_$coro"); + if (!isset($ctx->net->evilDbDefs[$db]) || !isset($ctx->net->evilDbAddr[$db])) { + $ctx->inc("mysqli_{$verb}_no_db_$coro"); + return; + } + $outcome = 'ok'; + $rows = 0; + $my = null; + try { + $my = $ctx->net->openMysqliConnection($db); + // @ silences mysqlnd's raw E_WARNING on a dropped connection — + // the mysqli_sql_exception still propagates to the catch blocks. + $res = @$my->query($sql); + if ($res instanceof \mysqli_result) { + while (@$res->fetch_row() !== null) { + $rows++; + } + $res->free(); + } + $ctx->inc("mysqli_{$verb}_ok_$coro"); + } catch (\Async\AsyncCancellation $e) { + $outcome = 'cancelled'; + $ctx->inc("mysqli_{$verb}_cancelled_$coro"); + } catch (\Throwable $e) { + $outcome = 'failed'; + $ctx->inc("mysqli_{$verb}_failed_$coro"); + } finally { + $ctx->inc("mysqli_{$verb}_rows_$coro", $rows); + if ($my instanceof \mysqli) { + @$my->close(); + } + $ctx->events[] = sprintf( + 'mysqli %s: db=%s verb=%s rows=%d outcome=%s', + $coro, $db, $verb, $rows, $outcome); + } + } + + /** + * Shared async-mysqli routine for the "runs a transaction via mysqli" + * step: begin_transaction → prepared INSERT → commit. A connection fault + * mid-transaction must surface as a clean mysqli_sql_exception; the + * coroutine completes and nothing is left wedged. Outcome buckets: + * mysqli_txn_ok / _cancelled / _failed / _no_db sum to + * mysqli_txn_attempts; mysqli_txn_committed counts acknowledged COMMITs. + */ + public static function mysqliTransaction(Context $ctx, string $coro, string $db): void { + $ctx->inc("mysqli_txn_attempts_$coro"); + if (!isset($ctx->net->evilDbDefs[$db]) || !isset($ctx->net->evilDbAddr[$db])) { + $ctx->inc("mysqli_txn_no_db_$coro"); + return; + } + $outcome = 'ok'; + $my = null; + try { + $my = $ctx->net->openMysqliConnection($db); + @$my->begin_transaction(); + $stmt = @$my->prepare('INSERT INTO items (label, n) VALUES (?, ?)'); + $label = "mtxn-$coro"; + $n = 0; + @$stmt->bind_param('si', $label, $n); + @$stmt->execute(); + @$my->commit(); + $ctx->inc("mysqli_txn_committed_$coro"); + $ctx->inc("mysqli_txn_ok_$coro"); + } catch (\Async\AsyncCancellation $e) { + $outcome = 'cancelled'; + $ctx->inc("mysqli_txn_cancelled_$coro"); + } catch (\Throwable $e) { + $outcome = 'failed'; + $ctx->inc("mysqli_txn_failed_$coro"); + } finally { + if ($my instanceof \mysqli) { + @$my->close(); + } + $ctx->events[] = sprintf( + 'mysqli-txn %s: db=%s outcome=%s', $coro, $db, $outcome); + } + } } diff --git a/fuzzy-tests/_harness/generate.php b/fuzzy-tests/_harness/generate.php index a996e42..8524dee 100644 --- a/fuzzy-tests/_harness/generate.php +++ b/fuzzy-tests/_harness/generate.php @@ -203,6 +203,7 @@ function findFeatures(string $root): array { fclose($ts); PROBE, 'pdo_mysql' => 'if (!extension_loaded("pdo_mysql")) { echo "skip ext/pdo_mysql required"; exit; }', + 'mysqli' => 'if (!extension_loaded("mysqli")) { echo "skip ext/mysqli required"; exit; }', // A reachable MySQL server is opt-in, like Toxiproxy: the database chaos // tests run only where one answers and skip everywhere else. The probe is // a plain TCP connect — the upstream the Toxiproxy proxy will point at. diff --git a/fuzzy-tests/db/mysqli_chaos.feature b/fuzzy-tests/db/mysqli_chaos.feature new file mode 100644 index 0000000..069a369 --- /dev/null +++ b/fuzzy-tests/db/mysqli_chaos.feature @@ -0,0 +1,151 @@ +Feature: Database chaos — async mysqli against a Toxiproxy-fronted server + + The mysqli half of the #136 database coverage. The mysqli extension reaches + the same real MySQL server through the same Toxiproxy proxy as the PDO MySQL + suite — connect / query / transaction all go through the libuv reactor — but + exercises a different driver surface (mysqli's own connection object, + result iteration and prepared-statement API). mysqli has no connection + pool, so every chaos query opens and closes its own connection. + + Opt-in by design: every scenario needs ext/mysqli, a reachable MySQL server + (CHAOS_MYSQL) and a running Toxiproxy; the generated .phpt carry a + --SKIPIF-- probe for all three. + + Invariants, decidable regardless of interleaving: + - a non-truncating toxic leaves the result set intact — the query still + returns its five rows; + - a dropped connection surfaces as a clean mysqli_sql_exception, never a + hang; + - a cancel mid-query leaves the coroutine completed and unorphaned, and + the outcome buckets sum to the attempt count. + + Scenario: a query through a clean proxy returns every row + Given a MySQLi database "DB" + And a coroutine "C" + When coroutine "C" queries via mysqli "DB" + Then counter "mysqli_query_attempts_C" equals 1 + And counter "mysqli_query_ok_C" equals 1 + And counter "mysqli_query_rows_C" equals 5 + And no orphan coroutines + + Scenario Outline: latency does not corrupt the result set + Given a MySQLi database "DB" + And Toxiproxy adds ms latency to database "DB" + And a coroutine "C" + When coroutine "C" queries via mysqli "DB" + Then counter "mysqli_query_ok_C" equals 1 + And counter "mysqli_query_rows_C" equals 5 + And no orphan coroutines + + Examples: + | latency | + | 1 | + | 5 | + | 20 | + + Scenario: a bandwidth-throttled query still returns every row + Given a MySQLi database "DB" + And Toxiproxy throttles database "DB" to 64 KB/s + And a coroutine "C" + When coroutine "C" queries via mysqli "DB" + Then counter "mysqli_query_ok_C" equals 1 + And counter "mysqli_query_rows_C" equals 5 + And no orphan coroutines + + Scenario Outline: a TCP-sliced wire stream is reassembled exactly + Given a MySQLi database "DB" + And Toxiproxy slices database "DB" into -byte TCP segments + And a coroutine "C" + When coroutine "C" queries via mysqli "DB" + Then counter "mysqli_query_ok_C" equals 1 + And counter "mysqli_query_rows_C" equals 5 + And no orphan coroutines + + Examples: + | segment | + | 1 | + | 64 | + | 512 | + + Scenario: an RST mid-query surfaces as a clean error + Given a MySQLi database "DB" + And Toxiproxy resets database "DB" after 200 ms + And a coroutine "C" + When coroutine "C" runs a slow query via mysqli "DB" + Then counter "mysqli_slow_query_attempts_C" equals 1 + And counter "mysqli_slow_query_failed_C" equals 1 + And coroutine "C" is completed + And no orphan coroutines + + Scenario Outline: cancel a coroutine mid-query + # A killer cancels the coroutine while mysqli is parked in the reactor on + # the DB socket. Only the liveness sum and no-hang / no-orphan are + # decidable under the random scheduler. + Given a MySQLi database "DB" + And a coroutine "C" + And a coroutine "K" + When coroutine "C" runs a slow query via mysqli "DB" + And coroutine "K" sleeps ms + And coroutine "K" cancels coroutine "C" + Then counter "mysqli_slow_query_ok_C" plus counter "mysqli_slow_query_cancelled_C" plus counter "mysqli_slow_query_failed_C" plus counter "mysqli_slow_query_no_db_C" equals counter "mysqli_slow_query_attempts_C" + And coroutine "C" is completed + And no orphan coroutines + + Examples: + | ms | + | 0 | + | 50 | + | 300 | + | 900 | + + Scenario: many concurrent mysqli queries under latency + Given a MySQLi database "DB" + And Toxiproxy adds 5 ms latency to database "DB" + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + When coroutine "C1" queries via mysqli "DB" + And coroutine "C2" queries via mysqli "DB" + And coroutine "C3" queries via mysqli "DB" + Then counter "mysqli_query_ok_C1" equals 1 + And counter "mysqli_query_ok_C2" equals 1 + And counter "mysqli_query_ok_C3" equals 1 + And counter "mysqli_query_rows_C2" equals 5 + And no orphan coroutines + + Scenario: a transaction commits through a latency-toxic'd connection + Given a MySQLi database "DB" + And Toxiproxy adds 5 ms latency to database "DB" + And a coroutine "C" + When coroutine "C" runs a transaction via mysqli "DB" + Then counter "mysqli_txn_attempts_C" equals 1 + And counter "mysqli_txn_ok_C" equals 1 + And counter "mysqli_txn_committed_C" equals 1 + And no orphan coroutines + + Scenario: an RST mid-transaction surfaces cleanly + # reset_peer drops the connection 150 ms in; the transaction must fail + # with a clean mysqli_sql_exception and the coroutine must complete. + Given a MySQLi database "DB" + And Toxiproxy resets database "DB" after 150 ms + And a coroutine "C" + When coroutine "C" runs a transaction via mysqli "DB" + Then counter "mysqli_txn_ok_C" plus counter "mysqli_txn_cancelled_C" plus counter "mysqli_txn_failed_C" plus counter "mysqli_txn_no_db_C" equals counter "mysqli_txn_attempts_C" + And coroutine "C" is completed + And no orphan coroutines + + Scenario: mysqli toxics crossed with logic and scheduler chaos + Given a MySQLi database "DB" + One of: + - Toxiproxy adds 3 ms latency to database "DB" + - Toxiproxy throttles database "DB" to 128 KB/s + - Toxiproxy slices database "DB" into random:64-byte TCP segments + And a coroutine "C" + And a coroutine "N" + When coroutine "C" queries via mysqli "DB" + Any of: + - coroutine "N" sleeps 2 ms + - coroutine "N" sleeps 6 ms + Then counter "mysqli_query_ok_C" equals 1 + And counter "mysqli_query_rows_C" equals 5 + And no orphan coroutines From 35a7136b12284e079ae6d16cdba937c9ba9ff703 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Sat, 23 May 2026 09:33:16 +0000 Subject: [PATCH 4/8] #136 db-chaos: PgSQL coverage + concurrent transactions / heterogeneous workloads MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the PgSQL half of the database chaos suite and a separate feature targeting the harder surface — multiple coroutines doing *different* things on one pooled PDO. PgSQL: - fuzzy-tests/db/pgsql_chaos.feature — 12 scenarios → 30 .phpt mirroring mysql_chaos.feature against a Toxiproxy-fronted PostgreSQL server. - ChaosNet::openDbConnection() and ChaosNet::dbUpstream() are now driver-aware (mysql / pgsql DSN, CHAOS_MYSQL / CHAOS_PGSQL upstream env vars). - Slow-query step picks SELECT pg_sleep(2) vs SELECT SLEEP(2) at action-run time from the database's declared driver. - New Given steps: a [pooled] PgSQL database "DB". Toxiproxy and PDO client steps relaxed to driver-agnostic requires — the database declaration carries the driver requirement. - generate.php: ext/pdo_pgsql + pgsql-server --SKIPIF-- probes. concurrent.feature (9 scenarios → 22 .phpt): - Concurrent transactions all commit. Reader + writer side by side (decidable: reader's WHERE id<=5 snapshot is unaffected by writer's id>5 insert). Mixed query/transaction/slow query. A transaction cancelled while a sibling reader keeps working. Concurrent transactions with one cancelled. Heterogeneous workload failing cleanly under reset_peer. Transaction/query storm on an undersized pool (4 conns, 6 coroutines). - StandardSteps::dbTransaction now runs a multi-statement transaction (BEGIN → INSERT → read-back SELECT → COMMIT) so each transaction yields several times and a sibling can be scheduled between any two statements — broader cross-coroutine interleaving. Found a real UAF in the PDO connection pool exposed by this feature (coroutine cancelled mid-transaction → SEGV in php_pdo_free_statement). Fixed in php-src ext/pdo (true-async/php-src#136). Verified: full 643-test fuzzy suite passes under fifo + five random scheduler seeds (1, 7, 42, 1009, 31337) — 100%. --- CHANGELOG.md | 2 + fuzzy-tests/_harness/ChaosNet.php | 44 ++++--- fuzzy-tests/_harness/Steps.php | 69 +++++++--- fuzzy-tests/_harness/generate.php | 11 ++ fuzzy-tests/db/concurrent.feature | 204 +++++++++++++++++++++++++++++ fuzzy-tests/db/pgsql_chaos.feature | 187 ++++++++++++++++++++++++++ 6 files changed, 485 insertions(+), 32 deletions(-) create mode 100644 fuzzy-tests/db/concurrent.feature create mode 100644 fuzzy-tests/db/pgsql_chaos.feature diff --git a/CHANGELOG.md b/CHANGELOG.md index 389e8b6..6bdefa6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.7.0] - ### Added +- **#136 Database chaos: concurrent transactions and heterogeneous workloads** — new `fuzzy-tests/db/concurrent.feature` (9 scenarios → 22 .phpt) targets the harder surface where the per-driver suites stop: many coroutines doing *different* things at once on one pooled `$pdo` — concurrent transactions, a writer racing a reader (decidable: reader's `WHERE id<=5` snapshot is interleaving-independent of the writer's `id>5` insert), mixed query/transaction/slow-query workload, a transaction cancelled while a sibling reader keeps working, a transaction/query storm on an undersized pool (4-conn pool, 6 coroutines). The transaction body is now multi-statement (BEGIN → INSERT → read-back SELECT → COMMIT), so each transaction yields several times and a sibling can be scheduled between any two statements. **Found and exposed a real UAF in the PDO connection pool**: cancelling a coroutine mid-transaction segfaulted in `php_pdo_free_statement` — fixed in php-src `ext/pdo/pdo_pool.c` (`#136`); the pool slot release path force-zeroed `pool_slot_refcount` and cleared the template's per-coroutine error stash from a shared slot, dangling live `stmt->pooled_conn` pointers from the cancelled coroutine. +- **#136 Database chaos: async PDO PgSQL coverage** — new `fuzzy-tests/db/pgsql_chaos.feature` (12 scenarios → 30 .phpt) mirrors `mysql_chaos.feature` against a Toxiproxy-fronted PostgreSQL server, exercising the pdo_pgsql driver and the libpq wire protocol. The Toxiproxy toxic steps, `queries / runs a slow query / runs a transaction on database` client steps, `dbRun()` and `dbTransaction()` are now driver-aware: `ChaosNet::openDbConnection()` builds a `pgsql:` or `mysql:` DSN from the database's declared driver, `ChaosNet::dbUpstream()` selects the upstream env var (`CHAOS_PGSQL` / `CHAOS_MYSQL`, defaults `127.0.0.1:5432` / `:3306`), and the slow-query step picks `SELECT pg_sleep(2)` vs `SELECT SLEEP(2)` at action-run time. New steps `a [pooled] PgSQL database "DB"`; generate.php gains an ext/pdo_pgsql + a PostgreSQL-reachable `--SKIPIF--` probe. The Toxiproxy and PDO client steps' `requires` were relaxed to driver-agnostic — the database-declaration step pins the driver requirement. - **#136 Database chaos: async mysqli coverage** — `fuzzy-tests/db/mysqli_chaos.feature` (10 scenarios) extends the database chaos suite to the `mysqli` extension: the same Toxiproxy-fronted MySQL server reached through mysqli's own connection / result / prepared-statement API instead of PDO. mysqli has no connection pool, so each chaos query opens its own connection. New steps `a MySQLi database "DB"` and `coroutine "C" queries|runs a slow query via|runs a transaction via mysqli "DB"`; the Toxiproxy toxic steps are now driver-agnostic and shared with the PDO suite. `ChaosNet::openMysqliConnection()` builds the connection (`MYSQLI_REPORT_STRICT` so faults raise `mysqli_sql_exception`). Generated `.phpt` carry an ext/mysqli `--SKIPIF--` probe. - **#136 Database chaos: async PDO MySQL coverage** — new chaos topic `fuzzy-tests/db/` with `mysql_chaos.feature`: the async PDO MySQL driver is exercised against a real MySQL server fronted by Toxiproxy (`client coroutine → Toxiproxy → MySQL`), so the transport toxics — latency, bandwidth caps, TCP slicing, `reset_peer` mid-query / mid-transaction — land on the driver's wire I/O. 12 scenarios cover a non-pooled and a pool-enabled connection: a query / transaction completing intact through a non-truncating toxic, a dropped connection surfacing as a clean `PDOException` (never a hang), a coroutine cancelled mid-query, and the connection pool failing every slot cleanly when the server connection is lost. New steps `a [pooled] MySQL database "DB"` / `Toxiproxy adds latency to|throttles|slices|resets database "DB"` / `coroutine "C" queries|runs a slow query on|runs a transaction on database "DB"`; connection parameters come from the environment (`CHAOS_MYSQL`, `CHAOS_MYSQL_USER/PASS/DB`). Opt-in like Toxiproxy: generated `.phpt` carry a `--SKIPIF--` probe for ext/pdo_mysql + a reachable MySQL server, so the suite stays inert on dev machines and per-PR CI. Closes the database half of the #136 coverage gap. The network-fixture layer of the harness — EvilPeers, Toxiproxy proxies, chaos databases — is extracted from `Context` into a dedicated `ChaosNet` class (`$ctx->net`) so `Context` stays the scenario orchestrator. - **#136 HTTP chaos: async ext/curl coverage** — EvilPeer gains an `http` mode (`EvilPeer::serveHttp`): it drains one HTTP request and writes back an HTTP/1.1 response, with the serve-mode body toxics (slice/drip/abrupt close/hard reset/forked peer/Toxiproxy) joined by HTTP-specific ones — chunked transfer-encoding, a mendacious `Content-Length` (over/under-stated), dribbled headers, an arbitrary status code. New chaos topic `fuzzy-tests/curl/` with `http_chaos.feature`: a reactor-driven `ext/curl` client (`coroutine "C" fetches peer "EP" over HTTP`) is exercised against every toxic, under the random scheduler, and cancelled mid-transfer — closing the CURL half of the #136 coverage gap (nothing previously exercised async curl under chaos). Generated `.phpt` carry a `--SKIPIF--` curl probe. **Found a real bug:** async curl dropped all but the first chunk of a chunked-encoded response body (`CURLE_WRITE_ERROR`) — fixed in php-src `ext/curl/curl_async.c` (`#136`); see `fuzzy-tests/FINDINGS.md`. diff --git a/fuzzy-tests/_harness/ChaosNet.php b/fuzzy-tests/_harness/ChaosNet.php index d3d02a6..32f2f96 100644 --- a/fuzzy-tests/_harness/ChaosNet.php +++ b/fuzzy-tests/_harness/ChaosNet.php @@ -129,25 +129,37 @@ public function addEvilDbToxic(string $name, string $type, array $attributes, st ]; } + /** + * The real DB-server address a driver's chaos databases are fronted onto. + * Comes from the environment so the suite adapts to whatever the CI / dev + * box exposes; chaos-friendly defaults match the local setup. + */ + public function dbUpstream(string $driver): string { + return $driver === 'pgsql' + ? (getenv('CHAOS_PGSQL') ?: '127.0.0.1:5432') + : (getenv('CHAOS_MYSQL') ?: '127.0.0.1:3306'); + } + /** * Open a PDO connection to a fronted database, through its Toxiproxy proxy. - * Connection parameters come from the environment (so the suite adapts to - * whatever DB the CI / dev box exposes) with chaos-friendly defaults: - * CHAOS_MYSQL (host:port) · CHAOS_MYSQL_USER · CHAOS_MYSQL_PASS · - * CHAOS_MYSQL_DB. - * A pool-enabled handle is created with POOL_MIN 0, so the constructor - * itself opens no socket — it neither yields nor needs a coroutine. + * Driver-aware (mysql / pgsql); connection parameters come from the + * environment — CHAOS_{MYSQL,PGSQL}[_USER|_PASS|_DB]. A pool-enabled handle + * is created with POOL_MIN 0, so the constructor opens no socket eagerly. */ public function openDbConnection(string $db, bool $pool): \PDO { - $addr = $this->evilDbAddr[$db] ?? ''; - $colon = strrpos($addr, ':'); - $host = $colon === false ? $addr : substr($addr, 0, $colon); - $port = $colon === false ? 3306 : (int) substr($addr, $colon + 1); - $user = getenv('CHAOS_MYSQL_USER') ?: 'test'; - $pass = getenv('CHAOS_MYSQL_PASS') ?: 'test'; - $name = getenv('CHAOS_MYSQL_DB') ?: 'chaos_test'; - $dsn = "mysql:host=$host;port=$port;dbname=$name"; - $opts = [ + $driver = $this->evilDbDefs[$db]['driver'] ?? 'mysql'; + $addr = $this->evilDbAddr[$db] ?? ''; + $colon = strrpos($addr, ':'); + $pgsql = $driver === 'pgsql'; + $host = $colon === false ? $addr : substr($addr, 0, $colon); + $port = $colon === false ? ($pgsql ? 5432 : 3306) : (int) substr($addr, $colon + 1); + $envPfx = $pgsql ? 'CHAOS_PGSQL' : 'CHAOS_MYSQL'; + $user = getenv("{$envPfx}_USER") ?: 'test'; + $pass = getenv("{$envPfx}_PASS") ?: 'test'; + $name = getenv("{$envPfx}_DB") ?: 'chaos_test'; + $dsn = sprintf('%s:host=%s;port=%d;dbname=%s', + $pgsql ? 'pgsql' : 'mysql', $host, $port, $name); + $opts = [ \PDO::ATTR_ERRMODE => \PDO::ERRMODE_EXCEPTION, \PDO::ATTR_TIMEOUT => 5, ]; @@ -261,7 +273,7 @@ public function setUp(Context $ctx): array { // pool-enabled DB also gets its one shared PDO handle built here. foreach ($this->evilDbDefs as $name => $spec) { $this->toxiproxy ??= new ToxiproxyClient(); - $upstream = getenv('CHAOS_MYSQL') ?: '127.0.0.1:3306'; + $upstream = $this->dbUpstream($spec['driver'] ?? 'mysql'); $proxyName = sprintf('chaosdb_%d_%s_%s', getmypid(), bin2hex(random_bytes(3)), $name); $listen = $this->toxiproxy->createProxy($proxyName, '127.0.0.1:0', $upstream); $this->toxiproxyProxies[] = $proxyName; diff --git a/fuzzy-tests/_harness/Steps.php b/fuzzy-tests/_harness/Steps.php index ea05485..7efad43 100644 --- a/fuzzy-tests/_harness/Steps.php +++ b/fuzzy-tests/_harness/Steps.php @@ -512,7 +512,7 @@ function(Context $ctx, string $latExpr, string $name) { $lat = (int)$ctx->resolver->resolve($latExpr); $ctx->net->addEvilDbToxic($name, 'latency', ['latency' => $lat]); }) - ->requires('toxiproxy', 'mysql-server'); + ->requires('toxiproxy'); // Given Toxiproxy throttles database "DB" to N KB/s $r->on('/^Toxiproxy throttles database "([^"]+)" to (\S+) KB\/s$/', @@ -520,7 +520,7 @@ function(Context $ctx, string $name, string $rateExpr) { $rate = (int)$ctx->resolver->resolve($rateExpr); $ctx->net->addEvilDbToxic($name, 'bandwidth', ['rate' => $rate]); }) - ->requires('toxiproxy', 'mysql-server'); + ->requires('toxiproxy'); // Given Toxiproxy slices database "DB" into N-byte TCP segments $r->on('/^Toxiproxy slices database "([^"]+)" into (\S+)-byte TCP segments$/', @@ -532,7 +532,7 @@ function(Context $ctx, string $name, string $sizeExpr) { 'delay' => 0, ]); }) - ->requires('toxiproxy', 'mysql-server'); + ->requires('toxiproxy'); // Given Toxiproxy resets database "DB" after N ms // reset_peer toxic — a TCP RST N ms into the connection; lands @@ -542,7 +542,7 @@ function(Context $ctx, string $name, string $msExpr) { $ms = (int)$ctx->resolver->resolve($msExpr); $ctx->net->addEvilDbToxic($name, 'reset_peer', ['timeout' => $ms]); }) - ->requires('toxiproxy', 'mysql-server'); + ->requires('toxiproxy'); // Given a MySQLi database "DB" // The same Toxiproxy-fronted MySQL server, reached through the mysqli @@ -555,6 +555,33 @@ function(Context $ctx, string $name) { }) ->requires('toxiproxy', 'mysqli', 'mysql-server'); + // Given a PgSQL database "DB" + // A PostgreSQL server, fronted by Toxiproxy exactly like the MySQL + // one. The Toxiproxy toxic steps and the `queries / runs a slow query + // on / runs a transaction on database` client steps are all + // driver-agnostic — dbRun()/dbTransaction() build a pgsql: DSN when + // the database's driver is pgsql. + $r->on('/^a PgSQL database "([^"]+)"$/', + function(Context $ctx, string $name) { + $ctx->net->defineEvilDb($name, 'pgsql'); + }) + ->requires('toxiproxy', 'pdo_pgsql', 'pgsql-server'); + + // Given a pooled PgSQL database "DB" + $r->on('/^a pooled PgSQL database "([^"]+)"$/', + function(Context $ctx, string $name) { + $ctx->net->defineEvilDb($name, 'pgsql', true); + }) + ->requires('toxiproxy', 'pdo_pgsql', 'pgsql-server'); + + // Given a pooled PgSQL database "DB" with N connections + $r->on('/^a pooled PgSQL database "([^"]+)" with (\S+) connections$/', + function(Context $ctx, string $name, string $nExpr) { + $n = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilDb($name, 'pgsql', true, $n > 0 ? $n : 1); + }) + ->requires('toxiproxy', 'pdo_pgsql', 'pgsql-server'); + // ---- When: actions inside a coroutine ---- // When coroutine "X" downloads from peer "EP" @@ -614,20 +641,23 @@ function(Context $ctx, string $coro, string $db) { 'SELECT id, label, n FROM items WHERE id <= 5 ORDER BY id', 'query'); }); }) - ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + ->requires('toxiproxy'); // When coroutine "X" runs a slow query on database "DB" - // SELECT SLEEP(2) — keeps the coroutine parked in the reactor on the - // DB socket long enough for a killer to cancel it or a reset_peer - // toxic to land mid-query. + // A ~2 s server-side sleep — keeps the coroutine parked in the reactor + // on the DB socket long enough for a killer to cancel it or a + // reset_peer toxic to land mid-query. The sleep SQL is driver-specific + // (MySQL SLEEP() vs PostgreSQL pg_sleep()), resolved when the action + // runs — by then the database's driver is known. $r->on('/^coroutine "([^"]+)" runs a slow query on database "([^"]+)"$/', function(Context $ctx, string $coro, string $db) { $ctx->planAction($coro, function(Context $ctx) use ($coro, $db) { - StandardSteps::dbRun($ctx, $coro, $db, - 'SELECT SLEEP(2)', 'slow_query'); + $driver = $ctx->net->evilDbDefs[$db]['driver'] ?? 'mysql'; + $sql = $driver === 'pgsql' ? 'SELECT pg_sleep(2)' : 'SELECT SLEEP(2)'; + StandardSteps::dbRun($ctx, $coro, $db, $sql, 'slow_query'); }); }) - ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + ->requires('toxiproxy'); // When coroutine "X" runs a transaction on database "DB" // BEGIN → INSERT → COMMIT. A connection fault mid-transaction must @@ -640,7 +670,7 @@ function(Context $ctx, string $coro, string $db) { StandardSteps::dbTransaction($ctx, $coro, $db); }); }) - ->requires('toxiproxy', 'pdo_mysql', 'mysql-server'); + ->requires('toxiproxy'); // When coroutine "X" queries via mysqli "DB" // Same SELECT as the PDO query step, but over the mysqli extension — @@ -3583,10 +3613,13 @@ public static function dbRun(Context $ctx, string $coro, string $db, string $sql /** * Shared async-PDO routine for the "runs a transaction" database step: - * BEGIN → INSERT → COMMIT. A connection fault mid-transaction must surface - * as a clean error — the coroutine terminates, the connection (or pool - * slot) is not left wedged, and the server rolls the transaction back on - * the dropped connection. Outcome buckets mirror dbRun(): + * BEGIN → INSERT → read-back SELECT → COMMIT. The multi-statement body + * means several reactor round-trips inside one transaction, so a random + * scheduler can interleave a sibling coroutine's work between any two of + * them. A connection fault mid-transaction must surface as a clean error — + * the coroutine terminates, the connection (or pool slot) is not left + * wedged, and the server rolls the transaction back on the dropped + * connection. Outcome buckets mirror dbRun(): * db_txn_ok / db_txn_cancelled / db_txn_failed / db_txn_no_db sum to * db_txn_attempts; db_txn_committed counts the transactions that COMMIT * actually acknowledged. @@ -3610,6 +3643,10 @@ public static function dbTransaction(Context $ctx, string $coro, string $db): vo @$pdo->beginTransaction(); $stmt = @$pdo->prepare('INSERT INTO items (label, n) VALUES (?, ?)'); @$stmt->execute(["txn-$coro", 0]); + // Read-back inside the transaction — another reactor round-trip a + // sibling coroutine can be scheduled across. + $check = @$pdo->query('SELECT COUNT(*) FROM items WHERE id <= 5'); + @$check->fetch(\PDO::FETCH_NUM); @$pdo->commit(); $ctx->inc("db_txn_committed_$coro"); $ctx->inc("db_txn_ok_$coro"); diff --git a/fuzzy-tests/_harness/generate.php b/fuzzy-tests/_harness/generate.php index 8524dee..5131287 100644 --- a/fuzzy-tests/_harness/generate.php +++ b/fuzzy-tests/_harness/generate.php @@ -204,6 +204,17 @@ function findFeatures(string $root): array { PROBE, 'pdo_mysql' => 'if (!extension_loaded("pdo_mysql")) { echo "skip ext/pdo_mysql required"; exit; }', 'mysqli' => 'if (!extension_loaded("mysqli")) { echo "skip ext/mysqli required"; exit; }', + 'pdo_pgsql' => 'if (!extension_loaded("pdo_pgsql")) { echo "skip ext/pdo_pgsql required"; exit; }', + // A reachable PostgreSQL server, opt-in like the MySQL one. + 'pgsql-server' => <<<'PROBE' +$ps = getenv("CHAOS_PGSQL") ?: "127.0.0.1:5432"; +$pp = strrpos($ps, ":"); +$ph = $pp === false ? $ps : substr($ps, 0, $pp); +$pport = $pp === false ? 5432 : (int)substr($ps, $pp + 1); +$psk = @stream_socket_client("tcp://$ph:$pport", $pe, $pm, 2); +if ($psk === false) { echo "skip PostgreSQL not reachable at $ps (set CHAOS_PGSQL)"; exit; } +fclose($psk); +PROBE, // A reachable MySQL server is opt-in, like Toxiproxy: the database chaos // tests run only where one answers and skip everywhere else. The probe is // a plain TCP connect — the upstream the Toxiproxy proxy will point at. diff --git a/fuzzy-tests/db/concurrent.feature b/fuzzy-tests/db/concurrent.feature new file mode 100644 index 0000000..800d4df --- /dev/null +++ b/fuzzy-tests/db/concurrent.feature @@ -0,0 +1,204 @@ +Feature: Database chaos — concurrent transactions and heterogeneous workloads + + The per-driver suites (mysql_chaos / pgsql_chaos / mysqli_chaos) drive one + kind of work at a time. This feature targets the harder surface: many + coroutines doing *different* things at once against the same database — + one running a transaction while another reads, several transactions + committing concurrently, a transaction cancelled while a sibling keeps + working — all interleaved by the random scheduler. + + The transaction body is multi-statement (BEGIN → INSERT → read-back SELECT + → COMMIT), so each transaction yields to the reactor several times and a + sibling coroutine can be scheduled between any two of its statements. The + PDO connection pool gives every coroutine its own slot, so concurrent + transactions never share a connection — but the pool's acquire / release / + in-transaction bookkeeping is exercised hard. + + Invariants, decidable regardless of interleaving: + - every transaction either fully commits (db_txn_committed) or does not; + the outcome buckets sum to the attempt count; + - a reader querying the five seed rows (ids 1..5) always sees exactly + five — a concurrent writer only ever appends rows with id > 5, so the + reader's result is interleaving-independent; + - cancelling or faulting one coroutine never disturbs a sibling; + - no coroutine hangs or is orphaned, and the pool never wedges. + + Scenario: concurrent transactions all commit + Given a pooled MySQL database "PDB" with 4 connections + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + And a coroutine "C4" + When coroutine "C1" runs a transaction on database "PDB" + And coroutine "C2" runs a transaction on database "PDB" + And coroutine "C3" runs a transaction on database "PDB" + And coroutine "C4" runs a transaction on database "PDB" + Then counter "db_txn_ok_C1" equals 1 + And counter "db_txn_ok_C2" equals 1 + And counter "db_txn_ok_C3" equals 1 + And counter "db_txn_ok_C4" equals 1 + And counter "db_txn_committed_C1" equals 1 + And counter "db_txn_committed_C4" equals 1 + And no orphan coroutines + + Scenario: a reader and a writer run side by side + # The writer commits a transaction (appending a row with id > 5) while the + # reader queries the five seed rows. The reader must see exactly five rows + # regardless of how the two interleave. + Given a pooled MySQL database "PDB" with 2 connections + And a coroutine "W" + And a coroutine "R" + When coroutine "W" runs a transaction on database "PDB" + And coroutine "R" queries database "PDB" + Then counter "db_txn_ok_W" equals 1 + And counter "db_txn_committed_W" equals 1 + And counter "db_query_ok_R" equals 1 + And counter "db_query_rows_R" equals 5 + And no orphan coroutines + + Scenario: a mixed workload — query, transaction and slow query together + # Three coroutines doing three different things at once, through one + # latency-toxic'd pool. None of the work truncates, so all three complete. + Given a pooled MySQL database "PDB" with 3 connections + And Toxiproxy adds 5 ms latency to database "PDB" + And a coroutine "Q" + And a coroutine "T" + And a coroutine "S" + When coroutine "Q" queries database "PDB" + And coroutine "T" runs a transaction on database "PDB" + And coroutine "S" runs a slow query on database "PDB" + Then counter "db_query_ok_Q" equals 1 + And counter "db_query_rows_Q" equals 5 + And counter "db_txn_ok_T" equals 1 + And counter "db_slow_query_ok_S" equals 1 + And coroutine "S" is completed + And no orphan coroutines + + Scenario: concurrent transactions under latency + Given a pooled MySQL database "PDB" with 3 connections + And Toxiproxy adds 5 ms latency to database "PDB" + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + When coroutine "C1" runs a transaction on database "PDB" + And coroutine "C2" runs a transaction on database "PDB" + And coroutine "C3" runs a transaction on database "PDB" + Then counter "db_txn_committed_C1" equals 1 + And counter "db_txn_committed_C2" equals 1 + And counter "db_txn_committed_C3" equals 1 + And no orphan coroutines + + Scenario Outline: a transaction cancelled while a sibling keeps working + # A killer cancels the transaction coroutine; the reader sibling is a + # different coroutine and must be wholly unaffected — it still returns its + # five rows. The transaction's own outcome is interleaving-dependent, so + # only its liveness sum is decidable. + Given a pooled MySQL database "PDB" with 3 connections + And a coroutine "T" + And a coroutine "R" + And a coroutine "K" + When coroutine "T" runs a transaction on database "PDB" + And coroutine "R" queries database "PDB" + And coroutine "K" sleeps ms + And coroutine "K" cancels coroutine "T" + Then counter "db_txn_ok_T" plus counter "db_txn_cancelled_T" plus counter "db_txn_failed_T" plus counter "db_txn_no_db_T" equals counter "db_txn_attempts_T" + And counter "db_query_ok_R" equals 1 + And counter "db_query_rows_R" equals 5 + And coroutine "T" is completed + And coroutine "R" is completed + And no orphan coroutines + + Examples: + | ms | + | 0 | + | 2 | + | 8 | + + Scenario: concurrent transactions, one cancelled + # Three transactions run together; the killer cancels exactly one. The + # other two are independent coroutines and must still commit. + Given a pooled MySQL database "PDB" with 4 connections + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + And a coroutine "K" + When coroutine "C1" runs a transaction on database "PDB" + And coroutine "C2" runs a transaction on database "PDB" + And coroutine "C3" runs a transaction on database "PDB" + And coroutine "K" sleeps 2 ms + And coroutine "K" cancels coroutine "C2" + Then counter "db_txn_ok_C1" equals 1 + And counter "db_txn_ok_C3" equals 1 + And counter "db_txn_ok_C2" plus counter "db_txn_cancelled_C2" plus counter "db_txn_failed_C2" plus counter "db_txn_no_db_C2" equals counter "db_txn_attempts_C2" + And coroutine "C1" is completed + And coroutine "C2" is completed + And coroutine "C3" is completed + And no orphan coroutines + + Scenario: a heterogeneous workload fails cleanly under a connection reset + # reset_peer drops every connection. A transaction, a query and a slow + # query all run together — each must surface a clean error, complete, and + # leave nothing wedged. Liveness sums hold for every interleaving. + Given a pooled MySQL database "PDB" with 3 connections + And Toxiproxy resets database "PDB" after 200 ms + And a coroutine "T" + And a coroutine "Q" + And a coroutine "S" + When coroutine "T" runs a transaction on database "PDB" + And coroutine "Q" queries database "PDB" + And coroutine "S" runs a slow query on database "PDB" + Then counter "db_txn_ok_T" plus counter "db_txn_cancelled_T" plus counter "db_txn_failed_T" plus counter "db_txn_no_db_T" equals counter "db_txn_attempts_T" + And counter "db_query_ok_Q" plus counter "db_query_cancelled_Q" plus counter "db_query_failed_Q" plus counter "db_query_no_db_Q" equals counter "db_query_attempts_Q" + And coroutine "T" is completed + And coroutine "Q" is completed + And coroutine "S" is completed + And no orphan coroutines + + Scenario: a transaction/query storm on an undersized pool + # Six coroutines — three transactions, three reads — share a pool of only + # four connections, so two coroutines must wait for a slot to free. Every + # coroutine still completes its work: the pool serialises acquisition + # without losing or corrupting a transaction. + Given a pooled MySQL database "PDB" with 4 connections + And a coroutine "T1" + And a coroutine "Q1" + And a coroutine "T2" + And a coroutine "Q2" + And a coroutine "T3" + And a coroutine "Q3" + When coroutine "T1" runs a transaction on database "PDB" + And coroutine "Q1" queries database "PDB" + And coroutine "T2" runs a transaction on database "PDB" + And coroutine "Q2" queries database "PDB" + And coroutine "T3" runs a transaction on database "PDB" + And coroutine "Q3" queries database "PDB" + Then counter "db_txn_committed_T1" equals 1 + And counter "db_txn_committed_T2" equals 1 + And counter "db_txn_committed_T3" equals 1 + And counter "db_query_rows_Q1" equals 5 + And counter "db_query_rows_Q2" equals 5 + And counter "db_query_rows_Q3" equals 5 + And no orphan coroutines + + Scenario: reader and writer crossed with transport and scheduler chaos + # A fixed writer/reader pair around a non-truncating toxic and a scheduler + # perturbation — the writer always commits, the reader always sees its + # five rows, whatever the cross-product. + Given a pooled MySQL database "PDB" with 3 connections + One of: + - Toxiproxy adds 3 ms latency to database "PDB" + - Toxiproxy throttles database "PDB" to 128 KB/s + - Toxiproxy slices database "PDB" into random:64-byte TCP segments + Given a coroutine "W" + And a coroutine "R" + And a coroutine "N" + When coroutine "W" runs a transaction on database "PDB" + And coroutine "R" queries database "PDB" + Any of: + - coroutine "N" sleeps 2 ms + - coroutine "N" sleeps 6 ms + Then counter "db_txn_ok_W" equals 1 + And counter "db_txn_committed_W" equals 1 + And counter "db_query_ok_R" equals 1 + And counter "db_query_rows_R" equals 5 + And no orphan coroutines diff --git a/fuzzy-tests/db/pgsql_chaos.feature b/fuzzy-tests/db/pgsql_chaos.feature new file mode 100644 index 0000000..b4a7224 --- /dev/null +++ b/fuzzy-tests/db/pgsql_chaos.feature @@ -0,0 +1,187 @@ +Feature: Database chaos — async PDO PgSQL against a Toxiproxy-fronted server + + The PostgreSQL half of the #136 database coverage. Identical chaos model to + the PDO MySQL suite — Toxiproxy between the async PDO client and a real + server, every connect / query / transaction through the libuv reactor — but + exercising the pdo_pgsql driver and the libpq wire protocol. + + client coroutine ──▶ Toxiproxy proxy ──▶ real PostgreSQL server + └── latency / bandwidth / slicer / reset_peer + + Opt-in: every scenario needs ext/pdo_pgsql, a reachable PostgreSQL server + (CHAOS_PGSQL, default 127.0.0.1:5432, schema seeded with a five-row `items` + table) and a running Toxiproxy; the generated .phpt carry a --SKIPIF-- + probe for all three. + + Invariants, decidable regardless of interleaving: + - a non-truncating toxic leaves the result set intact — the query still + returns its five rows; + - a dropped connection surfaces as a clean PDOException, never a hang; + - a cancel mid-query leaves the coroutine completed and unorphaned, and + the outcome buckets sum to the attempt count; + - the connection pool fails every slot cleanly when the server + connection is lost. + + Scenario: a query through a clean proxy returns every row + Given a PgSQL database "DB" + And a coroutine "C" + When coroutine "C" queries database "DB" + Then counter "db_query_attempts_C" equals 1 + And counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines + + Scenario Outline: latency does not corrupt the result set + Given a PgSQL database "DB" + And Toxiproxy adds ms latency to database "DB" + And a coroutine "C" + When coroutine "C" queries database "DB" + Then counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines + + Examples: + | latency | + | 1 | + | 5 | + | 20 | + + Scenario: a bandwidth-throttled query still returns every row + Given a PgSQL database "DB" + And Toxiproxy throttles database "DB" to 64 KB/s + And a coroutine "C" + When coroutine "C" queries database "DB" + Then counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines + + Scenario Outline: a TCP-sliced wire stream is reassembled exactly + # Toxiproxy chops the libpq wire protocol into tiny TCP segments — the + # driver must reassemble the message frames whatever the fragmentation. + Given a PgSQL database "DB" + And Toxiproxy slices database "DB" into -byte TCP segments + And a coroutine "C" + When coroutine "C" queries database "DB" + Then counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines + + Examples: + | segment | + | 1 | + | 64 | + | 512 | + + Scenario: an RST mid-query surfaces as a clean error + Given a PgSQL database "DB" + And Toxiproxy resets database "DB" after 200 ms + And a coroutine "C" + When coroutine "C" runs a slow query on database "DB" + Then counter "db_slow_query_attempts_C" equals 1 + And counter "db_slow_query_failed_C" equals 1 + And coroutine "C" is completed + And no orphan coroutines + + Scenario Outline: cancel a coroutine mid-query + Given a PgSQL database "DB" + And a coroutine "C" + And a coroutine "K" + When coroutine "C" runs a slow query on database "DB" + And coroutine "K" sleeps ms + And coroutine "K" cancels coroutine "C" + Then counter "db_slow_query_ok_C" plus counter "db_slow_query_cancelled_C" plus counter "db_slow_query_failed_C" plus counter "db_slow_query_no_db_C" equals counter "db_slow_query_attempts_C" + And coroutine "C" is completed + And no orphan coroutines + + Examples: + | ms | + | 0 | + | 50 | + | 300 | + | 900 | + + Scenario: many coroutines share one pooled connection handle + Given a pooled PgSQL database "PDB" with 4 connections + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + And a coroutine "C4" + When coroutine "C1" queries database "PDB" + And coroutine "C2" queries database "PDB" + And coroutine "C3" queries database "PDB" + And coroutine "C4" queries database "PDB" + Then counter "db_query_ok_C1" equals 1 + And counter "db_query_ok_C2" equals 1 + And counter "db_query_ok_C3" equals 1 + And counter "db_query_ok_C4" equals 1 + And counter "db_query_rows_C1" equals 5 + And counter "db_query_rows_C4" equals 5 + And no orphan coroutines + + Scenario: concurrent pooled queries under latency + Given a pooled PgSQL database "PDB" with 3 connections + And Toxiproxy adds 5 ms latency to database "PDB" + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + When coroutine "C1" queries database "PDB" + And coroutine "C2" queries database "PDB" + And coroutine "C3" queries database "PDB" + Then counter "db_query_ok_C1" equals 1 + And counter "db_query_ok_C2" equals 1 + And counter "db_query_ok_C3" equals 1 + And no orphan coroutines + + Scenario: the pool fails every query cleanly when the server connection drops + Given a pooled PgSQL database "PDB" with 4 connections + And Toxiproxy resets database "PDB" after 200 ms + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + When coroutine "C1" queries database "PDB" + And coroutine "C2" queries database "PDB" + And coroutine "C3" queries database "PDB" + Then counter "db_query_ok_C1" plus counter "db_query_failed_C1" equals counter "db_query_attempts_C1" + And counter "db_query_ok_C2" plus counter "db_query_failed_C2" equals counter "db_query_attempts_C2" + And counter "db_query_ok_C3" plus counter "db_query_failed_C3" equals counter "db_query_attempts_C3" + And coroutine "C1" is completed + And coroutine "C2" is completed + And coroutine "C3" is completed + And no orphan coroutines + + Scenario: a transaction commits through a latency-toxic'd connection + Given a PgSQL database "DB" + And Toxiproxy adds 5 ms latency to database "DB" + And a coroutine "C" + When coroutine "C" runs a transaction on database "DB" + Then counter "db_txn_attempts_C" equals 1 + And counter "db_txn_ok_C" equals 1 + And counter "db_txn_committed_C" equals 1 + And no orphan coroutines + + Scenario: an RST mid-transaction surfaces cleanly and wedges nothing + Given a pooled PgSQL database "PDB" + And Toxiproxy resets database "PDB" after 150 ms + And a coroutine "C" + And a coroutine "K" + When coroutine "C" runs a transaction on database "PDB" + And coroutine "K" runs a slow query on database "PDB" + Then counter "db_txn_ok_C" plus counter "db_txn_cancelled_C" plus counter "db_txn_failed_C" plus counter "db_txn_no_db_C" equals counter "db_txn_attempts_C" + And coroutine "C" is completed + And no orphan coroutines + + Scenario: database toxics crossed with logic and scheduler chaos + Given a PgSQL database "DB" + One of: + - Toxiproxy adds 3 ms latency to database "DB" + - Toxiproxy throttles database "DB" to 128 KB/s + - Toxiproxy slices database "DB" into random:64-byte TCP segments + And a coroutine "C" + And a coroutine "N" + When coroutine "C" queries database "DB" + Any of: + - coroutine "N" sleeps 2 ms + - coroutine "N" sleeps 6 ms + Then counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines From 0a5aaaaad79208e0da728164513c13ce22a0f490 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Sat, 23 May 2026 10:16:44 +0000 Subject: [PATCH 5/8] #136 db-chaos: SQLite pool-focused coverage MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the last driver of the "all four" set from #136. SQLite is a local file — no Toxiproxy, no network toxics, and pdo_sqlite operations do not yield to the reactor. The chaos surface is the PDO connection pool itself: per-coroutine sqlite3* slots over one shared file. - fuzzy-tests/db/sqlite_chaos.feature — 6 scenarios: pooled query, many concurrent queries, transaction commit, concurrent transactions, reader + writer sharing the pool, transaction/query storm on an undersized pool (3 conns, 5 coroutines). - ChaosNet::openDbConnection() learns the sqlite driver — DSN is "sqlite:", no user/pass, no upstream port. setUp() per-pid seeds a fresh file (CREATE TABLE items + 5 seed rows) and registers the path as the db address; tearDown() unlinks it. - New Given steps: a [pooled] SQLite database "DB". Existing driver-agnostic client steps (queries / runs a transaction on database) reuse as-is. - generate.php: pdo_sqlite --SKIPIF-- rule. Full 649-test fuzzy suite passes 100%. --- CHANGELOG.md | 1 + fuzzy-tests/_harness/ChaosNet.php | 51 +++++++++++++--- fuzzy-tests/_harness/Steps.php | 24 ++++++++ fuzzy-tests/_harness/generate.php | 1 + fuzzy-tests/db/sqlite_chaos.feature | 94 +++++++++++++++++++++++++++++ 5 files changed, 161 insertions(+), 10 deletions(-) create mode 100644 fuzzy-tests/db/sqlite_chaos.feature diff --git a/CHANGELOG.md b/CHANGELOG.md index 6bdefa6..a1517fd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.7.0] - ### Added +- **#136 Database chaos: async PDO SQLite (pool-focused)** — `fuzzy-tests/db/sqlite_chaos.feature` (6 scenarios). SQLite is local-file-only: no Toxiproxy, no network toxics, and pdo_sqlite operations do not yield to the reactor (nothing to poll). What this exercises is the PDO connection pool itself — per-coroutine `sqlite3*` slots over one shared file, acquire / release / slot reuse under many coroutines, and the same multi-statement transaction body that surfaced the cancellation UAF on the network drivers. Schema is seeded per-scenario into a unique file under `sys_get_temp_dir()` (per-PID name) and removed in tearDown. New steps `a [pooled] SQLite database "DB"`; only `pdo_sqlite` is required by SKIPIF — runs everywhere `ext/pdo_sqlite` is present. - **#136 Database chaos: concurrent transactions and heterogeneous workloads** — new `fuzzy-tests/db/concurrent.feature` (9 scenarios → 22 .phpt) targets the harder surface where the per-driver suites stop: many coroutines doing *different* things at once on one pooled `$pdo` — concurrent transactions, a writer racing a reader (decidable: reader's `WHERE id<=5` snapshot is interleaving-independent of the writer's `id>5` insert), mixed query/transaction/slow-query workload, a transaction cancelled while a sibling reader keeps working, a transaction/query storm on an undersized pool (4-conn pool, 6 coroutines). The transaction body is now multi-statement (BEGIN → INSERT → read-back SELECT → COMMIT), so each transaction yields several times and a sibling can be scheduled between any two statements. **Found and exposed a real UAF in the PDO connection pool**: cancelling a coroutine mid-transaction segfaulted in `php_pdo_free_statement` — fixed in php-src `ext/pdo/pdo_pool.c` (`#136`); the pool slot release path force-zeroed `pool_slot_refcount` and cleared the template's per-coroutine error stash from a shared slot, dangling live `stmt->pooled_conn` pointers from the cancelled coroutine. - **#136 Database chaos: async PDO PgSQL coverage** — new `fuzzy-tests/db/pgsql_chaos.feature` (12 scenarios → 30 .phpt) mirrors `mysql_chaos.feature` against a Toxiproxy-fronted PostgreSQL server, exercising the pdo_pgsql driver and the libpq wire protocol. The Toxiproxy toxic steps, `queries / runs a slow query / runs a transaction on database` client steps, `dbRun()` and `dbTransaction()` are now driver-aware: `ChaosNet::openDbConnection()` builds a `pgsql:` or `mysql:` DSN from the database's declared driver, `ChaosNet::dbUpstream()` selects the upstream env var (`CHAOS_PGSQL` / `CHAOS_MYSQL`, defaults `127.0.0.1:5432` / `:3306`), and the slow-query step picks `SELECT pg_sleep(2)` vs `SELECT SLEEP(2)` at action-run time. New steps `a [pooled] PgSQL database "DB"`; generate.php gains an ext/pdo_pgsql + a PostgreSQL-reachable `--SKIPIF--` probe. The Toxiproxy and PDO client steps' `requires` were relaxed to driver-agnostic — the database-declaration step pins the driver requirement. - **#136 Database chaos: async mysqli coverage** — `fuzzy-tests/db/mysqli_chaos.feature` (10 scenarios) extends the database chaos suite to the `mysqli` extension: the same Toxiproxy-fronted MySQL server reached through mysqli's own connection / result / prepared-statement API instead of PDO. mysqli has no connection pool, so each chaos query opens its own connection. New steps `a MySQLi database "DB"` and `coroutine "C" queries|runs a slow query via|runs a transaction via mysqli "DB"`; the Toxiproxy toxic steps are now driver-agnostic and shared with the PDO suite. `ChaosNet::openMysqliConnection()` builds the connection (`MYSQLI_REPORT_STRICT` so faults raise `mysqli_sql_exception`). Generated `.phpt` carry an ext/mysqli `--SKIPIF--` probe. diff --git a/fuzzy-tests/_harness/ChaosNet.php b/fuzzy-tests/_harness/ChaosNet.php index 32f2f96..2b98765 100644 --- a/fuzzy-tests/_harness/ChaosNet.php +++ b/fuzzy-tests/_harness/ChaosNet.php @@ -148,6 +148,18 @@ public function dbUpstream(string $driver): string { */ public function openDbConnection(string $db, bool $pool): \PDO { $driver = $this->evilDbDefs[$db]['driver'] ?? 'mysql'; + $opts = [\PDO::ATTR_ERRMODE => \PDO::ERRMODE_EXCEPTION]; + + if ($pool) { + $opts[\PDO::ATTR_POOL_ENABLED] = true; + $opts[\PDO::ATTR_POOL_MIN] = 0; + $opts[\PDO::ATTR_POOL_MAX] = $this->evilDbDefs[$db]['poolMax'] ?? 4; + } + + if ($driver === 'sqlite') { + return new \PDO('sqlite:' . $this->evilDbAddr[$db], null, null, $opts); + } + $addr = $this->evilDbAddr[$db] ?? ''; $colon = strrpos($addr, ':'); $pgsql = $driver === 'pgsql'; @@ -159,15 +171,9 @@ public function openDbConnection(string $db, bool $pool): \PDO { $name = getenv("{$envPfx}_DB") ?: 'chaos_test'; $dsn = sprintf('%s:host=%s;port=%d;dbname=%s', $pgsql ? 'pgsql' : 'mysql', $host, $port, $name); - $opts = [ - \PDO::ATTR_ERRMODE => \PDO::ERRMODE_EXCEPTION, - \PDO::ATTR_TIMEOUT => 5, - ]; - if ($pool) { - $opts[\PDO::ATTR_POOL_ENABLED] = true; - $opts[\PDO::ATTR_POOL_MIN] = 0; - $opts[\PDO::ATTR_POOL_MAX] = $this->evilDbDefs[$db]['poolMax'] ?? 4; - } + + $opts[\PDO::ATTR_TIMEOUT] = 5; + return new \PDO($dsn, $user, $pass, $opts); } @@ -272,8 +278,26 @@ public function setUp(Context $ctx): array { // the driver's wire I/O — latency / bandwidth / RST mid-query. A // pool-enabled DB also gets its one shared PDO handle built here. foreach ($this->evilDbDefs as $name => $spec) { + $driver = $spec['driver'] ?? 'mysql'; + + if ($driver === 'sqlite') { + $path = sprintf('%s/chaos_sqlite_%d_%s.db', sys_get_temp_dir(), getmypid(), $name); + @unlink($path); + $seed = new \PDO('sqlite:' . $path); + $seed->exec('CREATE TABLE items (id INTEGER PRIMARY KEY AUTOINCREMENT, label TEXT, n INT)'); + $seed->exec("INSERT INTO items (label, n) VALUES ('alpha',1),('beta',2),('gamma',3),('delta',4),('epsilon',5)"); + $seed = null; + $this->evilDbAddr[$name] = $path; + + if ($spec['pool']) { + $this->evilDbPool[$name] = $this->openDbConnection($name, true); + } + + continue; + } + $this->toxiproxy ??= new ToxiproxyClient(); - $upstream = $this->dbUpstream($spec['driver'] ?? 'mysql'); + $upstream = $this->dbUpstream($driver); $proxyName = sprintf('chaosdb_%d_%s_%s', getmypid(), bin2hex(random_bytes(3)), $name); $listen = $this->toxiproxy->createProxy($proxyName, '127.0.0.1:0', $upstream); $this->toxiproxyProxies[] = $proxyName; @@ -309,6 +333,13 @@ public function tearDown(): void { // releases every per-coroutine connection it still holds. $this->evilDbPool = []; + // Delete per-scenario SQLite files. + foreach ($this->evilDbDefs as $name => $spec) { + if (($spec['driver'] ?? '') === 'sqlite' && isset($this->evilDbAddr[$name])) { + @unlink($this->evilDbAddr[$name]); + } + } + // Close every EvilPeer listening socket left open. foreach ($this->evilPeerServers as $server) { if (is_resource($server)) { diff --git a/fuzzy-tests/_harness/Steps.php b/fuzzy-tests/_harness/Steps.php index 7efad43..545561e 100644 --- a/fuzzy-tests/_harness/Steps.php +++ b/fuzzy-tests/_harness/Steps.php @@ -582,6 +582,30 @@ function(Context $ctx, string $name, string $nExpr) { }) ->requires('toxiproxy', 'pdo_pgsql', 'pgsql-server'); + // Given a [pooled] SQLite database "DB" + // SQLite is a local file — no Toxiproxy, no network toxics. The chaos + // surface is the PDO pool itself: per-coroutine sqlite3* slots over one + // shared file, with the same client steps (queries / runs a transaction + // on database) as the network drivers. + $r->on('/^a SQLite database "([^"]+)"$/', + function(Context $ctx, string $name) { + $ctx->net->defineEvilDb($name, 'sqlite'); + }) + ->requires('pdo_sqlite'); + + $r->on('/^a pooled SQLite database "([^"]+)"$/', + function(Context $ctx, string $name) { + $ctx->net->defineEvilDb($name, 'sqlite', true); + }) + ->requires('pdo_sqlite'); + + $r->on('/^a pooled SQLite database "([^"]+)" with (\S+) connections$/', + function(Context $ctx, string $name, string $nExpr) { + $n = (int)$ctx->resolver->resolve($nExpr); + $ctx->net->defineEvilDb($name, 'sqlite', true, $n > 0 ? $n : 1); + }) + ->requires('pdo_sqlite'); + // ---- When: actions inside a coroutine ---- // When coroutine "X" downloads from peer "EP" diff --git a/fuzzy-tests/_harness/generate.php b/fuzzy-tests/_harness/generate.php index 5131287..a9546c1 100644 --- a/fuzzy-tests/_harness/generate.php +++ b/fuzzy-tests/_harness/generate.php @@ -203,6 +203,7 @@ function findFeatures(string $root): array { fclose($ts); PROBE, 'pdo_mysql' => 'if (!extension_loaded("pdo_mysql")) { echo "skip ext/pdo_mysql required"; exit; }', + 'pdo_sqlite' => 'if (!extension_loaded("pdo_sqlite")) { echo "skip ext/pdo_sqlite required"; exit; }', 'mysqli' => 'if (!extension_loaded("mysqli")) { echo "skip ext/mysqli required"; exit; }', 'pdo_pgsql' => 'if (!extension_loaded("pdo_pgsql")) { echo "skip ext/pdo_pgsql required"; exit; }', // A reachable PostgreSQL server, opt-in like the MySQL one. diff --git a/fuzzy-tests/db/sqlite_chaos.feature b/fuzzy-tests/db/sqlite_chaos.feature new file mode 100644 index 0000000..efdf6e1 --- /dev/null +++ b/fuzzy-tests/db/sqlite_chaos.feature @@ -0,0 +1,94 @@ +Feature: Database chaos — async PDO SQLite (pool-focused) + + SQLite is a local file: no Toxiproxy, no network toxics, and pdo_sqlite + operations do not yield to the reactor (there is no socket to poll on). + Cancellation mid-query and transport-level chaos therefore do not apply. + + What the chaos suite verifies here is the PDO connection pool itself: + per-coroutine sqlite3* slots over one shared file, acquire / release / + slot reuse under many coroutines, and the same multi-statement + transaction body that surfaced the cancellation UAF on the network + drivers. + + Schema is seeded per-scenario (the .phpt runs in its own process); the + scenario file lives under sys_get_temp_dir() and is removed in tearDown. + + Scenario: a query through a pooled SQLite handle returns every row + Given a pooled SQLite database "DB" + And a coroutine "C" + When coroutine "C" queries database "DB" + Then counter "db_query_ok_C" equals 1 + And counter "db_query_rows_C" equals 5 + And no orphan coroutines + + Scenario: many coroutines share one pooled SQLite handle + Given a pooled SQLite database "DB" with 4 connections + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + And a coroutine "C4" + When coroutine "C1" queries database "DB" + And coroutine "C2" queries database "DB" + And coroutine "C3" queries database "DB" + And coroutine "C4" queries database "DB" + Then counter "db_query_ok_C1" equals 1 + And counter "db_query_ok_C2" equals 1 + And counter "db_query_ok_C3" equals 1 + And counter "db_query_ok_C4" equals 1 + And counter "db_query_rows_C1" equals 5 + And counter "db_query_rows_C4" equals 5 + And no orphan coroutines + + Scenario: a transaction commits on a pooled SQLite handle + Given a pooled SQLite database "DB" + And a coroutine "C" + When coroutine "C" runs a transaction on database "DB" + Then counter "db_txn_ok_C" equals 1 + And counter "db_txn_committed_C" equals 1 + And no orphan coroutines + + Scenario: concurrent transactions all commit + Given a pooled SQLite database "DB" with 4 connections + And a coroutine "C1" + And a coroutine "C2" + And a coroutine "C3" + And a coroutine "C4" + When coroutine "C1" runs a transaction on database "DB" + And coroutine "C2" runs a transaction on database "DB" + And coroutine "C3" runs a transaction on database "DB" + And coroutine "C4" runs a transaction on database "DB" + Then counter "db_txn_committed_C1" equals 1 + And counter "db_txn_committed_C2" equals 1 + And counter "db_txn_committed_C3" equals 1 + And counter "db_txn_committed_C4" equals 1 + And no orphan coroutines + + Scenario: a reader and a writer share the pool + Given a pooled SQLite database "DB" with 2 connections + And a coroutine "W" + And a coroutine "R" + When coroutine "W" runs a transaction on database "DB" + And coroutine "R" queries database "DB" + Then counter "db_txn_committed_W" equals 1 + And counter "db_query_ok_R" equals 1 + And counter "db_query_rows_R" equals 5 + And no orphan coroutines + + Scenario: a transaction/query storm on an undersized SQLite pool + Given a pooled SQLite database "DB" with 3 connections + And a coroutine "T1" + And a coroutine "Q1" + And a coroutine "T2" + And a coroutine "Q2" + And a coroutine "T3" + When coroutine "T1" runs a transaction on database "DB" + And coroutine "Q1" queries database "DB" + And coroutine "T2" runs a transaction on database "DB" + And coroutine "Q2" queries database "DB" + And coroutine "T3" runs a transaction on database "DB" + Then counter "db_txn_committed_T1" equals 1 + And counter "db_txn_committed_T2" equals 1 + And counter "db_txn_committed_T3" equals 1 + And counter "db_query_rows_Q1" equals 5 + And counter "db_query_rows_Q2" equals 5 + And no orphan coroutines From f91c1538686661c0c2b44623eb29170c2f5730e2 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Sat, 23 May 2026 10:20:06 +0000 Subject: [PATCH 6/8] docs: move #136 fuzzy-tests entries out of main CHANGELOG.md The main CHANGELOG.md is for user-facing extension features. Chaos / fuzzy-tests / harness changes are test-suite history and belong in a separate file. Moved this session's #136 entries (CURL, PDO MySQL, mysqli, PgSQL, concurrent, SQLite coverage) into the new fuzzy-tests/CHANGELOG.md. Older #127 / #129 chaos entries left in the main CHANGELOG.md as historical record (already shipped). --- CHANGELOG.md | 6 ------ fuzzy-tests/CHANGELOG.md | 15 +++++++++++++++ 2 files changed, 15 insertions(+), 6 deletions(-) create mode 100644 fuzzy-tests/CHANGELOG.md diff --git a/CHANGELOG.md b/CHANGELOG.md index a1517fd..4c601a4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,12 +8,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [0.7.0] - ### Added -- **#136 Database chaos: async PDO SQLite (pool-focused)** — `fuzzy-tests/db/sqlite_chaos.feature` (6 scenarios). SQLite is local-file-only: no Toxiproxy, no network toxics, and pdo_sqlite operations do not yield to the reactor (nothing to poll). What this exercises is the PDO connection pool itself — per-coroutine `sqlite3*` slots over one shared file, acquire / release / slot reuse under many coroutines, and the same multi-statement transaction body that surfaced the cancellation UAF on the network drivers. Schema is seeded per-scenario into a unique file under `sys_get_temp_dir()` (per-PID name) and removed in tearDown. New steps `a [pooled] SQLite database "DB"`; only `pdo_sqlite` is required by SKIPIF — runs everywhere `ext/pdo_sqlite` is present. -- **#136 Database chaos: concurrent transactions and heterogeneous workloads** — new `fuzzy-tests/db/concurrent.feature` (9 scenarios → 22 .phpt) targets the harder surface where the per-driver suites stop: many coroutines doing *different* things at once on one pooled `$pdo` — concurrent transactions, a writer racing a reader (decidable: reader's `WHERE id<=5` snapshot is interleaving-independent of the writer's `id>5` insert), mixed query/transaction/slow-query workload, a transaction cancelled while a sibling reader keeps working, a transaction/query storm on an undersized pool (4-conn pool, 6 coroutines). The transaction body is now multi-statement (BEGIN → INSERT → read-back SELECT → COMMIT), so each transaction yields several times and a sibling can be scheduled between any two statements. **Found and exposed a real UAF in the PDO connection pool**: cancelling a coroutine mid-transaction segfaulted in `php_pdo_free_statement` — fixed in php-src `ext/pdo/pdo_pool.c` (`#136`); the pool slot release path force-zeroed `pool_slot_refcount` and cleared the template's per-coroutine error stash from a shared slot, dangling live `stmt->pooled_conn` pointers from the cancelled coroutine. -- **#136 Database chaos: async PDO PgSQL coverage** — new `fuzzy-tests/db/pgsql_chaos.feature` (12 scenarios → 30 .phpt) mirrors `mysql_chaos.feature` against a Toxiproxy-fronted PostgreSQL server, exercising the pdo_pgsql driver and the libpq wire protocol. The Toxiproxy toxic steps, `queries / runs a slow query / runs a transaction on database` client steps, `dbRun()` and `dbTransaction()` are now driver-aware: `ChaosNet::openDbConnection()` builds a `pgsql:` or `mysql:` DSN from the database's declared driver, `ChaosNet::dbUpstream()` selects the upstream env var (`CHAOS_PGSQL` / `CHAOS_MYSQL`, defaults `127.0.0.1:5432` / `:3306`), and the slow-query step picks `SELECT pg_sleep(2)` vs `SELECT SLEEP(2)` at action-run time. New steps `a [pooled] PgSQL database "DB"`; generate.php gains an ext/pdo_pgsql + a PostgreSQL-reachable `--SKIPIF--` probe. The Toxiproxy and PDO client steps' `requires` were relaxed to driver-agnostic — the database-declaration step pins the driver requirement. -- **#136 Database chaos: async mysqli coverage** — `fuzzy-tests/db/mysqli_chaos.feature` (10 scenarios) extends the database chaos suite to the `mysqli` extension: the same Toxiproxy-fronted MySQL server reached through mysqli's own connection / result / prepared-statement API instead of PDO. mysqli has no connection pool, so each chaos query opens its own connection. New steps `a MySQLi database "DB"` and `coroutine "C" queries|runs a slow query via|runs a transaction via mysqli "DB"`; the Toxiproxy toxic steps are now driver-agnostic and shared with the PDO suite. `ChaosNet::openMysqliConnection()` builds the connection (`MYSQLI_REPORT_STRICT` so faults raise `mysqli_sql_exception`). Generated `.phpt` carry an ext/mysqli `--SKIPIF--` probe. -- **#136 Database chaos: async PDO MySQL coverage** — new chaos topic `fuzzy-tests/db/` with `mysql_chaos.feature`: the async PDO MySQL driver is exercised against a real MySQL server fronted by Toxiproxy (`client coroutine → Toxiproxy → MySQL`), so the transport toxics — latency, bandwidth caps, TCP slicing, `reset_peer` mid-query / mid-transaction — land on the driver's wire I/O. 12 scenarios cover a non-pooled and a pool-enabled connection: a query / transaction completing intact through a non-truncating toxic, a dropped connection surfacing as a clean `PDOException` (never a hang), a coroutine cancelled mid-query, and the connection pool failing every slot cleanly when the server connection is lost. New steps `a [pooled] MySQL database "DB"` / `Toxiproxy adds latency to|throttles|slices|resets database "DB"` / `coroutine "C" queries|runs a slow query on|runs a transaction on database "DB"`; connection parameters come from the environment (`CHAOS_MYSQL`, `CHAOS_MYSQL_USER/PASS/DB`). Opt-in like Toxiproxy: generated `.phpt` carry a `--SKIPIF--` probe for ext/pdo_mysql + a reachable MySQL server, so the suite stays inert on dev machines and per-PR CI. Closes the database half of the #136 coverage gap. The network-fixture layer of the harness — EvilPeers, Toxiproxy proxies, chaos databases — is extracted from `Context` into a dedicated `ChaosNet` class (`$ctx->net`) so `Context` stays the scenario orchestrator. -- **#136 HTTP chaos: async ext/curl coverage** — EvilPeer gains an `http` mode (`EvilPeer::serveHttp`): it drains one HTTP request and writes back an HTTP/1.1 response, with the serve-mode body toxics (slice/drip/abrupt close/hard reset/forked peer/Toxiproxy) joined by HTTP-specific ones — chunked transfer-encoding, a mendacious `Content-Length` (over/under-stated), dribbled headers, an arbitrary status code. New chaos topic `fuzzy-tests/curl/` with `http_chaos.feature`: a reactor-driven `ext/curl` client (`coroutine "C" fetches peer "EP" over HTTP`) is exercised against every toxic, under the random scheduler, and cancelled mid-transfer — closing the CURL half of the #136 coverage gap (nothing previously exercised async curl under chaos). Generated `.phpt` carry a `--SKIPIF--` curl probe. **Found a real bug:** async curl dropped all but the first chunk of a chunked-encoded response body (`CURLE_WRITE_ERROR`) — fixed in php-src `ext/curl/curl_async.c` (`#136`); see `fuzzy-tests/FINDINGS.md`. - **#127 I/O chaos: EvilPeer + transport×logic crossing** — new `fuzzy-tests/_peers/EvilPeer.php`, a deliberately misbehaving network peer driven by a declarative fault table. Toxics: payload slicing, inter-chunk drip delay, abrupt mid-stream close (`reset`); parameters accept the seeded-random fuzz syntax (`random:N`, `1|5`). New chaos topic `fuzzy-tests/io/`: `evil_peer.feature` (sliced/dripped stream reassembled exactly), `abrupt_close.feature` (dropped connection → clean payload prefix, no hang), and `combined_chaos.feature` — **crosses transport chaos with logic chaos**: toxic-selection mutation blocks × client-logic mutation blocks × the random scheduler, all checked against a fixed payload oracle. On a failure the executor prints a **chaos event log** — the exact low-level toxic sequence the EvilPeer played out plus each client's I/O trace. Harness gains `defineEvilPeer()` + a prep-phase that binds each peer's listening socket and serves it from a coroutine. - **#129 I/O chaos: Toxiproxy transport-level fault injection** — new `fuzzy-tests/_peers/ToxiproxyClient.php`, a minimal HTTP-API client for [Toxiproxy](https://github.com/Shopify/toxiproxy). An EvilPeer can now be fronted by a Toxiproxy proxy (`client → proxy → peer`), injecting transport faults a pure-PHP peer cannot reproduce precisely: real bandwidth throttling, latency with jitter, TCP-segment slicing, `limit_data` byte-counted truncation, `reset_peer` timed RST. New steps `evil peer "EP" is fronted by Toxiproxy` / `Toxiproxy throttles|adds latency to|slices|cuts off|resets peer "EP" …`; new feature `fuzzy-tests/io/toxiproxy.feature`. Opt-in by design: every generated `.phpt` carries a `--SKIPIF--` probe (`SKIP_RULES['toxiproxy']`) and skips wherever no Toxiproxy admin endpoint answers — so the suite never gates per-PR CI. A dedicated `nightly-io-chaos.yml` workflow stands Toxiproxy up and runs the suite under FIFO + four random scheduler seeds. Closes the last item of #129. - **#107 `ThreadPool` workers auto-detect** — `workers` is optional (default `0` → `Async\available_parallelism()`). diff --git a/fuzzy-tests/CHANGELOG.md b/fuzzy-tests/CHANGELOG.md new file mode 100644 index 0000000..a318e61 --- /dev/null +++ b/fuzzy-tests/CHANGELOG.md @@ -0,0 +1,15 @@ +# Fuzzy-tests changelog + +Test-coverage and harness changes for `ext/async/fuzzy-tests/`. Kept +separate from the main `CHANGELOG.md` so user-facing extension features +are not mixed with test-suite history. + +## Unreleased + +### Added +- **#136 SQLite pool-focused coverage** — `db/sqlite_chaos.feature` (6 scenarios). Local file, no Toxiproxy, pdo_sqlite operations do not yield — exercises only the PDO connection pool itself (per-coroutine `sqlite3*` slots over one shared file). New steps `a [pooled] SQLite database "DB"`. Schema seeded per-scenario into a per-PID file under `sys_get_temp_dir()`, removed in tearDown. Only `pdo_sqlite` required by SKIPIF. +- **#136 Concurrent transactions and heterogeneous workloads** — `db/concurrent.feature` (9 scenarios → 22 .phpt) targets many coroutines doing different things at once on one pooled `$pdo`: concurrent transactions, writer racing reader (reader's `WHERE id<=5` snapshot is interleaving-independent of writer's `id>5` insert), mixed workloads, a transaction cancelled while a sibling reader keeps working, a storm on an undersized pool. `StandardSteps::dbTransaction` runs a multi-statement transaction now (BEGIN → INSERT → read-back SELECT → COMMIT) so each yields several times. **Found a UAF in the PDO connection pool** under coroutine cancellation mid-transaction — fixed in php-src `ext/pdo/pdo_pool.c`. +- **#136 PgSQL coverage** — `db/pgsql_chaos.feature` (12 scenarios → 30 .phpt) mirrors `mysql_chaos.feature` against a Toxiproxy-fronted PostgreSQL server. `ChaosNet::openDbConnection()` and `dbUpstream()` driver-aware (mysql / pgsql); slow-query step picks `SELECT pg_sleep(2)` vs `SELECT SLEEP(2)` at action-run time. New steps `a [pooled] PgSQL database "DB"`. Toxiproxy and PDO client steps relaxed to driver-agnostic `requires`; the database declaration carries the driver requirement. `generate.php` gains `pdo_pgsql` and `pgsql-server` SKIPIF probes. +- **#136 mysqli coverage** — `db/mysqli_chaos.feature` (10 scenarios) — same Toxiproxy-fronted MySQL through the mysqli extension. mysqli has no pool, every chaos query opens its own connection. New steps `a MySQLi database "DB"` and `coroutine "C" queries|runs a slow query via|runs a transaction via mysqli "DB"`. `ChaosNet::openMysqliConnection()` with `MYSQLI_REPORT_STRICT`. Toxiproxy toxic steps driver-agnostic now. +- **#136 PDO MySQL coverage** — `db/mysql_chaos.feature` (12 scenarios → 30 .phpt). Async PDO MySQL through Toxiproxy: latency, bandwidth, TCP slicer, `reset_peer` mid-query / mid-transaction. Pooled and non-pooled. `ChaosNet` is the new network-fixture layer extracted from `Context`; new steps `a [pooled] MySQL database "DB"` / `Toxiproxy adds latency|throttles|slices|resets database "DB"` / `coroutine "C" queries|runs a slow query on|runs a transaction on database "DB"`. Env-driven connection (`CHAOS_MYSQL`, `CHAOS_MYSQL_USER/PASS/DB`). Opt-in via SKIPIF (ext/pdo_mysql + reachable MySQL). +- **#136 HTTP / CURL coverage** — `EvilPeer::serveHttp` (HTTP/1.1 evil server mode) + `curl/http_chaos.feature` (12 scenarios → 35 .phpt). Reactor-driven `ext/curl` client under every body toxic plus HTTP-specific ones — chunked TE, mendacious Content-Length, dribbled headers, arbitrary status. Killer-cancel mid-transfer covered. New step `coroutine "C" fetches peer "EP" over HTTP`. **Found a real bug** — async curl dropped all but the first chunk of a chunked response body (`CURLE_WRITE_ERROR`); fixed in php-src `ext/curl/curl_async.c`. See `FINDINGS.md`. From aa781baf11b4a4b0a854a8256edfc5db9a933895 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Sat, 23 May 2026 10:20:24 +0000 Subject: [PATCH 7/8] ci: fix missing 'done' in nightly-io-chaos randomised-schedulers loop MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The for-loop iterating TRUE_ASYNC_SCHED seeds was missing its `done` terminator, causing bash to swallow the following "- name:" step as loop body and the job to fail with a syntax error / exit 2 — exactly what the recent nightly run reported. --- .github/workflows/nightly-io-chaos.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/nightly-io-chaos.yml b/.github/workflows/nightly-io-chaos.yml index a3621b2..3ad99ea 100644 --- a/.github/workflows/nightly-io-chaos.yml +++ b/.github/workflows/nightly-io-chaos.yml @@ -127,6 +127,7 @@ jobs: -g FAIL,BORK,LEAK,XLEAK \ --no-progress --offline --show-diff --set-timeout 120 \ fuzzy-tests/_generated/io/toxiproxy__*.phpt + done - name: Toxiproxy log if: always() From 523bf75fd4b0020a3c445b39c62c01461cf09d68 Mon Sep 17 00:00:00 2001 From: Edmond <1571649+EdmondDantes@users.noreply.github.com> Date: Sat, 23 May 2026 10:26:18 +0000 Subject: [PATCH 8/8] ci: nightly DB chaos workflow (MySQL + PostgreSQL + Toxiproxy) DB half of #136 had no scheduled run. Mirrors nightly-io-chaos: stands up MySQL and PostgreSQL (apt + service start, test/test user, chaos_test schema seeded with the 5-row items table), Toxiproxy, builds PHP with pdo_mysql/mysqli/pdo_pgsql/pdo_sqlite, generates fuzzy-tests and runs fuzzy-tests/_generated/db/ under FIFO + four random scheduler seeds (random:1, :7, :42, :1337). Schedule 03:43 UTC daily, offset from the I/O nightly so they do not contend for the runner pool. --- .github/workflows/nightly-db-chaos.yml | 180 +++++++++++++++++++++++++ 1 file changed, 180 insertions(+) create mode 100644 .github/workflows/nightly-db-chaos.yml diff --git a/.github/workflows/nightly-db-chaos.yml b/.github/workflows/nightly-db-chaos.yml new file mode 100644 index 0000000..9f15cf9 --- /dev/null +++ b/.github/workflows/nightly-db-chaos.yml @@ -0,0 +1,180 @@ +name: Nightly DB chaos (Toxiproxy + MySQL + PostgreSQL) + +# Database half of issue #136: async PDO MySQL / mysqli / PDO PgSQL / PDO +# SQLite under Toxiproxy transport faults (latency, bandwidth, slicer, +# reset_peer) + a separate concurrent.feature for transactions and +# heterogeneous coroutine workloads. Opt-in like the I/O nightly: every +# generated .phpt carries --SKIPIF-- probes (ext/pdo_*, ext/mysqli, real +# server reachable, Toxiproxy admin endpoint) and skips wherever the env +# does not satisfy them. This job is where the env is actually stood up. + +on: + schedule: + # 03:43 UTC daily — offset from nightly-io-chaos so the two do not + # contend for the same runner pool. + - cron: '43 3 * * *' + workflow_dispatch: + +jobs: + db-chaos: + name: "DB_CHAOS" + runs-on: ubuntu-24.04 + timeout-minutes: 60 + + env: + CHAOS_TOXIPROXY: 127.0.0.1:8474 + CHAOS_MYSQL: 127.0.0.1:3306 + CHAOS_MYSQL_USER: test + CHAOS_MYSQL_PASS: test + CHAOS_MYSQL_DB: chaos_test + CHAOS_PGSQL: 127.0.0.1:5432 + CHAOS_PGSQL_USER: test + CHAOS_PGSQL_PASS: test + CHAOS_PGSQL_DB: chaos_test + TOXIPROXY_VERSION: 2.12.0 + + steps: + - name: Checkout php-async repo + uses: actions/checkout@v4 + with: + path: async + + - name: Clone php-src (true-async-stable) + run: | + git clone --depth=1 --branch=true-async-stable https://github.com/true-async/php-src php-src + + - name: Copy php-async extension into php-src + run: | + mkdir -p php-src/ext/async + cp -r async/* php-src/ext/async/ + + - name: Install build dependencies + run: | + set -x + sudo apt-get update -y + sudo apt-get install -y \ + autoconf bison build-essential re2c pkg-config \ + libxml2-dev libssl-dev libsqlite3-dev libonig-dev \ + libcurl4-openssl-dev libuv1-dev libpq-dev + + - name: Start MySQL (preinstalled on ubuntu-24.04) + run: | + set -x + sudo service mysql start + for i in $(seq 1 20); do + if sudo mysqladmin ping --silent; then break; fi + sleep 1 + done + sudo mysql -e "CREATE USER 'test'@'%' IDENTIFIED BY 'test';" + sudo mysql -e "CREATE DATABASE chaos_test;" + sudo mysql -e "GRANT ALL PRIVILEGES ON chaos_test.* TO 'test'@'%';" + mysql -h127.0.0.1 -utest -ptest chaos_test <<'SQL' + CREATE TABLE items (id INT PRIMARY KEY AUTO_INCREMENT, label VARCHAR(64), n INT); + INSERT INTO items (label, n) VALUES ('alpha',1),('beta',2),('gamma',3),('delta',4),('epsilon',5); + SQL + + - name: Install and start PostgreSQL + run: | + set -x + sudo apt-get install -y postgresql + sudo service postgresql start + for i in $(seq 1 20); do + if sudo -u postgres pg_isready; then break; fi + sleep 1 + done + sudo -u postgres psql -c "CREATE USER test WITH PASSWORD 'test';" + sudo -u postgres psql -c "CREATE DATABASE chaos_test OWNER test;" + # Allow TCP md5 from localhost for the test user (default ubuntu cluster). + PG_HBA=$(sudo -u postgres psql -tA -c "SHOW hba_file;") + echo "host chaos_test test 127.0.0.1/32 md5" | sudo tee -a "$PG_HBA" + sudo service postgresql reload + PGPASSWORD=test psql -h 127.0.0.1 -U test -d chaos_test <<'SQL' + CREATE TABLE items (id SERIAL PRIMARY KEY, label VARCHAR(64), n INT); + INSERT INTO items (label, n) VALUES ('alpha',1),('beta',2),('gamma',3),('delta',4),('epsilon',5); + SQL + + - name: Install Toxiproxy + run: | + set -x + curl -sSL -o /tmp/toxiproxy-server \ + "https://github.com/Shopify/toxiproxy/releases/download/v${TOXIPROXY_VERSION}/toxiproxy-server-linux-amd64" + sudo install -m 0755 /tmp/toxiproxy-server /usr/local/bin/toxiproxy-server + toxiproxy-server --version + + - name: Start Toxiproxy + run: | + toxiproxy-server -host 127.0.0.1 -port 8474 >/tmp/toxiproxy.log 2>&1 & + for i in $(seq 1 30); do + if curl -sf http://127.0.0.1:8474/version >/dev/null; then + echo "Toxiproxy up: $(curl -s http://127.0.0.1:8474/version)" + exit 0 + fi + sleep 1 + done + echo "Toxiproxy did not come up"; cat /tmp/toxiproxy.log; exit 1 + + - name: Configure PHP + working-directory: php-src + run: | + set -x + ./buildconf --force + ./configure \ + --enable-option-checking=fatal \ + --prefix=/usr/local \ + --without-pear \ + --with-openssl \ + --with-curl \ + --enable-sockets \ + --enable-pcntl \ + --enable-mbstring \ + --enable-debug \ + --enable-zts \ + --enable-async \ + --enable-async-fuzz \ + --enable-pdo \ + --with-pdo-mysql=mysqlnd \ + --with-mysqli=mysqlnd \ + --with-pdo-pgsql \ + --with-pdo-sqlite + + - name: Build PHP + working-directory: php-src + run: make -j$(nproc) >/dev/null + + - name: Install PHP + working-directory: php-src + run: sudo make install + + - name: Generate chaos tests + working-directory: php-src/ext/async + run: PHP_BIN=/usr/local/bin/php bash fuzzy-tests/regen.sh + + - name: Run DB chaos suite (FIFO) + working-directory: php-src/ext/async + run: | + /usr/local/bin/php ../../run-tests.php \ + -P -q -j$(nproc) \ + -g FAIL,BORK,LEAK,XLEAK \ + --no-progress --offline --show-diff --set-timeout 120 \ + fuzzy-tests/_generated/db/ + + - name: Run DB chaos suite (randomised schedulers) + working-directory: php-src/ext/async + run: | + set -e + for sched in random:1 random:7 random:42 random:1337; do + echo "=== TRUE_ASYNC_SCHED=$sched ===" + TRUE_ASYNC_SCHED=$sched /usr/local/bin/php ../../run-tests.php \ + -P -q -j$(nproc) \ + -g FAIL,BORK,LEAK,XLEAK \ + --no-progress --offline --show-diff --set-timeout 120 \ + fuzzy-tests/_generated/db/ + done + + - name: Toxiproxy log + if: always() + run: cat /tmp/toxiproxy.log || true + + - name: PostgreSQL log + if: always() + run: sudo tail -n 200 /var/log/postgresql/postgresql-*.log || true