Skip to content
Merged
180 changes: 180 additions & 0 deletions .github/workflows/nightly-db-chaos.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
name: Nightly DB chaos (Toxiproxy + MySQL + PostgreSQL)

# Database half of issue #136: async PDO MySQL / mysqli / PDO PgSQL / PDO
# SQLite under Toxiproxy transport faults (latency, bandwidth, slicer,
# reset_peer) + a separate concurrent.feature for transactions and
# heterogeneous coroutine workloads. Opt-in like the I/O nightly: every
# generated .phpt carries --SKIPIF-- probes (ext/pdo_*, ext/mysqli, real
# server reachable, Toxiproxy admin endpoint) and skips wherever the env
# does not satisfy them. This job is where the env is actually stood up.

on:
schedule:
# 03:43 UTC daily — offset from nightly-io-chaos so the two do not
# contend for the same runner pool.
- cron: '43 3 * * *'
workflow_dispatch:

jobs:
db-chaos:
name: "DB_CHAOS"
runs-on: ubuntu-24.04
timeout-minutes: 60

env:
CHAOS_TOXIPROXY: 127.0.0.1:8474
CHAOS_MYSQL: 127.0.0.1:3306
CHAOS_MYSQL_USER: test
CHAOS_MYSQL_PASS: test
CHAOS_MYSQL_DB: chaos_test
CHAOS_PGSQL: 127.0.0.1:5432
CHAOS_PGSQL_USER: test
CHAOS_PGSQL_PASS: test
CHAOS_PGSQL_DB: chaos_test
TOXIPROXY_VERSION: 2.12.0

steps:
- name: Checkout php-async repo
uses: actions/checkout@v4
with:
path: async

- name: Clone php-src (true-async-stable)
run: |
git clone --depth=1 --branch=true-async-stable https://github.com/true-async/php-src php-src

- name: Copy php-async extension into php-src
run: |
mkdir -p php-src/ext/async
cp -r async/* php-src/ext/async/

- name: Install build dependencies
run: |
set -x
sudo apt-get update -y
sudo apt-get install -y \
autoconf bison build-essential re2c pkg-config \
libxml2-dev libssl-dev libsqlite3-dev libonig-dev \
libcurl4-openssl-dev libuv1-dev libpq-dev

- name: Start MySQL (preinstalled on ubuntu-24.04)
run: |
set -x
sudo service mysql start
for i in $(seq 1 20); do
if sudo mysqladmin ping --silent; then break; fi
sleep 1
done
sudo mysql -e "CREATE USER 'test'@'%' IDENTIFIED BY 'test';"
sudo mysql -e "CREATE DATABASE chaos_test;"
sudo mysql -e "GRANT ALL PRIVILEGES ON chaos_test.* TO 'test'@'%';"
mysql -h127.0.0.1 -utest -ptest chaos_test <<'SQL'
CREATE TABLE items (id INT PRIMARY KEY AUTO_INCREMENT, label VARCHAR(64), n INT);
INSERT INTO items (label, n) VALUES ('alpha',1),('beta',2),('gamma',3),('delta',4),('epsilon',5);
SQL

- name: Install and start PostgreSQL
run: |
set -x
sudo apt-get install -y postgresql
sudo service postgresql start
for i in $(seq 1 20); do
if sudo -u postgres pg_isready; then break; fi
sleep 1
done
sudo -u postgres psql -c "CREATE USER test WITH PASSWORD 'test';"
sudo -u postgres psql -c "CREATE DATABASE chaos_test OWNER test;"
# Allow TCP md5 from localhost for the test user (default ubuntu cluster).
PG_HBA=$(sudo -u postgres psql -tA -c "SHOW hba_file;")
echo "host chaos_test test 127.0.0.1/32 md5" | sudo tee -a "$PG_HBA"
sudo service postgresql reload
PGPASSWORD=test psql -h 127.0.0.1 -U test -d chaos_test <<'SQL'
CREATE TABLE items (id SERIAL PRIMARY KEY, label VARCHAR(64), n INT);
INSERT INTO items (label, n) VALUES ('alpha',1),('beta',2),('gamma',3),('delta',4),('epsilon',5);
SQL

- name: Install Toxiproxy
run: |
set -x
curl -sSL -o /tmp/toxiproxy-server \
"https://github.com/Shopify/toxiproxy/releases/download/v${TOXIPROXY_VERSION}/toxiproxy-server-linux-amd64"
sudo install -m 0755 /tmp/toxiproxy-server /usr/local/bin/toxiproxy-server
toxiproxy-server --version

- name: Start Toxiproxy
run: |
toxiproxy-server -host 127.0.0.1 -port 8474 >/tmp/toxiproxy.log 2>&1 &
for i in $(seq 1 30); do
if curl -sf http://127.0.0.1:8474/version >/dev/null; then
echo "Toxiproxy up: $(curl -s http://127.0.0.1:8474/version)"
exit 0
fi
sleep 1
done
echo "Toxiproxy did not come up"; cat /tmp/toxiproxy.log; exit 1

- name: Configure PHP
working-directory: php-src
run: |
set -x
./buildconf --force
./configure \
--enable-option-checking=fatal \
--prefix=/usr/local \
--without-pear \
--with-openssl \
--with-curl \
--enable-sockets \
--enable-pcntl \
--enable-mbstring \
--enable-debug \
--enable-zts \
--enable-async \
--enable-async-fuzz \
--enable-pdo \
--with-pdo-mysql=mysqlnd \
--with-mysqli=mysqlnd \
--with-pdo-pgsql \
--with-pdo-sqlite

- name: Build PHP
working-directory: php-src
run: make -j$(nproc) >/dev/null

- name: Install PHP
working-directory: php-src
run: sudo make install

- name: Generate chaos tests
working-directory: php-src/ext/async
run: PHP_BIN=/usr/local/bin/php bash fuzzy-tests/regen.sh

- name: Run DB chaos suite (FIFO)
working-directory: php-src/ext/async
run: |
/usr/local/bin/php ../../run-tests.php \
-P -q -j$(nproc) \
-g FAIL,BORK,LEAK,XLEAK \
--no-progress --offline --show-diff --set-timeout 120 \
fuzzy-tests/_generated/db/

- name: Run DB chaos suite (randomised schedulers)
working-directory: php-src/ext/async
run: |
set -e
for sched in random:1 random:7 random:42 random:1337; do
echo "=== TRUE_ASYNC_SCHED=$sched ==="
TRUE_ASYNC_SCHED=$sched /usr/local/bin/php ../../run-tests.php \
-P -q -j$(nproc) \
-g FAIL,BORK,LEAK,XLEAK \
--no-progress --offline --show-diff --set-timeout 120 \
fuzzy-tests/_generated/db/
done

- name: Toxiproxy log
if: always()
run: cat /tmp/toxiproxy.log || true

- name: PostgreSQL log
if: always()
run: sudo tail -n 200 /var/log/postgresql/postgresql-*.log || true
1 change: 1 addition & 0 deletions .github/workflows/nightly-io-chaos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,7 @@ jobs:
-g FAIL,BORK,LEAK,XLEAK \
--no-progress --offline --show-diff --set-timeout 120 \
fuzzy-tests/_generated/io/toxiproxy__*.phpt
done

- name: Toxiproxy log
if: always()
Expand Down
15 changes: 15 additions & 0 deletions fuzzy-tests/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# Fuzzy-tests changelog

Test-coverage and harness changes for `ext/async/fuzzy-tests/`. Kept
separate from the main `CHANGELOG.md` so user-facing extension features
are not mixed with test-suite history.

## Unreleased

### Added
- **#136 SQLite pool-focused coverage** — `db/sqlite_chaos.feature` (6 scenarios). Local file, no Toxiproxy, pdo_sqlite operations do not yield — exercises only the PDO connection pool itself (per-coroutine `sqlite3*` slots over one shared file). New steps `a [pooled] SQLite database "DB"`. Schema seeded per-scenario into a per-PID file under `sys_get_temp_dir()`, removed in tearDown. Only `pdo_sqlite` required by SKIPIF.
- **#136 Concurrent transactions and heterogeneous workloads** — `db/concurrent.feature` (9 scenarios → 22 .phpt) targets many coroutines doing different things at once on one pooled `$pdo`: concurrent transactions, writer racing reader (reader's `WHERE id<=5` snapshot is interleaving-independent of writer's `id>5` insert), mixed workloads, a transaction cancelled while a sibling reader keeps working, a storm on an undersized pool. `StandardSteps::dbTransaction` runs a multi-statement transaction now (BEGIN → INSERT → read-back SELECT → COMMIT) so each yields several times. **Found a UAF in the PDO connection pool** under coroutine cancellation mid-transaction — fixed in php-src `ext/pdo/pdo_pool.c`.
- **#136 PgSQL coverage** — `db/pgsql_chaos.feature` (12 scenarios → 30 .phpt) mirrors `mysql_chaos.feature` against a Toxiproxy-fronted PostgreSQL server. `ChaosNet::openDbConnection()` and `dbUpstream()` driver-aware (mysql / pgsql); slow-query step picks `SELECT pg_sleep(2)` vs `SELECT SLEEP(2)` at action-run time. New steps `a [pooled] PgSQL database "DB"`. Toxiproxy and PDO client steps relaxed to driver-agnostic `requires`; the database declaration carries the driver requirement. `generate.php` gains `pdo_pgsql` and `pgsql-server` SKIPIF probes.
- **#136 mysqli coverage** — `db/mysqli_chaos.feature` (10 scenarios) — same Toxiproxy-fronted MySQL through the mysqli extension. mysqli has no pool, every chaos query opens its own connection. New steps `a MySQLi database "DB"` and `coroutine "C" queries|runs a slow query via|runs a transaction via mysqli "DB"`. `ChaosNet::openMysqliConnection()` with `MYSQLI_REPORT_STRICT`. Toxiproxy toxic steps driver-agnostic now.
- **#136 PDO MySQL coverage** — `db/mysql_chaos.feature` (12 scenarios → 30 .phpt). Async PDO MySQL through Toxiproxy: latency, bandwidth, TCP slicer, `reset_peer` mid-query / mid-transaction. Pooled and non-pooled. `ChaosNet` is the new network-fixture layer extracted from `Context`; new steps `a [pooled] MySQL database "DB"` / `Toxiproxy adds latency|throttles|slices|resets database "DB"` / `coroutine "C" queries|runs a slow query on|runs a transaction on database "DB"`. Env-driven connection (`CHAOS_MYSQL`, `CHAOS_MYSQL_USER/PASS/DB`). Opt-in via SKIPIF (ext/pdo_mysql + reachable MySQL).
- **#136 HTTP / CURL coverage** — `EvilPeer::serveHttp` (HTTP/1.1 evil server mode) + `curl/http_chaos.feature` (12 scenarios → 35 .phpt). Reactor-driven `ext/curl` client under every body toxic plus HTTP-specific ones — chunked TE, mendacious Content-Length, dribbled headers, arbitrary status. Killer-cancel mid-transfer covered. New step `coroutine "C" fetches peer "EP" over HTTP`. **Found a real bug** — async curl dropped all but the first chunk of a chunked response body (`CURLE_WRITE_ERROR`); fixed in php-src `ext/curl/curl_async.c`. See `FINDINGS.md`.
48 changes: 48 additions & 0 deletions fuzzy-tests/FINDINGS.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,51 @@ Fixed in `php_stdiop_write()` / `php_stdiop_read()`: re-suspend until *this*
request completed. Regression test `tests/io/083-concurrent_async_write.phpt`.
The structural fix — per-request completion events instead of the broadcast —
is tracked in true-async/php-async#130.

## Async curl drops a chunked response body (real bug — fixed)

The new `curl/http_chaos.feature` (issue #136) drives an async `ext/curl`
client against the EvilPeer in its `http` mode. Every scenario passed except
the three "chunked transfer encoding" rows, which failed with
`curl_get_ok == 0`: curl reported `CURLE_WRITE_ERROR` —
*"Failure writing output to destination, passed 272 returned 17"* — after
delivering only the first 17-byte chunk to the `CURLOPT_WRITEFUNCTION`
callback. The same program on stock PHP 8.3 returns the whole body.

Root cause in `ext/curl/curl_async.c`. The async write path uses libcurl's
`CURL_WRITEFUNC_PAUSE` / unpause pattern: the first `curl_write` call copies
the data, spawns a coroutine for the PHP callback and returns `PAUSE`; the
completion callback stores the callback's return value and unpauses, and the
*re-call* returns that stored value. The re-call branch assumed libcurl
re-delivers exactly the slice that was paused on — but on unpause libcurl
re-delivers the whole paused window **and coalesces any freshly decoded data
into it**. With chunked transfer-encoding the de-chunker produces many small
pieces, so the re-call carried 272 bytes while the stored result was 17 →
`passed 272 returned 17` → `CURLE_WRITE_ERROR`. (A fixed Content-Length body
arrives one network read at a time, one write callback per reactor wakeup,
so it never tripped — only chunked decoding coalesces.)

Fixed in `curl_async_write_user()`: the re-call now tracks a
`consumed_offset` through the (possibly grown) window — it reports the full
length back to libcurl only once the PHP callback has accepted every byte,
otherwise it feeds the remainder through another callback slice. A genuine
short return / exception still surfaces verbatim via a new `aborted` flag.
Tracked in php-src as `#136`.

## Async PDO MySQL pool leaks a raw warning on a dropped connection (observation)

The `db/mysql_chaos.feature` suite (issue #136) fronts a real MySQL server
with Toxiproxy and drops the connection mid-query with the `reset_peer`
toxic. With `PDO::ATTR_ERRMODE = ERRMODE_EXCEPTION` the driver correctly
raises a `PDOException` — but the **pool-enabled** path also emits a bare
`E_WARNING` ("Error while reading greeting packet") from mysqlnd on top of
the exception, where the non-pooled `new PDO()` path of the same failing
connect does not.

It does not break error handling — the exception still propagates and is
caught — so the chaos steps simply `@`-silence the expected noise, the same
way the raw-socket I/O steps already do. Worth a follow-up: under
ERRMODE_EXCEPTION the pool's internal connect (`pdo_pool_acquire_conn` →
`db_handle_factory`) should suppress the low-level mysqlnd warning the way
the direct constructor path does. Not a correctness bug; tracked as a
loose end, not fixed here.
Loading
Loading