Skip to content

fix(ci): Fix E2E flakiness without retries#5888

Draft
antonis wants to merge 1 commit intomainfrom
fix/e2e-stable-checks
Draft

fix(ci): Fix E2E flakiness without retries#5888
antonis wants to merge 1 commit intomainfrom
fix/e2e-stable-checks

Conversation

@antonis
Copy link
Contributor

@antonis antonis commented Mar 25, 2026

📢 Type of change

  • Bugfix

📜 Description

Alternative to #5830 — fixes E2E test flakiness without per-flow retries.

  • Per-flow process isolation in cli.mjs — each Maestro flow runs in its own process, preventing crash cascade
  • Maestro driver warm-up flow before real tests — first launchApp after simulator boot is unreliable on Tart VMs
  • Simulator boot readinessxcrun simctl bootstatus, Settings.app warm-up, MAESTRO_DRIVER_STARTUP_TIMEOUT bumped to 180s
  • crash.yml runs first — next flow's launchTestAppClear verifies post-crash recovery
  • Sample app test fixes — search all envelopes for app start tx, sort by timestamp, TTID/TTFD allow-list (navigation, ui.load)
  • execSyncexecFileSync to avoid shell interpolation

No test coverage lost. No retries.

#skip-changelog

💡 Motivation and Context

iOS E2E tests fail on Cirrus Labs Tart VM runners due to Maestro driver connection issues on first launch, crash cascade in shared sessions, and envelope delivery order on slow VMs.

💚 How did you test it?

CI

📝 Checklist

  • I added tests to verify changes
  • No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled
  • All tests passing
  • No breaking changes

🔮 Next steps

Close #5830 if this approach proves stable.

@antonis antonis added the ready-to-merge Triggers the full CI test suite label Mar 25, 2026
@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

Semver Impact of This PR

None (no version bump detected)

📋 Changelog Preview

This is how your changes will appear in the changelog.
Entries from this PR are highlighted with a left border (blockquote style).


This PR will not appear in the changelog.


🤖 This preview updates automatically when you update the PR.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

Android (legacy) Performance metrics 🚀

  Plain With Sentry Diff
Startup time 460.06 ms 501.76 ms 41.70 ms
Size 43.75 MiB 48.08 MiB 4.33 MiB

Baseline results on branch: main

Startup times

Revision Plain With Sentry Diff
4a17c8f+dirty 406.62 ms 400.58 ms -6.04 ms
df1f7df+dirty 442.64 ms 427.16 ms -15.48 ms
a483f9f+dirty 396.82 ms 453.28 ms 56.46 ms
60cd796+dirty 445.84 ms 492.45 ms 46.61 ms
5c16cdc+dirty 423.48 ms 452.35 ms 28.88 ms
80e4616+dirty 411.58 ms 462.12 ms 50.54 ms
55b77fc+dirty 411.87 ms 417.16 ms 5.29 ms
bca62c0+dirty 414.36 ms 451.06 ms 36.70 ms
0b64753+dirty 448.67 ms 474.61 ms 25.94 ms
4e6d7d7+dirty 480.73 ms 515.73 ms 35.00 ms

App size

Revision Plain With Sentry Diff
4a17c8f+dirty 43.75 MiB 47.99 MiB 4.24 MiB
df1f7df+dirty 43.75 MiB 48.08 MiB 4.33 MiB
a483f9f+dirty 43.75 MiB 48.41 MiB 4.66 MiB
60cd796+dirty 43.75 MiB 48.07 MiB 4.32 MiB
5c16cdc+dirty 17.75 MiB 19.68 MiB 1.94 MiB
80e4616+dirty 43.75 MiB 48.55 MiB 4.80 MiB
55b77fc+dirty 43.75 MiB 47.99 MiB 4.24 MiB
bca62c0+dirty 43.75 MiB 48.41 MiB 4.66 MiB
0b64753+dirty 17.75 MiB 19.70 MiB 1.95 MiB
4e6d7d7+dirty 43.75 MiB 48.40 MiB 4.64 MiB

Previous results on branch: fix/e2e-stable-checks

Startup times

Revision Plain With Sentry Diff
f5fb57c+dirty 405.82 ms 423.92 ms 18.10 ms
9530cff+dirty 399.24 ms 427.22 ms 27.98 ms

App size

Revision Plain With Sentry Diff
f5fb57c+dirty 43.75 MiB 48.08 MiB 4.33 MiB
9530cff+dirty 43.75 MiB 48.08 MiB 4.33 MiB

@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

iOS (legacy) Performance metrics 🚀

  Plain With Sentry Diff
Startup time 1203.77 ms 1224.11 ms 20.34 ms
Size 3.38 MiB 4.73 MiB 1.35 MiB

Baseline results on branch: main

Startup times

Revision Plain With Sentry Diff
ea3e26e+dirty 1229.13 ms 1228.46 ms -0.67 ms
80e4616+dirty 1221.32 ms 1225.64 ms 4.32 ms
818a608+dirty 1205.76 ms 1208.00 ms 2.24 ms
77061ed+dirty 1233.16 ms 1234.88 ms 1.71 ms
bef3709+dirty 1222.07 ms 1220.24 ms -1.83 ms
a206511+dirty 1185.00 ms 1186.35 ms 1.35 ms
74979ac+dirty 1210.49 ms 1213.31 ms 2.82 ms
a2bb688+dirty 1223.53 ms 1232.90 ms 9.37 ms
8a868fe+dirty 1221.50 ms 1230.78 ms 9.28 ms
d590428+dirty 1211.77 ms 1220.51 ms 8.75 ms

App size

Revision Plain With Sentry Diff
ea3e26e+dirty 3.41 MiB 4.58 MiB 1.17 MiB
80e4616+dirty 3.38 MiB 4.60 MiB 1.22 MiB
818a608+dirty 2.63 MiB 3.91 MiB 1.28 MiB
77061ed+dirty 2.63 MiB 3.98 MiB 1.34 MiB
bef3709+dirty 3.38 MiB 4.78 MiB 1.40 MiB
a206511+dirty 3.41 MiB 4.67 MiB 1.25 MiB
74979ac+dirty 3.38 MiB 4.60 MiB 1.22 MiB
a2bb688+dirty 2.63 MiB 3.99 MiB 1.36 MiB
8a868fe+dirty 3.38 MiB 4.60 MiB 1.22 MiB
d590428+dirty 3.38 MiB 4.78 MiB 1.39 MiB

Previous results on branch: fix/e2e-stable-checks

Startup times

Revision Plain With Sentry Diff
f5fb57c+dirty 1195.00 ms 1190.48 ms -4.52 ms
9530cff+dirty 1230.51 ms 1231.96 ms 1.45 ms

App size

Revision Plain With Sentry Diff
f5fb57c+dirty 3.38 MiB 4.73 MiB 1.35 MiB
9530cff+dirty 3.38 MiB 4.73 MiB 1.35 MiB

@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

iOS (new) Performance metrics 🚀

  Plain With Sentry Diff
Startup time 1218.96 ms 1223.65 ms 4.69 ms
Size 3.38 MiB 4.73 MiB 1.35 MiB

Baseline results on branch: main

Startup times

Revision Plain With Sentry Diff
ea3e26e+dirty 1216.61 ms 1214.15 ms -2.47 ms
80e4616+dirty 1206.90 ms 1205.94 ms -0.96 ms
818a608+dirty 1218.84 ms 1223.18 ms 4.34 ms
77061ed+dirty 1210.77 ms 1218.45 ms 7.68 ms
bef3709+dirty 1217.79 ms 1225.33 ms 7.54 ms
a206511+dirty 1225.02 ms 1223.74 ms -1.28 ms
74979ac+dirty 1212.33 ms 1212.54 ms 0.21 ms
a2bb688+dirty 1244.82 ms 1238.60 ms -6.22 ms
8a868fe+dirty 1206.85 ms 1215.04 ms 8.19 ms
d590428+dirty 1221.23 ms 1225.27 ms 4.03 ms

App size

Revision Plain With Sentry Diff
ea3e26e+dirty 3.41 MiB 4.58 MiB 1.17 MiB
80e4616+dirty 3.38 MiB 4.60 MiB 1.22 MiB
818a608+dirty 3.19 MiB 4.48 MiB 1.29 MiB
77061ed+dirty 3.19 MiB 4.54 MiB 1.36 MiB
bef3709+dirty 3.38 MiB 4.78 MiB 1.40 MiB
a206511+dirty 3.41 MiB 4.67 MiB 1.25 MiB
74979ac+dirty 3.38 MiB 4.60 MiB 1.22 MiB
a2bb688+dirty 3.19 MiB 4.56 MiB 1.37 MiB
8a868fe+dirty 3.38 MiB 4.60 MiB 1.22 MiB
d590428+dirty 3.38 MiB 4.78 MiB 1.39 MiB

Previous results on branch: fix/e2e-stable-checks

Startup times

Revision Plain With Sentry Diff
f5fb57c+dirty 1246.91 ms 1241.61 ms -5.30 ms
9530cff+dirty 1220.98 ms 1216.18 ms -4.80 ms

App size

Revision Plain With Sentry Diff
f5fb57c+dirty 3.38 MiB 4.73 MiB 1.35 MiB
9530cff+dirty 3.38 MiB 4.73 MiB 1.35 MiB

@github-actions
Copy link
Contributor

github-actions bot commented Mar 25, 2026

Android (new) Performance metrics 🚀

  Plain With Sentry Diff
Startup time 440.46 ms 484.47 ms 44.01 ms
Size 43.94 MiB 48.94 MiB 5.00 MiB

Baseline results on branch: main

Startup times

Revision Plain With Sentry Diff
70250df+dirty 418.08 ms 480.84 ms 62.76 ms
8d89cc9+dirty 357.69 ms 415.79 ms 58.10 ms
1853710+dirty 360.67 ms 396.28 ms 35.61 ms
55b77fc+dirty 410.46 ms 414.11 ms 3.65 ms
69602ce+dirty 375.37 ms 405.28 ms 29.91 ms
c1573b3+dirty 355.65 ms 448.82 ms 93.17 ms
90afdd3+dirty 367.79 ms 404.84 ms 37.05 ms
955f2eb+dirty 388.13 ms 433.56 ms 45.44 ms
80e4616+dirty 427.31 ms 461.15 ms 33.84 ms
276d348+dirty 356.30 ms 405.27 ms 48.97 ms

App size

Revision Plain With Sentry Diff
70250df+dirty 43.94 MiB 48.91 MiB 4.97 MiB
8d89cc9+dirty 7.15 MiB 8.41 MiB 1.26 MiB
1853710+dirty 7.15 MiB 8.41 MiB 1.26 MiB
55b77fc+dirty 43.94 MiB 48.82 MiB 4.88 MiB
69602ce+dirty 7.15 MiB 8.41 MiB 1.26 MiB
c1573b3+dirty 7.15 MiB 8.42 MiB 1.27 MiB
90afdd3+dirty 7.15 MiB 8.43 MiB 1.28 MiB
955f2eb+dirty 7.15 MiB 8.42 MiB 1.27 MiB
80e4616+dirty 43.94 MiB 49.38 MiB 5.44 MiB
276d348+dirty 7.15 MiB 8.42 MiB 1.26 MiB

Previous results on branch: fix/e2e-stable-checks

Startup times

Revision Plain With Sentry Diff
f5fb57c+dirty 444.45 ms 473.36 ms 28.91 ms
9530cff+dirty 395.80 ms 426.40 ms 30.60 ms

App size

Revision Plain With Sentry Diff
f5fb57c+dirty 43.94 MiB 48.94 MiB 5.00 MiB
9530cff+dirty 43.94 MiB 48.94 MiB 5.00 MiB

@antonis antonis force-pushed the fix/e2e-stable-checks branch from 05eb0bd to 605f13a Compare March 26, 2026 10:07
@antonis antonis changed the title fix(ci): Fix E2E flakiness with stable checks instead of retries fix(ci): Fix E2E flakiness without retries Mar 26, 2026
@antonis antonis force-pushed the fix/e2e-stable-checks branch from 2da1a10 to 0d6c6da Compare March 26, 2026 14:36
Replace retry-based approach (PR #5830) with deterministic fixes:

### Simulator stability (Cirrus Labs Tart VMs)
- `wait_for_boot: true` / `erase_before_boot: false` on simulator-action
- `xcrun simctl bootstatus booted -b` to block until boot completes
- Settings.app warm-up for SpringBoard/system service initialization
- `MAESTRO_DRIVER_STARTUP_TIMEOUT` bumped to 180s

### e2e-v2 test runner (cli.mjs)
- Per-flow process isolation via individual `maestro test` calls
- Maestro driver warm-up flow before real tests (non-fatal)
- crash.yml runs first so the next flow verifies post-crash recovery
- `execSync` → `execFileSync` to avoid shell interpolation
- SENTRY_AUTH_TOKEN redaction in debug logs

### Sample application test fixes
- Search all envelopes for app start transaction (slow VM delivery)
- Sort envelopes by timestamp for deterministic ordering
- Allow-list for TTID/TTFD ops (`navigation`, `ui.load`)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@antonis antonis force-pushed the fix/e2e-stable-checks branch from 0d6c6da to 60075ed Compare March 26, 2026 15:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-to-merge Triggers the full CI test suite

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant