Skip to content

cppa-cursor-browser: Performance benchmark suite — create benchmark infra + baselines #110

Description

@clean6378-max-it

Calendar Day

Thursday, June 25, 2026 (PR 1 of 2)

Planned Effort

5 story points — sprint item #8 (Medium-High)

Depends on: Wednesday PR 1 (summary cache benchmark #7) merged — this suite organizes the summary-cache benchmarks alongside parse/export/search and captures baselines for the full set. Capstone of the week's benchmark work.

Problem

The project has individual test files that measure specific latencies but no unified benchmark infrastructure with tracked baselines, regression gates, and CI integration. benchmarks/baselines.json exists but has empty groups. Without a benchmark suite that gates CI, performance-sensitive code paths (parse, export, search, summary cache) can degrade silently across releases — "the operations most likely to degrade under growth are exactly the ones with no regression gate."

Goal

One merged PR that unifies all benchmark files under a common conftest, populates benchmarks/baselines.json from a reference run, adds a CI gate failing on >20% regression, and documents local + CI usage.

Scope

Touch points

  • tests/benchmarks/ — organize test_parse_bench.py, test_export_bench.py, test_search_bench.py, and test_summary_cache_bench.py (from Support Cursor CLI agent sessions #7) under a common conftest.py with shared fixtures
  • benchmarks/baselines.json — populate from a reference run
  • .github/workflows/ci.yml — add benchmark job with --benchmark-compare
  • benchmarks/README.md (new) — document local + CI usage

Gate behavior

  • Fail if a current mean exceeds its baseline by >20%
  • Pinned runner + consistent data to control variance
  • Missing baselines for new benchmark names: warn, do not fail

Acceptance Criteria

  • A pytest-benchmark based suite exists covering: parse latency, export latency, search latency, and summary cache (from item Support Cursor CLI agent sessions #7)
  • benchmarks/baselines.json is populated with initial values from a reference run
  • CI has a benchmark job that compares against baselines and fails on >20% regression
  • The benchmark job runs on a consistent environment (pinned runner, consistent data)
  • A benchmarks/README.md documents how to run benchmarks locally and update baselines
  • PR approved by at least 1 reviewer

Verification

cd C:\Users\Jasen\CppAliance\cppa-cursor-browser
.\.venv\Scripts\Activate.ps1
pytest tests/benchmarks/ --benchmark-only
pytest tests/benchmarks/ --benchmark-compare --benchmark-compare-fail=mean:20%

Metadata

Metadata

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions