Skip to content

[Security hardening] Add automated security audit workflow#2442

Open
PascalThuet wants to merge 36 commits into
github:mainfrom
PascalThuet:codex/add-security-audit-workflow
Open

[Security hardening] Add automated security audit workflow#2442
PascalThuet wants to merge 36 commits into
github:mainfrom
PascalThuet:codex/add-security-audit-workflow

Conversation

@PascalThuet

@PascalThuet PascalThuet commented May 2, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add a dedicated Security Audit workflow for dependency and static-analysis checks.
  • Audit committed hashed requirements on push, pull request, and manual dispatch so ordinary CI stays deterministic.
  • Verify committed audit requirements are refreshed when dependency inputs change, while keeping live dependency resolution for the scheduled audit across the supported Python and OS matrix.
  • Run Bandit for high-severity findings with a scoped baseline for the accepted shell-step B602 finding.
  • Pin all repository GitHub Actions to full commit SHAs.
  • Harden extension, preset, integration, and workflow downloads with bounded reads, strict HTTPS redirects, optional sha256 verification, bounded ZIP extraction, and source-path traversal guards.

Security context

This creates a repeatable CI baseline for dependency advisories and high-severity static-analysis issues while preserving a scheduled live-resolution signal for newly published dependency problems. It also closes similar supply-chain and archive-handling gaps found during follow-up review.

Closes #2438

Validation

  • git diff --check
  • uvx ruff check on all changed Python surfaces and tests
  • uv run python -m pytest tests/test_download_security.py tests/test_extensions.py tests/test_presets.py tests/integrations/test_integration_catalog.py tests/test_security_workflow.py tests/test_upgrade.py -q
  • uv run python -m pytest -q
  • uv pip compile pyproject.toml --extra test --universal --generate-hashes --quiet --no-header --output-file , compared with .github/security-audit-requirements.txt
  • uvx --from pip-audit==2.10.0 pip-audit --disable-pip --require-hashes -r .github/security-audit-requirements.txt --progress-spinner off
  • uvx --from bandit==1.9.4 bandit -r src -lll --baseline .github/bandit-baseline.json

@PascalThuet PascalThuet marked this pull request as ready for review May 2, 2026 06:45
@PascalThuet PascalThuet requested a review from mnriem as a code owner May 2, 2026 06:45
@mnriem mnriem requested a review from Copilot May 4, 2026 12:39

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new GitHub Actions workflow to introduce automated security checks for the Python codebase, aiming to catch dependency advisories and high-severity static-analysis findings in CI alongside the existing test/lint workflows.

Changes:

  • Add .github/workflows/security.yml with a dedicated Security Audit workflow.
  • Run pip-audit on pushes to main, pull requests, a weekly cron, and manual dispatch.
  • Run Bandit against src/, temporarily skipping B602 pending the shell-step hardening tracked in #2440.
Show a summary per file
File Description
.github/workflows/security.yml New CI workflow that adds dependency-audit and static-analysis jobs for the Python project.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 1/1 changed files
  • Comments generated: 5

Comment thread .github/workflows/security.yml Outdated
Comment thread .github/workflows/security.yml Outdated
Comment thread .github/workflows/security.yml Outdated
Comment thread .github/workflows/security.yml Outdated
Comment thread .github/workflows/security.yml Outdated

@mnriem mnriem left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address Copilot feedback. If not applicable, please explain why

@PascalThuet

Copy link
Copy Markdown
Contributor Author

Addressed the Copilot feedback in the latest push:

  • Dependency auditing now exports the locked runtime + test extra dependency set with uv export --extra test --frozen --no-emit-project, then audits the generated requirements file.
  • pip-audit and Bandit are pinned via uvx --from (pip-audit==2.10.0, bandit==1.9.4).
  • Removed the global Bandit --skip B602; the existing shell execution call sites now use targeted # nosec B602 suppressions with local context.
  • Added CONTRIBUTING.md security-check commands so contributors can reproduce the workflow locally.

Local validation:

uv export --quiet --extra test --frozen --format requirements.txt --no-emit-project --output-file /tmp/spec-kit-audit-requirements.txt
uvx --from pip-audit==2.10.0 pip-audit -r /tmp/spec-kit-audit-requirements.txt --progress-spinner off
uvx --from bandit==1.9.4 bandit -r src -lll
git diff --check

Results: pip-audit reported no known vulnerabilities, Bandit reported no issues, and git diff --check passed.

@PascalThuet

Copy link
Copy Markdown
Contributor Author

Added automated regression coverage for the security workflow in tests/test_security_workflow.py.

The new tests statically verify that:

  • dependency auditing exports the locked runtime + test extra requirements before running pip-audit;
  • pip-audit and Bandit remain pinned via uvx --from;
  • Bandit does not use a global --skip B602;
  • CONTRIBUTING.md documents the same security commands.

Validation after this commit:

uv run python -m pytest tests/test_security_workflow.py -q
uv export --quiet --extra test --frozen --format requirements.txt --no-emit-project --output-file /tmp/spec-kit-audit-requirements.txt
uvx --from pip-audit==2.10.0 pip-audit -r /tmp/spec-kit-audit-requirements.txt --progress-spinner off
uvx --from bandit==1.9.4 bandit -r src -lll
git diff --check

All passed locally.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

src/specify_cli/init.py:414

  • This second # nosec B602 suppresses Bandit for the non-capturing branch as well, so neither path through run_command will ever raise B602 again. If the intent is only to defer the current finding until #2440, the skip needs to stay in the workflow/configuration layer rather than permanently muting this call site in source.
            # shell=True is only available to callers that opt in explicitly.
            subprocess.run(cmd, check=check_return, shell=shell)  # nosec B602
  • Files reviewed: 5/5 changed files
  • Comments generated: 5

Comment thread .github/workflows/security.yml Outdated
Comment thread CONTRIBUTING.md Outdated
Comment thread tests/test_security_workflow.py Outdated
Comment thread src/specify_cli/workflows/steps/shell/__init__.py Outdated
Comment thread src/specify_cli/__init__.py Outdated
@mnriem

mnriem commented May 5, 2026

Copy link
Copy Markdown
Collaborator

Please address Copilot feedback. If not applicable, please explain why. Note the shell step should indeed be ignored

@mnriem mnriem left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments above

@PascalThuet

Copy link
Copy Markdown
Contributor Author

Addressed the follow-up review in e1e8051:

  • Removed --frozen from the dependency export because this repo intentionally does not commit uv.lock; the workflow and contributor docs now use the same clean-checkout-safe command.
  • Replaced source-level # nosec B602 suppressions with a CI-level Bandit baseline at .github/bandit-baseline.json.
  • Kept that baseline scoped to exactly one accepted finding: B602 in ShellStep, matching the maintainer note that the shell step should be ignored.
  • Hardened the generic run_command helper so shell=True is rejected instead of suppressed, which removes the similar non-shell-step B602 sink entirely.
  • Expanded tests/test_security_workflow.py to catch these regressions: no lockfile flags, pinned tools, no global --skip B602, one-entry Bandit baseline, no source # nosec B602, docs in sync, and run_command(shell=True) rejection.

Validation:

uv export --quiet --extra test --format requirements.txt --no-emit-project --output-file /tmp/spec-kit-audit-requirements.txt
uvx --from pip-audit==2.10.0 pip-audit -r /tmp/spec-kit-audit-requirements.txt --progress-spinner off
uvx --from bandit==1.9.4 bandit -r src -lll --baseline .github/bandit-baseline.json
uv run python -m pytest tests/test_security_workflow.py tests/test_workflows.py -q
uvx ruff check src/
git diff --check

I also verified the uv export command in a temporary clean directory without uv.lock; it succeeds before running the audit.

@PascalThuet

Copy link
Copy Markdown
Contributor Author

One more small cleanup after re-review: pushed 6f1da27 to use uv pip compile pyproject.toml --extra test for the audit requirements instead of uv export.

Reason: it still audits the runtime + test extra dependency set, but it avoids both requiring and generating a project uv.lock, which makes the CI and CONTRIBUTING.md command cleaner for this repo.

Revalidated locally:

uv pip compile pyproject.toml --extra test --quiet --output-file /tmp/spec-kit-audit-requirements.txt
uvx --from pip-audit==2.10.0 pip-audit -r /tmp/spec-kit-audit-requirements.txt --progress-spinner off
uvx --from bandit==1.9.4 bandit -r src -lll --baseline .github/bandit-baseline.json
uv run python -m pytest tests/test_security_workflow.py -q
git diff --check

All passed.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

tests/test_security_workflow.py:91

  • Bandit doesn't require the exact literal # nosec B602 to suppress this finding; plain # nosec, #nosec, and multi-ID forms also disable B602. This check leaves those suppression paths untested, so a future source-level bypass can slip in without failing CI.
        )

    def test_b602_is_not_suppressed_in_source(self):
        source_text = "\n".join(
            path.read_text(encoding="utf-8")
            for path in (REPO_ROOT / "src").rglob("*.py")
        )
  • Files reviewed: 5/5 changed files
  • Comments generated: 3

Comment thread .github/workflows/security.yml Outdated
Comment thread CONTRIBUTING.md Outdated
Comment thread tests/test_security_workflow.py Outdated
@mnriem

mnriem commented May 5, 2026

Copy link
Copy Markdown
Collaborator

Please address Copilot feedback

@PascalThuet

Copy link
Copy Markdown
Contributor Author

Addressed the latest Copilot review in 129d19e:

  • Dependency audit now runs on the supported OS/Python matrix: ubuntu-latest and windows-latest across Python 3.11, 3.12, and 3.13.
  • The requirements output path in CI now uses ${{ runner.temp }}, and the contributor docs use a relative spec-kit-audit-requirements.txt path instead of /tmp, so the commands are Windows-compatible.
  • The security workflow regression test now verifies GitHub Actions are pinned to full 40-character commit SHAs.
  • The Bandit suppression guard now catches any # nosec form, not only the exact # nosec B602 spelling.
  • The dependency audit now compiles hashed requirements and runs pip-audit --disable-pip --require-hashes, avoiding a second pip resolution step.

Validation:

uv run python -m pytest tests/test_security_workflow.py -q
uv pip compile pyproject.toml --extra test --python-version 3.11 --generate-hashes --quiet --output-file /tmp/spec-kit-audit-py311.txt
uv pip compile pyproject.toml --extra test --python-version 3.12 --generate-hashes --quiet --output-file /tmp/spec-kit-audit-py312.txt
uv pip compile pyproject.toml --extra test --python-version 3.13 --generate-hashes --quiet --output-file /tmp/spec-kit-audit-py313.txt
uvx --from pip-audit==2.10.0 pip-audit --disable-pip --require-hashes -r /tmp/spec-kit-audit-py311.txt --progress-spinner off
uvx --from pip-audit==2.10.0 pip-audit --disable-pip --require-hashes -r /tmp/spec-kit-audit-py312.txt --progress-spinner off
uvx --from pip-audit==2.10.0 pip-audit --disable-pip --require-hashes -r /tmp/spec-kit-audit-py313.txt --progress-spinner off
uvx --from bandit==1.9.4 bandit -r src -lll --baseline .github/bandit-baseline.json
git diff --check

All passed locally.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 5/5 changed files
  • Comments generated: 5

Comment thread .github/workflows/security.yml Outdated
Comment thread CONTRIBUTING.md Outdated
Comment thread tests/test_security_workflow.py
Comment thread src/specify_cli/__init__.py Outdated
Comment thread .github/workflows/security.yml

@mnriem mnriem left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please address Copilot feedback

@PascalThuet

Copy link
Copy Markdown
Contributor Author

Addressed the latest Copilot review round in commit aa02062:

  • Push/PR/manual dependency audits now use committed hashed requirements at .github/security-audit-requirements.txt for deterministic CI.
  • Scheduled audits still perform live resolution across the supported Python/OS matrix.
  • CONTRIBUTING now runs pip-audit from the committed requirements file and documents how to refresh it without creating a root-level artifact.
  • The security workflow tests now assert triggers, deterministic audit behavior, action SHA pins, hashed committed requirements, and the absence of run_command(shell=...).
  • PR validation notes now match the Bandit baseline command used by CI.

Local validation passed:

  • git diff --check
  • uv run python -m pytest tests/test_security_workflow.py -q
  • uv run python -m pytest -q
  • uvx ruff check src tests/test_security_workflow.py
  • uvx --from pip-audit==2.10.0 pip-audit --disable-pip --require-hashes -r .github/security-audit-requirements.txt --progress-spinner off
  • uvx --from bandit==1.9.4 bandit -r src -lll --baseline .github/bandit-baseline.json

The new GitHub workflow runs are currently waiting for fork PR approval (action_required).

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 6/6 changed files
  • Comments generated: 2

Comment thread .github/workflows/security.yml
Comment thread tests/test_security_workflow.py
PascalThuet and others added 23 commits June 17, 2026 17:03
Polishes from the previous review pass that I noticed after pushing.

- security.yml: drop the unneeded fetch-depth: 0 from the secret-scan
  checkout. detect-secrets-hook reads the working tree only — fetching
  full history slows the job without adding signal.
- security.yml: add a follow-on step that surfaces the Bandit medium-
  severity informational pass in $GITHUB_STEP_SUMMARY. With
  continue-on-error: true the previous step never marks the job
  yellow/red, so findings were buried in the log; the summary now flags
  them with a ⚠️ heading (or ✅ when clean) at the top of the run page.
- CONTRIBUTING.md: document the new tooling and gates so contributors
  don't bounce off CI:
  - detect-secrets-hook command + how to regenerate .secrets.baseline
  - the bandit baseline label gate (security-baseline-change)
  - shellcheck --severity=error invocation
  - explicit note that committed security-audit-requirements.txt can
    drift purely from upstream package releases and needs periodic
    regeneration even on unrelated PRs.
Four hardening / robustness items raised during self-review of the PR.

- check_bandit_baseline.py: normalize whitespace in the code-snippet
  hash that's part of each entry's identity. A bandit version bump that
  reformats the snippet (different number of context lines, different
  indentation) would otherwise make every baseline entry look "new",
  forcing the security-baseline-change label on every unrelated PR.
- security.yml + check_secrets_baseline.py: symmetric growth gate on
  .secrets.baseline. detect-secrets-hook already blocks unknown secrets,
  but extending the baseline (whitelisting a new finding) was silent.
  Mirror the bandit gate — PR must carry secrets-baseline-change to
  acknowledge any new identity (filename + line + type + hashed_secret).
- test_security_workflow.py: drop the brittle exact-name lookup for the
  blocking bandit step. The test now finds it by the baseline-arg
  signature, so future renames of the step don't silently bypass the
  --skip B602 check. Added _find_step_by_run_signature helper that
  insists on exactly one match. Strict assertions on OS matrix and tool
  version pins are kept — those are intentional security choices.
- workflows/PUBLISHING.md: the shell-step NOTE in
  src/specify_cli/workflows/steps/shell/__init__.py points authors here
  for "security guidance", but the section didn't exist. Added an
  explicit "Security: shell steps execute arbitrary code" subsection
  covering the no-sandbox model, the inspect-before-install obligation,
  input-interpolation hygiene, and reviewer expectations.
Two items from the second self-review:

- workflows/PUBLISHING.md: fix invented command name. The first draft
  recommended `specify workflow inspect <name>` which doesn't exist —
  the actual subcommand is `workflow info`, and even that only shows
  metadata (name/version/inputs/step IDs+types), never the shell
  `run` content. Replace with explicit guidance to read the raw
  workflow.yml directly when auditing shell steps.
- tests/test_baseline_gates.py: new file. 12 unit tests covering both
  check_bandit_baseline.py and check_secrets_baseline.py — no PR base
  ref, introduction (baseline absent at base), identical baselines,
  growth without ack label, growth with ack label, swap attack
  (constant count, new identity), and (bandit-only) whitespace-only
  drift in the code snippet hash. The latter verifies the
  normalization added earlier protects against bandit reformatting
  its output.
Three polish items from the third self-review pass.

- tests/test_upgrade.py: new TestBoundedRead class. Pins the contract
  that _fetch_latest_release_tag wraps the response body through
  read_response_limited with max_bytes=1024*1024. Protects the hardening
  against a silent revert to `resp.read()` in a future refactor (the
  extraction to _version.py during the last merge would have lost it
  if we hadn't caught it manually).
- tests/test_baseline_gates.py: replace the two near-identical test
  classes with a parametrized TestSharedBaselineGate (bandit + secrets
  via a GateConfig dataclass and a `gate` fixture). Bandit-only quirks
  (no-base-ref short-circuit, whitespace-normalized identity) stay in
  TestBanditSpecific. Removes ~80 lines of duplication; the two scripts
  now exercise the same scenarios by construction so a divergence is
  caught the moment one drifts.
- tests/test_baseline_gates.py: new shared scenario
  test_corrupt_json_at_base_falls_back_to_empty. Covers the
  except JSONDecodeError branch of _read_baseline_at — corrupt base
  doesn't crash the script; instead the head set becomes "all new" and
  the normal label gate fires. Was previously dead code from a coverage
  standpoint.

3009 passed (up from 3006 — 14 baseline tests now parametrized as
12 + 2 bandit-specific, plus 1 new bounded-read test).
Three micro-cleanups raised during review github#4 of my own work — no
behavior change, just clarity.

- Replace the __import__("specify_cli._download_security", fromlist=...)
  dance with a plain `import ... as _real_read_response_limited` at the
  top of the file. Easier to grep, no runtime difference.
- Type the recorded dict explicitly and make max_bytes/label keyword-
  only without defaults on the spy. If a future refactor drops either
  argument the spy now raises TypeError immediately, instead of
  silently recording None and tripping the post-call assertion with a
  more confusing message.
- Tighten the label check from fuzzy substring match ("github" in
  label.lower()) to exact equality ("GitHub latest release"). Both
  catch regressions; exact equality also catches typos.
Six items from the new Copilot pass. Three were latent bugs in the
guardrails added by earlier commits, two are documentation/wording, one
is parity coverage.

Bugs
- security.yml: the MEDIUM Bandit informational pass ran without
  --baseline, so the whitelisted HIGH B602 finding re-fired there on
  every run, turning the job summary into a permanent warning. Apply
  the same baseline to both passes; medium-only NEW findings now
  surface, as intended.
- security.yml: the summary step ran with if: always() but the MEDIUM
  pass has the default if: success() — when the blocking HIGH step
  fails, the MEDIUM pass is skipped (outcome=skipped, not failure) and
  the summary wrote "✅ clean" anyway. Switch to a case statement that
  handles failure/success/skipped distinctly (⚠️ / ✅ / ⏭️).
- check_bandit_baseline.py and check_secrets_baseline.py used
  `git show <head_ref>:` on both sides, so an unreadable/unfetched
  head ref returned empty results and the diff computed 0 new
  identities → fail-open. Read the head side from the working tree
  instead (CI is checked out at the PR head), fail-closed when the
  file is missing, and SystemExit on corrupt JSON. The base side
  keeps the lenient JSONDecodeError fallback because that's historical
  state we can't change.

Wording
- security.yml + CONTRIBUTING.md: both mentioned `# nosec` as a
  suppression mechanism, but tests/test_security_workflow.py::
  test_bandit_nosec_is_not_suppressed_in_source explicitly forbids
  `# nosec` under src/. Replace with the actually-supported paths
  (bandit baseline for HIGH findings, `# noqa: S6xx` for ruff
  subprocess-shell rules) and flag the forbidden-comment policy.

Parity coverage
- tests/test_security_workflow.py: three new tests for the secret-scan
  job mirroring the dependency-audit / static-analysis coverage —
  detect-secrets-hook command, baseline path, excluded paths, growth
  gate env wiring (BASE only, no HEAD env), and fetch-depth: 0.
- tests/test_workflows.py: regression test that
  WorkflowCatalog._fetch_single_catalog routes through
  read_response_limited with error_type=WorkflowCatalogError and label
  "workflow catalog". Mirrors TestBoundedRead for _fetch_latest_
  release_tag and the equivalent test in test_integration_catalog.py.
- tests/test_baseline_gates.py: two new fail-closed cases (head
  missing in working tree, head corrupt in working tree); drop the
  now-unused head_sha returns and the head env var from GateHandle.run.

Note: Copilot also flagged "no tests on baseline gate scripts" — those
tests already shipped in tests/test_baseline_gates.py (commit 2fd8071,
posted before the review). Updated here with the new fail-closed cases.

Tests: 3017 passed (was 3009).
- read_response_limited: read in a loop until EOF or one byte past the
  limit instead of a single read(max_bytes + 1). A server using chunked
  transfer encoding can return fewer bytes per read() than requested
  while streaming more than max_bytes total, defeating the single-read
  bound. Add regression tests for the short-read and within-limit paths.
- _download_security: annotate _raise / _raise_from as NoReturn so type
  checkers treat call sites as unreachable.
- Extract the duplicated is_https_or_localhost_http redirect-safety
  predicate into _download_security and import it from both _github_http
  and authentication/http so the rule lives in one place.
- azure_devops: stop catching broad ValueError/KeyError around token
  acquisition; give the bounded read a dedicated _TokenResponseTooLarge
  type and catch only URLError, OSError, JSONDecodeError, and that type
  so unrelated programming errors still surface.
- tests: make response mocks faithful streams (advancing cursor, b"" at
  EOF) so the bounded read loop terminates as it would against a real
  http.client.HTTPResponse.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- __init__.py preset/extension URL installs: give read_response_limited a
  domain error_type (PresetError / ExtensionError) and catch that instead
  of a blanket ValueError, so an oversized body is reported cleanly while
  unrelated ValueErrors surface as real errors. The extension catch now
  also covers install_from_zip's ValidationError (an ExtensionError).
- _utils.run_command: rewrite the misleading docstring — shell=False is the
  only honoured mode; shell=True is rejected with ValueError, the parameter
  is retained only so existing keyword callers don't hit TypeError.
- _download_security: document that the loopback allowance is an exact-string
  match (not an IP-range check), that read_response_limited's max_bytes
  default is the 50 MiB ceiling (callers with tighter budgets should pass an
  explicit value), and how _safe_zip_name handles single trailing-slash
  directory markers vs malformed empty segments.
- authentication/http: comment the empty-hosts _StripAuthOnRedirect use as the
  HTTPS-downgrade guard on the unauthenticated path.
- check_security_requirements: document the HEAD^ fallback failing safe (audit
  anyway) on shallow / single-commit checkouts.
- security.yml: document the universal committed snapshot vs per-Python
  scheduled compile distinction.
- tests: add a regression test that a symlink alongside benign members is
  rejected with no partial extraction to disk.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Audit follow-up to the download-hardening work, closing similar cases
within the scope of this PR:

- Add read_zip_member_limited() and use it for the inline extension.yml
  read in the extension *update* path (__init__.py). That read happened
  before install_from_zip()'s safe_extract_zip(), so a raw zf.open().read()
  bypassed the per-member size bound: a manifest declaring a huge file_size
  (few KB compressed, gigabytes uncompressed) would be fully loaded by
  yaml.safe_load. The helper rejects on declared size and reads bounded.
- Route the Azure DevOps OAuth token request through a strict-redirect
  opener so a 307/308 redirect cannot forward the client_secret POST body
  to a non-HTTPS, non-loopback host.
- Tests for the new helper and the updated ADO opener path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…P bytes

Addresses the Copilot review on PR github#2442 and the same pattern elsewhere.

- safe_extract_zip(): track the cumulative bytes actually written and fail past
  max_total_bytes, so the total-size bound holds even if member headers
  understate file_size (the declared-total check alone could be evaded). Mirrors
  the existing per-member written guard — defense-in-depth consistency.
- Pass an explicit max_bytes to read_response_limited() at every JSON call site
  instead of inheriting the 50 MiB archive/payload default:
    * MAX_JSON_METADATA_BYTES (1 MiB): Azure AD token, GitHub release metadata,
      and the existing latest-release fetch (migrated off an inline literal).
    * MAX_JSON_CATALOG_BYTES (8 MiB): preset, extension, workflow and
      integration catalog fetches.
  Binary/archive downloads keep the 50 MiB ceiling.

Both ceilings are centralized as documented constants in _download_security.py.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- pin actions/checkout to the repo-wide df4cb1c (v6.0.3) in lint.yml
  and security.yml
- replace the ad-hoc ip_address loopback checks in the workflow add
  URL/catalog flows with the shared is_https_or_localhost_http
  predicate, so HTTP-on-loopback rules match the redirect handler
- drop the empty member name from the zip dot-segment test: zipfile
  cannot write such an entry, the case crashed in the test itself
…gates

- align the setup-uv pin in security.yml with test.yml (v8.2.0)
- use is_https_or_localhost_http for the preset_add/extension_add URL
  checks and pass strict_redirects=True to the latest-release fetch and
  the release-asset resolver call sites
- baseline gate scripts fail closed on unresolvable refs and git read
  errors instead of treating them as "baseline did not exist"; the
  security workflow re-runs on labeled/unlabeled so the ack label can
  turn the gate green without a push
- regenerate the bandit baseline against HEAD (two entries referenced
  removed code, one had drifted); track baseline entries by
  file+test_id in tests so line drift no longer breaks them
- raise ZIP size-limit errors outside the broad except in
  safe_extract_zip so an error_type subclassing OSError/RuntimeError
  cannot re-wrap them
- tests: drop two redirect tests duplicated from test_authentication,
  move the downgrade test next to its siblings, assert the workflow
  catalog max_bytes, route OpenerDirector.open through urlopen in the
  modules that patch urlopen, add set -euo pipefail to the secret scan,
  misc cleanup (unused helper, redundant imports, EOF-less fake read)
is_https_or_localhost_http allows HTTP for localhost, 127.0.0.1 and
::1; the user-facing messages and the open_url docstring only said
localhost.
The non-HTTPS redirect rejection in _StripAuthOnRedirect applies to every
authenticated attempt regardless of strict_redirects; the flag only extends
the same guard to the unauthenticated fallback. Document both guards on the
class and correct the open_url docstring, which previously gated the whole
scheme restriction under strict_redirects.
A URL without a hostname (e.g. https:///x) has no real target; reject it
regardless of scheme. Folds main's hostless-HTTPS guard into the shared
predicate so every download/redirect call site benefits.
@PascalThuet

PascalThuet commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

Rebased the branch on current origin/main and resolved the remaining conflicts. Copilot feedback is addressed by rejecting hostless URLs consistently, clarifying HTTPS/loopback guidance, preserving strict redirect handling for authenticated and strict unauthenticated downloads, and keeping bounded-read tests aligned with real stream behavior.

Pushed commit 1f1b20c9e7544b67d7be067c45c7c63b3c75de45.

Local verification:

  • uvx ruff check ... passed.
  • uv run pytest tests/test_download_security.py tests/test_github_http.py tests/test_authentication.py tests/test_upgrade.py tests/test_workflows.py tests/test_extensions.py tests/test_presets.py tests/integrations/test_integration_catalog.py tests/test_security_workflow.py tests/test_baseline_gates.py -q passed: 1217 passed, 19 warnings from temporary auth.json fixture permissions.

GitHub reports the PR as mergeable. No GitHub checks are currently reported for this branch.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 43/43 changed files
  • Comments generated: 2

Comment thread src/specify_cli/authentication/http.py Outdated
Comment on lines +157 to +162
Redirect scheme safety: every authenticated attempt goes through
``_StripAuthOnRedirect``, which always rejects redirects to non-HTTPS
URLs (except HTTP to localhost / 127.0.0.1 / ::1, the hosts allowed by
``is_https_or_localhost_http``). *strict_redirects* extends that same
scheme guard and the optional redirect validator to the unauthenticated
fallback; without it, the fallback follows redirects without that handler.
Comment on lines +117 to +128
from specify_cli.authentication.http import (
_StripAuthOnRedirect,
_validate_strict_redirect,
)

# A 307/308 redirect preserves the POST body, which carries the
# client_secret. Reuse the package HTTPS-downgrade guard (empty host
# list means no auth header to strip, just the scheme check) so the
# secret can never be forwarded to a non-HTTPS, non-loopback host.
opener = urllib.request.build_opener(
_StripAuthOnRedirect((), _validate_strict_redirect)
)
@mnriem

mnriem commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

Please address Copilot feedback and test & lint errors

@PascalThuet

Copy link
Copy Markdown
Contributor Author

Updated on behalf of @PascalThuet by Codex (GPT-5).

Commit: e3f0153

What changed:

  • Clarified open_url() redirect-handler documentation so it matches the strict_redirects or redirect_validator is not None implementation.
  • Removed the redundant Azure DevOps redirect validator; _StripAuthOnRedirect already applies the strict scheme guard.
  • Removed lingering shell= arguments from _utils.run_command() after the explicit shell=True rejection, clearing the new Bandit B602 findings without source suppressions.
  • Preserved the real PATH in baseline-gate subprocess tests so Windows can find git.
  • Regenerated .github/security-audit-requirements.txt with the CI compile command and updated .secrets.baseline for the false positives surfaced by the secret scan.

Local validation:

  • uvx ruff check src/specify_cli/_utils.py src/specify_cli/authentication/http.py src/specify_cli/authentication/azure_devops.py tests/test_baseline_gates.py passed.
  • git diff --check passed.
  • uvx --from bandit==1.9.4 bandit -r src -lll --baseline .github/bandit-baseline.json passed.
  • git ls-files -z -- ':!:.secrets.baseline' ':!:uv.lock' ':!:.github/security-audit-requirements.txt' | xargs -0 uvx --from detect-secrets==1.5.0 detect-secrets-hook --baseline .secrets.baseline passed.
  • GENERATED_REQUIREMENTS=/tmp/spec-kit-2442-security-audit-requirements.txt DEPENDENCY_DIFF_BASE=cedbf484d7a94b5566b24169c642ae48a3c81394 DEPENDENCY_DIFF_HEAD=HEAD python .github/scripts/check_security_requirements.py passed.
  • uvx --from pip-audit==2.10.0 pip-audit --disable-pip --require-hashes -r .github/security-audit-requirements.txt --progress-spinner off passed.
  • uv run pytest tests/test_authentication.py tests/test_baseline_gates.py tests/test_security_workflow.py -q passed: 124 passed, 19 expected auth fixture permission warnings.

GitHub created new pull_request runs for e3f0153, but they are currently action_required, so they need maintainer approval before they will execute.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot's findings

  • Files reviewed: 43/43 changed files
  • Comments generated: 7

Comment on lines +1027 to +1031
from specify_cli._download_security import read_response_limited as _read_response_limited
from specify_cli.authentication.http import open_url as _open_url

with _open_url(from_url, timeout=60) as response:
zip_data = response.read()
zip_data = _read_response_limited(
Comment on lines 2587 to +2591
try:
with self._open_url(
download_url, timeout=60, extra_headers=extra_headers
) as response:
zip_data = response.read()
zip_data = read_response_limited(
Comment thread src/specify_cli/presets/__init__.py Outdated
Comment on lines 2517 to 2518
try:
with self._open_url(download_url, timeout=60, extra_headers=extra_headers) as response:
Comment thread src/specify_cli/extensions.py Outdated
Comment on lines 2222 to 2223
try:
with self._open_url(entry.url, timeout=10) as response:
Comment on lines 2404 to +2409
try:
import urllib.error

with self._open_url(catalog_url, timeout=10) as response:
catalog_data = json.loads(response.read())
catalog_data = json.loads(
read_response_limited(
Comment thread src/specify_cli/presets/__init__.py Outdated
Comment on lines 2163 to 2164
try:
with self._open_url(entry.url, timeout=10) as response:
Comment thread src/specify_cli/presets/__init__.py Outdated
Comment on lines 2321 to 2322
try:
with self._open_url(catalog_url, timeout=10) as response:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Security hardening] Add automated security audit checks for Python dependencies and static analysis

4 participants