Skip to content

ci: cap Vercel deploy jobs at 30min (prevent concurrency lockups)#710

Merged
blove merged 1 commit into
mainfrom
blove/deploy-job-timeouts
Jun 20, 2026
Merged

ci: cap Vercel deploy jobs at 30min (prevent concurrency lockups)#710
blove merged 1 commit into
mainfrom
blove/deploy-job-timeouts

Conversation

@blove

@blove blove commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Add timeout-minutes: 30 to the four Vercel deploy jobs in ci.yml (Deploy → Vercel, Canonical demo → Vercel, ag-ui demo → Vercel, Cockpit — deploy smoke dry-run), which previously had no timeout (GitHub's 6h default).

Why

The main CI concurrency group is cancel-in-progress: false (intentional — don't kill a deploy mid-flight), so a single hung job blocks all later runs and deploys in the group. Today a Deploy → Vercel job hung for ~4 hours and froze main deploys. A 30-min cap (legitimate deploys take ~12 min) turns a hang into a fast failure instead of a multi-hour lockup.

Not touching the concurrency semantics or strengthening anything else — just bounding the failure mode.

Test Plan

  • CI green; deploys still complete within the cap

🤖 Generated with Claude Code

The main CI concurrency group is cancel-in-progress:false (don't kill a
deploy mid-flight), so a hung job blocks all later runs/deploys. The four
Vercel deploy jobs had no timeout (GitHub's 6h default) — today a hung
'Deploy → Vercel' froze deploys for ~4h. A 30min cap (legit deploys take
~12min) turns a hang into a fast failure instead of a multi-hour lockup.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@blove blove enabled auto-merge (squash) June 19, 2026 23:58
@vercel

vercel Bot commented Jun 19, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
threadplane Ready Ready Preview, Comment Jun 20, 2026 12:03am

Request Review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated approval: this PR received an intelligent (AI) code review. See the review comments on this PR.

@github-actions

github-actions Bot commented Jun 20, 2026

Copy link
Copy Markdown
Contributor

Claude finished @blove's task in 33s —— View job


Review

  • Read the diff and PR description
  • Analyze the change for correctness, security, and appropriateness

The change is correct and well-targeted. Four lines added, each placing timeout-minutes: 30 at the right level (job-level, not step-level), covering exactly the jobs that sit inside the cancel-in-progress: false concurrency group and can block the queue if they hang.

What I checked:

  • Placementtimeout-minutes at the job level is correct. Step-level timeouts would leave the job itself running if the hung step somehow exceeded its cap but the job runner stayed alive; job-level is the right hammer here.
  • Value — 30 min with ~12 min typical runtime gives ≈2.5× headroom. That's a sensible ratio: tight enough to kill a genuine hang quickly, loose enough to survive a slow Vercel cold-start or a transient network delay without false-positives.
  • Coverage — All four deploy-path jobs are included: the primary deploy, both demo deploys, and cockpit-deploy-smoke. Nothing left uncapped in the problematic concurrency group.
  • Concurrency semantics — The PR correctly leaves cancel-in-progress: false alone. Adding a timeout is purely additive; it doesn't change what happens to in-flight deploys, only puts a ceiling on how long a stuck one can hold the queue.
  • Security — No workflow script injection surface, no secrets exposure, no user-controlled inputs. Pure timeout config.
  • Comments — The inline comments ("fail fast instead of blocking the main concurrency group on a hang") are accurate and add useful context without being noisy.

No issues found. This is a straightforward, well-scoped fix for a real incident. Good to merge.

@blove blove merged commit 119c49d into main Jun 20, 2026
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant