Skip to content

feat(webapp): tag Prisma spans with db.datasource attribute#3422

Merged
ericallam merged 1 commit intomainfrom
feat/prisma-span-datasource-attribute
Apr 21, 2026
Merged

feat(webapp): tag Prisma spans with db.datasource attribute#3422
ericallam merged 1 commit intomainfrom
feat/prisma-span-datasource-attribute

Conversation

@ericallam
Copy link
Copy Markdown
Member

@ericallam ericallam commented Apr 21, 2026

Summary

Stamp every Prisma span with db.datasource: "writer" | "replica" so traces can distinguish which client the query went through.

Both PrismaClient instances share the same global @prisma/instrumentation, so their spans come out with identical names and attributes today. This makes them trivially filterable.

How

Two pieces in apps/webapp/app/:

  1. v3/tracer.server.ts — a DatasourceAttributeSpanProcessor reads an OTel context key in onStart and calls span.setAttribute("db.datasource", value). Registered as the first span processor.
  2. db.server.tstagDatasource(datasource, client) wraps each PrismaClient with $extends({ query: { $allOperations } }). The middleware sets the context key around the query and directly tags the active span (to catch prisma:client:operation, which Prisma creates before the middleware fires).

Context-propagation gotcha

PrismaPromise is lazy — query(args) returns a thenable that only starts when someone .then()s it. The naive context.with(ctx, () => query(args)) restores ALS synchronously, so when Prisma's internal code awaits the thenable later, the engine spans fire with the original ALS. Wrapping as async () => await query(args) forces the .then() inside the context.with callback, so ALS stays on our context for the engine spans.

Coverage

  • Tagged: all prisma:engine:* (connection, db_query, serialize, query, etc.), prisma:client:operation, prisma:client:serialize, prisma:client:connect
  • Not tagged: prisma:client:load_engine — one-time startup, fires before any query

Concurrent Promise.all([writer.x, replica.y]) correctly tags each pool separately (ALS isolates per-Promise chain).

Performance

One context.with (~200ns) and one setAttribute per span (effectively free per OTel JS benchmarks) per Prisma op. Negligible against a query path measured in milliseconds.

Test plan

  • Verify db.datasource appears on prisma:engine:connection spans after the webapp is restarted
  • Spot-check a handful of real traces carry the attribute

Wrap the writer and replica Prisma clients with a $extends middleware
that sets an OTel context key around each operation, and a span
processor that reads the key and stamps 'db.datasource' = 'writer' |
'replica' on every span created in that scope.

Also directly tags the prisma:client:operation span via
trace.getActiveSpan() since that outer span is created before the
extension middleware runs and would otherwise miss the context.

Motivation: writer and replica emit identical span names through the
same global instrumentation, so pool-saturation monitors on
prisma:engine:connection could not distinguish the two pools. With this
change, monitors can filter by the new attribute.

Context propagation note: PrismaPromise is lazy, so wrapping query(args)
directly with context.with leaves the thenable unstarted and Prisma's
.then() fires outside the scope. The inner 'async () => await query(args)'
forces the .then() inside the context.with callback so engine spans
see the correct active context.

Not tagged: prisma:client:load_engine (one-time startup, irrelevant).
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 21, 2026

⚠️ No Changeset found

Latest commit: 7bd7398

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 21, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: cd545daa-95dc-49be-833a-1adeaa1b7c88

📥 Commits

Reviewing files that changed from the base of the PR and between b570586 and 7bd7398.

⛔ Files ignored due to path filters (1)
  • references/hello-world/src/trigger/example.ts is excluded by !references/**
📒 Files selected for processing (3)
  • .server-changes/prisma-span-datasource-attribute.md
  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (27)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: sdk-compat / Bun Runtime
  • GitHub Check: sdk-compat / Node.js 20.20 (ubuntu-latest)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: sdk-compat / Cloudflare Workers
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: sdk-compat / Deno Runtime
  • GitHub Check: sdk-compat / Node.js 22.12 (ubuntu-latest)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: typecheck / typecheck
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects instead

Files:

  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
{packages/core,apps/webapp}/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use zod for validation in packages/core and apps/webapp

Files:

  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

Use function declarations instead of default exports

Add crumbs as you write code using // @Crumbs comments or `// `#region` `@crumbs blocks. These are temporary debug instrumentation and must be stripped using agentcrumbs strip before merge.

Files:

  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
**/*.ts

📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)

**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries

Files:

  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
**/*.{js,ts,jsx,tsx,json,md,yaml,yml}

📄 CodeRabbit inference engine (AGENTS.md)

Format code using Prettier before committing

Files:

  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
**/*.ts{,x}

📄 CodeRabbit inference engine (CLAUDE.md)

Always import from @trigger.dev/sdk when writing Trigger.dev tasks. Never use @trigger.dev/sdk/v3 or deprecated client.defineJob.

Files:

  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
apps/webapp/**/*.{ts,tsx}

📄 CodeRabbit inference engine (.cursor/rules/webapp.mdc)

apps/webapp/**/*.{ts,tsx}: Access environment variables through the env export of env.server.ts instead of directly accessing process.env
Use subpath exports from @trigger.dev/core package instead of importing from the root @trigger.dev/core path

Use named constants for sentinel/placeholder values (e.g. const UNSET_VALUE = '__unset__') instead of raw string literals scattered across comparisons

Files:

  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
apps/webapp/**/*.server.ts

📄 CodeRabbit inference engine (apps/webapp/CLAUDE.md)

apps/webapp/**/*.server.ts: Never use request.signal for detecting client disconnects. Use getRequestAbortSignal() from app/services/httpAsyncStorage.server.ts instead, which is wired directly to Express res.on('close') and fires reliably
Access environment variables via env export from app/env.server.ts. Never use process.env directly
Always use findFirst instead of findUnique in Prisma queries. findUnique has an implicit DataLoader that batches concurrent calls and has active bugs even in Prisma 6.x (uppercase UUIDs returning null, composite key SQL correctness issues, 5-10x worse performance). findFirst is never batched and avoids this entire class of issues

Files:

  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
🧠 Learnings (12)
📓 Common learnings
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/database/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:17.177Z
Learning: Applies to internal-packages/database/**/{app,src,webapp}/**/*.{ts,tsx,js,jsx} : Use `$replica` from `~/db.server` for read-heavy queries in the webapp instead of the primary database connection
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/app/services/taskIdentifierRegistry.server.ts:24-67
Timestamp: 2026-04-13T21:44:00.032Z
Learning: In `apps/webapp/app/services/taskIdentifierRegistry.server.ts`, the sequential upsert/updateMany/findMany writes in `syncTaskIdentifiers` are intentionally NOT wrapped in a Prisma transaction. This function runs only during deployment-change events (low-concurrency path), and any partial `isInLatestDeployment` state is acceptable because it self-corrects on the next deployment. Do not flag this as a missing-transaction/atomicity issue in future reviews.
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to internal-packages/database/**/*.{ts,tsx} : Use Prisma for database interactions in internal-packages/database with PostgreSQL
📚 Learning: 2025-11-27T16:26:37.432Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .github/copilot-instructions.md:0-0
Timestamp: 2025-11-27T16:26:37.432Z
Learning: Applies to internal-packages/database/**/*.{ts,tsx} : Use Prisma for database interactions in internal-packages/database with PostgreSQL

Applied to files:

  • .server-changes/prisma-span-datasource-attribute.md
  • apps/webapp/app/db.server.ts
📚 Learning: 2026-03-02T12:43:17.177Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/database/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:17.177Z
Learning: Applies to internal-packages/database/**/{app,src,webapp}/**/*.{ts,tsx,js,jsx} : Use `$replica` from `~/db.server` for read-heavy queries in the webapp instead of the primary database connection

Applied to files:

  • .server-changes/prisma-span-datasource-attribute.md
  • apps/webapp/app/db.server.ts
📚 Learning: 2026-03-24T10:42:43.111Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3255
File: apps/webapp/app/routes/api.v1.runs.$runId.spans.$spanId.ts:100-100
Timestamp: 2026-03-24T10:42:43.111Z
Learning: In `apps/webapp/app/routes/api.v1.runs.$runId.spans.$spanId.ts` (and related span-handling code in trigger.dev), `span.entity` is a required (non-optional) field on the `SpanDetail` type and is always present. Do not flag `span.entity.type` as a potential null pointer / suggest optional chaining (`span.entity?.type`) in this context.

Applied to files:

  • .server-changes/prisma-span-datasource-attribute.md
📚 Learning: 2026-04-13T21:44:00.032Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/app/services/taskIdentifierRegistry.server.ts:24-67
Timestamp: 2026-04-13T21:44:00.032Z
Learning: In `apps/webapp/app/services/taskIdentifierRegistry.server.ts`, the sequential upsert/updateMany/findMany writes in `syncTaskIdentifiers` are intentionally NOT wrapped in a Prisma transaction. This function runs only during deployment-change events (low-concurrency path), and any partial `isInLatestDeployment` state is acceptable because it self-corrects on the next deployment. Do not flag this as a missing-transaction/atomicity issue in future reviews.

Applied to files:

  • apps/webapp/app/db.server.ts
📚 Learning: 2026-03-02T12:43:25.254Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/run-engine/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:25.254Z
Learning: Use Prisma for data persistence with support for read-only replica queries via `readOnlyPrisma`

Applied to files:

  • apps/webapp/app/db.server.ts
📚 Learning: 2026-04-15T15:39:06.868Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-15T15:39:06.868Z
Learning: Use Prisma 6.14.0 from `internal-packages/database` for database operations. This is the pinned version for the monorepo.

Applied to files:

  • apps/webapp/app/db.server.ts
📚 Learning: 2026-04-16T13:45:18.782Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3368
File: apps/webapp/test/engine/taskIdentifierRegistry.test.ts:3-19
Timestamp: 2026-04-16T13:45:18.782Z
Learning: In `apps/webapp/test/engine/taskIdentifierRegistry.test.ts`, the `vi.mock` calls for `~/services/taskIdentifierCache.server` (stubbing `getTaskIdentifiersFromCache` and `populateTaskIdentifierCache`), `~/models/task.server` (stubbing `getAllTaskIdentifiers`), and `~/db.server` (stubbing `prisma` and `$replica`) are intentional. The suite uses real Postgres via testcontainers for all `TaskIdentifier` DB operations, but isolates the Redis cache layer and legacy query fallback as separate concerns not exercised in this test file. Do not flag these mocks as violations of the no-mocks policy in future reviews.

Applied to files:

  • apps/webapp/app/db.server.ts
📚 Learning: 2026-03-22T13:26:12.060Z
Learnt from: ericallam
Repo: triggerdotdev/trigger.dev PR: 3244
File: apps/webapp/app/components/code/TextEditor.tsx:81-86
Timestamp: 2026-03-22T13:26:12.060Z
Learning: In the triggerdotdev/trigger.dev codebase, do not flag `navigator.clipboard.writeText(...)` calls for `missing-await`/`unhandled-promise` issues. These clipboard writes are intentionally invoked without `await` and without `catch` handlers across the project; keep that behavior consistent when reviewing TypeScript/TSX files (e.g., usages like in `apps/webapp/app/components/code/TextEditor.tsx`).

Applied to files:

  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
📚 Learning: 2026-03-22T19:24:14.403Z
Learnt from: matt-aitken
Repo: triggerdotdev/trigger.dev PR: 3187
File: apps/webapp/app/v3/services/alerts/deliverErrorGroupAlert.server.ts:200-204
Timestamp: 2026-03-22T19:24:14.403Z
Learning: In the triggerdotdev/trigger.dev codebase, webhook URLs are not expected to contain embedded credentials/secrets (e.g., fields like `ProjectAlertWebhookProperties` should only hold credential-free webhook endpoints). During code review, if you see logging or inclusion of raw webhook URLs in error messages, do not automatically treat it as a credential-leak/secrets-in-logs issue by default—first verify the URL does not contain embedded credentials (for example, no username/password in the URL, no obvious secret/token query params or fragments). If the URL is credential-free per this project’s conventions, allow the logging.

Applied to files:

  • apps/webapp/app/db.server.ts
  • apps/webapp/app/v3/tracer.server.ts
📚 Learning: 2026-03-02T12:43:25.254Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: internal-packages/run-engine/CLAUDE.md:0-0
Timestamp: 2026-03-02T12:43:25.254Z
Learning: Applies to internal-packages/run-engine/src/engine/systems/**/*.ts : Integrate OpenTelemetry tracer and meter instrumentation in RunEngine systems for observability

Applied to files:

  • apps/webapp/app/v3/tracer.server.ts
📚 Learning: 2026-03-29T19:16:28.864Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 3291
File: apps/webapp/app/v3/featureFlags.ts:53-65
Timestamp: 2026-03-29T19:16:28.864Z
Learning: When reviewing TypeScript code that uses Zod v3, treat `z.coerce.*()` schemas as their direct Zod type (e.g., `z.coerce.boolean()` returns a `ZodBoolean` with `_def.typeName === "ZodBoolean"`) rather than a `ZodEffects`. Only `.preprocess()`, `.refine()`/`.superRefine()`, and `.transform()` are expected to wrap schemas in `ZodEffects`. Therefore, in reviewers’ logic like `getFlagControlType`, do not flag/unblock failures that require unwrapping `ZodEffects` when the input schema is a `z.coerce.*` schema.

Applied to files:

  • apps/webapp/app/v3/tracer.server.ts
🔇 Additional comments (5)
.server-changes/prisma-span-datasource-attribute.md (1)

1-6: LGTM — scope matches implementation.

The enumerated span coverage (prisma:engine:*, prisma:engine:connection, outer prisma:client:operation) lines up with how DatasourceAttributeSpanProcessor + tagDatasource actually tag spans. Good to have this contract written down for future monitor/dashboard authors.

apps/webapp/app/v3/tracer.server.ts (2)

65-81: Span processor is correctly ordered and minimal.

onStart reads the key from parentContext (not the span's own context), which is the right choice: the processor runs synchronously at span creation time, and any descendant created while tagDatasource's context.with is active will inherit the key via its parent context. The typeof ds === "string" guard keeps it safe against unrelated context pollution. No concerns.


227-227: Good call registering as the first SpanProcessor.

Placing DatasourceAttributeSpanProcessor before BatchSpanProcessor/SimpleSpanProcessor guarantees the attribute is set before any exporter processes onEnd (processors run in registration order). Attribute propagation to exported spans is deterministic.

apps/webapp/app/db.server.ts (2)

119-124: Replica fallback tags correctly.

When DATABASE_READ_REPLICA_URL is unset, $replica resolves to the writer-wrapped prisma, so $replica.* calls accurately emit db.datasource="writer" — that's the honest answer for monitoring (the query really did hit the writer). Nice.


101-117: Dual-tagging approach is sound; transaction span gap warrants clarification in migration doc.

The trace.getActiveSpan()?.setAttribute(...) + context.with(... DATASOURCE_CONTEXT_KEY ...) combo correctly handles Prisma's ordering quirk (outer prisma:client:operation is created before the middleware runs, so only the active-span path tags it; child engine/serialize/connect spans then get picked up by DatasourceAttributeSpanProcessor). Wrapping as async () => await query(args) rather than returning the PrismaPromise directly is also necessary to keep ALS active when the lazy thenable starts.

The .server-changes/prisma-span-datasource-attribute.md migration doc currently lists only prisma:engine:*, prisma:engine:connection, and prisma:client:operation as tagged spans. The prisma:client:transaction span generated by prisma.$transaction(...) is not routed through $allOperations, so neither the setAttribute line nor the context.with runs for it—meaning the outer transaction span won't carry db.datasource (its inner prisma:client:operation children still will). Update the migration doc to explicitly note this exclusion and confirm whether production monitors filtering on transaction spans require extending tagging to the $transaction path or whether child-span coverage is sufficient.


Walkthrough

The changes implement OpenTelemetry instrumentation to automatically tag Prisma database operation spans with a db.datasource attribute indicating whether the operation targets a "writer" or "replica" datasource. The implementation introduces a DatasourceAttributeSpanProcessor that reads the datasource from context and applies it to spans, a tagDatasource wrapper function that manages the context during Prisma operations, and updates the client initialization to apply the wrapper to both writer and replica clients. Documentation is added describing the feature's scope and behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The description provides comprehensive technical detail about the implementation, but it is missing key template sections including the Closes issue reference, testing steps checklist, and changelog. Complete the PR description template by adding Closes #issue, marking checklist items, documenting testing steps, and including a changelog summary section.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: adding db.datasource attribute tagging to Prisma spans to distinguish between writer and replica clients.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/prisma-span-datasource-attribute

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ericallam ericallam marked this pull request as ready for review April 21, 2026 15:48
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

@ericallam ericallam merged commit 7c95ee4 into main Apr 21, 2026
43 checks passed
@ericallam ericallam deleted the feat/prisma-span-datasource-attribute branch April 21, 2026 15:56
@github-actions github-actions Bot mentioned this pull request Apr 21, 2026
ericallam pushed a commit that referenced this pull request May 1, 2026
## Summary
8 new features, 18 improvements, 11 bug fixes.

## Breaking changes
- Add server-side deprecation gate for deploys from v3 CLI versions
(gated by `DEPRECATE_V3_CLI_DEPLOYS_ENABLED`). v4 CLI deploys are
unaffected.
([#3415](#3415))

## Improvements
- Add `--no-browser` flag to `init` and `login` to skip auto-opening the
browser during authentication. Also error loudly when `init` is run
without `--yes` under non-TTY stdin (previously default-and-exited
silently, leaving the project half-initialized). Both commands now show
an `Examples` section in `--help`.
([#3483](#3483))
- Add `isReplay` boolean to the run context (`ctx.run.isReplay`),
derived from the existing `replayedFromTaskRunFriendlyId` database
field. Defaults to `false` for backwards compatibility.
([#3454](#3454))
- Redact the `resolveWaitpoint` runtime log so it only emits `id` and
`type` instead of the full completed waitpoint. Previously the log
printed the entire waitpoint (including `output`) to stdout in
production runs, which could leak sensitive payloads. The value returned
by `wait.forToken()` is unchanged.
([#3490](#3490))
- Add `SessionId` friendly ID generator and schemas for the new durable
Session primitive. Exported from `@trigger.dev/core/v3/isomorphic`
alongside `RunId`, `BatchId`, etc. Ships the
`CreateSessionStreamWaitpoint` request/response schemas alongside the
main Session CRUD.
([#3417](#3417))
- Truncate large error stacks and messages to prevent OOM crashes. Stack
traces are capped at 50 frames (keeping top 5 + bottom 45 with an
omission notice), individual stack lines at 1024 chars, and error
messages at 1000 chars. Applied in parseError, sanitizeError, and OTel
span recording.
([#3405](#3405))

## Server changes

These changes affect the self-hosted Docker image and Trigger.dev Cloud:

- Add a "Back office" tab to `/admin` and a per-organization detail page
at `/admin/back-office/orgs/:orgId`. The first action available on that
page is editing the org's API rate limit: admins can save a
`tokenBucket` override (refill rate, interval, max tokens) and see a
plain-English preview of the resulting sustained rate and burst
allowance. Writes are audit-logged via the server logger.
([#3434](#3434))
- Optional `DEPLOY_REGISTRY_ECR_DEFAULT_REPOSITORY_POLICY` env var to
apply a default repository policy when the webapp creates new ECR repos
([#3467](#3467))
- Ship the Errors page to all users, with a polish + bug-fix pass:
pinned "No channel" item in the Slack alert channel picker,
viewer-timezone alert timestamps via Slack's `<!date^>` token, Activity
sparkline peak tooltip, centered loading spinner and bug-icon empty
state on the error detail page, ellipsis on the Configure alerts
trigger.
([#3477](#3477))
- Configure the set of machine presets to build boot snapshots for at
deploy time via `COMPUTE_TEMPLATE_MACHINE_PRESETS` (CSV of preset names,
default `small-1x`). Use `COMPUTE_TEMPLATE_MACHINE_PRESETS_REQUIRED`
(CSV, default = full PRESETS list) to scope which preset failures fail a
required-mode deploy. Optional preset failures are logged and don't
block the deploy.
([#3492](#3492))
- Regenerating a RuntimeEnvironment API key no longer invalidates the
previous key immediately. The old key is recorded in a new
`RevokedApiKey` table with a 24 hour grace window, and
`findEnvironmentByApiKey` falls back to it when the submitted key
doesn't match any live environment. The grace window can be ended early
(or extended) by updating `expiresAt` on the row.
([#3420](#3420))
- Add the `Session` primitive — a durable, task-bound, bidirectional I/O
channel that outlives a single run and acts as the run manager for
`chat.agent`. Ships the Postgres `Session` + `SessionRun` tables,
ClickHouse `sessions_v1` + replication service, the `sessions` JWT
scope, and the public CRUD + realtime routes (`/api/v1/sessions`,
`/realtime/v1/sessions/:session/:io`) including `end-and-continue` for
server-orchestrated run handoffs and session-stream waitpoints.
([#3417](#3417))
- Add `KUBERNETES_POD_DNS_NDOTS_OVERRIDE_ENABLED` flag (off by default)
that overrides the cluster default and sets `dnsConfig.options.ndots` on
runner pods (defaulting to 2, configurable via
`KUBERNETES_POD_DNS_NDOTS`). Kubernetes defaults pods to `ndots: 5`, so
any name with fewer than 5 dots — including typical external domains
like `api.example.com` — is first walked through every entry in the
cluster search list (`<ns>.svc.cluster.local`, `svc.cluster.local`,
`cluster.local`) before being tried as-is, turning one resolution into
4+ CoreDNS queries (×2 with A+AAAA). Using a lower `ndots` value reduces
DNS query amplification in the `cluster.local` zone.
  
Note: before enabling, make sure no code path relies on search-list
expansion for names with dots ≥ the configured value — those names will
hit their as-is form first and could resolve externally before falling
back to the cluster search path.
([#3441](#3441))
- Vercel integration option to disable auto promotions
([#3376](#3376))
- Make it clear in the admin that feature flags are global and should
rarely be changed.
([#3408](#3408))
- Admin worker groups API: add GET loader and expose more fields on
POST. ([#3390](#3390))
- Add 60s fresh / 60s stale SWR cache to `getEntitlement` in
`platform.v3.server.ts`. Eliminates a synchronous billing-service HTTP
round trip on every trigger. Reuses the existing `platformCache` (LRU
memory + Redis) pattern already used for `limits` and `usage`. Cache key
is `${orgId}`. Errors return a permissive `{ hasAccess: true }` fallback
(existing behavior) and are also cached to prevent thundering-herd on
billing outages.
([#3388](#3388))
- Show a `MicroVM` badge next to the region name on the regions page.
([#3407](#3407))
- Increase default maximum project count per organization from 10 to 25
([#3409](#3409))
- Merge execution snapshot creation into the dequeue taskRun.update
transaction, reducing 2 DB commits to 1 per dequeue operation
([#3395](#3395))
- Add per-worker Node.js heap metrics to the OTel meter —
`nodejs.memory.heap.used`, `nodejs.memory.heap.total`,
`nodejs.memory.heap.limit`, `nodejs.memory.external`,
`nodejs.memory.array_buffers`, `nodejs.memory.rss`. Host-metrics only
publishes RSS, which overstates V8 heap by the external + native
footprint; these give direct heap visibility per cluster worker so
`NODE_MAX_OLD_SPACE_SIZE` can be sized against observed heap peaks
rather than RSS.
([#3437](#3437))
- Tag Prisma spans with `db.datasource: "writer" | "replica"` so
monitors and trace queries can distinguish the writer pool from the
replica pool. Applies to all `prisma:engine:*` spans (including
`prisma:engine:connection` used by the connection-pool monitors) and the
outer `prisma:client:operation` span.
([#3422](#3422))
- Clarify the cross-region intent in the Terraform and AI-prompt helpers
on the Add Private Connection page. Both already default
`supported_regions` to `["us-east-1", "eu-central-1"]`; added an inline
comment / parenthetical so the user understands why both regions are
listed (Trigger.dev runs in both, so the service must be consumable from
either).
([#3465](#3465))
- Add `RUN_ENGINE_READ_REPLICA_SNAPSHOTS_SINCE_ENABLED` flag (default
off) to route the Prisma reads inside `RunEngine.getSnapshotsSince`
through the read-only replica client. Offloads the snapshot polling
queries (fired by every running task runner) from the primary. When
disabled, behavior is unchanged.
([#3423](#3423))
- Stop creating TaskRunTag records and _TaskRunToTaskRunTag join table
entries during task triggering. The denormalized runTags string array on
TaskRun already stores tag names, making the M2M relation redundant
write overhead.
([#3369](#3369))
- Stop writing per-tick state (`lastScheduledTimestamp`,
`nextScheduledTimestamp`, `lastRunTriggeredAt`) on `TaskSchedule` and
`TaskScheduleInstance`. The schedule engine now carries the previous
fire time forward via the worker queue payload, eliminating ~270K
dead-tuple-driven autovacuums per year on these hot tables and the
associated `IO:XactSync` mini-spikes on the writer. Customer-facing
`payload.lastTimestamp` semantics are unchanged.
([#3476](#3476))
- Replace the expensive DISTINCT query for task filter dropdowns with a
dedicated TaskIdentifier registry table backed by Redis. Environments
migrate automatically on their next deploy, with a transparent fallback
to the legacy query for unmigrated environments. Also fixes duplicate
dropdown entries when a task changes trigger source, and adds
active/archived grouping for removed tasks. Moves BackgroundWorkerTask
reads in the trigger hot path to the read replica.
([#3368](#3368))
- Public Access Tokens (PATs) minted before an API key rotation now keep
working during the 24h grace window. `validatePublicJwtKey` falls back
to any non-expired `RevokedApiKey` rows for the signing environment when
the primary signature check against the env's current `apiKey` fails.
The fallback query only runs on the failure path, so the hot success
path is unchanged.
([#3464](#3464))
- Batch items that hit the environment queue size limit now fast-fail
without
retries and without creating pre-failed TaskRuns.
([#3352](#3352))
- Show the cancel button in the runs list for runs in `DEQUEUED` status.
`DEQUEUED` was missing from `NON_FINAL_RUN_STATUSES` so the list hid the
button even though the single run page allowed it.
([#3421](#3421))
- Reduce 5xx feedback loops on hot debounce keys by quantizing
`delayUntil`,
  adding an unlocked fast-path skip, and gracefully handling redlock
contention in `handleDebounce` so the SDK no longer retries into a herd.
([#3453](#3453))
- Fix RSS memory leak in the realtime proxy routes. `/realtime/v1/runs`,
`/realtime/v1/runs/:id`, and `/realtime/v1/batches/:id` called `fetch()`
into Electric with no abort signal, so when a client disconnected mid
long-poll, undici kept the upstream socket open and buffered response
chunks that would never be consumed — retained only in RSS, invisible to
V8 heap tooling. Thread `getRequestAbortSignal()` through
`RealtimeClient.streamRun/streamRuns/streamBatch` to `longPollingFetch`
and cancel the upstream body in the error path. Isolated reproducer
showed ~44 KB retained per leaked request; signal propagation releases
it cleanly.
([#3442](#3442))
- Fix memory leak where every aborted SSE connection pinned the full
request/response graph on Node 20, caused by `AbortSignal.any()` in
`sse.ts` retaining its source signals indefinitely (see
nodejs/node#54614, nodejs/node#55351). Also clear the
`setTimeout(abort)` timer in `entry.server.tsx` so successful HTML
renders don't pin the React tree for 30s per request.
([#3430](#3430))
- Preserve filters on the queues page when submitting modal actions.
([#3471](#3471))
- Fix Redis connection leak in realtime streams and broken abort signal
propagation.
  
**Redis connections**: Non-blocking methods (ingestData, appendPart,
getLastChunkIndex) now share a single Redis connection instead of
creating one per request. streamResponse still uses dedicated
connections (required for XREAD BLOCK) but now tears them down
immediately via disconnect() instead of graceful quit(), with a 15s
inactivity fallback.
  
**Abort signal**: request.signal is broken in Remix/Express due to a
Node.js undici GC bug (nodejs/node#55428) that severs the signal chain
when Remix clones the Request internally. Added getRequestAbortSignal()
wired to Express res.on("close") via httpAsyncStorage, which fires
reliably on client disconnect. All SSE/streaming routes updated to use
it. ([#3399](#3399))
- Prevent dashboard crash (React error #31) when span accessory item
text is not a string. Filters out malformed accessory items in
SpanCodePathAccessory instead of passing objects to React as children.
([#3400](#3400))
- Upgrade Remix packages from 2.1.0 to 2.17.4 to address security
vulnerabilities in React Router
([#3372](#3372))
- Fix Vercel integration settings page (remove redundant section
toggles) and improve the Vercel onboarding flow so the modal closes
after connecting a GitHub repo and the marketplace `next` URL is
preserved across the GitHub app install redirect.
([#3424](#3424))

<details>
<summary>Raw changeset output</summary>

# Releases
## @trigger.dev/build@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

## trigger.dev@4.4.5

### Patch Changes

- Add `--no-browser` flag to `init` and `login` to skip auto-opening the
browser during authentication. Also error loudly when `init` is run
without `--yes` under non-TTY stdin (previously default-and-exited
silently, leaving the project half-initialized). Both commands now show
an `Examples` section in `--help`.
([#3483](#3483))
-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`
    -   `@trigger.dev/build@4.4.5`
    -   `@trigger.dev/schema-to-json@4.4.5`

## @trigger.dev/core@4.4.5

### Patch Changes

- Add `isReplay` boolean to the run context (`ctx.run.isReplay`),
derived from the existing `replayedFromTaskRunFriendlyId` database
field. Defaults to `false` for backwards compatibility.
([#3454](#3454))
- Redact the `resolveWaitpoint` runtime log so it only emits `id` and
`type` instead of the full completed waitpoint. Previously the log
printed the entire waitpoint (including `output`) to stdout in
production runs, which could leak sensitive payloads. The value returned
by `wait.forToken()` is unchanged.
([#3490](#3490))
- Add `SessionId` friendly ID generator and schemas for the new durable
Session primitive. Exported from `@trigger.dev/core/v3/isomorphic`
alongside `RunId`, `BatchId`, etc. Ships the
`CreateSessionStreamWaitpoint` request/response schemas alongside the
main Session CRUD.
([#3417](#3417))
- Truncate large error stacks and messages to prevent OOM crashes. Stack
traces are capped at 50 frames (keeping top 5 + bottom 45 with an
omission notice), individual stack lines at 1024 chars, and error
messages at 1000 chars. Applied in parseError, sanitizeError, and OTel
span recording.
([#3405](#3405))

## @trigger.dev/python@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`
    -   `@trigger.dev/build@4.4.5`
    -   `@trigger.dev/sdk@4.4.5`

## @trigger.dev/react-hooks@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

## @trigger.dev/redis-worker@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

## @trigger.dev/rsc@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

## @trigger.dev/schema-to-json@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

## @trigger.dev/sdk@4.4.5

### Patch Changes

-   Updated dependencies:
    -   `@trigger.dev/core@4.4.5`

</details>

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants