Skip to content

[HDX-4125] Enable end-to-end PR testing on Vercel previews (inline API)#2184

Merged
kodiakhq[bot] merged 8 commits intomainfrom
warren/HDX-4125-vercel-preview-inline-api
May 5, 2026
Merged

[HDX-4125] Enable end-to-end PR testing on Vercel previews (inline API)#2184
kodiakhq[bot] merged 8 commits intomainfrom
warren/HDX-4125-vercel-preview-inline-api

Conversation

@wrn14897
Copy link
Copy Markdown
Member

@wrn14897 wrn14897 commented May 4, 2026

Linear: https://linear.app/clickhouse/issue/HDX-4125/enable-end-to-end-pr-testing-on-vercel-previews-inline-api

Cherry-picked from DeploySentinel/hyperdx-ee#1996 (commit 316c4c494).

Ref: HDX-4137

Why

Today the Vercel preview only deploys the frontend (@hyperdx/app) and proxies /api/* to a separately-deployed long-lived API. PR previews can therefore only exercise UI changes — API-side changes are invisible.

This change inlines the Express API into the Next.js /api/[...all] serverless function so each preview deployment is a self-contained app + API stack. Reviewers can register, configure connections/sources, and run real ClickHouse queries against the PR's actual API code.

Production deploys (Docker fullstack image, standalone Next output) are unchanged — they keep proxying /api/* to the separately-deployed API service.

What

  • packages/api/src/serverless.ts — stateless handler that lazily connects to MongoDB on first invocation, strips the /api prefix, and delegates to the existing Express app from api-app.ts.
  • packages/app/pages/api/[...all].ts — branches on HDX_PREVIEW_INLINE_API. When 'true', lazy-requires the compiled serverless handler. Otherwise falls back to the proxy.
  • packages/app/next.config.mjs — webpack externals rule that marks @hyperdx/api (and subpaths) as external whenever HDX_PREVIEW_INLINE_API !== 'true'. Production app builds stay byte-for-byte equivalent and never bundle passport, mongoose, AWS SDK, etc.
  • packages/api/tsconfig.vercel.json + build:vercel script — produces build/<file>.js layout (no src/ subdirectory) so tsc-alias correctly rewrites @/ imports to relative paths for webpack. The default build script is unchanged, so production Docker builds and bin/hyperdx are unaffected.
  • vercel.json — declares install/build commands and an ignoreCommand that skips builds for changes outside app/api/common-utils.
  • config.IS_INLINE_API + FRONTEND_REDIRECT_BASE — emit relative redirects on Vercel previews so post-login/logout/auth-error/join-team flows stay on the auto-assigned preview host instead of bouncing to the production host configured at the project level. Outside preview, FRONTEND_REDIRECT_BASE === FRONTEND_URL, so all redirect URLs are byte-for-byte identical to today.
  • redirectToDashboard / handleAuthError now use 303 See Other so browsers follow the redirect with GET (avoids 405 on Next.js pages router).
  • DEFAULT_FRONTEND_URL falls back to '' when HYPERDX_APP_PORT is unset (avoids http://localhost:undefined / Invalid URL on Vercel previews).
  • OnboardingChecklist — gate sourceRows query on firstConnection?.id so brand-new teams don't fire a query with empty connection id.
  • agent_docs/development.md — documents the preview mode, required env vars (HDX_PREVIEW_INLINE_API, MONGO_URI, EXPRESS_SESSION_SECRET, DISABLED_AUTH_METHODS), and limitations (no background tasks, no alerts, no OPAMP, function size budget, cold starts).

OSS-specific deltas vs upstream commit

  • Slack router (packages/api/src/routers/api/slack.ts) doesn't exist in OSS, so its FRONTEND_URLFRONTEND_REDIRECT_BASE rename is dropped along with the slack.test.ts mock update.
  • Kept OSS-only HYPERDX_IMAGE / IS_APP_IMAGE / IS_ALL_IN_ONE_IMAGE / IS_LOCAL_IMAGE constants in config.ts.

Verification

  • yarn workspace @hyperdx/api ci:lint
  • yarn workspace @hyperdx/app ci:lint
  • yarn workspace @hyperdx/api build:vercel ✅ (zero unresolved @/ imports in build/)
  • yarn workspace @hyperdx/app ci:unit ✅ (1533 tests passed)

Cherry-picked from DeploySentinel/hyperdx-ee#1996 (commit 316c4c494).

Adds an opt-in code path that bundles the entire Express API into the
Next.js `/api/[...all]` serverless function so a single Vercel preview
deployment can serve both the app and the API. In all other environments
(Docker fullstack, standalone production) the existing
http-proxy-middleware path is preserved unchanged.

Key changes:
- packages/api/src/serverless.ts: stateless handler that lazily connects
  to MongoDB on first invocation and delegates to the Express app.
- packages/app/pages/api/[...all].ts: branches on HDX_PREVIEW_INLINE_API.
  When 'true', lazy-requires the compiled serverless handler. Otherwise
  falls back to the proxy.
- packages/app/next.config.mjs: webpack `externals` rule that marks
  `@hyperdx/api` as external whenever HDX_PREVIEW_INLINE_API is not
  'true', so production builds stay byte-for-byte equivalent.
- packages/api/tsconfig.vercel.json + build:vercel script: produces
  build/<file>.js layout (no src subdirectory) so tsc-alias correctly
  rewrites `@/` imports to relative paths for webpack.
- vercel.json: declares install/build commands and an ignoreCommand
  that skips builds for changes outside app/api/common-utils.
- config.IS_INLINE_API + FRONTEND_REDIRECT_BASE: emit relative redirects
  on Vercel previews so post-login/logout/auth-error/join-team flows
  stay on the auto-assigned preview host instead of bouncing to
  production. Outside preview, FRONTEND_REDIRECT_BASE === FRONTEND_URL,
  so all redirect URLs are byte-for-byte identical to today.
- redirectToDashboard / handleAuthError now use 303 See Other so
  browsers follow the redirect with GET (avoids 405 on Next.js pages).
- DEFAULT_FRONTEND_URL falls back to '' when HYPERDX_APP_PORT is unset
  (avoids 'http://localhost:undefined' / Invalid URL on Vercel previews).
- OnboardingChecklist: gate sourceRows query on firstConnection?.id so
  brand-new teams don't fire a query with empty connection id.
- agent_docs/development.md: documents the preview mode, env vars, and
  limitations (no background tasks, no alerts, no OPAMP).

OSS-specific deltas vs upstream commit:
- Slack router (packages/api/src/routers/api/slack.ts) doesn't exist in
  OSS, so its FRONTEND_URL → FRONTEND_REDIRECT_BASE rename is dropped
  along with the slack.test.ts mock update.
- Kept OSS-only HYPERDX_IMAGE / IS_APP_IMAGE / IS_ALL_IN_ONE_IMAGE /
  IS_LOCAL_IMAGE constants in config.ts.
@vercel
Copy link
Copy Markdown

vercel Bot commented May 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment May 5, 2026 2:07pm

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented May 4, 2026

🦋 Changeset detected

Latest commit: aa5e00e

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@hyperdx/api Patch
@hyperdx/app Patch
@hyperdx/otel-collector Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

  • Critical-path files (2):
    • packages/api/src/config.ts
    • packages/api/src/middleware/auth.ts
  • Cross-layer change: touches frontend (packages/app) + backend (packages/api)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats
  • Production files changed: 14
  • Production lines changed: 294 (+ 72 in test files, excluded from tier calculation)
  • Branch: warren/HDX-4125-vercel-preview-inline-api
  • Author: wrn14897

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@github-actions github-actions Bot added the review/tier-4 Critical — deep review + domain expert sign-off label May 4, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

PR Review

Well-structured PR with clear scope isolation (preview-only path, production untouched). A few items worth fixing:

  • ⚠️ POST-only enforcement in wrong location (clickhouseProxy.ts:173): The proxy filter still allows GET (_req.method === 'GET' || _req.method === 'POST'), which lets GET requests pass filter setup before being rejected inside the proxyReq event handler. The proxy connection is already established at that point, and calling res.sendStatus(405) from inside the event can race with the in-flight proxy request causing "headers already sent" errors → Move the POST-only check to the filter: filter: _req => _req.method === 'POST'

  • ⚠️ Missing error boundary on lazy require ([...all].ts:628): require('@hyperdx/api/build/serverless') is called bare. If the build artifact is missing or the module throws at load time (e.g. wrong APP_TYPE), the error propagates as an unhandled exception → Wrap in try-catch and return a 500 so callers get a meaningful error.

  • ⚠️ Potential double-serialization in JSON body path (clickhouseProxy.ts:398-403): When content-type: application/json, body may already be a parsed object (by body-parser), so JSON.stringify(body) is correct — but if body-parser is absent/bypassed and body is already a string, JSON.stringify wraps it in extra quotes and corrupts the ClickHouse request. Add a guard: body = typeof body === 'string' ? body : JSON.stringify(body)

  • ℹ️ Behavior change for all environments (auth.ts:243): Redirect after login changed from /search to / for all deployments (not just Vercel). This is intentional per the PR description (client-side routing handles it), but worth noting in release notes as it changes post-login UX for Docker/standalone users.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

E2E Test Results

All tests passed • 157 passed • 3 skipped • 1192s

Status Count
✅ Passed 157
❌ Failed 0
⚠️ Flaky 6
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

When the API runs inline inside the Next.js `/api/[...all]` serverless
function on Vercel previews (HDX_PREVIEW_INLINE_API=true), Next.js
populates req.query.all with the catch-all path segments before Express
sees the request. Express's query middleware short-circuits on
`if (!req.query)` and never re-parses, so the polluting `all` key
stays in req.query.

The proxy then built URL search params with
`new URLSearchParams(req.query)`, which forwarded
`?all=clickhouse-proxy&...` to ClickHouse. Modern ClickHouse rejects
this with:

  Code: 115. DB::Exception: Setting all is neither a builtin setting
  nor started with the prefix 'custom_' registered for user-defined
  settings. (UNKNOWN_SETTING)

Mirror the upstream EE pattern: parse the original request URL (the
`path` argument http-proxy-middleware passes) into a URL object and
use its searchParams. This sidesteps Next.js's req.query mutation
entirely. The original TODO in the file already pointed in this
direction.

Also drop the bogus `req.path.replace('^/clickhouse-proxy', '')` (a
literal-string replace that never matched) and replace it with a real
regex that strips the mount prefix as documented.

The OnboardingChecklist data check that fired with an empty connection
id has already been gated separately, so this fix is the remaining
regression preventing /search from loading on Vercel previews.
Replace OSS clickhouseProxy.ts with the downstream EE version (modulo
CHC and RBAC code paths, which don't exist in OSS). This eliminates
drift between OSS and EE so future merges of the inline-API ticket
back into OSS apply cleanly.

Behaviorally vs the previous OSS version:

- Query params now come from the request URL via
  validateAndSanitizePath() + URL.searchParams, not from req.query.
  This fixes the 'Setting all is neither a builtin setting nor started
  with the prefix custom_' regression on Vercel previews, where the
  Next.js [...all] catch-all route populates req.query.all and
  Express's query middleware (`if (!req.query)`) refuses to re-parse,
  so the polluting key gets forwarded to ClickHouse as ?all=...
  (see HDX-4125).
- Adds path-injection hardening: rejects pathnames containing
  query-string chars (?, &) — including double-encoded variants —
  and protocol-based payloads (javascript:, data:, //evil.com).
- POST-only proxy now: GETs return 405. Matches EE; the previous OSS
  permitted GETs and let them through unchanged.
- Body is JSON-stringified when content-type is application/json
  before forwarding (was raw write before).
- In dev (`IS_DEV`), captures the parameterized SQL on the active
  OTel span as http.request.body for trace-based debugging.
- proxyTimeout set to 800s (Vercel max function duration), matching
  the upstream MAX_CLICKHOUSE_PROXY_TIMEOUT_SECONDS constant. Inlined
  here rather than importing from @hyperdx/common-utils/dist/constants
  because the OSS constants module doesn't carry it (the rest of the
  EE constants file is RBAC/alert plumbing OSS doesn't have).
- proxyRes adds X-ClickHouse-Mixed-Response (207) and
  X-ClickHouse-Service-Unavailable (206) headers so the browser
  ClickHouse client can detect those statuses (it can read headers
  but not status codes).

OSS-specific deltas vs upstream commit (intentional, must remain):

- /test endpoint drops the optional `roles` body field and the
  per-role activation probe. Pure RBAC feature.
- getConnection drops isChcUser / role / accessToken /
  isRBACEnforced / clickhouseRoleName from req._hdx_connection. None
  of those fields exist on OSS connections or auth context.
- pathRewrite drops the CHC branch (request_timeout injection) and
  the RBAC branch (forced role= injection). OSS connections never
  set isChcUser or isRBACEnforced.
- proxyReq drops the CHC Authorization-pass-through path. OSS uses
  X-ClickHouse-User / X-ClickHouse-Key only.
- Replaces `STAGE !== 'production'` (downstream-only env) with
  IS_DEV for the span SQL attribute guard.
- Adds @braintree/sanitize-url ^7.1.1 to packages/api deps.
It's only referenced inside clickhouseProxy.ts. Upstream EE exports it
because it's covered by a unit test there; OSS has no such test, so
the export is dead and Knip flags it. Re-add the export when/if a
test file is brought over.
Comment thread packages/api/src/__tests__/config.test.ts
Comment thread packages/api/src/routers/api/clickhouseProxy.ts Outdated
Comment thread packages/api/src/routers/api/clickhouseProxy.ts Outdated
Comment thread packages/api/src/routers/api/clickhouseProxy.ts Outdated
Comment thread packages/api/src/config.ts Outdated
Match prior OSS behavior of leaving proxyTimeout unset (defaults to
http-proxy's behavior). The downstream EE 800s timeout is tied to
Vercel's max function duration; OSS deploys via Docker fullstack
where the timeout isn't relevant, so leaving it at the default keeps
the OSS deployment unchanged from before the EE clone.
clickhouseProxy.ts:
- Drop the dev-only OTel span SQL attribute (req body \u2192 parameterized
  SQL on http.request.body). Not used in OSS, removes the
  parameterizedQueryToSql / @opentelemetry/api / IS_DEV imports.
- Drop the X-ClickHouse-Mixed-Response (207) and
  X-ClickHouse-Service-Unavailable (206) response-header bridges.
  Those are downstream EE conventions; the OSS browser client
  doesn't read them.

config.ts: shorten the FRONTEND_REDIRECT_BASE comment.
@kodiakhq kodiakhq Bot merged commit 29586e7 into main May 5, 2026
17 checks passed
@kodiakhq kodiakhq Bot deleted the warren/HDX-4125-vercel-preview-inline-api branch May 5, 2026 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge review/tier-4 Critical — deep review + domain expert sign-off

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants