[HDX-4125] Enable end-to-end PR testing on Vercel previews (inline API)#2184
Conversation
Cherry-picked from DeploySentinel/hyperdx-ee#1996 (commit 316c4c494). Adds an opt-in code path that bundles the entire Express API into the Next.js `/api/[...all]` serverless function so a single Vercel preview deployment can serve both the app and the API. In all other environments (Docker fullstack, standalone production) the existing http-proxy-middleware path is preserved unchanged. Key changes: - packages/api/src/serverless.ts: stateless handler that lazily connects to MongoDB on first invocation and delegates to the Express app. - packages/app/pages/api/[...all].ts: branches on HDX_PREVIEW_INLINE_API. When 'true', lazy-requires the compiled serverless handler. Otherwise falls back to the proxy. - packages/app/next.config.mjs: webpack `externals` rule that marks `@hyperdx/api` as external whenever HDX_PREVIEW_INLINE_API is not 'true', so production builds stay byte-for-byte equivalent. - packages/api/tsconfig.vercel.json + build:vercel script: produces build/<file>.js layout (no src subdirectory) so tsc-alias correctly rewrites `@/` imports to relative paths for webpack. - vercel.json: declares install/build commands and an ignoreCommand that skips builds for changes outside app/api/common-utils. - config.IS_INLINE_API + FRONTEND_REDIRECT_BASE: emit relative redirects on Vercel previews so post-login/logout/auth-error/join-team flows stay on the auto-assigned preview host instead of bouncing to production. Outside preview, FRONTEND_REDIRECT_BASE === FRONTEND_URL, so all redirect URLs are byte-for-byte identical to today. - redirectToDashboard / handleAuthError now use 303 See Other so browsers follow the redirect with GET (avoids 405 on Next.js pages). - DEFAULT_FRONTEND_URL falls back to '' when HYPERDX_APP_PORT is unset (avoids 'http://localhost:undefined' / Invalid URL on Vercel previews). - OnboardingChecklist: gate sourceRows query on firstConnection?.id so brand-new teams don't fire a query with empty connection id. - agent_docs/development.md: documents the preview mode, env vars, and limitations (no background tasks, no alerts, no OPAMP). OSS-specific deltas vs upstream commit: - Slack router (packages/api/src/routers/api/slack.ts) doesn't exist in OSS, so its FRONTEND_URL → FRONTEND_REDIRECT_BASE rename is dropped along with the slack.test.ts mock update. - Kept OSS-only HYPERDX_IMAGE / IS_APP_IMAGE / IS_ALL_IN_ONE_IMAGE / IS_LOCAL_IMAGE constants in config.ts.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🦋 Changeset detectedLatest commit: aa5e00e The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
🔴 Tier 4 — CriticalTouches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD. Why this tier:
Review process: Deep review from a domain expert. Synchronous walkthrough may be required. Stats
|
PR ReviewWell-structured PR with clear scope isolation (preview-only path, production untouched). A few items worth fixing:
|
E2E Test Results✅ All tests passed • 157 passed • 3 skipped • 1192s
Tests ran across 4 shards in parallel. |
When the API runs inline inside the Next.js `/api/[...all]` serverless
function on Vercel previews (HDX_PREVIEW_INLINE_API=true), Next.js
populates req.query.all with the catch-all path segments before Express
sees the request. Express's query middleware short-circuits on
`if (!req.query)` and never re-parses, so the polluting `all` key
stays in req.query.
The proxy then built URL search params with
`new URLSearchParams(req.query)`, which forwarded
`?all=clickhouse-proxy&...` to ClickHouse. Modern ClickHouse rejects
this with:
Code: 115. DB::Exception: Setting all is neither a builtin setting
nor started with the prefix 'custom_' registered for user-defined
settings. (UNKNOWN_SETTING)
Mirror the upstream EE pattern: parse the original request URL (the
`path` argument http-proxy-middleware passes) into a URL object and
use its searchParams. This sidesteps Next.js's req.query mutation
entirely. The original TODO in the file already pointed in this
direction.
Also drop the bogus `req.path.replace('^/clickhouse-proxy', '')` (a
literal-string replace that never matched) and replace it with a real
regex that strips the mount prefix as documented.
The OnboardingChecklist data check that fired with an empty connection
id has already been gated separately, so this fix is the remaining
regression preventing /search from loading on Vercel previews.
Replace OSS clickhouseProxy.ts with the downstream EE version (modulo CHC and RBAC code paths, which don't exist in OSS). This eliminates drift between OSS and EE so future merges of the inline-API ticket back into OSS apply cleanly. Behaviorally vs the previous OSS version: - Query params now come from the request URL via validateAndSanitizePath() + URL.searchParams, not from req.query. This fixes the 'Setting all is neither a builtin setting nor started with the prefix custom_' regression on Vercel previews, where the Next.js [...all] catch-all route populates req.query.all and Express's query middleware (`if (!req.query)`) refuses to re-parse, so the polluting key gets forwarded to ClickHouse as ?all=... (see HDX-4125). - Adds path-injection hardening: rejects pathnames containing query-string chars (?, &) — including double-encoded variants — and protocol-based payloads (javascript:, data:, //evil.com). - POST-only proxy now: GETs return 405. Matches EE; the previous OSS permitted GETs and let them through unchanged. - Body is JSON-stringified when content-type is application/json before forwarding (was raw write before). - In dev (`IS_DEV`), captures the parameterized SQL on the active OTel span as http.request.body for trace-based debugging. - proxyTimeout set to 800s (Vercel max function duration), matching the upstream MAX_CLICKHOUSE_PROXY_TIMEOUT_SECONDS constant. Inlined here rather than importing from @hyperdx/common-utils/dist/constants because the OSS constants module doesn't carry it (the rest of the EE constants file is RBAC/alert plumbing OSS doesn't have). - proxyRes adds X-ClickHouse-Mixed-Response (207) and X-ClickHouse-Service-Unavailable (206) headers so the browser ClickHouse client can detect those statuses (it can read headers but not status codes). OSS-specific deltas vs upstream commit (intentional, must remain): - /test endpoint drops the optional `roles` body field and the per-role activation probe. Pure RBAC feature. - getConnection drops isChcUser / role / accessToken / isRBACEnforced / clickhouseRoleName from req._hdx_connection. None of those fields exist on OSS connections or auth context. - pathRewrite drops the CHC branch (request_timeout injection) and the RBAC branch (forced role= injection). OSS connections never set isChcUser or isRBACEnforced. - proxyReq drops the CHC Authorization-pass-through path. OSS uses X-ClickHouse-User / X-ClickHouse-Key only. - Replaces `STAGE !== 'production'` (downstream-only env) with IS_DEV for the span SQL attribute guard. - Adds @braintree/sanitize-url ^7.1.1 to packages/api deps.
It's only referenced inside clickhouseProxy.ts. Upstream EE exports it because it's covered by a unit test there; OSS has no such test, so the export is dead and Knip flags it. Re-add the export when/if a test file is brought over.
Match prior OSS behavior of leaving proxyTimeout unset (defaults to http-proxy's behavior). The downstream EE 800s timeout is tied to Vercel's max function duration; OSS deploys via Docker fullstack where the timeout isn't relevant, so leaving it at the default keeps the OSS deployment unchanged from before the EE clone.
clickhouseProxy.ts: - Drop the dev-only OTel span SQL attribute (req body \u2192 parameterized SQL on http.request.body). Not used in OSS, removes the parameterizedQueryToSql / @opentelemetry/api / IS_DEV imports. - Drop the X-ClickHouse-Mixed-Response (207) and X-ClickHouse-Service-Unavailable (206) response-header bridges. Those are downstream EE conventions; the OSS browser client doesn't read them. config.ts: shorten the FRONTEND_REDIRECT_BASE comment.
Linear: https://linear.app/clickhouse/issue/HDX-4125/enable-end-to-end-pr-testing-on-vercel-previews-inline-api
Cherry-picked from DeploySentinel/hyperdx-ee#1996 (commit
316c4c494).Ref: HDX-4137
Why
Today the Vercel preview only deploys the frontend (
@hyperdx/app) and proxies/api/*to a separately-deployed long-lived API. PR previews can therefore only exercise UI changes — API-side changes are invisible.This change inlines the Express API into the Next.js
/api/[...all]serverless function so each preview deployment is a self-contained app + API stack. Reviewers can register, configure connections/sources, and run real ClickHouse queries against the PR's actual API code.Production deploys (Docker fullstack image, standalone Next output) are unchanged — they keep proxying
/api/*to the separately-deployed API service.What
packages/api/src/serverless.ts— stateless handler that lazily connects to MongoDB on first invocation, strips the/apiprefix, and delegates to the existing Express app fromapi-app.ts.packages/app/pages/api/[...all].ts— branches onHDX_PREVIEW_INLINE_API. When'true', lazy-requires the compiled serverless handler. Otherwise falls back to the proxy.packages/app/next.config.mjs— webpackexternalsrule that marks@hyperdx/api(and subpaths) as external wheneverHDX_PREVIEW_INLINE_API !== 'true'. Production app builds stay byte-for-byte equivalent and never bundle passport, mongoose, AWS SDK, etc.packages/api/tsconfig.vercel.json+build:vercelscript — producesbuild/<file>.jslayout (nosrc/subdirectory) sotsc-aliascorrectly rewrites@/imports to relative paths for webpack. The defaultbuildscript is unchanged, so production Docker builds andbin/hyperdxare unaffected.vercel.json— declares install/build commands and anignoreCommandthat skips builds for changes outsideapp/api/common-utils.config.IS_INLINE_API+FRONTEND_REDIRECT_BASE— emit relative redirects on Vercel previews so post-login/logout/auth-error/join-team flows stay on the auto-assigned preview host instead of bouncing to the production host configured at the project level. Outside preview,FRONTEND_REDIRECT_BASE === FRONTEND_URL, so all redirect URLs are byte-for-byte identical to today.redirectToDashboard/handleAuthErrornow use 303 See Other so browsers follow the redirect with GET (avoids 405 on Next.js pages router).DEFAULT_FRONTEND_URLfalls back to''whenHYPERDX_APP_PORTis unset (avoidshttp://localhost:undefined/Invalid URLon Vercel previews).OnboardingChecklist— gatesourceRowsquery onfirstConnection?.idso brand-new teams don't fire a query with empty connection id.agent_docs/development.md— documents the preview mode, required env vars (HDX_PREVIEW_INLINE_API,MONGO_URI,EXPRESS_SESSION_SECRET,DISABLED_AUTH_METHODS), and limitations (no background tasks, no alerts, no OPAMP, function size budget, cold starts).OSS-specific deltas vs upstream commit
packages/api/src/routers/api/slack.ts) doesn't exist in OSS, so itsFRONTEND_URL→FRONTEND_REDIRECT_BASErename is dropped along with theslack.test.tsmock update.HYPERDX_IMAGE/IS_APP_IMAGE/IS_ALL_IN_ONE_IMAGE/IS_LOCAL_IMAGEconstants inconfig.ts.Verification
yarn workspace @hyperdx/api ci:lint✅yarn workspace @hyperdx/app ci:lint✅yarn workspace @hyperdx/api build:vercel✅ (zero unresolved@/imports inbuild/)yarn workspace @hyperdx/app ci:unit✅ (1533 tests passed)