feat(api): redactSecrets util for LLM input from observability data#2188
feat(api): redactSecrets util for LLM input from observability data#2188alex-fedotyev wants to merge 4 commits intomainfrom
Conversation
Adds a reusable best-effort secret redactor with conservative allowlist patterns covering: PEM blocks, basic-auth URLs, key=value pairs, JSON-shaped secrets, HTTP secret headers, Bearer/Basic auth values, JWTs, AWS access keys, Slack tokens, and GitHub token shapes. Codifies the design rule for HyperDX AI endpoints in the file header: LLM input derived from observability data passes through redactSecrets; user-authored prose does not. Internal-only; no consumer in this commit. Imported by the upcoming /ai/summarize endpoint and any future LLM endpoints that ingest observability data. Refs HDX-3992.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
🦋 Changeset detectedLatest commit: 0829acd The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
🔵 Tier 2 — Low RiskSmall, isolated change with no API route or data model modifications. Why this tier:
Review process: AI review + quick human skim (target: 5–15 min). Reviewer validates AI assessment and checks for domain-specific concerns. Stats
|
PR Review
✅ No critical security bugs. PEM bounded lazy quantifier ( |
E2E Test Results✅ All tests passed • 161 passed • 3 skipped • 1143s
Tests ran across 4 shards in parallel. |
Address review comments on #2188: - basic-auth-url now handles "@" in passwords. Previous regex stopped at the first "@", leaving any password tail before the host visible. New regex greedily consumes the password and backtracks to the last "@" before the host; host is captured and preserved in the replacement. New test: a password containing "@" must be fully redacted, with the host intact. - key-value pattern now matches shell-style quoted values: PASSWORD="hunter2 with spaces" and API_KEY='abc 123' are redacted. Previously the unquoted character class stopped at the leading quote, so neither pattern fired. Two new tests cover both quote styles. - pem pattern is bounded by {0,16000}? on the lazy match so an unmatched BEGIN does not scan an unbounded amount of trailing input. Real PEM blocks are well under 16KB; the API caps the whole request body at 50KB. New test asserts unchanged output and sub-500ms wall-clock on a 50KB unmatched-BEGIN payload. - Header "Known gaps" comment now mentions raw "@" in basic-auth usernames (ambiguous to parse without percent-encoding). 44 tests pass; eight new cases for the items above. No changes to the public surface. Refs HDX-3992.
|
Thanks for the review. Pushed fixes in 9753dc1.
44 tests pass; eight new cases. |
The previous review-fix commit pushed prod lines from 139 to 153, just over the Tier 2 threshold (< 150 prod lines). Compressing the verbose comments on PEM, basic-auth-url, and key-value patterns brings prod back to 144. No behavior change.
Co-Authored-By: Claude Opus <model> <noreply@anthropic.com>
Summary
Adds a reusable best-effort secret redactor at
packages/api/src/utils/redactSecrets.ts. Internal-only utility; no consumer in this PR. The next AI-summarize PR (HDX-3992 split) imports it; future LLM endpoints that ingest observability data should also.The file header codifies the design rule for HyperDX AI endpoints:
Patterns covered
pem-----BEGIN ... PRIVATE KEY-----blocks (RSA, EC, DSA, OPENSSH, PKCS#8)basic-auth-urlhttps://user:pass@hostkey-valuepassword=...,api_key=...,token=..., etc.json-quoted{"password":"..."}and similarhttp-headerX-Api-Key:,X-Auth-Token:,Api-Key:bearerAuthorization: Bearer ...basicAuthorization: Basic ...jwteyJ...three-segment base64urlaws-access-keyAKIA[16],ASIA[16]slack-tokenxox[a-z]-...github-tokenghp_,gho_,ghu_,ghs_,ghr_Known gaps (deferred to follow-ups)
Why this is its own PR
Splits cleanly from the larger AI summarize work (HDX-3992 / #2108). Lands as a small, isolated, test-heavy change so review is fast and the util is in place before downstream consumers arrive.
Tests
38 cases in
packages/api/src/utils/__tests__/redactSecrets.test.tscovering: each pattern with a positive case, a "looks similar but isn't" negative, and at least one multi-secret payload. Pattern-coverage assertion exposes the registry shape so future additions get a compile-time signal.All neighboring api utils tests still pass (8 suites, 61 tests).
No user-facing change
The util is internal API code with no production consumer in this PR. No changeset.
References