Skip to content

feat: optimized default otel-logs schema#2125

Merged
kodiakhq[bot] merged 12 commits into
mainfrom
aaron/new-otel-schema
May 5, 2026
Merged

feat: optimized default otel-logs schema#2125
kodiakhq[bot] merged 12 commits into
mainfrom
aaron/new-otel-schema

Conversation

@knudtty
Copy link
Copy Markdown
Contributor

@knudtty knudtty commented Apr 15, 2026

Summary

After weeks of benchmarking, this optimized logs schema proves to beat the existing schema in almost all queries. The primary key is now prepended with toStartOfFiveMinutes, which groups rows together well enough that TimestampTime is not needed. Additionally, full text search is added for all indexes as it shows superior performance with minimal ingest overhead.

How to test locally

  1. If you have a .volumes directory locally, (re)move it
  2. yarn dev
  3. Poke around and make sure everything is working. Check the schema to ensure its the new one

References

  • Linear Issue: Closes HDX-4034

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 15, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
hyperdx-oss Ready Ready Preview, Comment May 5, 2026 3:31pm

Request Review

@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 15, 2026

🦋 Changeset detected

Latest commit: adfd9aa

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages
Name Type
@hyperdx/otel-collector Minor
@hyperdx/api Minor
@hyperdx/app Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions github-actions Bot added the review/tier-4 Critical — deep review + domain expert sign-off label Apr 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 15, 2026

🔴 Tier 4 — Critical

Touches auth, data models, config, tasks, OTel pipeline, ClickHouse, or CI/CD.

Why this tier:

  • Critical-path files (8):
    • .github/workflows/main.yml
    • docker/clickhouse/local/init-db-e2e.sh
    • docker/hyperdx/Dockerfile
    • docker/hyperdx/entry.local.base.sh
    • docker/otel-collector/schema/seed/00002_otel_logs.sql
    • docker/otel-collector/schema/seed/00002_otel_logs_compat.sql
    • packages/otel-collector/cmd/migrate/main.go
    • packages/otel-collector/cmd/migrate/main_test.go
  • Cross-layer change: touches frontend (packages/app) + backend (packages/api)

Review process: Deep review from a domain expert. Synchronous walkthrough may be required.
SLA: Schedule synchronous review within 2 business days.

Stats
  • Production files changed: 34
  • Production lines changed: 513 (+ 16 in test files, excluded from tier calculation)
  • Branch: aaron/new-otel-schema
  • Author: knudtty

To override this classification, remove the review/tier-4 label and apply a different review/tier-* label. Manual overrides are preserved on subsequent pushes.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 15, 2026

PR Review

  • ⚠️ Breaking schema change with no automated migration → Existing deployments with data in otel_logs will silently lose it (table won't be recreated by goose if it already exists). Consider adding a goose -- +goose Down migration or a doc/warning for ops teams upgrading from the old schema; "rm .volumes" only works locally.

  • ⚠️ EventName column added to schema but never referenced in any query, config, or API code → Confirm this is intentional (forward-looking for OTEL spec) or wire it up; silent dead columns cause confusion.

  • ⚠️ enable_block_number_column / enable_block_offset_column settings added to new schema but absent from e2e init script (docker/clickhouse/local/init-db-e2e.sh:129) → E2E tests run against a schema that differs from production, which could mask bugs dependent on those settings.

  • ⚠️ INDEX idx_trace_id TraceId TYPE text(tokenizer = 'array')array tokenizer treats the value as an array of strings; TraceId is a single String (hex UUID), so this indexes it as a one-element array and won't help partial/prefix lookups. The old bloom_filter(0.001) was correct for exact TraceId lookups. Consider using text(tokenizer = 'default') or keeping bloom_filter for this specific index.

  • ✅ Version-gated compat schema logic (Go), unit tests, and smoke tests for the compat path are solid.

  • ✅ Typo fix in CI workflow (otel-ccollectorotel-collector) is a good catch.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 15, 2026

E2E Test Results

All tests passed • 158 passed • 3 skipped • 1246s

Status Count
✅ Passed 158
❌ Failed 0
⚠️ Flaky 5
⏭️ Skipped 3

Tests ran across 4 shards in parallel.

View full report →

@knudtty
Copy link
Copy Markdown
Contributor Author

knudtty commented Apr 15, 2026

Edit: This is completed

I still need to add some logic that only adds FTS if the CH version is > 26.2. Will require some modification to the migration logic

@knudtty knudtty requested review from a team, bot-hyperdx and dhable and removed request for a team and bot-hyperdx April 16, 2026 01:16
INDEX idx_scope_attr_value mapValues(ScopeAttributes) TYPE text(tokenizer = 'array'),
INDEX idx_log_attr_key mapKeys(LogAttributes) TYPE text(tokenizer = 'array'),
INDEX idx_log_attr_value mapValues(LogAttributes) TYPE text(tokenizer = 'array'),
INDEX idx_lower_body lower(Body) TYPE text(tokenizer = 'splitByNonAlpha')
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about adding FTS on all fields like

ADD INDEX fulltext_lower_v1_idx concatWithSeparator(';', TraceId, SpanId, TraceFlags, SeverityText, SeverityNumber, ServiceName, Body, ResourceSchemaUrl, ResourceAttributes, ScopeSchemaUrl, ScopeName, ScopeVersion, ScopeAttributes, LogAttributes) TYPE text(tokenizer = 'splitByNonAlpha', preprocessor = lower(concatWithSeparator(';', TraceId, SpanId, TraceFlags, SeverityText, SeverityNumber, ServiceName, Body, ResourceSchemaUrl, ResourceAttributes, ScopeSchemaUrl, ScopeName, ScopeVersion, ScopeAttributes, LogAttributes))) GRANULARITY 100000000

Add change the implicit search field to concatWithSeparator(';', TraceId, SpanId, TraceFlags, SeverityText, SeverityNumber, ServiceName, Body, ResourceSchemaUrl, ResourceAttributes, ScopeSchemaUrl, ScopeName, ScopeVersion, ScopeAttributes, LogAttributes). This is going to massively improve DX since developers won’t always need to look up the attributes. Any thoughts?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'd like to, it isn't supported yet. Probably made it into 26.4

@@ -1 +1 @@
SELECT SeverityText FROM otel_logs WHERE ResourceAttributes['suite-id'] = 'normalize-severity' AND ResourceAttributes['test-id'] = 'text-case' ORDER BY TimestampTime FORMAT CSV
SELECT SeverityText FROM otel_logs WHERE ResourceAttributes['suite-id'] = 'normalize-severity' AND ResourceAttributes['test-id'] = 'text-case' ORDER BY (toStartOfFiveMinutes(Timestamp), Timestamp) FORMAT CSV
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For smoke tests, I’d suggest adding another ClickHouse instance with an older version (< 26.2) and ensuring the schema and SELECT queries still work correctly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@knudtty Is that something you're working on?

ENABLE_SWAGGER=true
DEFAULT_CONNECTIONS=[{"name":"Local ClickHouse","host":"http://localhost:${HDX_DEV_CH_HTTP_PORT}","username":"default","password":""}]
DEFAULT_SOURCES=[{"from":{"databaseName":"default","tableName":"otel_logs"},"kind":"log","timestampValueExpression":"TimestampTime","name":"Logs","displayedTimestampValueExpression":"Timestamp","implicitColumnExpression":"Body","serviceNameExpression":"ServiceName","bodyExpression":"Body","eventAttributesExpression":"LogAttributes","resourceAttributesExpression":"ResourceAttributes","defaultTableSelectExpression":"Timestamp,ServiceName,SeverityText,Body","severityTextExpression":"SeverityText","traceIdExpression":"TraceId","spanIdExpression":"SpanId","connection":"Local ClickHouse","traceSourceId":"Traces","sessionSourceId":"Sessions","metricSourceId":"Metrics"},{"from":{"databaseName":"default","tableName":"otel_traces"},"kind":"trace","timestampValueExpression":"Timestamp","name":"Traces","displayedTimestampValueExpression":"Timestamp","implicitColumnExpression":"SpanName","serviceNameExpression":"ServiceName","eventAttributesExpression":"SpanAttributes","resourceAttributesExpression":"ResourceAttributes","defaultTableSelectExpression":"Timestamp,ServiceName,StatusCode,round(Duration/1e6),SpanName","traceIdExpression":"TraceId","spanIdExpression":"SpanId","durationExpression":"Duration","durationPrecision":9,"parentSpanIdExpression":"ParentSpanId","spanNameExpression":"SpanName","spanKindExpression":"SpanKind","statusCodeExpression":"StatusCode","statusMessageExpression":"StatusMessage","connection":"Local ClickHouse","logSourceId":"Logs","sessionSourceId":"Sessions","metricSourceId":"Metrics"},{"from":{"databaseName":"default","tableName":""},"kind":"metric","timestampValueExpression":"TimeUnix","name":"Metrics","resourceAttributesExpression":"ResourceAttributes","metricTables":{"gauge":"otel_metrics_gauge","histogram":"otel_metrics_histogram","sum":"otel_metrics_sum","_id":"682586a8b1f81924e628e808","id":"682586a8b1f81924e628e808"},"connection":"Local ClickHouse","logSourceId":"Logs","traceSourceId":"Traces","sessionSourceId":"Sessions"},{"from":{"databaseName":"default","tableName":"hyperdx_sessions"},"kind":"session","timestampValueExpression":"TimestampTime","name":"Sessions","displayedTimestampValueExpression":"Timestamp","implicitColumnExpression":"Body","serviceNameExpression":"ServiceName","bodyExpression":"Body","eventAttributesExpression":"LogAttributes","resourceAttributesExpression":"ResourceAttributes","defaultTableSelectExpression":"Timestamp,ServiceName,SeverityText,Body","severityTextExpression":"SeverityText","traceIdExpression":"TraceId","spanIdExpression":"SpanId","connection":"Local ClickHouse","logSourceId":"Logs","traceSourceId":"Traces","metricSourceId":"Metrics"},{"from":{"databaseName":"otel_json","tableName":"otel_logs"},"kind":"log","timestampValueExpression":"Timestamp","name":"JSON Logs","displayedTimestampValueExpression":"Timestamp","implicitColumnExpression":"Body","serviceNameExpression":"ServiceName","bodyExpression":"Body","eventAttributesExpression":"LogAttributes","resourceAttributesExpression":"ResourceAttributes","defaultTableSelectExpression":"Timestamp,ServiceName,SeverityText,Body","severityTextExpression":"SeverityText","traceIdExpression":"TraceId","spanIdExpression":"SpanId","connection":"Local ClickHouse","traceSourceId":"JSON Traces","metricSourceId":"JSON Metrics"},{"from":{"databaseName":"otel_json","tableName":"otel_traces"},"kind":"trace","timestampValueExpression":"Timestamp","name":"JSON Traces","displayedTimestampValueExpression":"Timestamp","implicitColumnExpression":"SpanName","serviceNameExpression":"ServiceName","eventAttributesExpression":"SpanAttributes","resourceAttributesExpression":"ResourceAttributes","defaultTableSelectExpression":"Timestamp,ServiceName,StatusCode,round(Duration/1e6),SpanName","traceIdExpression":"TraceId","spanIdExpression":"SpanId","durationExpression":"Duration","durationPrecision":9,"parentSpanIdExpression":"ParentSpanId","spanNameExpression":"SpanName","spanKindExpression":"SpanKind","statusCodeExpression":"StatusCode","statusMessageExpression":"StatusMessage","connection":"Local ClickHouse","logSourceId":"JSON Logs","metricSourceId":"JSON Metrics"},{"from":{"databaseName":"otel_json","tableName":""},"kind":"metric","timestampValueExpression":"TimeUnix","name":"JSON Metrics","resourceAttributesExpression":"ResourceAttributes","metricTables":{"gauge":"otel_metrics_gauge","histogram":"otel_metrics_histogram","sum":"otel_metrics_sum"},"connection":"Local ClickHouse","logSourceId":"JSON Logs","traceSourceId":"JSON Traces"}]
DEFAULT_SOURCES=[{"from":{"databaseName":"default","tableName":"otel_logs"},"kind":"log","timestampValueExpression":"Timestamp","name":"Logs","displayedTimestampValueExpression":"Timestamp","implicitColumnExpression":"Body","serviceNameExpression":"ServiceName","bodyExpression":"Body","eventAttributesExpression":"LogAttributes","resourceAttributesExpression":"ResourceAttributes","defaultTableSelectExpression":"Timestamp,ServiceName,SeverityText,Body","severityTextExpression":"SeverityText","traceIdExpression":"TraceId","spanIdExpression":"SpanId","connection":"Local ClickHouse","traceSourceId":"Traces","sessionSourceId":"Sessions","metricSourceId":"Metrics"},{"from":{"databaseName":"default","tableName":"otel_traces"},"kind":"trace","timestampValueExpression":"Timestamp","name":"Traces","displayedTimestampValueExpression":"Timestamp","implicitColumnExpression":"SpanName","serviceNameExpression":"ServiceName","eventAttributesExpression":"SpanAttributes","resourceAttributesExpression":"ResourceAttributes","defaultTableSelectExpression":"Timestamp,ServiceName,StatusCode,round(Duration/1e6),SpanName","traceIdExpression":"TraceId","spanIdExpression":"SpanId","durationExpression":"Duration","durationPrecision":9,"parentSpanIdExpression":"ParentSpanId","spanNameExpression":"SpanName","spanKindExpression":"SpanKind","statusCodeExpression":"StatusCode","statusMessageExpression":"StatusMessage","connection":"Local ClickHouse","logSourceId":"Logs","sessionSourceId":"Sessions","metricSourceId":"Metrics"},{"from":{"databaseName":"default","tableName":""},"kind":"metric","timestampValueExpression":"TimeUnix","name":"Metrics","resourceAttributesExpression":"ResourceAttributes","metricTables":{"gauge":"otel_metrics_gauge","histogram":"otel_metrics_histogram","sum":"otel_metrics_sum","_id":"682586a8b1f81924e628e808","id":"682586a8b1f81924e628e808"},"connection":"Local ClickHouse","logSourceId":"Logs","traceSourceId":"Traces","sessionSourceId":"Sessions"},{"from":{"databaseName":"default","tableName":"hyperdx_sessions"},"kind":"session","timestampValueExpression":"TimestampTime","name":"Sessions","displayedTimestampValueExpression":"Timestamp","implicitColumnExpression":"Body","serviceNameExpression":"ServiceName","bodyExpression":"Body","eventAttributesExpression":"LogAttributes","resourceAttributesExpression":"ResourceAttributes","defaultTableSelectExpression":"Timestamp,ServiceName,SeverityText,Body","severityTextExpression":"SeverityText","traceIdExpression":"TraceId","spanIdExpression":"SpanId","connection":"Local ClickHouse","logSourceId":"Logs","traceSourceId":"Traces","metricSourceId":"Metrics"},{"from":{"databaseName":"otel_json","tableName":"otel_logs"},"kind":"log","timestampValueExpression":"Timestamp","name":"JSON Logs","displayedTimestampValueExpression":"Timestamp","implicitColumnExpression":"Body","serviceNameExpression":"ServiceName","bodyExpression":"Body","eventAttributesExpression":"LogAttributes","resourceAttributesExpression":"ResourceAttributes","defaultTableSelectExpression":"Timestamp,ServiceName,SeverityText,Body","severityTextExpression":"SeverityText","traceIdExpression":"TraceId","spanIdExpression":"SpanId","connection":"Local ClickHouse","traceSourceId":"JSON Traces","metricSourceId":"JSON Metrics"},{"from":{"databaseName":"otel_json","tableName":"otel_traces"},"kind":"trace","timestampValueExpression":"Timestamp","name":"JSON Traces","displayedTimestampValueExpression":"Timestamp","implicitColumnExpression":"SpanName","serviceNameExpression":"ServiceName","eventAttributesExpression":"SpanAttributes","resourceAttributesExpression":"ResourceAttributes","defaultTableSelectExpression":"Timestamp,ServiceName,StatusCode,round(Duration/1e6),SpanName","traceIdExpression":"TraceId","spanIdExpression":"SpanId","durationExpression":"Duration","durationPrecision":9,"parentSpanIdExpression":"ParentSpanId","spanNameExpression":"SpanName","spanKindExpression":"SpanKind","statusCodeExpression":"StatusCode","statusMessageExpression":"StatusMessage","connection":"Local ClickHouse","logSourceId":"JSON Logs","metricSourceId":"JSON Metrics"},{"from":{"databaseName":"otel_json","tableName":""},"kind":"metric","timestampValueExpression":"TimeUnix","name":"JSON Metrics","resourceAttributesExpression":"ResourceAttributes","metricTables":{"gauge":"otel_metrics_gauge","histogram":"otel_metrics_histogram","sum":"otel_metrics_sum"},"connection":"Local ClickHouse","logSourceId":"JSON Logs","traceSourceId":"JSON Traces"}]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread docker/clickhouse/local/init-db-e2e.sh
Comment thread docker/clickhouse/local/init-db-e2e.sh
PARTITION BY toDate(Timestamp)
ORDER BY (toStartOfFiveMinutes(Timestamp), ServiceName, Timestamp)
TTL toDateTime(Timestamp) + ${TABLES_TTL}
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to add enable_block_number_column = 1, enable_block_offset_column = 1; here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Added

Copy link
Copy Markdown
Member

@wrn14897 wrn14897 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great optimization!

@kodiakhq kodiakhq Bot merged commit aaba3e9 into main May 5, 2026
19 checks passed
@kodiakhq kodiakhq Bot deleted the aaron/new-otel-schema branch May 5, 2026 16:23
knudtty added a commit to ClickHouse/ClickStack-helm-charts that referenced this pull request May 5, 2026
New schema is introduced in
hyperdxio/hyperdx#2125

The schema removes the `TimestampTime` column, so default sources should
be updated appropriately
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

automerge review/tier-4 Critical — deep review + domain expert sign-off

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants