Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs-mintlify/admin/account-billing/distribution.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -124,4 +124,4 @@ open http://localhost:3000
[ref-infrastructure]: /admin/deployment/infrastructure
[ref-workspace]: /admin/workspace
[ref-deployment]: /admin/deployment
[ref-update-channels]: /admin/deployment/deployments#update-channels
[ref-update-channels]: /admin/deployment/index#update-channels
182 changes: 182 additions & 0 deletions docs-mintlify/admin/connect-to-data/data-sources/ksqldb.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -349,6 +349,19 @@ cubes:
- CUBE.user_id
- CUBE.status
stream_offset: latest
output_column_types:
- member: CUBE.order_id
type: text
- member: CUBE.user_id
type: text
- member: CUBE.status
type: text
- member: CUBE.count
type: int
- member: CUBE.total_amount
type: decimal
- member: CUBE.failed_count
type: int
```

```javascript title="JavaScript"
Expand Down Expand Up @@ -514,6 +527,14 @@ cube("order_events_stream", {
},
},
stream_offset: `latest`,
output_column_types: [
{ member: CUBE.order_id, type: `text` },
{ member: CUBE.user_id, type: `text` },
{ member: CUBE.status, type: `text` },
{ member: CUBE.count, type: `int` },
{ member: CUBE.total_amount, type: `decimal` },
{ member: CUBE.failed_count, type: `int` },
],
},
},
});
Expand All @@ -533,6 +554,9 @@ Key properties for the streaming pre-aggregation:
from the last processed offset regardless of this setting.
- `unique_key_columns` — columns that uniquely identify a record, used
for deduplication (see [below](#unique-key-columns-and-deduplication)).
- `output_column_types` — declares the output column types for the Cube
Store table, required for Kafka streams mode (see
[below](#output-column-types)).

#### Primary key and ungrouped queries

Expand All @@ -557,6 +581,35 @@ or table. Cube discovers its schema automatically. With Kafka streams
mode enabled, the streaming pre-aggregation reads the backing Kafka topic
directly — no objects are created in ksqlDB.

#### Topic name matching {#topic-name-matching}

In Kafka streams mode, Cube Store parses the `select_statement`
(generated from the cube's `sql` property) and matches the `FROM` table
name against the actual Kafka topic name. On managed platforms like
Confluent Cloud, the Kafka topic name often differs from the ksqlDB
stream or table name — for example, a ksqlDB stream called
`ORDER_EVENTS_STREAM` might be backed by a Kafka topic named
`pksqlc-abc123ORDER_EVENTS_STREAM`.

The cube's `sql` property must reference the **ksqlDB stream or table
name** (not the Kafka topic), because Cube uses ksqlDB `DESCRIBE` to
discover the schema and resolve the backing topic. However, Cube does
not currently rewrite the `FROM` clause in the generated
`select_statement` to use the resolved Kafka topic name. If the ksqlDB
object name differs from the Kafka topic name, Cube Store will fail
with:

> Topic table ORDER_EVENTS_STREAM is not found

<Warning>

This is a known limitation of Kafka streams mode. It does not occur
when the ksqlDB object name and the Kafka topic name are the same,
which is the default behavior when ksqlDB creates a stream or table
with the default topic naming strategy.

</Warning>

### Unique key columns and deduplication

When `unique_key_columns` is set, Cube Store appends an internal
Expand All @@ -579,6 +632,135 @@ falls back to the Kafka message key: for a single unique key column, the
raw key value is used; for composite keys, the key is expected to be a
JSON object with matching field names.

### Output column types

In Kafka streams mode, Cube Store creates its internal pre-aggregation
table based on column type information. By default, column types are
inferred from the source ksqlDB stream using `DESCRIBE`. However, the
pre-aggregation's `select_statement` (generated from the rollup
definition) renames and transforms columns — for example, a source
column `CREATED_AT` becomes `order_events_stream__created_at_second` in
the output.

When this renaming happens, the raw source column types no longer match
the output column names, causing errors like:

> Key column `order_events_stream__id` not found among column definitions

To fix this, define `output_column_types` on the streaming
pre-aggregation. This tells Cube the exact output column types to use
for the Cube Store table, and separately passes the source schema so
Cube Store can deserialize the raw Kafka messages correctly.

<CodeGroup>

```yaml title="YAML"
pre_aggregations:
- name: stream
type: rollup
read_only: true
measures:
- CUBE.count
- CUBE.total_amount
- CUBE.failed_count
dimensions:
- CUBE.order_id
- CUBE.user_id
- CUBE.status
unique_key_columns:
- order_id
time_dimension: CUBE.created_at
granularity: second
partition_granularity: day
build_range_start:
sql: "SELECT date_trunc('day', DATE_SUB(NOW(), INTERVAL '5 hour'))"
build_range_end:
sql: "SELECT DATE_ADD(NOW(), INTERVAL '15 minute')"
refresh_key:
every: 1 minute
update_window: 1 hour
incremental: true
stream_offset: latest
output_column_types:
- member: CUBE.order_id
type: text
- member: CUBE.user_id
type: text
- member: CUBE.status
type: text
- member: CUBE.count
type: int
- member: CUBE.total_amount
type: decimal
- member: CUBE.failed_count
type: int
```

```javascript title="JavaScript"
pre_aggregations: {
stream: {
type: `rollup`,
read_only: true,
measures: [CUBE.count, CUBE.total_amount, CUBE.failed_count],
dimensions: [CUBE.order_id, CUBE.user_id, CUBE.status],
unique_key_columns: [`order_id`],
time_dimension: CUBE.created_at,
granularity: `second`,
partition_granularity: `day`,
build_range_start: {
sql: `SELECT date_trunc('day', DATE_SUB(NOW(), INTERVAL '5 hour'))`,
},
build_range_end: {
sql: `SELECT DATE_ADD(NOW(), INTERVAL '15 minute')`,
},
refresh_key: {
every: `1 minute`,
update_window: `1 hour`,
incremental: true,
},
stream_offset: `latest`,
output_column_types: [
{ member: CUBE.order_id, type: `text` },
{ member: CUBE.user_id, type: `text` },
{ member: CUBE.status, type: `text` },
{ member: CUBE.count, type: `int` },
{ member: CUBE.total_amount, type: `decimal` },
{ member: CUBE.failed_count, type: `int` },
],
},
},
```

</CodeGroup>

Each entry in `output_column_types` has two properties:

- `member` — a reference to a dimension or measure included in the
pre-aggregation.
- `type` — the Cube Store column type. Common values: `text`, `int`,
`bigint`, `decimal`, `float`, `boolean`, `timestamp`.

The time dimension used in `time_dimension` does not need an entry in
`output_column_types` — its type is always `timestamp` and is set
automatically.

When `output_column_types` is defined, Cube uses the aliased column
names (matching the `select_statement`) for the Cube Store table
definition and passes the raw source schema separately via
`source_table`, so Cube Store knows how to deserialize incoming Kafka
messages. Without it, column names come from the raw ksqlDB `DESCRIBE`
output and will not match the aliased names in the `select_statement`
or `unique_key_columns`.

<Warning>

`output_column_types` is required for Kafka streams mode when the
pre-aggregation uses `unique_key_columns`. Without it, the unique key
column names will not match the table column definitions, causing the
pre-aggregation build to fail.

</Warning>

### Stream format

Cube Store expects Kafka messages to have a **JSON object** as their
Expand Down
2 changes: 1 addition & 1 deletion docs-mintlify/admin/deployment/auto-suspension.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -124,4 +124,4 @@ response times to be significantly longer than usual.
[self-effects]: #effects-on-experience
[ref-refresh-worker]: /cube-core/architecture#refresh-worker
[ref-sls]: /docs/integrations/semantic-layer-sync#on-schedule
[ref-cube-version]: /admin/deployment/deployments#cube-version
[ref-cube-version]: /admin/deployment/index#cube-version
10 changes: 2 additions & 8 deletions docs-mintlify/admin/deployment/environments.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,6 @@ Every Cube Cloud deployment provides a number of environments:
- Multiple [staging environments](#staging-environments).
- Per-user [development environments](#development-environments).

<Note>

Available on [all plans](https://cube.dev/pricing).

</Note>

## Production environment

This is the main environment. It runs the data model from the _main branch_.
Expand Down Expand Up @@ -120,5 +114,5 @@ credentials**][ref-credentials].
[ref-suspend]: /docs/deployment/cloud/auto-suspension
[ref-overview]: /docs/workspace/integrations#review-integrations
[ref-credentials]: /docs/workspace/integrations#view-api-credentials
[ref-version]: /docs/deployment/cloud/deployments#cube-version
[ref-version-channel]: /docs/deployment/cloud/deployments#update-channels
[ref-version]: /admin/deployment/index#cube-version
[ref-version-channel]: /admin/deployment/index#update-channels
2 changes: 1 addition & 1 deletion docs-mintlify/admin/deployment/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ You can view or change the update channel by navigating to **Settings →
General → Cube version**:

<Frame>
<img src="https://ucarecdn.com/3ef5fa36-4b27-437c-aa70-dbc0aa01255d/" />
<img src="https://ucarecdn.com/3ef5fa36-4b27-437c-aa70-dbc0aa01255d/-/format/auto/" />
</Frame>

You can select a specific version in the drop-down. Only versions that have been used by
Expand Down
6 changes: 0 additions & 6 deletions docs-mintlify/admin/deployment/infrastructure.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,6 @@ scaling, and monitoring your Cube Deployments, as well as managing Cube Store
and persisting pre-aggregated data. This option requires the least effort to
set up.

<Note>

Available on [all plans](https://cube.dev/pricing).

</Note>

Please note that some Enterprise features, such as VPC peering or PrivateLink are
not available on the multi-tenant infrastructure. There's also a possibility of
resource contention ("noisy neighbor") problem.
Expand Down
6 changes: 0 additions & 6 deletions docs-mintlify/admin/monitoring/pre-aggregations.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,6 @@ can see which pre-aggregations are accelerating queries, if they are [being
refreshed][ref-caching-using-preaggs-refresh], along with the last 24 hours of
build history.

<Note>

Available on [all plans](https://cube.dev/pricing).

</Note>

<Frame>
<img src="https://ucarecdn.com/eb10de99-4ff6-4a9f-abd5-736971497583/" />
</Frame>
Expand Down
4 changes: 2 additions & 2 deletions docs-mintlify/admin/monitoring/query-history.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ failed.

<Note>

Available on [all plans](https://cube.dev/pricing).
You can also choose a [Query History tier](/admin/account-billing/pricing#query-history-tiers).
You can choose a [Query History tier](/admin/account-billing/pricing#query-history-tiers)
to fit your retention and throughput needs.

</Note>

Expand Down
3 changes: 2 additions & 1 deletion docs-mintlify/docs.json
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,8 @@
"pages": [
"docs/data-modeling/visual-modeler",
"docs/data-modeling/data-model-ide",
"docs/data-modeling/dev-mode"
"docs/data-modeling/dev-mode",
"docs/data-modeling/access-policies-viewer"
]
}
]
Expand Down
6 changes: 0 additions & 6 deletions docs-mintlify/docs/data-modeling/access-control/context.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -300,12 +300,6 @@ enrich the security context with additional attributes.
When using Cube Cloud, you can enrich the security context with information about
an authenticated user, obtained during their authentication.

<Note>

Available on [all plans](https://cube.dev/pricing).

</Note>

You can enable the authentication integration by navigating to the **Settings → Configuration**
of your Cube Cloud deployment and using the **Enable Cloud Auth Integration** toggle.

Expand Down
Loading
Loading