Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .github/workflows/build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ on:

env:
OPERATOR_NAME: "spark-k8s-operator"
RUST_NIGHTLY_TOOLCHAIN_VERSION: "nightly-2025-10-23"
NIX_PKG_MANAGER_VERSION: "2.30.0"
RUST_TOOLCHAIN_VERSION: "1.89.0"
RUST_NIGHTLY_TOOLCHAIN_VERSION: "nightly-2026-02-24"
NIX_PKG_MANAGER_VERSION: "2.33.3"
RUST_TOOLCHAIN_VERSION: "1.93.0"
HADOLINT_VERSION: "v2.14.0"
PYTHON_VERSION: "3.14"
CARGO_TERM_COLOR: always
Expand Down Expand Up @@ -139,7 +139,7 @@ jobs:
set -euo pipefail
[ -n "$GITHUB_DEBUG" ] && set -x

CURRENT_VERSION=$(cargo metadata --format-version 1 --no-deps | jq -r '.packages[0].version')
CURRENT_VERSION=$(cargo metadata --format-version 1 --no-deps | jq -r '.packages[] | select(.name == "stackable-spark-k8s-operator") | .version')

if [ "$GITHUB_EVENT_NAME" == 'pull_request' ]; then
# Include a PR suffix if this workflow is triggered by a PR
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/pr_pre-commit.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ on:

env:
CARGO_TERM_COLOR: always
NIX_PKG_MANAGER_VERSION: "2.30.0"
RUST_TOOLCHAIN_VERSION: "nightly-2025-10-23"
NIX_PKG_MANAGER_VERSION: "2.33.3"
RUST_TOOLCHAIN_VERSION: "nightly-2026-02-24"
HADOLINT_VERSION: "v2.14.0"
PYTHON_VERSION: "3.14"
JINJA2_CLI_VERSION: "0.8.2"
JINJA2_CLI_VERSION: "1.0.0"

jobs:
pre-commit:
Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ repos:
- id: cargo-rustfmt
name: cargo-rustfmt
language: system
entry: cargo +nightly-2025-10-23 fmt --all -- --check
entry: cargo +nightly-2026-02-24 fmt --all -- --check
stages: [pre-commit, pre-merge-commit]
pass_filenames: false
files: \.rs$
Expand Down
2 changes: 1 addition & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"rust-analyzer.rustfmt.overrideCommand": [
"rustfmt",
"+nightly-2025-10-23",
"+nightly-2026-02-24",
"--edition",
"2024",
"--"
Expand Down
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ All notable changes to this project will be documented in this file.
Previously, Jobs were retried at most 6 times by default ([#647]).
- Support for Spark `3.5.8` ([#650]).
- First class support for S3 on Spark connect clusters ([#652]).
- Spark applications can now have templates that are merged into the application manifest before reconciliation. This allows users with many applications to source out common configuration in a central place and reduce duplication ([#660]).

### Fixed

Expand Down Expand Up @@ -45,6 +46,7 @@ All notable changes to this project will be documented in this file.
[#652]: https://github.com/stackabletech/spark-k8s-operator/pull/652
[#655]: https://github.com/stackabletech/spark-k8s-operator/pull/655
[#656]: https://github.com/stackabletech/spark-k8s-operator/pull/656
[#660]: https://github.com/stackabletech/spark-k8s-operator/pull/660

## [25.11.0] - 2025-11-07

Expand Down
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 13 additions & 4 deletions Cargo.nix

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ tokio = { version = "1.40", features = ["full"] }
tracing = "0.1"
tracing-futures = { version = "0.2", features = ["futures-03"] }
indoc = "2"
regex = "1"

[patch."https://github.com/stackabletech/operator-rs.git"]
# stackable-operator = { git = "https://github.com/stackabletech//operator-rs.git", branch = "main" }
Expand Down
1 change: 1 addition & 0 deletions deploy/helm/spark-k8s-operator/templates/roles.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ rules:
- sparkapplications
- sparkhistoryservers
- sparkconnectservers
- sparkapptemplates
verbs:
- get
- list
Expand Down
6 changes: 3 additions & 3 deletions docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@

# We want to automatically use the latest. We also don't tag our images with a version.
# hadolint ignore=DL3007
FROM oci.stackable.tech/sdp/ubi9-rust-builder:latest AS builder
FROM oci.stackable.tech/sdp/ubi10-rust-builder:latest AS builder


# We want to automatically use the latest.
# hadolint ignore=DL3007
FROM registry.access.redhat.com/ubi9/ubi-minimal:latest AS operator
FROM registry.access.redhat.com/ubi10/ubi-minimal:latest AS operator

ARG VERSION
# NOTE (@Techassi): This is required for OpenShift/Red Hat certification
Expand Down Expand Up @@ -74,7 +74,7 @@ LABEL org.opencontainers.image.description="Deploy and manage Apache Spark-on-Ku

# https://docs.openshift.com/container-platform/4.16/openshift_images/create-images.html#defining-image-metadata
# https://github.com/projectatomic/ContainerApplicationGenericLabels/blob/master/vendor/redhat/labels.md
LABEL io.openshift.tags="ubi9,stackable,sdp,spark-k8s"
LABEL io.openshift.tags="ubi10,stackable,sdp,spark-k8s"
LABEL io.k8s.description="Deploy and manage Apache Spark-on-Kubernetes clusters."
LABEL io.k8s.display-name="Stackable Operator for Apache Spark-on-Kubernetes"

Expand Down
99 changes: 99 additions & 0 deletions docs/modules/spark-k8s/pages/usage-guide/app_templates.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
= Spark Application Templates
:description: Learn how to configure application templates for Spark applications on the Stackable Data Platform.

Spark application templates are used to define reusable configurations for Spark applications.
When you have many applications with similar configurations, templates can help you avoid duplication by grouping common settings together.
Application templates are available for the `v1alpha1` version of the SparkApplication custom resource and share the exact same structure as the SparkApplication resource, but with some differences in the way the operator handles them:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know yet if the plan is to keep versions in step with another (e.g a v1alpha2 is created for one entity when the other entity is bumped to v1alpha2). I think that might make mapping between app and template a little easier.


1. Application templates are cluster wide resources, while Spark application resources are namespace-scoped. This means that application templates can be used across multiple namespaces, while Spark application resources are limited to the namespace they are created in.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sentences should be on new lines in the docs e.g.

Suggested change
1. Application templates are cluster wide resources, while Spark application resources are namespace-scoped. This means that application templates can be used across multiple namespaces, while Spark application resources are limited to the namespace they are created in.
1. Application templates are cluster wide resources, while Spark application resources are namespace-scoped.
This means that application templates can be used across multiple namespaces, while Spark application resources are limited to the namespace they are created in.

(not sure if we want indenting there...)

2. Application templates are not reconciled by the operator, but must be referenced from a SparkApplication resource to be applied. This means that changes to an application template will not automatically trigger updates to SparkApplication resources that reference it.
3. An application can reference multiple application templates, and the settings from these templates will be merged together. The merging order of the templates is indicated by their index in the reference list. The application fields have the highest precedence and will override any conflicting settings from the templates. This allows you to have a base template with common settings and then override specific settings in the application resource as needed.
4. Application template references are immutable in the sense that once applied to an application they cannot be changed again. Currently templates are applied upon the creation of the application, and any changes to the template references after that will be ignored.

== Examples

Applications use `metadata.annotations` to reference application templates as shown below:

[source,yaml]
----
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: app
annotations:
spark-application.template.merge: "true" # <1>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to include all available annotations in the example to be comprehensive.

spark-application.template.0.name: "app-template" # <2>
spec: # <3>
sparkImage:
productVersion: "4.1.1"
mode: cluster
mainClass: com.example.Main
mainApplicationFile: "/examples.jar"
----
<1> Enable application template merging for this application.
<2> Name of the application template to reference.
<3> Application specification. The fields `sparkImage`, `mode`, `mainClass`, and `mainApplicationFile` are required for the application to be valid, but the rest of the fields are optional and can be defined in the application template.

The application template referenced in the example above is defined as follows:

[source,yaml]
----
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplicationTemplate # <1>
metadata:
name: app-template # <2>
spec:
sparkImage:
productVersion: "4.1.1"
pullPolicy: IfNotPresent
mode: cluster
mainClass: com.example.Main
mainApplicationFile: "placeholder" # <3>
sparkConf:
spark.kubernetes.file.upload.path: "s3a://my-bucket"
s3connection:
reference: spark-history-s3-connection
logFileDirectory:
s3:
prefix: eventlogs/
bucket:
reference: spark-history-s3-bucket
driver:
config:
logging:
enableVectorAgent: False
executor:
replicas: 1
config:
logging:
enableVectorAgent: False
----
<1> The kind of the resource is `SparkApplicationTemplate` to indicate that this is an application template.
<2> Name of the application template.
<3> The value of `mainApplicationFile` is set to a placeholder value, which will be overridden by the application resource. Similarly to the application, The fields `sparkImage`, `mode`, `mainClass`, and `mainApplicationFile` are required for the template to be valid.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we have to have define sparkImage etc. both in the template and the app? Is the plan with v1alpha2 to remove them as being mandatory in the application so we can only have them in the template (or override them)? If so, can they be optional in the template?


An application can reference multiple application templates as shown below:

[source,yaml]
----
---
apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: app
annotations:
spark-application.template.merge: "true" # <1>
spark-application.template.0.name: "app-template-0" # <2>
spark-application.template.1.name: "app-template-1"
spark-application.template.2.name: "app-template-2"
spec: # <3>
sparkImage:
productVersion: "4.1.1"
mode: cluster
mainClass: com.example.Main
mainApplicationFile: "/examples.jar"
----
<1> Enable application template merging for this application.
<2> The name of the application templates to reference. The settings from these templates will be merged together in the order they are referenced, with `app-template-0` having the lowest precedence and `app-template-2` having the highest precedence. Tha application fields have the highest overall precedence and will override any conflicting settings from the templates.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<2> The name of the application templates to reference. The settings from these templates will be merged together in the order they are referenced, with `app-template-0` having the lowest precedence and `app-template-2` having the highest precedence. Tha application fields have the highest overall precedence and will override any conflicting settings from the templates.
<2> The name of the application templates to reference. The settings from these templates will be merged together in the order they are referenced, with `app-template-0` having the lowest precedence and `app-template-2` having the highest precedence. The application fields have the highest overall precedence and will override any conflicting settings from the templates.

1 change: 1 addition & 0 deletions docs/modules/spark-k8s/partials/nav.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
** xref:spark-k8s:usage-guide/job-dependencies.adoc[]
** xref:spark-k8s:usage-guide/resources.adoc[]
** xref:spark-k8s:usage-guide/s3.adoc[]
** xref:spark-k8s:usage-guide/app_templates.adoc[]
** xref:spark-k8s:usage-guide/security.adoc[]
** xref:spark-k8s:usage-guide/logging.adoc[]
** xref:spark-k8s:usage-guide/history-server.adoc[]
Expand Down
Loading
Loading