You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many images rebuild the same thing as a stage (for example, hadoop, druid, etc... all build hadoop/hadoop, then copy things out in the final image).
One way we can speed up image builds is move that compile stage out on its own, and publish the image for reuse.
Each version would only need to be built once (per architecture, considering platform dependent code). Any patches applied after that would cause a new build. The builds would be timestamped.
Important
The SDP version is irrelevant to the source build, so the same artifact could be used for 0.0.0-dev images and the actual SDP release images.
We decided to abuse the sdp-version input of the reusable workflow to store a timestamp, so the source images will be oci.stackable.tech/precompiled/hadoop:1.2.3-stackable1234567.
The timestamp makes it easier to see which is the newer build.
A second PR can be raised to update images to use the new timestamp.
If that becomes a burden, we can explore tooling to make this easier (maybe in boil).
PRs will be raised in pairs. For example:
PR for precompiled/hadoop (or vector, or nifi) so the image is available.
PR to replace the hadoop-builder (or vector, or nifi) stage for the dependent product images.
Groups of changes
We don't have to do all at once, but instead we can do the ones that appear to take significant time.
diff --git a/hadoop/boil-config.toml b/hadoop/boil-config.toml
index 4c6ef2a..a7989da 100644
--- a/hadoop/boil-config.toml+++ b/hadoop/boil-config.toml@@ -3,21 +3,27 @@
# Not part of SDP 25.7.0, but still required for hbase, hive, spark-k8s
[versions."3.3.6".local-images]
-"hadoop/hadoop" = "3.3.6"
java-base = "11"
java-devel = "11"
[versions."3.3.6".build-arguments]
+precompiled-hadoop-version = "3.3.6"+# Find the latest build timestamp here:+# https://oci.stackable.tech/harbor/projects/52/repositories/hadoop/artifacts-tab+precompiled_hadoop_ts = "1776412753"
async-profiler-version = "2.9"
jmx-exporter-version = "1.3.0"
hdfs-utils-version = "0.4.0"
[versions."3.4.2".local-images]
-"hadoop/hadoop" = "3.4.2"
java-base = "11"
java-devel = "11"
[versions."3.4.2".build-arguments]
+precompiled-hadoop-version = "3.4.2"+# Find the latest build timestamp here:+# https://oci.stackable.tech/harbor/projects/52/repositories/hadoop/artifacts-tab+precompiled_hadoop_ts = "1776412753"
async-profiler-version = "2.9"
jmx-exporter-version = "1.3.0"
hdfs-utils-version = "0.5.0"
hadoop/Dockerfile
diff --git i/hadoop/Dockerfile w/hadoop/Dockerfile
index 2757b4e..a4076d0 100644
--- i/hadoop/Dockerfile+++ w/hadoop/Dockerfile@@ -1,7 +1,9 @@
# syntax=docker/dockerfile:1.16.0@sha256:e2dd261f92e4b763d789984f6eab84be66ab4f5f08052316d8eb8f173593acf7
-# check=error=true+# check=error=true;skip=InvalidDefaultArgInFrom-FROM local-image/hadoop/hadoop AS hadoop-builder+ARG PRECOMPILED_HADOOP_VERSION+ARG PRECOMPILED_HADOOP_TS+FROM oci.stackable.tech/precompiled/hadoop:${PRECOMPILED_HADOOP_VERSION}-stackable${PRECOMPILED_HADOOP_TS} AS hadoop-builder
FROM local-image/java-devel AS hdfs-utils-builder
@@ -9,11 +11,9 @@ ARG HDFS_UTILS_VERSION
ARG PRODUCT_VERSION
ARG RELEASE_VERSION
ARG STACKABLE_USER_UID
-ARG HADOOP_HADOOP_VERSION+ARG PRECOMPILED_HADOOP_VERSION
# Reassign the arg to `HADOOP_VERSION` for better readability.
-# It is passed as `HADOOP_HADOOP_VERSION`, because boil-config.toml has to contain `hadoop/hadoop` to establish a dependency on the Hadoop builder.-# The value of `hadoop/hadoop` is transformed by `bake` and automatically passed as `HADOOP_HADOOP_VERSION` arg.-ENV HADOOP_VERSION=${HADOOP_HADOOP_VERSION}+ENV HADOOP_VERSION=${PRECOMPILED_HADOOP_VERSION}
# Starting with hdfs-utils 0.4.0 we need to use Java 17 for compilation.
# We can not simply use java-devel with Java 17, as it is also used to compile Hadoop in this
@@ -69,9 +69,9 @@ FROM local-image/java-base AS final
ARG PRODUCT_VERSION
ARG RELEASE_VERSION
-ARG HADOOP_HADOOP_VERSION+ARG PRECOMPILED_HADOOP_VERSION
# Reassign the arg to `HADOOP_VERSION` for better readability.
-ENV HADOOP_VERSION=${HADOOP_HADOOP_VERSION}+ENV HADOOP_VERSION=${PRECOMPILED_HADOOP_VERSION}
ARG HDFS_UTILS_VERSION
ARG STACKABLE_USER_UID
ARG ASYNC_PROFILER_VERSION
Many images rebuild the same thing as a stage (for example, hadoop, druid, etc... all build hadoop/hadoop, then copy things out in the final image).
One way we can speed up image builds is move that compile stage out on its own, and publish the image for reuse.
Each version would only need to be built once (per architecture, considering platform dependent code). Any patches applied after that would cause a new build. The builds would be timestamped.
Important
The SDP version is irrelevant to the source build, so the same artifact could be used for 0.0.0-dev images and the actual SDP release images.
We decided to abuse the
sdp-versioninput of the reusable workflow to store a timestamp, so the source images will beoci.stackable.tech/precompiled/hadoop:1.2.3-stackable1234567.PRs will be raised in pairs. For example:
precompiled/hadoop(or vector, or nifi) so the image is available.Groups of changes
We don't have to do all at once, but instead we can do the ones that appear to take significant time.
Hadoop
Vector
Note
This will be somewhat quicker since #1462
Nifi
Example of the Part 2 type changes
Sample of the second part of this change
hadoop/boil.tomlhadoop/Dockerfile