Skip to content

chore: Faster image builds #1465

@NickLarsenNZ

Description

@NickLarsenNZ

Many images rebuild the same thing as a stage (for example, hadoop, druid, etc... all build hadoop/hadoop, then copy things out in the final image).

One way we can speed up image builds is move that compile stage out on its own, and publish the image for reuse.

Each version would only need to be built once (per architecture, considering platform dependent code). Any patches applied after that would cause a new build. The builds would be timestamped.

Important

The SDP version is irrelevant to the source build, so the same artifact could be used for 0.0.0-dev images and the actual SDP release images.
We decided to abuse the sdp-version input of the reusable workflow to store a timestamp, so the source images will be oci.stackable.tech/precompiled/hadoop:1.2.3-stackable1234567.

  • The timestamp makes it easier to see which is the newer build.
  • A second PR can be raised to update images to use the new timestamp.
  • If that becomes a burden, we can explore tooling to make this easier (maybe in boil).

PRs will be raised in pairs. For example:

  1. PR for precompiled/hadoop (or vector, or nifi) so the image is available.
  2. PR to replace the hadoop-builder (or vector, or nifi) stage for the dependent product images.

Groups of changes

We don't have to do all at once, but instead we can do the ones that appear to take significant time.

Hadoop

Vector

Note

This will be somewhat quicker since #1462

  • precompiled/vector: TODO
  • whatever uses vector: TODO

Nifi

  • precompiled/nifi: TODO
  • nifi: TODO

Example of the Part 2 type changes

Sample of the second part of this change

hadoop/boil.toml

diff --git a/hadoop/boil-config.toml b/hadoop/boil-config.toml
index 4c6ef2a..a7989da 100644
--- a/hadoop/boil-config.toml
+++ b/hadoop/boil-config.toml
@@ -3,21 +3,27 @@
 
 # Not part of SDP 25.7.0, but still required for hbase, hive, spark-k8s
 [versions."3.3.6".local-images]
-"hadoop/hadoop" = "3.3.6"
 java-base = "11"
 java-devel = "11"
 
 [versions."3.3.6".build-arguments]
+precompiled-hadoop-version = "3.3.6"
+# Find the latest build timestamp here:
+# https://oci.stackable.tech/harbor/projects/52/repositories/hadoop/artifacts-tab
+precompiled_hadoop_ts = "1776412753"
 async-profiler-version = "2.9"
 jmx-exporter-version = "1.3.0"
 hdfs-utils-version = "0.4.0"
 
 [versions."3.4.2".local-images]
-"hadoop/hadoop" = "3.4.2"
 java-base = "11"
 java-devel = "11"
 
 [versions."3.4.2".build-arguments]
+precompiled-hadoop-version = "3.4.2"
+# Find the latest build timestamp here:
+# https://oci.stackable.tech/harbor/projects/52/repositories/hadoop/artifacts-tab
+precompiled_hadoop_ts = "1776412753"
 async-profiler-version = "2.9"
 jmx-exporter-version = "1.3.0"
 hdfs-utils-version = "0.5.0"

hadoop/Dockerfile

diff --git i/hadoop/Dockerfile w/hadoop/Dockerfile
index 2757b4e..a4076d0 100644
--- i/hadoop/Dockerfile
+++ w/hadoop/Dockerfile
@@ -1,7 +1,9 @@
 # syntax=docker/dockerfile:1.16.0@sha256:e2dd261f92e4b763d789984f6eab84be66ab4f5f08052316d8eb8f173593acf7
-# check=error=true
+# check=error=true;skip=InvalidDefaultArgInFrom
 
-FROM local-image/hadoop/hadoop AS hadoop-builder
+ARG PRECOMPILED_HADOOP_VERSION
+ARG PRECOMPILED_HADOOP_TS
+FROM oci.stackable.tech/precompiled/hadoop:${PRECOMPILED_HADOOP_VERSION}-stackable${PRECOMPILED_HADOOP_TS} AS hadoop-builder
 
 FROM local-image/java-devel AS hdfs-utils-builder
 
@@ -9,11 +11,9 @@ ARG HDFS_UTILS_VERSION
 ARG PRODUCT_VERSION
 ARG RELEASE_VERSION
 ARG STACKABLE_USER_UID
-ARG HADOOP_HADOOP_VERSION
+ARG PRECOMPILED_HADOOP_VERSION
 # Reassign the arg to `HADOOP_VERSION` for better readability.
-# It is passed as `HADOOP_HADOOP_VERSION`, because boil-config.toml has to contain `hadoop/hadoop` to establish a dependency on the Hadoop builder.
-# The value of `hadoop/hadoop` is transformed by `bake` and automatically passed as `HADOOP_HADOOP_VERSION` arg.
-ENV HADOOP_VERSION=${HADOOP_HADOOP_VERSION}
+ENV HADOOP_VERSION=${PRECOMPILED_HADOOP_VERSION}
 
 # Starting with hdfs-utils 0.4.0 we need to use Java 17 for compilation.
 # We can not simply use java-devel with Java 17, as it is also used to compile Hadoop in this
@@ -69,9 +69,9 @@ FROM local-image/java-base AS final
 
 ARG PRODUCT_VERSION
 ARG RELEASE_VERSION
-ARG HADOOP_HADOOP_VERSION
+ARG PRECOMPILED_HADOOP_VERSION
 # Reassign the arg to `HADOOP_VERSION` for better readability.
-ENV HADOOP_VERSION=${HADOOP_HADOOP_VERSION}
+ENV HADOOP_VERSION=${PRECOMPILED_HADOOP_VERSION}
 ARG HDFS_UTILS_VERSION
 ARG STACKABLE_USER_UID
 ARG ASYNC_PROFILER_VERSION

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Development: In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions