Skip to content

Commit eeb87de

Browse files
authored
Fix selectSparkVersion() crash on non-SemVer runtime keys (#832)
## Summary `WorkspaceClient.clusters().selectSparkVersion(...)` throws `IllegalArgumentException: Not a valid SemVer: ...` when the Spark versions API returns a runtime key that is not a valid SemVer. This makes the sort that picks the latest runtime resilient to such keys, mirroring the behavior of the Go SDK. ## Why `selectSparkVersion(latest)` gathers the matching runtime keys and sorts them to pick the newest: ```java versions.sort((v1, v2) -> SemVer.parse(v2).compareTo(SemVer.parse(v1))); ``` `SemVer.parse` throws `IllegalArgumentException` on any string it cannot parse, and because that happens *inside* the sort comparator, a single unparseable key aborts the entire selection. The clusters API recently started returning the key `v18.x-scala2.13` — two version segments (`18.x`) plus a leading `v` — which the SemVer regex (it expects `major.minor.patch`) rejects. As a result, every caller of `selectSparkVersion(latest)` in such a workspace fails, and the `ClustersIT.latestRuntime` integration test started failing across all clouds. The Go SDK does not have this problem: its comparator uses `golang.org/x/mod/semver.Compare`, which is total and never throws — invalid versions simply sort lowest and are effectively ignored when picking "latest". This PR brings the Java behavior in line, which also keeps the two SDKs consistent and makes selection robust to any future malformed key, not just this specific shape. ## What changed ### Interface changes - **`SemVer.parseOrNull(String)`** — parses a version string, returning `null` instead of throwing when the input is not a valid SemVer. ### Behavioral changes - `selectSparkVersion(latest)` no longer throws when the API returns a non-SemVer runtime key. Unparseable keys are ranked lowest and the latest parseable runtime is returned, matching the Go SDK. (Previously every such call threw `IllegalArgumentException`.) ### Internal changes - The version sort in `ClustersExt.selectSparkVersion` now uses a null-safe comparator (`compareSparkVersionsDescending`) built on `SemVer.parseOrNull`. ## How is this tested? Unit tests (run via `mvn`): - `SemVerTest` — `parseOrNull` returns `null` for the `v18.x-scala2.13` shape, malformed input, `null`, and empty string, and still parses valid versions. - `ClustersExtTest` — new regression test feeds a versions list containing `v18.x-scala2.13` and asserts `selectSparkVersion(latest)` does not throw and returns the latest parseable runtime (`15.4.x-scala2.12`). This test fails on the pre-fix code. NO_CHANGELOG=true
1 parent bb48b1e commit eeb87de

4 files changed

Lines changed: 82 additions & 1 deletion

File tree

databricks-sdk-java/src/main/java/com/databricks/sdk/mixin/ClustersExt.java

Lines changed: 22 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,11 +63,32 @@ public String selectSparkVersion(SparkVersionSelector selector) throws IllegalAr
6363
if (!selector.latest) {
6464
throw new IllegalArgumentException("spark versions query returned multiple results");
6565
}
66-
versions.sort((v1, v2) -> SemVer.parse(v2).compareTo(SemVer.parse(v1)));
66+
versions.sort(ClustersExt::compareSparkVersionsDescending);
6767
}
6868
return versions.get(0);
6969
}
7070

71+
/**
72+
* Compares two Spark runtime keys so that the latest version sorts first. Mirrors the
73+
* databricks-sdk-go behavior (golang.org/x/mod/semver.Compare), where keys that are not valid
74+
* SemVer (for example "v18.x-scala2.13") are treated as lowest priority instead of throwing. This
75+
* ensures a single unparseable runtime key returned by the API cannot break version selection.
76+
*/
77+
private static int compareSparkVersionsDescending(String v1, String v2) {
78+
SemVer s1 = SemVer.parseOrNull(v1);
79+
SemVer s2 = SemVer.parseOrNull(v2);
80+
if (s1 == null && s2 == null) {
81+
return 0;
82+
}
83+
if (s1 == null) {
84+
return 1; // v1 is unparseable: sort it after v2
85+
}
86+
if (s2 == null) {
87+
return -1; // v2 is unparseable: keep v1 before v2
88+
}
89+
return s2.compareTo(s1); // descending order: latest first
90+
}
91+
7192
public String selectNodeType(NodeTypeSelector selector) {
7293
// Logic ported from
7394
// https://github.com/databricks/databricks-sdk-go/blob/main/service/clusters/node_type.go

databricks-sdk-java/src/main/java/com/databricks/sdk/mixin/SemVer.java

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,19 @@ public static SemVer parse(String v) {
4747
m.group("build"));
4848
}
4949

50+
/**
51+
* Parses the given version string, returning {@code null} instead of throwing when it is not a
52+
* valid SemVer. Useful when sorting collections that may contain non-SemVer values (for example
53+
* Spark runtime keys such as "v18.x-scala2.13").
54+
*/
55+
public static SemVer parseOrNull(String v) {
56+
try {
57+
return parse(v);
58+
} catch (IllegalArgumentException e) {
59+
return null;
60+
}
61+
}
62+
5063
@Override
5164
public int compareTo(SemVer other) {
5265
if (other == null) {

databricks-sdk-java/src/test/java/com/databricks/sdk/mixin/ClustersExtTest.java

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,4 +143,34 @@ void sparkVersionWithSparkVersionParameter() {
143143
clustersExt.selectSparkVersion(new SparkVersionSelector().withSparkVersion("3.4.1"));
144144
assertEquals("13.3.x-scala2.12", sparkVersion);
145145
}
146+
147+
private GetSparkVersionsResponse testGetSparkVersionsWithUnparseableKey() {
148+
Collection<SparkVersion> versions = new ArrayList<>();
149+
versions.add(
150+
new SparkVersion()
151+
.setName("14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12)")
152+
.setKey("14.3.x-scala2.12"));
153+
versions.add(
154+
new SparkVersion()
155+
.setName("15.4 LTS (includes Apache Spark 3.5.0, Scala 2.12)")
156+
.setKey("15.4.x-scala2.12"));
157+
// Non-SemVer runtime key returned by the API. Sorting this with SemVer.parse() previously threw
158+
// "Not a valid SemVer: v18.x-scala2.13" and broke selection for every caller.
159+
versions.add(
160+
new SparkVersion()
161+
.setName("18.x (includes Apache Spark 4.0.0, Scala 2.13)")
162+
.setKey("v18.x-scala2.13"));
163+
return new GetSparkVersionsResponse().setVersions(versions);
164+
}
165+
166+
@Test
167+
void selectLatestSparkVersionIgnoresUnparseableKey() {
168+
ClustersExt clustersExt = new ClustersExt(clustersMock);
169+
Mockito.doReturn(testGetSparkVersionsWithUnparseableKey()).when(clustersMock).sparkVersions();
170+
171+
// Must not throw on the non-SemVer key, and must return the latest parseable runtime - matching
172+
// databricks-sdk-go, which ranks unparseable keys lowest rather than failing.
173+
String sparkVersion = clustersExt.selectSparkVersion(new SparkVersionSelector().withLatest());
174+
assertEquals("15.4.x-scala2.12", sparkVersion);
175+
}
146176
}

databricks-sdk-java/src/test/java/com/databricks/sdk/mixin/SemVerTest.java

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,4 +37,21 @@ void parseTest() {
3737
int compareResult = parsedSemVer.compareTo(expectedSemVer);
3838
assertEquals(0, compareResult);
3939
}
40+
41+
@Test
42+
void parseOrNullReturnsNullForInvalid() {
43+
// Spark runtime key shape that crashed selectSparkVersion(): only two version segments and a
44+
// leading "v", which is not a valid SemVer.
45+
assertNull(SemVer.parseOrNull("v18.x-scala2.13"));
46+
assertNull(SemVer.parseOrNull("not-a-version"));
47+
assertNull(SemVer.parseOrNull(null));
48+
assertNull(SemVer.parseOrNull(""));
49+
}
50+
51+
@Test
52+
void parseOrNullParsesValid() {
53+
SemVer parsed = SemVer.parseOrNull("v1.2.3-alpha+build-20230510");
54+
assertNotNull(parsed);
55+
assertEquals(0, parsed.compareTo(new SemVer(1, 2, 3, "alpha", "build-20230510")));
56+
}
4057
}

0 commit comments

Comments
 (0)