[SPARK-56865][INFRA] Suggest a minimal Fix Versions set when resolving JIRA#55878
[SPARK-56865][INFRA] Suggest a minimal Fix Versions set when resolving JIRA#55878cloud-fan wants to merge 2 commits into
Conversation
…ranch-M.x is also being merged When the merge script targets master plus a branch-M.x integration dev branch, master's N.0.0 entry is redundant because branch-M.x -> master is a merge-forward chain, so master's next release inherits anything that lands on branch-M.x. The prior `_dedupe_fix_versions_adjacent_major_zero` rule also wrongly dropped the branch-M.x entry whenever the corresponding branch-M.N release branch was in the merge set, because both versions ended up in the defaults list. Replace it with a rule that skips master's contribution only when a branch-M.x is present, and let dict.fromkeys handle exact duplicates.
There was a problem hiding this comment.
To @cloud-fan , I agree with the purpose and functionality of this PR.
However, let's not make a confusion to the community via this PR because Apache Spark community's backporting policy has never been merge-forward chain. Of course, we allow some exceptions, but that's not the general policy.
branch-M.x -> master is a merge-forward chain
If we want to choose a terminology, the Apache Spark's backporting policy has been Upstream-First policy always to prevent a regression.
|
@dongjoon-hyun do you have a concrete proposal? Look at the table, when we merge a PR to Another idea is to be verbose. Merged to 4.x means 4.3, merged to master means 5.0. The case above will become |
|
@cloud-fan , what I suggesting you is simply revising your wording in the phrase,
|
|
@dongjoon-hyun Your suggestion makes me think what should be the rule of deciding the released versions. In the past, when we merge a PR to master only, the released version is the next feature release version (e.g. So the principle is to use a minimal set of released versions to tell users which Spark versions contain the commit. That being said:
The last 2 rows are mostly for completeness, but it should not happen in practise. How do you think of this principle? No backport or forward-merge wording, just explain |
|
To @cloud-fan , sorry but it sounds like a generated answer? Are you reading my comments? |
|
To be clear,
|
|
@dongjoon-hyun yes, and the previous comment was 100% hand crafted. I got your point of changing the wording of pr desc, but I want to go deeper to clarify what "released versions" mean. The last comment actually gives a different proposal of how to pick the released versions, than this PR currently implements |
|
Specifically, for the following, if you want to suggest anything new to the community about the released versions, we need to discuss on dev@spark first. Otherwise, we stick to the existing policy only. It's not a thing we can decide by two PMC members.
|
|
As you know, this PR's code is simply to mimic the existing PMC member's behavior. I also dropped |
…am-First framing Extends the Fix Version default to also drop branch-M.x's minor.0 when a sibling branch-M.N contributes M.N.0 (an unreleased M.N.0 means branch-M.N has not diverged from branch-M.x yet, so M.N.0 alone covers both lineages). Reframes the comments from "merge-forward chain" to the Apache Spark "Upstream-First" backporting policy. Co-authored-by: Isaac
|
ok new proposal implemented. The principle is actually the same as the old days, just with adoption to the new Let me know if this needs dev list discussion, or it's better to drop it to be safe. |
|
PR description is updated as well, hopefully it explain the philosophy of this proposed change. |
|
cc @peter-toth , too. |
|
The |
What changes were proposed in this pull request?
When
dev/merge_spark_pr.pyresolves a JIRA after merging a PR to multiple branches, it computes a default Fix Versions list from the merge targets. This PR rewrites that computation to produce the minimal set of Fix Versions that tells users which Spark releases will contain the commit, leveraging the Apache Spark Upstream-First backporting policy (cherry-picks flowmaster -> branch-M.x -> branch-M.N).Each merge target contributes a candidate version:
master-> the greatest unreleasedN.0.0branch-M.x-> that major's greatest unreleasedM.minor.0branch-M.N-> its greatest unreleasedM.N.patchTwo suppression rules then trim redundancy:
N.0.0when anybranch-M.xis in the merge set: a cherry-pick tobranch-M.xhas already landed on master, so the next master release is implied.branch-M.x'sminor.0when a siblingbranch-M.NcontributesM.N.0: an unreleasedM.N.0meansbranch-M.Nhas not diverged frombranch-M.xyet, soM.N.0alone covers both lineages.Sanity check against the current unreleased SPARK versions (
5.0.0, 4.3.0, 4.2.0, 4.1.2, 4.0.3, 3.5.9) plus the hypothetical "4.2.0 has released" scenarios raised in the review discussion:master[5.0.0][5.0.0]master,branch-4.x[5.0.0, 4.3.0][4.3.0]master,branch-4.x,branch-4.2(4.2.0 unreleased)[5.0.0, 4.2.0][4.2.0]master,branch-4.x,branch-4.2(4.2.0 released)[5.0.0, 4.2.1][4.3.0, 4.2.1]master,branch-4.2(4.2.0 unreleased)[5.0.0, 4.2.0][5.0.0, 4.2.0]master,branch-4.2(4.2.0 released)[5.0.0, 4.2.1][5.0.0, 4.2.1]The rules are parameterized on
MandN, so they generalize to any major/minor pair as the project moves forward (e.g. post-5.0.0master + branch-5.x + branch-5.1with5.1.0unreleased ->[5.1.0]; double-digit minors likebranch-4.11work the same way).Why are the changes needed?
The previous defaults forced committers to manually edit Fix Versions in multiple common cases:
master + branch-M.x: strip the extraN.0.0.master + branch-M.x + branch-M.N(just-cutbranch-M.N): the change becomes[M.N.0]rather than[5.0.0, M.N.0]or[M.minor.0, M.N.0].Both are now produced as defaults that match the convention Spark committers apply by hand.
Does this PR introduce any user-facing change?
No. The committer prompt still lets the user override the defaults, so any committer who prefers the old behavior can type it in. Only the suggested default changes.
How was this patch tested?
python3 -m doctest dev/merge_spark_pr.py -v-- 34 doctests pass, including 16 cases incompute_merge_default_fix_versionscovering all six rows in the table plus the existing scenarios.branch-4.11) to confirm the rules generalize acrossMandN.Was this patch authored or co-authored using generative AI tooling?
Generated-by: Claude Code (Claude Opus 4.7)