AbsaOSS · miroslavpojer · May 29, 2026 · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026
@@ -183,6 +183,7 @@ Each feature is documented separately — click a name below to learn configurat
 | [Service Chapters](docs/features/service_chapters.md)                 | Quality & Warnings        | Surfaces gaps: issues without PRs, unlabeled items, PRs without notes, etc.                                    |
 | [Duplicity Handling](docs/features/duplicity_handling.md)             | Quality & Warnings        | Marks duplicate lines when the same issue appears in multiple chapters.                                        |
 | [Tag Range Selection](docs/features/tag_range.md)                     | Time Range                | Chooses scope via `tag-name`/`from-tag-name`.                                                                  |
+| [Compare Mode](docs/features/compare_mode.md)                         | Time Range                | Graph-based commit selection via `repo.compare()` — correct for branching release histories (maintenance + develop in parallel). |
 | [Date Selection](docs/features/date_selection.md)                     | Time Range                | Chooses scope via timestamps (`published-at` vs `created-at`).                                                 |
 | [Custom Row Formats](docs/features/custom_row_formats.md)             | Formatting & Presentation | Controls row templates and placeholders (`{number}`, `{title}`, `{developers}`, …).                            |
 | [Custom Chapters](docs/features/custom_chapters.md)                   | Formatting & Presentation | Maps labels to chapter headings; aggregates multiple labels under one title.                                   |

@@ -0,0 +1,152 @@
+# Feature: Compare Mode
+
+## Purpose
+Ensure release notes for a maintenance branch contain **only** the changes that belong
+to that branch, even when a parallel development branch (`v2.7.x`) is active and produces
+commits in the same timestamp window.
+
+## The Problem Compare Mode Solves
+
+The default (timestamp) approach asks GitHub: *"give me all commits/PRs since time T".*
+That works perfectly when every release is on a single linear history. It breaks the
+moment two release streams run in parallel.
+
+**Concrete example — two active streams:**
+
+```text
+develop:
+*  (tag: v2.7.1) 2026-05-20  Improve Kafka consumer throughput (#1401)
+*  (tag: v2.7.0) 2026-05-14  Fix new service access role (#1363)
+*  2026-05-07  Fix/1346 custom hive table (#1349)
+|
+| maintenance/v2.6.x:
+| *  (tag: v2.6.5) 2026-05-20  Backport: handle empty schema in Hive (#1402)
+| *  (tag: v2.6.4) 2026-05-14  Fix new service access role (#1363)  ← cherry-pick
+| *  (tag: v2.6.3) 2026-05-07  Fix/1346 custom hive table (#1349)   ← cherry-pick
+|/
+*  (tag: v2.6.0) 2026-04-21  Fixes for update-ca-certificates (#1318)
+```
+
+Generating release notes for **`v2.6.5`** (previous: `v2.6.4`):
+
+| Mode | What is fetched | Correct? |
+|---|---|---|
+| **Timestamp** | Everything between 2026-05-14 and 2026-05-20 on *any* branch → `#1363` (v2.7.0) + `#1401` (v2.7.1) + `#1402` | ❌ two develop PRs contaminate the patch notes |
+| **Compare** | Only commits reachable from `v2.6.5` but **not** `v2.6.4` → `#1402` only | ✅ |
+
+---
+
+## How Compare Mode Works
+
+### Activation
+
+Compare mode is active **when `from-tag-name` is explicitly provided**. When it is absent
+the existing timestamp path runs unchanged.
+
+### Step 1 — Graph-based commit selection
+
+Instead of asking "what happened after time T?", the action asks GitHub: *"what commits
+exist in `tag-name` that do not exist in `from-tag-name`?"*
+
+This is a pure graph operation — it follows the commit ancestry tree, not the clock.
+The result is exactly the set of commits unique to the current release, regardless of
+when they were authored or which branch they live on.
+
+### Step 2 — PRs derived from commit messages, not from a time filter
+
+Rather than fetching all closed PRs and filtering by timestamp, compare mode reads the
+PR numbers directly from the commit messages returned in Step 1. Both common merge
+styles are recognised:
+
+- **Squash-merge:** `Fix new service access role (#1363)`
+- **Merge-commit:** `Merge pull request #1363 from org/branch`
+
+Each unique PR number is then fetched individually by number. This means only the PRs
+that actually belong to the release are ever loaded.
+
+Cherry-picks are handled automatically: the commit message on the maintenance branch
+preserves the original PR number, so the right PR is always found even though the
+commit SHA differs from the one on develop.
+
+### Step 3 — Why a PR can have a date before `data.since`
+
+When a commit is cherry-picked, the PR object that gets fetched is the *original* PR —
+the one that was merged onto develop weeks or months earlier. Its `merged_at` date is
+that old develop date, which is before the previous maintenance tag's timestamp
+(`data.since`).
+
+Timestamp mode would silently drop it. Compare mode keeps it, because the commit graph
+(not the clock) is the authority on what belongs in the release.
+
+### Step 4 — `data.since` is still set, but only used for issues
+
+`data.since` is always derived from the previous release's timestamp, in both modes.
+In compare mode it is **not used to filter PRs or commits** — that job is already done
+by the graph in Step 1. It is only used for:
+
+- Fetching recently-updated **issues** (issue filtering is timestamp-based in both modes)
+- Date-gating **release notes extraction** from PR/issue body text
+
+### Step 5 — The filter stage passes PRs and commits through unchanged
+
+`FilterByRelease` — the stage that normally drops PRs and commits older than
+`data.since` — detects that compare mode is active and skips that timestamp check
+entirely. The PR and commit sets arriving from `mine_data` are already exact; no further
+trimming is needed or correct.
+
+Issues are always filtered by timestamp regardless of mode.
+
+---
+
+## Data Flow
+
+```
+from-tag-name provided?
+        │
+     ┌──┴──────────────────────┐
+    YES (compare mode)         NO (timestamp mode)
+     │                         │
+  GitHub Compare API:          get_commits(since=data.since)
+  commits unique to to-tag     get_pulls(state=closed)
+     │                              │
+  extract PR numbers           FilterByRelease drops
+  from commit messages         PRs/commits before since
+     │
+  fetch each PR by number
+     │
+  FilterByRelease: skip timestamp check — pass everything through
+```
+
+---
+
+## Configuration
+
+```yaml
+- name: Generate Release Notes
+  uses: AbsaOSS/generate-release-notes@v1
+  env:
+    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+  with:
+    tag-name: "v2.6.5"       # the release being generated
+    from-tag-name: "v2.6.4"  # providing this activates compare mode
+    chapters: |
+      - {"title": "Bugfixes 🛠", "label": "bug"}
+      - {"title": "Features 🎉", "label": "feature"}
+```
+
+> **When to use:** always supply `from-tag-name` when releasing from a maintenance branch
+> that runs in parallel with a development branch.  Omitting it is fine for purely
+> linear release histories.
+
+---
+
+## Related Features
+
+- [Tag Range Selection](./tag_range.md) – explains the user-facing `from-tag-name` input and
+  its interaction with compare mode.
+- [Date Selection](./date_selection.md) – controls whether `created_at` or `published_at`
+  is used as `data.since` (applies in both modes).
+- [Release Notes Extraction](./release_notes_extraction.md) – uses `data.since` for body
+  scanning; unaffected by compare mode.
+
+← [Back to Feature Tutorials](../../README.md#feature-tutorials)
@@ -38,6 +38,7 @@ https://github.com/org/repo/compare/v1.5.0...v1.6.0
 (The compare URL reflects both `from-tag-name` and `tag-name`.)
 
 ## Related Features
+- [Compare Mode](./compare_mode.md) – activated automatically when `from-tag-name` is set; explains why graph-based selection is needed for branching histories and how it works internally.
 - [Date Selection](./date_selection.md) – defines which timestamp of the previous release becomes the cutoff.
 - [Service Chapters](./service_chapters.md) – uses the same time window to assess gaps.
 - [Release Notes Extraction](./release_notes_extraction.md) – only processes PRs/issues within the computed window.

@@ -68,34 +68,47 @@ def filter(self, data: MinedData) -> MinedData:
         md = MinedData(data.home_repository)
         md.release = data.release
         md.since = data.since
+        md.compare_commit_shas = data.compare_commit_shas
 
         if data.release is not None:
             logger.info("Starting issue, prs and commit reduction by the latest release since time.")
 
             issues_dict = self._filter_issues(data)
             logger.debug("Count of issues reduced from %d to %d", len(data.issues), len(issues_dict))
 
-            # filter out merged PRs and commits before the date
-            pulls_seen: set[int] = set()
-            pulls_dict: dict[PullRequest, Repository] = {}
-            for pull, repo in data.pull_requests.items():
-                if data.since and (
-                    (pull.merged_at and pull.merged_at >= data.since)
-                    or (pull.closed_at and pull.closed_at >= data.since)
-                ):
-                    if pull.number not in pulls_seen:
-                        pulls_seen.add(pull.number)
-                        pulls_dict[pull] = repo
-            logger.debug(
-                "Count of pulls reduced from %d to %d", len(data.pull_requests.items()), len(pulls_dict.items())
-            )
-
-            commits_dict = {
-                commit: repo
-                for commit, repo in data.commits.items()
-                if data.since and commit.commit.author.date > data.since
-            }
-            logger.debug("Count of commits reduced from %d to %d", len(data.commits.items()), len(commits_dict.items()))
+            if data.compare_commit_shas:
+                # compare mode: PR and commit sets are already exact — pass through unchanged
+                pulls_dict = dict(data.pull_requests)
+                commits_dict = dict(data.commits)
+                logger.debug("Compare mode: skipping PR/commit timestamp filter.")
+            else:
+                # filter out merged PRs and commits before the date
+                pulls_seen: set[int] = set()
+                pulls_dict = {}
+                for pull, repo in data.pull_requests.items():
+                    if data.since and (
+                        (pull.merged_at and pull.merged_at >= data.since)
+                        or (pull.closed_at and pull.closed_at >= data.since)
+                    ):
+                        if pull.number not in pulls_seen:
+                            pulls_seen.add(pull.number)
+                            pulls_dict[pull] = repo
+                logger.debug(
+                    "Count of pulls reduced from %d to %d",
+                    len(data.pull_requests.items()),
+                    len(pulls_dict.items()),
+                )
+
+                commits_dict = {
+                    commit: repo
+                    for commit, repo in data.commits.items()
+                    if data.since and commit.commit.author.date > data.since
+                }
+                logger.debug(
+                    "Count of commits reduced from %d to %d",
+                    len(data.commits.items()),
+                    len(commits_dict.items()),
+                )
 
             md.issues = issues_dict
             md.pull_requests = pulls_dict

@@ -19,6 +19,7 @@
 """
 
 import logging
+import re
 import sys
 import traceback
 from concurrent.futures import ThreadPoolExecutor, as_completed, CancelledError
@@ -30,16 +31,21 @@
 from github.Issue import Issue
 from github.PullRequest import PullRequest
 from github.Repository import Repository
+from github.Commit import Commit as GithubCommit
 
 from release_notes_generator.action_inputs import ActionInputs
 from release_notes_generator.data.utils.bulk_sub_issue_collector import BulkSubIssueCollector
+
 from release_notes_generator.model.record.issue_record import IssueRecord
 from release_notes_generator.model.mined_data import MinedData
 from release_notes_generator.model.record.pull_request_record import PullRequestRecord
 from release_notes_generator.utils.decorators import safe_call_decorator
 from release_notes_generator.utils.github_rate_limiter import GithubRateLimiter
 from release_notes_generator.utils.record_utils import get_id, parse_issue_id
 
+_PR_NUMBER_RE = re.compile(r"\(#(\d+)\)|Merge pull request #(\d+)")
+_COMPARE_COMMITS_MAX_RESULTS = 10_000
+
 logger = logging.getLogger(__name__)
 
 
@@ -66,16 +72,55 @@ def mine_data(self) -> MinedData:
 
         self._get_issues(data)
 
-        # pulls and commits, and then reduce them by the latest release since time
-        pull_requests = list(
-            self._safe_call(repo.get_pulls)(state=PullRequestRecord.PR_STATE_CLOSED, base=repo.default_branch)
-        )
-        data.pull_requests = {pr: data.home_repository for pr in pull_requests}
-        if data.since:
-            commits = list(self._safe_call(repo.get_commits)(since=data.since))
+        if ActionInputs.is_from_tag_name_defined():
+            logger.info(
+                "Compare mode: using repo.compare('%s', '%s').",
+                ActionInputs.get_from_tag_name(),
+                ActionInputs.get_tag_name(),
+            )
+            comparison = self._safe_call(repo.compare)(ActionInputs.get_from_tag_name(), ActionInputs.get_tag_name())
+            if comparison is None:
+                logger.error(
+                    "Compare API returned no result for '%s'...'%s'. Ending!",
+                    ActionInputs.get_from_tag_name(),
+                    ActionInputs.get_tag_name(),
+                )
+                sys.exit(1)
+            compare_commits: list[GithubCommit] = list(comparison.commits)
+            total_commits = getattr(comparison, "total_commits", None)
+            if isinstance(total_commits, int) and total_commits > len(compare_commits):
+                logger.warning(
+                    "Compare mode: retrieved %d commit(s) but comparison reports %d total; results may be truncated.",
+                    len(compare_commits),
+                    total_commits,
+                )
+            elif len(compare_commits) >= _COMPARE_COMMITS_MAX_RESULTS:
+                logger.warning(
+                    "Compare mode: retrieved %d commit(s); comparison ranges over %d commits may be truncated.",
+                    len(compare_commits),
+                    _COMPARE_COMMITS_MAX_RESULTS,
+                )
+            data.compare_commit_shas = {c.sha for c in compare_commits}
+            data.commits = {c: data.home_repository for c in compare_commits}
+            pr_numbers = self._extract_pr_numbers_from_commits(compare_commits)
+            pulls: dict[PullRequest, Repository] = {}
+            for number in sorted(pr_numbers):
+                pr = self._safe_call(repo.get_pull)(number)
+                if pr is not None:
+                    pulls[pr] = data.home_repository
+            data.pull_requests = pulls
+            logger.info("Compare mode: found %d commit(s), %d PR(s).", len(compare_commits), len(data.pull_requests))
         else:
-            commits = list(self._safe_call(repo.get_commits)())
-        data.commits = {c: data.home_repository for c in commits}
+            # pulls and commits, then reduce them by the latest release since time
+            pull_requests = list(
+                self._safe_call(repo.get_pulls)(state=PullRequestRecord.PR_STATE_CLOSED, base=repo.default_branch)
+            )
+            data.pull_requests = {pr: data.home_repository for pr in pull_requests}
+            if data.since:
+                commits = list(self._safe_call(repo.get_commits)(since=data.since))
+            else:
+                commits = list(self._safe_call(repo.get_commits)())
+            data.commits = {c: data.home_repository for c in commits}
 
         logger.info("Initial data mining from GitHub completed.")
 
@@ -423,6 +468,27 @@ def __get_latest_semantic_release(releases) -> Optional[GitRelease]:
 
         return rls
 
+    @staticmethod
+    def _extract_pr_numbers_from_commits(commits: list[GithubCommit]) -> set[int]:
+        """
+        Extract unique PR numbers from commit messages.
+
+        Note: Only the first line (subject) of each commit message is scanned to avoid matching
+        references in the commit body.
+
+        Parameters:
+            commits: Commit objects whose messages are scanned.
+        Returns:
+            set[int]: Unique PR numbers found across all messages.
+        """
+        pr_numbers: set[int] = set()
+        for commit in commits:
+            subject = commit.commit.message.splitlines()[0] if commit.commit.message else ""
+            for match in _PR_NUMBER_RE.finditer(subject):
+                number_str = match.group(1) or match.group(2)
+                pr_numbers.add(int(number_str))
+        return pr_numbers
+
     @staticmethod
     def __filter_duplicated_issues(data: MinedData) -> "MinedData":
         """

@@ -47,6 +47,7 @@ def __init__(self, repository: Repository):
         self.issues: dict[Issue, Repository] = {}
         self.pull_requests: dict[PullRequest, Repository] = {}
         self.commits: dict[Commit, Repository] = {}
+        self.compare_commit_shas: set[str] = set()
 
         self.parents_sub_issues: dict[str, list[str]] = {}  # parent issue id -> list of its sub-issues ids
         # dictionary of fetched cross issues and their pull requests