Skip to content

feat(gitutils): add parse_vcs_url for pip VCS URL parsing#1215

Open
tiran wants to merge 1 commit into
python-wheel-build:mainfrom
tiran:parse-vcs-url
Open

feat(gitutils): add parse_vcs_url for pip VCS URL parsing#1215
tiran wants to merge 1 commit into
python-wheel-build:mainfrom
tiran:parse-vcs-url

Conversation

@tiran

@tiran tiran commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator

Pull Request Description

What

Add parse_vcs_url to parse pip VCS URLs (git+https, git+ssh) into a repo clone URL and git ref. Use it in bootstrapper and sources to replace duplicated manual git+ URL parsing logic.

Why

Replace ad-hoc implementations of pip VCS url parsing with a single, well-designed, and reusable function.

Add `parse_vcs_url` to parse pip VCS URLs (git+https, git+ssh) into a
repo clone URL and git ref. Use it in `bootstrapper` and `sources` to
replace duplicated manual git+ URL parsing logic.

Co-Authored-By: Claude <claude@anthropic.com>
Signed-off-by: Christian Heimes <cheimes@redhat.com>
@tiran tiran requested a review from a team as a code owner June 24, 2026 11:15
@coderabbitai

coderabbitai Bot commented Jun 24, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

gitutils.py gains a GIT_HEAD constant ("HEAD") and a new parse_vcs_url function that parses git+https/git+ssh pip-style VCS URLs into a plain clone URL and a git ref, with validation for unsupported schemes and missing/empty refs. git_clone_fast's ref default is updated to use GIT_HEAD. bootstrapper.py and sources.py remove their duplicated urlparse-based @ref extraction logic and replace it with calls to parse_vcs_url. Tests are updated to reflect the new error message wording and the normalized (prefix-stripped) clone URL.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly describes the main change: adding parse_vcs_url for pip VCS URL parsing.
Description check ✅ Passed The description matches the changeset by describing the new parser and its use in bootstrapper and sources.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@mergify mergify Bot added the ci label Jun 24, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/test_gitutils.py (1)

79-93: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick win

Cover #subdirectory= fragments in the parser tests.

parse_vcs_url() explicitly drops URL fragments before returning the clone URL, but this suite never exercises a pip-style #subdirectory= URL. That leaves a common VCS form unprotected against regressions in both callers.

Suggested test addition
 def test_parse_vcs_url() -> None:
     assert parse_vcs_url("git+https://git.test/org/project.git@v1.0") == (
         "https://git.test/org/project.git",
         "v1.0",
     )
+    assert parse_vcs_url(
+        "git+https://git.test/org/project.git@v1.0#subdirectory=src"
+    ) == (
+        "https://git.test/org/project.git",
+        "v1.0",
+    )
     # '@' in netloc must not be confused with the ref '@'
     assert parse_vcs_url("git+ssh://git@git.test/org/project.git@abc123") == (
         "ssh://git@git.test/org/project.git",
         "abc123",
     )

As per path instructions, "tests/**: Verify test actually tests the intended behavior. Check for missing edge cases."

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_gitutils.py` around lines 79 - 93, Add a test case in
test_parse_vcs_url() to cover a pip-style VCS URL with a `#subdirectory`=
fragment, and assert that parse_vcs_url() still returns the cloned repo URL plus
the correct ref while dropping the fragment. Use the existing parse_vcs_url and
GIT_HEAD patterns in this test module so the new assertion verifies the fragment
is ignored without changing the returned clone URL/ref behavior.

Source: Path instructions

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/test_gitutils.py`:
- Around line 79-93: Add a test case in test_parse_vcs_url() to cover a
pip-style VCS URL with a `#subdirectory`= fragment, and assert that
parse_vcs_url() still returns the cloned repo URL plus the correct ref while
dropping the fragment. Use the existing parse_vcs_url and GIT_HEAD patterns in
this test module so the new assertion verifies the fragment is ignored without
changing the returned clone URL/ref behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 41e8cbc8-5fc4-450a-a86b-b0796f44c7ae

📥 Commits

Reviewing files that changed from the base of the PR and between 694f04f and 0f0e286.

📒 Files selected for processing (6)
  • src/fromager/bootstrapper.py
  • src/fromager/gitutils.py
  • src/fromager/sources.py
  • tests/test_bootstrap.py
  • tests/test_gitutils.py
  • tests/test_sources.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant