fix: resolve safetensors shard index prefix splitting#44
Conversation
…er, add error tests
|
Warning Review limit reached
More reviews will be available in 50 minutes and 23 seconds. Learn how PR review limits work. To continue reviewing without waiting, enable usage-based billing in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
There was a problem hiding this comment.
Pull Request Overview
While this PR successfully addresses the SafeTensors sharding issue for hyphenated model names, it introduces substantial scope creep by adding remote GGUF inspection and comparison UI views.
A critical concern is that src/modelinfo/parsers/huggingface.py has seen a massive increase in cyclomatic complexity (+41) without adequate test coverage. Furthermore, authentication logic for accessing gated or private models on the Hugging Face Hub is inconsistently implemented and missing from the new remote streaming requests, which will result in 401 errors in production. The regex fix for SafeTensors shards is also too restrictive and should be generalized to support different padding lengths. Codacy indicates the project remains up to standards, but the high complexity in the parser module and duplicated mock logic in tests should be addressed before merging.
About this PR
- Authentication logic for the Hugging Face Hub is currently duplicated and missing in several new network request paths. This should be centralized into a shared utility to ensure gated/private models are handled consistently across all remote fetching operations.
- The PR contains significant scope creep. The title focuses on a SafeTensors fix, but the majority of the changes implement a new Remote GGUF support system and UI tables. This should ideally be split into separate PRs to simplify review and testing.
1 comment outside of the diff
src/modelinfo/cli.py
line 133🟡 MEDIUM RISK
Theanalyze_modelfunction has reached a cyclomatic complexity of 16. It is managing too many responsibilities including local file validation, remote resolution, and multi-format dispatch. Consider refactoring local parser dispatch and remote fetching into separate helper functions.
Test suggestions
- Resolve SafeTensors index path when the model name contains multiple hyphens
- Fetch and parse remote GGUF header via stream/range requests
- Render a comparison table in the UI for repositories with multiple GGUF variants
- Handle unauthorized (401) and not found (404) responses from Hugging Face Hub
- Automate unit tests for high-complexity remote fetching logic in src/modelinfo/parsers/huggingface.py
Prompt proposal for missing tests
Consider implementing these tests if applicable:
1. Automate unit tests for high-complexity remote fetching logic in src/modelinfo/parsers/huggingface.py
TIP Improve review quality by adding custom instructions
TIP How was this review? Give us feedback
|
|
||
| headers = {"Range": f"bytes={start_bytes}-{end_bytes}"} | ||
| try: | ||
| chunk = _make_request( |
There was a problem hiding this comment.
🔴 HIGH RISK
Authentication logic is missing from the new RemoteFileStream and config.json requests. This will cause 401 Unauthorized errors when users attempt to inspect gated or private models. Additionally, this file is flagged as complex and lacks sufficient test coverage for these new paths.
| elif "-of-" in base_name and path.endswith(".safetensors"): | ||
| prefix = base_name.split("-")[0] | ||
| import re | ||
| match = re.match(r"^(.*?)-\d{5}-of-\d{5}\.safetensors$", base_name) |
There was a problem hiding this comment.
🟡 MEDIUM RISK
Suggestion: The regex strictly expects exactly 5 digits for shard indexing (e.g., -00001-of-00005). This will fail for non-standard exports (e.g., 4-digit padding), causing the logic to fall back to the broken split('-')[0] behavior. Use a more flexible digit match.
| match = re.match(r"^(.*?)-\d{5}-of-\d{5}\.safetensors$", base_name) | |
| match = re.match(r"^(.*?)-\d+-of-\d+\.safetensors$", base_name) |
| if "/api/models/" in url: | ||
| return json.dumps({ | ||
| "siblings": [ | ||
| {"rfilename": "model-q4.gguf", "size": 1000000000} |
There was a problem hiding this comment.
⚪ LOW RISK
The mock implementation for network requests and GGUF headers is duplicated across multiple test cases. Consolidating this into a shared pytest fixture or a parameterized test would improve maintainability.
…, and test helper
c788d4c to
9985fd0
Compare
Summary
This pull request implements regex-based shard prefix parsing in the SafeTensors parser to support model names containing hyphens.
Motivation & Context
Previously, the SafeTensors shard index logic extracted the model name prefix by splitting the base filename at the first hyphen (
base_name.split("-")[0]). For model names that contain hyphens (for example,llama-3-8b-00001-of-00004.safetensors), this split logic wrongly extracted onlyllamaas the prefix. As a result, the parser failed to locate the correct index file (llama-3-8b.safetensors.index.json).This change uses a regular expression to match standard shard formats and correctly extract the prefix. It falls back to the previous split-based method for non-standard formats.
Type of Change
How Has This Been Tested?
Added a unit test
test_safetensors_sharded_with_hyphensintests/test_parsers.pythat verifies the index path is correctly resolved when parsing a shard file path with multiple hyphens (e.g.mock-llama-3-8b-00001-of-00002.safetensors).Ran all unit tests to verify:
Screenshots (if appropriate)
Checklist