fix: handle reverse tensor shape ordering for gguf shape guessing by pipe1os · Pull Request #42 · pipe1os/modelinfo-cli

pipe1os · 2026-06-27T15:50:46Z

Summary

Fix the fallback shape guessing logic for GGUF models to correctly handle their column-major dimension ordering ([width, height]).

Motivation & Context

GGUF models store tensors in a column-major format compared to PyTorch/SafeTensors' row-major layout. As a result, when fallback shape guessing is triggered (such as when GGUF/config metadata is missing or unrecognized), querying shape[0] for key projection or fused projection weights accesses the width (input dimension) instead of the height (output/key-value dimension). This leads to extremely inflated kv_dim calculations and inaccurate memory/VRAM estimations.

This change detects whether the input tensors represent a GGUF model by checking for standard GGUF metadata keys starting with general., and then uses shape[-1] (the last dimension) instead of shape[0] for fallback dimension extraction.

Type of Change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
Refactoring (no functional changes, no api changes)
Documentation update

How Has This Been Tested?

Added regression tests in tests/test_calculator.py verifying:

Fallback shape guessing for GGUF models correctly uses shape[-1] for k_proj tensors.
Fallback shape guessing for fused qkv_proj GGUF tensors correctly uses (shape[-1] // 3).

All tests pass successfully.

Unit tests
Integration tests
Manual testing

Screenshots (if appropriate)

N/A

Checklist

My code follows the code style of this project.
My commit messages follow the Conventional Commits format, are lowercase, imperative, and specific.
I have updated the documentation accordingly (if applicable).
I have added tests to cover my changes.
All new and existing tests passed.

…er, add error tests

…xity

… remote routing

coderabbitai · 2026-06-27T15:50:54Z

Warning

Review limit reached

@pipe1os, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 52 minutes and 12 seconds. Learn how PR review limits work.

To continue reviewing without waiting, enable usage-based billing in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: fb151d3e-a40b-4c8a-9635-93044f79a5dd

📥 Commits

Reviewing files that changed from the base of the PR and between 0e82634 and e972e57.

📒 Files selected for processing (2)

src/modelinfo/architecture.py
tests/test_calculator.py

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch advisor/006-fix-gguf-shape-guessing

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

codacy-production · 2026-06-27T15:51:59Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

AI Reviewer: first review requested successfully. AI can make mistakes. Always validate suggestions.

_{TIP This summary will be updated as you push new changes.}

codacy-production

Pull Request Overview

While this PR is technically 'up to standards' according to Codacy, it introduces substantial scope creep that is not reflected in the title. Beyond the GGUF shape-guessing fix, it implements remote GGUF fetching, a custom RemoteFileStream, and UI comparison tables.

The implementation has caused a significant spike in complexity, particularly in src/modelinfo/parsers/huggingface.py (+41), which now functions as a monolithic dispatcher. Additionally, there are minor logic issues in the CLI error handling and code duplication in the new test suite. Addressing the complexity deltas through the suggested refactors is highly recommended before merging.

About this PR

This PR contains significant scope creep. The title and summary focus on a specific bug fix for shape guessing, but the implementation introduces major new features (remote GGUF support, UI comparison tables, and a custom RemoteFileStream) that are not detailed in the description. This makes the review process more difficult and obscures architectural changes.

2 comments outside of the diff

src/modelinfo/cli.py

_{line 133 🟡 MEDIUM RISK}
The analyze_model function exceeds the cyclomatic complexity threshold (16). It currently handles URI resolution, local path validation, and multi-format parser dispatch. It is recommended to extract the format detection and parser selection logic into a separate factory function.

_{line 183 ⚪ LOW RISK}
The error message is misleading when a local file exists but does not match supported extensions. If the file exists but no extension matches, the exception should state that the format is unsupported rather than saying the file was 'not found locally'.

Test suggestions

Verify fallback shape guessing for GGUF k_proj uses shape[-1]
Verify fallback shape guessing for GGUF fused qkv_proj uses shape[-1] // 3
Verify identification of GGUF architecture based on 'general.' metadata keys
Verify remote fetching of a single GGUF file from Hugging Face
Verify remote fetching and grouping of multiple GGUF variants from a repository

_{TIP Improve review quality by adding custom instructions}
_{TIP How was this review? Give us feedback}

codacy-production · 2026-06-27T15:53:29Z

+        return 0.0
+
+
+class RemoteFileStream:


_{🔴 HIGH RISK}

Suggestion: The complexity of this file has increased significantly (+41) with the addition of remote GGUF support and the RemoteFileStream implementation. fetch_huggingface_repo has become a monolithic function handling multiple file formats and sharding strategies. Consider refactoring it into smaller, format-specific functions (e.g., _handle_gguf_repo, _handle_safetensors_repo) to improve maintainability.

See Complexity in Codacy

codacy-production · 2026-06-27T15:53:29Z

+                    {"rfilename": "model-q4.gguf", "size": 1000000000}
+                ]
+            }).encode("utf-8")
+        elif "model-q4.gguf" in url:
+            import struct
+            header = b"GGUF" + struct.pack("<IQQ", 2, 0, 0)
+            return header
+        raise ValueError(f"Unexpected url: {url}")
+
+    monkeypatch.setattr(huggingface, "_make_request", fake_make_request)
+
+    tensors, config, format_name, disk_size = huggingface.fetch_huggingface_repo("org/model-gguf")
+
+    assert format_name == "GGUF"


_{⚪ LOW RISK}

Nitpick: Detected code duplication in test mocks. The fake_make_request logic is repeated across multiple test cases. It is recommended to extract this logic into a reusable pytest fixture or a helper function configured with a URL-to-response mapping.

codacy-production · 2026-06-27T15:53:29Z

+            raise PermissionError(f"Gated/Private Model (401 Unauthorized). Set the HF_TOKEN environment variable to access {real_repo_id}")
        if e.code == 404:
-             raise FileNotFoundError(f"Could not find repository on Hugging Face (404 Not Found): {repo_id}")
+             raise FileNotFoundError(f"Could not find repository on Hugging Face (404 Not Found): {real_repo_id}")


_{⚪ LOW RISK}

Nitpick: Minor indentation inconsistency: line 302 has an extra leading space before 'raise FileNotFoundError'.

…, and test helper

pipe1os added 11 commits June 27, 2026 11:16

implement remote gguf inspection on hugging face

1b1a090

split print_model_info test to comply with codacy method size limit

71ef3a3

fix codacy issues: add read limit, honor gpu_util, modularize hf pars…

d0c5474

…er, add error tests

refactor: split concurrent shards fetching to lower cyclomatic comple…

bebe2c1

…xity

fix codacy issues: compute GGUF group variant overhead dynamically

6555e0e

docs: document remote gguf inspection options in README.md

357ee16

fix: strip trailing slashes from model paths at entrypoint

5a8e6e2

fix: treat paths starting with local prefix as local files to prevent…

0ef126b

… remote routing

fix: handle reverse tensor shape ordering for gguf shape guessing

7dc8576

fix: handle concurrent remote shard download failures gracefully

b0b9744

fix: resolve safetensors shard index prefix splitting

9ba40d0

pipe1os added 4 commits June 27, 2026 11:52

merge: integrate branch 004-fix-safetensors-shard-prefix

738e291

merge: integrate branch 005-graceful-shard-downloads

1b1c58f

merge: integrate branch 006-fix-gguf-shape-guessing

56bbf66

merge: integrate branch 007-refine-remote-detection

0cfe2e5

codacy-production Bot reviewed Jun 27, 2026

View reviewed changes

pipe1os added 4 commits June 27, 2026 11:53

merge: integrate branch 008-fix-comparison-trailing-slash

2deb25a

fix: address codacy review feedback on disk size, regex, path parsing…

32bf12b

…, and test helper

merge: sync with main

e972e57

merge: sync with main

1a82ced

pipe1os force-pushed the main branch from 32bf12b to 71abc38 Compare June 27, 2026 18:25

pipe1os closed this Jun 27, 2026

pipe1os deleted the advisor/006-fix-gguf-shape-guessing branch June 27, 2026 18:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle reverse tensor shape ordering for gguf shape guessing#42

fix: handle reverse tensor shape ordering for gguf shape guessing#42
pipe1os wants to merge 19 commits into
mainfrom
advisor/006-fix-gguf-shape-guessing

pipe1os commented Jun 27, 2026

Uh oh!

coderabbitai Bot commented Jun 27, 2026 •

edited

Loading

Review limit reached

Uh oh!

codacy-production Bot commented Jun 27, 2026 •

edited

Loading

Uh oh!

codacy-production Bot left a comment

Uh oh!

codacy-production Bot Jun 27, 2026

Uh oh!

codacy-production Bot Jun 27, 2026

Uh oh!

codacy-production Bot Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pipe1os commented Jun 27, 2026

Summary

Motivation & Context

Type of Change

How Has This Been Tested?

Screenshots (if appropriate)

Checklist

Uh oh!

coderabbitai Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Uh oh!

codacy-production Bot commented Jun 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Up to standards ✅

Uh oh!

codacy-production Bot left a comment

Choose a reason for hiding this comment

Pull Request Overview

About this PR

Test suggestions

Uh oh!

codacy-production Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

codacy-production Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

codacy-production Bot Jun 27, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 27, 2026 •

edited

Loading

codacy-production Bot commented Jun 27, 2026 •

edited

Loading