Skip to content

Use new Safetensors packaging#666

Open
ilopezluna wants to merge 8 commits intomainfrom
use-new-packaging
Open

Use new Safetensors packaging#666
ilopezluna wants to merge 8 commits intomainfrom
use-new-packaging

Conversation

@ilopezluna
Copy link
Contributor

@ilopezluna ilopezluna commented Feb 12, 2026

This pull request simplifies and improves the docker model package command by removing the --dir-tar option and its related implementation, updating documentation for clarity and accuracy, and enhancing how configuration metadata is extracted for directory-based model packaging. The changes also improve the user experience with better help text and more robust handling of model formats.

CLI and Documentation Improvements:

  • Removed the --dir-tar option and all related code, including flag definitions, validation, processing logic, and documentation references. This option is no longer supported.
  • Updated the docker model package help text, command documentation, and CLI reference files to provide clearer, more structured instructions and usage details. The new help text explicitly describes supported input types and packaging behaviors.

Packaging Logic Enhancements:

  • Refactored Safetensors directory packaging to use the new FromDirectory builder method, simplifying the code and ensuring each file is packaged as a separate OCI layer.
  • Improved config metadata extraction for directory-based packaging: now uses format-specific handlers to extract metadata from model weight files, providing more accurate model configuration.

Compatibility Handling:

  • Updated the distribution client to accept both v0.1 and v0.2 model config media types, ensuring compatibility with new and existing model artifacts.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @ilopezluna, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a significant update to the model packaging system, primarily focusing on enhancing how Safetensors models are handled. It transitions to a more granular, layer-per-file approach for Safetensors, which improves flexibility and metadata extraction. Concurrently, it streamlines the packaging CLI by removing the generic --dir-tar option and updates documentation to reflect these changes. Additionally, the vllm inference backend has been removed, simplifying the overall backend architecture.

Highlights

  • New Safetensors Packaging: Implemented a new V0.2 layer-per-file packaging mechanism for Safetensors models, allowing each file to be an individual OCI layer and preserving directory structure.
  • Removed --dir-tar Flag: The --dir-tar flag and its associated functionality for packaging arbitrary directories as tar archives have been removed from the docker model package command.
  • Updated docker model package Command: The package command's short and long descriptions have been significantly expanded and clarified to detail the packaging behavior for GGUF, Safetensors, DDUF, and repackaging.
  • V0.2 Model Config Support: The distribution client now explicitly supports MediaTypeModelConfigV02, indicating readiness for the new model configuration format.
  • Recursive Safetensors Discovery: The model parsing logic now recursively searches for .safetensors files within a model directory, accommodating nested structures.
  • VLLM Backend Removal: The vllm inference backend and its related imports and initialization have been removed from the main application.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • cmd/cli/commands/package.go
    • Removed the import for the packaging utility.
    • Updated the Short and Long descriptions for the package command to provide more detailed guidance.
    • Removed validation logic for dir-tar paths.
    • Removed the --dir-tar flag definition.
    • Removed the dirTarPaths field from the packageOptions struct.
    • Modified the initializeBuilder function to use builder.FromDirectory for Safetensors models, simplifying the packaging process.
    • Removed the entire block of code responsible for processing directory tar archives in packageModel.
  • cmd/cli/docs/reference/docker_model_package.yaml
    • Updated the short and long descriptions to align with the new packaging behavior and details.
    • Removed the dir-tar option from the command's reference documentation.
  • cmd/cli/docs/reference/model.md
    • Updated the description for the package subcommand in the main model reference documentation.
  • cmd/cli/docs/reference/model_package.md
    • Updated the main description of the docker model package command.
    • Removed the dir-tar option from the options table.
  • main.go
    • Removed the import for the vllm inference backend.
    • Removed the initialization and registration of the vllm backend.
  • pkg/distribution/builder/builder.go
    • Removed the WithDirTar method, which was used for adding directory tar archives to the artifact.
  • pkg/distribution/builder/from_directory.go
    • Added an import for the format package.
    • Implemented logic within FromDirectory to extract config metadata from weight files using format-specific handlers.
  • pkg/distribution/distribution/client.go
    • Updated the checkCompat function to explicitly support MediaTypeModelConfigV02 in addition to MediaTypeModelConfigV01.
  • pkg/distribution/distribution/client_test.go
    • Modified a test case to use v99.0+json as an unsupported model config version, reflecting that v0.2+json is now supported.
  • pkg/distribution/huggingface/model.go
    • Removed the import for the packaging utility.
    • Refactored buildModelFromFiles to use V0.2 layer-per-file packaging for Safetensors models and retain V0.1 packaging for GGUF models.
    • Removed the createConfigArchive function and its associated logic.
  • pkg/distribution/internal/bundle/parse.go
    • Added an import for the strings package.
    • Modified the findSafetensorsFile function to recursively search for .safetensors files within a model directory.
  • pkg/distribution/internal/bundle/parse_test.go
    • Added a new test case, TestParse_WithNestedSafetensors, to verify the recursive discovery of safetensors files.
  • pkg/distribution/packaging/directory.go
    • Removed file.
  • pkg/distribution/packaging/directory_test.go
    • Removed file.
  • pkg/distribution/packaging/dirtar.go
    • Removed file.
  • pkg/distribution/packaging/dirtar_test.go
    • Removed file.
Activity
  • The pull request is currently marked as Work In Progress (WIP).
  • No additional comments or reviews have been posted yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new V0.2 layer-per-file packaging mechanism and refactors the existing packaging logic. This involves removing the packaging package and the --dir-tar command-line flag. The documentation has been updated to reflect these changes. The overall changes simplify the packaging process, especially for Safetensors models.

I have a couple of suggestions for improvement:

  • In pkg/distribution/builder/from_directory.go, errors during format handler retrieval and config extraction are silently ignored. It would be better to log these errors to aid in debugging.
  • In pkg/distribution/huggingface/model.go, the new logic for handling GGUF models from Hugging Face seems to have dropped support for packaging auxiliary configuration files (like tokenizer.json), which could be a regression.

Additionally, I noticed the vllm backend has been removed. I assume this is an intentional part of the refactoring.

@ilopezluna ilopezluna changed the title [WIP] Use new packaging Use new Safetensors packaging Feb 13, 2026
@ilopezluna ilopezluna marked this pull request as ready for review February 13, 2026 09:05
@ilopezluna ilopezluna requested a review from a team February 13, 2026 09:05
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In findSafetensorsFile you return filepath.SkipAll from the Walk callback, but filepath doesn’t define SkipAll (only SkipDir), so this won’t compile; use the appropriate sentinel (e.g., filepath.SkipDir) or restructure the walk logic to terminate early in a supported way.
  • The refactor of buildModelFromFiles/buildGGUFModelV01 removes the previous config archive handling for GGUF models, so non-chat-template config files downloaded from HuggingFace are no longer added to the artifact; confirm this is intentional or reintroduce packaging of tokenizer/merges/etc. to avoid regressions when running GGUF models built from HF.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `findSafetensorsFile` you return `filepath.SkipAll` from the Walk callback, but `filepath` doesn’t define `SkipAll` (only `SkipDir`), so this won’t compile; use the appropriate sentinel (e.g., `filepath.SkipDir`) or restructure the walk logic to terminate early in a supported way.
- The refactor of `buildModelFromFiles`/`buildGGUFModelV01` removes the previous config archive handling for GGUF models, so non-chat-template config files downloaded from HuggingFace are no longer added to the artifact; confirm this is intentional or reintroduce packaging of tokenizer/merges/etc. to avoid regressions when running GGUF models built from HF.

## Individual Comments

### Comment 1
<location> `pkg/distribution/huggingface/model.go:118-113` </location>
<code_context>
+
+// buildGGUFModelV01 builds a GGUF model using V0.1 packaging (backward compatible).
+// GGUF uses FromPaths + config archive approach.
+func buildGGUFModelV01(localPaths map[string]string, weightFiles, configFiles []RepoFile, tempDir string) (types.ModelArtifact, error) {
 	// Collect weight file paths (sorted for reproducibility)
 	var weightPaths []string
</code_context>

<issue_to_address>
**issue (bug_risk):** GGUF V0.1 builder no longer uses configFiles despite comment claiming a config archive approach

The current `buildGGUFModelV01` implementation no longer uses `configFiles`: the previous `WithConfigArchive` flow is gone, but the comment still says `GGUF uses FromPaths + config archive approach.` Please either (a) update the comment and remove the unused `configFiles` parameter if config packaging is intentionally dropped, or (b) restore the config-archive handling to preserve the prior behavior.
</issue_to_address>

### Comment 2
<location> `pkg/distribution/internal/bundle/parse.go:115-124` </location>
<code_context>
+	// Search recursively for V0.2 models with nested directory structure
+	// (e.g., text_encoder/model.safetensors)
+	var firstFound string
+	walkErr := filepath.Walk(modelDir, func(path string, info os.FileInfo, err error) error {
+		if err != nil {
+			return nil // skip errors
+		}
+		if info.IsDir() {
+			return nil
+		}
+		if filepath.Ext(path) == ".safetensors" && !strings.HasPrefix(info.Name(), ".") {
+			rel, relErr := filepath.Rel(modelDir, path)
+			if relErr == nil {
+				firstFound = rel
+				return filepath.SkipAll // found one, stop walking
+			}
+		}
+		return nil
+	})
+	if walkErr != nil {
+		return "", fmt.Errorf("walk for safetensors files: %w", walkErr)
+	}
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Recursive safetensors search swallows Walk/Rel errors, which may hide real filesystem issues

In the recursive branch, the walk callback ignores all errors and also drops `filepath.Rel` failures, so permission/IO/path issues are indistinguishable from “no safetensors found” (returns `"", nil`). This can mask real problems with malformed or unreadable bundles. It would be better to propagate the first meaningful error (and return it) or at least log it so callers can tell the difference between absence of files and filesystem failures.

Suggested implementation:

```golang
	walkErr := filepath.Walk(modelDir, func(path string, info os.FileInfo, err error) error {
		if err != nil {
			// Propagate filesystem errors so callers can distinguish them from
			// the case where no safetensors files are present.
			return err
		}

```

```golang
		if filepath.Ext(path) == ".safetensors" && !strings.HasPrefix(info.Name(), ".") {
			rel, relErr := filepath.Rel(modelDir, path)
			if relErr != nil {
				// Treat a bad relative path as a real error instead of silently
				// ignoring it, so malformed bundles surface to the caller.
				return relErr
			}
			firstFound = rel
			return filepath.SkipAll // found one, stop walking
		}

```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@ilopezluna
Copy link
Contributor Author

@sourcery-ai review

Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In findSafetensorsFile, returning filepath.SkipAll from a filepath.Walk callback will be treated as an error and wrapped as walk for safetensors files: skip all; use filepath.SkipDir with Walk or switch to WalkDir and handle SkipAll explicitly so early-exit doesn't surface as an error.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `findSafetensorsFile`, returning `filepath.SkipAll` from a `filepath.Walk` callback will be treated as an error and wrapped as `walk for safetensors files: skip all`; use `filepath.SkipDir` with `Walk` or switch to `WalkDir` and handle `SkipAll` explicitly so early-exit doesn't surface as an error.

## Individual Comments

### Comment 1
<location> `pkg/distribution/internal/bundle/parse.go:124` </location>
<code_context>
+		if info.IsDir() {
+			return nil
+		}
+		if filepath.Ext(path) == ".safetensors" && !strings.HasPrefix(info.Name(), ".") {
+			rel, relErr := filepath.Rel(modelDir, path)
+			if relErr != nil {
</code_context>

<issue_to_address>
**issue (bug_risk):** The use of `filepath.SkipAll` will not compile; use a supported sentinel (e.g., `filepath.SkipDir`) or a custom error to stop walking.

`filepath.Walk` can only short-circuit via `filepath.SkipDir` or a non-nil error; `filepath.SkipAll` does not exist in the package, so this won’t compile. To stop after the first match, either:

- Return a sentinel error (e.g. `errFoundSafetensors`) and treat it specially after `Walk`, or
- Carefully use `filepath.SkipDir` so it only affects the intended directory.

A sentinel error is typically the simplest and safest option.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants