Skip to content

configuring genai model deployment with model readonly filesystem #4139

Open
dtrawins wants to merge 23 commits intomainfrom
ro-model_path-configuration
Open

configuring genai model deployment with model readonly filesystem #4139
dtrawins wants to merge 23 commits intomainfrom
ro-model_path-configuration

Conversation

@dtrawins
Copy link
Copy Markdown
Collaborator

@dtrawins dtrawins commented Apr 15, 2026

🛠 Summary

CVS-186376

Allows starting generative models on readonly filesystem with option to change runtime parameters and accepts models without graph.pbtxt.
It can be done with --model_path and --task along with other task specific paramters.
There is also removed default --task value as text_generation. Task should be explicit if graph.pbtxt should be created.
The logic is:

  • if --source_model is provided, OVMS will assume to path of pulling the model from HF if missing on the local target path. It requires RW filesystem. Should be used always with --task.
  • if --source_model is not provided and deployment is with --model_name and --model_path, task is optional. With empty --task, graph from local model folder will be used. With set task, all runtime params will be from CLI and graph will be in memory without saving to model folder.

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

@dtrawins dtrawins marked this pull request as ready for review April 15, 2026 23:27
Copilot AI review requested due to automatic review settings April 15, 2026 23:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates OVMS HF pull-and-start / --task startup flows to support deployments where the model directory is read-only by generating and consuming graph.pbtxt purely in memory instead of writing it to disk.

Changes:

  • Add an in-memory graph export path (GraphExport) and update graph loading logic to accept in-memory graph content.
  • Adjust CLI/config validation and module startup logic to allow --task + --model_path without requiring HF download parameters.
  • Update and add tests to validate “no graph.pbtxt written” behavior in pull-and-start and local model-path task scenarios.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/test/test_utils.hpp Adds a new SetUpServer overload supporting --task.
src/test/test_utils.cpp Implements the new SetUpServer overload building argv with --task.
src/test/pull_hf_model_test.cpp Updates tests to validate graph is stored in memory (no graph.pbtxt on disk) in pull-and-start mode.
src/test/llm/llmnode_test.cpp Adds a test ensuring --task + --model_path works without writing graph.pbtxt (read-only model dir).
src/server.cpp Updates module startup flow: skip HF pull when sourceModel is empty and generate graph in memory from model_path.
src/pull_module/hf_pull_model_module.cpp Controls whether graph is written to disk or stored in memory based on server mode.
src/modelmanager.cpp Treats in-memory graph content as satisfying “graph available” checks for Mediapipe startup.
src/mediapipe_internal/mediapipegraphdefinition.cpp Loads graph definition from in-memory content when available.
src/mediapipe_internal/mediapipegraphconfig.cpp Logs in-memory graph content instead of reading from file when available.
src/mediapipe_internal/BUILD Adds Bazel dependency on //src/graph_export:graph_export.
src/graph_export/graph_export.hpp Extends createServableConfig with writeToFile and adds in-memory graph accessors.
src/graph_export/graph_export.cpp Implements process-global in-memory graph storage and conditional file writing.
src/config.cpp Relaxes validation for HF pull-and-start when using --task with --model_path.
src/cli_parser.cpp Prevents model_name from being treated as source_model when model_path is set; preserves explicit --model_path.
src/BUILD Adds Bazel dependency on //src/graph_export:graph_export for server build.

Comment thread src/server.cpp Outdated
Comment on lines 381 to 404
if (config.getServerSettings().serverMode == HF_PULL_MODE || config.getServerSettings().serverMode == HF_PULL_AND_START_MODE) {
INSERT_MODULE(HF_MODEL_PULL_MODULE_NAME, it);
START_MODULE(it);
if (!status.ok()) {
return status;
bool needsHfPull = !config.getServerSettings().hfSettings.sourceModel.empty();
if (needsHfPull) {
INSERT_MODULE(HF_MODEL_PULL_MODULE_NAME, it);
START_MODULE(it);
if (!status.ok()) {
return status;
}
auto hfModule = dynamic_cast<const HfPullModelModule*>(it->second.get());
status = hfModule->clone();
// Return from modules only in --pull mode or error, otherwise start the rest of modules
if (config.getServerSettings().serverMode == HF_PULL_MODE || !status.ok())
return status;
} else {
// --task with --model_path: create graph in memory without HF download
GraphExport graphExporter;
const auto& hfSettings = config.getServerSettings().hfSettings;
status = graphExporter.createServableConfig(config.modelPath(), hfSettings, false);
if (!status.ok()) {
SPDLOG_ERROR("Failed to create in-memory graph config: {}", status.string());
return status;
}
SPDLOG_INFO("Graph config created in memory from model_path: {}", config.modelPath());
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startModules() can enter HF_PULL(_AND_START)_MODE with an existing GraphExport in-memory graph still set from a previous run (GraphExport is process-global). Because this function doesn’t clear that state, later graph detection paths may incorrectly treat a graph as available even when graph.pbtxt isn’t present on disk. Consider explicitly calling GraphExport::clearInMemoryGraphContent() at the beginning of the HF_PULL/HF_PULL_AND_START handling (and/or after successful disk write) to prevent stale in-memory graphs from affecting subsequent startups.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Comment thread src/modelmanager.cpp
Comment on lines 232 to 235
std::ifstream ifs(mpConfig.getGraphPath());
if (ifs.is_open()) {
bool graphAvailable = ifs.is_open() || GraphExport::hasInMemoryGraphContent();
if (graphAvailable) {
// Single model with graph.pbtxt, check if user passed model unsupported model parameters in cmd arguments
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ModelManager::startFromConfig() treats GraphExport::hasInMemoryGraphContent() as equivalent to having a graph.pbtxt on disk. Because the in-memory graph is process-global and not tied to mpConfig/modelPath, a stale value could make graphAvailable true and skip the non-graph startup path incorrectly. To avoid incorrect graph detection, gate this on the current server mode/CLI path (only use in-memory graph in the specific HF/task+model_path flow) and/or ensure the in-memory content is cleared deterministically after use.

Copilot uses AI. Check for mistakes.
Comment on lines 63 to +70
Status MediapipeGraphDefinition::validateForConfigFileExistence() {
if (GraphExport::hasInMemoryGraphContent()) {
const std::string& content = GraphExport::getInMemoryGraphContent();
this->chosenConfig = content;
this->mgconfig.setCurrentGraphPbTxtMD5(ovms::FileSystem::getStringMD5(content));
SPDLOG_LOGGER_DEBUG(modelmanager_logger, "Using in-memory graph content for mediapipe graph definition: {}", this->getName());
return StatusCode::OK;
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateForConfigFileExistence() unconditionally prefers GraphExport in-memory content whenever it is set. Since that content is global and not scoped to a particular graph/model, this can cause the wrong graph to be loaded (e.g., if a previous startup/test left in-memory content set) and will also bypass filesystem-based reload semantics. Consider scoping the in-memory graph to the specific startup flow (or at least clearing it after it has been consumed) so normal file-based graphs aren’t shadowed.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Comment thread src/test/pull_hf_model_test.cpp Outdated
Comment on lines 360 to 362
ASSERT_TRUE(ovms::GraphExport::hasInMemoryGraphContent());
std::string graphContents = ovms::GraphExport::getInMemoryGraphContent();
ASSERT_EQ(expectedGraphContents, removeVersionString(graphContents)) << graphContents;
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests rely on GraphExport’s in-memory graph content, but GraphExport uses a process-global static string. Without clearing it in TearDown (or at the start of each test), the in-memory graph can leak into later tests and change their behavior (e.g., later graph existence checks may pass unexpectedly). Add GraphExport::clearInMemoryGraphContent() to the fixture TearDown (and possibly SetUp) to keep tests isolated.

Copilot uses AI. Check for mistakes.
Comment thread src/test/llm/llmnode_test.cpp Outdated
Comment on lines +4543 to +4547
void SetUp() override {
// Rename graph.pbtxt so it's not used from file
if (std::filesystem::exists(graphPath)) {
std::filesystem::rename(graphPath, graphPathRenamed);
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetUp() renames graph.pbtxt to graph.pbtxt.bak without handling the case where graph.pbtxt.bak already exists (e.g., from a previous failed run). std::filesystem::rename will throw in that situation and can make the test flaky. Consider removing/uniquifying the backup path or using copy+remove with overwrite semantics so the setup is idempotent.

Copilot uses AI. Check for mistakes.
Comment on lines +4551 to +4559
void TearDown() override {
ovms::Server& server = ovms::Server::instance();
server.setShutdownRequest(1);
if (t && t->joinable())
t->join();
server.setShutdownRequest(0);
// Restore write permissions and rename graph.pbtxt back
RemoveReadonlyFileAttributeFromDir(modelDir);
if (std::filesystem::exists(graphPathRenamed)) {
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixture starts the server in a mode that generates an in-memory graph (GraphExport uses a process-global static string). TearDown() restores filesystem state, but it doesn’t clear the in-memory graph content, which can leak into subsequent tests within the same gtest binary. Call GraphExport::clearInMemoryGraphContent() in TearDown (and/or SetUp) to keep test state isolated.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Comment thread src/test/test_utils.cpp
Comment on lines +895 to +903
char* argv[] = {(char*)"ovms",
(char*)"--model_name",
(char*)modelName,
(char*)"--model_path",
(char*)getGenericFullPathForSrcTest(modelPath).c_str(),
(char*)"--port",
(char*)port.c_str(),
(char*)"--task",
(char*)task};
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this overload, argv stores a pointer from getGenericFullPathForSrcTest(modelPath).c_str(), but that c_str() comes from a temporary std::string that is destroyed at the end of the initializer expression. That leaves argv containing a dangling pointer and can cause undefined behavior in server.start(). Keep the full model path in a named std::string (with lifetime spanning the server.start call) and build argv from stable storage (ideally avoid casting away const on string literals as well).

Copilot uses AI. Check for mistakes.
Comment thread src/graph_export/graph_export.cpp
@dtrawins dtrawins requested a review from mzegla May 7, 2026 14:52
@dtrawins dtrawins added this to the 2026.2_rc milestone May 8, 2026
Copy link
Copy Markdown
Collaborator

@rasapala rasapala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot suggested some good changes. Please verify if they apply.

Comment thread src/graph_export/graph_export.cpp Outdated
#endif
namespace ovms {

static std::string s_inMemoryGraphContent;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why s_ prefix? That's inconsistent with camelCase convention we use.


Status MediapipeGraphDefinition::validateForConfigFileExistence() {
if (GraphExport::hasInMemoryGraphContent()) {
const std::string& content = GraphExport::getInMemoryGraphContent();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it global? Not tied to any specific servable object?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, assumption is that only one graph can be stored in memory. using multiple models requires creating graph.pbtxt in model folder via --pull command.

// EnvGuard guard;
// guard.set("HF_ENDPOINT", "https://modelscope.cn");
// guard.set("HF_ENDPOINT", "https://hf-mirror.com");
this->filesToPrintInCaseOfFailure.emplace_back("graph.pbtxt");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test now checks different flow. Don't we want to keep testing the previous version?
Maybe a separate test or this one parametrized?


TEST_F(HfDownloaderPullHfModel, PositiveDownloadAndStartModelOutsideOvOrg) {
SKIP_AND_EXIT_IF_NOT_RUNNING_UNSTABLE(); // CVS-180127
this->filesToPrintInCaseOfFailure.emplace_back("graph.pbtxt");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. The path with graph.pbtxt present is dropped now.

Comment thread src/cli_parser.cpp Outdated
if (result->count("model_name")) {
modelsSettings.modelName = result->operator[]("model_name").as<std::string>();
} else {
} else if (!hfSettings.sourceModel.empty()) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change? what happens with modelsSettings.modelName on else path now?

Comment thread src/graph_export/graph_export.cpp Outdated
Comment thread src/cli_parser.cpp
// HF pull mode or pull and start mode
if (isHFPullOrPullAndStart(this->result)) {
// HF pull mode or pull and start mode or starting from local folder with graph created in memory
if (isHFPullOrPullAndStart(this->result) || isGenAIConfigureAndStart(this->result)) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GenAI name but it has nothing to do with OpenVINO GenAI, I dont think its good naming

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkalinowski it refers to generative models which can be started via --model_path and --task. It configures runtime params based on CLI options and start the server.
Do you have some other proposals for the name?

Copy link
Copy Markdown
Collaborator

@dkalinowski dkalinowski May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just what the new feature is about: InMemoryGraphStart? TemporaryGraphStart? AutoGraphCreationStart? AutomaticGraphStart?

I mean here but also other enums etc

Comment thread src/config.cpp Outdated
Comment thread src/graph_export/graph_export.cpp Outdated
Comment thread src/graph_export/graph_export.cpp Outdated
#endif
namespace ovms {

static std::string s_inMemoryGraphContent;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dtrawins @atobiszei
how does it support C-API? C-API allows starting and stopping the server multiple times, in a single process. Can we start with this mode via C-API? Do we still fully support C-API?

Copy link
Copy Markdown
Collaborator Author

@dtrawins dtrawins May 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capi does not support llm models. Also in-memory graph is cleared in server.cpp

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it supports starting the server with llms (and other graphs) configured in config.json

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.

Comment thread src/cli_parser.cpp
Comment thread src/config.cpp
Comment on lines +155 to +165
if (this->serverSettings.serverMode == HF_PULL_MODE || this->serverSettings.serverMode == HF_PULL_AND_START_MODE || this->serverSettings.serverMode == GENAI_CONFIGURE_AND_START) {
// When --task is used with --model_path (no HF pulling), sourceModel and downloadPath are not required
bool taskWithModelPath = (this->serverSettings.serverMode == HF_PULL_AND_START_MODE || this->serverSettings.serverMode == GENAI_CONFIGURE_AND_START) && !this->modelsSettings.modelPath.empty();
if (!taskWithModelPath) {
if (!serverSettings.hfSettings.sourceModel.size()) {
std::cerr << "source_model parameter is required for pull mode";
return false;
}
if (!serverSettings.hfSettings.downloadPath.size()) {
std::cerr << "model_repository_path parameter is required for pull mode";
return false;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug to fix.

Comment thread src/server.cpp Outdated
#include "src/test/test_with_temp_dir.hpp"
#include "src/filesystem/filesystem.hpp"
#include "src/pull_module/hf_pull_model_module.hpp"
#include "src/graph_export/graph_export.hpp"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

Comment thread src/cli_parser.cpp Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants