feat: check_models external readiness check#712
Conversation
Greptile SummaryThis PR introduces a
|
| Filename | Overview |
|---|---|
| packages/data-designer-engine/src/data_designer/engine/readiness.py | New module; lifts model and MCP health check logic out of DatasetBuilder into a shared run_readiness_check() function. Logic is equivalent to the removed private methods, with the improvement that async vs. sync dispatch now follows the resolved client_concurrency_mode rather than the raw env-var flag. |
| packages/data-designer-engine/src/data_designer/engine/flags.py | New module centralizing DATA_DESIGNER_ASYNC_ENGINE flag; straightforward env-var read with a module-level bool, enabling tests to monkeypatch a single location. |
| packages/data-designer-engine/src/data_designer/engine/dataset_builders/dataset_builder.py | Replaces the two inline health-check methods with run_readiness_check(); moves _resolve_async_compatibility() before the readiness call to correctly match client mode; switches all flag references to flags.DATA_DESIGNER_ASYNC_ENGINE. |
| packages/data-designer-engine/src/data_designer/engine/models/registry.py | Adds _parse_health_check_chat_response (validates non-empty string) and _validate_health_check_embedding_response (validates exactly one non-empty vector) to replace the permissive lambda x: x parser and unvalidated embedding calls. |
| packages/data-designer/src/data_designer/interface/data_designer.py | Adds check_models() method adjacent to validate(); delegates to run_readiness_check with the resolved client concurrency mode. |
| packages/data-designer/src/data_designer/cli/controllers/generation_controller.py | Adds run_check_models(); handles DataDesignerError and generic Exception separately to surface the error class name for typed engine errors. |
| scripts/health_checks.py | Refactored to use DataDesigner.check_models() rather than calling ModelFacade directly; delegates embedding response validation to the registry layer. |
| packages/data-designer-engine/tests/engine/test_readiness.py | New test file with thorough coverage: alias collection, skip-when-empty, error propagation, ordering, async dispatch, timeout/cancel, and client-mode selection. |
| packages/data-designer/tests/interface/test_data_designer.py | Adds check_models interface tests covering delegation to run_readiness_check, typed error propagation, sync fallback mode, and column materialization. |
Sequence Diagram
sequenceDiagram
participant User
participant CLI as check_models_command
participant Ctrl as GenerationController
participant DD as DataDesigner
participant RP as ResourceProvider
participant RC as readiness.py
participant MR as ModelRegistry
participant MCP as MCPRegistry
User->>CLI: data-designer check-models config.yaml
CLI->>Ctrl: run_check_models(config_source)
Ctrl->>Ctrl: _load_config(config_source)
Ctrl->>DD: DataDesigner()
Ctrl->>DD: check_models(config_builder)
DD->>RP: _create_resource_provider(check-models, config_builder)
DD->>DD: _resolve_client_concurrency_mode(config_builder)
DD->>RC: run_readiness_check(columns, resource_provider, client_concurrency_mode)
RC->>MR: run_health_check(model_aliases) or arun_health_check(...)
MR-->>RC: OK or raises typed error
RC->>MCP: run_health_check(tool_aliases)
MCP-->>RC: OK or raises
RC-->>DD: None
DD-->>Ctrl: None
Ctrl->>User: All models and tools responded successfully
Reviews (5): Last reviewed commit: "Be more strict with health check respons..." | Re-trigger Greptile
6ef34db to
88f1314
Compare
| if not model_aliases: | ||
| return | ||
|
|
||
| if flags.DATA_DESIGNER_ASYNC_ENGINE: |
There was a problem hiding this comment.
small edge case: check_models() can build sync-mode clients when the config forces sync fallback, like allow_resize=True, but readiness still branches on the raw async env flag and calls arun_health_check(). I know allow_resize / sync are probably on the way out, so maybe this is just a known limitation, but it might be worth either matching the resolved client mode here or adding a regression/explicit guard so users do not hit an async/sync client internals error.
There was a problem hiding this comment.
Interesting, I overlooked this, thanks. Addressed in c9c118f. This should be fairly straightforward to unwind if we fully drop the sync engine.
| fake_loop = Mock() | ||
|
|
||
| with ( | ||
| patch("data_designer.engine.readiness.ensure_async_engine_loop", return_value=fake_loop, create=True), |
There was a problem hiding this comment.
suggestion: this patch target does not look like it is used. _run_model_health_check() imports ensure_async_engine_loop inside the function from data_designer.engine.dataset_builders.utils.async_concurrency, so patching data_designer.engine.readiness.ensure_async_engine_loop will not intercept it. Could patch the source path instead and drop create=True; same thing applies to the timeout test below.
|
I think this new readiness path could eventually replace the implementation in |
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
c4715f9 to
c9c118f
Compare
|
@andreatgretel I updated the health_checks script to use this new method in b3a9d88, LMK what you think! |
andreatgretel
left a comment
There was a problem hiding this comment.
This replacement direction looks good to me: keeping scripts/health_checks.py as the predefined-provider/catalog entrypoint while delegating the actual probe path through DataDesigner.check_models() seems like the right shape.
One behavioral difference I noticed: the old script had stricter pass criteria than the shared readiness path. It retried chat completions and failed empty/non-string responses, and it required embeddings to return exactly one non-empty vector. The new path reports success as long as ModelRegistry.run_health_check() does not raise, so "" for chat or [[]] for embeddings could now pass.
Since this script is our maintainer smoke test for predefined provider defaults, I’d preserve those non-empty checks either in ModelRegistry.run_health_check() or as script-specific strict validation. Otherwise this looks good to me.
Signed-off-by: Mike Knepper <mknepper@nvidia.com>
|
I've updated |
📋 Summary
Introduces a new
check_modelsCLI command andDataDesignerinterface method for checking external readiness of models and tools without triggering a full workload (preview or create). This is the "external deps" analogue to the existingvalidatefunctionality (internal coherence).🔗 Related Issue
N/A, direct to PR
🔄 Changes
DatasetBuilderto a standalonereadiness.pyengine module. Used by both the builder and the end user-facing interfacesflags.pyengine module where the async engine flag is centralized. This cleans up some duplication of the env var magic string / constant that was floating around in a few places.🧪 Testing
make testpasses✅ Checklist