feat: check_models external readiness check by mikeknep · Pull Request #712 · NVIDIA-NeMo/DataDesigner

mikeknep · 2026-05-29T15:02:28Z

📋 Summary

Introduces a new check_models CLI command and DataDesigner interface method for checking external readiness of models and tools without triggering a full workload (preview or create). This is the "external deps" analogue to the existing validate functionality (internal coherence).

🔗 Related Issue

N/A, direct to PR

🔄 Changes

Lifts the checks run by DatasetBuilder to a standalone readiness.py engine module. Used by both the builder and the end user-facing interfaces
Creates a flags.py engine module where the async engine flag is centralized. This cleans up some duplication of the env var magic string / constant that was floating around in a few places.

🧪 Testing

make test passes
Unit tests added/updated
E2E tests added/updated (if applicable)

✅ Checklist

Follows commit message conventions
Commits are signed off (DCO)
Architecture docs updated (if applicable)

greptile-apps · 2026-05-29T15:07:46Z

Greptile Summary

This PR introduces a check_models CLI command and DataDesigner.check_models() interface method that runs the same external readiness probes (model health checks and MCP tool checks) that already gate preview and create, but without triggering a full workload. It also extracts the shared check logic into a new engine/readiness.py module and centralizes the async engine feature flag into a new engine/flags.py.

New readiness.py engine module lifts _run_model_health_check_if_needed and _run_mcp_tool_check_if_needed out of DatasetBuilder into a public run_readiness_check() function; both DatasetBuilder.build/build_preview and DataDesigner.check_models now share the exact same code path.
New flags.py centralizes DATA_DESIGNER_ASYNC_ENGINE so the engine, interface, and readiness modules read a single source of truth instead of each re-parsing the env var independently.
check-models CLI command mirrors the existing validate command shape — takes a config source, prints success or a typed error message, exits 1 on failure — complementing validate (internal readiness) with external readiness (provider liveness).

Confidence Score: 5/5

Safe to merge. The refactoring is a clean extraction of existing logic into a shared module with no behavioral regressions.

The readiness check logic is lifted verbatim from DatasetBuilder private methods into readiness.py, and both the existing workload startup path and the new check_models path call the same function. Moving _resolve_async_compatibility() before the readiness probe corrects a subtle inconsistency where the old health check dispatched on the raw env-var flag rather than the resolved async mode. Validation improvements in registry.py are additive correctness fixes. Test coverage is thorough across engine, interface, and CLI layers.

No files require special attention.

Important Files Changed

Filename	Overview
packages/data-designer-engine/src/data_designer/engine/readiness.py	New module; lifts model and MCP health check logic out of DatasetBuilder into a shared run_readiness_check() function. Logic is equivalent to the removed private methods, with the improvement that async vs. sync dispatch now follows the resolved client_concurrency_mode rather than the raw env-var flag.
packages/data-designer-engine/src/data_designer/engine/flags.py	New module centralizing DATA_DESIGNER_ASYNC_ENGINE flag; straightforward env-var read with a module-level bool, enabling tests to monkeypatch a single location.
packages/data-designer-engine/src/data_designer/engine/dataset_builders/dataset_builder.py	Replaces the two inline health-check methods with run_readiness_check(); moves _resolve_async_compatibility() before the readiness call to correctly match client mode; switches all flag references to flags.DATA_DESIGNER_ASYNC_ENGINE.
packages/data-designer-engine/src/data_designer/engine/models/registry.py	Adds _parse_health_check_chat_response (validates non-empty string) and _validate_health_check_embedding_response (validates exactly one non-empty vector) to replace the permissive lambda x: x parser and unvalidated embedding calls.
packages/data-designer/src/data_designer/interface/data_designer.py	Adds check_models() method adjacent to validate(); delegates to run_readiness_check with the resolved client concurrency mode.
packages/data-designer/src/data_designer/cli/controllers/generation_controller.py	Adds run_check_models(); handles DataDesignerError and generic Exception separately to surface the error class name for typed engine errors.
scripts/health_checks.py	Refactored to use DataDesigner.check_models() rather than calling ModelFacade directly; delegates embedding response validation to the registry layer.
packages/data-designer-engine/tests/engine/test_readiness.py	New test file with thorough coverage: alias collection, skip-when-empty, error propagation, ordering, async dispatch, timeout/cancel, and client-mode selection.
packages/data-designer/tests/interface/test_data_designer.py	Adds check_models interface tests covering delegation to run_readiness_check, typed error propagation, sync fallback mode, and column materialization.

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI as check_models_command
    participant Ctrl as GenerationController
    participant DD as DataDesigner
    participant RP as ResourceProvider
    participant RC as readiness.py
    participant MR as ModelRegistry
    participant MCP as MCPRegistry

    User->>CLI: data-designer check-models config.yaml
    CLI->>Ctrl: run_check_models(config_source)
    Ctrl->>Ctrl: _load_config(config_source)
    Ctrl->>DD: DataDesigner()
    Ctrl->>DD: check_models(config_builder)
    DD->>RP: _create_resource_provider(check-models, config_builder)
    DD->>DD: _resolve_client_concurrency_mode(config_builder)
    DD->>RC: run_readiness_check(columns, resource_provider, client_concurrency_mode)
    RC->>MR: run_health_check(model_aliases) or arun_health_check(...)
    MR-->>RC: OK or raises typed error
    RC->>MCP: run_health_check(tool_aliases)
    MCP-->>RC: OK or raises
    RC-->>DD: None
    DD-->>Ctrl: None
    Ctrl->>User: All models and tools responded successfully

_{Reviews (5): Last reviewed commit: "Be more strict with health check respons..." | Re-trigger Greptile}

andreatgretel · 2026-06-09T21:06:01Z

+    if not model_aliases:
+        return
+
+    if flags.DATA_DESIGNER_ASYNC_ENGINE:


small edge case: check_models() can build sync-mode clients when the config forces sync fallback, like allow_resize=True, but readiness still branches on the raw async env flag and calls arun_health_check(). I know allow_resize / sync are probably on the way out, so maybe this is just a known limitation, but it might be worth either matching the resolved client mode here or adding a regression/explicit guard so users do not hit an async/sync client internals error.

Interesting, I overlooked this, thanks. Addressed in c9c118f. This should be fairly straightforward to unwind if we fully drop the sync engine.

andreatgretel · 2026-06-09T21:06:02Z

+    fake_loop = Mock()
+
+    with (
+        patch("data_designer.engine.readiness.ensure_async_engine_loop", return_value=fake_loop, create=True),


suggestion: this patch target does not look like it is used. _run_model_health_check() imports ensure_async_engine_loop inside the function from data_designer.engine.dataset_builders.utils.async_concurrency, so patching data_designer.engine.readiness.ensure_async_engine_loop will not intercept it. Could patch the source path instead and drop create=True; same thing applies to the timeout test below.

Nice catch, thanks. Fixed in 8f52dbd

andreatgretel · 2026-06-09T21:06:03Z

I think this new readiness path could eventually replace the implementation in scripts/health_checks.py, but probably not the script itself. The script is catalog-scoped, checking the predefined provider/model defaults when env vars are present; check_models is config-scoped. A nice follow-up might be to keep the script as the maintainer entrypoint, but have it build small temporary configs from PREDEFINED_PROVIDERS_MODEL_MAP and call the shared readiness path so the actual probe logic does not drift.

Signed-off-by: Mike Knepper <mknepper@nvidia.com>

mikeknep · 2026-06-10T16:29:06Z

@andreatgretel I updated the health_checks script to use this new method in b3a9d88, LMK what you think!

andreatgretel

This replacement direction looks good to me: keeping scripts/health_checks.py as the predefined-provider/catalog entrypoint while delegating the actual probe path through DataDesigner.check_models() seems like the right shape.

One behavioral difference I noticed: the old script had stricter pass criteria than the shared readiness path. It retried chat completions and failed empty/non-string responses, and it required embeddings to return exactly one non-empty vector. The new path reports success as long as ModelRegistry.run_health_check() does not raise, so "" for chat or [[]] for embeddings could now pass.

Since this script is our maintainer smoke test for predefined provider defaults, I’d preserve those non-empty checks either in ModelRegistry.run_health_check() or as script-specific strict validation. Otherwise this looks good to me.

Signed-off-by: Mike Knepper <mknepper@nvidia.com>

mikeknep · 2026-06-11T19:33:42Z

I've updated ModelRegistry.[a]run_health_check with more strict checking of the responses in 95e213c

mikeknep requested a review from a team as a code owner May 29, 2026 15:02

mikeknep had a problem deploying to agentic-ci May 29, 2026 15:02 — with GitHub Actions Failure

mikeknep force-pushed the dd-model-health/mknepper branch from 6ef34db to 88f1314 Compare June 4, 2026 14:09

github-actions Bot mentioned this pull request Jun 8, 2026

Agentic CI: Issue & PR Triage Tracker #562

Open

andreatgretel reviewed Jun 9, 2026

View reviewed changes

mikeknep added 8 commits June 10, 2026 09:45

Plan for check-models

9bc45f2

Signed-off-by: Mike Knepper <mknepper@nvidia.com>

Agent impl for check-models

ddb9ad8

Signed-off-by: Mike Knepper <mknepper@nvidia.com>

Centralize async engine flag in a shared module

f1e519f

Signed-off-by: Mike Knepper <mknepper@nvidia.com>

Cleanup

1527c84

Signed-off-by: Mike Knepper <mknepper@nvidia.com>

post-rebase cleanup

2e142bb

Signed-off-by: Mike Knepper <mknepper@nvidia.com>

patch async loop source in readiness tests

8f52dbd

Use in health_checks script

b3a9d88

Signed-off-by: Mike Knepper <mknepper@nvidia.com>

align readiness with client mode

c9c118f

mikeknep force-pushed the dd-model-health/mknepper branch from c4715f9 to c9c118f Compare June 10, 2026 16:26

mikeknep requested a review from andreatgretel June 10, 2026 16:29

andreatgretel previously approved these changes Jun 11, 2026

View reviewed changes

Be more strict with health check responses

95e213c

Signed-off-by: Mike Knepper <mknepper@nvidia.com>

mikeknep dismissed andreatgretel’s stale review via 95e213c June 11, 2026 19:31

mikeknep requested a review from andreatgretel June 11, 2026 19:33

andreatgretel approved these changes Jun 11, 2026

View reviewed changes

mikeknep merged commit abe7667 into main Jun 11, 2026
61 checks passed

mikeknep deleted the dd-model-health/mknepper branch June 11, 2026 20:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: check_models external readiness check#712

feat: check_models external readiness check#712
mikeknep merged 9 commits into
mainfrom
dd-model-health/mknepper

mikeknep commented May 29, 2026

Uh oh!

greptile-apps Bot commented May 29, 2026 •

edited

Loading

Confidence Score: 5/5

Sequence Diagram

Uh oh!

andreatgretel Jun 9, 2026

Uh oh!

mikeknep Jun 10, 2026

Uh oh!

andreatgretel Jun 9, 2026

Uh oh!

mikeknep Jun 10, 2026

Uh oh!

andreatgretel commented Jun 9, 2026

Uh oh!

mikeknep commented Jun 10, 2026

Uh oh!

andreatgretel left a comment

Uh oh!

mikeknep commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mikeknep commented May 29, 2026

📋 Summary

🔗 Related Issue

🔄 Changes

🧪 Testing

✅ Checklist

Uh oh!

greptile-apps Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

andreatgretel Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

mikeknep Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

andreatgretel Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

mikeknep Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

andreatgretel commented Jun 9, 2026

Uh oh!

mikeknep commented Jun 10, 2026

Uh oh!

andreatgretel left a comment

Choose a reason for hiding this comment

Uh oh!

mikeknep commented Jun 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

greptile-apps Bot commented May 29, 2026 •

edited

Loading