Skip to content

fix(evaluators): Update Luna support#206

Open
mikebranc wants to merge 1 commit intomainfrom
fix-luna-plugin-bugs
Open

fix(evaluators): Update Luna support#206
mikebranc wants to merge 1 commit intomainfrom
fix-luna-plugin-bugs

Conversation

@mikebranc
Copy link
Copy Markdown
Collaborator

@mikebranc mikebranc commented May 1, 2026

Summary

https://app.shortcut.com/galileo/story/63852/agent-control-luna-guardrails-not-working-as-expected

  • Fix metric names in Luna2Metric: output_toxicitytoxicity, pii_detectionpii; add input_pii and not_empty operator
  • Require stage_name for local stages: previously only validated for central stages, but the API needs it for both — validator and field description updated accordingly
  • Add categorical operator support: Protect's local-stage rule engine doesn't evaluate not_empty/any server-side (always returns not_triggered); added _evaluate_metric_results for client-side fallback
  • Pass stage_name in local stage API calls: was being configured but not forwarded to invoke_protect
  • Test fixes: updated all local stage configs to include stage_name, corrected stale metric names (output_toxicity, pii_detection), added test_local_stage_requires_stage_name and TestLuna2BuildMessage coverage

@mikebranc mikebranc changed the title feat: (wip) update protect support feat: (wip) update Luna support May 1, 2026
@mikebranc mikebranc force-pushed the fix-luna-plugin-bugs branch from a37f229 to a412ee9 Compare May 5, 2026 20:19
@mikebranc mikebranc changed the title feat: (wip) update Luna support fix: Update Luna support May 5, 2026
@mikebranc mikebranc changed the title fix: Update Luna support fix(evaluators): Update Luna support May 5, 2026
@mikebranc mikebranc force-pushed the fix-luna-plugin-bugs branch from a412ee9 to 9ebd657 Compare May 5, 2026 20:25
@mikebranc mikebranc force-pushed the fix-luna-plugin-bugs branch from 9ebd657 to 3d77d08 Compare May 5, 2026 20:28
@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 41.37931% with 34 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...agent_control_evaluator_galileo/luna2/evaluator.py 37.03% 34 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant