Skip to content

feat: shared assertion templates (include_assertions) #948

@christso

Description

@christso

Problem

Users frequently reuse the same assertion sets across multiple tests and eval files — safety checks, format validation, tone requirements. Currently they must copy-paste assertion blocks, leading to:

  • Duplication across EVAL.yaml files
  • Drift when updating shared criteria
  • Verbose eval files that obscure test-specific logic

Proposed Design

Reusable assertion template files

# .agentv/templates/safe-response.yaml
assertions:
  - type: llm-grader
    prompt: ./graders/no-hallucination.md
    required: true
  - type: llm-grader
    prompt: ./graders/safe-content.md
    required: true
    min_score: 0.9
  - type: contains
    value: "disclaimer"
    negate: true

Reference from EVAL.yaml

# evals/EVAL.yaml
tests:
  - id: refund-request
    input: "I want a refund"
    assertions:
      - include: safe-response          # resolves .agentv/templates/safe-response.yaml
      - type: llm-grader
        prompt: ./graders/refund-quality.md   # test-specific assertion
  - id: greeting
    input: "Hello"
    assertions:
      - include: safe-response          # same shared template
      - type: contains
        value: "hello"

Suite-level includes

# Apply to all tests
assertions:
  - include: safe-response

tests:
  - id: test-1
    input: "..."
    # inherits safe-response assertions + can add test-specific ones

Resolution rules

  1. include: name resolves to .agentv/templates/{name}.yaml
  2. Relative paths also work: include: ./my-templates/safety.yaml
  3. Templates can include other templates (max depth 3 to prevent cycles)
  4. Test-level assertions merge with included assertions (not replace)
  5. skip_defaults: true on a test still skips suite-level includes

Template Location

Two resolution mechanisms, no config needed:

  1. Convention directory: include: safe-response resolves to .agentv/templates/safe-response.yaml
  2. Relative path: include: ./my-templates/safety.yaml resolves relative to the eval file
Use case How
Shared across repo .agentv/templates/ (convention)
Co-located with evals include: ./shared/safety.yaml
Shared across eval suites include: ../../common/safety.yaml
Monorepo shared include: ../../../packages/evals/templates/safety.yaml

Relative paths are explicit and traceable — no ambiguity about which directory a template resolved from.

Implementation

Files to modify

  1. packages/core/src/evaluation/validation/eval-file.schema.ts — add include variant to EvaluatorSchema
  2. packages/core/src/evaluation/loaders/evaluator-parser.ts — resolve include references, load template files, flatten into assertion list
  3. packages/core/src/evaluation/yaml-parser.ts — handle include resolution during test loading

Template discovery

.agentv/templates/           # convention directory
  safe-response.yaml
  json-output.yaml
  professional-tone.yaml

Discovery chain (closest wins): {eval-dir}/.agentv/templates/{repo-root}/.agentv/templates/

Template file format

Same as an assertion block — a YAML file with a top-level assertions array. Each entry is a standard evaluator config. This keeps templates authorable by AI agents (same schema as inline assertions).

Research Context

Inspired by RSpec's shared_examples pattern — define reusable test behaviors that can be included with context. See test-framework-assertion-patterns research.

Dependencies

Acceptance Signals

  1. include: name resolves correctly — loads from .agentv/templates/{name}.yaml — verified by test
  2. include: ./path resolves relative paths — verified by test
  3. Suite-level includes apply to all tests unless skip_defaults: true — verified by test
  4. Test-level includes merge with test-specific assertions — verified by test
  5. Nested includes work up to depth 3; depth > 3 produces a clear error — verified by test
  6. Missing template produces a clear error with the resolved path — verified by test
  7. Schema validation accepts include entries in assertion arrays — verified by test
  8. eval-schema.json regenerated with include support
  9. All existing tests pass — no regressions

Non-goals

  • Template parameters/variables (keep it simple — templates are static assertion sets)
  • Template versioning
  • Remote template registries
  • Cross-repo template sharing (use relative paths with monorepo layout)
  • Configurable template_dirs in config.yaml (relative paths already cover custom locations — a config-based search path creates "which directory did this resolve from?" debugging headaches)
  • Template inheritance/override (templates are flat assertion lists, not class hierarchies)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions