fix(airt): normalize attacks arg in generate_category_attack by rdheekonda · Pull Request #35 · dreadnode/capabilities

rdheekonda · 2026-06-04T00:31:59Z

Problem

generate_category_attack's attacks parameter was iterated character-by-character when passed as a bare string, producing cryptic errors like Unknown attack: 't' / Unknown attack: '['. This made the entire category-sweep path effectively unusable — no format (bare string, full SDK name, JSON list) worked.

Root cause: the runner read attacks_raw = params.get("attacks", []) and looped for a in attacks_raw directly, so "tap" became ['t', 'a', 'p']. By contrast, generate_attack correctly does attack_type.split(",").

Fix

scripts/attack_runner.py: add _normalize_attack_names() accepting a list (["tap", "goat"]), comma-separated string ("tap,goat"), single name ("tap"), or stringified-list noise ("['tap']"). Return a clear validation error instead of a single-character failure.
tools/attacks.py: widen the attacks annotation to list[str] | str.
tests/test_attack_runner.py: add TestNormalizeAttackNames regression coverage (10 cases + explicit "must not split to chars" assertion).
skills/error-troubleshooting/SKILL.md: document the single-character failure signature.
agents/ai-red-teaming-agent.md: add category-tool auto-fallback guidance.

Verification

py_compile passes on all changed Python files.
Normalization logic validated against all parametrized cases.
End-to-end: re-ran the previously-failing generate_category_attack(attacks=["tap"], categories=["violence"], …) — it now parses correctly, loads bundled violence goals, and runs a real 3-goal sweep to completion.

The `attacks` parameter was iterated character-by-character when passed as a bare string, producing cryptic "Unknown attack: 't'" errors and making the entire category-sweep path unusable. - Add _normalize_attack_names() to accept list, comma-separated string, single name, or stringified-list noise; mirrors generate_attack's attack_type.split(",") handling. - Return a clear validation error instead of a single-character failure. - Widen the tool annotation to list[str] | str. - Add TestNormalizeAttackNames regression coverage. - Document the failure signature in error-troubleshooting skill. - Add category-tool auto-fallback guidance to the agent instructions.

attack_runner.py does not import typing as t, so the t.Any annotation tripped ruff F821 (undefined name) in CI. Use the builtin object annotation, which needs no import.

rdheekonda added 3 commits June 4, 2026 00:31

fix(airt): use object instead of t.Any in _normalize_attack_names

849f98b

attack_runner.py does not import typing as t, so the t.Any annotation tripped ruff F821 (undefined name) in CI. Use the builtin object annotation, which needs no import.

chore(ai-red-teaming): bump version 1.3.6 -> 1.3.7

073885a

rdheekonda merged commit b2803d9 into main Jun 4, 2026
5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(airt): normalize attacks arg in generate_category_attack#35

fix(airt): normalize attacks arg in generate_category_attack#35
rdheekonda merged 3 commits into
mainfrom
fix/airt-category-attacks-arg-parsing

rdheekonda commented Jun 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rdheekonda commented Jun 4, 2026

Problem

Fix

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant