nullhack · nullhack · Apr 26, 2026 · Apr 26, 2026 · Apr 26, 2026
diff --git a/.flowception/.gitkeep b/.flowception/.gitkeep
diff --git a/.gitignore b/.gitignore
@@ -169,4 +169,8 @@ cython_debug/
 #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
 #.idea/
 .mutmut-cache
+
+# Flowception session files (local working state)
+.flowception/session-*.yaml
+
 # Trigger CI run to verify linting fixes
diff --git a/.opencode/agents/product-owner.md b/.opencode/agents/product-owner.md
@@ -19,7 +19,7 @@ You interview the human stakeholder to discover what to build, write Gherkin spe
 
 ## Session Start
 
-Load `skill run-session` first — it reads FLOW.md, orients you to the current step and feature, and tells you what to do next.
+Load `skill run-session` first — it reads docs/flows/feature-flow.yaml, orients you to the current step and feature, and tells you what to do next.
 
 **[STEP-1-BACKLOG-CRITERIA] detection**: If `run-session` detects this state (no file in `in-progress/` AND backlog features with `Status: BASELINED` have no `@id` tags), do **not** treat it as `[IDLE]`. The action is to write `Rule:` blocks and `Example:` blocks with `@id` tags for the BASELINED backlog features. Files stay in `backlog/`. Do NOT move any feature to `in-progress/` during this state.
 
@@ -45,8 +45,8 @@ After the system-architect approves (Step 4):
 
 1. Run or observe the feature yourself. If user interaction is involved, interact with it. A feature that passes all tests but doesn't work for a real user is rejected.
 2. Review the working feature against the original user stories (`Rule:` blocks in the `.feature` file).
-3. **If accepted**: move `docs/features/in-progress/<name>.feature` → `docs/features/completed/<name>.feature`; update `WORK.md` (`@state: STEP-5-MERGE`); notify stakeholder. The stakeholder decides when to trigger PR and release. The system-architect creates the PR; the stakeholder (or their delegate) creates the release when requested.
-4. **If rejected**: write specific feedback in `WORK.md` pointing to the failing step, then send back to the relevant step.
+3. **If accepted**: move `docs/features/in-progress/<name>.feature` → `docs/features/completed/<name>.feature`; update the session file in `.flowception/` (`@state: STEP-5-MERGE`); notify stakeholder. The stakeholder decides when to trigger PR and release. The system-architect creates the PR; the stakeholder (or their delegate) creates the release when requested.
+4. **If rejected**: write specific feedback in the session file in `.flowception/` pointing to the failing step, then send back to the relevant step.
 
 ## Handling Gaps
 
@@ -64,11 +64,11 @@ When a gap is reported (by software-engineer or system-architect):
 When a defect is reported against any feature:
 
 1. Add a `@bug` Example to the relevant `Rule:` block in the `.feature` file using the standard `Given/When/Then` format describing the correct behaviour.
-2. Update `WORK.md` `@state` to reflect the bug work and notify the software-engineer.
+2. Update the session file in `.flowception/` `@state` to reflect the bug work and notify the software-engineer.
 3. SE implements the test in `tests/features/` **and** a `@given` Hypothesis property test in `tests/unit/`. Both are required.
 
 ## Available Skills
 
 - `run-session` — session start/end protocol
-- `select-feature` — when FLOW.md Status is [IDLE]: score and select next backlog feature using WSJF
+- `select-feature` — when docs/flows/feature-flow.yaml Status is [IDLE]: score and select next backlog feature using WSJF
 - `define-scope` — Step 1: Stage 1 (Discovery sessions with stakeholder) and Stage 2 (Stories + Criteria, PO alone)
diff --git a/.opencode/agents/software-engineer.md b/.opencode/agents/software-engineer.md
@@ -31,7 +31,7 @@ You implement everything the system-architect designed. You own the code: tests,
 
 ## Session Start
 
-Load `skill run-session` first — it reads FLOW.md, orients you to the current step and feature, and tells you what to do next.
+Load `skill run-session` first — it reads docs/flows/feature-flow.yaml, orients you to the current step and feature, and tells you what to do next.
 
 ## Step Routing
 
@@ -47,20 +47,20 @@ Load `skill run-session` first — it reads FLOW.md, orients you to the current
 - You own git commits and releases
 - **System-architect approves**: any change to stubs, Protocols, or ADR decisions
 - **PO approves**: new runtime dependencies, changed entry points, scope changes
-- **You never move `.feature` files.** The PO is the sole owner of all feature file moves (backlog → in-progress → completed). If you find no `.feature` file in `docs/features/in-progress/`, **STOP** — do not self-select a feature. Write the gap in FLOW.md and escalate to PO.
+- **You never move `.feature` files.** The PO is the sole owner of all feature file moves (backlog → in-progress → completed). If you find no `.feature` file in `docs/features/in-progress/`, **STOP** — do not self-select a feature. Write the gap in the session file in `.flowception/` and escalate to PO.
 
 ## No In-Progress Feature
 
 If `docs/features/in-progress/` contains only `.gitkeep` (no `.feature` file):
 1. Do not pick a feature from backlog yourself.
-2. Update `WORK.md` `@state` to `[IDLE]` if it is not already.
+2. Update the session file in `.flowception/` `@state` to `[IDLE]` if it is not already.
 3. Stop. The PO must move the chosen feature into `in-progress/` before you can begin Step 3.
 
 ## Spec Gaps
 
 If during implementation you discover behaviour not covered by existing acceptance criteria:
 - Do not extend criteria yourself — escalate to the PO
-- Note the gap in `WORK.md` and escalate to PO
+- Note the gap in the session file in `.flowception/` and escalate to PO
 
 ## Available Skills
 

diff --git a/.opencode/agents/system-architect.md b/.opencode/agents/system-architect.md
@@ -31,13 +31,13 @@ You design the system's structure and verify that the implementation respects th
 
 ## Session Start
 
-Load `skill run-session` first — it reads FLOW.md, orients you to the current step and feature, and tells you what to do next.
+Load `skill run-session` first — it reads docs/flows/feature-flow.yaml, orients you to the current step and feature, and tells you what to do next.
 
 ## Step Routing
 
 | Step | Action |
 |---|---|
-| **Step 2 — ARCH** | Load `skill architect` — verify on `feat/<stem>` branch, design domain model, write stubs, create ADRs, generate test stubs |
+| **Step 2 — ARCH** | Load `skill architect` — arch-cycle subflow (read → interview → validate → design → stubs), design domain model, write stubs, create ADRs, generate test stubs |
 | **Step 4 — VERIFY** | Load `skill verify` — adversarial technical review of the SE's implementation |
 | **Step 5 — after PO accepts** | Load `skill create-pr` — create and merge the feature pull request |
 
@@ -47,13 +47,13 @@ Load `skill run-session` first — it reads FLOW.md, orients you to the current
 - You own `docs/system.md` (including the `## Domain Model` section) and `docs/adr/ADR-*.md` — create and update these at Step 2; draft ADRs first, then present a validation table to the stakeholder before committing
 - You review implementation at Step 4 to ensure architectural decisions were respected
 - **PO approves**: new runtime dependencies, changed entry points, scope changes
-- **You never move `.feature` files.** The PO is the sole owner of all feature file moves. If you find no `.feature` file in `docs/features/in-progress/`, **STOP** — do not self-select a feature. Update `WORK.md` `@state` to `[IDLE]` and escalate to PO.
+- **You never move `.feature` files.** The PO is the sole owner of all feature file moves. If you find no `.feature` file in `docs/features/in-progress/`, **STOP** — do not self-select a feature. Update the session file in `.flowception/` `@state` to `[IDLE]` and escalate to PO.
 
 ## Step 2 → Step 3 Handoff
 
-After architecture is complete and test stubs are generated:
-1. Commit all changes on `feat/<stem>`
-2. Update `WORK.md`: set `@state: STEP-3-WORKING`
+After architecture is complete (arch-cycle subflow exits `complete`) and test stubs are generated:
+1. Commit all changes on the feature branch (the SE creates the branch at Step 3 start — SA commits on whatever branch is current, or the SA may commit on `main` if no branch exists yet, and the SE will branch from that commit)
+2. Update the session file in `.flowception/`: set `@state: step-3-working` (the TDD subflow's `setup` state handles branch creation)
 3. Stop. The SE takes over for implementation.
 
 ## Step 4 Review Stance
@@ -67,7 +67,7 @@ Your default hypothesis is that the code is broken despite passing automated che
 
 If during Step 2 or Step 4 you discover behaviour not covered by existing acceptance criteria:
 - Do not extend criteria yourself — escalate to the PO
-- Note the gap in `WORK.md` and escalate to PO
+- Note the gap in the session file in `.flowception/` and escalate to PO
 
 ## Available Skills
 

diff --git a/.opencode/knowledge/agent-design/opencode-format.md b/.opencode/knowledge/agent-design/opencode-format.md
@@ -0,0 +1,100 @@
+---
+domain: agent-design
+tags: [agents, opencode, format, configuration]
+last-updated: 2026-04-26
+---
+
+# OpenCode Agent Format
+
+## Key Takeaways
+
+- Agent files live at `.opencode/agents/<name>.md` (project) or `~/.config/opencode/agents/<name>.md` (global); the filename becomes the agent name.
+- Frontmatter requires `description` and `mode` (primary/subagent/all); optional fields include model, temperature, steps, permissions, and more.
+- Body sections in order: Role, Available Skills, Instructions, Escalation; write in third person.
+- Permission values are `allow` (run immediately), `ask` (prompt user), `deny` (hidden/rejected); wildcards supported, last matching rule wins.
+
+## Concepts
+
+**File Location and Naming**: Agent files are discovered at `.opencode/agents/<name>.md` (project-level) and `~/.config/opencode/agents/<name>.md` (global). The filename without `.md` becomes the agent name. Project-level takes precedence.
+
+**YAML Frontmatter Fields**: Required fields are `description` (1-sentence, shown in agent selection) and `mode` (primary for main agents, subagent for agents invoked by others, all for either). Key optional fields: `model` (override default model), `steps` (max agentic iterations), `hidden` (hide subagents from autocomplete), `permission` (fine-grained tool access control), `prompt` (custom system prompt).
+
+**Body Structure**: Body sections in order: Role (who the agent is and what it owns), Available Skills (which skills to load and when), Instructions (step-by-step actions), Escalation (when to hand off).
+
+**Permission Patterns**: Permission values are `allow` (run immediately), `ask` (prompt user), `deny` (hidden/rejected); wildcards are supported and the last matching rule wins. Common patterns: Read-only (deny edit, ask bash), Build (allow edit and bash), Restricted (ask for both).
+
+## Content
+
+### File Locations
+
+- Project: `.opencode/agents/<name>.md`
+- Global: `~/.config/opencode/agents/<name>.md`
+
+The filename (without `.md`) becomes the agent name.
+
+### YAML Frontmatter
+
+```yaml
+---
+description: <1-sentence description>  # Required
+mode: primary | subagent | all       # Required
+model: <provider/model-id>           # Optional; inherits from primary
+temperature: <0.0-1.0>               # Optional; model default
+steps: <integer>                      # Optional; max agentic iterations
+disable: true | false                 # Optional; default false
+hidden: true | false                   # Optional; subagent only; hides from @ autocomplete
+prompt: <text or {file:./path}>       # Optional; custom system prompt
+color: <hex or theme-color>           # Optional; UI color
+top_p: <0.0-1.0>                      # Optional; response diversity
+permission:
+  edit: allow | ask | deny
+  bash:
+    "*": ask | allow | deny            # Wildcard; last matching rule wins
+    "git status *": allow              # Specific command patterns
+  webfetch: allow | ask | deny
+  skill:
+    "<skill-name>": allow | deny
+  task:
+    "*": deny | allow
+    "<agent-name>": allow
+---
+```
+
+### Key Fields
+
+- **description** (required): What the agent does and when to use it. Shown in agent selection.
+- **mode**: `primary` for main agents, `subagent` for agents invoked by others, `all` for either.
+- **model**: Override the default model. Subagents inherit the invoking primary's model unless specified.
+- **steps**: Maximum agentic iterations before forced text-only response.
+- **hidden**: Only for `mode: subagent`. Hides from `@` autocomplete but still invokable via Task tool.
+- **permission.task**: Controls which subagents this agent can invoke via the Task tool. Glob patterns supported; last matching rule wins.
+
+### Permission Values
+
+| Value | Behavior |
+|---|---|
+| `allow` | Tool runs immediately without approval |
+| `ask` | User prompted for approval before running |
+| `deny` | Tool hidden from agent, access rejected |
+
+### Markdown Body
+
+After frontmatter, write the agent's instructions. Key sections:
+
+1. **Role** — who the agent is and what it owns
+2. **Available Skills** — which skills to load and when
+3. **Instructions** — step-by-step actions for each owned step
+4. **Escalation** — when to hand off to another agent or human
+
+### Common Permission Patterns
+
+| Pattern | edit | bash | Use Case |
+|---|---|---|---|
+| Read-only | deny | ask (specific: allow git read commands) | Review, analysis |
+| Build | allow | allow | Full development |
+| Restricted | ask | ask | Planning, cautious editing |
+
+## Related
+
+- [[agent-design/principles]]
+- [[skill-design/opencode-format]]
diff --git a/.opencode/knowledge/agent-design/principles.md b/.opencode/knowledge/agent-design/principles.md
@@ -0,0 +1,93 @@
+---
+domain: agent-design
+tags: [agents, best-practices, ownership, context-isolation, research-backed]
+last-updated: 2026-04-26
+---
+
+# Agent Design Principles
+
+## Key Takeaways
+
+- Define the smallest agent that can own a clear task; add agents only for separate ownership, different instructions, different tool surface, or different approval policy.
+- Use subagents for investigation tasks that rapidly exhaust context; they quarantine token cost and prevent anchoring bias.
+- Maintain a three-file separation (AGENTS.md, agents, skills) to prevent instruction conflict, positional attention degradation, and redundancy interference.
+- Embed specific IF-THEN triggers at decision points, not vague references; error-specific feedback is actionable, vague feedback is not.
+
+## Concepts
+
+**Minimal-Scope Agent Design**: Define the smallest agent that can own a clear task. Add more agents only for separate ownership, different instructions (not just more detail), different tool surface, or different approval policy. The split criterion is ownership boundary, not instruction volume. Anti-pattern: creating agents just to organize instructions.
+
+**Context Isolation via Subagents**: Subagents run in their own context windows and report back summaries. This keeps the primary conversation clean for implementation. Every file read in a subagent burns tokens in a child window, not the primary window. Context window is the primary performance constraint for LLM agents. A fresh context also prevents anchoring bias from prior conversation state.
+
+**Three-File Separation**: Three failure modes (instruction conflict, positional attention degradation, redundancy interference) produce a three-file split with defined content rules: AGENTS.md (every session, project conventions), agents (when role invoked, role identity), skills (on demand, procedural instructions), and knowledge (on demand, reference + explanation only).
+
+**Effective Instruction Writing and Tool Permission Design**: Specific triggers at decision points are 2-3x more likely to execute than general intentions. Error-specific feedback like "FAIL: function > 20 lines at file:47" is actionable; "Apply function length rules" is not. Agent-Computer Interface design is as important as Human-Computer Interface design: start with bash for breadth, promote to dedicated tools for security, structured output, or audit patterns.
+
+## Content
+
+### Minimal-Scope Agent Design
+
+Define the smallest agent that can own a clear task. Add more agents only when you need:
+- **Separate ownership** — different domain responsibility
+- **Different instructions** — not just more detail, but fundamentally different guidance
+- **Different tool surface** — distinct actions and permissions
+- **Different approval policy** — different escalation rules
+
+The split criterion is **ownership boundary**, not instruction volume. A single agent with more tools is usually better than multiple agents that share the same domain. (Source: OpenAI Agents SDK, 2024; research entry #21.)
+
+Anti-pattern: Creating agents just to organize instructions. If two agents need the same knowledge and perform similar actions, they should be one agent with skill-based differentiation.
+
+### Context Isolation via Subagents
+
+Subagents run in their own context windows and report back summaries. This keeps the primary conversation clean for implementation. Every file read in a subagent burns tokens in a child window, not the primary window.
+
+Context window is the primary performance constraint for LLM agents. Investigation tasks rapidly exhaust context if done inline. Delegating to a subagent quarantines that cost; the primary agent receives only the distilled result. A fresh context also prevents anchoring bias from prior conversation state. (Source: Anthropic, 2025; research entry #22.)
+
+### Three-File Separation
+
+Three failure modes converge to produce a three-file split with defined content rules:
+
+| Failure Mode | Source | Prevention |
+|---|---|---|
+| Instruction conflict on drift | Entry #24 — LLMs cannot reliably resolve conflicting instructions | Single source of truth per concern |
+| Positional attention degradation | Entry #25 — Middle content gets less attention | Keep always-loaded files lean |
+| Redundancy interference | Entry #26 — Redundant content creates competing attention targets | De-duplicate across all files |
+
+| File | When Loaded | Contains | Must NOT Contain |
+|---|---|---|---|
+| `AGENTS.md` | Every session | Project conventions, commands, formats | Step procedures, role-specific rules, knowledge |
+| `.opencode/agents/*.md` | When role invoked | Role identity, skill loads, permissions, escalation | Workflow details, knowledge content |
+| `.opencode/skills/*/SKILL.md` | On demand | Procedural instructions, self-contained | Duplication of `AGENTS.md` or other skills |
+| `.opencode/knowledge/` | On demand | Reference + explanation only | Procedural instructions, step-by-step workflows |
+
+### Effective Instruction Writing
+
+- **Specific triggers**: "Load skill X when condition Y" not "use judgment"
+- **Clear actions**: Every step corresponds to a specific output
+- **Concrete examples**: Include before/after code where helpful (one is enough)
+- **Verification criteria**: How does the agent know it's done?
+- **Implementation intentions**: "If X then Y" plans are 2–3x more likely to execute than general intentions (Source: Gollwitzer, 1999; entry #2.)
+- **Error-specific feedback**: "FAIL: function > 20 lines at file:47" is actionable; "Apply function length rules" is not (Source: Hattie & Timperley, 2007; entry #9.)
+
+### Tool Permission Design (ACI)
+
+Agent-Computer Interface design is as important as Human-Computer Interface design. More time was spent optimizing tools than prompts in SWE-bench work. (Source: Anthropic Engineering Blog, 2024.)
+
+- Start with bash for breadth
+- Promote to dedicated tools when you need to: gate security-sensitive actions, render structured output, audit usage patterns
+- Poka-yoke your tools: make the right action easy and the wrong action hard
+- Give agents enough tokens to think — truncating tool descriptions to save tokens often costs more in misunderstandings
+
+### Adversarial Verification
+
+The reviewer's job is to try to break the feature, not to confirm it works. Default hypothesis: "it might be broken despite green checks; prove otherwise."
+
+Highest-quality thinking emerges when parties hold different hypotheses and are charged with finding flaws in each other's reasoning. (Source: Mellers, Hertwig, & Kahneman, 2001; entry #5.)
+
+Accountability to an unknown audience improves reasoning quality. Structured PASS/FAIL tables with evidence columns create commitment-device effects. (Source: Cialdini, 2001; entry #3; Tetlock, 1983; entry #6.)
+
+## Related
+
+- [[agent-design/opencode-format]]
+- [[skill-design/principles]]
+- [[knowledge-design/principles]]