From fcd296c499f263ed53db024d788cade049e47ded Mon Sep 17 00:00:00 2001 From: YeonGyu-Kim Date: Wed, 24 Jun 2026 15:34:28 +0900 Subject: [PATCH] docs(web): restore skills detail copy --- packages/web/content/docs/skills.md | 67 +++++++++++++++++++++- packages/web/lib/docs-content.generated.ts | 27 ++++++++- 2 files changed, 92 insertions(+), 2 deletions(-) diff --git a/packages/web/content/docs/skills.md b/packages/web/content/docs/skills.md index dfe7d49..5e0d82f 100644 --- a/packages/web/content/docs/skills.md +++ b/packages/web/content/docs/skills.md @@ -13,6 +13,33 @@ The command pillars stay simple: Skills add specialist judgment around those pillars. The sections below describe each skill and how it is typically used. +### Skill index + +Most skills auto-activate when a request matches their domain, so you do not need to study or manually select every skill before using LazyCodex. When you want to be explicit, put the skill name in the prompt — for example `$visual-qa`, `$git-master`, or `$ulw-research`. + +| Skill | Use it for | +| --- | --- | +| `init-deep` | Hierarchical `AGENTS.md` context for large or old repos | +| `ulw-plan` | Explore-first planning before coding | +| `ulw-loop` | Evidence-bound loop until verified completion | +| `start-work` | Execute a plan with durable Boulder progress | +| `review-work` | Five-lane parallel post-implementation review | +| `remove-ai-slops` | Behavior-preserving cleanup of AI-looking code | +| `frontend` | Designed UI work instead of generic layout filling | +| `programming` | Strict TypeScript, Rust, Python, or Go discipline, TDD-first | +| `git-master` | Atomic commits, rebase/squash, push safety, history investigation | +| `visual-qa` | Screenshot/TUI diff plus dual-oracle visual QA | +| `debugging` | Evidence-led root-cause investigation | +| `refactor` | Behavior-preserving restructure of existing code | +| `ulw-research` | Maximum-saturation research with codebase, web, official-docs, and OSS-repo swarms | +| `LSP` | Diagnostics, definitions, references, symbols, and renames | +| `lsp-setup` | Configure language servers for a project | +| `AST-grep` | Structural search and rewrite across code | +| `rules` | Project instructions from AGENTS, rules, and instruction files | +| `comment-checker` | Feedback after edit-like operations | + +### Skill highlights + --- ### review-work @@ -176,9 +203,47 @@ Finds code by syntactic shape rather than text — every function call matching --- +### lsp-setup + +Language-server installation and workspace wiring. + +Configures language servers when a project does not already expose reliable diagnostics, definitions, references, and safe renames. It detects the language stack, installs or points to the right server, and validates that LSP calls work before higher-level coding or refactor skills depend on them. + +**When it activates:** When diagnostics are missing, definitions cannot be resolved, or a project needs LSP support before a refactor or programming task. + +--- + +### rules + +Project instruction injection from repository and user rule files. + +Automatically loads project instructions from sources such as `AGENTS.md`, `CONTEXT.md`, `.omo/rules/`, `.claude/rules/`, `.github/instructions/`, and `.github/copilot-instructions.md`. There is no command to run — the harness treats these rules as active context when the plugin is enabled. + +**When it activates:** At session start and prompt submission, so agents inherit project constraints before planning or editing. + +--- + +### comment-checker + +Immediate feedback after edit-like operations. + +After code changes, `comment-checker` inspects comments near the edited lines. If it flags comment drift — a comment that no longer matches the code below it — the agent must fix or justify the comment before proceeding. This catches stale comments at the moment they are introduced rather than during a later review. + +**When it activates:** After write, edit, patch, or other edit-like tool calls when the plugin has the guardrail enabled. + +--- + ### Where skills live -LazyCodex installs skills as part of the OmO plugin. OmO can also load skills from project and user locations such as `.codex/skills`, `~/.codex/skills`, `.agents/skills`, and `~/.agents/skills`. +LazyCodex installs skills as part of the OmO plugin. OmO can also load skills from project and user locations such as `.codex/skills`, `~/.codex/skills`, `.opencode/skills`, `~/.config/opencode/skills`, `.claude/skills`, `.agents/skills`, and `~/.agents/skills`. + +LazyCodex installs the Codex Light setup with: + +```bash +npx lazycodex-ai install +``` + +That installer wires the Codex marketplace plugin as `omo@sisyphuslabs` while keeping the public package alias easy to remember. Each skill carries deep internal references — detailed playbooks, language-specific recipes, and per-phase instructions — but none of that is something you need to read. The harness reads it for you when the skill activates. diff --git a/packages/web/lib/docs-content.generated.ts b/packages/web/lib/docs-content.generated.ts index 92798bc..e342c12 100644 --- a/packages/web/lib/docs-content.generated.ts +++ b/packages/web/lib/docs-content.generated.ts @@ -9,7 +9,7 @@ export const DOC_SOURCES: Record = { "ulw-plan.md": "

$ulw-plan is the strategic planning consultant (Prometheus). It turns an idea into a decision-complete work plan. It is a planner, NOT an implementer. When you say "do X" it produces a plan for X and never writes product code.

\n

The flow

\n
    \n
  1. Socratic interview — ask only the forks that exploration cannot resolve. When intent is fuzzy, research to best practice instead of interrogating.
  2. \n
  3. Parallel codebase exploration — fan out read-only subagents to ground every decision in the actual code, never in memory.
  4. \n
  5. Metis gap analysis — name every unknown the plan depends on and either close it or surface it as an explicit fork.
  6. \n
  7. Write the plan to plans/<slug>.md — one decision-complete plan a worker executes with zero further interview.
  8. \n
  9. Optional Momus high-accuracy review — an adversarial pass that tries to break the plan before it ships.
  10. \n
\n

Output

\n

Questions, research, and a work plan whose every todo carries references, acceptance criteria, a QA plan, and a commit boundary. The plan records status: awaiting-approval and waits — it never begins execution itself.

\n

Handoff

\n

Once you approve, hand the plan to $start-work, which executes it against durable Boulder state with the five evidence gates.

\n", "start-work.md": "

$start-work executes a Prometheus work plan until every top-level checkbox is done.

\n

How it works

\n\n

Syntax

\n
$start-work [plan-name] [--worktree <absolute-path>]\n
\n

Done

\n

It prints an ORCHESTRATION COMPLETE block when every checkbox is checked.

\n", "ulw-loop.md": "

$ulw-loop is a self-referential development loop that decomposes work into systematic, evidence-bound steps and runs until verified completion.

\n

How it works

\n

The agent works continuously and emits <promise>DONE</promise> when it believes the task is complete, but that does NOT end the loop. An Oracle must verify the result first. The loop ends only after the system confirms the Oracle verified it. If verification fails, it continues with the message: "Oracle verification failed. Continuing ULTRAWORK loop."

\n

Each step carries its own evidence: a real artifact, not a dry-run claim. Progress is checkpointed, so a long run survives restarts without losing what was already proven.

\n

Bootstrap

\n

Before the first run, the loop reads its full workflow reference (Bootstrap tier triage, the Execution Loop, and the Manual-QA channels table) so every later phase executes the same way. It only reads the sections the current phase needs.

\n

Manual-QA channels

\n

A step does not close on a status string. It closes on a captured artifact from a real surface — an HTTP call, a tmux session, or a browser — plus an adversarial pass and a cleanup receipt. See manual QA.

\n

Syntax

\n
$ulw-loop "task description" [--completion-promise=TEXT] [--strategy=reset|continue]\n
\n

Limits

\n

The iteration cap is 500 in ultrawork mode (100 in normal mode).

\n

Reading more

\n\n", - "skills.md": "

Skills are specialist playbooks that LazyCodex loads on top of the command pillars. They auto-activate when a task matches their domain — you do not need to study or memorize them. Include ultrawork (or the short alias ulw) in your prompt and the harness picks the right skills internally.

\n

When you want to call a skill explicitly, put its name in the prompt: $review-work, $remove-ai-slops, $ulw-research, and so on.

\n

Commands

\n

The command pillars stay simple:

\n\n

Skills add specialist judgment around those pillars. The sections below describe each skill and how it is typically used.

\n
\n

review-work

\n

Five-lane parallel post-implementation review.

\n

After significant work, review-work launches five sub-agents in parallel — each covering a different angle: goal/constraint verification, hands-on QA execution, code quality, security, and context mining from git history and issues. All five must pass for the review to pass. One failure means the review fails.

\n

When it activates: After completing any meaningful implementation — especially when the change touches 3+ files or runs for 20+ minutes.

\n

Example: After finishing a PR, the user says:

\n
review my work\n
\n

The harness spawns five parallel reviewers in separate threads, each with a focused lens. The final verdict is PASS only when every lane agrees.

\n
\n

remove-ai-slops

\n

Behavior-preserving cleanup of AI-generated code smells.

\n

The safety invariant: regression tests lock behavior before a single line is deleted. Covers obvious comments, excessive defensive code, unnecessary abstractions, dead code, duplicates, and oversized modules (250+ pure LOC triggers a full modular refactoring). Workers run in parallel batches of five, and any test failure triggers an immediate revert.

\n

When it activates: When asked to clean, deslop, or remove AI-generated patterns.

\n

Example: Combining with refactor and programming for a full cleanup pass:

\n
ulw plan and manual qa, no behaviour changes, no regressions\n/refactor /remove-ai-slops through /programming\n
\n

The harness plans the cleanup first, locks behavior with tests, then dispatches parallel workers by slop category — safe to dangerous order.

\n
\n

frontend

\n

UI, UX, design, performance, accessibility, and visual QA — all in one router.

\n

Not a single rule file but a router. It reads design, perfection, and ui-ux-db references based on the task, then builds and verifies against the actual browser. Covers UI implementation, styling, layout, animation, Lighthouse 100, Core Web Vitals, accessibility, SEO, and React dev tools like react-scan and react-doctor.

\n

When it activates: Any task involving UI, styling, layout, animation, design, or performance auditing.

\n

Example:

\n
redesign the sidebar with better spacing and hit Lighthouse 100\n
\n

The skill routes to the right design references, builds to match the existing design system, then runs a real Playwright Chromium Lighthouse audit — never the Lighthouse CLI, never by weakening UX.

\n
\n

programming

\n

One philosophy across four languages: strict types, modern stacks, TDD.

\n

Applies to every .py, .pyi, .rs, .ts, .tsx, .mts, .cts, .go file. The skill gates on language, loads the matching reference set, and enforces: parse-don't-validate at boundaries, exhaustive variant matching, typed errors, no escape hatches (any, unwrap, @ts-ignore), 250 pure LOC ceiling per file, and mandatory TDD (RED → GREEN → REFACTOR).

\n

When it activates: Automatically on any code file edit in the supported languages.

\n

Example: The skill is always on. When editing TypeScript, it loads the TypeScript reference (Bun + Biome + strict tsconfig), enforces branded types and discriminated unions, and runs the post-write review loop: measure pure LOC, self-review seven questions, refactor if over 250 LOC.

\n
\n

debugging

\n

Hypothesis-driven runtime debugging across any language or binary.

\n

Every claim about why a bug happens must come from observed runtime state, not code reading. The skill runs a phased loop: setup and journal, form 3+ orthogonal hypotheses, investigate in parallel, escalate to independent verifiers after 2 failed rounds, confirm root cause by toggling, lock with a failing test, fix minimally, QA on the real surface, then clean up every debug artifact.

\n

When it activates: Crashes, silent failures, wrong responses, stuck processes, memory leaks, async misbehavior, or reverse engineering.

\n

Example:

\n
debug this — the API returns 200 but the body is empty\n
\n

The skill fires parallel investigation lanes, attaches real debuggers (pdb, node inspect, lldb, dlv), and does not close the bug until the root cause is confirmed by toggling and a failing test goes GREEN.

\n
\n

refactor

\n

Codemap-aware, LSP- and AST-grep-powered restructuring.

\n

Maps the codebase before touching anything, evaluates test coverage to set the verification strategy, plans atomic steps with rollback points, then executes with LSP renames and AST-grep structural rewrites. Any test failure during execution triggers an immediate stop and revert.

\n

When it activates: Requests to refactor, restructure, extract, simplify, or modernize code.

\n

Example:

\n
refactor the validation logic into its own module --scope=module\n
\n

The skill builds a dependency graph of the target, runs characterization tests to pin current behavior, then executes the restructuring step by step — verifying after each step.

\n
\n

visual-qa

\n

Screenshot and TUI diff plus dual-oracle visual QA.

\n

Captures reference and actual evidence — screenshots for web UIs, tmux capture-pane for terminal UIs — then runs a bundled pixel-diff or column-width script. Two parallel read-only oracle passes evaluate: one for design-system and functional integrity, one for visual fidelity and CJK text precision. The final verdict is a single good/bad score.

\n

When it activates: After building or changing any UI, or when asked to verify visual correctness.

\n
\n

git-master

\n

Atomic commits, rebase/squash, push safety, history investigation.

\n

Handles commit message style detection, semantic grouping, fixup autosquash, blame, bisect, log -S, and questions like "who wrote this" or "when was this added."

\n

When it activates: Any git operation — committing, rebasing, squashing, history search.

\n
\n

ulw-research

\n

Maximum-saturation research mode (formerly ultraresearch).

\n

Orchestrates parallel explore and librarian swarms across the codebase, the web, official documentation, and OSS repositories. Runs a recursive EXPAND loop driven by leads that workers return, verifies findings empirically by running code, and produces cited synthesis with optional reports.

\n

When it activates: Only on explicit demand — the word ulw-research, the legacy alias ultraresearch, or any request for deep research or an ultra-precise investigation.

\n

Example:

\n
ulw-research the typeclaw architecture — map every module and find the official docs\n
\n

The skill fans out 10+ parallel search lanes across GitHub, official docs, and web sources, recursively expands promising leads, then synthesizes a cited report.

\n
\n

LSP

\n

Language-server diagnostics, definitions, references, symbols, and safe renames.

\n

Gives the agent language-server precision via MCP tool calls. Runs diagnostics after every edit, finds definitions and references across the workspace, and performs safe renames through the language server's own workspace edit — not text find-and-replace.

\n

When it activates: Automatically after edit-like tool calls (diagnostics), and on demand for navigation and renames.

\n
\n

AST-grep

\n

Structural search and rewrite across 25 languages.

\n

Finds code by syntactic shape rather than text — every function call matching a pattern, every import shaped like X. Rewrites are deterministic and always previewed with dryRun=true before applying. Pairs with the refactor skill for safe, large-scale codemods.

\n

When it activates: Structural code matching, pattern-based search, or deterministic rewrites (strip as any, migrate require() to import, find empty catch blocks).

\n
\n

Where skills live

\n

LazyCodex installs skills as part of the OmO plugin. OmO can also load skills from project and user locations such as .codex/skills, ~/.codex/skills, .agents/skills, and ~/.agents/skills.

\n

Each skill carries deep internal references — detailed playbooks, language-specific recipes, and per-phase instructions — but none of that is something you need to read. The harness reads it for you when the skill activates.

\n

The command pillars and the disciplines behind them are covered in depth: ulw-plan, ulw-loop, start-work, TDD, manual QA, and git workflow.

\n", + "skills.md": "

Skills are specialist playbooks that LazyCodex loads on top of the command pillars. They auto-activate when a task matches their domain — you do not need to study or memorize them. Include ultrawork (or the short alias ulw) in your prompt and the harness picks the right skills internally.

\n

When you want to call a skill explicitly, put its name in the prompt: $review-work, $remove-ai-slops, $ulw-research, and so on.

\n

Commands

\n

The command pillars stay simple:

\n\n

Skills add specialist judgment around those pillars. The sections below describe each skill and how it is typically used.

\n

Skill index

\n

Most skills auto-activate when a request matches their domain, so you do not need to study or manually select every skill before using LazyCodex. When you want to be explicit, put the skill name in the prompt — for example $visual-qa, $git-master, or $ulw-research.

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
SkillUse it for
init-deepHierarchical AGENTS.md context for large or old repos
ulw-planExplore-first planning before coding
ulw-loopEvidence-bound loop until verified completion
start-workExecute a plan with durable Boulder progress
review-workFive-lane parallel post-implementation review
remove-ai-slopsBehavior-preserving cleanup of AI-looking code
frontendDesigned UI work instead of generic layout filling
programmingStrict TypeScript, Rust, Python, or Go discipline, TDD-first
git-masterAtomic commits, rebase/squash, push safety, history investigation
visual-qaScreenshot/TUI diff plus dual-oracle visual QA
debuggingEvidence-led root-cause investigation
refactorBehavior-preserving restructure of existing code
ulw-researchMaximum-saturation research with codebase, web, official-docs, and OSS-repo swarms
LSPDiagnostics, definitions, references, symbols, and renames
lsp-setupConfigure language servers for a project
AST-grepStructural search and rewrite across code
rulesProject instructions from AGENTS, rules, and instruction files
comment-checkerFeedback after edit-like operations
\n

Skill highlights

\n
\n

review-work

\n

Five-lane parallel post-implementation review.

\n

After significant work, review-work launches five sub-agents in parallel — each covering a different angle: goal/constraint verification, hands-on QA execution, code quality, security, and context mining from git history and issues. All five must pass for the review to pass. One failure means the review fails.

\n

When it activates: After completing any meaningful implementation — especially when the change touches 3+ files or runs for 20+ minutes.

\n

Example: After finishing a PR, the user says:

\n
review my work\n
\n

The harness spawns five parallel reviewers in separate threads, each with a focused lens. The final verdict is PASS only when every lane agrees.

\n
\n

remove-ai-slops

\n

Behavior-preserving cleanup of AI-generated code smells.

\n

The safety invariant: regression tests lock behavior before a single line is deleted. Covers obvious comments, excessive defensive code, unnecessary abstractions, dead code, duplicates, and oversized modules (250+ pure LOC triggers a full modular refactoring). Workers run in parallel batches of five, and any test failure triggers an immediate revert.

\n

When it activates: When asked to clean, deslop, or remove AI-generated patterns.

\n

Example: Combining with refactor and programming for a full cleanup pass:

\n
ulw plan and manual qa, no behaviour changes, no regressions\n/refactor /remove-ai-slops through /programming\n
\n

The harness plans the cleanup first, locks behavior with tests, then dispatches parallel workers by slop category — safe to dangerous order.

\n
\n

frontend

\n

UI, UX, design, performance, accessibility, and visual QA — all in one router.

\n

Not a single rule file but a router. It reads design, perfection, and ui-ux-db references based on the task, then builds and verifies against the actual browser. Covers UI implementation, styling, layout, animation, Lighthouse 100, Core Web Vitals, accessibility, SEO, and React dev tools like react-scan and react-doctor.

\n

When it activates: Any task involving UI, styling, layout, animation, design, or performance auditing.

\n

Example:

\n
redesign the sidebar with better spacing and hit Lighthouse 100\n
\n

The skill routes to the right design references, builds to match the existing design system, then runs a real Playwright Chromium Lighthouse audit — never the Lighthouse CLI, never by weakening UX.

\n
\n

programming

\n

One philosophy across four languages: strict types, modern stacks, TDD.

\n

Applies to every .py, .pyi, .rs, .ts, .tsx, .mts, .cts, .go file. The skill gates on language, loads the matching reference set, and enforces: parse-don't-validate at boundaries, exhaustive variant matching, typed errors, no escape hatches (any, unwrap, @ts-ignore), 250 pure LOC ceiling per file, and mandatory TDD (RED → GREEN → REFACTOR).

\n

When it activates: Automatically on any code file edit in the supported languages.

\n

Example: The skill is always on. When editing TypeScript, it loads the TypeScript reference (Bun + Biome + strict tsconfig), enforces branded types and discriminated unions, and runs the post-write review loop: measure pure LOC, self-review seven questions, refactor if over 250 LOC.

\n
\n

debugging

\n

Hypothesis-driven runtime debugging across any language or binary.

\n

Every claim about why a bug happens must come from observed runtime state, not code reading. The skill runs a phased loop: setup and journal, form 3+ orthogonal hypotheses, investigate in parallel, escalate to independent verifiers after 2 failed rounds, confirm root cause by toggling, lock with a failing test, fix minimally, QA on the real surface, then clean up every debug artifact.

\n

When it activates: Crashes, silent failures, wrong responses, stuck processes, memory leaks, async misbehavior, or reverse engineering.

\n

Example:

\n
debug this — the API returns 200 but the body is empty\n
\n

The skill fires parallel investigation lanes, attaches real debuggers (pdb, node inspect, lldb, dlv), and does not close the bug until the root cause is confirmed by toggling and a failing test goes GREEN.

\n
\n

refactor

\n

Codemap-aware, LSP- and AST-grep-powered restructuring.

\n

Maps the codebase before touching anything, evaluates test coverage to set the verification strategy, plans atomic steps with rollback points, then executes with LSP renames and AST-grep structural rewrites. Any test failure during execution triggers an immediate stop and revert.

\n

When it activates: Requests to refactor, restructure, extract, simplify, or modernize code.

\n

Example:

\n
refactor the validation logic into its own module --scope=module\n
\n

The skill builds a dependency graph of the target, runs characterization tests to pin current behavior, then executes the restructuring step by step — verifying after each step.

\n
\n

visual-qa

\n

Screenshot and TUI diff plus dual-oracle visual QA.

\n

Captures reference and actual evidence — screenshots for web UIs, tmux capture-pane for terminal UIs — then runs a bundled pixel-diff or column-width script. Two parallel read-only oracle passes evaluate: one for design-system and functional integrity, one for visual fidelity and CJK text precision. The final verdict is a single good/bad score.

\n

When it activates: After building or changing any UI, or when asked to verify visual correctness.

\n
\n

git-master

\n

Atomic commits, rebase/squash, push safety, history investigation.

\n

Handles commit message style detection, semantic grouping, fixup autosquash, blame, bisect, log -S, and questions like "who wrote this" or "when was this added."

\n

When it activates: Any git operation — committing, rebasing, squashing, history search.

\n
\n

ulw-research

\n

Maximum-saturation research mode (formerly ultraresearch).

\n

Orchestrates parallel explore and librarian swarms across the codebase, the web, official documentation, and OSS repositories. Runs a recursive EXPAND loop driven by leads that workers return, verifies findings empirically by running code, and produces cited synthesis with optional reports.

\n

When it activates: Only on explicit demand — the word ulw-research, the legacy alias ultraresearch, or any request for deep research or an ultra-precise investigation.

\n

Example:

\n
ulw-research the typeclaw architecture — map every module and find the official docs\n
\n

The skill fans out 10+ parallel search lanes across GitHub, official docs, and web sources, recursively expands promising leads, then synthesizes a cited report.

\n
\n

LSP

\n

Language-server diagnostics, definitions, references, symbols, and safe renames.

\n

Gives the agent language-server precision via MCP tool calls. Runs diagnostics after every edit, finds definitions and references across the workspace, and performs safe renames through the language server's own workspace edit — not text find-and-replace.

\n

When it activates: Automatically after edit-like tool calls (diagnostics), and on demand for navigation and renames.

\n
\n

AST-grep

\n

Structural search and rewrite across 25 languages.

\n

Finds code by syntactic shape rather than text — every function call matching a pattern, every import shaped like X. Rewrites are deterministic and always previewed with dryRun=true before applying. Pairs with the refactor skill for safe, large-scale codemods.

\n

When it activates: Structural code matching, pattern-based search, or deterministic rewrites (strip as any, migrate require() to import, find empty catch blocks).

\n
\n

lsp-setup

\n

Language-server installation and workspace wiring.

\n

Configures language servers when a project does not already expose reliable diagnostics, definitions, references, and safe renames. It detects the language stack, installs or points to the right server, and validates that LSP calls work before higher-level coding or refactor skills depend on them.

\n

When it activates: When diagnostics are missing, definitions cannot be resolved, or a project needs LSP support before a refactor or programming task.

\n
\n

rules

\n

Project instruction injection from repository and user rule files.

\n

Automatically loads project instructions from sources such as AGENTS.md, CONTEXT.md, .omo/rules/, .claude/rules/, .github/instructions/, and .github/copilot-instructions.md. There is no command to run — the harness treats these rules as active context when the plugin is enabled.

\n

When it activates: At session start and prompt submission, so agents inherit project constraints before planning or editing.

\n
\n

comment-checker

\n

Immediate feedback after edit-like operations.

\n

After code changes, comment-checker inspects comments near the edited lines. If it flags comment drift — a comment that no longer matches the code below it — the agent must fix or justify the comment before proceeding. This catches stale comments at the moment they are introduced rather than during a later review.

\n

When it activates: After write, edit, patch, or other edit-like tool calls when the plugin has the guardrail enabled.

\n
\n

Where skills live

\n

LazyCodex installs skills as part of the OmO plugin. OmO can also load skills from project and user locations such as .codex/skills, ~/.codex/skills, .opencode/skills, ~/.config/opencode/skills, .claude/skills, .agents/skills, and ~/.agents/skills.

\n

LazyCodex installs the Codex Light setup with:

\n
npx lazycodex-ai install\n
\n

That installer wires the Codex marketplace plugin as omo@sisyphuslabs while keeping the public package alias easy to remember.

\n

Each skill carries deep internal references — detailed playbooks, language-specific recipes, and per-phase instructions — but none of that is something you need to read. The harness reads it for you when the skill activates.

\n

The command pillars and the disciplines behind them are covered in depth: ulw-plan, ulw-loop, start-work, TDD, manual QA, and git workflow.

\n", "ultrawork.md": "

ultrawork is the headline mode. Include ultrawork (or the short alias ulw) anywhere in your prompt — like adding ultrathink — and the harness switches to maximum-precision, outcome-first, evidence-driven orchestration. Skills activate internally; you do not need to name them.

\n
\n

"Plan, execute, verify, and keep the evidence attached."

\n
\n

The principle is simple. An agent saying it is done does not mean the work is done. The work is done when observable evidence verifies it.

\n

Usage

\n

Just include the word in your prompt. Nothing else to configure.

\n
ulw add authentication\n
\n
fix the flaky checkout test ultrawork\n
\n

The harness reads the task, picks the right skills (programming, debugging, refactor, etc.), and runs the evidence-bound loop automatically. You do not choose skills yourself unless you want to be explicit — for example $review-work or $ulw-research.

\n

What it enforces

\n\n

Relationship to $ulw-loop

\n

$ulw-loop is the command form of ultrawork discipline. The latest flow stores request, goals, success criteria, and an evidence ledger under .omo/ulw-loop:

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
FileRole
.omo/ulw-loop/brief.mdOriginal request and persistent constraints
.omo/ulw-loop/goals.jsonGoals and success criteria
.omo/ulw-loop/ledger.jsonlpass, fail, block, steering, checkpoint records
\n

Saying "done" is not enough. Each success criterion requires evidence captured from a real surface, and that evidence must pass before the loop stops.

\n

The exact syntax and flags live in the $ulw-loop command docs.

\n

Failure limits

\n

The loop does not run forever. The latest $ulw-loop workflow uses these caps:

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
ConditionLimit
Iterations on one goal without a full pass5 cycles
Same failure on the same criterion3 times
\n

Evidence over hope

\n

The loop does not stop at "I wrote some code." It stops when the result is confirmed by evidence — what check ran and what it showed — not by the agent's expected status report.

\n

Position among commands

\n

$ulw-loop is one of several commands, each for a different shape of work.

\n

The typical flow: $ulw-plan produces a decision-complete plan, $start-work executes it checkpoint by checkpoint, and $ulw-loop keeps open-ended work running until a verifier approves. Detailed syntax for each command is in the Commands section.

\n", "discipline-agents.md": "

LazyCodex ports a single discipline agent from OmO into Codex: Hephaestus, the autonomous deep worker. There is no Sisyphus orchestrator in the Codex package — Hephaestus is the one role, and it carries the whole run itself with read-only subagents for parallel exploration.

\n

What Hephaestus is

\n

Named after the Greek god of the forge. Goal-oriented: you give it objectives, not step-by-step recipes, and it executes them end-to-end. "The Legitimate Craftsman." Methodical, thorough, obsessive — built for deep architectural reasoning, complex debugging, and cross-domain synthesis.

\n

Installed roles

\n

As of 4.12.1, the following roles are installed. When Codex exposes agent_type, the role is set directly; otherwise the role description is included in the message as a fallback.

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
RolePrimary use
explorerInternal codebase context: structure, call flows, test locations.
librarianExternal docs, library contracts, latest API research.
planPlan drafting and task decomposition.
momus / metisMissing decisions, edge cases, risk review.
lazycodex-executorExecuting specific task units from a plan.
lazycodex-code-reviewerPost-implementation code quality review.
lazycodex-qa-executorReal-execution-based QA.
lazycodex-gate-reviewerPre-completion verification gates.
lazycodex-clone-fidelity-reviewerClone and sync operation fidelity checks.
\n

Parent session ownership

\n

Even with multiple roles, completion judgment is never handed wholesale to a sub-agent. The parent Codex session keeps ownership of goals, constraints, and final judgment. Sub-agents are used to read terrain, find gaps, or assist review.

\n

The operating loop

\n

Hephaestus runs a short, tight loop on every unit of work:

\n
    \n
  1. Explore — map the terrain. Read the code with tools, never speculate. Fire 2-5 parallel explore subagents before writing anything.
  2. \n
  3. Plan — chart the course. Record files to modify, specific changes, and dependencies via update_plan.
  4. \n
  5. Implement — build with precision. Surgical edits that match codebase style (naming, indentation, imports, error handling) even when a greenfield would read differently.
  6. \n
  7. Verify — prove it works. LSP diagnostics on changed files, related tests, and build — in parallel where possible.
  8. \n
  9. Manually QA — drive the artifact through its real surface (HTTP call, tmux, browser), then write the final message.
  10. \n
\n

Non-goals

\n\n

Delegation, not orchestration

\n

Hephaestus stays the parent. For parallel exploration it spawns read-only Codex subagent roles (multi_agent_v1.spawn_agent) and keeps the parent session live with brief status updates while children run. It does not hand the run off to a separate orchestrator — it owns the goal, delegates the grunt work, and verifies the results itself.

\n

Boulder state

\n

$start-work uses .omo/boulder.json to persist progress and the Stop-hook continuation to keep plan execution moving. This is the core visible behavior: checkboxes advance, and when all are done it prints ORCHESTRATION COMPLETE.

\n

Where the boulder comes from

\n

The full OmO has a second primary agent, Sisyphus, the orchestrator with .omo/boulder.json session continuity. The Codex package is the Hephaestus-only light port, so on Codex the durable progress state lives in .omo/boulder.json as written by $start-work and the Stop-hook continuation — without the Sisyphus orchestration layer.

\n

Reading more

\n\n", "model-routing.md": "

Multi-model routing sends each part of a run to the model that fits it best, instead of running everything on one model. LazyCodex installs OmO's routing defaults into Codex so a serious repository is not bottlenecked by a single context window or price point.

\n

Current baseline

\n

The 4.12.1 bundled model-catalog.json centers the default profile on gpt-5.5:

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n
ProfileModelReasoning
Defaultgpt-5.5high
Plan modegpt-5.5xhigh
Workergpt-5.5high
Verifiergpt-5.5high
\n

The actual model name you see may differ as Codex and OpenAI update their model lineup. This doc focuses on how LazyCodex uses model profiles, not on comparing specific models.

\n

What gets routed

\n\n

Why role profiles exist

\n

Role-based profiles separate work by nature:

\n\n

This pairs with Agent Roles. Even when multiple roles move in parallel, each role's model profile is tracked in the Codex configuration.

\n

How it fits the harness

\n

Routing is part of the harness setup that npx lazycodex-ai install wires into Codex. It detects the available subscriptions and provider auth, then maps roles to models so you do not hand-configure each one.

\n

Provider auth

\n

Auth targets Codex itself, not LazyCodex. Once Codex is logged in, the installer's subscription detection and provider routing take over. If you let an LLM agent run the install, it walks the same detection and selection for you.

\n

User notes

\n\n

Customizing it

\n

Routing and provider settings live in the configuration. See Configuration for the fields that control which model handles which role, and how to override the defaults per project.

\n", @@ -263,6 +263,16 @@ export const DOC_TOC: Record = { "id": "commands", "text": "Commands" }, + { + "level": 3, + "id": "skill-index", + "text": "Skill index" + }, + { + "level": 3, + "id": "skill-highlights", + "text": "Skill highlights" + }, { "level": 3, "id": "review-work", @@ -318,6 +328,21 @@ export const DOC_TOC: Record = { "id": "ast-grep", "text": "AST-grep" }, + { + "level": 3, + "id": "lsp-setup", + "text": "lsp-setup" + }, + { + "level": 3, + "id": "rules", + "text": "rules" + }, + { + "level": 3, + "id": "comment-checker", + "text": "comment-checker" + }, { "level": 3, "id": "where-skills-live",