diff --git a/.agents/skills/obol-stack-dev/SKILL.md b/.agents/skills/obol-stack-dev/SKILL.md index 730af818..75acc995 100644 --- a/.agents/skills/obol-stack-dev/SKILL.md +++ b/.agents/skills/obol-stack-dev/SKILL.md @@ -62,6 +62,67 @@ Tier 2: Per-Instance Tier 1: Cluster-Wide Gateway | Integration testing | `references/integration-testing.md` | | Troubleshooting | `references/troubleshooting.md` | +## Dev Registry Cache + +When `OBOL_DEVELOPMENT=true`, `obol stack up` provisions pull-through k3d registry caches before creating a new cluster. Current mirrors: + +- `docker.io` -> `k3d-obol-docker-io.localhost:54100` +- `ghcr.io` -> `k3d-obol-ghcr-io.localhost:54101` +- `quay.io` -> `k3d-obol-quay-io.localhost:54102` + +The generated registry config lives at `$OBOL_CONFIG_DIR/registries.yaml`. Cached image layers are stored under `~/.local/state/obol/registry-cache/` by default, or under `OBOL_REGISTRY_CACHE_DIR` if set. + +Use this mental model: + +- Fresh dev cluster: new cluster creation gets `--registry-config` and `--registry-use` entries, so pulls benefit from the cache. +- Existing dev cluster: `obol stack up` only starts the cluster and does not re-run registry setup. +- This is an upstream pull cache, not a dedicated local-build publishing workflow. + +## Existing Dev Stack Refresh + +When testing a new frontend or stack branch against an already-initialized `.workspace/config`, rebuild the local CLI before `stack up`. Current `obol stack up` refreshes `$OBOL_CONFIG_DIR/defaults` when the embedded infrastructure digest, backend, or stack ID changes, then preserves mutable LiteLLM model entries across Helm sync. If testing raw file edits without rebuilding the CLI, patch the generated defaults copy manually or re-run after rebuilding. + +```bash +# Prefer origin-only fetches in this checkout. Some Radicle remotes may be stale +# and can make `git fetch --all` fail after GitHub has already fetched. +git fetch origin --prune + +# Rebuild the local CLI from the current branch. +go build -o .workspace/bin/obol ./cmd/obol + +# For a released frontend image, verify and pull the exact tag first. +docker manifest inspect obolnetwork/obol-stack-front-end:v0.1.17-rc.5 >/dev/null +docker pull obolnetwork/obol-stack-front-end:v0.1.17-rc.5 + +# Confirm the embedded source has the intended image tag before rebuilding. +rg -n 'obol-stack-front-end|tag:' internal/embed/infrastructure/values/obol-frontend.yaml.gotmpl + +OBOL_CONFIG_DIR="$PWD/.workspace/config" \ +OBOL_BIN_DIR="$PWD/.workspace/bin" \ +OBOL_DATA_DIR="$PWD/.workspace/data" \ + .workspace/bin/obol stack up +``` + +Expected verification for frontend image refresh: + +```bash +OBOL_CONFIG_DIR="$PWD/.workspace/config" OBOL_BIN_DIR="$PWD/.workspace/bin" OBOL_DATA_DIR="$PWD/.workspace/data" \ + .workspace/bin/obol kubectl -n obol-frontend get deploy obol-frontend-obol-app \ + -o jsonpath='{.spec.template.spec.containers[*].image}{"\n"}' + +OBOL_CONFIG_DIR="$PWD/.workspace/config" OBOL_BIN_DIR="$PWD/.workspace/bin" OBOL_DATA_DIR="$PWD/.workspace/data" \ + .workspace/bin/obol kubectl -n obol-frontend rollout status deploy/obol-frontend-obol-app --timeout=180s + +curl -sS -I --max-time 10 http://obol.stack +curl -sS --max-time 15 http://obol.stack/api/agents/instances +``` + +Known existing-stack migration failures: + +- `Namespace "hermes-obol-agent" ... exists and cannot be imported into the current release`: the namespace or monetize RBAC predated Helm ownership. Current `obol stack up` adopts known base-owned resources before Helm sync. If doing it manually, label and annotate the existing resource with `app.kubernetes.io/managed-by=Helm`, `meta.helm.sh/release-name=base`, and `meta.helm.sh/release-namespace=kube-system`. +- `conflict with "kubectl-patch" ... llm/litellm-config .data.config.yaml`: older writers used a non-Helm field manager for `data.config.yaml`, which conflicts with Helm server-side apply. Current writers use Helm's field manager. During `obol stack up`, the existing LiteLLM config is backed up and previous model entries are merged into the new chart config; if a non-Helm manager is detected, the ConfigMap is deleted before Helm sync so ownership is recreated cleanly. +- `/etc/hosts` updates require interactive sudo. Codex cannot satisfy the password prompt in non-interactive execution; if DNS fails in the browser, run `obol stack up` or `obol hermes sync obol-agent` from a normal terminal, or manually add `127.0.0.1 obol-agent.obol.stack` and flush local DNS. + ## 4 Inference Paths (All Through LiteLLM) | Path | Model Name | LiteLLM model_list | Example | @@ -80,7 +141,7 @@ All 4 paths use the same OpenClaw config pattern: ### Paid Routing Notes - The paid path uses the **Obol LiteLLM fork** because paid-model lifecycle relies on the config-only model management API. -- `litellm-config` carries one static route: `paid/* -> openai/* -> http://127.0.0.1:8402`. +- `litellm-config` carries one static route: `paid/* -> openai/* -> http://127.0.0.1:8402/v1`. - `x402-buyer` runs as a **sidecar in the LiteLLM pod**, not as a separate Service. - `buy.py buy` signs auths locally and creates a `PurchaseRequest`; the controller writes per-upstream buyer files and keeps LiteLLM model entries in sync. - The currently validated local OSS model is `qwen3.5:9b`. Prefer that exact model in live commerce tests. @@ -125,9 +186,9 @@ go test -tags integration -v -timeout 10m ./internal/openclaw/ # Integration te go test -tags integration -v -run TestIntegration_Tunnel_SellDiscoverBuySidecar_QuotaAndBalance -timeout 30m ./internal/openclaw/ ``` -## OpenClaw Skills System +## Agent Skills System -Skills are SKILL.md files (with optional scripts and references) that give the agent domain-specific capabilities. Delivered via host-path PVC injection to `/data/.openclaw/skills/` inside the pod. +Skills are SKILL.md files (with optional scripts and references) that give the agent domain-specific capabilities. Hermes receives embedded Obol skills through native `skills.external_dirs` at `/data/.hermes/obol-skills` with `OBOL_SKILLS_DIR` set. OpenClaw receives embedded skills through host-path PVC injection to `/data/.openclaw/skills/`. ### Default Embedded Skills @@ -242,9 +303,9 @@ obol sell http qwen35 \ --upstream ollama --port 11434 --namespace llm --health-path /api/tags \ --per-request "0.001" --chain "base-sepolia" --wallet "0x" -# Trigger reconciliation (or wait for heartbeat) -obol kubectl exec -n openclaw-obol-agent deploy/openclaw -c openclaw -- \ - python3 /data/.openclaw/skills/monetize/scripts/monetize.py process qwen35 --namespace llm +# Trigger reconciliation from the default Hermes agent pod +obol kubectl exec -n hermes-obol-agent deploy/hermes -c hermes -- \ + python3 /data/.hermes/obol-skills/monetize/scripts/monetize.py process qwen35 --namespace llm # Verify 402 curl -X POST http://obol.stack:8080/services/qwen35/v1/chat/completions \ diff --git a/.dockerignore b/.dockerignore index 9a420205..1ebc838c 100644 --- a/.dockerignore +++ b/.dockerignore @@ -1,2 +1,2 @@ -.workspace/ +.workspace* .worktrees/ diff --git a/CLAUDE.md b/CLAUDE.md index 1aa5598c..ff45927a 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -4,7 +4,7 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co ## Project Overview -Obol Stack: framework for AI agents to run decentralised infrastructure locally. k3d cluster with OpenClaw AI agent, blockchain networks, payment-gated inference (x402), Cloudflare tunnels. CLI: `github.com/urfave/cli/v3`. +Obol Stack: framework for AI agents to run decentralised infrastructure locally. k3d cluster with a Hermes default AI agent, optional OpenClaw instances, blockchain networks, payment-gated inference (x402), and Cloudflare tunnels. CLI: `github.com/urfave/cli/v3`. ## Conventions @@ -60,6 +60,7 @@ obol ├── agent init (deploys obol-agent singleton) ├── network list, install, add, remove, status, sync, delete ├── sell inference, http, list, status, stop, delete, pricing, register +├── hermes onboard, setup, sync, list, delete, token, wallet, skills ├── openclaw onboard, setup, sync, list, delete, dashboard, cli, token, skills ├── model setup, status ├── app install, sync, list, delete @@ -73,7 +74,7 @@ obol Deployed on `obol stack up` from `internal/embed/infrastructure/`. Key templates in `base/templates/`: `llm.yaml` (LiteLLM + Ollama), `x402.yaml` (verifier + serviceoffer-controller), `obol-agent.yaml` (singleton), `serviceoffer-crd.yaml`, `registrationrequest-crd.yaml`, `obol-agent-monetize-rbac.yaml`, `local-path.yaml`. Plus `cloudflared/` chart and `values/` for eRPC, monitoring, frontend. -Components: eRPC (`erpc` ns), Frontend (`obol-frontend` ns), Cloudflared (`traefik` ns), Monitoring/Prometheus (`monitoring` ns), LiteLLM (`llm` ns), x402-verifier (`x402` ns), serviceoffer-controller (`x402` ns), obol-agent (`openclaw-obol-agent` ns), ServiceOffer + RegistrationRequest CRDs. +Components: eRPC (`erpc` ns), Frontend (`obol-frontend` ns), Cloudflared (`traefik` ns), Monitoring/Prometheus (`monitoring` ns), LiteLLM (`llm` ns), x402-verifier (`x402` ns), serviceoffer-controller (`x402` ns), default obol-agent (`hermes-obol-agent` ns), ServiceOffer + RegistrationRequest CRDs. ## Monetize Subsystem @@ -83,7 +84,7 @@ Payment-gated access to cluster services via x402 (HTTP 402 micropayments, USDC **Buy-side flow**: `buy.py probe` sees 402 pricing → `buy.py buy` pre-signs ERC-3009 auths into a `PurchaseRequest` CR in the agent namespace → serviceoffer-controller writes buyer config/auth files into `llm` and publishes `paid/` → the in-pod `x402-buyer` sidecar spends one auth per paid request. Agent-managed refill runs through `buy.py process --all`, not the controller. -**buy.py** lives at `/data/.openclaw/skills/buy-inference/scripts/buy.py` inside the agent pod (skill name: `buy-inference`, not `buy`). Commands: +**buy.py** lives at `${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py` inside the agent pod (skill name: `buy-inference`, not `buy`). Commands: ``` probe [--model ] Probe x402 pricing from a 402 endpoint buy --endpoint --model Pre-sign ERC-3009 auths + create PurchaseRequest @@ -179,6 +180,21 @@ k3d: 1 server, ports 80:80 + 8080:80 + 443:443 + 8443:443, `rancher/k3s:v1.35.1- **Local access**: On macOS, port 80 is privileged and may not bind without root. Always use `http://obol.stack:8080/` (not `http://obol.stack/`) for local browser and curl access. Port 8080 maps to the same Traefik load balancer as port 80. +### Dev Registry Cache + +When `OBOL_DEVELOPMENT=true`, `obol stack up` creates pull-through k3d registry caches and wires new clusters to use them on image pulls: + +- `docker.io` -> `k3d-obol-docker-io.localhost:54100` +- `ghcr.io` -> `k3d-obol-ghcr-io.localhost:54101` +- `quay.io` -> `k3d-obol-quay-io.localhost:54102` + +The generated k3d registry config is written to `$OBOL_CONFIG_DIR/registries.yaml`. Cache data is stored under `~/.local/state/obol/registry-cache/` by default, or under `OBOL_REGISTRY_CACHE_DIR` when set. + +Important caveats: + +- This is a pull-through cache for upstream registries, not a first-class local build registry workflow. +- It is only set up during cluster creation. If `obol stack up` is just starting an existing k3d cluster, registry setup is skipped. + ## LLM Routing **Service access from the Mac host** — not every cluster service is reachable via `obol.stack:8080`. Only routes published through Traefik are externally accessible. Everything else is ClusterIP-only and requires `kubectl port-forward`: @@ -194,7 +210,7 @@ k3d: 1 server, ports 80:80 + 8080:80 + 443:443 + 8443:443, `rancher/k3s:v1.35.1- **x402-buyer sidecar is distroless** — no `wget`, `curl`, or shell inside the container. Use port-forward from the host, not `kubectl exec`. -**LiteLLM gateway** (`llm` ns, port 4000): OpenAI-compatible proxy routing to Ollama/Anthropic/OpenAI. ConfigMap `litellm-config` (YAML config.yaml with model_list), Secret `litellm-secrets` (master key + API keys). Auto-configured with Ollama models during `obol stack up` (no manual `obol model setup` needed). `ConfigureLiteLLM()` patches config + Secret + restarts or hot-adds via the LiteLLM model API. Paid remote inference uses the Obol LiteLLM fork plus the `x402-buyer` sidecar, with a static `paid/* -> openai/* -> http://127.0.0.1:8402` route and explicit paid-model entries when needed. OpenClaw always routes through LiteLLM (openai provider slot), never native providers; `dangerouslyDisableDeviceAuth` is enabled for Traefik-proxied access. +**LiteLLM gateway** (`llm` ns, port 4000): OpenAI-compatible proxy routing to Ollama/Anthropic/OpenAI. ConfigMap `litellm-config` (YAML config.yaml with model_list), Secret `litellm-secrets` (master key + API keys). Auto-configured with Ollama models during `obol stack up` (no manual `obol model setup` needed). `ConfigureLiteLLM()` patches config + Secret + restarts or hot-adds via the LiteLLM model API. Paid remote inference uses the Obol LiteLLM fork plus the `x402-buyer` sidecar, with a static `paid/* -> openai/* -> http://127.0.0.1:8402` route and explicit paid-model entries when needed. Hermes uses a custom OpenAI-compatible provider pointed at LiteLLM; optional OpenClaw instances use the OpenAI provider slot. `dangerouslyDisableDeviceAuth` is enabled for Traefik-proxied access. **Auto-configuration**: During `obol stack up`, `autoConfigureLLM()` detects host Ollama models and patches LiteLLM config so agent chat works immediately without manual `obol model setup`. During install, `obolup.sh` `check_agent_model_api_key()` reads `~/.openclaw/openclaw.json` agent model, resolves API key from environment (`ANTHROPIC_API_KEY`, `CLAUDE_CODE_OAUTH_TOKEN` for Anthropic; `OPENAI_API_KEY` for OpenAI), and exports it for downstream tools. @@ -204,9 +220,13 @@ k3d: 1 server, ports 80:80 + 8080:80 + 443:443 + 8443:443, `rancher/k3s:v1.35.1- `obol sell inference` — standalone OpenAI-compatible HTTP gateway with x402 payment gating, for bare metal / Secure Enclave. `--vm` flag runs Ollama in Apple Containerization Linux VM. Key code: `internal/inference/` (gateway, container, store) and `internal/enclave/` (Secure Enclave signing via CGo/Security.framework on Darwin, stub fallback elsewhere). -## OpenClaw & Skills +## Agent Runtimes & Skills + +Hermes is the stack-managed default runtime. Default instance state lives under `applications/hermes/obol-agent`, namespace `hermes-obol-agent`, service/deployment `hermes`, ConfigMap `hermes-config`, and PVC path `$DATA_DIR/hermes-obol-agent/hermes-data/.hermes`. + +OpenClaw remains an optional manual runtime. OpenClaw instances live under `applications/openclaw/`, namespace `openclaw-`, service/deployment `openclaw`, and ConfigMap `openclaw-config`. -Skills = SKILL.md + optional scripts/references, embedded in `obol` binary (`internal/embed/skills/`, 23 skills). Delivered via host-path PVC injection to `$DATA_DIR/openclaw-/openclaw-data/.openclaw/skills/`. Categories: Infrastructure (ethereum-networks, ethereum-local-wallet, obol-stack, distributed-validators, monetize, discovery), Ethereum Dev (addresses, building-blocks, concepts, gas, indexing, l2s, orchestration, security, standards, ship, testing, tools, wallets), Frontend (frontend-playbook, frontend-ux, qa, why). +Obol skills = SKILL.md + optional scripts/references, embedded in `obol` binary (`internal/embed/skills/`). Hermes receives them via native `skills.external_dirs` at `/data/.hermes/obol-skills` with `OBOL_SKILLS_DIR` set to that path. OpenClaw receives them via PVC injection at `/data/.openclaw/skills`. **Monetize skill** (`internal/embed/skills/monetize/`): thin compatibility wrapper around ServiceOffer CRUD, controller waiting, and `/skill.md` publication. diff --git a/README.md b/README.md index 0bd81603..9695f746 100644 --- a/README.md +++ b/README.md @@ -49,14 +49,15 @@ obol version obol stack init obol stack up -# Set up your AI agent (interactive — choose a model provider) +# Apply agent capabilities to the default stack-managed agent obol agent init -# Open the agent dashboard -obol openclaw dashboard default +# Inspect the default Hermes agent +obol hermes list +obol hermes token obol-agent ``` -The agent init flow will configure [OpenClaw](https://openclaw.ai) with your chosen model provider (Ollama, Anthropic, or OpenAI) and deploy it to the cluster. +`obol stack up` provisions the default [Hermes Agent](https://github.com/NousResearch/hermes-agent) runtime behind LiteLLM. `obol agent init` applies the controller-based agent capabilities used for monetization and reconciliation. ## Blockchain Networks @@ -142,39 +143,36 @@ obol model setup --provider openai --api-key sk-proj-... obol model status ``` -`model setup` patches the LiteLLM config and Secret with your API key, adds the model to the gateway, restarts LiteLLM, and syncs all deployed OpenClaw instances. +`model setup` patches the LiteLLM config and Secret with your API key, adds the model to the gateway, restarts LiteLLM, and syncs the stack-managed Hermes default agent. -## OpenClaw AI Agent +## AI Agent Runtimes -[OpenClaw](https://openclaw.ai) is the AI agent deployed by the stack. Multiple instances can run side-by-side, each with its own model provider configuration. +Hermes is the default AI agent runtime deployed by the stack as `obol-agent`. OpenClaw remains available as an optional manual runtime. Multiple Hermes and OpenClaw instances can run side-by-side. ```bash -# Create and deploy an instance (interactive provider setup) -obol openclaw onboard +# Default stack-managed Hermes agent +obol hermes list +obol hermes token obol-agent +obol hermes skills list + +# Create and deploy an additional Hermes instance +obol hermes onboard --id research -# Reconfigure model provider for an existing instance -obol openclaw setup +# Create and deploy an optional OpenClaw instance +obol openclaw onboard -# List instances +# List optional OpenClaw instances obol openclaw list -# Open the web dashboard +# Open the OpenClaw web dashboard obol openclaw dashboard - -# Manage skills (add, remove, list) -obol openclaw skills list -obol openclaw skills add -obol openclaw skills remove - -# Remove an instance -obol openclaw delete --force ``` -When only one OpenClaw instance is installed, the instance ID is optional — it is auto-selected. With multiple instances, specify the name: `obol openclaw setup prod`. +When only one runtime-specific instance is installed, the instance ID is optional. With multiple instances, specify the name: `obol hermes sync research` or `obol openclaw setup prod`. ### Skills -OpenClaw ships with 21 embedded skills that are installed automatically on first deploy. Skills give the agent domain-specific capabilities — from querying blockchains to understanding Ethereum development patterns. +The stack ships with embedded Obol skills that are installed automatically for the default Hermes agent and for OpenClaw instances. Skills give the agent domain-specific capabilities — from querying blockchains to understanding Ethereum development patterns. #### Infrastructure @@ -295,8 +293,8 @@ obol sell list obol sell status -n # 4) Buyer wallet and balances are available. -obol kubectl exec -n openclaw-obol-agent -- \ - python3 /data/.openclaw/skills/buy-inference/scripts/buy.py balance +obol kubectl exec -n hermes-obol-agent deploy/hermes -c hermes -- \ + python3 /data/.hermes/obol-skills/buy-inference/scripts/buy.py balance ``` Run the paid tests only after all four checks pass. diff --git a/cmd/obol/hermes.go b/cmd/obol/hermes.go new file mode 100644 index 00000000..2a5b6187 --- /dev/null +++ b/cmd/obol/hermes.go @@ -0,0 +1,210 @@ +package main + +import ( + "context" + "errors" + + "github.com/ObolNetwork/obol-stack/internal/config" + "github.com/ObolNetwork/obol-stack/internal/hermes" + "github.com/urfave/cli/v3" +) + +func hermesCommand(cfg *config.Config) *cli.Command { + return &cli.Command{ + Name: "hermes", + Aliases: []string{"herme"}, + Usage: "Manage Hermes agent instances", + Commands: []*cli.Command{ + { + Name: "onboard", + Usage: "Create and deploy a Hermes instance", + Flags: []cli.Flag{ + &cli.StringFlag{ + Name: "id", + Usage: "Instance ID (defaults to generated petname)", + }, + &cli.BoolFlag{ + Name: "force", + Aliases: []string{"f"}, + Usage: "Overwrite existing instance", + }, + &cli.BoolFlag{ + Name: "no-sync", + Usage: "Only scaffold config, don't deploy to cluster", + }, + }, + Action: func(ctx context.Context, cmd *cli.Command) error { + return hermes.Onboard(cfg, hermes.OnboardOptions{ + ID: cmd.String("id"), + Force: cmd.Bool("force"), + Sync: !cmd.Bool("no-sync"), + }, getUI(cmd)) + }, + }, + { + Name: "sync", + Usage: "Deploy or update a Hermes instance", + ArgsUsage: "[instance-name]", + Action: func(ctx context.Context, cmd *cli.Command) error { + id, _, err := hermes.ResolveInstance(cfg, cmd.Args().Slice()) + if err != nil { + return err + } + return hermes.Sync(cfg, id, getUI(cmd)) + }, + }, + { + Name: "token", + Usage: "Retrieve or regenerate the Hermes API server token", + ArgsUsage: "[instance-name]", + Flags: []cli.Flag{ + &cli.BoolFlag{ + Name: "regenerate", + Usage: "Delete and regenerate the API server token (restarts the instance)", + }, + }, + Action: func(ctx context.Context, cmd *cli.Command) error { + id, _, err := hermes.ResolveInstance(cfg, cmd.Args().Slice()) + if err != nil { + return err + } + + u := getUI(cmd) + if cmd.Bool("regenerate") { + newToken, err := hermes.RegenerateToken(cfg, id, u) + if err != nil { + return err + } + u.Print(newToken) + return nil + } + + return hermes.Token(cfg, id, u) + }, + }, + { + Name: "list", + Usage: "List Hermes instances", + Action: func(ctx context.Context, cmd *cli.Command) error { + return hermes.List(cfg, getUI(cmd)) + }, + }, + { + Name: "delete", + Usage: "Remove a Hermes instance and its cluster resources", + ArgsUsage: "[instance-name]", + Flags: []cli.Flag{ + &cli.BoolFlag{ + Name: "force", + Aliases: []string{"f"}, + Usage: "Skip confirmation prompt", + }, + }, + Action: func(ctx context.Context, cmd *cli.Command) error { + id, _, err := hermes.ResolveInstance(cfg, cmd.Args().Slice()) + if err != nil { + return err + } + return hermes.Delete(cfg, id, cmd.Bool("force"), getUI(cmd)) + }, + }, + { + Name: "setup", + Usage: "Re-render Hermes config from the current LiteLLM inventory", + ArgsUsage: "[instance-name]", + Action: func(ctx context.Context, cmd *cli.Command) error { + id, _, err := hermes.ResolveInstance(cfg, cmd.Args().Slice()) + if err != nil { + return err + } + return hermes.Setup(cfg, id, hermes.SetupOptions{}, getUI(cmd)) + }, + }, + { + Name: "dashboard", + Usage: "Pending product decision for Hermes-native dashboard behavior", + ArgsUsage: "[instance-name]", + Action: func(ctx context.Context, cmd *cli.Command) error { + return errors.New("Hermes dashboard semantics diverge from OpenClaw; choose a native Hermes dashboard flow or an Obol wrapper before enabling this command") + }, + }, + { + Name: "wallet", + Usage: "Inspect Hermes instance wallets", + Commands: []*cli.Command{ + { + Name: "address", + Usage: "Show the wallet address for a Hermes instance", + ArgsUsage: "[instance-name]", + Action: func(ctx context.Context, cmd *cli.Command) error { + args := cmd.Args().Slice() + + if len(args) == 0 { + addr, err := hermes.ResolveWalletAddress(cfg) + if err != nil { + return err + } + getUI(cmd).Print(addr) + return nil + } + + id, _, err := hermes.ResolveInstance(cfg, args) + if err != nil { + return err + } + + walletInfo, err := hermes.ReadWalletMetadata(hermes.DeploymentPath(cfg, id)) + if err != nil { + return err + } + getUI(cmd).Print(walletInfo.Address) + return nil + }, + }, + { + Name: "list", + Usage: "List wallets for Hermes instances", + ArgsUsage: "[instance-name]", + Action: func(ctx context.Context, cmd *cli.Command) error { + args := cmd.Args().Slice() + + var id string + if len(args) > 0 { + var err error + id, _, err = hermes.ResolveInstance(cfg, args) + if err != nil { + return err + } + } + + return hermes.ListWallets(cfg, id, getUI(cmd)) + }, + }, + }, + }, + { + Name: "skills", + Usage: "Run native Hermes skills commands against a deployed instance", + ArgsUsage: "[instance-name] [-- ]", + SkipFlagParsing: true, + Action: func(ctx context.Context, cmd *cli.Command) error { + id, remaining, err := hermes.ResolveInstance(cfg, cmd.Args().Slice()) + if err != nil { + return err + } + + return hermes.Skills(cfg, id, rawArgsAfterSeparator(remaining)) + }, + }, + }, + } +} + +func rawArgsAfterSeparator(args []string) []string { + for i, arg := range args { + if arg == "--" { + return args[i+1:] + } + } + return args +} diff --git a/cmd/obol/hermes_test.go b/cmd/obol/hermes_test.go new file mode 100644 index 00000000..4883c553 --- /dev/null +++ b/cmd/obol/hermes_test.go @@ -0,0 +1,42 @@ +package main + +import "testing" + +func TestHermesCommand_Structure(t *testing.T) { + cfg := newTestConfig(t) + cmd := hermesCommand(cfg) + + expected := map[string]bool{ + "onboard": false, + "sync": false, + "token": false, + "list": false, + "delete": false, + "setup": false, + "dashboard": false, + "wallet": false, + "skills": false, + } + + for _, sub := range cmd.Commands { + if _, ok := expected[sub.Name]; ok { + expected[sub.Name] = true + } + } + + for name, found := range expected { + if !found { + t.Errorf("missing Hermes subcommand %q", name) + } + } +} + +func TestHermesSkillsCommand_UsesRawFlagParsing(t *testing.T) { + cfg := newTestConfig(t) + cmd := hermesCommand(cfg) + skills := findSubcommand(t, cmd, "skills") + + if !skills.SkipFlagParsing { + t.Fatal("Hermes skills command should pass native Hermes flags through") + } +} diff --git a/cmd/obol/main.go b/cmd/obol/main.go index 86a392a2..5c495ab8 100644 --- a/cmd/obol/main.go +++ b/cmd/obol/main.go @@ -51,7 +51,16 @@ COMMANDS: network status Show eRPC gateway health and upstreams network delete Remove network deployment - OpenClaw (AI Agent): + Hermes (Default Agent Runtime): + hermes onboard Create and deploy a Hermes instance + hermes setup Re-render Hermes config for a deployed instance + hermes sync Deploy or update a Hermes instance + hermes token Retrieve Hermes API server token + hermes list List Hermes instances + hermes delete Remove instance and cluster resources + hermes wallet Inspect Hermes wallets + + OpenClaw (Alternate Agent Runtime): openclaw onboard Create and deploy an OpenClaw instance openclaw setup Reconfigure model providers for a deployed instance openclaw dashboard Open the dashboard in a browser @@ -365,6 +374,7 @@ GLOBAL OPTIONS:{{template "visibleFlagTemplate" .}}{{end}} updateCommand(cfg), upgradeCommand(cfg), networkCommand(cfg), + hermesCommand(cfg), openclawCommand(cfg), sellCommand(cfg), modelCommand(cfg), diff --git a/cmd/obol/model.go b/cmd/obol/model.go index cf5c78fa..a0375fb5 100644 --- a/cmd/obol/model.go +++ b/cmd/obol/model.go @@ -10,8 +10,8 @@ import ( "strings" "github.com/ObolNetwork/obol-stack/internal/config" + "github.com/ObolNetwork/obol-stack/internal/hermes" "github.com/ObolNetwork/obol-stack/internal/model" - "github.com/ObolNetwork/obol-stack/internal/openclaw" "github.com/ObolNetwork/obol-stack/internal/ui" "github.com/urfave/cli/v3" ) @@ -174,7 +174,7 @@ func setupOllama(cfg *config.Config, u *ui.UI, models []string) error { u.Successf("Ollama configured. To change later, run: obol model setup (or obol model remove )") - return syncOpenClawModels(cfg, u) + return syncAgentModels(cfg, u) } func setupCloudProvider(cfg *config.Config, u *ui.UI, provider, apiKey string, models []string) error { @@ -213,32 +213,24 @@ func setupCloudProvider(cfg *config.Config, u *ui.UI, provider, apiKey string, m u.Print("") u.Successf("Model configured. To change later, run: obol model setup (or obol model remove )") - return syncOpenClawModels(cfg, u) + return syncAgentModels(cfg, u) } -// syncOpenClawModels reads the full LiteLLM model list and updates all -// deployed OpenClaw instances so their "openai" provider (LiteLLM gateway) -// model list stays in sync. This prevents OpenClaw from trying to use -// native provider routing for models it discovers but doesn't recognise. -func syncOpenClawModels(cfg *config.Config, u *ui.UI) error { - allModels, err := model.GetConfiguredModels(cfg) - if err != nil { - u.Warnf("Could not read LiteLLM model list: %v", err) - return nil // non-fatal - } - - return openclaw.SyncOverlayModels(cfg, allModels, u) +// syncAgentModels re-renders the stack-managed Hermes default agent from the +// current LiteLLM model inventory. +func syncAgentModels(cfg *config.Config, u *ui.UI) error { + return hermes.SyncDefaultModels(cfg, u) } func modelSyncCommand(cfg *config.Config) *cli.Command { return &cli.Command{ Name: "sync", - Usage: "Sync LiteLLM model list to all OpenClaw instances", + Usage: "Sync LiteLLM model list to the stack-managed Hermes agent", Action: func(ctx context.Context, cmd *cli.Command) error { u := getUI(cmd) u.Info("Reading model list from LiteLLM...") - return syncOpenClawModels(cfg, u) + return syncAgentModels(cfg, u) }, } } @@ -264,7 +256,7 @@ func modelSetupCustomCommand(cfg *config.Config) *cli.Command { return err } - return syncOpenClawModels(cfg, u) + return syncAgentModels(cfg, u) }, } } @@ -514,7 +506,7 @@ func modelRemoveCommand(cfg *config.Config) *cli.Command { return err } - return syncOpenClawModels(cfg, u) + return syncAgentModels(cfg, u) }, } } diff --git a/cmd/obol/sell.go b/cmd/obol/sell.go index d2d7ce45..828aa2f0 100644 --- a/cmd/obol/sell.go +++ b/cmd/obol/sell.go @@ -24,10 +24,10 @@ import ( "github.com/ObolNetwork/obol-stack/internal/config" "github.com/ObolNetwork/obol-stack/internal/enclave" "github.com/ObolNetwork/obol-stack/internal/erc8004" + "github.com/ObolNetwork/obol-stack/internal/hermes" "github.com/ObolNetwork/obol-stack/internal/inference" "github.com/ObolNetwork/obol-stack/internal/kubectl" "github.com/ObolNetwork/obol-stack/internal/monetizeapi" - "github.com/ObolNetwork/obol-stack/internal/openclaw" "github.com/ObolNetwork/obol-stack/internal/schemas" "github.com/ObolNetwork/obol-stack/internal/stack" "github.com/ObolNetwork/obol-stack/internal/tee" @@ -185,7 +185,7 @@ Examples: wallet := cmd.String("wallet") if wallet == "" { - if resolved, err := openclaw.ResolveWalletAddress(cfg); err == nil { + if resolved, err := hermes.ResolveWalletAddress(cfg); err == nil { wallet = resolved u.Infof("Using wallet from remote-signer: %s", wallet) } else if u.IsTTY() { @@ -484,7 +484,7 @@ Examples: }, &cli.StringFlag{ Name: "private-key-file", - Usage: "Path to the ERC-8004 signing key file (defaults to the OpenClaw remote-signer wallet)", + Usage: "Path to the ERC-8004 signing key file (defaults to the Hermes remote-signer wallet)", }, &cli.StringSliceFlag{ Name: "register-skills", @@ -572,7 +572,7 @@ Examples: // Auto-discover wallet from remote-signer if not set. wallet := cmd.String("wallet") if wallet == "" { - if resolved, err := openclaw.ResolveWalletAddress(cfg); err == nil { + if resolved, err := hermes.ResolveWalletAddress(cfg); err == nil { wallet = resolved u.Infof("Using wallet from remote-signer: %s", wallet) } else if u.IsTTY() { @@ -796,8 +796,8 @@ func autoRegisterServiceOffer(ctx context.Context, cfg *config.Config, u *ui.UI, ) if strings.TrimSpace(opts.PrivateKeyInput) == "" { - if _, err := openclaw.ResolveWalletAddress(cfg); err == nil { - ns, nsErr := openclaw.ResolveInstanceNamespace(cfg) + if _, err := hermes.ResolveWalletAddress(cfg); err == nil { + ns, nsErr := hermes.ResolveInstanceNamespace(cfg) if nsErr == nil { pf, pfErr := startSignerPortForward(cfg, ns) if pfErr != nil { @@ -1624,7 +1624,7 @@ Reloads the payment verifier when configuration is changed.`, wallet := cmd.String("wallet") if wallet == "" { - if resolved, err := openclaw.ResolveWalletAddress(cfg); err == nil { + if resolved, err := hermes.ResolveWalletAddress(cfg); err == nil { wallet = resolved u.Infof("Using wallet from remote-signer: %s", wallet) } else { @@ -1753,14 +1753,14 @@ Examples: agentURI := endpoint + "/.well-known/agent-registration.json" // Determine signing method: private key file (if explicitly provided) - // or remote-signer (default when OpenClaw agent is deployed). + // or remote-signer (default when Hermes agent is deployed). useRemoteSigner := false var signerNS string // If --private-key-file is explicitly provided, honour user intent. if !cmd.IsSet("private-key-file") { - if _, err := openclaw.ResolveWalletAddress(cfg); err == nil { - ns, nsErr := openclaw.ResolveInstanceNamespace(cfg) + if _, err := hermes.ResolveWalletAddress(cfg); err == nil { + ns, nsErr := hermes.ResolveInstanceNamespace(cfg) if nsErr == nil { useRemoteSigner = true signerNS = ns diff --git a/docs/getting-started.md b/docs/getting-started.md index 9d121ba7..94dee367 100644 --- a/docs/getting-started.md +++ b/docs/getting-started.md @@ -48,9 +48,9 @@ On first run, `stack up` will: 1. Create the k3d cluster 2. Deploy infrastructure (Traefik, monitoring, LLM gateway, etc.) 3. Build and import the x402-verifier image (development mode only) -4. Deploy a default OpenClaw agent instance with 23 skills +4. Deploy a default Hermes agent instance with embedded Obol skills 5. Generate an Ethereum signing wallet for the agent -6. Import your local workspace (if `~/.openclaw/` exists) +6. Import runtime state for the stack-managed agent ## Step 2 -- Verify the Cluster @@ -70,8 +70,8 @@ All pods should show `Running` or `Completed` within ~2 minutes: | **Monitoring** | `monitoring` | Prometheus + kube-prometheus-stack | | **Reloader** | `reloader` | Auto-restarts workloads on config changes | | **x402 Gateway** | `x402` | Shared seller-owned payment gateway for priced HTTP routes | -| **OpenClaw** | `openclaw-default` | AI agent with Ethereum wallet | -| **Remote Signer** | `openclaw-default` | Ethereum transaction signing service | +| **Hermes** | `hermes-obol-agent` | Default AI agent with Ethereum wallet | +| **Remote Signer** | `hermes-obol-agent` | Ethereum transaction signing service | Open the frontend: http://obol.stack/ @@ -150,22 +150,21 @@ A successful response contains `tool_calls` with `get_weather` and `{"location": ## Step 4 -- Deploy the AI Agent -The default OpenClaw instance was created during `stack up`. To deploy an additional agent: +The default Hermes instance is created during `stack up`. Apply the agent-specific reconciliation capabilities with: ```bash obol agent init ``` -This creates an `obol-agent` instance with: +The default `obol-agent` instance includes: - A unique Ethereum signing wallet -- 23 embedded skills (Ethereum queries, monetization, cluster diagnostics, etc.) - RBAC permissions to manage ServiceOffers and Kubernetes resources -- A heartbeat that runs the agent periodically +- Hermes routed through LiteLLM for model access List all agent instances: ```bash -obol openclaw list +obol hermes list ``` ## Step 5 -- Test Agent Inference @@ -174,22 +173,22 @@ Get the gateway token for your agent instance: ```bash # For the default instance -obol openclaw token default +obol hermes token default # For obol-agent -obol openclaw token obol-agent +obol hermes token obol-agent ``` Test inference through the agent gateway: ```bash -TOKEN=$(obol openclaw token default) +TOKEN=$(obol hermes token default) -obol kubectl port-forward -n openclaw-default svc/openclaw 18789:18789 & +obol kubectl port-forward -n hermes-obol-agent svc/hermes 8642:8642 & PF_PID=$! sleep 3 -curl -s --max-time 120 -X POST http://localhost:18789/v1/chat/completions \ +curl -s --max-time 120 -X POST http://localhost:8642/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $TOKEN" \ -d '{"model":"qwen3.5:35b","messages":[{"role":"user","content":"What is 2+2?"}],"max_tokens":50,"stream":false}' \ @@ -198,7 +197,7 @@ curl -s --max-time 120 -X POST http://localhost:18789/v1/chat/completions \ kill $PF_PID ``` -This confirms the full inference chain: **OpenClaw → LiteLLM → Ollama**. +This confirms the full inference chain: **Hermes → LiteLLM → Ollama**. ## Step 6 -- Deploy a Blockchain Network diff --git a/docs/guides/monetize-inference.md b/docs/guides/monetize-inference.md index 327bd7de..61e4f2dc 100644 --- a/docs/guides/monetize-inference.md +++ b/docs/guides/monetize-inference.md @@ -82,7 +82,7 @@ Verify the key components: | Check | Command | Expected | |-------|---------|----------| | Cluster nodes | `obol kubectl get nodes` | 1 node Ready | -| Agent running | `obol kubectl get pods -n openclaw-obol-agent` | Running | +| Agent running | `obol kubectl get pods -n hermes-obol-agent` | Running | | CRD installed | `obol kubectl get crd serviceoffers.obol.org` | Found | | x402 verifier | `obol kubectl get pods -n x402` | 2 replicas Running | | Traefik gateway | `obol kubectl get gateway -n traefik` | traefik-gateway | @@ -751,23 +751,23 @@ Missing any of these fields causes the facilitator to reject the payment before ### RBAC: forbidden -If the OpenClaw agent cannot create or patch Kubernetes resources (ServiceOffers, Middlewares, HTTPRoutes), the ClusterRoleBindings may have empty `subjects` lists. Patch them manually: +If the default Hermes agent cannot create or patch Obol resources, the RBAC bindings may have empty `subjects` lists. Patch them manually: ```bash -# Patch both ClusterRoleBindings -for BINDING in openclaw-monetize-read-binding openclaw-monetize-workload-binding; do +# Patch the read ClusterRoleBinding +for BINDING in openclaw-monetize-read-binding; do kubectl patch clusterrolebinding "$BINDING" \ --type=json \ - -p '[{"op":"add","path":"/subjects","value":[{"kind":"ServiceAccount","name":"openclaw","namespace":"openclaw-obol-agent"}]}]' + -p '[{"op":"add","path":"/subjects","value":[{"kind":"ServiceAccount","name":"hermes","namespace":"hermes-obol-agent"}]}]' done -# Patch x402 namespace RoleBinding -kubectl patch rolebinding openclaw-x402-pricing-binding -n x402 \ +# Patch the default-agent write RoleBinding +kubectl patch rolebinding openclaw-monetize-write-binding -n hermes-obol-agent \ --type=json \ - -p '[{"op":"add","path":"/subjects","value":[{"kind":"ServiceAccount","name":"openclaw","namespace":"openclaw-obol-agent"}]}]' + -p '[{"op":"add","path":"/subjects","value":[{"kind":"ServiceAccount","name":"hermes","namespace":"hermes-obol-agent"}]}]' ``` -Replace `openclaw-obol-agent` with your actual OpenClaw namespace if different. +Replace `hermes-obol-agent` with your actual Hermes namespace if different. --- diff --git a/flows/flow-01-prerequisites.sh b/flows/flow-01-prerequisites.sh index 95b5c06e..67287246 100755 --- a/flows/flow-01-prerequisites.sh +++ b/flows/flow-01-prerequisites.sh @@ -49,7 +49,7 @@ else fail "Missing Python packages and automatic venv setup failed — install eth-account httpx" fi -# The default OpenClaw deployment depends on the published remote-signer chart. +# The default Hermes deployment depends on the published remote-signer chart. step "remote-signer Helm chart version is published" rs_version=$(remote_signer_chart_version) if [ -z "$rs_version" ]; then diff --git a/flows/flow-02-stack-init-up.sh b/flows/flow-02-stack-init-up.sh index 607defcd..305c47ba 100755 --- a/flows/flow-02-stack-init-up.sh +++ b/flows/flow-02-stack-init-up.sh @@ -13,6 +13,9 @@ else run_step "obol stack up" "$OBOL" stack up fi +refresh_obol_ingress_env +INGRESS_URL="${OBOL_INGRESS_URL%/}" + # §1: Verify stack config directory has required files (created by obol stack init) step "Stack config has cluster ID and kubeconfig" STACK_ID=$(cat "$OBOL_CONFIG_DIR/.stack-id" 2>/dev/null || true) @@ -34,13 +37,12 @@ else fail "k3s server version unexpected — ${kube_ver:0:100}" fi -# Poll for the core platform to settle. PR 299 no longer depends on the -# default OpenClaw instance for ServiceOffer reconciliation, so exclude the -# openclaw namespace here and validate it separately in flow-04. -step "Core platform pods Running or Completed (excluding openclaw/cloudflared, max 180x5s)" +# Poll for the core platform to settle. Default agent runtimes are validated +# separately in flow-04, so exclude their namespaces from the platform gate. +step "Core platform pods Running or Completed (excluding agent runtimes/cloudflared, max 180x5s)" for i in $(seq 1 180); do pod_output=$("$OBOL" kubectl get pods -A --no-headers 2>&1) - platform_pods=$(echo "$pod_output" | grep -v '^openclaw-' | grep -v ' cloudflared-' || true) + platform_pods=$(echo "$pod_output" | grep -v -E '^(openclaw|hermes)-' | grep -v ' cloudflared-' || true) bad_pods=$(echo "$platform_pods" | grep -v -E "Running|Completed" || true) if [ -z "$bad_pods" ]; then pass "All pods healthy (attempt $i)" @@ -53,8 +55,8 @@ for i in $(seq 1 180); do done # Frontend via Traefik — wait up to 5 min for DNS + Traefik to be ready -poll_step "Frontend at http://obol.stack:8080/" 60 5 \ - $CURL_OBOL -sf --max-time 5 http://obol.stack:8080/ +poll_step "Frontend at $INGRESS_URL/" 60 5 \ + $CURL_OBOL -sf --max-time 5 "$INGRESS_URL/" # §6: obol network list shows available networks (getting-started §6) # Tests the network management CLI without deploying any Ethereum clients. @@ -69,7 +71,7 @@ run_step_grep "obol network status shows eRPC upstreams" \ # §6/§1.6: eRPC /rpc JSON lists base-sepolia among available chains + all states OK step "eRPC /rpc lists base-sepolia (required for x402 payment chain)" -erpc_json=$($CURL_OBOL -sf --max-time 5 http://obol.stack:8080/rpc 2>&1) || true +erpc_json=$($CURL_OBOL -sf --max-time 5 "$INGRESS_URL/rpc" 2>&1) || true if echo "$erpc_json" | python3 -c " import sys, json d = json.load(sys.stdin) @@ -98,7 +100,7 @@ fi # §2: Frontend returns the Obol Stack Next.js app (getting-started §2 Key URLs) step "Frontend serves Next.js app" -frontend_out=$($CURL_OBOL -sf --max-time 10 http://obol.stack:8080/ 2>&1) || true +frontend_out=$($CURL_OBOL -sf --max-time 10 "$INGRESS_URL/" 2>&1) || true if echo "$frontend_out" | grep -q "_next\|html"; then pass "Frontend returns Next.js app HTML" else @@ -109,7 +111,7 @@ fi # Test an actual eth_blockNumber call via the eRPC proxy to verify end-to-end routing. step "eRPC proxies eth_blockNumber to mainnet" erpc_rpc_out=$($CURL_OBOL -sf --max-time 15 -X POST \ - "http://obol.stack:8080/rpc/evm/1" \ + "$INGRESS_URL/rpc/evm/1" \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' 2>&1) || true if echo "$erpc_rpc_out" | python3 -c " @@ -128,7 +130,7 @@ fi # eth_chainId should return 0x14a34 = 84532 confirming correct chain routing step "eRPC proxies Base Sepolia (chain 84532) for x402 payments" erpc_basesep=$($CURL_OBOL -sf --max-time 15 -X POST \ - "http://obol.stack:8080/rpc/evm/84532" \ + "$INGRESS_URL/rpc/evm/84532" \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' 2>&1) || true if echo "$erpc_basesep" | python3 -c " diff --git a/flows/flow-04-agent.sh b/flows/flow-04-agent.sh index 1bb029dd..a6e781e6 100755 --- a/flows/flow-04-agent.sh +++ b/flows/flow-04-agent.sh @@ -1,26 +1,26 @@ #!/bin/bash # Flow 04: Agent Init + Inference — getting-started.md §4-5. -# Tests: agent init, openclaw list, token, agent gateway inference. +# Tests: agent init, hermes list, token, agent gateway inference. source "$(dirname "$0")/lib.sh" # §4: Deploy AI Agent (idempotent) run_step "obol agent init" "$OBOL" agent init # List agent instances — verify name AND URL are shown (getting-started §4) -run_step_grep "openclaw list shows instances" "obol-agent\|default" "$OBOL" openclaw list -step "openclaw list shows agent URL" -list_out=$("$OBOL" openclaw list 2>&1) || true +run_step_grep "hermes list shows instances" "obol-agent" "$OBOL" hermes list +step "hermes list shows agent URL" +list_out=$("$OBOL" hermes list 2>&1) || true if echo "$list_out" | grep -q "obol.stack\|URL:"; then url=$(echo "$list_out" | grep -oE 'http://[a-z0-9.-]+' | head -1) - pass "openclaw list shows agent URL: $url" + pass "hermes list shows agent URL: $url" else - fail "openclaw list missing URL — ${list_out:0:200}" + fail "hermes list missing URL — ${list_out:0:200}" fi # PR 299 moves monetization reconciliation to serviceoffer-controller. # agent init should remove the legacy heartbeat file instead of injecting it. step "Legacy HEARTBEAT.md removed from agent workspace" -HEARTBEAT_FILE="$OBOL_DATA_DIR/openclaw-obol-agent/openclaw-data/.openclaw/workspace/HEARTBEAT.md" +HEARTBEAT_FILE="$OBOL_DATA_DIR/hermes-obol-agent/hermes-data/.hermes/workspace/HEARTBEAT.md" if [ ! -f "$HEARTBEAT_FILE" ]; then pass "Legacy HEARTBEAT.md removed (controller owns reconciliation)" else @@ -30,30 +30,30 @@ fi run_step_grep "serviceoffer-controller running" "Running" \ "$OBOL" kubectl get pods -n x402 -l app=serviceoffer-controller --no-headers -# §5: OpenClaw service on port 18789 (getting-started §5 uses port-forward 18789:18789) -step "OpenClaw service on port 18789" -NS=$("$OBOL" openclaw list 2>/dev/null | grep -oE 'openclaw-[a-z0-9-]+' | head -1 || echo "openclaw-obol-agent") -oc_port=$("$OBOL" kubectl get svc openclaw -n "$NS" \ +# §5: Hermes service on port 8642 (getting-started §5 uses port-forward 8642:8642) +step "Hermes service on port 8642" +NS=$("$OBOL" hermes list 2>/dev/null | grep -oE 'hermes-[a-z0-9-]+' | head -1 || echo "hermes-obol-agent") +oc_port=$("$OBOL" kubectl get svc hermes -n "$NS" \ -o jsonpath='{.spec.ports[0].port}' 2>&1) || true -if [ "$oc_port" = "18789" ]; then - pass "OpenClaw service port: 18789 (matches getting-started §5 port-forward)" +if [ "$oc_port" = "8642" ]; then + pass "Hermes service port: 8642 (matches getting-started §5 port-forward)" else - fail "OpenClaw service port unexpected: $oc_port (expected 18789)" + fail "Hermes service port unexpected: $oc_port (expected 8642)" fi # §5: Test Agent Inference -step "Get openclaw token" -TOKEN=$("$OBOL" openclaw token obol-agent 2>/dev/null || "$OBOL" openclaw token default 2>/dev/null || true) +step "Get Hermes API server token" +TOKEN=$("$OBOL" hermes token obol-agent 2>/dev/null || "$OBOL" hermes token default 2>/dev/null || true) if [ -n "$TOKEN" ]; then pass "Got token: ${TOKEN:0:8}..." else - fail "Failed to get openclaw token" + fail "Failed to get Hermes token" emit_metrics exit 0 fi # §5: Token is 32-char alphanumeric (validates token generation for gateway auth) -step "OpenClaw gateway token is 32-char alphanumeric" +step "Hermes API server token is 32-char alphanumeric" if echo "$TOKEN" | grep -qE '^[A-Za-z0-9]{32}$'; then pass "Token: ${TOKEN:0:8}... (32 chars, alphanumeric)" else @@ -61,25 +61,28 @@ else fi # Determine the namespace for port-forward -NS=$("$OBOL" openclaw list 2>/dev/null | grep -oE 'openclaw-[a-z0-9-]+' | head -1 || echo "openclaw-obol-agent") +NS=$("$OBOL" hermes list 2>/dev/null | grep -oE 'hermes-[a-z0-9-]+' | head -1 || echo "hermes-obol-agent") step "Agent inference via port-forward" -kill $(lsof -ti:18789) 2>/dev/null || true -"$OBOL" kubectl port-forward -n "$NS" svc/openclaw 18789:18789 &>/dev/null & +AGENT_PF_PORT="${FLOW04_AGENT_PORT:-$(pick_free_port)}" +"$OBOL" kubectl port-forward -n "$NS" "svc/hermes" "${AGENT_PF_PORT}:8642" &>/dev/null & PF_PID=$! -# Poll until port 18789 is accepting connections +# Poll until the selected local port is accepting connections for i in $(seq 1 15); do - if curl -sf --max-time 2 http://localhost:18789/health >/dev/null 2>&1; then + if curl -sf --max-time 2 "http://localhost:${AGENT_PF_PORT}/health" >/dev/null 2>&1; then break fi sleep 2 done -out=$(curl -sf --max-time 120 -X POST http://localhost:18789/v1/chat/completions \ +model_name=$("$OBOL" kubectl get cm hermes-config -n "$NS" -o jsonpath='{.data.config\.yaml}' 2>/dev/null | sed -n 's/^[[:space:]]*default: //p' | tr -d '"' | head -1) +[ -n "$model_name" ] || model_name="qwen3.5:35b" + +out=$(curl -sf --max-time 120 -X POST "http://localhost:${AGENT_PF_PORT}/v1/chat/completions" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $TOKEN" \ - -d "{\"model\":\"openclaw\",\"messages\":[{\"role\":\"user\",\"content\":\"What is 2+2?\"}],\"max_tokens\":50,\"stream\":false}" 2>&1) || true + -d "{\"model\":\"$model_name\",\"messages\":[{\"role\":\"user\",\"content\":\"What is 2+2?\"}],\"max_tokens\":50,\"stream\":false}" 2>&1) || true if echo "$out" | grep -q "choices"; then pass "Agent inference returned response" @@ -89,81 +92,89 @@ fi cleanup_pid "$PF_PID" -# §4: Verify obol-managed skills are installed (getting-started §4) -# Skills like sell, buy-inference, discovery, obol-stack are obol-managed. -step "obol openclaw skills list shows obol-managed skills" -skills_out=$("$OBOL" openclaw skills list obol-agent 2>&1) || true -if echo "$skills_out" | grep -q "sell\|buy-inference\|obol-stack"; then - ready_count=$(echo "$skills_out" | grep -c "ready" || echo 0) - pass "openclaw skills: $ready_count obol-managed skills ready" -else - fail "openclaw skills list missing expected skills — ${skills_out:0:200}" -fi - # §4: Ethereum signing wallet created by obol agent init (getting-started §4) # "A unique Ethereum signing wallet" is listed as a feature of obol agent init. -step "obol openclaw wallet list shows Ethereum address" -wallet_out=$("$OBOL" openclaw wallet list obol-agent 2>&1) || true +step "obol hermes wallet list shows Ethereum address" +wallet_out=$("$OBOL" hermes wallet list obol-agent 2>&1) || true if echo "$wallet_out" | grep -q "0x[0-9a-fA-F]\{40\}\|Address:"; then addr=$(echo "$wallet_out" | grep -oE '0x[0-9a-fA-F]{40}' | head -1) pass "Agent wallet address: $addr" else - fail "openclaw wallet list missing address — ${wallet_out:0:200}" + fail "hermes wallet list missing address — ${wallet_out:0:200}" fi -# §4: OpenClaw gateway health via HTTPRoute URL (getting-started §4 output shows URL) -# The URL http://openclaw-obol-agent.obol.stack is shown after obol openclaw sync. -step "OpenClaw gateway health via HTTPRoute hostname" -OPENCLAW_URL="http://openclaw-obol-agent.obol.stack:8080" +# §4: Hermes gateway health via HTTPRoute URL (getting-started §4 output shows URL) +step "Hermes gateway health via HTTPRoute hostname" +ingress_port=$(k3d_live_ingress_port || true) +if [ -z "$ingress_port" ]; then + ingress_port=$(awk ' + /- port:/ { + split($3, p, ":") + if (p[2] == "80") { print p[1]; exit } + } + ' "$OBOL_CONFIG_DIR/k3d.yaml" 2>/dev/null || true) +fi +[ -n "$ingress_port" ] || ingress_port=80 +if [ "$ingress_port" = "80" ]; then + HERMES_URL="http://hermes-obol-agent.obol.stack" +else + HERMES_URL="http://hermes-obol-agent.obol.stack:${ingress_port}" +fi # Use --resolve to bypass DNS (obol.stack not always in /etc/hosts for subdomains) -oc_health=$(curl --resolve "openclaw-obol-agent.obol.stack:8080:127.0.0.1" \ - -sf --max-time 10 "$OPENCLAW_URL/health" 2>&1) || true -if echo "$oc_health" | grep -q "ok.*true\|status.*live"; then - pass "OpenClaw gateway health: $oc_health" +oc_health=$(curl --resolve "hermes-obol-agent.obol.stack:${ingress_port}:127.0.0.1" \ + -sf --max-time 10 "$HERMES_URL/health" 2>&1) || true +if echo "$oc_health" | grep -q "ok\\|status"; then + pass "Hermes gateway health: $oc_health" +else + fail "Hermes gateway health check failed — ${oc_health:0:100}" +fi + +step "Hermes native dashboard UI via deeplink" +HERMES_DASHBOARD_HOST="hermes-obol-agent-ui.obol.stack" +if [ "$ingress_port" = "80" ]; then + HERMES_DASHBOARD_URL="http://${HERMES_DASHBOARD_HOST}" else - fail "OpenClaw gateway health check failed — ${oc_health:0:100}" + HERMES_DASHBOARD_URL="http://${HERMES_DASHBOARD_HOST}:${ingress_port}" +fi +dashboard_html="" +for i in $(seq 1 15); do + dashboard_html=$(curl --resolve "${HERMES_DASHBOARD_HOST}:${ingress_port}:127.0.0.1" \ + -sf --max-time 10 "$HERMES_DASHBOARD_URL/" 2>&1) || true + if echo "$dashboard_html" | grep -q "__HERMES_SESSION_TOKEN__"; then + pass "Hermes dashboard UI loaded: $HERMES_DASHBOARD_URL" + break + fi + sleep 2 +done +if ! echo "$dashboard_html" | grep -q "__HERMES_SESSION_TOKEN__"; then + fail "Hermes dashboard UI deeplink failed — ${dashboard_html:0:100}" fi -# §4: Verify openclaw config still has the expected model/provider wiring. -oc_config=$("$OBOL" kubectl get cm openclaw-config -n openclaw-obol-agent \ - -o jsonpath='{.data.openclaw\.json}' 2>&1) || true +# §4: Verify Hermes config still has the expected model/provider wiring. +oc_config=$("$OBOL" kubectl get cm hermes-config -n hermes-obol-agent \ + -o jsonpath='{.data.config\.yaml}' 2>&1) || true step "Agent primary model is configured" -model_val=$(echo "$oc_config" | python3 -c " -import sys, json -try: - d = json.load(sys.stdin) - m = d.get('agents',{}).get('defaults',{}).get('model',{}).get('primary','') - print(m) -except: pass -" 2>/dev/null) || model_val="" +model_val=$(echo "$oc_config" | sed -n 's/^[[:space:]]*default: //p' | tr -d '"' | head -1) if [ -n "$model_val" ]; then pass "Agent primary model: $model_val" else - fail "Agent model not configured in openclaw-config" + fail "Agent model not configured in hermes-config" fi -# §4: OpenClaw routes through LiteLLM (openai provider slot at litellm.llm.svc) -# CLAUDE.md: "OpenClaw always routes through LiteLLM (openai provider slot)" -step "OpenClaw openai provider routes to in-cluster LiteLLM" -litellm_base=$(echo "$oc_config" | python3 -c " -import sys, json -try: - d = json.load(sys.stdin) - url = d.get('models',{}).get('providers',{}).get('openai',{}).get('baseUrl','') - print(url) -except: pass -" 2>/dev/null) || litellm_base="" +# §4: Hermes routes through LiteLLM via a custom OpenAI-compatible endpoint. +step "Hermes model provider routes to in-cluster LiteLLM" +litellm_base=$(echo "$oc_config" | sed -n 's/^[[:space:]]*base_url: //p' | tr -d '"' | head -1) if echo "$litellm_base" | grep -q "litellm.llm.svc.cluster.local"; then - pass "OpenClaw openai provider baseUrl: $litellm_base" + pass "Hermes custom provider base_url: $litellm_base" else - fail "OpenClaw not routing through LiteLLM — base URL: ${litellm_base:-empty}" + fail "Hermes not routing through LiteLLM — base URL: ${litellm_base:-empty}" fi # §4 RBAC: controller design keeps read cluster-wide, but write namespace-scoped. step "RBAC: monetize read ClusterRole and write Role exist" cr_read=$("$OBOL" kubectl get clusterrole openclaw-monetize-read 2>&1) || true -role_write=$("$OBOL" kubectl get role openclaw-monetize-write -n openclaw-obol-agent 2>&1) || true +role_write=$("$OBOL" kubectl get role openclaw-monetize-write -n hermes-obol-agent 2>&1) || true if echo "$cr_read" | grep -q "openclaw-monetize-read" && \ echo "$role_write" | grep -q "openclaw-monetize-write"; then pass "RBAC: read ClusterRole + write Role" @@ -173,7 +184,7 @@ fi # §4 RBAC: write Role allows CRUD on ServiceOffers (obol.org) only in the agent namespace. step "RBAC: openclaw-monetize-write can CRUD ServiceOffers" -write_rules=$("$OBOL" kubectl get role openclaw-monetize-write -n openclaw-obol-agent \ +write_rules=$("$OBOL" kubectl get role openclaw-monetize-write -n hermes-obol-agent \ -o jsonpath='{.rules}' 2>&1) || true if echo "$write_rules" | python3 -c " import sys, json @@ -192,32 +203,32 @@ else fail "RBAC write rule missing ServiceOffer CRUD — ${write_rules:0:100}" fi -# §4: Read ClusterRoleBinding and write RoleBinding must include openclaw SA as subject. -step "RBAC: openclaw-monetize bindings have openclaw SA as subject" +# §4: Read ClusterRoleBinding and write RoleBinding must include hermes SA as subject. +step "RBAC: openclaw-monetize bindings have hermes SA as subject" rbac_out=$("$OBOL" kubectl get clusterrolebinding openclaw-monetize-read-binding \ -o jsonpath='{.subjects}' 2>&1) || true -rbac_write=$("$OBOL" kubectl get rolebinding openclaw-monetize-write-binding -n openclaw-obol-agent \ +rbac_write=$("$OBOL" kubectl get rolebinding openclaw-monetize-write-binding -n hermes-obol-agent \ -o jsonpath='{.subjects}' 2>&1) || true -if echo "$rbac_out" | grep -q "openclaw" && echo "$rbac_write" | grep -q "openclaw"; then - pass "Read ClusterRoleBinding and write RoleBinding have openclaw SA" +if echo "$rbac_out" | grep -q "hermes" && echo "$rbac_write" | grep -q "hermes"; then + pass "Read ClusterRoleBinding and write RoleBinding have hermes SA" else - fail "RBAC binding missing openclaw SA — read: ${rbac_out:0:50} write: ${rbac_write:0:50}" + fail "RBAC binding missing hermes SA — read: ${rbac_out:0:50} write: ${rbac_write:0:50}" fi # §2 component table: Remote Signer running (getting-started §2 lists it as a component) # The remote-signer provides signing services for the agent's Ethereum wallet. # It exposes a REST API on port 9000 for health and key management. step "Remote Signer health check" -kill $(lsof -ti:9000) 2>/dev/null || true -"$OBOL" kubectl port-forward -n "$NS" svc/remote-signer 9000:9000 &>/dev/null & +REMOTE_SIGNER_PF_PORT="${FLOW04_REMOTE_SIGNER_PORT:-$(pick_free_port)}" +"$OBOL" kubectl port-forward -n "$NS" "svc/remote-signer" "${REMOTE_SIGNER_PF_PORT}:9000" &>/dev/null & RS_PID=$! for i in $(seq 1 10); do - if curl -sf --max-time 2 http://localhost:9000/healthz >/dev/null 2>&1; then + if curl -sf --max-time 2 "http://localhost:${REMOTE_SIGNER_PF_PORT}/healthz" >/dev/null 2>&1; then break fi sleep 1 done -rs_out=$(curl -sf --max-time 5 http://localhost:9000/healthz 2>&1) || true +rs_out=$(curl -sf --max-time 5 "http://localhost:${REMOTE_SIGNER_PF_PORT}/healthz" 2>&1) || true cleanup_pid "$RS_PID" if echo "$rs_out" | grep -q "ok\|status"; then pass "Remote Signer healthy: $rs_out" diff --git a/flows/flow-07-sell-verify.sh b/flows/flow-07-sell-verify.sh index af545161..3255a1d2 100755 --- a/flows/flow-07-sell-verify.sh +++ b/flows/flow-07-sell-verify.sh @@ -3,6 +3,9 @@ # Runs AFTER flow-06 (ServiceOffer flow-qwen must be Ready). source "$(dirname "$0")/lib.sh" +refresh_obol_ingress_env +INGRESS_URL="${OBOL_INGRESS_URL%/}" + # Controller-based reconciliation lives in the x402 namespace. run_step_grep "serviceoffer-controller pod running" "Running" \ "$OBOL" kubectl get pods -n x402 -l app=serviceoffer-controller --no-headers @@ -27,22 +30,22 @@ else fail "Frontend HTTPRoute missing hostname restriction — exposed to public tunnel! ($fe_hostnames)" fi -# Security: OpenClaw dashboard restricted to local subdomain (not public) -step "OpenClaw HTTPRoute restricted to subdomain (security: not fully public)" -oc_hostnames=$("$OBOL" kubectl get httproute openclaw -n openclaw-obol-agent \ +# Security: default Hermes agent restricted to local subdomain (not public) +step "Hermes HTTPRoute restricted to subdomain (security: not fully public)" +agent_hostnames=$("$OBOL" kubectl get httproute hermes -n hermes-obol-agent \ -o jsonpath='{.spec.hostnames}' 2>&1) || true -if echo "$oc_hostnames" | grep -q "obol.stack"; then - pass "OpenClaw HTTPRoute hostname: $oc_hostnames (local subdomain)" +if echo "$agent_hostnames" | grep -q "obol.stack"; then + pass "Hermes HTTPRoute hostname: $agent_hostnames (local subdomain)" else - fail "OpenClaw HTTPRoute missing hostname restriction — ${oc_hostnames:0:100}" + fail "Hermes HTTPRoute missing hostname restriction — ${agent_hostnames:0:100}" fi # §1.6 pre-check: eRPC accessible (local Traefik, obol.stack only — never via tunnel) # GET /rpc returns network list (from getting-started.md §2, monetize §1.6) -step "eRPC accessible at obol.stack:8080/rpc" -erpc_out=$($CURL_OBOL -sf --max-time 10 http://obol.stack:8080/rpc 2>&1) || true +step "eRPC accessible at $INGRESS_URL/rpc" +erpc_out=$($CURL_OBOL -sf --max-time 10 "$INGRESS_URL/rpc" 2>&1) || true if echo "$erpc_out" | python3 -c "import sys,json; d=json.load(sys.stdin); assert 'rpc' in d or 'error' in d" 2>/dev/null; then - pass "eRPC at obol.stack:8080/rpc returned JSON" + pass "eRPC at $INGRESS_URL/rpc returned JSON" else fail "eRPC not responding — ${erpc_out:0:100}" fi @@ -78,7 +81,7 @@ done step "402 via local Traefik" for i in $(seq 1 6); do local_code=$($CURL_OBOL -s --max-time 5 -o /dev/null -w '%{http_code}' -X POST \ - "http://obol.stack:8080/services/flow-qwen/v1/chat/completions" \ + "$INGRESS_URL/services/flow-qwen/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "{\"model\":\"$FLOW_MODEL\",\"messages\":[{\"role\":\"user\",\"content\":\"Hello\"}]}" 2>&1) || true if [ "$local_code" = "402" ]; then @@ -92,7 +95,7 @@ done # Validate 402 JSON body has required x402 fields step "402 body has x402Version and accepts[]" body=$($CURL_OBOL -s --max-time 10 -X POST \ - "http://obol.stack:8080/services/flow-qwen/v1/chat/completions" \ + "$INGRESS_URL/services/flow-qwen/v1/chat/completions" \ -H "Content-Type: application/json" \ -d "{\"model\":\"$FLOW_MODEL\",\"messages\":[{\"role\":\"user\",\"content\":\"Hello\"}]}" 2>&1) || true if echo "$body" | python3 -c " diff --git a/flows/flow-08-buy.sh b/flows/flow-08-buy.sh index 2c007da2..de49e683 100755 --- a/flows/flow-08-buy.sh +++ b/flows/flow-08-buy.sh @@ -5,7 +5,8 @@ source "$(dirname "$0")/lib.sh" TUNNEL_OUTPUT=$("$OBOL" tunnel status 2>&1) || true TUNNEL_URL=$(echo "$TUNNEL_OUTPUT" | grep -oE 'https://[a-z0-9-]+\.trycloudflare\.com' | head -1 || true) -BASE_URL="http://obol.stack:8080" +refresh_obol_ingress_env +BASE_URL="${OBOL_INGRESS_URL%/}" if [ -n "$TUNNEL_URL" ]; then tunnel_probe=$(curl -s -o /dev/null -w '%{http_code}' --max-time 15 -X POST \ "$TUNNEL_URL/services/flow-qwen/v1/chat/completions" \ @@ -94,7 +95,7 @@ import httpx from eth_account import Account from eth_account.messages import encode_typed_data -SERVICE_URL = os.environ.get('BASE_URL', 'http://obol.stack:8080') +SERVICE_URL = os.environ.get('BASE_URL', os.environ.get('OBOL_INGRESS_URL', 'http://obol.stack:8080')) SERVICE_PATH = "/services/flow-qwen/v1/chat/completions" CONSUMER_KEY = os.environ["CONSUMER_PRIVATE_KEY"] # derived from Hardhat mnemonic in lib.sh USDC_ADDRESS = "0x036CbD53842c5426634e7929541eC2318f3dCF7e" diff --git a/flows/flow-11-dual-stack.sh b/flows/flow-11-dual-stack.sh index d4381a80..06d185a7 100755 --- a/flows/flow-11-dual-stack.sh +++ b/flows/flow-11-dual-stack.sh @@ -3,11 +3,11 @@ # # Two independent obol stacks on the same machine. Alice registers her # inference service on the ERC-8004 Identity Registry (Base Sepolia). -# Bob's agent discovers her by scanning the registry, buys inference +# Bob's Hermes agent discovers her by scanning the registry, buys inference # tokens via x402, and uses the paid/* sidecar route. # # This is the most human-like integration test: every interaction with -# Bob is through natural language prompts to his OpenClaw agent. +# Bob is through natural language prompts to his Hermes agent. # # Requires: # - .env with REMOTE_SIGNER_PRIVATE_KEY (funded on Base Sepolia with ETH + USDC) @@ -59,6 +59,12 @@ FLOW11_ARTIFACT_DIR="${FLOW11_ARTIFACT_DIR:-$OBOL_ROOT/.tmp/flow-11-$(date +%Y%m BASE_SEPOLIA_RPC="${FLOW11_BASE_SEPOLIA_RPC:-https://sepolia.base.org}" USDC_ADDRESS_BASE_SEPOLIA="0x036CbD53842c5426634e7929541eC2318f3dCF7e" ERC8004_IDENTITY_REGISTRY_BASE_SEPOLIA="0x8004A818BFB912233c491871b3d84c89A494BD9e" +BOB_AGENT_NS="hermes-obol-agent" +BOB_AGENT_DEPLOY="hermes" +BOB_AGENT_CONTAINER="hermes" +BOB_AGENT_SERVICE="hermes" +BOB_AGENT_REMOTE_PORT="8642" +BOB_OBOL_SKILLS_DIR="/data/.hermes/obol-skills" mkdir -p "$FLOW11_ARTIFACT_DIR" rewrite_k3d_ports() { @@ -207,7 +213,7 @@ bob() { } purchase_request_status() { - bob kubectl get purchaserequests.obol.org -n openclaw-obol-agent --no-headers 2>&1 || true + bob kubectl get purchaserequests.obol.org -n "$BOB_AGENT_NS" --no-headers 2>&1 || true } buyer_sidecar_status() { @@ -225,7 +231,7 @@ except Exception as e: } bob_tunnel_402_code() { - bob kubectl exec -n openclaw-obol-agent deploy/openclaw -c openclaw -- \ + bob kubectl exec -n "$BOB_AGENT_NS" "deploy/$BOB_AGENT_DEPLOY" -c "$BOB_AGENT_CONTAINER" -- \ python3 -c " import json import urllib.error @@ -250,8 +256,8 @@ except Exception as e: bob_buy_skill_balance() { bob kubectl exec \ - -n openclaw-obol-agent deploy/openclaw -c openclaw -- \ - python3 /data/.openclaw/skills/buy-inference/scripts/buy.py balance 2>&1 || true + -n "$BOB_AGENT_NS" "deploy/$BOB_AGENT_DEPLOY" -c "$BOB_AGENT_CONTAINER" -- \ + python3 "$BOB_OBOL_SKILLS_DIR/buy-inference/scripts/buy.py" balance 2>&1 || true } run_tail_or_fail() { @@ -823,9 +829,9 @@ pass "Bob eRPC configured for Base Sepolia" ensure_bob_tunnel_dns "$TUNNEL_HOST" "$TUNNEL_IP" -# Wait for Bob's OpenClaw agent to be ready -poll_step_grep "Bob: OpenClaw agent ready" "Running" 24 5 \ - bob kubectl get pods -n openclaw-obol-agent -l app.kubernetes.io/name=openclaw --no-headers +# Wait for Bob's Hermes agent to be ready +poll_step_grep "Bob: Hermes agent ready" "Running" 24 5 \ + bob kubectl get pods -n "$BOB_AGENT_NS" -l app.kubernetes.io/name=hermes --no-headers step "Bob: tunnel reachable from agent pod" bob_tunnel_code="" @@ -852,7 +858,7 @@ step "Bob: fund remote-signer wallet with USDC" BOB_SIGNER_ADDR=$(python3 -c " import json, sys try: - d = json.load(open('$BOB_DIR/config/applications/openclaw/obol-agent/wallet.json')) + d = json.load(open('$BOB_DIR/config/applications/hermes/obol-agent/wallet.json')) print(d.get('address','')) except: pass " 2>&1) @@ -930,21 +936,21 @@ fi # BOB'S AGENT: DISCOVER ALICE VIA ERC-8004 + BUY + USE # ═════════════════════════════════════════════════════════════════ -step "Bob: get OpenClaw gateway token" -BOB_TOKEN=$(bob openclaw token obol-agent 2>/dev/null || true) +step "Bob: get Hermes API server token" +BOB_TOKEN=$(bob hermes token obol-agent 2>/dev/null || true) if [ -z "$BOB_TOKEN" ]; then fail "Could not get Bob's gateway token" emit_metrics; exit 1 fi pass "Token: ${BOB_TOKEN:0:10}..." -# Port-forward to Bob's OpenClaw for chat API access. +# Port-forward to Bob's Hermes API server for chat access. BOB_AGENT_PORT=$(pick_free_port) PF_AGENT_LOG=$(mktemp) -bob kubectl port-forward -n openclaw-obol-agent svc/openclaw "${BOB_AGENT_PORT}:18789" >"$PF_AGENT_LOG" 2>&1 & +bob kubectl port-forward -n "$BOB_AGENT_NS" "svc/$BOB_AGENT_SERVICE" "${BOB_AGENT_PORT}:${BOB_AGENT_REMOTE_PORT}" >"$PF_AGENT_LOG" 2>&1 & PF_AGENT=$! -step "Bob: OpenClaw API port-forward ready" +step "Bob: Hermes API port-forward ready" pf_ready=0 for i in $(seq 1 20); do if python3 - "$BOB_AGENT_PORT" <<'PY' @@ -970,9 +976,9 @@ PY sleep 1 done if [ "$pf_ready" = "1" ]; then - pass "OpenClaw API available on localhost:$BOB_AGENT_PORT" + pass "Hermes API available on localhost:$BOB_AGENT_PORT" else - fail "OpenClaw port-forward failed: $(tail -n 10 "$PF_AGENT_LOG" 2>/dev/null | tr '\n' ' ')" + fail "Hermes port-forward failed: $(tail -n 10 "$PF_AGENT_LOG" 2>/dev/null | tr '\n' ' ')" cleanup_pid "$PF_AGENT" rm -f "$PF_AGENT_LOG" emit_metrics; exit 1 @@ -984,7 +990,7 @@ discover_response=$(curl -sf --max-time 300 \ -H "Authorization: Bearer $BOB_TOKEN" \ -H "Content-Type: application/json" \ -d "{ - \"model\": \"openclaw\", + \"model\": \"hermes-agent\", \"messages\": [{ \"role\": \"user\", \"content\": \"Search the ERC-8004 agent identity registry on Base Sepolia for recently registered AI inference services that support x402 payments. Use the discovery skill to scan for agents. Look for one named 'Dual-Stack Test Inference' or similar with natural_language_processing skills. Report what you find — the agent ID, name, endpoint URL, and whether it supports x402.\" @@ -1007,11 +1013,11 @@ buy_response=$(curl -sf --max-time 300 \ -H "Authorization: Bearer $BOB_TOKEN" \ -H "Content-Type: application/json" \ -d "{ - \"model\": \"openclaw\", + \"model\": \"hermes-agent\", \"messages\": [ {\"role\": \"user\", \"content\": \"Search the ERC-8004 registry on Base Sepolia for the agent named 'Dual-Stack Test Inference'. Report its endpoint.\"}, {\"role\": \"assistant\", \"content\": \"I found the agent. Its endpoint is $TUNNEL_URL/services/alice-inference\"}, - {\"role\": \"user\", \"content\": \"Now use the buy-inference skill to buy 5 inference tokens from Alice. Run exactly: python3 scripts/buy.py buy alice-inference --endpoint $TUNNEL_URL/services/alice-inference/v1/chat/completions --model qwen3.5:9b --count 5\"} + {\"role\": \"user\", \"content\": \"Now use the buy-inference skill to buy 5 inference tokens from Alice. Run exactly: python3 $BOB_OBOL_SKILLS_DIR/buy-inference/scripts/buy.py buy alice-inference --endpoint $TUNNEL_URL/services/alice-inference/v1/chat/completions --model qwen3.5:9b --count 5\"} ], \"max_tokens\": 4000, \"stream\": false diff --git a/flows/lib.sh b/flows/lib.sh index 73ac11c3..ba1f1111 100755 --- a/flows/lib.sh +++ b/flows/lib.sh @@ -47,22 +47,128 @@ HARDHAT_MNEMONIC="test test test test test test test test test test test junk" hh_key() { cast wallet derive-private-key "$HARDHAT_MNEMONIC" "$1"; } hh_addr() { cast wallet address --private-key "$(hh_key "$1")"; } -# Anvil deterministic accounts (derived at runtime -- no secrets in source) -export SELLER_WALLET=$(hh_addr 1) -export SELLER_KEY=$(hh_key 1) -export CONSUMER_WALLET=$(hh_addr 0) -export CONSUMER_PRIVATE_KEY=$(hh_key 0) -export FACILITATOR_PRIVATE_KEY=$(hh_key 3) +# Anvil deterministic accounts (derived at runtime -- no secrets in source). +# Flows that do not touch on-chain payment should not require Foundry/cast. +if command -v cast >/dev/null 2>&1; then + export SELLER_WALLET=$(hh_addr 1) + export SELLER_KEY=$(hh_key 1) + export CONSUMER_WALLET=$(hh_addr 0) + export CONSUMER_PRIVATE_KEY=$(hh_key 0) + export FACILITATOR_PRIVATE_KEY=$(hh_key 3) +else + export SELLER_WALLET="${SELLER_WALLET:-}" + export SELLER_KEY="${SELLER_KEY:-}" + export CONSUMER_WALLET="${CONSUMER_WALLET:-}" + export CONSUMER_PRIVATE_KEY="${CONSUMER_PRIVATE_KEY:-}" + export FACILITATOR_PRIVATE_KEY="${FACILITATOR_PRIVATE_KEY:-}" +fi export USDC_ADDRESS="0x036CbD53842c5426634e7929541eC2318f3dCF7e" export CHAIN="base-sepolia" export ANVIL_RPC="http://localhost:8545" # Model used for flow tests (small, fast, local Ollama) export FLOW_MODEL="${FLOW_MODEL:-qwen3.5:9b}" +OBOL_INGRESS_URL_OVERRIDE="${OBOL_INGRESS_URL:-}" # macOS mDNS can be slow resolving .stack TLD from /etc/hosts. # Use --resolve to bypass DNS and go straight to 127.0.0.1. -CURL_OBOL="curl --resolve obol.stack:80:127.0.0.1 --resolve obol.stack:8080:127.0.0.1 --resolve obol.stack:443:127.0.0.1" +obol_ingress_url() { + if [ -n "${OBOL_INGRESS_URL_OVERRIDE:-}" ]; then + echo "${OBOL_INGRESS_URL_OVERRIDE%/}" + return 0 + fi + + local live_host_port + live_host_port="$(k3d_live_ingress_port || true)" + if [ -n "$live_host_port" ]; then + if [ "$live_host_port" = "80" ]; then + echo "http://obol.stack" + else + echo "http://obol.stack:$live_host_port" + fi + return 0 + fi + + local k3d_config="$OBOL_CONFIG_DIR/k3d.yaml" + if [ -f "$k3d_config" ]; then + local host_port + host_port=$(awk ' + /- port:/ { + gsub(/"/, "", $3) + split($3, parts, ":") + if (parts[2] == "80") { + print parts[1] + exit + } + } + ' "$k3d_config") + if [ -n "$host_port" ]; then + if [ "$host_port" = "80" ]; then + echo "http://obol.stack" + else + echo "http://obol.stack:$host_port" + fi + return 0 + fi + fi + + if ! is_port_listening 80; then + echo "http://obol.stack" + else + echo "http://obol.stack:8080" + fi +} + +k3d_live_ingress_port() { + command -v docker >/dev/null 2>&1 || return 0 + + local stack_id_file="$OBOL_CONFIG_DIR/.stack-id" + [ -f "$stack_id_file" ] || return 0 + + local stack_id + stack_id="$(tr -d '[:space:]' < "$stack_id_file")" + [ -n "$stack_id" ] || return 0 + + local container="k3d-obol-stack-${stack_id}-serverlb" + if ! docker ps --format '{{.Names}}' | grep -qx "$container"; then + return 0 + fi + + docker port "$container" 80/tcp 2>/dev/null | awk -F: ' + /^[0-9.:]+:[0-9]+$/ { + print $NF + exit + } + ' +} + +obol_curl_command_for_url() { + local url="${1%/}" + local port="80" + + case "$url" in + http://obol.stack:*) + port="${url#http://obol.stack:}" + port="${port%%/*}" + ;; + https://obol.stack:*) + port="${url#https://obol.stack:}" + port="${port%%/*}" + ;; + https://obol.stack) + port="443" + ;; + esac + + echo "curl --resolve obol.stack:$port:127.0.0.1 --resolve obol.stack:80:127.0.0.1 --resolve obol.stack:8080:127.0.0.1 --resolve obol.stack:443:127.0.0.1" +} + +refresh_obol_ingress_env() { + export OBOL_INGRESS_URL + OBOL_INGRESS_URL="$(obol_ingress_url)" + CURL_OBOL="$(obol_curl_command_for_url "$OBOL_INGRESS_URL")" + export CURL_OBOL +} step() { STEP_COUNT=$((STEP_COUNT + 1)) @@ -208,3 +314,5 @@ with socket.socket() as sock: print(sock.getsockname()[1]) PY } + +refresh_obol_ingress_env diff --git a/internal/agent/agent.go b/internal/agent/agent.go index 8b94f2cd..16603637 100644 --- a/internal/agent/agent.go +++ b/internal/agent/agent.go @@ -3,20 +3,27 @@ package agent import ( "fmt" "os" + "os/exec" "path/filepath" + "github.com/ObolNetwork/obol-stack/internal/agentruntime" "github.com/ObolNetwork/obol-stack/internal/config" + "github.com/ObolNetwork/obol-stack/internal/hermes" "github.com/ObolNetwork/obol-stack/internal/ui" ) -// DefaultInstanceID is the canonical OpenClaw instance that runs both +// DefaultInstanceID is the canonical default-agent instance that runs both // user-facing inference and agent-mode monetize/heartbeat reconciliation. -const DefaultInstanceID = "obol-agent" +const DefaultInstanceID = agentruntime.DefaultInstanceID -// Init removes the legacy monetize heartbeat from the default OpenClaw instance. -// ServiceOffer reconciliation is now handled by the dedicated serviceoffer-controller -// in the x402 namespace rather than inside the OpenClaw runtime. +// Init provisions the stack-managed Hermes default agent and removes the legacy +// monetize heartbeat. ServiceOffer reconciliation is now handled by the +// dedicated serviceoffer-controller in the x402 namespace. func Init(cfg *config.Config, u *ui.UI) error { + if err := hermes.SetupDefault(cfg, u); err != nil { + return fmt.Errorf("failed to set up default Hermes agent: %w", err) + } + if err := removeHeartbeatFile(cfg, u); err != nil { return fmt.Errorf("failed to remove HEARTBEAT.md: %w", err) } @@ -26,11 +33,58 @@ func Init(cfg *config.Config, u *ui.UI) error { } func removeHeartbeatFile(cfg *config.Config, u *ui.UI) error { - namespace := fmt.Sprintf("openclaw-%s", DefaultInstanceID) - heartbeatPath := filepath.Join(cfg.DataDir, namespace, "openclaw-data", ".openclaw", "workspace", "HEARTBEAT.md") - if err := os.Remove(heartbeatPath); err != nil && !os.IsNotExist(err) { - return err + targets := []struct { + runtime agentruntime.Runtime + path string + }{ + { + runtime: agentruntime.Hermes, + path: filepath.Join(agentruntime.WorkspacePath(cfg, agentruntime.Hermes, DefaultInstanceID), "HEARTBEAT.md"), + }, + { + runtime: agentruntime.OpenClaw, + path: filepath.Join(agentruntime.WorkspacePath(cfg, agentruntime.OpenClaw, DefaultInstanceID), "HEARTBEAT.md"), + }, } - u.Successf("Legacy HEARTBEAT.md removed from %s", heartbeatPath) + + for _, target := range targets { + if err := os.Remove(target.path); err != nil { + if os.IsNotExist(err) { + continue + } + if os.IsPermission(err) { + if podErr := removeHeartbeatFileInPod(cfg, target.runtime); podErr != nil { + u.Warnf("Could not remove legacy HEARTBEAT.md from %s host path: %v", target.runtime, err) + continue + } + u.Successf("Legacy HEARTBEAT.md removed from %s runtime", agentruntime.Describe(target.runtime).DisplayName) + continue + } + return err + } + u.Successf("Legacy HEARTBEAT.md removed from %s", target.path) + } + return nil } + +func removeHeartbeatFileInPod(cfg *config.Config, runtime agentruntime.Runtime) error { + kubeconfigPath := filepath.Join(cfg.ConfigDir, "kubeconfig.yaml") + if _, err := os.Stat(kubeconfigPath); os.IsNotExist(err) { + return err + } + + desc := agentruntime.Describe(runtime) + containerPath := filepath.Join("/data", desc.HomeDir, "workspace", "HEARTBEAT.md") + cmd := exec.Command( + filepath.Join(cfg.BinDir, "kubectl"), + "exec", + "-n", agentruntime.Namespace(runtime, DefaultInstanceID), + "-c", desc.ServiceName, + "deploy/"+desc.ServiceName, + "--", + "rm", "-f", containerPath, + ) + cmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + return cmd.Run() +} diff --git a/internal/agentruntime/runtime.go b/internal/agentruntime/runtime.go new file mode 100644 index 00000000..6fe117a8 --- /dev/null +++ b/internal/agentruntime/runtime.go @@ -0,0 +1,222 @@ +package agentruntime + +import ( + "errors" + "fmt" + "os" + "path/filepath" + "strings" + + "github.com/ObolNetwork/obol-stack/internal/config" +) + +type Runtime string + +const ( + OpenClaw Runtime = "openclaw" + Hermes Runtime = "hermes" + + DefaultDomain = "obol.stack" + DefaultInstanceID = "obol-agent" +) + +type Descriptor struct { + Runtime Runtime + DisplayName string + ServiceName string + ConfigMapName string + DataPVCName string + HomeDir string + DefaultPort int +} + +type DeploymentRef struct { + Runtime Runtime + ID string +} + +func Describe(runtime Runtime) Descriptor { + switch runtime { + case Hermes: + return Descriptor{ + Runtime: Hermes, + DisplayName: "Hermes", + ServiceName: "hermes", + ConfigMapName: "hermes-config", + DataPVCName: "hermes-data", + HomeDir: ".hermes", + DefaultPort: 8642, + } + default: + return Descriptor{ + Runtime: OpenClaw, + DisplayName: "OpenClaw", + ServiceName: "openclaw", + ConfigMapName: "openclaw-config", + DataPVCName: "openclaw-data", + HomeDir: ".openclaw", + DefaultPort: 18789, + } + } +} + +func DeploymentPath(cfg *config.Config, runtime Runtime, id string) string { + return filepath.Join(cfg.ConfigDir, "applications", string(runtime), id) +} + +func Namespace(runtime Runtime, id string) string { + return fmt.Sprintf("%s-%s", runtime, id) +} + +func Hostname(runtime Runtime, id string) string { + return fmt.Sprintf("%s-%s.%s", runtime, id, DefaultDomain) +} + +func DashboardHostname(runtime Runtime, id string) string { + if runtime == Hermes { + if id == DefaultInstanceID { + return fmt.Sprintf("%s.%s", DefaultInstanceID, DefaultDomain) + } + + return fmt.Sprintf("%s-ui.%s", Namespace(Hermes, id), DefaultDomain) + } + + return Hostname(runtime, id) +} + +func Hostnames(runtime Runtime, id string) []string { + if strings.TrimSpace(id) == "" { + return nil + } + + hostnames := []string{Hostname(runtime, id)} + if runtime == Hermes { + hostnames = append(hostnames, DashboardHostname(runtime, id)) + } + + return hostnames +} + +func CollectHostnames(cfg *config.Config, include ...DeploymentRef) []string { + var hostnames []string + seen := make(map[string]bool) + + add := func(runtime Runtime, id string) { + for _, hostname := range Hostnames(runtime, id) { + if hostname == "" || seen[hostname] { + continue + } + + hostnames = append(hostnames, hostname) + seen[hostname] = true + } + } + + for _, ref := range include { + add(ref.Runtime, ref.ID) + } + + for _, runtime := range []Runtime{Hermes, OpenClaw} { + ids, err := ListInstanceIDs(cfg, runtime) + if err != nil { + continue + } + + for _, id := range ids { + add(runtime, id) + } + } + + return hostnames +} + +func DataRoot(cfg *config.Config, runtime Runtime, id string) string { + desc := Describe(runtime) + return filepath.Join(cfg.DataDir, Namespace(runtime, id), desc.DataPVCName) +} + +func HomePath(cfg *config.Config, runtime Runtime, id string) string { + desc := Describe(runtime) + return filepath.Join(DataRoot(cfg, runtime, id), desc.HomeDir) +} + +func WorkspacePath(cfg *config.Config, runtime Runtime, id string) string { + return filepath.Join(HomePath(cfg, runtime, id), "workspace") +} + +func SkillsPath(cfg *config.Config, runtime Runtime, id string) string { + return filepath.Join(HomePath(cfg, runtime, id), "skills") +} + +func KeystoreVolumePath(cfg *config.Config, runtime Runtime, id string) string { + return filepath.Join(cfg.DataDir, Namespace(runtime, id), "remote-signer-keystores") +} + +func ListInstanceIDs(cfg *config.Config, runtime Runtime) ([]string, error) { + appsDir := filepath.Join(cfg.ConfigDir, "applications", string(runtime)) + + entries, err := os.ReadDir(appsDir) + if err != nil { + if os.IsNotExist(err) { + return nil, nil + } + + return nil, fmt.Errorf("failed to read %s instances: %w", Describe(runtime).DisplayName, err) + } + + var ids []string + + for _, entry := range entries { + if entry.IsDir() { + ids = append(ids, entry.Name()) + } + } + + return ids, nil +} + +func ResolveInstance(cfg *config.Config, runtime Runtime, args []string) (id string, remaining []string, err error) { + instances, err := ListInstanceIDs(cfg, runtime) + if err != nil { + return "", nil, err + } + + desc := Describe(runtime) + + switch len(instances) { + case 0: + return "", nil, fmt.Errorf("no %s instances found — run 'obol %s onboard' to create one", desc.DisplayName, runtime) + case 1: + return instances[0], args, nil + default: + if len(args) > 0 { + for _, inst := range instances { + if args[0] == inst { + return inst, args[1:], nil + } + } + } + + return "", nil, fmt.Errorf("multiple %s instances found, specify one: %s", desc.DisplayName, strings.Join(instances, ", ")) + } +} + +func MustDefaultDeploymentPath(cfg *config.Config) string { + return DeploymentPath(cfg, Hermes, DefaultInstanceID) +} + +func ResolveSingleDefaultNamespace(cfg *config.Config, runtime Runtime) (string, error) { + ids, err := ListInstanceIDs(cfg, runtime) + if err != nil { + return "", err + } + + switch len(ids) { + case 0: + return "", errors.New("no instances found") + case 1: + return Namespace(runtime, ids[0]), nil + default: + return "", fmt.Errorf("multiple %s instances found (%s), specify an instance", Describe(runtime).DisplayName, strings.Join(ids, ", ")) + } +} diff --git a/internal/agentruntime/runtime_test.go b/internal/agentruntime/runtime_test.go new file mode 100644 index 00000000..2bb53778 --- /dev/null +++ b/internal/agentruntime/runtime_test.go @@ -0,0 +1,144 @@ +package agentruntime + +import ( + "os" + "path/filepath" + "testing" + + "github.com/ObolNetwork/obol-stack/internal/config" +) + +func TestHermesPaths(t *testing.T) { + cfg := &config.Config{ + ConfigDir: "/tmp/obol-config", + DataDir: "/tmp/obol-data", + } + + if got, want := DeploymentPath(cfg, Hermes, DefaultInstanceID), filepath.Join("/tmp/obol-config", "applications", "hermes", DefaultInstanceID); got != want { + t.Fatalf("DeploymentPath() = %q, want %q", got, want) + } + + if got, want := Namespace(Hermes, DefaultInstanceID), "hermes-obol-agent"; got != want { + t.Fatalf("Namespace() = %q, want %q", got, want) + } + + if got, want := Hostname(Hermes, DefaultInstanceID), "hermes-obol-agent.obol.stack"; got != want { + t.Fatalf("Hostname() = %q, want %q", got, want) + } + + if got, want := HomePath(cfg, Hermes, DefaultInstanceID), filepath.Join("/tmp/obol-data", "hermes-obol-agent", "hermes-data", ".hermes"); got != want { + t.Fatalf("HomePath() = %q, want %q", got, want) + } +} + +func TestDashboardHostname(t *testing.T) { + tests := []struct { + name string + runtime Runtime + id string + want string + }{ + { + name: "default hermes", + runtime: Hermes, + id: DefaultInstanceID, + want: "obol-agent.obol.stack", + }, + { + name: "named hermes", + runtime: Hermes, + id: "alice", + want: "hermes-alice-ui.obol.stack", + }, + { + name: "openclaw", + runtime: OpenClaw, + id: "default", + want: "openclaw-default.obol.stack", + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + if got := DashboardHostname(tt.runtime, tt.id); got != tt.want { + t.Fatalf("DashboardHostname(%q, %q) = %q, want %q", tt.runtime, tt.id, got, tt.want) + } + }) + } +} + +func TestCollectHostnamesIncludesHermesDashboardsAndOpenClawInstances(t *testing.T) { + cfg := &config.Config{ + ConfigDir: t.TempDir(), + DataDir: t.TempDir(), + } + + for _, path := range []string{ + DeploymentPath(cfg, Hermes, "alice"), + DeploymentPath(cfg, Hermes, DefaultInstanceID), + DeploymentPath(cfg, OpenClaw, "default"), + } { + if err := os.MkdirAll(path, 0o755); err != nil { + t.Fatalf("MkdirAll(%q): %v", path, err) + } + } + + got := CollectHostnames(cfg, DeploymentRef{ + Runtime: Hermes, + ID: "bob", + }) + + want := []string{ + "hermes-bob.obol.stack", + "hermes-bob-ui.obol.stack", + "hermes-alice.obol.stack", + "hermes-alice-ui.obol.stack", + "hermes-obol-agent.obol.stack", + "obol-agent.obol.stack", + "openclaw-default.obol.stack", + } + + assertSameStringSet(t, got, want) +} + +func TestCollectHostnamesDeduplicatesIncludedDeployment(t *testing.T) { + cfg := &config.Config{ + ConfigDir: t.TempDir(), + DataDir: t.TempDir(), + } + + if err := os.MkdirAll(DeploymentPath(cfg, Hermes, "alice"), 0o755); err != nil { + t.Fatalf("MkdirAll(): %v", err) + } + + got := CollectHostnames(cfg, DeploymentRef{ + Runtime: Hermes, + ID: "alice", + }) + + want := []string{ + "hermes-alice.obol.stack", + "hermes-alice-ui.obol.stack", + } + + assertSameStringSet(t, got, want) +} + +func assertSameStringSet(t *testing.T, got, want []string) { + t.Helper() + + if len(got) != len(want) { + t.Fatalf("got %d hostnames %v, want %d %v", len(got), got, len(want), want) + } + + seen := make(map[string]int, len(got)) + for _, value := range got { + seen[value]++ + } + + for _, value := range want { + if seen[value] != 1 { + t.Fatalf("hostname %q count = %d in %v, want 1", value, seen[value], got) + } + } +} diff --git a/internal/defaults/defaults.go b/internal/defaults/defaults.go new file mode 100644 index 00000000..da75a745 --- /dev/null +++ b/internal/defaults/defaults.go @@ -0,0 +1,216 @@ +package defaults + +import ( + "errors" + "fmt" + "net" + "os" + "path/filepath" + "runtime" + "strings" + + "github.com/ObolNetwork/obol-stack/internal/config" + "github.com/ObolNetwork/obol-stack/internal/embed" +) + +const ( + backendK3d = "k3d" + backendK3s = "k3s" + + stackIDFile = ".stack-id" + stackBackendFile = ".stack-backend" + stampFile = ".obol-defaults-stamp" +) + +// RefreshInfrastructureIfChanged refreshes the generated defaults tree when +// the embedded infrastructure assets, backend, or stack ID changed. +func RefreshInfrastructureIfChanged(cfg *config.Config, backendName, stackID string) (bool, error) { + defaultsDir := filepath.Join(cfg.ConfigDir, "defaults") + stamp, err := infrastructureStamp(backendName, stackID) + if err != nil { + return false, err + } + + currentStamp, _ := os.ReadFile(filepath.Join(defaultsDir, stampFile)) + _, helmfileErr := os.Stat(filepath.Join(defaultsDir, "helmfile.yaml")) + if string(currentStamp) == stamp && helmfileErr == nil { + return false, nil + } + + if err := CopyInfrastructure(cfg, backendName, stackID); err != nil { + return false, err + } + + return true, nil +} + +// CopyInfrastructure renders the embedded infrastructure defaults for the +// current stack and records the stamp that produced the copied tree. +func CopyInfrastructure(cfg *config.Config, backendName, stackID string) error { + defaultsDir := filepath.Join(cfg.ConfigDir, "defaults") + replacements, err := InfrastructureReplacements(backendName, stackID) + if err != nil { + return err + } + + if err := embed.CopyDefaults(defaultsDir, replacements); err != nil { + return err + } + + stamp, err := infrastructureStamp(backendName, stackID) + if err != nil { + return err + } + + return os.WriteFile(filepath.Join(defaultsDir, stampFile), []byte(stamp), 0o600) +} + +// InfrastructureReplacements returns the placeholder values used when copying +// embedded infrastructure defaults. +func InfrastructureReplacements(backendName, stackID string) (map[string]string, error) { + ollamaHost := OllamaHostForBackend(backendName) + + ollamaHostIP, err := OllamaHostIPForBackend(backendName) + if err != nil { + return nil, err + } + + return map[string]string{ + "{{OLLAMA_HOST}}": ollamaHost, + "{{OLLAMA_HOST_IP}}": ollamaHostIP, + "{{CLUSTER_ID}}": stackID, + }, nil +} + +// DetectedBackendName reads the persisted backend choice, defaulting to k3d for +// legacy stacks that predate .stack-backend. +func DetectedBackendName(cfg *config.Config) string { + data, err := os.ReadFile(filepath.Join(cfg.ConfigDir, stackBackendFile)) + if err != nil { + return backendK3d + } + + backendName := strings.TrimSpace(string(data)) + if backendName == "" { + return backendK3d + } + + return backendName +} + +// StackID reads the persisted stack ID. +func StackID(cfg *config.Config) string { + data, err := os.ReadFile(filepath.Join(cfg.ConfigDir, stackIDFile)) + if err != nil { + return "" + } + + return strings.TrimSpace(string(data)) +} + +func infrastructureStamp(backendName, stackID string) (string, error) { + digest, err := embed.InfrastructureDigest() + if err != nil { + return "", err + } + + return fmt.Sprintf("digest=%s\nbackend=%s\nstackID=%s\n", digest, backendName, stackID), nil +} + +// OllamaHostForBackend returns the hostname/IP that reaches the host Ollama +// instance from inside the cluster. +func OllamaHostForBackend(backendName string) string { + if backendName == backendK3s { + return "127.0.0.1" + } + + if runtime.GOOS == "darwin" { + return "host.docker.internal" + } + + return "host.k3d.internal" +} + +// OllamaHostIPForBackend resolves the Ollama host to an IP address. +// ClusterIP+Endpoints requires an IP, not a hostname. +func OllamaHostIPForBackend(backendName string) (string, error) { + host := OllamaHostForBackend(backendName) + + if net.ParseIP(host) != nil { + return host, nil + } + + addrs, err := net.LookupHost(host) + if err == nil && len(addrs) > 0 { + return addrs[0], nil + } + + if runtime.GOOS == "darwin" && backendName == backendK3d { + return DockerDesktopGatewayIP(), nil + } + + if runtime.GOOS == "linux" && backendName == backendK3d { + ip, bridgeErr := DockerBridgeGatewayIP() + if bridgeErr == nil { + return ip, nil + } + + return "", fmt.Errorf("cannot resolve Ollama host %q to IP: %w; docker0 fallback also failed: %w", host, err, bridgeErr) + } + + return "", fmt.Errorf("cannot resolve Ollama host %q to IP: %w\n\tEnsure Docker Desktop is running", host, err) +} + +// DockerDesktopGatewayIP returns the Docker Desktop VM gateway IP. +func DockerDesktopGatewayIP() string { + return "192.168.65.254" +} + +// DockerBridgeGatewayIP returns the IPv4 address of an active Docker bridge +// interface. +func DockerBridgeGatewayIP() (string, error) { + if ip, err := BridgeInterfaceIP("docker0"); err == nil { + return ip, nil + } + + ifaces, err := net.Interfaces() + if err != nil { + return "", fmt.Errorf("cannot list network interfaces: %w", err) + } + + for _, iface := range ifaces { + if !strings.HasPrefix(iface.Name, "br-") { + continue + } + if ip, err := BridgeInterfaceIP(iface.Name); err == nil { + return ip, nil + } + } + + return "", errors.New("no active Docker bridge interface found (docker0 or br-*)") +} + +// BridgeInterfaceIP returns the IPv4 address of a named network interface. +func BridgeInterfaceIP(name string) (string, error) { + iface, err := net.InterfaceByName(name) + if err != nil { + return "", fmt.Errorf("interface %s not found: %w", name, err) + } + + if iface.Flags&net.FlagUp == 0 { + return "", fmt.Errorf("interface %s is down", name) + } + + addrs, err := iface.Addrs() + if err != nil { + return "", fmt.Errorf("cannot get addresses for %s: %w", name, err) + } + + for _, addr := range addrs { + if ipNet, ok := addr.(*net.IPNet); ok && ipNet.IP.To4() != nil { + return ipNet.IP.String(), nil + } + } + + return "", fmt.Errorf("no IPv4 address found on interface %s", name) +} diff --git a/internal/defaults/defaults_test.go b/internal/defaults/defaults_test.go new file mode 100644 index 00000000..2717eebe --- /dev/null +++ b/internal/defaults/defaults_test.go @@ -0,0 +1,97 @@ +package defaults + +import ( + "os" + "path/filepath" + "strings" + "testing" + + "github.com/ObolNetwork/obol-stack/internal/config" +) + +func TestCopyInfrastructureRendersStackPlaceholders(t *testing.T) { + cfg := &config.Config{ConfigDir: t.TempDir()} + + if err := CopyInfrastructure(cfg, backendK3s, "test-stack"); err != nil { + t.Fatalf("CopyInfrastructure: %v", err) + } + + data, err := os.ReadFile(filepath.Join(cfg.ConfigDir, "defaults", "base", "templates", "llm.yaml")) + if err != nil { + t.Fatalf("read llm defaults: %v", err) + } + + out := string(data) + for _, want := range []string{ + `ip: "127.0.0.1"`, + `LITELLM_MASTER_KEY: "sk-obol-test-stack"`, + } { + if !strings.Contains(out, want) { + t.Fatalf("rendered defaults missing %q:\n%s", want, out) + } + } + for _, unexpected := range []string{"{{OLLAMA_HOST_IP}}", "{{CLUSTER_ID}}"} { + if strings.Contains(out, unexpected) { + t.Fatalf("rendered defaults still contain %q:\n%s", unexpected, out) + } + } +} + +func TestRefreshInfrastructureIfChangedUsesStamp(t *testing.T) { + cfg := &config.Config{ConfigDir: t.TempDir()} + + refreshed, err := RefreshInfrastructureIfChanged(cfg, backendK3s, "test-stack") + if err != nil { + t.Fatalf("first RefreshInfrastructureIfChanged: %v", err) + } + if !refreshed { + t.Fatal("first refresh should copy defaults") + } + + marker := filepath.Join(cfg.ConfigDir, "defaults", "marker.txt") + if err := os.WriteFile(marker, []byte("keep"), 0o600); err != nil { + t.Fatalf("write marker: %v", err) + } + + refreshed, err = RefreshInfrastructureIfChanged(cfg, backendK3s, "test-stack") + if err != nil { + t.Fatalf("second RefreshInfrastructureIfChanged: %v", err) + } + if refreshed { + t.Fatal("second refresh should be skipped for the same stamp") + } + if _, err := os.Stat(marker); err != nil { + t.Fatalf("unchanged refresh should not rewrite defaults tree: %v", err) + } + + refreshed, err = RefreshInfrastructureIfChanged(cfg, backendK3s, "other-stack") + if err != nil { + t.Fatalf("changed RefreshInfrastructureIfChanged: %v", err) + } + if !refreshed { + t.Fatal("stack ID change should refresh defaults") + } + + data, err := os.ReadFile(filepath.Join(cfg.ConfigDir, "defaults", "base", "templates", "llm.yaml")) + if err != nil { + t.Fatalf("read llm defaults: %v", err) + } + if !strings.Contains(string(data), `LITELLM_MASTER_KEY: "sk-obol-other-stack"`) { + t.Fatalf("refreshed defaults did not use new stack ID:\n%s", string(data)) + } +} + +func TestDetectedBackendNameDefaultsToK3d(t *testing.T) { + cfg := &config.Config{ConfigDir: t.TempDir()} + + if got := DetectedBackendName(cfg); got != backendK3d { + t.Fatalf("DetectedBackendName() = %q, want %q", got, backendK3d) + } + + if err := os.WriteFile(filepath.Join(cfg.ConfigDir, stackBackendFile), []byte("k3s\n"), 0o600); err != nil { + t.Fatalf("write backend: %v", err) + } + if got := DetectedBackendName(cfg); got != backendK3s { + t.Fatalf("DetectedBackendName() = %q, want %q", got, backendK3s) + } +} diff --git a/internal/embed/embed.go b/internal/embed/embed.go index e2ce9546..44843768 100644 --- a/internal/embed/embed.go +++ b/internal/embed/embed.go @@ -1,7 +1,9 @@ package embed import ( + "crypto/sha256" "embed" + "encoding/hex" "fmt" "io/fs" "os" @@ -27,6 +29,38 @@ var networksFS embed.FS //go:embed all:skills var skillsFS embed.FS +// InfrastructureDigest returns a stable digest of the embedded infrastructure +// assets. Callers use this to decide whether an existing copied defaults tree +// needs to be refreshed from the current binary. +func InfrastructureDigest() (string, error) { + hash := sha256.New() + + if err := fs.WalkDir(infrastructureFS, "infrastructure", func(path string, d fs.DirEntry, err error) error { + if err != nil { + return err + } + if d.IsDir() { + return nil + } + + data, err := infrastructureFS.ReadFile(path) + if err != nil { + return fmt.Errorf("failed to read embedded file %s: %w", path, err) + } + + hash.Write([]byte(path)) + hash.Write([]byte{0}) + hash.Write(data) + hash.Write([]byte{0}) + + return nil + }); err != nil { + return "", err + } + + return hex.EncodeToString(hash.Sum(nil)), nil +} + // CopyDefaults recursively copies all embedded infrastructure manifests to the destination directory. // The replacements map is applied to every file: each key (e.g. "{{OLLAMA_HOST}}") is replaced // with its value. Pass nil for a verbatim copy. diff --git a/internal/embed/embed_crd_test.go b/internal/embed/embed_crd_test.go index 7e294f9f..49059362 100644 --- a/internal/embed/embed_crd_test.go +++ b/internal/embed/embed_crd_test.go @@ -361,9 +361,9 @@ func TestMonetizeRBAC_Parses(t *testing.T) { docs := multiDoc(data) - ns := findDocByName(docs, "Namespace", "openclaw-obol-agent") + ns := findDocByName(docs, "Namespace", "hermes-obol-agent") if ns == nil { - t.Fatal("no Namespace 'openclaw-obol-agent' found") + t.Fatal("no Namespace 'hermes-obol-agent' found") } // ── Read ClusterRole ──────────────────────────────────────────────── @@ -409,8 +409,8 @@ func TestMonetizeRBAC_Parses(t *testing.T) { if writeRole == nil { t.Fatal("no Role 'openclaw-monetize-write' found") } - if ns := nested(writeRole, "metadata", "namespace"); ns != "openclaw-obol-agent" { - t.Errorf("write Role namespace = %v, want openclaw-obol-agent", ns) + if ns := nested(writeRole, "metadata", "namespace"); ns != "hermes-obol-agent" { + t.Errorf("write Role namespace = %v, want hermes-obol-agent", ns) } writeRules, ok := writeRole["rules"].([]interface{}) if !ok || len(writeRules) == 0 { diff --git a/internal/embed/infrastructure/base/templates/llm.yaml b/internal/embed/infrastructure/base/templates/llm.yaml index 745048b5..5ee34beb 100644 --- a/internal/embed/infrastructure/base/templates/llm.yaml +++ b/internal/embed/infrastructure/base/templates/llm.yaml @@ -13,7 +13,8 @@ # k3s → 127.0.0.1 (k3s runs directly on the host) # - Using ClusterIP+Endpoints instead of ExternalName for Traefik Gateway API # compatibility (ExternalName services are rejected as HTTPRoute backends). -# - LiteLLM starts with an empty model_list; models are added via `obol model setup`. +# - LiteLLM starts with a base paid/* route; provider models are added via +# `obol model setup`, and purchased remote models are added by the controller. apiVersion: v1 kind: Namespace metadata: @@ -54,8 +55,8 @@ subsets: protocol: TCP --- -# LiteLLM configuration: empty model_list by default. -# Models are added via `obol model setup` which patches this ConfigMap. +# LiteLLM configuration: base paid/* route plus provider and purchased models. +# Models are persisted here so LiteLLM reloads survive pod restarts. apiVersion: v1 kind: ConfigMap metadata: diff --git a/internal/embed/infrastructure/base/templates/obol-agent-admission-policy.yaml b/internal/embed/infrastructure/base/templates/obol-agent-admission-policy.yaml index 0114082d..52ef22dd 100644 --- a/internal/embed/infrastructure/base/templates/obol-agent-admission-policy.yaml +++ b/internal/embed/infrastructure/base/templates/obol-agent-admission-policy.yaml @@ -1,6 +1,6 @@ --- -# Admission Policy for OpenClaw-created resources -# Ensures that when OpenClaw service accounts create Traefik middlewares or +# Admission Policy for agent-created resources +# Ensures that when OpenClaw or Hermes service accounts create Traefik middlewares or # Gateway API HTTPRoutes, they conform to expected patterns: # - HTTPRoutes must reference the shared traefik-gateway # - ForwardAuth middlewares must target x402-verifier.x402.svc @@ -8,7 +8,7 @@ # Uses Kubernetes ValidatingAdmissionPolicy (v1, GA in 1.30+). #------------------------------------------------------------------------------ -# ValidatingAdmissionPolicy - Guards resources created by OpenClaw agents +# ValidatingAdmissionPolicy - Guards resources created by agent runtimes #------------------------------------------------------------------------------ apiVersion: admissionregistration.k8s.io/v1 kind: ValidatingAdmissionPolicy @@ -26,11 +26,11 @@ spec: resources: ["httproutes"] operations: ["CREATE", "UPDATE"] matchConditions: - - name: only-openclaw-sa - expression: 'request.userInfo.username.startsWith("system:serviceaccount:openclaw-")' + - name: only-agent-runtime-sa + expression: 'request.userInfo.username.startsWith("system:serviceaccount:openclaw-") || request.userInfo.username.startsWith("system:serviceaccount:hermes-")' validations: - expression: '!has(object.spec.parentRefs) || object.spec.parentRefs.all(p, p.name == "traefik-gateway")' - message: "HTTPRoutes created by OpenClaw must reference traefik-gateway" + message: "HTTPRoutes created by agent runtimes must reference traefik-gateway" - expression: '!has(object.spec.forwardAuth) || object.spec.forwardAuth.address.startsWith("http://x402-verifier.x402.svc")' message: "ForwardAuth middlewares must target x402-verifier.x402.svc" diff --git a/internal/embed/infrastructure/base/templates/obol-agent-monetize-rbac.yaml b/internal/embed/infrastructure/base/templates/obol-agent-monetize-rbac.yaml index 88b86048..63cc9f96 100644 --- a/internal/embed/infrastructure/base/templates/obol-agent-monetize-rbac.yaml +++ b/internal/embed/infrastructure/base/templates/obol-agent-monetize-rbac.yaml @@ -1,5 +1,5 @@ --- -# Monetize RBAC for OpenClaw Agents +# Monetize RBAC for the default agent runtime # # The agent remains a compatibility CLI surface for creating and inspecting # ServiceOffer objects. The serviceoffer-controller owns all child resources, @@ -14,7 +14,7 @@ apiVersion: v1 kind: Namespace metadata: - name: openclaw-obol-agent + name: hermes-obol-agent --- #------------------------------------------------------------------------------ @@ -46,7 +46,7 @@ apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: openclaw-monetize-write - namespace: openclaw-obol-agent + namespace: hermes-obol-agent rules: - apiGroups: ["obol.org"] resources: ["serviceoffers"] @@ -72,8 +72,8 @@ roleRef: name: openclaw-monetize-read subjects: - kind: ServiceAccount - name: openclaw - namespace: openclaw-obol-agent + name: hermes + namespace: hermes-obol-agent --- #------------------------------------------------------------------------------ @@ -83,12 +83,12 @@ apiVersion: rbac.authorization.k8s.io/v1 kind: RoleBinding metadata: name: openclaw-monetize-write-binding - namespace: openclaw-obol-agent + namespace: hermes-obol-agent roleRef: apiGroup: rbac.authorization.k8s.io kind: Role name: openclaw-monetize-write subjects: - kind: ServiceAccount - name: openclaw - namespace: openclaw-obol-agent + name: hermes + namespace: hermes-obol-agent diff --git a/internal/embed/infrastructure/base/templates/obol-agent.yaml b/internal/embed/infrastructure/base/templates/obol-agent.yaml index 7c2cdf5b..d788f1a5 100644 --- a/internal/embed/infrastructure/base/templates/obol-agent.yaml +++ b/internal/embed/infrastructure/base/templates/obol-agent.yaml @@ -1,6 +1,6 @@ --- # Agent namespace — retained for backward compatibility. -# The obol-agent OpenClaw instance runs in openclaw-obol-agent namespace; +# The default obol-agent Hermes instance runs in hermes-obol-agent namespace; # RBAC is managed by the openclaw-monetize-read/workload ClusterRoles. apiVersion: v1 kind: Namespace diff --git a/internal/embed/infrastructure/values/obol-frontend.yaml.gotmpl b/internal/embed/infrastructure/values/obol-frontend.yaml.gotmpl index 4477acae..ccb70e0e 100644 --- a/internal/embed/infrastructure/values/obol-frontend.yaml.gotmpl +++ b/internal/embed/infrastructure/values/obol-frontend.yaml.gotmpl @@ -35,7 +35,7 @@ image: repository: obolnetwork/obol-stack-front-end pullPolicy: IfNotPresent - tag: "v0.1.16" + tag: "v0.1.17-rc.5" service: type: ClusterIP diff --git a/internal/embed/skills/buy-inference/SKILL.md b/internal/embed/skills/buy-inference/SKILL.md index 78e197a5..37fd785a 100644 --- a/internal/embed/skills/buy-inference/SKILL.md +++ b/internal/embed/skills/buy-inference/SKILL.md @@ -28,18 +28,18 @@ Purchase access to remote x402-gated inference endpoints using a risk-isolated s ```bash # Probe an endpoint to see its pricing -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py probe https://seller.example.com/services/my-model/v1/chat/completions +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py probe https://seller.example.com/services/my-model/v1/chat/completions # Probe with the concrete remote model when the seller validates model IDs -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py probe https://seller.example.com/services/my-model/v1/chat/completions --model qwen3.5:35b +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py probe https://seller.example.com/services/my-model/v1/chat/completions --model qwen3.5:35b # Buy access (probes, pre-signs auths, creates/updates a PurchaseRequest) -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py buy remote-qwen \ +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py buy remote-qwen \ --endpoint https://seller.example.com/services/my-model \ --model qwen3.5:35b # Buy with agent-managed auto-refill intent -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py buy remote-qwen \ +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py buy remote-qwen \ --endpoint https://seller.example.com/services/my-model \ --model qwen3.5:35b \ --count 100 \ @@ -48,25 +48,25 @@ python3 /data/.openclaw/skills/buy-inference/scripts/buy.py buy remote-qwen \ --refill-count 50 # Manual top-up on the same purchase name -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py buy remote-qwen \ +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py buy remote-qwen \ --endpoint https://seller.example.com/services/my-model \ --model qwen3.5:35b \ --count 25 # List purchased providers + remaining auths -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py list +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py list # Check sidecar health + remaining auths -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py status remote-qwen +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py status remote-qwen # Reconcile auto-refill policies (heartbeat / cron entrypoint) -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py process --all +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py process --all # Check your USDC balance -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py balance +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py balance # Compatibility alias for the same reconcile loop -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py maintain +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py maintain ``` ## Commands @@ -116,7 +116,7 @@ Use the absolute script path inside the pod. Do not rely on `cd ... && ...` shell wrapping. ```bash -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py process --all +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py process --all ``` Tell the agent to schedule that as its maintenance loop only when at least one @@ -131,7 +131,7 @@ CLI example: ```bash hermes cron create "every 5m" \ - "Reconcile existing x402 PurchaseRequests. Use the buy-inference skill and run python3 /data/.openclaw/skills/buy-inference/scripts/buy.py process --all. Report only errors or state changes." \ + "Reconcile existing x402 PurchaseRequests. Use the buy-inference skill and run python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py process --all. Report only errors or state changes." \ --name "x402 buy reconcile" \ --skill buy-inference ``` @@ -142,7 +142,7 @@ Python API example: from cron.jobs import create_job create_job( - prompt="Reconcile existing x402 PurchaseRequests. Use the buy-inference skill and run python3 /data/.openclaw/skills/buy-inference/scripts/buy.py process --all. Report only errors or state changes.", + prompt="Reconcile existing x402 PurchaseRequests. Use the buy-inference skill and run python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py process --all. Report only errors or state changes.", schedule="every 5m", name="x402 buy reconcile", skills=["buy-inference"], @@ -220,7 +220,7 @@ flowchart LR 5. **Runtime mount**: A lean Go sidecar (`x402-buyer`) already runs inside the existing `litellm` pod in the `llm` namespace. It mounts both ConfigMaps and serves as an OpenAI-compatible reverse proxy on `127.0.0.1:8402`. -6. **Wire**: LiteLLM keeps one static wildcard route: `paid/* -> openai/* -> 127.0.0.1:8402`. The controller also adds explicit paid-model entries when required so models with colons resolve reliably. The public model name is always `paid/`. +6. **Wire**: LiteLLM keeps one static wildcard route: `paid/* -> openai/* -> 127.0.0.1:8402/v1`. The controller also adds explicit paid-model entries when required so models with colons resolve reliably. The public model name is always `paid/`. 7. **Runtime**: On each request through the sidecar: - Sidecar forwards to upstream seller @@ -297,10 +297,10 @@ This is the complete journey from discovering a seller to using purchased infere ```bash # Search the ERC-8004 registry for recently registered agents -python3 /data/.openclaw/skills/discovery/scripts/discovery.py search --chain base-sepolia +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/discovery/scripts/discovery.py search --chain base-sepolia # Fetch a candidate's registration JSON to check x402Support and services -python3 /data/.openclaw/skills/discovery/scripts/discovery.py uri --chain base-sepolia +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/discovery/scripts/discovery.py uri --chain base-sepolia ``` Look for agents with `"x402Support": true` and a `"web"` service endpoint. @@ -309,7 +309,7 @@ Look for agents with `"x402Support": true` and a `"web"` service endpoint. ```bash # Send an unauthenticated request to get 402 pricing -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py probe --model +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py probe --model ``` This returns the seller's pricing: `payTo`, `network`, `price`, and `asset` (USDC contract). @@ -318,10 +318,10 @@ This returns the seller's pricing: `payTo`, `network`, `price`, and `asset` (USD ```bash # Check USDC balance -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py balance --chain base-sepolia +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py balance --chain base-sepolia # Buy access (pre-sign auths, create PurchaseRequest, wait for controller reconciliation) -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py buy \ +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py buy \ --endpoint \ --model \ --count 20 @@ -344,13 +344,13 @@ The `paid/` prefix routes through the x402-buyer sidecar, which transparently at ```bash # Check remaining auths -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py list +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py list # Check one purchased upstream in detail -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py status +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py status # Reconcile auto-refill intent (what the heartbeat should run) -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py process --all +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py process --all ``` Manual `refill` and `remove` commands are still not available in the current diff --git a/internal/embed/skills/buy-inference/references/purchase-request-spec.md b/internal/embed/skills/buy-inference/references/purchase-request-spec.md index fe82d5ab..be368367 100644 --- a/internal/embed/skills/buy-inference/references/purchase-request-spec.md +++ b/internal/embed/skills/buy-inference/references/purchase-request-spec.md @@ -23,7 +23,7 @@ apiVersion: obol.org/v1alpha1 kind: PurchaseRequest metadata: name: remote-qwen - namespace: openclaw-obol-agent + namespace: hermes-obol-agent spec: endpoint: https://seller.example.com/services/qwen/v1/chat/completions model: qwen3.5:32b diff --git a/internal/embed/skills/buy-inference/references/x402-buyer-api.md b/internal/embed/skills/buy-inference/references/x402-buyer-api.md index 8d42e1dc..f09567a8 100644 --- a/internal/embed/skills/buy-inference/references/x402-buyer-api.md +++ b/internal/embed/skills/buy-inference/references/x402-buyer-api.md @@ -211,7 +211,7 @@ apiVersion: obol.org/v1alpha1 kind: PurchaseRequest metadata: name: remote-qwen - namespace: openclaw-obol-agent + namespace: hermes-obol-agent spec: endpoint: https://seller.example.com/services/qwen/v1/chat/completions model: qwen3.5:9b diff --git a/internal/embed/skills/buy-inference/scripts/buy.py b/internal/embed/skills/buy-inference/scripts/buy.py index 81e9d7cc..d5c9bfe8 100644 --- a/internal/embed/skills/buy-inference/scripts/buy.py +++ b/internal/embed/skills/buy-inference/scripts/buy.py @@ -195,7 +195,7 @@ def _get_agent_namespace(): with open("/var/run/secrets/kubernetes.io/serviceaccount/namespace") as f: return f.read().strip() except FileNotFoundError: - return os.environ.get("AGENT_NAMESPACE", "openclaw-obol-agent") + return os.environ.get("AGENT_NAMESPACE", "hermes-obol-agent") def _purchase_collection_path(ns=None): diff --git a/internal/embed/skills/discovery/SKILL.md b/internal/embed/skills/discovery/SKILL.md index 3ad40141..ee6c2b8f 100644 --- a/internal/embed/skills/discovery/SKILL.md +++ b/internal/embed/skills/discovery/SKILL.md @@ -123,10 +123,10 @@ Once you find an agent with `"x402Support": true` and a service endpoint, use th ```bash # 1. Probe the endpoint for pricing -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py probe --model +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py probe --model # 2. Buy access (pre-signs payment auths, configures sidecar) -python3 /data/.openclaw/skills/buy-inference/scripts/buy.py buy \ +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/buy-inference/scripts/buy.py buy \ --endpoint --model # 3. Use via LiteLLM as paid/ diff --git a/internal/embed/skills/monetize-guide/SKILL.md b/internal/embed/skills/monetize-guide/SKILL.md index 683b1b94..a7b08371 100644 --- a/internal/embed/skills/monetize-guide/SKILL.md +++ b/internal/embed/skills/monetize-guide/SKILL.md @@ -36,7 +36,7 @@ obol kubectl get nodes obol kubectl get clusterrolebinding openclaw-monetize-read-binding -o jsonpath='{.subjects}' # 3. Get the wallet address (auto-generated by agent) -python3 /data/.openclaw/skills/ethereum-local-wallet/scripts/signer.py accounts +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/ethereum-local-wallet/scripts/signer.py accounts ``` **If cluster is not running**: Tell the user to run `obol stack up` first. Do NOT run it yourself — it takes several minutes and changes system state. @@ -100,10 +100,10 @@ Query the ERC-8004 registry to see what comparable services charge. ```bash # Search for registered agents on Base Sepolia -python3 /data/.openclaw/skills/discovery/scripts/discovery.py search --limit 10 +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/discovery/scripts/discovery.py search --limit 10 # For each agent with x402Support, fetch their registration to see pricing -python3 /data/.openclaw/skills/discovery/scripts/discovery.py uri +python3 ${OBOL_SKILLS_DIR:-/data/.openclaw/skills}/discovery/scripts/discovery.py uri ``` **Pricing guidelines** (present these to the user with your research): @@ -208,7 +208,7 @@ Expected progression: If a stage is stuck, check: ```bash # Agent logs for reconciliation errors -obol kubectl logs -l app=openclaw -n openclaw-obol-agent --tail=50 +obol kubectl logs -l app.kubernetes.io/name=hermes -n hermes-obol-agent --tail=50 # x402-verifier is running obol kubectl get pods -n x402 diff --git a/internal/hermes/hermes.go b/internal/hermes/hermes.go new file mode 100644 index 00000000..79511af7 --- /dev/null +++ b/internal/hermes/hermes.go @@ -0,0 +1,1285 @@ +package hermes + +import ( + "bytes" + "crypto/rand" + "encoding/base64" + "encoding/json" + "errors" + "fmt" + "os" + "os/exec" + "path/filepath" + "strings" + "syscall" + + "github.com/ObolNetwork/obol-stack/internal/agentruntime" + "github.com/ObolNetwork/obol-stack/internal/config" + "github.com/ObolNetwork/obol-stack/internal/dns" + obolembed "github.com/ObolNetwork/obol-stack/internal/embed" + "github.com/ObolNetwork/obol-stack/internal/model" + "github.com/ObolNetwork/obol-stack/internal/tunnel" + "github.com/ObolNetwork/obol-stack/internal/ui" + petname "github.com/dustinkirkland/golang-petname" + "gopkg.in/yaml.v3" +) + +const ( + valuesFileName = "values-hermes.yaml" + helmfileFileName = "helmfile.yaml" + gatewayTokenFileName = ".gateway-token" + obolSkillsDirName = "obol-skills" + + // renovate: datasource=helm depName=remote-signer registryUrl=https://obolnetwork.github.io/helm-charts/ + remoteSignerChartVersion = "0.3.0" + // renovate: datasource=helm depName=raw registryUrl=https://bedag.github.io/helm-charts/ + rawChartVersion = "2.0.2" + + defaultImage = "nousresearch/hermes-agent:latest" + hermesInstallDir = "/data/.hermes/hermes-agent" + hermesRepoURL = "https://github.com/NousResearch/hermes-agent.git" + hermesBinary = hermesInstallDir + "/venv/bin/hermes" + + containerUID = 10000 + containerGID = 10000 + dashboardPort = 9119 +) + +type OnboardOptions struct { + ID string + Force bool + Sync bool + IsDefault bool + AgentMode bool +} + +type SetupOptions struct{} + +type DashboardOptions struct { + Port int + NoBrowser bool +} + +type instance struct { + ID string `json:"id"` + Namespace string `json:"namespace"` + URL string `json:"url"` +} + +func DeploymentPath(cfg *config.Config, id string) string { + return agentruntime.DeploymentPath(cfg, agentruntime.Hermes, id) +} + +func SetupDefault(cfg *config.Config, u *ui.UI) error { + if _, _, err := configuredModels(cfg, u); err != nil { + u.Warnf("Skipping default Hermes agent: %v", err) + u.Print(" Run 'obol model setup' to configure LiteLLM, then 'obol hermes onboard obol-agent'.") + return nil + } + + return Onboard(cfg, OnboardOptions{ + ID: agentruntime.DefaultInstanceID, + Sync: true, + IsDefault: true, + AgentMode: true, + }, u) +} + +func Onboard(cfg *config.Config, opts OnboardOptions, u *ui.UI) error { + id := strings.TrimSpace(opts.ID) + if opts.IsDefault { + id = agentruntime.DefaultInstanceID + } + + if id == "" { + id = petname.Generate(2, "-") + u.Infof("Generated deployment ID: %s", id) + } else { + u.Infof("Using deployment ID: %s", id) + } + + deploymentDir := DeploymentPath(cfg, id) + namespace := agentruntime.Namespace(agentruntime.Hermes, id) + hostname := agentruntime.Hostname(agentruntime.Hermes, id) + + if _, err := os.Stat(deploymentDir); err == nil && !opts.Force && !opts.IsDefault { + return fmt.Errorf("deployment already exists: hermes/%s\nDirectory: %s\nUse --force or -f to overwrite", id, deploymentDir) + } + + if opts.IsDefault && !opts.Force { + if _, err := os.Stat(deploymentDir); err == nil { + u.Info("Default Hermes instance already configured, re-syncing...") + if err := dns.EnsureHostsEntries(agentruntime.CollectHostnames(cfg, agentruntime.DeploymentRef{ + Runtime: agentruntime.Hermes, + ID: id, + })); err != nil { + u.Warnf("Could not update /etc/hosts for Hermes hostnames: %v", err) + } + if err := writeDeploymentFiles(cfg, id, deploymentDir, currentAgentBaseURL(deploymentDir), u); err != nil { + return err + } + if opts.Sync { + return Sync(cfg, id, u) + } + return nil + } + } + + if _, err := os.Stat(deploymentDir); err == nil && opts.Force { + u.Warnf("Overwriting existing deployment at %s", deploymentDir) + } + + if err := os.MkdirAll(deploymentDir, 0o755); err != nil { + return fmt.Errorf("failed to create deployment directory: %w", err) + } + + if err := dns.EnsureHostsEntries(agentruntime.CollectHostnames(cfg, agentruntime.DeploymentRef{ + Runtime: agentruntime.Hermes, + ID: id, + })); err != nil { + u.Warnf("Could not update /etc/hosts for Hermes hostnames: %v", err) + } + + u.Blank() + u.Info("Generating Ethereum wallet...") + wallet, err := GenerateWallet(cfg, id, u) + if err != nil { + _ = os.RemoveAll(deploymentDir) + return fmt.Errorf("failed to generate wallet: %w", err) + } + + rsValues := generateRemoteSignerValues(wallet) + if err := os.WriteFile(filepath.Join(deploymentDir, "values-remote-signer.yaml"), []byte(rsValues), 0o600); err != nil { + _ = os.RemoveAll(deploymentDir) + return fmt.Errorf("failed to write remote-signer values: %w", err) + } + + if err := WriteWalletMetadata(deploymentDir, wallet); err != nil { + _ = os.RemoveAll(deploymentDir) + return fmt.Errorf("failed to write wallet metadata: %w", err) + } + + agentBaseURL := "" + if opts.AgentMode { + if st, _ := tunnel.LoadTunnelState(cfg); st != nil && st.Hostname != "" { + agentBaseURL = "https://" + st.Hostname + } + } + + if err := writeDeploymentFiles(cfg, id, deploymentDir, agentBaseURL, u); err != nil { + _ = os.RemoveAll(deploymentDir) + return err + } + + u.Blank() + u.Success("Hermes instance configured!") + u.Detail("Deployment", fmt.Sprintf("hermes/%s", id)) + u.Detail("Namespace", namespace) + u.Detail("Hostname", hostname) + u.Detail("Wallet", wallet.Address) + u.Detail("Location", deploymentDir) + u.Blank() + u.Print("Files created:") + u.Print(" - values-hermes.yaml Hermes deployment manifest") + u.Print(" - values-remote-signer.yaml Remote-signer config") + u.Print(" - wallet.json Wallet metadata") + u.Print(" - helmfile.yaml Hermes + remote-signer deployment configuration") + u.Blank() + u.Print(" Back up your signing key:") + u.Printf(" cp -r %s ~/obol-wallet-backup/", agentruntime.KeystoreVolumePath(cfg, agentruntime.Hermes, id)) + + if opts.Sync { + u.Blank() + u.Info("Deploying to cluster...") + u.Blank() + return Sync(cfg, id, u) + } + + u.Printf("\nTo deploy: obol hermes sync %s", id) + return nil +} + +func Sync(cfg *config.Config, id string, u *ui.UI) error { + deploymentDir := DeploymentPath(cfg, id) + if _, err := os.Stat(deploymentDir); os.IsNotExist(err) { + return fmt.Errorf("deployment not found: hermes/%s\nDirectory: %s", id, deploymentDir) + } + + if err := dns.EnsureHostsEntries(agentruntime.CollectHostnames(cfg, agentruntime.DeploymentRef{ + Runtime: agentruntime.Hermes, + ID: id, + })); err != nil { + u.Warnf("Could not update /etc/hosts for Hermes hostnames: %v", err) + } + + if err := writeDeploymentFiles(cfg, id, deploymentDir, currentAgentBaseURL(deploymentDir), u); err != nil { + return err + } + + helmfilePath := filepath.Join(deploymentDir, helmfileFileName) + if _, err := os.Stat(helmfilePath); os.IsNotExist(err) { + return fmt.Errorf("helmfile.yaml not found in: %s", deploymentDir) + } + + kubeconfigPath := filepath.Join(cfg.ConfigDir, "kubeconfig.yaml") + if _, err := os.Stat(kubeconfigPath); os.IsNotExist(err) { + return errors.New("cluster not running. Run 'obol stack up' first") + } + + if err := refreshHelmRepos(cfg); err != nil { + return err + } + + helmfileBinary := filepath.Join(cfg.BinDir, "helmfile") + cmd := exec.Command(helmfileBinary, "-f", helmfilePath, "sync") + cmd.Dir = deploymentDir + cmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + + u.Infof("Syncing Hermes: hermes/%s", id) + u.Detail("Deployment directory", deploymentDir) + + if err := u.Exec(ui.ExecConfig{ + Name: "Running helmfile sync", + Cmd: cmd, + }); err != nil { + return fmt.Errorf("helmfile sync failed: %w", err) + } + + u.Blank() + u.Success("Hermes installed successfully!") + u.Detail("Namespace", agentruntime.Namespace(agentruntime.Hermes, id)) + u.Detail("URL", "http://"+agentruntime.Hostname(agentruntime.Hermes, id)) + u.Detail("Dashboard", "http://"+dashboardHostname(id)) + u.Blank() + u.Dim("[Optional] Retrieve an API server token:") + u.Printf(" obol hermes token %s", id) + u.Blank() + u.Dim("[Optional] Port-forward fallback:") + u.Printf(" obol kubectl -n %s port-forward svc/%s %d:%d", + agentruntime.Namespace(agentruntime.Hermes, id), + agentruntime.Describe(agentruntime.Hermes).ServiceName, + agentruntime.Describe(agentruntime.Hermes).DefaultPort, + agentruntime.Describe(agentruntime.Hermes).DefaultPort, + ) + + return nil +} + +func Setup(cfg *config.Config, id string, _ SetupOptions, u *ui.UI) error { + u.Info("Re-rendering Hermes config from the current LiteLLM model inventory...") + return Sync(cfg, id, u) +} + +func List(cfg *config.Config, u *ui.UI) error { + ids, err := agentruntime.ListInstanceIDs(cfg, agentruntime.Hermes) + if err != nil { + return err + } + + var instances []instance + for _, id := range ids { + instances = append(instances, instance{ + ID: id, + Namespace: agentruntime.Namespace(agentruntime.Hermes, id), + URL: "http://" + agentruntime.Hostname(agentruntime.Hermes, id), + }) + } + + if u.IsJSON() { + return u.JSON(instances) + } + + if len(instances) == 0 { + u.Print("No Hermes instances installed") + u.Print("\nTo create one: obol hermes onboard") + return nil + } + + u.Info("Hermes instances:") + u.Blank() + for _, inst := range instances { + u.Bold(" " + inst.ID) + u.Detail(" Namespace", inst.Namespace) + u.Detail(" URL", inst.URL) + u.Blank() + } + u.Printf("Total: %d instance(s)", len(instances)) + return nil +} + +func Delete(cfg *config.Config, id string, force bool, u *ui.UI) error { + namespace := agentruntime.Namespace(agentruntime.Hermes, id) + deploymentDir := DeploymentPath(cfg, id) + + u.Infof("Deleting Hermes: hermes/%s", id) + u.Detail("Namespace", namespace) + + configExists := false + if _, err := os.Stat(deploymentDir); err == nil { + configExists = true + } + + namespaceExists := false + kubeconfigPath := filepath.Join(cfg.ConfigDir, "kubeconfig.yaml") + if _, err := os.Stat(kubeconfigPath); err == nil { + kubectlBinary := filepath.Join(cfg.BinDir, "kubectl") + cmd := exec.Command(kubectlBinary, "get", "namespace", namespace) + cmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + if err := cmd.Run(); err == nil { + namespaceExists = true + } + } + + if !namespaceExists && !configExists { + return fmt.Errorf("instance not found: %s", id) + } + + u.Blank() + u.Print("Resources to be deleted:") + if namespaceExists { + u.Printf(" [x] Kubernetes namespace: %s", namespace) + } else { + u.Printf(" [ ] Kubernetes namespace: %s (not found)", namespace) + } + if configExists { + u.Printf(" [x] Configuration: %s", deploymentDir) + } + + if !force && !u.Confirm("\nProceed with deletion?", false) { + u.Print("Deletion cancelled") + return nil + } + + if namespaceExists { + helmfilePath := filepath.Join(deploymentDir, helmfileFileName) + helmfileBinary := filepath.Join(cfg.BinDir, "helmfile") + if _, err := os.Stat(helmfilePath); err == nil { + if _, err := os.Stat(helmfileBinary); err == nil { + destroyCmd := exec.Command(helmfileBinary, "-f", helmfilePath, "destroy") + destroyCmd.Dir = deploymentDir + destroyCmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + if err := u.Exec(ui.ExecConfig{ + Name: "Removing Helm releases from " + namespace, + Cmd: destroyCmd, + }); err != nil { + u.Warnf("helmfile destroy failed (will force-delete namespace): %v", err) + } + } + } + + kubectlBinary := filepath.Join(cfg.BinDir, "kubectl") + deleteCmd := exec.Command(kubectlBinary, "delete", "namespace", namespace, "--force", "--grace-period=0") + deleteCmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + if err := u.Exec(ui.ExecConfig{ + Name: "Deleting namespace " + namespace, + Cmd: deleteCmd, + }); err != nil { + u.Warnf("namespace deletion may still be in progress: %v", err) + } + } + + if configExists { + u.Info("Deleting configuration...") + if err := os.RemoveAll(deploymentDir); err != nil { + return fmt.Errorf("failed to delete config directory: %w", err) + } + u.Success("Configuration deleted") + parentDir := filepath.Join(cfg.ConfigDir, "applications", string(agentruntime.Hermes)) + if entries, err := os.ReadDir(parentDir); err == nil && len(entries) == 0 { + _ = os.Remove(parentDir) + } + } + + u.Blank() + u.Successf("Hermes %s deleted successfully!", id) + return nil +} + +func Token(cfg *config.Config, id string, u *ui.UI) error { + token, err := getToken(cfg, id) + if err != nil { + return err + } + u.Print(token) + return nil +} + +func RegenerateToken(cfg *config.Config, id string, u *ui.UI) (string, error) { + deploymentDir := DeploymentPath(cfg, id) + if _, err := os.Stat(deploymentDir); os.IsNotExist(err) { + return "", fmt.Errorf("deployment not found: hermes/%s", id) + } + + newToken, err := generateGatewayToken() + if err != nil { + return "", err + } + if err := persistGatewayToken(deploymentDir, newToken); err != nil { + return "", err + } + if err := writeDeploymentFiles(cfg, id, deploymentDir, currentAgentBaseURL(deploymentDir), u); err != nil { + return "", err + } + + namespace := agentruntime.Namespace(agentruntime.Hermes, id) + kubeconfigPath := filepath.Join(cfg.ConfigDir, "kubeconfig.yaml") + if _, err := os.Stat(kubeconfigPath); os.IsNotExist(err) { + return "", errors.New("cluster not running. Run 'obol stack up' first") + } + + kubectlBinary := filepath.Join(cfg.BinDir, "kubectl") + manifest := map[string]any{ + "apiVersion": "v1", + "kind": "Secret", + "metadata": map[string]any{ + "name": "hermes-api-server", + "namespace": namespace, + "labels": map[string]string{ + "app.kubernetes.io/name": "hermes", + "app.kubernetes.io/managed-by": "obol", + }, + }, + "type": "Opaque", + "stringData": map[string]string{ + "API_SERVER_KEY": newToken, + }, + } + raw, _ := json.Marshal(manifest) //nolint:errchkjson // controlled payload + + applyCmd := exec.Command(kubectlBinary, "apply", "-f", "-") + applyCmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + applyCmd.Stdin = bytes.NewReader(raw) + if out, err := applyCmd.CombinedOutput(); err != nil { + return "", fmt.Errorf("failed to apply Hermes token secret: %w\n%s", err, string(out)) + } + + restartCmd := exec.Command(kubectlBinary, "rollout", "restart", "deployment/hermes", "-n", namespace) + restartCmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + if out, err := restartCmd.CombinedOutput(); err != nil { + return "", fmt.Errorf("failed to restart Hermes runtime: %w\n%s", err, string(out)) + } + + waitCmd := exec.Command(kubectlBinary, "rollout", "status", "deployment/hermes", "-n", namespace, "--timeout=120s") + waitCmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + if out, err := waitCmd.CombinedOutput(); err != nil { + return "", fmt.Errorf("rollout not confirmed: %w\n%s", err, string(out)) + } + + u.Success("Token regenerated successfully") + return newToken, nil +} + +func SyncDefaultModels(cfg *config.Config, u *ui.UI) error { + deploymentDir := DeploymentPath(cfg, agentruntime.DefaultInstanceID) + if _, err := os.Stat(deploymentDir); os.IsNotExist(err) { + return nil + } + return Sync(cfg, agentruntime.DefaultInstanceID, u) +} + +func Skills(cfg *config.Config, id string, args []string) error { + return cliViaKubectlExec(cfg, id, append([]string{"skills"}, args...)) +} + +func ResolveInstance(cfg *config.Config, args []string) (string, []string, error) { + return agentruntime.ResolveInstance(cfg, agentruntime.Hermes, args) +} + +func cliViaKubectlExec(cfg *config.Config, id string, args []string) error { + kubeconfigPath := filepath.Join(cfg.ConfigDir, "kubeconfig.yaml") + if _, err := os.Stat(kubeconfigPath); os.IsNotExist(err) { + return errors.New("cluster not running. Run 'obol stack up' first") + } + + kubectlBinary := filepath.Join(cfg.BinDir, "kubectl") + namespace := agentruntime.Namespace(agentruntime.Hermes, id) + cmd := exec.Command(kubectlBinary, hermesExecArgs(namespace, args, stdinIsTerminal())...) + cmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + cmd.Stdin = os.Stdin + cmd.Stdout = os.Stdout + cmd.Stderr = os.Stderr + + if err := cmd.Run(); err != nil { + exitErr := &exec.ExitError{} + if errors.As(err, &exitErr) { + if status, ok := exitErr.Sys().(syscall.WaitStatus); ok { + os.Exit(status.ExitStatus()) + } + } + return err + } + + return nil +} + +func hermesExecArgs(namespace string, args []string, withTTY bool) []string { + execArgs := []string{"exec", "-i"} + if withTTY { + execArgs = append(execArgs, "-t") + } + execArgs = append(execArgs, + "-c", agentruntime.Describe(agentruntime.Hermes).ServiceName, + "-n", namespace, + "deploy/"+agentruntime.Describe(agentruntime.Hermes).ServiceName, + "--", + hermesBinary, + ) + return append(execArgs, args...) +} + +func stdinIsTerminal() bool { + info, err := os.Stdin.Stat() + return err == nil && info.Mode()&os.ModeCharDevice != 0 +} + +func getToken(cfg *config.Config, id string) (string, error) { + namespace := agentruntime.Namespace(agentruntime.Hermes, id) + kubeconfigPath := filepath.Join(cfg.ConfigDir, "kubeconfig.yaml") + if _, err := os.Stat(kubeconfigPath); os.IsNotExist(err) { + return "", errors.New("cluster not running. Run 'obol stack up' first") + } + + kubectlBinary := filepath.Join(cfg.BinDir, "kubectl") + cmd := exec.Command(kubectlBinary, "get", "secret", "hermes-api-server", "-n", namespace, "-o", "json") + cmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + + out, err := cmd.Output() + if err != nil { + return "", fmt.Errorf("failed to get secret: %w", err) + } + + var secret struct { + Data map[string]string `json:"data"` + } + if err := json.Unmarshal(out, &secret); err != nil { + return "", fmt.Errorf("failed to parse secret: %w", err) + } + + encoded := secret.Data["API_SERVER_KEY"] + if encoded == "" { + return "", fmt.Errorf("API_SERVER_KEY not found in namespace %s secrets", namespace) + } + + decoded, err := base64.StdEncoding.DecodeString(encoded) + if err != nil { + return "", fmt.Errorf("failed to decode token: %w", err) + } + return string(decoded), nil +} + +func writeDeploymentFiles(cfg *config.Config, id, deploymentDir, agentBaseURL string, u *ui.UI) error { + models, primary, err := configuredModels(cfg, u) + if err != nil { + return err + } + + token, err := ensureGatewayToken(deploymentDir) + if err != nil { + return err + } + + namespace := agentruntime.Namespace(agentruntime.Hermes, id) + hostname := agentruntime.Hostname(agentruntime.Hermes, id) + dashboardHost := dashboardHostname(id) + configData, err := generateConfig(cfg, primary) + if err != nil { + return err + } + + if err := os.WriteFile(filepath.Join(deploymentDir, valuesFileName), []byte(generateValues(namespace, hostname, dashboardHost, agentBaseURL, token, primary, configData)), 0o600); err != nil { + return fmt.Errorf("failed to write %s: %w", valuesFileName, err) + } + if err := os.WriteFile(filepath.Join(deploymentDir, helmfileFileName), []byte(generateHelmfile(namespace)), 0o600); err != nil { + return fmt.Errorf("failed to write %s: %w", helmfileFileName, err) + } + + if err := syncRuntimeFiles(cfg, id, configData, u); err != nil { + return err + } + + u.Successf("Prepared Hermes runtime config (%d model(s), default: %s)", len(models), primary) + return nil +} + +func generateHelmfile(namespace string) string { + return fmt.Sprintf(`# Managed by obol hermes + +repositories: + - name: obol + url: https://obolnetwork.github.io/helm-charts/ + - name: bedag + url: https://bedag.github.io/helm-charts/ + +releases: + - name: hermes + namespace: %s + createNamespace: true + chart: bedag/raw + version: %s + values: + - %s + + - name: remote-signer + namespace: %s + chart: obol/remote-signer + version: %s + values: + - values-remote-signer.yaml +`, namespace, rawChartVersion, valuesFileName, namespace, remoteSignerChartVersion) +} + +func dashboardHostname(id string) string { + return agentruntime.DashboardHostname(agentruntime.Hermes, id) +} + +func generateValues(namespace, hostname, dashboardHostname, agentBaseURL, token, primary string, configData []byte) string { + desc := agentruntime.Describe(agentruntime.Hermes) + + var b strings.Builder + fmt.Fprintf(&b, `resources: + - apiVersion: v1 + kind: ServiceAccount + metadata: + name: %s + namespace: %s + labels: + app.kubernetes.io/name: %s + app.kubernetes.io/managed-by: obol + automountServiceAccountToken: true + + - apiVersion: v1 + kind: Secret + metadata: + name: hermes-api-server + namespace: %s + labels: + app.kubernetes.io/name: %s + app.kubernetes.io/managed-by: obol + type: Opaque + stringData: + API_SERVER_KEY: %s + + - apiVersion: v1 + kind: ConfigMap + metadata: + name: %s + namespace: %s + labels: + app.kubernetes.io/name: %s + app.kubernetes.io/managed-by: obol + data: + config.yaml: | +`, desc.ServiceName, namespace, desc.ServiceName, namespace, desc.ServiceName, quoteYAML(token), desc.ConfigMapName, namespace, desc.ServiceName) + b.WriteString(indentBlock(string(configData), " ")) + fmt.Fprintf(&b, ` + - apiVersion: v1 + kind: PersistentVolumeClaim + metadata: + name: %s + namespace: %s + labels: + app.kubernetes.io/name: %s + app.kubernetes.io/managed-by: obol + spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 5Gi + + - apiVersion: apps/v1 + kind: Deployment + metadata: + name: %s + namespace: %s + labels: + app.kubernetes.io/name: %s + app.kubernetes.io/managed-by: obol + spec: + replicas: 1 + selector: + matchLabels: + app.kubernetes.io/name: %s + template: + metadata: + labels: + app.kubernetes.io/name: %s + app.kubernetes.io/managed-by: obol + spec: + serviceAccountName: %s + automountServiceAccountToken: true + securityContext: + runAsUser: %d + runAsGroup: %d + fsGroup: %d + initContainers: + - name: init-hermes-data + image: busybox:1.36 + command: + - sh + - -c + - mkdir -p /data/.hermes && chown -R %d:%d /data/.hermes + securityContext: + runAsUser: 0 + volumeMounts: + - name: data + mountPath: /data + - name: bootstrap-hermes-install + image: %s + imagePullPolicy: IfNotPresent + command: + - sh + - -ec + - | + install_dir=%s + repo_url=%s + mkdir -p /data/.hermes/home /data/.hermes/workspace + if [ ! -d "$install_dir/.git" ]; then + rm -rf "${install_dir}.tmp" + if [ -e "$install_dir" ]; then + mv "$install_dir" "${install_dir}.backup.$(date +%%s)" + fi + git clone --depth 1 "$repo_url" "$install_dir" + fi + cd "$install_dir" + if [ ! -x "$install_dir/venv/bin/hermes" ]; then + rm -rf "$install_dir/venv" + uv venv --python python3 --system-site-packages venv + VIRTUAL_ENV="$install_dir/venv" uv pip install -e "." + fi + if [ -f /data/.hermes/state.db ]; then + if ! python3 - <<'PY' + import sqlite3 + conn = sqlite3.connect('/data/.hermes/state.db') + row = conn.execute('PRAGMA quick_check').fetchone() + raise SystemExit(0 if row and row[0] == 'ok' else 1) + PY + then + ts="$(date -u +%%Y%%m%%dT%%H%%M%%SZ)" + backup_dir="/data/.hermes/backups/state-db-corrupt-$ts" + mkdir -p "$backup_dir" + cp -a /data/.hermes/state.db* "$backup_dir"/ 2>/dev/null || true + mv /data/.hermes/state.db "/data/.hermes/state.db.corrupt-$ts" + rm -f /data/.hermes/state.db-shm /data/.hermes/state.db-wal + echo "Backed up malformed Hermes state DB to $backup_dir" + fi + fi + volumeMounts: + - name: data + mountPath: /data + containers: + - name: %s + image: %s + imagePullPolicy: IfNotPresent + command: + - %s + args: + - gateway + - run + - --replace + ports: + - name: http + containerPort: %d + env: + - name: HERMES_HOME + value: /data/.hermes + - name: HOME + value: /data/.hermes/home + - name: API_SERVER_ENABLED + value: "true" + - name: API_SERVER_HOST + value: "0.0.0.0" + - name: API_SERVER_PORT + value: "%d" + - name: API_SERVER_KEY + valueFrom: + secretKeyRef: + name: hermes-api-server + key: API_SERVER_KEY + - name: API_SERVER_MODEL_NAME + value: %s + - name: REMOTE_SIGNER_URL + value: http://remote-signer:9000 + - name: AGENT_NAMESPACE + value: %s + - name: OBOL_SKILLS_DIR + value: /data/.hermes/%s + `, desc.DataPVCName, namespace, desc.ServiceName, desc.ServiceName, namespace, desc.ServiceName, desc.ServiceName, desc.ServiceName, desc.ServiceName, containerUID, containerGID, containerGID, containerUID, containerGID, quoteYAML(image()), quoteYAML(hermesInstallDir), quoteYAML(hermesRepoURL), desc.ServiceName, quoteYAML(image()), quoteYAML(hermesBinary), desc.DefaultPort, desc.DefaultPort, quoteYAML(primary), quoteYAML(namespace), obolSkillsDirName) + + if agentBaseURL != "" { + fmt.Fprintf(&b, " - name: AGENT_BASE_URL\n value: %s\n", quoteYAML(agentBaseURL)) + } + + fmt.Fprintf(&b, ` readinessProbe: + httpGet: + path: /health + port: %d + initialDelaySeconds: 5 + periodSeconds: 10 + livenessProbe: + httpGet: + path: /health + port: %d + initialDelaySeconds: 15 + periodSeconds: 20 + startupProbe: + httpGet: + path: /health + port: %d + periodSeconds: 5 + failureThreshold: 24 + volumeMounts: + - name: data + mountPath: /data + - name: hermes-dashboard + image: %s + imagePullPolicy: IfNotPresent + command: + - %s + args: + - dashboard + - --host + - 0.0.0.0 + - --port + - "%d" + - --no-open + - --insecure + ports: + - name: dashboard + containerPort: %d + env: + - name: HERMES_HOME + value: /data/.hermes + - name: HOME + value: /data/.hermes/home + - name: GATEWAY_HEALTH_URL + value: http://localhost:%d + - name: GATEWAY_HEALTH_TIMEOUT + value: "3" + readinessProbe: + httpGet: + path: /api/status + port: %d + initialDelaySeconds: 5 + periodSeconds: 10 + livenessProbe: + httpGet: + path: /api/status + port: %d + initialDelaySeconds: 15 + periodSeconds: 20 + startupProbe: + httpGet: + path: /api/status + port: %d + periodSeconds: 5 + failureThreshold: 24 + volumeMounts: + - name: data + mountPath: /data + volumes: + - name: data + persistentVolumeClaim: + claimName: %s + + - apiVersion: v1 + kind: Service + metadata: + name: %s + namespace: %s + labels: + app.kubernetes.io/name: %s + app.kubernetes.io/managed-by: obol + spec: + selector: + app.kubernetes.io/name: %s + ports: + - name: http + port: %d + targetPort: http + - name: dashboard + port: %d + targetPort: dashboard + + - apiVersion: gateway.networking.k8s.io/v1 + kind: HTTPRoute + metadata: + name: %s + namespace: %s + spec: + hostnames: + - %s + parentRefs: + - name: traefik-gateway + namespace: traefik + sectionName: web + rules: + - backendRefs: + - name: %s + port: %d + + - apiVersion: gateway.networking.k8s.io/v1 + kind: HTTPRoute + metadata: + name: %s-dashboard + namespace: %s + spec: + hostnames: + - %s + parentRefs: + - name: traefik-gateway + namespace: traefik + sectionName: web + rules: + - backendRefs: + - name: %s + port: %d +`, desc.DefaultPort, desc.DefaultPort, desc.DefaultPort, + quoteYAML(image()), quoteYAML(hermesBinary), dashboardPort, dashboardPort, desc.DefaultPort, dashboardPort, dashboardPort, dashboardPort, + desc.DataPVCName, + desc.ServiceName, namespace, desc.ServiceName, desc.ServiceName, desc.DefaultPort, dashboardPort, + desc.ServiceName, namespace, quoteYAML(hostname), desc.ServiceName, desc.DefaultPort, + desc.ServiceName, namespace, quoteYAML(dashboardHostname), desc.ServiceName, dashboardPort) + + return strings.ReplaceAll(b.String(), "\t", "") +} + +func syncRuntimeFiles(cfg *config.Config, id string, configData []byte, u *ui.UI) error { + targetDir := agentruntime.HomePath(cfg, agentruntime.Hermes, id) + ensureVolumeWritable(cfg, targetDir, u) + if err := os.MkdirAll(targetDir, 0o755); err != nil { + return fmt.Errorf("failed to create Hermes home %s: %w", targetDir, err) + } + if err := os.WriteFile(filepath.Join(targetDir, "config.yaml"), configData, 0o600); err != nil { + return fmt.Errorf("failed to write Hermes config: %w", err) + } + if err := syncObolSkills(cfg, id); err != nil { + return err + } + if err := removeLegacyHeartbeat(targetDir); err != nil { + return err + } + fixRuntimeVolumeOwnership(cfg, targetDir, u) + return nil +} + +func removeLegacyHeartbeat(hermesHome string) error { + heartbeatPath := filepath.Join(hermesHome, "workspace", "HEARTBEAT.md") + if err := os.Remove(heartbeatPath); err != nil && !os.IsNotExist(err) { + return fmt.Errorf("failed to remove legacy heartbeat: %w", err) + } + return nil +} + +func syncObolSkills(cfg *config.Config, id string) error { + targetDir := filepath.Join(agentruntime.HomePath(cfg, agentruntime.Hermes, id), obolSkillsDirName) + if err := os.MkdirAll(targetDir, 0o755); err != nil { + return fmt.Errorf("failed to create Obol skills directory: %w", err) + } + if err := obolembed.CopySkills(targetDir); err != nil { + return fmt.Errorf("failed to copy Obol skills: %w", err) + } + return nil +} + +func configuredModels(cfg *config.Config, u *ui.UI) ([]string, string, error) { + models, err := model.GetConfiguredModels(cfg) + if err == nil && len(models) > 0 { + primary, _ := rankModels(models) + return stripProviderPrefixes(models), stripProviderPrefix(primary), nil + } + + ollamaModels, ollamaErr := model.ListOllamaModels() + if ollamaErr != nil || len(ollamaModels) == 0 { + return nil, "", errors.New("no LiteLLM models configured") + } + + var names []string + for _, m := range ollamaModels { + name := m.Name + if before, ok := strings.CutSuffix(name, ":latest"); ok { + name = before + } + if name != "" { + names = append(names, name) + } + } + + if len(names) == 0 { + return nil, "", errors.New("no LiteLLM models configured") + } + + if err := model.ConfigureLiteLLM(cfg, u, "ollama", "", names); err != nil { + return nil, "", fmt.Errorf("failed to auto-configure LiteLLM for Ollama: %w", err) + } + + primary, _ := rankModels(names) + return names, stripProviderPrefix(primary), nil +} + +func generateConfig(cfg *config.Config, primary string) ([]byte, error) { + payload := map[string]any{ + "model": map[string]any{ + "default": primary, + "provider": "custom", + "base_url": "http://litellm.llm.svc.cluster.local:4000/v1", + "api_key": litellmMasterKey(cfg), + }, + "terminal": map[string]any{ + "backend": "local", + "cwd": "/data/.hermes/workspace", + "timeout": 180, + "lifetime_seconds": 300, + "docker_mount_cwd_to_workspace": false, + }, + "skills": map[string]any{ + "external_dirs": []string{"/data/.hermes/" + obolSkillsDirName}, + }, + } + return yaml.Marshal(payload) +} + +func currentAgentBaseURL(deploymentDir string) string { + raw, err := os.ReadFile(filepath.Join(deploymentDir, valuesFileName)) + if err != nil { + return "" + } + lines := strings.Split(string(raw), "\n") + for i, line := range lines { + if strings.Contains(line, "name: AGENT_BASE_URL") { + if i+1 < len(lines) && strings.Contains(lines[i+1], "value:") { + return strings.Trim(strings.TrimSpace(strings.TrimPrefix(strings.TrimSpace(lines[i+1]), "value:")), `"'`) + } + } + } + return "" +} + +func gatewayTokenPath(deploymentDir string) string { + return filepath.Join(deploymentDir, gatewayTokenFileName) +} + +func ensureGatewayToken(deploymentDir string) (string, error) { + if data, err := os.ReadFile(gatewayTokenPath(deploymentDir)); err == nil { + token := strings.TrimSpace(string(data)) + if token != "" { + return token, nil + } + } + + token, err := generateGatewayToken() + if err != nil { + return "", err + } + if err := persistGatewayToken(deploymentDir, token); err != nil { + return "", err + } + return token, nil +} + +func persistGatewayToken(deploymentDir, token string) error { + token = strings.TrimSpace(token) + if token == "" { + return errors.New("gateway token is empty") + } + return os.WriteFile(gatewayTokenPath(deploymentDir), []byte(token+"\n"), 0o600) +} + +func generateGatewayToken() (string, error) { + const alphabet = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" + + buf := make([]byte, 32) + if _, err := rand.Read(buf); err != nil { + return "", fmt.Errorf("failed to generate gateway token: %w", err) + } + + out := make([]byte, 32) + for i, b := range buf { + out[i] = alphabet[int(b)%len(alphabet)] + } + return string(out), nil +} + +func image() string { + if override := strings.TrimSpace(os.Getenv("OBOL_HERMES_IMAGE")); override != "" { + return override + } + return defaultImage +} + +func quoteYAML(value string) string { + value = strings.TrimSpace(value) + value = strings.ReplaceAll(value, `"`, `\"`) + return `"` + value + `"` +} + +func indentBlock(value, prefix string) string { + if value == "" { + return prefix + "\n" + } + + lines := strings.Split(strings.TrimRight(value, "\n"), "\n") + for i, line := range lines { + lines[i] = prefix + line + } + return strings.Join(lines, "\n") + "\n" +} + +func refreshHelmRepos(cfg *config.Config) error { + helmBinary := filepath.Join(cfg.BinDir, "helm") + kubeconfigPath := filepath.Join(cfg.ConfigDir, "kubeconfig.yaml") + env := append(os.Environ(), "KUBECONFIG="+kubeconfigPath) + + addObolCmd := exec.Command(helmBinary, "repo", "add", "obol", "https://obolnetwork.github.io/helm-charts/") + addObolCmd.Env = env + addObolOut, addObolErr := addObolCmd.CombinedOutput() + if addObolErr != nil && !strings.Contains(string(addObolOut), `"obol" already exists`) { + return fmt.Errorf("helm repo add obol failed: %w\n%s", addObolErr, string(addObolOut)) + } + + addBedagCmd := exec.Command(helmBinary, "repo", "add", "bedag", "https://bedag.github.io/helm-charts/") + addBedagCmd.Env = env + addBedagOut, addBedagErr := addBedagCmd.CombinedOutput() + if addBedagErr != nil && !strings.Contains(string(addBedagOut), `"bedag" already exists`) { + return fmt.Errorf("helm repo add bedag failed: %w\n%s", addBedagErr, string(addBedagOut)) + } + + updateCmd := exec.Command(helmBinary, "repo", "update", "obol", "bedag") + updateCmd.Env = env + updateOut, updateErr := updateCmd.CombinedOutput() + if updateErr != nil { + return fmt.Errorf("helm repo update failed: %w\n%s", updateErr, string(updateOut)) + } + + return nil +} + +func litellmMasterKey(cfg *config.Config) string { + stackIDPath := filepath.Join(cfg.ConfigDir, ".stack-id") + data, err := os.ReadFile(stackIDPath) + if err != nil { + return "sk-obol-default" + } + return "sk-obol-" + strings.TrimSpace(string(data)) +} + +func stripProviderPrefix(modelName string) string { + modelName = strings.TrimSpace(strings.Trim(modelName, `"'`)) + if before, after, ok := strings.Cut(modelName, "/"); ok && before != "" && after != "" { + return after + } + return modelName +} + +func stripProviderPrefixes(modelNames []string) []string { + if len(modelNames) == 0 { + return nil + } + + out := make([]string, 0, len(modelNames)) + for _, name := range modelNames { + if trimmed := stripProviderPrefix(name); trimmed != "" { + out = append(out, trimmed) + } + } + return out +} + +func rankModels(models []string) (primary string, fallbacks []string) { + if len(models) == 0 { + return "", nil + } + + var cloud []string + var local []string + for _, m := range models { + trimmed := stripProviderPrefix(m) + if isCloudModel(trimmed) { + cloud = append(cloud, trimmed) + } else { + local = append(local, trimmed) + } + } + + if len(cloud) > 0 { + primary = cloud[0] + fallbacks = append(append([]string{}, cloud[1:]...), local...) + } else { + primary = local[0] + fallbacks = local[1:] + } + + return primary, fallbacks +} + +func isCloudModel(name string) bool { + if strings.Contains(name, "claude") { + return true + } + if strings.HasPrefix(name, "gpt") || strings.HasPrefix(name, "o1") || strings.HasPrefix(name, "o3") || strings.HasPrefix(name, "o4") { + return true + } + return false +} + +func k3dNodeExec(cfg *config.Config, hostPath, shellCmd string) error { + stackID := "" + if data, err := os.ReadFile(filepath.Join(cfg.ConfigDir, ".stack-id")); err == nil { + stackID = strings.TrimSpace(string(data)) + } + if stackID == "" { + return fmt.Errorf("stack ID not found") + } + + container := fmt.Sprintf("k3d-obol-stack-%s-server-0", stackID) + relPath, err := filepath.Rel(cfg.DataDir, hostPath) + if err != nil { + return fmt.Errorf("cannot compute relative path from %s to %s: %w", cfg.DataDir, hostPath, err) + } + if strings.HasPrefix(relPath, "..") { + return fmt.Errorf("path %s is not under DataDir %s", hostPath, cfg.DataDir) + } + + nodePath := filepath.Join("/data", relPath) + quoted := "'" + strings.ReplaceAll(nodePath, "'", "'\"'\"'") + "'" + expanded := strings.ReplaceAll(shellCmd, "{}", quoted) + + cmd := exec.Command("docker", "exec", container, "sh", "-c", expanded) + return cmd.Run() +} + +func ensureVolumeWritable(cfg *config.Config, hostPath string, u *ui.UI) { + backendName := "k3d" + if data, err := os.ReadFile(filepath.Join(cfg.ConfigDir, ".stack-backend")); err == nil { + backendName = strings.TrimSpace(string(data)) + } + + if backendName != "k3d" { + return + } + + uid := os.Getuid() + gid := os.Getgid() + shellCmd := fmt.Sprintf("mkdir -p {} && chown -R %d:%d {}", uid, gid) + + if err := k3dNodeExec(cfg, hostPath, shellCmd); err != nil && u != nil { + u.Warnf("Could not pre-create volume directory %s: %v", hostPath, err) + } +} + +func fixRuntimeVolumeOwnership(cfg *config.Config, hostPath string, u *ui.UI) { + backendName := "k3d" + if data, err := os.ReadFile(filepath.Join(cfg.ConfigDir, ".stack-backend")); err == nil { + backendName = strings.TrimSpace(string(data)) + } + + owner := fmt.Sprintf("%d:%d", containerUID, containerGID) + switch backendName { + case "k3d": + if err := k3dNodeExec(cfg, hostPath, "chown -R "+owner+" {}"); err != nil && u != nil { + u.Warnf("Failed to fix volume ownership for %s: %v", hostPath, err) + } + default: + _ = os.Chown(hostPath, containerUID, containerGID) + } +} diff --git a/internal/hermes/hermes_test.go b/internal/hermes/hermes_test.go new file mode 100644 index 00000000..09d74e4d --- /dev/null +++ b/internal/hermes/hermes_test.go @@ -0,0 +1,164 @@ +package hermes + +import ( + "fmt" + "os" + "path/filepath" + "reflect" + "strings" + "testing" + + "github.com/ObolNetwork/obol-stack/internal/config" + "gopkg.in/yaml.v3" +) + +func testConfig(t *testing.T) *config.Config { + t.Helper() + + dir := t.TempDir() + if err := os.WriteFile(filepath.Join(dir, ".stack-id"), []byte("test-cluster"), 0o644); err != nil { + t.Fatalf("write .stack-id: %v", err) + } + + return &config.Config{ConfigDir: dir, DataDir: dir, BinDir: dir} +} + +func TestGenerateConfig_UsesLiteLLMCustomProvider(t *testing.T) { + raw, err := generateConfig(testConfig(t), "gpt-5.2") + if err != nil { + t.Fatalf("generateConfig() error = %v", err) + } + + var cfg map[string]any + if err := yaml.Unmarshal(raw, &cfg); err != nil { + t.Fatalf("yaml.Unmarshal() error = %v", err) + } + + modelCfg, ok := cfg["model"].(map[string]any) + if !ok { + t.Fatalf("model config missing or wrong type: %#v", cfg["model"]) + } + + if got := modelCfg["default"]; got != "gpt-5.2" { + t.Fatalf("model.default = %#v, want %q", got, "gpt-5.2") + } + + if got := modelCfg["provider"]; got != "custom" { + t.Fatalf("model.provider = %#v, want %q", got, "custom") + } + + if got := modelCfg["base_url"]; got != "http://litellm.llm.svc.cluster.local:4000/v1" { + t.Fatalf("model.base_url = %#v", got) + } + + if got := modelCfg["api_key"]; got != "sk-obol-test-cluster" { + t.Fatalf("model.api_key = %#v, want stack-derived LiteLLM key", got) + } + + terminalCfg, ok := cfg["terminal"].(map[string]any) + if !ok { + t.Fatalf("terminal config missing or wrong type: %#v", cfg["terminal"]) + } + + if got := terminalCfg["cwd"]; got != "/data/.hermes/workspace" { + t.Fatalf("terminal.cwd = %#v, want %q", got, "/data/.hermes/workspace") + } + + skillsCfg, ok := cfg["skills"].(map[string]any) + if !ok { + t.Fatalf("skills config missing or wrong type: %#v", cfg["skills"]) + } + if got := fmt.Sprint(skillsCfg["external_dirs"]); !strings.Contains(got, "/data/.hermes/obol-skills") { + t.Fatalf("skills.external_dirs = %#v, want Obol external skills dir", skillsCfg["external_dirs"]) + } +} + +func TestGenerateValues_UsesHermesNativeNames(t *testing.T) { + values := generateValues( + "hermes-obol-agent", + "hermes-obol-agent.obol.stack", + "obol-agent.obol.stack", + "https://agent.example.com", + "secret-token", + "gpt-5.2", + []byte("model:\n default: gpt-5.2\n"), + ) + + for _, needle := range []string{ + "name: hermes", + "name: hermes-config", + "name: hermes-data", + `API_SERVER_KEY: "secret-token"`, + `value: "https://agent.example.com"`, + "AGENT_NAMESPACE", + `value: "hermes-obol-agent"`, + "OBOL_SKILLS_DIR", + "/data/.hermes/obol-skills", + "containerPort: 8642", + "containerPort: 9119", + "init-hermes-data", + "bootstrap-hermes-install", + `install_dir="/data/.hermes/hermes-agent"`, + `repo_url="https://github.com/NousResearch/hermes-agent.git"`, + "uv venv --python python3 --system-site-packages venv", + `uv pip install -e "."`, + `PRAGMA quick_check`, + `state-db-corrupt-$ts`, + `- "/data/.hermes/hermes-agent/venv/bin/hermes"`, + `- "hermes-obol-agent.obol.stack"`, + `- "obol-agent.obol.stack"`, + "name: hermes-dashboard", + "name: GATEWAY_HEALTH_URL", + } { + if !strings.Contains(values, needle) { + t.Fatalf("generateValues() missing %q:\n%s", needle, values) + } + } + + var parsed any + if err := yaml.Unmarshal([]byte(values), &parsed); err != nil { + t.Fatalf("generateValues() produced invalid YAML: %v\n%s", err, values) + } +} + +func TestDashboardHostname_UsesDefaultAgentHostAndHermesUIHostForNamedInstances(t *testing.T) { + tests := []struct { + id string + want string + }{ + { + id: "obol-agent", + want: "obol-agent.obol.stack", + }, + { + id: "research-agent", + want: "hermes-research-agent-ui.obol.stack", + }, + } + + for _, tt := range tests { + t.Run(tt.id, func(t *testing.T) { + if got := dashboardHostname(tt.id); got != tt.want { + t.Fatalf("dashboardHostname(%q) = %q, want %q", tt.id, got, tt.want) + } + }) + } +} + +func TestHermesExecArgs_UsesNativeHermesBinary(t *testing.T) { + got := hermesExecArgs("hermes-obol-agent", []string{"skills", "audit"}, false) + want := []string{ + "exec", "-i", + "-c", "hermes", + "-n", "hermes-obol-agent", + "deploy/hermes", + "--", + "/data/.hermes/hermes-agent/venv/bin/hermes", + "skills", + "audit", + } + + if !reflect.DeepEqual(got, want) { + t.Fatalf("hermesExecArgs() = %#v, want %#v", got, want) + } +} diff --git a/internal/hermes/wallet.go b/internal/hermes/wallet.go new file mode 100644 index 00000000..d212edf6 --- /dev/null +++ b/internal/hermes/wallet.go @@ -0,0 +1,355 @@ +package hermes + +import ( + "crypto/aes" + "crypto/cipher" + "crypto/rand" + "encoding/hex" + "encoding/json" + "fmt" + "math/big" + "os" + "path/filepath" + "strings" + "time" + + "github.com/ObolNetwork/obol-stack/internal/agentruntime" + "github.com/ObolNetwork/obol-stack/internal/config" + "github.com/ObolNetwork/obol-stack/internal/ui" + secp256k1 "github.com/decred/dcrd/dcrec/secp256k1/v4" + "github.com/google/uuid" + "golang.org/x/crypto/scrypt" + "golang.org/x/crypto/sha3" +) + +type WalletInfo struct { + Address string `json:"address"` + PublicKey string `json:"publicKey"` + KeystoreUUID string `json:"keystore_uuid"` + KeystorePath string `json:"keystore_path"` + CreatedAt string `json:"createdAt"` + Password string `json:"-"` +} + +type v3Keystore struct { + Address string `json:"address"` + Crypto v3Crypto `json:"crypto"` + ID string `json:"id"` + Version int `json:"version"` +} + +type v3Crypto struct { + Cipher string `json:"cipher"` + CipherText string `json:"ciphertext"` + CipherParams cipherParams `json:"cipherparams"` + KDF string `json:"kdf"` + KDFParams kdfParams `json:"kdfparams"` + MAC string `json:"mac"` +} + +type cipherParams struct { + IV string `json:"iv"` +} + +type kdfParams struct { + DKLen int `json:"dklen"` + N int `json:"n"` + R int `json:"r"` + P int `json:"p"` + Salt string `json:"salt"` +} + +const ( + scryptN = 262144 + scryptR = 8 + scryptP = 1 + scryptDKLen = 32 +) + +func GenerateWallet(cfg *config.Config, id string, u *ui.UI) (*WalletInfo, error) { + privKey, pubKey, err := generateKeypair() + if err != nil { + return nil, fmt.Errorf("key generation failed: %w", err) + } + + address := addressFromPublicKey(pubKey) + + password, err := generateRandomPassword(32) + if err != nil { + return nil, fmt.Errorf("password generation failed: %w", err) + } + + keystoreJSON, keystoreID, err := encryptToV3Keystore(privKey, pubKey, password) + if err != nil { + return nil, fmt.Errorf("keystore encryption failed: %w", err) + } + + keystorePath, err := provisionKeystoreToVolume(cfg, id, keystoreID, keystoreJSON, u) + if err != nil { + return nil, fmt.Errorf("keystore provisioning failed: %w", err) + } + + return &WalletInfo{ + Address: address, + PublicKey: "0x04" + hex.EncodeToString(pubKey), + KeystoreUUID: keystoreID, + KeystorePath: keystorePath, + CreatedAt: time.Now().UTC().Format(time.RFC3339), + Password: password, + }, nil +} + +func generateKeypair() (privKeyBytes []byte, pubKeyUncompressed []byte, err error) { + privKey, err := secp256k1.GeneratePrivateKey() + if err != nil { + return nil, nil, fmt.Errorf("secp256k1 key generation: %w", err) + } + + privKeyBytes = privKey.Serialize() + pubKeyUncompressed = privKey.PubKey().SerializeUncompressed()[1:] + return privKeyBytes, pubKeyUncompressed, nil +} + +func addressFromPublicKey(pubKey []byte) string { + h := sha3.NewLegacyKeccak256() + _, _ = h.Write(pubKey) + hash := h.Sum(nil) + return toChecksumAddress(hex.EncodeToString(hash[12:])) +} + +func toChecksumAddress(addr string) string { + addr = strings.ToLower(strings.TrimPrefix(addr, "0x")) + hash := sha3.NewLegacyKeccak256() + _, _ = hash.Write([]byte(addr)) + sum := hex.EncodeToString(hash.Sum(nil)) + + var out strings.Builder + out.WriteString("0x") + for i, c := range addr { + if c >= '0' && c <= '9' { + out.WriteRune(c) + continue + } + if sum[i] >= '8' { + out.WriteRune(rune(strings.ToUpper(string(c))[0])) + } else { + out.WriteRune(c) + } + } + + return out.String() +} + +func generateRandomPassword(length int) (string, error) { + const charset = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789" + charsetLen := big.NewInt(int64(len(charset))) + + result := make([]byte, length) + for i := range result { + n, err := rand.Int(rand.Reader, charsetLen) + if err != nil { + return "", fmt.Errorf("random int: %w", err) + } + result[i] = charset[n.Int64()] + } + return string(result), nil +} + +func provisionKeystoreToVolume(cfg *config.Config, id, keystoreID string, keystoreJSON []byte, u *ui.UI) (string, error) { + dir := agentruntime.KeystoreVolumePath(cfg, agentruntime.Hermes, id) + ensureVolumeWritable(cfg, dir, u) + if err := os.MkdirAll(dir, 0o700); err != nil { + return "", fmt.Errorf("create keystore directory: %w", err) + } + + path := filepath.Join(dir, keystoreID+".json") + if err := os.WriteFile(path, keystoreJSON, 0o600); err != nil { + return "", fmt.Errorf("write keystore: %w", err) + } + + fixRuntimeVolumeOwnership(cfg, dir, u) + return path, nil +} + +func encryptToV3Keystore(privKey, pubKey []byte, password string) ([]byte, string, error) { + salt := make([]byte, 32) + if _, err := rand.Read(salt); err != nil { + return nil, "", fmt.Errorf("salt generation: %w", err) + } + + iv := make([]byte, aes.BlockSize) + if _, err := rand.Read(iv); err != nil { + return nil, "", fmt.Errorf("iv generation: %w", err) + } + + dk, err := scrypt.Key([]byte(password), salt, scryptN, scryptR, scryptP, scryptDKLen) + if err != nil { + return nil, "", fmt.Errorf("scrypt: %w", err) + } + + block, err := aes.NewCipher(dk[:16]) + if err != nil { + return nil, "", fmt.Errorf("aes cipher: %w", err) + } + + stream := cipher.NewCTR(block, iv) + ciphertext := make([]byte, len(privKey)) + stream.XORKeyStream(ciphertext, privKey) + + macHasher := sha3.NewLegacyKeccak256() + _, _ = macHasher.Write(dk[16:32]) + _, _ = macHasher.Write(ciphertext) + mac := macHasher.Sum(nil) + + keystoreID := uuid.NewString() + keystore := v3Keystore{ + Address: strings.TrimPrefix(addressFromPublicKey(pubKey), "0x"), + Crypto: v3Crypto{ + Cipher: "aes-128-ctr", + CipherText: hex.EncodeToString(ciphertext), + CipherParams: cipherParams{ + IV: hex.EncodeToString(iv), + }, + KDF: "scrypt", + KDFParams: kdfParams{ + DKLen: scryptDKLen, + N: scryptN, + R: scryptR, + P: scryptP, + Salt: hex.EncodeToString(salt), + }, + MAC: hex.EncodeToString(mac), + }, + ID: keystoreID, + Version: 3, + } + + raw, err := json.MarshalIndent(keystore, "", " ") + if err != nil { + return nil, "", fmt.Errorf("marshal keystore: %w", err) + } + return raw, keystoreID, nil +} + +func generateRemoteSignerValues(wallet *WalletInfo) string { + return fmt.Sprintf(`# Remote-signer configuration +# Managed by obol hermes — do not edit manually. + +keystorePassword: + value: %q + +persistence: + enabled: true + size: 100Mi + +podSecurityContext: + fsGroup: 1000 +`, wallet.Password) +} + +func walletMetadataPath(deploymentDir string) string { + return filepath.Join(deploymentDir, "wallet.json") +} + +func WriteWalletMetadata(deploymentDir string, wallet *WalletInfo) error { + data, err := json.MarshalIndent(wallet, "", " ") + if err != nil { + return fmt.Errorf("marshal wallet metadata: %w", err) + } + return os.WriteFile(walletMetadataPath(deploymentDir), data, 0o600) +} + +func ReadWalletMetadata(deploymentDir string) (*WalletInfo, error) { + data, err := os.ReadFile(walletMetadataPath(deploymentDir)) + if err != nil { + return nil, err + } + + var wallet WalletInfo + if err := json.Unmarshal(data, &wallet); err != nil { + return nil, fmt.Errorf("unmarshal wallet metadata: %w", err) + } + return &wallet, nil +} + +func ResolveWalletAddress(cfg *config.Config) (string, error) { + ids, err := agentruntime.ListInstanceIDs(cfg, agentruntime.Hermes) + if err != nil { + return "", err + } + + switch len(ids) { + case 0: + return "", fmt.Errorf("no Hermes instances found — run 'obol hermes onboard' first, or use --wallet") + case 1: + wallet, err := ReadWalletMetadata(DeploymentPath(cfg, ids[0])) + if err != nil { + return "", fmt.Errorf("wallet not found for instance %q: %w (use --wallet to specify manually)", ids[0], err) + } + return wallet.Address, nil + default: + var addrs []string + for _, id := range ids { + w, err := ReadWalletMetadata(DeploymentPath(cfg, id)) + if err != nil { + continue + } + addrs = append(addrs, fmt.Sprintf(" %s (instance: %s)", w.Address, id)) + } + return "", fmt.Errorf("multiple Hermes instances found, use --wallet to specify:\n%s", strings.Join(addrs, "\n")) + } +} + +func ResolveInstanceNamespace(cfg *config.Config) (string, error) { + ids, err := agentruntime.ListInstanceIDs(cfg, agentruntime.Hermes) + if err != nil { + return "", err + } + + switch len(ids) { + case 0: + return "", fmt.Errorf("no Hermes instances found — run 'obol hermes onboard' first") + case 1: + return agentruntime.Namespace(agentruntime.Hermes, ids[0]), nil + default: + return "", fmt.Errorf("multiple Hermes instances found (%s), specify an instance", strings.Join(ids, ", ")) + } +} + +func ListWallets(cfg *config.Config, id string, u *ui.UI) error { + var ids []string + if id != "" { + ids = []string{id} + } else { + var err error + ids, err = agentruntime.ListInstanceIDs(cfg, agentruntime.Hermes) + if err != nil { + return err + } + if len(ids) == 0 { + u.Info("No Hermes instances found") + return nil + } + } + + found := false + for _, instanceID := range ids { + wallet, err := ReadWalletMetadata(DeploymentPath(cfg, instanceID)) + if err != nil { + continue + } + found = true + u.Detail("Instance", instanceID) + u.Detail(" Address", wallet.Address) + u.Detail(" Keystore UUID", wallet.KeystoreUUID) + if wallet.CreatedAt != "" { + u.Detail(" Created", wallet.CreatedAt) + } + u.Blank() + } + + if !found { + u.Info("No wallets found") + } + return nil +} diff --git a/internal/kubectl/kubectl.go b/internal/kubectl/kubectl.go index 3bc17deb..db77d71c 100644 --- a/internal/kubectl/kubectl.go +++ b/internal/kubectl/kubectl.go @@ -193,6 +193,36 @@ func Apply(binary, kubeconfig string, data []byte) error { return err } +// ApplyServerSideForceConflicts pipes the given data into kubectl apply +// --server-side --force-conflicts -f -. Use it only for narrow compatibility +// migrations where the caller has already decided which manager must own the +// restored fields. +func ApplyServerSideForceConflicts(binary, kubeconfig string, data []byte, fieldManager string) error { + args := []string{"apply", "--server-side", "--force-conflicts", "-f", "-"} + if strings.TrimSpace(fieldManager) != "" { + args = []string{"apply", "--server-side", "--force-conflicts", "--field-manager=" + fieldManager, "-f", "-"} + } + + cmd := exec.Command(binary, args...) + cmd.Env = append(os.Environ(), "KUBECONFIG="+kubeconfig) + cmd.Stdin = bytes.NewReader(data) + + var stderr bytes.Buffer + cmd.Stderr = &stderr + cmd.Stdout = os.Stdout + + if err := cmd.Run(); err != nil { + errMsg := strings.TrimSpace(stderr.String()) + if errMsg != "" { + return wrapClusterDown(fmt.Errorf("kubectl apply --server-side: %w: %s", err, errMsg), errMsg) + } + + return wrapClusterDown(fmt.Errorf("kubectl apply --server-side: %w", err), "") + } + + return nil +} + // ApplyOutput pipes the given data into kubectl apply -f - and returns stdout. func ApplyOutput(binary, kubeconfig string, data []byte) (string, error) { cmd := exec.Command(binary, "apply", "-f", "-") diff --git a/internal/model/model.go b/internal/model/model.go index 1203eb56..6c59ecd5 100644 --- a/internal/model/model.go +++ b/internal/model/model.go @@ -613,7 +613,7 @@ func RemoveModel(cfg *config.Config, u *ui.UI, modelName string) error { if err := kubectl.Run(kubectlBinary, kubeconfigPath, "patch", "configmap", configMapName, "-n", namespace, - "-p", patchJSON, "--type=merge"); err != nil { + "-p", patchJSON, "--type=merge", "--field-manager=helm"); err != nil { return fmt.Errorf("failed to patch ConfigMap: %w", err) } @@ -1175,7 +1175,7 @@ func patchLiteLLMConfig(kubectlBinary, kubeconfigPath string, entries []ModelEnt return kubectl.Run(kubectlBinary, kubeconfigPath, "patch", "configmap", configMapName, "-n", namespace, - "-p", patchJSON, "--type=merge") + "-p", patchJSON, "--type=merge", "--field-manager=helm") } // detectProvider infers the provider name from a model_list entry. diff --git a/internal/openclaw/openclaw.go b/internal/openclaw/openclaw.go index 2c0922b6..46639ed6 100644 --- a/internal/openclaw/openclaw.go +++ b/internal/openclaw/openclaw.go @@ -20,6 +20,7 @@ import ( "syscall" "time" + "github.com/ObolNetwork/obol-stack/internal/agentruntime" "github.com/ObolNetwork/obol-stack/internal/config" "github.com/ObolNetwork/obol-stack/internal/dns" obolembed "github.com/ObolNetwork/obol-stack/internal/embed" @@ -256,7 +257,10 @@ func Onboard(cfg *config.Config, opts OnboardOptions, u *ui.UI) error { // Ensure /etc/hosts has an entry for this subdomain. // macOS Sequoia's /etc/resolver/ doesn't reliably forward subdomain queries. - if err := dns.EnsureHostsEntries(collectAllHostnames(cfg, hostname)); err != nil { + if err := dns.EnsureHostsEntries(agentruntime.CollectHostnames(cfg, agentruntime.DeploymentRef{ + Runtime: agentruntime.OpenClaw, + ID: id, + })); err != nil { u.Warnf("Could not update /etc/hosts for %s: %v", hostname, err) } @@ -2832,31 +2836,3 @@ releases: - values-remote-signer.yaml `, id, namespace, chartVersion, namespace, remoteSignerChartVersion) } - -// collectAllHostnames gathers all openclaw subdomain hostnames that should be -// in /etc/hosts. Scans existing deployments and includes the new hostname. -func collectAllHostnames(cfg *config.Config, newHostname string) []string { - hostnames := []string{newHostname} - appsDir := filepath.Join(cfg.ConfigDir, "applications", appName) - - entries, err := os.ReadDir(appsDir) - if err != nil { - return hostnames - } - - seen := map[string]bool{newHostname: true} - - for _, e := range entries { - if !e.IsDir() { - continue - } - - h := fmt.Sprintf("openclaw-%s.%s", e.Name(), defaultDomain) - if !seen[h] { - hostnames = append(hostnames, h) - seen[h] = true - } - } - - return hostnames -} diff --git a/internal/openclaw/resolve.go b/internal/openclaw/resolve.go index 881ec27b..b1d66b54 100644 --- a/internal/openclaw/resolve.go +++ b/internal/openclaw/resolve.go @@ -50,7 +50,7 @@ func ResolveInstance(cfg *config.Config, args []string) (id string, remaining [] switch len(instances) { case 0: - return "", nil, errors.New("no OpenClaw instances found — run 'obol agent init' to create one") + return "", nil, errors.New("no OpenClaw instances found — run 'obol openclaw onboard' to create one") case 1: return instances[0], args, nil default: diff --git a/internal/openclaw/resolve_test.go b/internal/openclaw/resolve_test.go index d87614a4..a1e18eba 100644 --- a/internal/openclaw/resolve_test.go +++ b/internal/openclaw/resolve_test.go @@ -74,7 +74,7 @@ func TestResolveInstance(t *testing.T) { t.Fatal("expected error for zero instances") } - if got := err.Error(); got != "no OpenClaw instances found — run 'obol agent init' to create one" { + if got := err.Error(); got != "no OpenClaw instances found — run 'obol openclaw onboard' to create one" { t.Fatalf("unexpected error: %s", got) } }) diff --git a/internal/openclaw/wallet_resolve.go b/internal/openclaw/wallet_resolve.go index 69d19b2e..c3c26918 100644 --- a/internal/openclaw/wallet_resolve.go +++ b/internal/openclaw/wallet_resolve.go @@ -21,7 +21,7 @@ func ResolveWalletAddress(cfg *config.Config) (string, error) { switch len(ids) { case 0: - return "", fmt.Errorf("no OpenClaw instances found — run 'obol agent init' first, or use --wallet") + return "", fmt.Errorf("no OpenClaw instances found — run 'obol openclaw onboard' first, or use --wallet") case 1: wallet, err := ReadWalletMetadata(DeploymentPath(cfg, ids[0])) if err != nil { @@ -51,7 +51,7 @@ func ResolveInstanceNamespace(cfg *config.Config) (string, error) { switch len(ids) { case 0: - return "", fmt.Errorf("no OpenClaw instances found — run 'obol agent init' first") + return "", fmt.Errorf("no OpenClaw instances found — run 'obol openclaw onboard' first") case 1: return instanceNamespace(ids[0]), nil default: diff --git a/internal/serviceoffercontroller/purchase_helpers.go b/internal/serviceoffercontroller/purchase_helpers.go index 33526acb..c2af4973 100644 --- a/internal/serviceoffercontroller/purchase_helpers.go +++ b/internal/serviceoffercontroller/purchase_helpers.go @@ -26,6 +26,11 @@ const ( buyerAuthsCM = "x402-buyer-auths" litellmSecret = "litellm-secrets" litellmMasterKey = "LITELLM_MASTER_KEY" + + // Helm owns litellm-config.data["config.yaml"] via server-side apply. + // Runtime writers use the same field manager so stack upgrades do not + // create managedFields conflicts on the shared persistence key. + litellmConfigFieldManager = "helm" ) // litellmBaseURL returns the LiteLLM HTTP base URL. In production it resolves @@ -317,7 +322,7 @@ func (c *Controller) addLiteLLMModelEntry(ctx context.Context, ns, modelName str } cm.Data["config.yaml"] = string(rendered) - if _, err := c.kubeClient.CoreV1().ConfigMaps(ns).Update(ctx, cm, metav1.UpdateOptions{}); err != nil { + if _, err := c.kubeClient.CoreV1().ConfigMaps(ns).Update(ctx, cm, metav1.UpdateOptions{FieldManager: litellmConfigFieldManager}); err != nil { log.Printf("purchase: failed to update litellm-config: %v", err) return } @@ -392,7 +397,7 @@ func (c *Controller) removeLiteLLMModelEntry(ctx context.Context, ns, modelName return } cm.Data["config.yaml"] = string(rendered) - if _, err := c.kubeClient.CoreV1().ConfigMaps(ns).Update(ctx, cm, metav1.UpdateOptions{}); err != nil { + if _, err := c.kubeClient.CoreV1().ConfigMaps(ns).Update(ctx, cm, metav1.UpdateOptions{FieldManager: litellmConfigFieldManager}); err != nil { log.Printf("purchase: remove model: failed to update litellm-config: %v", err) return } diff --git a/internal/serviceoffercontroller/purchase_helpers_test.go b/internal/serviceoffercontroller/purchase_helpers_test.go index 6366da0c..21de7deb 100644 --- a/internal/serviceoffercontroller/purchase_helpers_test.go +++ b/internal/serviceoffercontroller/purchase_helpers_test.go @@ -114,6 +114,9 @@ func TestAddLiteLLMModelEntryUpdatesConfigMapAndHotAdds(t *testing.T) { if entry.LiteLLMParams.Model != "openai/paid/qwen3.5:9b" { t.Fatalf("litellm_params.model = %q", entry.LiteLLMParams.Model) } + if entry.LiteLLMParams.APIBase != "http://127.0.0.1:8402/v1" { + t.Fatalf("litellm_params.api_base = %q", entry.LiteLLMParams.APIBase) + } if got := fakeAPI.addCalls.Load(); got != 1 { t.Fatalf("expected exactly 1 call to /model/new, got %d", got) @@ -170,6 +173,43 @@ func TestAddLiteLLMModelEntryHandlesMissingConfigMap(t *testing.T) { c.addLiteLLMModelEntry(context.Background(), "llm", "paid/test-model") } +func TestNormalizePurchasedUpstreamURL(t *testing.T) { + tests := []struct { + name string + endpoint string + want string + }{ + { + name: "openai v1 chat completions", + endpoint: "https://seller.example/services/inference/v1/chat/completions", + want: "https://seller.example/services/inference", + }, + { + name: "bare chat completions", + endpoint: "https://seller.example/services/inference/chat/completions", + want: "https://seller.example/services/inference", + }, + { + name: "trailing slash", + endpoint: "https://seller.example/services/inference/v1/chat/completions/", + want: "https://seller.example/services/inference", + }, + { + name: "already base", + endpoint: "https://seller.example/services/inference", + want: "https://seller.example/services/inference", + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + if got := normalizePurchasedUpstreamURL(tt.endpoint); got != tt.want { + t.Fatalf("normalizePurchasedUpstreamURL(%q) = %q, want %q", tt.endpoint, got, tt.want) + } + }) + } +} + func TestRemoveLiteLLMModelEntryUpdatesConfigMapAndHotDeletes(t *testing.T) { c, fakeAPI := newTestControllerWithLiteLLM("llm") defer fakeAPI.close() diff --git a/internal/stack/backend_k3d.go b/internal/stack/backend_k3d.go index d66d2bf5..f530f0e3 100644 --- a/internal/stack/backend_k3d.go +++ b/internal/stack/backend_k3d.go @@ -4,10 +4,12 @@ import ( "crypto/tls" "errors" "fmt" + "net" "net/http" "os" "os/exec" "path/filepath" + "strconv" "strings" "time" @@ -67,8 +69,8 @@ func (b *K3dBackend) Init(cfg *config.Config, u *ui.UI, stackID string) error { k3dConfig = strings.ReplaceAll(k3dConfig, "{{DATA_DIR}}", absDataDir) k3dConfig = strings.ReplaceAll(k3dConfig, "{{CONFIG_DIR}}", absConfigDir) - // Strip port mappings for occupied host ports so k3d cluster create won't - // fail. The fallback mappings (8080→80, 8443→443) are always preserved. + // Rewrite occupied host-port mappings so k3d cluster create won't fail + // when another local stack is already bound to the default ingress ports. k3dConfig = stripConflictingPorts(k3dConfig, u) k3dConfigPath := filepath.Join(cfg.ConfigDir, k3dConfigFile) @@ -260,37 +262,128 @@ func portBlock(host, container int) string { return fmt.Sprintf(" - port: %d:%d\n nodeFilters:\n - loadbalancer\n", host, container) } -// stripConflictingPorts removes the identity port mappings (80:80, 443:443) -// from a k3d config string when those host ports are already in use. The -// fallback mappings (8080→80, 8443→443) are always preserved so Traefik -// remains reachable on an alternative port. +// stripConflictingPorts removes occupied default k3d ingress mappings. If all +// default host ports for a container port are occupied, it adds an ephemeral +// host-port mapping so multiple dev stacks can coexist on the same machine. func stripConflictingPorts(k3dConfig string, u *ui.UI) string { - type mapping struct { - hostPort int - containerPort int - fallbackPort int - } + return rewriteConflictingPorts(k3dConfig, u, hostPortAvailable, pickAvailableHostPort) +} - // Only strip the identity mappings; the high-port fallbacks are kept. - candidates := []mapping{ - {80, 80, 8080}, - {443, 443, 8443}, +func rewriteConflictingPorts( + k3dConfig string, + u *ui.UI, + available func(int) bool, + pickPort func() (int, error), +) string { + hasMapping := map[int]bool{} + + for _, c := range parseK3dPortMappings(k3dConfig) { + if c.containerPort != 80 && c.containerPort != 443 { + continue + } + + block := portBlock(c.hostPort, c.containerPort) + if !strings.Contains(k3dConfig, block) { + continue + } + if available(c.hostPort) { + hasMapping[c.containerPort] = true + continue + } + + k3dConfig = strings.Replace(k3dConfig, block, "", 1) + if fallbackPort := fallbackForDefaultPort(c); fallbackPort > 0 { + u.Warnf("Port %d is in use — removed %d:%d mapping (use port %d instead if available)", + c.hostPort, c.hostPort, c.containerPort, fallbackPort) + } else { + u.Warnf("Port %d is in use — removed %d:%d mapping", c.hostPort, c.hostPort, c.containerPort) + } } - for _, c := range candidates { - if checkPortsAvailable([]int{c.hostPort}) != nil { - block := portBlock(c.hostPort, c.containerPort) - if strings.Contains(k3dConfig, block) { - k3dConfig = strings.Replace(k3dConfig, block, "", 1) - u.Warnf("Port %d is in use — removed %d:%d mapping (use port %d instead)", - c.hostPort, c.hostPort, c.containerPort, c.fallbackPort) - } + for _, containerPort := range []int{80, 443} { + if hasMapping[containerPort] { + continue } + + hostPort, err := pickPort() + if err != nil { + u.Warnf("No default host port is available for container port %d and no ephemeral port could be selected: %v", containerPort, err) + continue + } + k3dConfig = insertK3dPortMapping(k3dConfig, portBlock(hostPort, containerPort)) + u.Warnf("All default host ports for container port %d are in use — using %d:%d instead", containerPort, hostPort, containerPort) } return k3dConfig } +type k3dPortMapping struct { + hostPort int + containerPort int +} + +func parseK3dPortMappings(k3dConfig string) []k3dPortMapping { + var mappings []k3dPortMapping + for _, line := range strings.Split(k3dConfig, "\n") { + line = strings.TrimSpace(line) + if !strings.HasPrefix(line, "- port:") { + continue + } + + portSpec := strings.TrimSpace(strings.TrimPrefix(line, "- port:")) + parts := strings.Split(portSpec, ":") + if len(parts) != 2 { + continue + } + hostPort, hostErr := strconv.Atoi(parts[0]) + containerPort, containerErr := strconv.Atoi(parts[1]) + if hostErr != nil || containerErr != nil { + continue + } + mappings = append(mappings, k3dPortMapping{hostPort: hostPort, containerPort: containerPort}) + } + return mappings +} + +func fallbackForDefaultPort(mapping k3dPortMapping) int { + switch mapping { + case k3dPortMapping{hostPort: 80, containerPort: 80}: + return 8080 + case k3dPortMapping{hostPort: 443, containerPort: 443}: + return 8443 + default: + return 0 + } +} + +func hostPortAvailable(port int) bool { + return checkPortsAvailable([]int{port}) == nil +} + +func pickAvailableHostPort() (int, error) { + ln, err := net.Listen("tcp", "127.0.0.1:0") + if err != nil { + return 0, err + } + defer ln.Close() + + addr, ok := ln.Addr().(*net.TCPAddr) + if !ok || addr.Port == 0 { + return 0, fmt.Errorf("unexpected listener address %q", ln.Addr().String()) + } + return addr.Port, nil +} + +func insertK3dPortMapping(k3dConfig, block string) string { + if strings.Contains(k3dConfig, "options:\n") { + return strings.Replace(k3dConfig, "options:\n", block+"options:\n", 1) + } + if strings.Contains(k3dConfig, "ports:\n") { + return k3dConfig + block + } + return k3dConfig + "\nports:\n" + block +} + // ensureK3dPortsAvailable re-reads the k3d config file, strips any port // mappings that have become conflicting since the config was written, and // persists the result. This handles the case where port state changed between diff --git a/internal/stack/stack.go b/internal/stack/stack.go index 7dc0063a..6cfb367b 100644 --- a/internal/stack/stack.go +++ b/internal/stack/stack.go @@ -13,9 +13,12 @@ import ( "strings" "github.com/ObolNetwork/obol-stack/internal/agent" + "github.com/ObolNetwork/obol-stack/internal/agentruntime" "github.com/ObolNetwork/obol-stack/internal/config" + stackdefaults "github.com/ObolNetwork/obol-stack/internal/defaults" "github.com/ObolNetwork/obol-stack/internal/dns" - "github.com/ObolNetwork/obol-stack/internal/embed" + "github.com/ObolNetwork/obol-stack/internal/hermes" + "github.com/ObolNetwork/obol-stack/internal/kubectl" "github.com/ObolNetwork/obol-stack/internal/model" "github.com/ObolNetwork/obol-stack/internal/openclaw" "github.com/ObolNetwork/obol-stack/internal/tunnel" @@ -23,6 +26,7 @@ import ( "github.com/ObolNetwork/obol-stack/internal/update" x402verifier "github.com/ObolNetwork/obol-stack/internal/x402" petname "github.com/dustinkirkland/golang-petname" + "gopkg.in/yaml.v3" ) const ( @@ -90,25 +94,8 @@ func Init(cfg *config.Config, u *ui.UI, force bool, backendName string) error { return err } - // Copy embedded defaults (helmfile + charts for infrastructure) - // Resolve {{OLLAMA_HOST}} based on backend: - // - k3d (Docker): host.docker.internal (macOS) or host.k3d.internal (Linux) - // - k3s (bare-metal): 127.0.0.1 (k3s runs directly on the host) - // Resolve {{OLLAMA_HOST_IP}} to a numeric IP for the Endpoints object: - // - Endpoints require an IP, not a hostname (ClusterIP+Endpoints pattern) - ollamaHost := ollamaHostForBackend(backendName) - - ollamaHostIP, err := ollamaHostIPForBackend(backendName) - if err != nil { - return fmt.Errorf("failed to resolve Ollama host IP: %w", err) - } - - defaultsDir := filepath.Join(cfg.ConfigDir, "defaults") - if err := embed.CopyDefaults(defaultsDir, map[string]string{ - "{{OLLAMA_HOST}}": ollamaHost, - "{{OLLAMA_HOST_IP}}": ollamaHostIP, - "{{CLUSTER_ID}}": stackID, - }); err != nil { + // Copy embedded defaults (helmfile + charts for infrastructure). + if err := stackdefaults.CopyInfrastructure(cfg, backendName, stackID); err != nil { return fmt.Errorf("failed to copy defaults: %w", err) } @@ -172,123 +159,24 @@ func cleanupStaleBackendConfigs(cfg *config.Config, oldBackend string) { } } -// ollamaHostForBackend returns the hostname/IP that reaches the host Ollama -// instance from inside the cluster. func ollamaHostForBackend(backendName string) string { - if backendName == BackendK3s { - return "127.0.0.1" - } - - if runtime.GOOS == "darwin" { - return "host.docker.internal" - } - - return "host.k3d.internal" + return stackdefaults.OllamaHostForBackend(backendName) } -// ollamaHostIPForBackend resolves the Ollama host to an IP address. -// ClusterIP+Endpoints requires an IP (not a hostname). -// -// Resolution strategy: -// 1. If already an IP (k3s: 127.0.0.1), return as-is -// 2. Try host-side DNS resolution -// 3. macOS: use Docker Desktop VM gateway (192.168.65.254) -// 4. Linux: fall back to docker0 bridge interface IP func ollamaHostIPForBackend(backendName string) (string, error) { - host := ollamaHostForBackend(backendName) - - // If already an IP, return as-is (k3s: 127.0.0.1) - if net.ParseIP(host) != nil { - return host, nil - } - - // Try host-side DNS resolution first. - addrs, err := net.LookupHost(host) - if err == nil && len(addrs) > 0 { - return addrs[0], nil - } - - // macOS Docker Desktop: host.docker.internal is only resolvable inside - // containers (Docker injects it via DNS), not on the host. Use the - // well-known VM gateway IP that Docker Desktop exposes to containers. - if runtime.GOOS == "darwin" && backendName == BackendK3d { - return dockerDesktopGatewayIP(), nil - } - - // Linux fallback: docker0 bridge interface IP (reachable from all containers). - if runtime.GOOS == "linux" && backendName == BackendK3d { - ip, bridgeErr := dockerBridgeGatewayIP() - if bridgeErr == nil { - return ip, nil - } - - return "", fmt.Errorf("cannot resolve Ollama host %q to IP: %w; docker0 fallback also failed: %w", host, err, bridgeErr) - } - - return "", fmt.Errorf("cannot resolve Ollama host %q to IP: %w\n\tEnsure Docker Desktop is running", host, err) + return stackdefaults.OllamaHostIPForBackend(backendName) } -// dockerDesktopGatewayIP returns the Docker Desktop VM gateway IP. -// On macOS, Docker Desktop runs a LinuxKit VM. The host is reachable from -// containers at this well-known gateway address (192.168.65.254 maps to -// host.docker.internal inside the VM). This has been stable across Docker -// Desktop versions since the transition from HyperKit to Apple Virtualization. func dockerDesktopGatewayIP() string { - return "192.168.65.254" + return stackdefaults.DockerDesktopGatewayIP() } -// dockerBridgeGatewayIP returns the IPv4 address of an active Docker bridge -// interface. It prefers docker0 (the default bridge, typically 172.17.0.1). -// If docker0 is present but DOWN (e.g. when only k3d's custom bridge network -// is active), it falls back to the first UP interface whose name starts with -// "br-" — which is how Docker names per-network bridge interfaces. func dockerBridgeGatewayIP() (string, error) { - if ip, err := bridgeInterfaceIP("docker0"); err == nil { - return ip, nil - } - - // docker0 missing or DOWN — scan for an active br- bridge. - ifaces, err := net.Interfaces() - if err != nil { - return "", fmt.Errorf("cannot list network interfaces: %w", err) - } - - for _, iface := range ifaces { - if !strings.HasPrefix(iface.Name, "br-") { - continue - } - if ip, err := bridgeInterfaceIP(iface.Name); err == nil { - return ip, nil - } - } - - return "", errors.New("no active Docker bridge interface found (docker0 or br-*)") + return stackdefaults.DockerBridgeGatewayIP() } -// bridgeInterfaceIP returns the IPv4 address of a named network interface, -// or an error if the interface does not exist, is DOWN, or has no IPv4 address. func bridgeInterfaceIP(name string) (string, error) { - iface, err := net.InterfaceByName(name) - if err != nil { - return "", fmt.Errorf("interface %s not found: %w", name, err) - } - - if iface.Flags&net.FlagUp == 0 { - return "", fmt.Errorf("interface %s is down", name) - } - - addrs, err := iface.Addrs() - if err != nil { - return "", fmt.Errorf("cannot get addresses for %s: %w", name, err) - } - - for _, addr := range addrs { - if ipNet, ok := addr.(*net.IPNet); ok && ipNet.IP.To4() != nil { - return ipNet.IP.String(), nil - } - } - - return "", fmt.Errorf("no IPv4 address found on interface %s", name) + return stackdefaults.BridgeInterfaceIP(name) } // Up starts the cluster using the configured backend @@ -307,8 +195,6 @@ func Up(cfg *config.Config, u *ui.UI, wildcardDNS bool) error { u.Infof("Starting stack (id: %s, backend: %s)", stackID, backend.Name()) - portsBlocked := checkPortsAvailable([]int{80, 443}) != nil - kubeconfigData, err := backend.Up(cfg, u, stackID) if err != nil { return err @@ -319,17 +205,25 @@ func Up(cfg *config.Config, u *ui.UI, wildcardDNS bool) error { return fmt.Errorf("failed to write kubeconfig: %w", err) } + if refreshed, err := stackdefaults.RefreshInfrastructureIfChanged(cfg, backend.Name(), stackID); err != nil { + return fmt.Errorf("failed to refresh default infrastructure templates: %w", err) + } else if refreshed { + u.Dim("Refreshed default infrastructure templates from embedded assets") + } + + // Ensure the base host before syncing defaults. Include existing agent + // hostnames so stack up never shrinks the managed /etc/hosts block to only + // obol.stack when default setup is skipped. + if err := dns.EnsureHostsEntries(agentruntime.CollectHostnames(cfg)); err != nil { + u.Warnf("Could not update /etc/hosts for obol.stack: %v", err) + } + // Sync defaults with backend-aware dataDir dataDir := backend.DataDir(cfg) if err := syncDefaults(cfg, u, kubeconfigPath, dataDir); err != nil { return err } - // Ensure obol.stack resolves to localhost via /etc/hosts (works everywhere). - if err := dns.EnsureHostsEntries(nil); err != nil { - u.Warnf("Could not update /etc/hosts for obol.stack: %v", err) - } - // Wildcard *.obol.stack DNS is opt-in (--wildcard-dns) because it // modifies system DNS config (NetworkManager/resolv.conf on Linux, // /etc/resolver on macOS) which can break host DNS resolution. @@ -345,12 +239,11 @@ func Up(cfg *config.Config, u *ui.UI, wildcardDNS bool) error { u.Blank() u.Bold("Stack started successfully.") - if portsBlocked { - u.Warnf("Ports 80/443 are in use by another process — use http://obol.stack:8080 instead") - u.Print("Visit http://obol.stack:8080 in your browser to get started.") - } else { - u.Print("Visit http://obol.stack in your browser to get started.") + ingressURL := LocalIngressURL(cfg) + if ingressURL != "http://obol.stack" { + u.Warnf("Default ingress ports are in use by another process — use %s instead", ingressURL) } + u.Printf("Visit %s in your browser to get started.", ingressURL) update.HintIfStale(cfg) return nil @@ -457,6 +350,15 @@ func syncDefaults(cfg *config.Config, u *ui.UI, kubeconfigPath string, dataDir s defaultsHelmfilePath := filepath.Join(cfg.ConfigDir, "defaults") helmfilePath := filepath.Join(defaultsHelmfilePath, "helmfile.yaml") + if err := migrateBaseHelmOwnership(cfg, kubeconfigPath); err != nil { + u.Warnf("Failed to migrate existing base resources into Helm ownership: %v", err) + } + + previousLiteLLMConfig, err := preserveLiteLLMConfigForHelm(cfg, kubeconfigPath) + if err != nil { + u.Warnf("Failed to preserve LiteLLM config across Helm sync: %v", err) + } + // Compatibility migration if err := migrateDefaultsHTTPRouteHostnames(helmfilePath); err != nil { u.Warnf("Failed to migrate defaults helmfile hostnames: %v", err) @@ -486,6 +388,12 @@ func syncDefaults(cfg *config.Config, u *ui.UI, kubeconfigPath string, dataDir s }); err != nil { u.Warn("Helmfile sync failed, stopping cluster") + if previousLiteLLMConfig != "" { + if restoreErr := restoreLiteLLMConfig(cfg, kubeconfigPath, previousLiteLLMConfig); restoreErr != nil { + u.Warnf("Failed to restore LiteLLM config after Helmfile error: %v", restoreErr) + } + } + if downErr := Down(cfg, u); downErr != nil { u.Warnf("Failed to stop cluster during cleanup: %v", downErr) } @@ -495,6 +403,12 @@ func syncDefaults(cfg *config.Config, u *ui.UI, kubeconfigPath string, dataDir s u.Success("Default infrastructure deployed") + if previousLiteLLMConfig != "" { + if err := restoreLiteLLMConfig(cfg, kubeconfigPath, previousLiteLLMConfig); err != nil { + u.Warnf("Failed to restore LiteLLM config after base migration: %v", err) + } + } + // Populate the x402-verifier CA bundle from the host so TLS verification of // the facilitator works without needing to run `obol sell pricing` first. // Non-fatal: best-effort, the user can repopulate by running `obol sell pricing`. @@ -506,22 +420,22 @@ func syncDefaults(cfg *config.Config, u *ui.UI, kubeconfigPath string, dataDir s // step required. Non-fatal: the user can always run `obol model setup` later. autoConfigureLLM(cfg, u) - // Deploy default OpenClaw instance (non-fatal on failure). + // Deploy default Hermes instance (non-fatal on failure). // Not wrapped in RunWithSpinner because SetupDefault/Onboard produce their // own UI output (Info, Detail, Print) and run sub-spinners via u.Exec. // An outer spinner would fight with that output and block any sudo password // prompt (e.g. EnsureHostsEntries writing /etc/hosts). u.Blank() - u.Info("Setting up default OpenClaw instance") + u.Info("Setting up default Hermes instance") - if err := openclaw.SetupDefault(cfg, u); err != nil { - u.Warnf("Failed to set up default OpenClaw: %v", err) - u.Dim(" You can manually set up OpenClaw later with: obol openclaw onboard") - } else if walletAddr, walletErr := openclaw.ResolveWalletAddress(cfg); walletErr == nil { + if err := hermes.SetupDefault(cfg, u); err != nil { + u.Warnf("Failed to set up default Hermes: %v", err) + u.Dim(" You can manually set up Hermes later with: obol hermes onboard") + } else if walletAddr, walletErr := hermes.ResolveWalletAddress(cfg); walletErr == nil { u.Blank() u.Successf("Default agent wallet: %s", walletAddr) u.Dim(" Fund this wallet for x402 buying or direct on-chain registration.") - u.Dim(" Retrieve later with: obol openclaw wallet address obol-agent") + u.Dim(" Retrieve later with: obol hermes wallet list obol-agent") } // Apply agent capabilities (RBAC + heartbeat) to the default instance. @@ -679,11 +593,12 @@ func autoDetectCloudProvider(cfg *config.Config, u *ui.UI) string { // localImage describes a Docker image built from source in this repo. type localImage struct { tag string // e.g. "ghcr.io/obolnetwork/x402-verifier:latest" - dockerfile string // relative to project root, e.g. "Dockerfile.x402-verifier" + dockerfile string // relative to project root or absolute path + contextDir string // relative to project root or absolute path (empty = project root) } // localImages lists images that should be built locally and imported into k3d. -var localImages = []localImage{ +var baseLocalImages = []localImage{ {tag: "ghcr.io/obolnetwork/x402-verifier:latest", dockerfile: "Dockerfile.x402-verifier"}, {tag: "ghcr.io/obolnetwork/serviceoffer-controller:latest", dockerfile: "Dockerfile.serviceoffer-controller"}, {tag: "ghcr.io/obolnetwork/x402-buyer:latest", dockerfile: "Dockerfile.x402-buyer"}, @@ -697,6 +612,41 @@ func devPreloadImages() []string { return images } +func hermesSourceDir(projectRoot string) string { + if override := strings.TrimSpace(os.Getenv("OBOL_HERMES_SOURCE_DIR")); override != "" { + return override + } + + candidates := []string{ + filepath.Join(filepath.Dir(projectRoot), "hermes-agent"), + filepath.Join(os.Getenv("HOME"), "Development", "R&D", "hermes-agent"), + } + + for _, candidate := range candidates { + if candidate == "" { + continue + } + if _, err := os.Stat(filepath.Join(candidate, "Dockerfile")); err == nil { + return candidate + } + } + + return "" +} + +func devLocalImages(projectRoot string) []localImage { + images := append([]localImage(nil), baseLocalImages...) + if hermesDir := hermesSourceDir(projectRoot); hermesDir != "" { + images = append(images, localImage{ + tag: "nousresearch/hermes-agent:latest", + dockerfile: filepath.Join(hermesDir, "Dockerfile"), + contextDir: hermesDir, + }) + } + + return images +} + // buildAndImportLocalImages builds Docker images from source and imports them // into the k3d cluster. This ensures images are available even when the GHCR // publish workflow hasn't run. Non-fatal: logs warnings on failure. @@ -716,10 +666,20 @@ func buildAndImportLocalImages(cfg *config.Config) { clusterName := "obol-stack-" + stackID k3dBinary := filepath.Join(cfg.BinDir, "k3d") - for _, img := range localImages { + for _, img := range devLocalImages(projectRoot) { contextDir := projectRoot + if img.contextDir != "" { + if filepath.IsAbs(img.contextDir) { + contextDir = img.contextDir + } else { + contextDir = filepath.Join(projectRoot, img.contextDir) + } + } - dockerfilePath := filepath.Join(projectRoot, img.dockerfile) + dockerfilePath := img.dockerfile + if !filepath.IsAbs(dockerfilePath) { + dockerfilePath = filepath.Join(projectRoot, img.dockerfile) + } if _, err := os.Stat(dockerfilePath); os.IsNotExist(err) { continue // Dockerfile not present (production install without source) } @@ -925,3 +885,167 @@ func migrateDefaultsHTTPRouteHostnames(helmfilePath string) error { return os.WriteFile(helmfilePath, []byte(updated), 0o600) //nolint:gosec // G703: path from user's local config dir } + +type baseHelmResource struct { + Kind string + Name string + Namespace string +} + +func migrateBaseHelmOwnership(cfg *config.Config, kubeconfigPath string) error { + kubectlBinary := filepath.Join(cfg.BinDir, "kubectl") + resources := []baseHelmResource{ + {Kind: "namespace", Name: "agent"}, + {Kind: "namespace", Name: "hermes-obol-agent"}, + {Kind: "clusterrole", Name: "openclaw-monetize-read"}, + {Kind: "clusterrolebinding", Name: "openclaw-monetize-read-binding"}, + {Kind: "role", Name: "openclaw-monetize-write", Namespace: "hermes-obol-agent"}, + {Kind: "rolebinding", Name: "openclaw-monetize-write-binding", Namespace: "hermes-obol-agent"}, + } + + var failures []error + + for _, resource := range resources { + if err := kubectl.RunSilent(kubectlBinary, kubeconfigPath, append([]string{"get", resource.Kind, resource.Name}, resource.namespaceArgs()...)...); err != nil { + continue + } + + labelArgs := append([]string{"label", resource.Kind, resource.Name}, resource.namespaceArgs()...) + labelArgs = append(labelArgs, "app.kubernetes.io/managed-by=Helm", "--overwrite") + if err := kubectl.RunSilent(kubectlBinary, kubeconfigPath, labelArgs...); err != nil { + failures = append(failures, fmt.Errorf("label %s/%s: %w", resource.Kind, resource.Name, err)) + continue + } + + annotateArgs := append([]string{"annotate", resource.Kind, resource.Name}, resource.namespaceArgs()...) + annotateArgs = append(annotateArgs, + "meta.helm.sh/release-name=base", + "meta.helm.sh/release-namespace=kube-system", + "--overwrite", + ) + if err := kubectl.RunSilent(kubectlBinary, kubeconfigPath, annotateArgs...); err != nil { + failures = append(failures, fmt.Errorf("annotate %s/%s: %w", resource.Kind, resource.Name, err)) + } + } + + return errors.Join(failures...) +} + +func (r baseHelmResource) namespaceArgs() []string { + if r.Namespace == "" { + return nil + } + + return []string{"-n", r.Namespace} +} + +// preserveLiteLLMConfigForHelm snapshots the mutable LiteLLM config before +// Helm sync. Helm owns the ConfigMap object, but provider and purchase flows +// append model routes to data["config.yaml"], which is a single scalar field +// from Kubernetes' managedFields perspective. +func preserveLiteLLMConfigForHelm(cfg *config.Config, kubeconfigPath string) (string, error) { + kubectlBinary := filepath.Join(cfg.BinDir, "kubectl") + + raw, err := kubectl.Output(kubectlBinary, kubeconfigPath, + "get", "configmap", "litellm-config", "-n", "llm", "-o", "jsonpath={.data.config\\.yaml}") + if err != nil || strings.TrimSpace(raw) == "" { + return "", nil + } + + managers, err := kubectl.Output(kubectlBinary, kubeconfigPath, + "get", "configmap", "litellm-config", "-n", "llm", + "--show-managed-fields", "-o", "jsonpath={.metadata.managedFields[*].manager}") + if err != nil || !needsLiteLLMConfigHelmMigration(managers) { + return raw, nil + } + + if err := kubectl.RunSilent(kubectlBinary, kubeconfigPath, + "delete", "configmap", "litellm-config", "-n", "llm"); err != nil { + return "", err + } + + return raw, nil +} + +func restoreLiteLLMConfig(cfg *config.Config, kubeconfigPath, raw string) error { + if strings.TrimSpace(raw) == "" { + return nil + } + + kubectlBinary := filepath.Join(cfg.BinDir, "kubectl") + if current, err := kubectl.Output(kubectlBinary, kubeconfigPath, + "get", "configmap", "litellm-config", "-n", "llm", "-o", "jsonpath={.data.config\\.yaml}"); err == nil && strings.TrimSpace(current) != "" { + merged, err := mergeLiteLLMConfig(current, raw) + if err != nil { + return err + } + raw = merged + } + + manifest := configMapFieldOwnershipManifest("litellm-config", "llm", "config.yaml", raw) + + return kubectl.ApplyServerSideForceConflicts(kubectlBinary, kubeconfigPath, manifest, "helm") +} + +func needsLiteLLMConfigHelmMigration(managers string) bool { + for _, manager := range strings.Fields(managers) { + if manager != "helm" { + return true + } + } + + return false +} + +func mergeLiteLLMConfig(currentRaw, previousRaw string) (string, error) { + var current model.LiteLLMConfig + if err := yaml.Unmarshal([]byte(currentRaw), ¤t); err != nil { + return "", fmt.Errorf("parse current LiteLLM config: %w", err) + } + + var previous model.LiteLLMConfig + if err := yaml.Unmarshal([]byte(previousRaw), &previous); err != nil { + return "", fmt.Errorf("parse previous LiteLLM config: %w", err) + } + + byName := make(map[string]int, len(current.ModelList)) + for i, entry := range current.ModelList { + byName[entry.ModelName] = i + } + + for _, entry := range previous.ModelList { + if strings.TrimSpace(entry.ModelName) == "" { + continue + } + if _, ok := byName[entry.ModelName]; ok { + continue + } + byName[entry.ModelName] = len(current.ModelList) + current.ModelList = append(current.ModelList, entry) + } + + if len(current.GeneralSettings) == 0 && len(previous.GeneralSettings) > 0 { + current.GeneralSettings = previous.GeneralSettings + } + if len(current.LiteLLMSettings) == 0 && len(previous.LiteLLMSettings) > 0 { + current.LiteLLMSettings = previous.LiteLLMSettings + } + + merged, err := yaml.Marshal(¤t) + if err != nil { + return "", fmt.Errorf("serialize merged LiteLLM config: %w", err) + } + + return string(merged), nil +} + +func configMapFieldOwnershipManifest(name, namespace, key, value string) []byte { + var b strings.Builder + + fmt.Fprintf(&b, "apiVersion: v1\nkind: ConfigMap\nmetadata:\n name: %s\n namespace: %s\ndata:\n %s: |\n", name, namespace, key) + for _, line := range strings.Split(value, "\n") { + fmt.Fprintf(&b, " %s\n", line) + } + + return []byte(b.String()) +} diff --git a/internal/stack/stack_test.go b/internal/stack/stack_test.go index e2201df6..b734799d 100644 --- a/internal/stack/stack_test.go +++ b/internal/stack/stack_test.go @@ -9,7 +9,9 @@ import ( "strings" "testing" + "github.com/ObolNetwork/obol-stack/internal/model" "github.com/ObolNetwork/obol-stack/internal/ui" + "gopkg.in/yaml.v3" "github.com/ObolNetwork/obol-stack/internal/config" ) @@ -159,19 +161,77 @@ func TestStripConflictingPorts_StringManipulation(t *testing.T) { } } -func TestEnsureK3dPortsAvailable_RewritesConfig(t *testing.T) { - // Verify that ensureK3dPortsAvailable reads, strips, and rewrites the - // config file when port blocks are present and those ports are occupied. - // We can't actually block port 80, so we verify the no-op path: when - // ports 80/443 are free, the file should remain unchanged. - tmpDir := t.TempDir() - cfgPath := filepath.Join(tmpDir, "k3d.yaml") +func TestRewriteConflictingPorts_PreservesAvailableFallbacks(t *testing.T) { + fullConfig := "ports:\n" + + portBlock(80, 80) + + portBlock(8080, 80) + + portBlock(443, 443) + + portBlock(8443, 443) + + "options:\n" - original := "ports:\n" + + got := rewriteConflictingPorts(fullConfig, ui.New(false), func(port int) bool { + return port == 8080 || port == 8443 + }, func() (int, error) { + t.Fatal("should not pick an ephemeral port when fallbacks are available") + return 0, nil + }) + + for _, unexpected := range []string{"- port: 80:80", "- port: 443:443"} { + if strings.Contains(got, unexpected) { + t.Fatalf("expected %s mapping to be removed:\n%s", unexpected, got) + } + } + for _, expected := range []string{"- port: 8080:80", "- port: 8443:443"} { + if !strings.Contains(got, expected) { + t.Fatalf("expected %s mapping to be preserved:\n%s", expected, got) + } + } +} + +func TestRewriteConflictingPorts_PicksEphemeralWhenAllDefaultsBusy(t *testing.T) { + fullConfig := "ports:\n" + portBlock(80, 80) + portBlock(8080, 80) + portBlock(443, 443) + - portBlock(8443, 443) + portBlock(8443, 443) + + "options:\n" + picks := []int{18080, 18443} + + got := rewriteConflictingPorts(fullConfig, ui.New(false), func(int) bool { + return false + }, func() (int, error) { + if len(picks) == 0 { + t.Fatal("unexpected extra port pick") + } + port := picks[0] + picks = picks[1:] + return port, nil + }) + + for _, unexpected := range []string{"- port: 80:80", "- port: 8080:80", "- port: 443:443", "- port: 8443:443"} { + if strings.Contains(got, unexpected) { + t.Fatalf("expected default mapping %s to be removed:\n%s", unexpected, got) + } + } + for _, expected := range []string{"- port: 18080:80", "- port: 18443:443"} { + if !strings.Contains(got, expected) { + t.Fatalf("expected %s mapping to be inserted:\n%s", expected, got) + } + } + if !strings.Contains(got, "options:\n") { + t.Fatal("YAML options key should remain") + } +} + +func TestEnsureK3dPortsAvailable_NoDefaultMappings(t *testing.T) { + // Verify the file read/write path stays a no-op for configs that do not + // contain the default ingress mappings. + tmpDir := t.TempDir() + cfgPath := filepath.Join(tmpDir, "k3d.yaml") + + original := "ports:\n" + + portBlock(18080, 80) + + portBlock(18443, 443) if err := os.WriteFile(cfgPath, []byte(original), 0o600); err != nil { t.Fatal(err) @@ -185,10 +245,8 @@ func TestEnsureK3dPortsAvailable_RewritesConfig(t *testing.T) { t.Fatal(err) } - // On most dev machines ports 80/443 are free (or permission-denied which - // is treated as available), so the config should be unchanged. if string(data) != original { - t.Errorf("expected config unchanged when ports are free\ngot:\n%s", string(data)) + t.Errorf("expected config unchanged when no default mappings are present\ngot:\n%s", string(data)) } } @@ -438,3 +496,139 @@ func TestLLMTemplate_IncludesPaidRouteAndBuyerSidecar(t *testing.T) { t.Fatalf("llm template should not require a custom provider:\n%s", out) } } + +func TestNeedsLiteLLMConfigHelmMigration(t *testing.T) { + tests := []struct { + name string + managers string + want bool + }{ + {name: "helm only", managers: "helm", want: false}, + {name: "empty", managers: "", want: false}, + {name: "old kubectl patch", managers: "helm kubectl-patch", want: true}, + {name: "controller update", managers: "helm serviceoffer-controller", want: true}, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + if got := needsLiteLLMConfigHelmMigration(tt.managers); got != tt.want { + t.Fatalf("needsLiteLLMConfigHelmMigration(%q) = %v, want %v", tt.managers, got, tt.want) + } + }) + } +} + +func TestMergeLiteLLMConfigPreservesChartDefaultsAndPreviousModels(t *testing.T) { + current := ` +model_list: + - model_name: "paid/*" + litellm_params: + model: "openai/*" + api_base: "http://127.0.0.1:8402/v1" + api_key: "unused" +general_settings: + master_key: os.environ/LITELLM_MASTER_KEY +litellm_settings: + cache: false + drop_params: true +` + previous := ` +model_list: + - model_name: "anthropic/*" + litellm_params: + model: "anthropic/claude-sonnet-4-5-20250929" +` + + merged, err := mergeLiteLLMConfig(current, previous) + if err != nil { + t.Fatalf("mergeLiteLLMConfig: %v", err) + } + + var got model.LiteLLMConfig + if err := yaml.Unmarshal([]byte(merged), &got); err != nil { + t.Fatalf("unmarshal merged config: %v\n%s", err, merged) + } + + if !hasLiteLLMModel(got, "paid/*") { + t.Fatalf("merged config lost chart paid route:\n%s", merged) + } + if !hasLiteLLMModel(got, "anthropic/*") { + t.Fatalf("merged config lost previous provider route:\n%s", merged) + } + if got.GeneralSettings["master_key"] != "os.environ/LITELLM_MASTER_KEY" { + t.Fatalf("merged config lost chart general_settings:\n%#v", got.GeneralSettings) + } + if got.LiteLLMSettings["drop_params"] != true { + t.Fatalf("merged config lost chart litellm_settings:\n%#v", got.LiteLLMSettings) + } +} + +func TestMergeLiteLLMConfigCurrentEntryWinsForChartDefaults(t *testing.T) { + current := ` +model_list: + - model_name: "paid/*" + litellm_params: + model: "openai/*" + api_base: "http://127.0.0.1:8402/v1" + api_key: "unused" +general_settings: + master_key: os.environ/LITELLM_MASTER_KEY +` + previous := ` +model_list: + - model_name: "paid/*" + litellm_params: + model: "openai/*" + api_base: "http://custom-buyer:8402/v1" + api_key: "custom" +` + + merged, err := mergeLiteLLMConfig(current, previous) + if err != nil { + t.Fatalf("mergeLiteLLMConfig: %v", err) + } + + var got model.LiteLLMConfig + if err := yaml.Unmarshal([]byte(merged), &got); err != nil { + t.Fatalf("unmarshal merged config: %v\n%s", err, merged) + } + + for _, entry := range got.ModelList { + if entry.ModelName == "paid/*" { + if entry.LiteLLMParams.APIBase != "http://127.0.0.1:8402/v1" { + t.Fatalf("current paid route did not win over previous route:\n%+v", entry) + } + return + } + } + + t.Fatalf("merged config missing paid route:\n%s", merged) +} + +func TestConfigMapFieldOwnershipManifestUsesLiteralBlock(t *testing.T) { + manifest := string(configMapFieldOwnershipManifest("litellm-config", "llm", "config.yaml", "model_list:\n - model_name: paid/*\n")) + + for _, want := range []string{ + "apiVersion: v1\n", + "kind: ConfigMap\n", + " name: litellm-config\n", + " namespace: llm\n", + " config.yaml: |\n", + " model_list:\n", + " - model_name: paid/*\n", + } { + if !strings.Contains(manifest, want) { + t.Fatalf("manifest missing %q:\n%s", want, manifest) + } + } +} + +func hasLiteLLMModel(cfg model.LiteLLMConfig, name string) bool { + for _, entry := range cfg.ModelList { + if entry.ModelName == name { + return true + } + } + + return false +} diff --git a/internal/tunnel/agent.go b/internal/tunnel/agent.go index 5eb4fa16..35a7d59b 100644 --- a/internal/tunnel/agent.go +++ b/internal/tunnel/agent.go @@ -7,12 +7,13 @@ import ( "path/filepath" "strings" + "github.com/ObolNetwork/obol-stack/internal/agentruntime" "github.com/ObolNetwork/obol-stack/internal/config" ) -const agentDeploymentID = "obol-agent" +const agentDeploymentID = agentruntime.DefaultInstanceID -// SyncAgentBaseURL patches AGENT_BASE_URL in the obol-agent's values-obol.yaml +// SyncAgentBaseURL patches AGENT_BASE_URL in the default Hermes deployment // and runs helmfile sync to apply the change. It is a no-op if the obol-agent // deployment directory does not exist (agent not yet initialized). func SyncAgentBaseURL(cfg *config.Config, tunnelURL string) error { @@ -27,28 +28,27 @@ func SyncAgentBaseURL(cfg *config.Config, tunnelURL string) error { } if err := patchAgentBaseURL(overlayPath, tunnelURL); err != nil { - return fmt.Errorf("failed to patch values-obol.yaml: %w", err) + return fmt.Errorf("failed to patch values-hermes.yaml: %w", err) } // Run helmfile sync to apply the change to the cluster. deploymentDir := filepath.Dir(overlayPath) - helmfilePath := filepath.Join(deploymentDir, "helmfile.yaml") if _, err := os.Stat(helmfilePath); os.IsNotExist(err) { // Overlay exists but helmfile.yaml is missing — unusual, skip sync. - fmt.Printf("⚠ AGENT_BASE_URL updated in values-obol.yaml but helmfile.yaml not found; run 'obol openclaw sync %s' manually.\n", agentDeploymentID) + fmt.Printf("⚠ AGENT_BASE_URL updated in values-hermes.yaml but helmfile.yaml not found; run 'obol hermes sync %s' manually.\n", agentDeploymentID) return nil } kubeconfigPath := filepath.Join(cfg.ConfigDir, "kubeconfig.yaml") if _, err := os.Stat(kubeconfigPath); os.IsNotExist(err) { - fmt.Printf("⚠ AGENT_BASE_URL updated but cluster not running; changes will apply on next 'obol openclaw sync %s'.\n", agentDeploymentID) + fmt.Printf("⚠ AGENT_BASE_URL updated but cluster not running; changes will apply on next 'obol hermes sync %s'.\n", agentDeploymentID) return nil } helmfileBin := filepath.Join(cfg.BinDir, "helmfile") if _, err := os.Stat(helmfileBin); os.IsNotExist(err) { - fmt.Printf("⚠ helmfile not found at %s; run 'obol openclaw sync %s' manually.\n", helmfileBin, agentDeploymentID) + fmt.Printf("⚠ helmfile not found at %s; run 'obol hermes sync %s' manually.\n", helmfileBin, agentDeploymentID) return nil } @@ -72,7 +72,7 @@ func SyncAgentBaseURL(cfg *config.Config, tunnelURL string) error { } func agentOverlayPath(cfg *config.Config) string { - return filepath.Join(cfg.ConfigDir, "applications", "openclaw", agentDeploymentID, "values-obol.yaml") + return filepath.Join(cfg.ConfigDir, "applications", string(agentruntime.Hermes), agentDeploymentID, "values-hermes.yaml") } func readCurrentAgentBaseURL(overlayPath string) (string, error) { @@ -91,10 +91,9 @@ func readCurrentAgentBaseURL(overlayPath string) (string, error) { return "", nil } -// patchAgentBaseURL reads values-obol.yaml and ensures the extraEnv list -// contains an AGENT_BASE_URL entry with the given value. If the entry already -// exists it is updated in place; otherwise it is appended after the -// REMOTE_SIGNER_URL entry. +// patchAgentBaseURL reads values-hermes.yaml and ensures the env list contains +// an AGENT_BASE_URL entry with the given value. If the entry already exists it +// is updated in place; otherwise it is appended after REMOTE_SIGNER_URL. func patchAgentBaseURL(path, tunnelURL string) error { data, err := os.ReadFile(path) if err != nil { @@ -119,6 +118,7 @@ func patchAgentBaseURL(path, tunnelURL string) error { for i := 0; i < len(lines); i++ { line := lines[i] + leading := line[:len(line)-len(strings.TrimLeft(line, " "))] // Case 1: AGENT_BASE_URL already present — update its value line. if strings.Contains(line, "name: AGENT_BASE_URL") { @@ -129,7 +129,7 @@ func patchAgentBaseURL(path, tunnelURL string) error { if i+1 < len(lines) && strings.Contains(lines[i+1], "value:") { i++ - out = append(out, " value: "+tunnelURL) + out = append(out, leading+" value: "+tunnelURL) } continue @@ -139,9 +139,13 @@ func patchAgentBaseURL(path, tunnelURL string) error { // Case 2: AGENT_BASE_URL not yet in the file — insert after REMOTE_SIGNER_URL. if !alreadyPresent && !inserted && strings.Contains(line, "value: http://remote-signer:9000") { + nameIndent := leading + if strings.HasSuffix(nameIndent, " ") { + nameIndent = nameIndent[:len(nameIndent)-2] + } out = append(out, - " - name: AGENT_BASE_URL", - " value: "+tunnelURL, + nameIndent+"- name: AGENT_BASE_URL", + leading+"value: "+tunnelURL, ) inserted = true } @@ -149,7 +153,7 @@ func patchAgentBaseURL(path, tunnelURL string) error { // Case 3: Neither AGENT_BASE_URL nor REMOTE_SIGNER_URL found (unusual). if !inserted { - out = append(out, "extraEnv:", " - name: AGENT_BASE_URL\n value: "+tunnelURL) + out = append(out, " - name: AGENT_BASE_URL", " value: "+tunnelURL) } return os.WriteFile(path, []byte(strings.Join(out, "\n")), 0o600) //nolint:gosec // G703: path from user's local config dir diff --git a/internal/tunnel/tunnel.go b/internal/tunnel/tunnel.go index 9c0806c5..acb031c6 100644 --- a/internal/tunnel/tunnel.go +++ b/internal/tunnel/tunnel.go @@ -12,6 +12,7 @@ import ( "strings" "time" + "github.com/ObolNetwork/obol-stack/internal/agentruntime" "github.com/ObolNetwork/obol-stack/internal/config" "github.com/ObolNetwork/obol-stack/internal/ui" ) @@ -120,16 +121,17 @@ func Status(cfg *config.Config, u *ui.UI) error { return nil } -// InjectBaseURL sets AGENT_BASE_URL on the default OpenClaw deployment so that +// InjectBaseURL sets AGENT_BASE_URL on the default Hermes deployment so that // monetize.py uses the tunnel URL in registration JSON. func InjectBaseURL(cfg *config.Config, tunnelURL string) error { kubectlPath := filepath.Join(cfg.BinDir, "kubectl") kubeconfigPath := filepath.Join(cfg.ConfigDir, "kubeconfig.yaml") + desc := agentruntime.Describe(agentruntime.Hermes) cmd := exec.Command(kubectlPath, "--kubeconfig", kubeconfigPath, - "set", "env", "deployment/openclaw", - "-n", "openclaw-obol-agent", + "set", "env", "deployment/"+desc.ServiceName, + "-n", agentruntime.Namespace(agentruntime.Hermes, agentruntime.DefaultInstanceID), "AGENT_BASE_URL="+strings.TrimRight(tunnelURL, "/"), ) diff --git a/internal/tunnel/tunnel_test.go b/internal/tunnel/tunnel_test.go index 4c39e9a4..884b5687 100644 --- a/internal/tunnel/tunnel_test.go +++ b/internal/tunnel/tunnel_test.go @@ -42,6 +42,30 @@ func TestParseQuickTunnelURL(t *testing.T) { } } +func TestBuildLocalManagedConfigYAMLRoutesOnlyRequestedHostname(t *testing.T) { + out := string(buildLocalManagedConfigYAML("stack.example.com", "00000000-0000-0000-0000-000000000000")) + + for _, want := range []string{ + "tunnel: 00000000-0000-0000-0000-000000000000", + "- hostname: stack.example.com", + "service: http://traefik.traefik.svc.cluster.local:80", + "- service: http_status:404", + } { + if !strings.Contains(out, want) { + t.Fatalf("config missing %q:\n%s", want, out) + } + } + + if strings.Count(out, "hostname:") != 1 { + t.Fatalf("persistent tunnel config should expose exactly one hostname:\n%s", out) + } + for _, unexpected := range []string{"obol-agent.obol.stack", "hermes-obol-agent.obol.stack", "*.obol.stack"} { + if strings.Contains(out, unexpected) { + t.Fatalf("persistent tunnel config exposes local agent hostname %q:\n%s", unexpected, out) + } + } +} + func TestPatchAgentBaseURL_Insert(t *testing.T) { dir := t.TempDir() path := filepath.Join(dir, "values-obol.yaml") @@ -113,3 +137,39 @@ skills: t.Errorf("expected exactly 1 AGENT_BASE_URL entry:\n%s", content) } } + +func TestPatchAgentBaseURL_InsertHermesManifestIndentation(t *testing.T) { + dir := t.TempDir() + path := filepath.Join(dir, "values-hermes.yaml") + + original := `resources: + - apiVersion: apps/v1 + kind: Deployment + spec: + template: + spec: + containers: + - name: openclaw + env: + - name: REMOTE_SIGNER_URL + value: http://remote-signer:9000 +` + if err := os.WriteFile(path, []byte(original), 0o644); err != nil { + t.Fatal(err) + } + + if err := patchAgentBaseURL(path, "https://mystack.example.com"); err != nil { + t.Fatal(err) + } + + data, _ := os.ReadFile(path) + content := string(data) + + if !strings.Contains(content, " - name: AGENT_BASE_URL") { + t.Fatalf("patched Hermes manifest missing preserved indent for name:\n%s", content) + } + + if !strings.Contains(content, " value: https://mystack.example.com") { + t.Fatalf("patched Hermes manifest missing preserved indent for value:\n%s", content) + } +} diff --git a/internal/update/update.go b/internal/update/update.go index 6167763b..17c3283e 100644 --- a/internal/update/update.go +++ b/internal/update/update.go @@ -5,12 +5,11 @@ import ( "os" "os/exec" "path/filepath" - "runtime" "strings" "github.com/ObolNetwork/obol-stack/internal/app" "github.com/ObolNetwork/obol-stack/internal/config" - "github.com/ObolNetwork/obol-stack/internal/embed" + stackdefaults "github.com/ObolNetwork/obol-stack/internal/defaults" "github.com/ObolNetwork/obol-stack/internal/network" "github.com/ObolNetwork/obol-stack/internal/ui" "github.com/ObolNetwork/obol-stack/internal/version" @@ -115,15 +114,13 @@ func ApplyUpgrades(cfg *config.Config, u *ui.UI, opts UpgradeOptions) error { u.Blank() u.Info("Refreshing default infrastructure templates...") - ollamaHost := "host.k3d.internal" - if runtime.GOOS == "darwin" { - ollamaHost = "host.docker.internal" + defaultsDir := filepath.Join(cfg.ConfigDir, "defaults") + stackID := stackdefaults.StackID(cfg) + if stackID == "" { + return fmt.Errorf("stack ID not found, run 'obol stack init' first") } - defaultsDir := filepath.Join(cfg.ConfigDir, "defaults") - if err := embed.CopyDefaults(defaultsDir, map[string]string{ - "{{OLLAMA_HOST}}": ollamaHost, - }); err != nil { + if err := stackdefaults.CopyInfrastructure(cfg, stackdefaults.DetectedBackendName(cfg), stackID); err != nil { return fmt.Errorf("failed to refresh defaults: %w", err) } diff --git a/internal/x402/bdd_integration_test.go b/internal/x402/bdd_integration_test.go index a025f78c..a8b0f463 100644 --- a/internal/x402/bdd_integration_test.go +++ b/internal/x402/bdd_integration_test.go @@ -104,7 +104,7 @@ func TestMain(m *testing.M) { log.Fatalf("obol stack up: %v", err) } - payTo, err := runObolOutput(obolBin, "openclaw", "wallet", "address", "obol-agent") + payTo, err := runObolOutput(obolBin, "hermes", "wallet", "address", "obol-agent") if err != nil { teardown(obolBin) log.Fatalf("resolve seller wallet: %v", err) diff --git a/renovate.json b/renovate.json index f3c82183..f4e6324f 100644 --- a/renovate.json +++ b/renovate.json @@ -12,7 +12,7 @@ "customType": "regex", "description": "Update obol-stack-front-end version from GitHub releases", "matchStrings": [ - "tag:\\s*[\"'](?v[0-9]+\\.[0-9]+\\.[0-9]+)[\"']" + "tag:\\s*[\"'](?v[0-9]+\\.[0-9]+\\.[0-9]+(?:-rc\\.[0-9]+)?)[\"']" ], "fileMatch": [ "^internal/embed/infrastructure/values/obol-frontend\\.yaml\\.gotmpl$"