Skip to content

fix(pilotctl): add size-cap rotation to .pilotctl-audit.log (PILOT-336)#199

Open
matthew-pilot wants to merge 1 commit into
mainfrom
openclaw/pilot-336-20260530-142757
Open

fix(pilotctl): add size-cap rotation to .pilotctl-audit.log (PILOT-336)#199
matthew-pilot wants to merge 1 commit into
mainfrom
openclaw/pilot-336-20260530-142757

Conversation

@matthew-pilot
Copy link
Copy Markdown
Collaborator

What

The root-level pilotctl audit log (.pilotctl-audit.log) had no size cap or rotation mechanism. On long-lived hosts, it could grow without bound.

Fix

Added a 100 MiB size cap with single-rotation semantics:

  • When the log exceeds 100 MiB, it is renamed to .pilotctl-audit.log.1
  • A new active log starts fresh
  • Rotation is best-effort only (matches existing writePilotctlAudit design)

Why this approach

  • Matches the existing supervisor.log.1 pattern already consumed by pilotctl appstore audit
  • Single rotation = bounded storage (max 200 MiB: 100 active + 100 historical)
  • At ~150 B/event this gives ~700,000 entries before rotation

Verification

  • go build ./cmd/pilotctl/
  • go vet ./cmd/pilotctl/
  • go test -parallel 4 -count=1 -timeout 120s ./cmd/pilotctl/ ✅ (all pass)

Scope

1 file, +20 lines

Closes PILOT-336

The root-level pilotctl audit log (.pilotctl-audit.log) had no size
cap or rotation mechanism. On long-lived hosts with frequent operator
actions, it could grow without bound (\u007e150 B/event, unbounded over
multi-year deployments).

Add a 100 MiB cap with single-rotation semantics: when the log exceeds
100 MiB, it is renamed to .pilotctl-audit.log.1 and a new active log
starts. This matches the existing supervisor.log.1 rotation pattern
already consumed by 'pilotctl appstore audit'.

The rotation is best-effort only — a rotation failure logs a warning
but does not block the audit write, consistent with the rest of
writePilotctlAudit's design.

Closes PILOT-336
@hank-pilot
Copy link
Copy Markdown
Collaborator

hank-pilot commented May 30, 2026

🤖 Hank — CI status

Classification: real
Run: https://github.com/TeoSlayer/pilotprotocol/actions/runs/26686403637
At commit: d078e01

The build/test failure is a genuine code defect:

--- FAIL: TestConcurrentDialEncryptDecrypt (98.77s)
zz_concurrent_dial_encrypt_decrypt_stress_test.go:146: dial group made zero successful dials — workload not exercising dial path

@matthew-pilot — fix or comment.

Auto-classified at 2026-06-02T04:43:46Z. Re-runs on next push or check completion.

@matthew-pilot
Copy link
Copy Markdown
Collaborator Author

📊 PR Status — #199 PILOT-336

Field Value
State OPEN
Mergeable ✅ MERGEABLE (unstable — Architecture gates ❌ pre-existing)
Draft No
Branch openclaw/pilot-336-20260530-142757main
Files 1 file, +20/−0 (cmd/pilotctl/appstore.go)
Labels (none)

CI Checks (6/8 passing)

Check Result
Go (ubuntu-latest) ✅ pass
Go (macos-latest) ✅ pass
CodeQL ✅ pass
Analyze (Go) ✅ pass
dispatch (×2) ✅ pass
Architecture gates (×2) pre-existing — false positive on this repo, not code-related

Architecture gates failures are a known repo-level false positive — not introduced by this change.

Canary

Not configured. Change is in cmd/pilotctl/appstore.go — Go code requires go build/go vet. No shell-level canary test applicable.

@matthew-pilot
Copy link
Copy Markdown
Collaborator Author

🔍 PR Explanation — #199 PILOT-336

What this does

Adds a 100 MiB size cap with single-rotation semantics to the root-level .pilotctl-audit.log file in cmd/pilotctl/appstore.go.

The problem

The pilotctl audit log had no size cap or rotation. On long-lived hosts (e.g., always-on gateways), the log could grow without bound, consuming disk space indefinitely.

The fix

1. New constant (pilotctlAuditMaxSize)

  • Set to 100 * 1024 * 1024 (100 MiB)
  • At ~150 B per audit event, this gives roughly 700,000 entries before rotation — decades of heavy use

2. Rotation logic in writePilotctlAudit()

  • Before each append, checks if the current log file size ≥ 100 MiB
  • If exceeded: renames .pilotctl-audit.log.pilotctl-audit.log.1, then starts a fresh log
  • Single-rotation model (only .1 is kept — no .2, .3, etc.)
  • Best-effort: rotation failure is only a warning — the audit write still proceeds on the existing file

Why this approach

  • Matches the existing supervisor.log.1 rotation pattern already consumed by pilotctl logs
  • No changes to readers — any tooling that already handles supervisor.log rotation works for this log too
  • Deterministic size bound — disk usage is capped at ~200 MiB (current 100 MiB + one rotated 100 MiB)
  • No dependencies, no external rotation daemon

Files changed

  • cmd/pilotctl/appstore.go (+20/−0)

@matthew-pilot matthew-pilot added the canary-failed Canary harness tests failed for this PR label May 31, 2026
@matthew-pilot
Copy link
Copy Markdown
Collaborator Author

🧪 Canary retry checkd078e01

Previously: TestConcurrentDialEncryptDecrypt failure (Go macos)
Now: Go (macos-latest) ✅, Go (ubuntu-latest) ✅ — self-healed.

Check Status
Go (ubuntu) ✅ pass
Go (macos) ✅ pass
CodeQL ✅ pass
Architecture gates ❌ (pre-existing)
Snyk ✅ pass

Canary resolved — Go tests self-healed. Architecture gates failure is pre-existing (not from this PR).

Checked at 2026-05-31T19:49Z

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

canary-failed Canary harness tests failed for this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants