Skip to content

feat: wire telemetry into all remove.* commands#1069

Open
Hweinstock wants to merge 1 commit intoaws:mainfrom
Hweinstock:feat/telemetry-remove-commands
Open

feat: wire telemetry into all remove.* commands#1069
Hweinstock wants to merge 1 commit intoaws:mainfrom
Hweinstock:feat/telemetry-remove-commands

Conversation

@Hweinstock
Copy link
Copy Markdown
Contributor

@Hweinstock Hweinstock commented Apr 30, 2026

Description

Wire telemetry into all remove.* commands, following the pattern from #1050 (add commands).

  • Renamed cliCommandRunrunCliCommand, withAddTelemetrywithCommandRunTelemetry since they are now generalized.
  • Registered 13 remove commands in COMMAND_SCHEMAS
  • Wired CLI remove handlers (BasePrimitive + 6 custom primitives) and TUI remove hooks

Not addressed (follow-up): Error classification — remove ops return { success: false, error: string }, losing the original error class. classifyError sees UnknownError instead of a typed error. We'll likely need a larger refactor to how we handle exception flow to a consistent and generalized result type, or conventional error flow.

Related Issue

Closes #1068

Type of Change

  • New feature

Testing

  • I ran npm run test:unit and npm run test:integ
  • I ran npm run typecheck
  • I ran npm run lint

Added telemetry success + failure assertions for 10 remove commands across 6 integration test files. remove.agent and remove.runtime-endpoint lack coverage (no existing test files exercise removal).

Checklist

  • I have read the CONTRIBUTING document
  • I have added any necessary tests that prove my fix is effective or my feature works
  • My changes generate no new warnings

@Hweinstock
Copy link
Copy Markdown
Contributor Author

/strands review

@github-actions github-actions Bot added size/m PR size: M strands-running agentcore-harness-reviewing AgentCore Harness review in progress and removed agentcore-harness-reviewing AgentCore Harness review in progress labels Apr 30, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 89d3db7 to 7cbfdc9 Compare April 30, 2026 21:36
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 30, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 7cbfdc9 to 9e49ed7 Compare April 30, 2026 21:39
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 30, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 9e49ed7 to 5c078b8 Compare April 30, 2026 21:41
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 30, 2026
@Hweinstock Hweinstock closed this Apr 30, 2026
@Hweinstock Hweinstock reopened this Apr 30, 2026
@github-actions github-actions Bot added size/m PR size: M agentcore-harness-reviewing AgentCore Harness review in progress and removed size/m PR size: M agentcore-harness-reviewing AgentCore Harness review in progress labels Apr 30, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 5c078b8 to 30ec334 Compare April 30, 2026 21:45
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 30, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 30ec334 to af77f7d Compare April 30, 2026 21:47
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 30, 2026
@aws aws deleted a comment from agentcore-cli-automation Apr 30, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from af77f7d to 5ac07cb Compare April 30, 2026 21:59
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 30, 2026
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 30, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 304bda5 to 86abe2d Compare April 30, 2026 22:23
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Apr 30, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 86abe2d to 3df2172 Compare May 1, 2026 12:35
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 1, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 3df2172 to bb2532f Compare May 1, 2026 13:17
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 1, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from bb2532f to 951cf98 Compare May 1, 2026 13:24
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 1, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 951cf98 to 513de65 Compare May 1, 2026 13:27
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 1, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 513de65 to 2f80de5 Compare May 1, 2026 13:34
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 1, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 2f80de5 to 7a3a32a Compare May 1, 2026 13:48
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels May 1, 2026
@Hweinstock Hweinstock force-pushed the feat/telemetry-remove-commands branch from 7a3a32a to 0d16acd Compare May 1, 2026 13:55
@github-actions github-actions Bot removed the size/m PR size: M label May 1, 2026
@Hweinstock
Copy link
Copy Markdown
Contributor Author

/strands review

@agentcore-cli-automation
Copy link
Copy Markdown

Behavior change in withCommandRunTelemetry: thrown errors are now propagated instead of caught

The old withAddTelemetry had a catch-all that converted thrown exceptions from fn() into a { success: false, error: getErrorMessage(err) } result:

} catch (err) {
  if (!result) {
    return { success: false, error: getErrorMessage(err) };
  }
}

The new withCommandRunTelemetry removes that safety net and re-throws instead (if (!result) throw e;). This is a real behavioral change for every TUI call site that was migrated from withAddTelemetry — not just a rename.

Most migrated hooks are fine because they already wrap the call in a try/catch (e.g. useCreateMemory, useCreateEvaluator, useCreateGateway, useCreateOnlineEval, useCreateIdentity). But two call sites only rely on the previous catch-all and now can leak unhandled rejections / freeze the UI if the underlying primitive throws:

  1. src/cli/tui/screens/agent/useAddAgent.ts (lines 149–174)useAddAgent wraps the call in try/finally, not try/catch. addAgentInnerhandleCreatePath / handleByoPath / handleImportPath has no top-level try/catch and calls things that can throw (configIO.readProjectSpec, renderer.render, setupPythonProject, copyFileSync, executeImportAgent, etc.). Previously a throw became { success: false, error }; now it propagates to the caller.

    The caller in AddFlow.tsx:

    void addAgent(config).then(result => { ... });  // no .catch()

    …has no .catch(), so a throw becomes an unhandled promise rejection and the UI stays stuck on loading: true, loadingMessage: 'Creating agent...'.

  2. src/cli/tui/screens/runtime-endpoint/AddRuntimeEndpointFlow.tsx (lines 81–100) — same pattern:

    void withCommandRunTelemetry('add.runtime-endpoint', {}, () => runtimeEndpointPrimitive.add(...))
      .then(result => { ... });  // no .catch()

    If runtimeEndpointPrimitive.add throws, this becomes an unhandled rejection (the old withAddTelemetry would have routed it through setFlow({ name: 'error', ... })).

In practice the individual .add() methods on the primitives each have their own try/catch, so this rarely triggers today — but the previous helper was the backstop for anything the primitive forgot to catch (and addAgentInner in particular does lean on that).

Options

Option A (minimal): Restore the catch-all inside withCommandRunTelemetry so the contract matches the old withAddTelemetry:

} catch (e) {
  if (!result) return { success: false, error: getErrorMessage(e) } as R;
}

This requires the R generic to always be assignable from { success: false; error: string }, which is already the constraint (R extends OperationResult).

Option B: Keep the new re-throw semantics but add try/catch (or .catch(...) on the returned promise) at the two call sites that currently rely on the old behavior (useAddAgent and AddRuntimeEndpointFlow.handleCreateComplete), mapping the error to the appropriate error state.

Option C: Keep re-throw semantics but also wrap addAgentInner in a try/catch internally so it always returns AddAgentInnerResult and never throws — and do the same for the runtime-endpoint add flow.

Option A preserves the old behavior most cheaply; B/C are more explicit but touch more code.

@Hweinstock
Copy link
Copy Markdown
Contributor Author

format failure is fixed in #1080.

@agentcore-cli-automation
Copy link
Copy Markdown

Reviewed the current state of the PR. The two concerns previously raised in comments appear to be addressed:

  1. Unhandled rejections in TUI call sites — the current withCommandRunTelemetry implementation keeps the catch-all that was in the old withAddTelemetry:

    if (!result) {
      return { success: false, error: getErrorMessage(e) } as R;
    }

    So useAddAgent and AddRuntimeEndpointFlow.handleCreateComplete retain the same 'throw → failure result' contract as before. No behavior change.

  2. remove.runtime-endpoint gapRuntimeEndpointPrimitive.ts is updated in this diff to wrap this.remove() in withCommandRunTelemetry('remove.runtime-endpoint', ...), and the command is registered in COMMAND_SCHEMAS. Coverage for all 12 removable resource types plus remove.all is now complete.

Spot-checked the rest:

  • SubCommand<G, S> is sound — uses Extract<Command, ${G}.${S}>, so unregistered sub-commands collapse to never at the call site.
  • All 12 ResourceType variants have matching remove.* entries in COMMAND_SCHEMAS, so \remove.${this.kind}`inBasePrimitiveanduseRemove` is always a valid key.
  • RemovalResult ({success:true} | {success:false, error:string}) and AddResult<T> both satisfy the R extends OperationResult constraint.
  • The runCliCommand refactor preserves the 'telemetry init failure runs fn() untracked and exits 0' behavior; process.exit returning never keeps the subsequent client.withCommandRun call reachable only when client is defined.
  • Removing the ! non-null assertions on configBundles in the integ tests is safe — the schema declares .default([]), so the field is always present after parse.
  • Integ test coverage for success + failure telemetry is in place for 10 of 12 remove commands; remove.agent and remove.runtime-endpoint coverage gap is called out in the PR description.

No new blocking issues. LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/m PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants