Added UI Automation Crawler to Onboarding Agent by robgruen · Pull Request #2294 · microsoft/TypeAgent

robgruen · 2026-05-05T18:45:08Z

This pull request introduces the initial setup and implementation for the UiAutomationHelper .NET project, providing both project structure and core functionality for UI automation tasks. The main focus is on enabling programmatic control of Windows applications and their UI elements via a set of RPC-accessible methods. Additionally, the PR includes configuration for project management, dependencies, and best practices.Project setup and configuration:

Added a new solution file UiAutomationHelper.sln with two projects: the main automation helper and its test project.
Created the initial .csproj file for UiAutomationHelper, targeting .NET 8.0 for Windows, referencing FlaUI libraries for UI automation, and configuring build and packaging settings.
Added a .gitignore for uiAutomationHelper to exclude build outputs and user-specific files.

Core functionality:

Implemented AppMethods in src/Methods/AppMethods.cs to support launching, attaching, listing, and killing applications, with robust parameter validation and error handling.
Implemented ActionMethods in src/Methods/ActionMethods.cs providing methods to interact with UI elements (invoke, toggle, set value, select, expand/collapse, scroll, focus, click, send keys), including parameter parsing, error handling, and support for various UI patterns.

Documentation:

Added Copilot instructions for Azure-related requests, specifying tool usage and best practices.

Skeleton for the UIA-based onboarding crawl: a .NET helper exposing a JSON-RPC stdio surface (app lifecycle, tree.dump, screenshot, do.invoke) backed by FlaUI/UIA3, and a TypeScript HelperClient. Verified end-to-end against Windows Clock; live smoke produces a tree-dump fixture and screenshot. SelectorParser has 20 xUnit tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Helper RPCs: do.toggle/setValue/select/expand/scroll/focus/click/sendKeys (joining do.invoke from slice 1), plus find (with optional polling) and events.idle (focus-change-debounce). All app.list and tree.dump calls now retry transient COM errors that fire during UWP teardown. Selector resolution gained two fixes from real-world failures: when AutomationId is missing, capture-time selectors include ClassName as a disambiguator so siblings sharing a Name (UWP's nested ApplicationFrame and CoreWindow both named after the app) resolve correctly. App.launch's returned mainWindow is now the desktop-rooted ApplicationFrameWindow, not the inner CoreWindow which lives under it in UIA's logical tree — resolved via Win32 GA_ROOTOWNER + name match + a poll loop for the async-created frame. Smoke now drives Clock through invoke + select + focus + find + events.idle and produces a clock-tree-navigated.json fixture showing the post-navigation state. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

C# helper: snapshot.capture / snapshot.restore / snapshot.delete RPCs backed by FolderSnapshotter (recursive copy with exclude globs) and ProcessKiller (graceful close → force kill on identity match). Restore is replace-not-merge — the target directory is wiped before files come back, so files added to state between snapshots disappear on restore. TS: snapshotPolicy.ts library with inferSnapshotPolicy (UWP via PowerShell Get-AppxPackage → PackageFamilyName → LocalState/Settings/ RoamingState folder enumeration), plus load/save/approve/markStateless helpers. HelperClient gains snapshotCapture/Restore/Delete. Smoke: - inferSnapshotPolicy for Clock detects the 3 expected UWP folders (LocalState, Settings, RoamingState) under Microsoft.WindowsAlarms_*. - Synthetic capture/restore round-trips against a sandboxed state directory: dirty all 3 files + add a new one + restore → all originals match expected content + the added file is gone. Slice 3b (onboarding-action wiring: inferSnapshotPolicy / approveSnapshotPolicy / markStateless / editSnapshotPolicy actions on the manifest+grammar) deferred to when we wire the full pipeline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Helper: tree.fingerprint RPC computes a SHA-256 of the (filtered) UIA subtree, with optional dynamic-control rules that mask `value` / `name` / `toggleState` of matched controls. Matchers: automationId, exact selector, glob selectorPattern, container (subtree + controlType + optional name/className regexes). TS dynamicControls.ts: calibrateDynamicControls runs N tree dumps (3 by default, 3s apart) with no input, diffs them by selector, and emits DynamicControlRule[] tagged `calibration-drift` with confidence = transitions / (N-1). Persistence (load/save) and rule-merge by matcher identity also included. Smoke validates the mechanism on Clock's running stopwatch: - Back-to-back fingerprints identical (deterministic hash). - Applying a rule that masks Close button's name → different fingerprint than no-rule (rule application affects the hash). - Calibration picks up StopwatchTimerText as dynamic with confidence 1.0 across 3 dumps. - Naked fingerprints diverge across a 4s window (timer advanced); rules-aware fingerprints partially mask drift but don't fully (Clock has multiple time-display elements at different granularities). The residual drift is a real-world finding to address with iterative explore-drift rule updates in the autonomous loop (slice 6). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Helper: JSON-RPC notifications (server → client, no id), routed through a shared write lock so UIA event-thread emissions don't interleave with RPC responses. New events.subscribe / events.unsubscribe RPCs accept eventTypes ["Invoked", "ValueChanged", "ToggleStateChanged", "StructureChanged"], scoped to a selector with TreeScope.Subtree. SubscriptionRegistry holds active subscriptions; Subscription.Dispose unregisters via FlaUI's IDisposable handlers. TS: HelperClient gains a notification dispatch path with `onEvent()` registration. Recorder library subscribes, writes captured events as JSONL into <workspaceDir>/recordings/<sessionId>/transitions.jsonl. Smoke launches Clock, drives it through navigation + click, and captures 10 StructureChanged events end-to-end into a transitions.jsonl fixture. Real-world finding: UIA's InvokedEvent doesn't propagate to in-process listeners for UWP apps even when triggered via real Mouse.LeftClick — likely a cross-process COM marshaling quirk between ApplicationFrameHost and the UWP package's CoreWindow. StructureChanged events DO fire reliably, which is enough for the autonomous-explore loop in slice 6 (which re-dumps the tree after each agent action and doesn't depend on Invoked events for its own actions). Record-mode for genuine user-driven sessions (separate process driving the app) is the case where Invoked events matter, and that case is untested here. Also: NavView Group elements can have dynamic Names that embed running state ("Stopwatch, Paused, 12 seconds 23 centiseconds"), invalidating selectors built on Name-only as soon as the app starts. Caught this in the smoke; selector fallbacks beyond ClassName disambiguation will be needed in slice 6. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Outer loop with pluggable DecisionOracle: - capture (treeFingerprint + treeDump → upsertState dedup by fingerprint) - oracle.decide(input) → ExploreDecision - execute (dispatch verb to do.* RPCs) - eventsIdle → recapture → addTransition - persist incrementally to states.jsonl + transitions.jsonl State graph persists every iteration (JSONL append + per-state TreeNode JSON + optional screenshots) and rehydrates on construction, so a crashed run can resume from disk. Budget gates: maxIterations / maxWallClockMs / maxStates / convergenceThreshold (iterations since last new state). Frontier computation: maps Pattern set → ActionVerb candidates per actionable on-screen control, marks destructive (delete/remove/reset/ clear/erase regex), priority-sorts (Button/MenuItem/ListItem first, unstable identifiers later, destructive last). Stub oracle picks the first non-destructive non-window-management frontier item; smoke shows 6 iterations, 2 distinct states, 5 successful transitions persisted with correct fromStateId→toStateId dedup on revisits. Slice 6b will swap StubOracle for a typechat-backed LLM oracle. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

LlmOracle implements DecisionOracle. ExploreDecision schema (act/stop/ restore) lives at exploreLlmSchema.ts and is loaded as text by TypeChat's TypeScript-JSON validator, the same pattern as discoveryLlmSchema. Postbuild copies it into dist alongside the discovery schema. lib/llm.ts gets a getExploreModel() factory tagged "onboarding:explore". Prompt template includes goal, frontier (rendered as a numbered list with controlType/name/automationId/verbs), recent transitions tail, visited-state ids, and remaining budget. On translation failure the oracle counts consecutive failures and either falls back to the first non-destructive frontier item (single retry) or stops. Smoke against Windows Clock with budget(maxIterations=8): the model systematically navigates Focus sessions → Timer → Alarm → Stopwatch → World clock, discovering 11 distinct states across 8 successful transitions (no failures). State dedup confirmed via revisit on iter 8 (state-004 shows up again with the same fingerprint). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three TypeChat schemas in synthesisLlmSchema.ts: NeutralStatesClassification (per-state isNeutral + tabOrSection label), ClusteringResult (group chunks by user-intent), and SynthesizedAction (final action with playback recipe, parameters, preconditions). Postbuild copies the schema into dist. synthesizer.ts pipelines: 1. classifyNeutralStates — one LLM call covering all states, summarized by their actionable controls + window title 2. chunkTransitions — deterministic; cuts the transition log at neutral boundaries, trailing-non-neutral chunks flagged isNeutralEnd=false 3. clusterChunks — one LLM call, intent-naming in camelCase verb-noun 4. synthesizeOneCluster — one LLM call per cluster, builds full PlaybackStep[] with valueRef/valueLiteral params extracted from chunk variations 5. writeOutput — discoveredActions.json (matches schema phraseGen consumes) + synthesisReport.md for the human approval gate End-to-end smoke: explore Clock (8 iterations, 11 states, 8 transitions) → synthesize (3 actions, all "navigateToTab" by destination, with full selector paths in playback). discoveredActions.json contains valid, replay-ready recipes. Quality finding: clustering didn't merge functionally identical chunks into one parameterized action. Design called for `navigateToTab(tab: "Focus sessions" | "Timer" | "Alarm" | ...)`, got three separate `navigateToTab` actions split by destination instead. The clustering prompt needs stronger emphasis on parameterization-via- variation (or a two-pass merge step). Mechanism is correct; output quality has room. Other findings: neutral classification produced sensible per-state labels (focusTab.setup / timerTab.empty / alarmTab.empty / etc.) just from actionable-control summaries; chunks merge correctly when intermediate states are non-neutral (2-step playbacks emerged where exploration crossed two neutrals). This closes the seven-slice arc: helper → verbs → snapshot → calibration → record → autonomous loop → synthesis. Branch is now a working end-to-end UIA-based onboarding pipeline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Built an end-to-end demo proving the design works on Windows Clock: clockCrawl.ts: full crawl with snapshot/restore safety net. - inferSnapshotPolicy auto-detects the 3 UWP folders (LocalState + Settings + RoamingState under Microsoft.WindowsAlarms_*) - captures baseline (82KB) - drives Clock for 25 iterations against a task-oriented goal (create alarm, create timer, exercise stopwatch...) - synthesis produces 4 actions with parameters and playback recipes - restores baseline at the end playbackExecutor.ts: generic SynthesizedAction → executed steps. - resolves valueRef/${param} substitution against a runtime params map - dispatches each step's verb to the corresponding do.* RPC - waits for UIA idle after invoke/select by default clockAgentDemo.ts: replays a crawled action with NEW parameters. - loads discoveredActions.json from the most recent run - restores Clock to the baseline snapshot (known starting point) - runs createAlarm({alarmName: "Crawled Demo Alarm", hour: 8, minute: 15}) - verifies a new AlarmViewGrid DataItem named "Edit alarm, Crawled Demo Alarm, 8:15AM, Only once, " appears in the tree - restores baseline again to leave Clock as we found it Result: the LLM successfully crawled Clock, the synthesizer extracted correct UIA paths through Popup → EditFlyout → ContentScrollViewer → DurationPicker → HourPicker, and the executor replayed those exact paths to create a brand new alarm with new parameter values not seen during the crawl. This validates the entire design end to end: helper → exploration → state graph → synthesis → discoveredActions.json → playback executor → real Windows alarm Quality findings still standing: - Clustering didn't merge tab-switching chunks into one parameterized navigateToTab; got 3 by-destination clusters + a mislabeled "12-step navigateToTab" that's actually the create-alarm-then-create-timer full flow. - Verification predicate in the demo had to use DataItem with AutomationId="AlarmViewGrid", not ListItem with Toggle — alarm rows don't render the way I first guessed. (Synthesis could be enhanced to emit better postcondition assertions.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…erge The synthesis pipeline produces dramatically better-shaped actions when the underlying model is a reasoning model and the prompts encode structural rules explicitly: - getSynthesisModel() defaults to GPT_5. Same for the explore oracle. Note: aiclient's getEnvSetting short-circuits on its empty-string default, so endpoint-suffixed timeout vars must be set explicitly (AZURE_OPENAI_MAX_TIMEOUT_GPT_5) — smoke tests handle this in their preamble. - Tightened prompts in synthesizer.ts: * NEUTRAL_RULES — modal/popup/flyout/wizard is NEVER neutral; "Save"- bearing controls are a hard signal; tab landing areas ARE neutral. * CLUSTERING_RULES — aggressively merge open→fill→save flows into one cluster; parameterize by variation; toggle-aware (split start/pause despite shared selector); don't emit fragments; aim for few clusters. * SYNTHESIS_RULES — use the LONGEST chunk as canonical playback (don't take intersection); declare parameters even from one chunk if value is clearly user-supplied; toggle-aware (one click per logical action, not the repeated count). - New validation pass (synthesisLlmSchema.ts ValidationResult): GPT-5 reads the full synthesized set and emits per-action verdicts (ok / fragment / duplicate / broken / ambiguous) plus MergeRecommendations for duplicates that should be one parameterized action. Recommendations are applied automatically: target actions are removed, a single parameterized action replaces them. - mergeIntoWorkspace appends/updates a workspace-level discoveredActions.json so successive crawls accumulate (rather than each run overwriting the canonical set). Per-action merge: longer playback wins, parameter examples union, destructive flags union. Real-data result on the existing Windows Clock 54-iteration run, before vs after this change: before (default model + loose prompts): 12 actions, mostly fragments confirmAlarm/Clock/Timer (1-step fragments), createAlarm 1-step, setAlarmDetails/Time fragments, startStopwatch with 9 alternating clicks merged after (GPT-5 + tight prompts + validation): addWorldClock(city: string) — 3 steps createAlarm(name: string, minutes: number) — 4 steps createTimer() — 2 steps recordLap() — 1 step setStopwatchRunning(running: boolean) — auto-merged from start+pause startFocusSession() — 1 step Also: clockFullCrawl.ts (popup-aware big-budget crawl) and resynthesize.ts (re-runs synthesis on an existing runDir without re- exploring — invaluable for iterating on synthesis prompts cheaply). uiCapture/README.md documents the full pipeline: helper RPCs, explorer loop, synthesis stages, on-disk layout, smoke tests, observed quality patterns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two reconnaissance modes that catalog what an app supports BEFORE the crawl, so the explore loop can drive specific actions instead of guessing. reconLlmSchema.ts + tabReconnaissance.ts — simple per-tab variant. Walks the NavView tabs (heuristic: largest cluster of sibling ListItems with SelectionItem pattern), navigates to each, sends screenshot + filtered control tree to a vision LLM, gets back a TabRecon with expectedActions. iterativeReconLlmSchema.ts + iterativeReconnaissance.ts — multi-turn loop. Per turn: screenshot + tree + already-discovered list go to the vision LLM, which returns newDiscoveries plus a click/back/done decision. Drills INTO modals/dialogs to enumerate their fields, then clicks Cancel to back out. Vastly richer than the per-tab variant because it sees what's BEHIND the buttons, not just on the surface. getReconModel() defaults to GPT_v (the dedicated vision deployment in this Azure config). GPT-5 deployments here don't accept image_url content type ("API version not supported"); GPT-4o uses a /openai/v1/ URL shape that aiclient's request builder doesn't construct correctly. GPT-v on /openai/deployments/gpt-v/chat/completions just works. TypeChat wiring fix: image content goes in promptHistory as a prior user message, NOT substituted for the createRequestPrompt result. That way TypeChat's standard schema-instruction wrapper still gets appended, and the model knows to respond in JSON. Smoke result on Clock (clockIterativeRecon.ts, 20 turns): → 34 discovered actions across 5 tabs → screen path: Timer → Add timer dialog → Timer → Alarm → Add alarm dialog → Alarm → Stopwatch → World clock → Add location → Focus sessions → ... → caught secondary features explore alone wouldn't (keepTimerOnTop, linkSpotify, repeatAlarm with days enum, setAlarmSound enum, etc.) → correctly flagged resetStopwatch as destructive → properly typed parameters with plausible examples (hour=7, cityName='New York', period='AM') Some actions are over-decomposed (nameAlarm / setAlarmTime / saveAlarm emitted as separate intents instead of fields of createAlarm). Expect the synthesis pass to roll these up when it sees the actual chunks. Also adds clockReconCrawl.ts (full pipeline: simple recon → goal-from- recon → crawl → synthesize) and clockIterativeRecon.ts (recon-only smoke for fast feedback). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a "Status" header at the top with what's working and what's left (by priority: TypeAgent integration → synthesis prompts on richer input → selector decay → focused crawl tooling → multi-id selector fallback). Documents the reconnaissance subsystem (both per-tab and iterative variants, vision wiring, model selection). Updates pipeline diagram to show the optional recon phase feeding the explore loop's goal. Adds new findings to the quality observations: GPT-5 for synthesis, GPT-v for vision, aiclient gotchas (URL shape + endpoint-suffixed env-var fallback bug). Updates "adding a new integration" to make iterativeReconnoiter the recommended starting step. Also updates clockReconCrawl.ts to use iterativeReconnoiter (the richer recon variant) instead of the simpler per-tab survey. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Generates a TypeAgent agent package from a workspace's discoveredActions.json: packages/agents/<name>/ package.json ← exports agent/manifest + agent/handlers, deps on agent-sdk + onboarding-agent tsconfig.json + src/tsconfig.json src/<name>Schema.ts ← typed action union + per-action types with parameters mapped from ParamSpec (string/number/boolean/enum literals) src/<name>Manifest.json ← schemaType.action = "<Cap>Action" src/<name>ActionHandler.ts ← AppAgent.executeAction wires through HelperClient + executePlayback, auto-launches the app on first call data/discoveredActions.json ← copied alongside for runtime loading Public exports surface from onboarding-agent: a new "./uiCapture" subpath entry point in package.json + dist/uiCapture/index.ts that re-exports HelperClient, executePlayback, and the relevant types so generated agents can import { ... } from "onboarding-agent/uiCapture" without depending on internal paths. Generated handler manages a per-session AgentState with the helper client + tracked app pid + main window selector. ensureClient lazily spawns the helper; ensureAppRunning launches the target AUMID/exePath (taken from the scaffolder's appLaunch option) on first action and re-launches if the prior pid has exited. Each action looks up the SynthesizedAction by actionName and runs executePlayback with the caller's parameters. Manifest currently omits grammarFile — the dispatcher falls back to LLM-based translation against the .pas.json schema. Hand-tuned grammar or phraseGen-emitted grammar can be added later. scaffoldClockAgent.ts is the one-shot CLI wrapper for Windows Clock. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tcher End-to-end pipeline now works through TypeAgent: 'start a timer in clock' → NL→typed action via TypeChat → playback recipe → real Clock UI. Verified on Windows Clock with two distinct actions: $ node packages/cli/bin/run.js run request "start a timer in clock" [⏰ windowsClock] Translating 'start a timer in clock' into action 'startTimer' [⏰ windowsClock] Executing action windowsClock.startTimer Done: startTimer (3 steps) $ node packages/cli/bin/run.js run request "show me the timer tab in clock" [⏰ windowsClock] Executing action windowsClock.navigateToTimerTab Done: navigateToTimerTab (1 steps) Changes to land the integration: - Generated windowsClock-agent package via scaffoldUiAgent (4 actions from the latest crawl: navigateToTimerTab, renameTimer, startTimer, setTimerViewMode). Schema, manifest, action handler, package.json, tsconfigs all auto-generated from data/discoveredActions.json. - Registered windowsClock in defaultAgentProvider/data/config.json and added windowsClock-agent as a workspace dependency in its package.json. Dispatcher loads it the same way as built-in agents. - Three scaffolder fixes uncovered while integrating: * /** ... */ blocks rejected by action-schema-compiler — switched to single-line // comments per action description. * Record<string, never> not supported for parameter types — switched to {} for zero-parameter actions. * Comments above the entry-type union are rejected — moved auto-generation note out of the way. - New scaffolder option: appTitleMatch. Each action handler probes app.list for an existing window matching the title before launching; UWP apps can't be launched twice and FlaUI returns "no main window" when they are. Without this fix, the second NL request in a session failed; with it, the handler attaches to the running Clock instance. Quality issue still standing: the explore phase only covered the Timer tab (28 iter wall-clock-capped at 15 min, the LLM oracle drilled deep on Timer instead of moving on). The crawl produced just 4 actions instead of the ~10-15 implied by reconnaissance's 35 candidates. Fix is either focused per-tab re-crawls (merge logic already in place) or a prompt tweak to make the oracle move on after exhausting a tab. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per-tab focused crawls for Alarm, Stopwatch, World Clock, Focus session (Timer was already covered) — snapshot once at start, restore between tabs, run an explore loop with a tightly-scoped goal naming the specific tasks for that tab, synthesize and merge into the workspace's discoveredActions.json. Final restore at the end. Per-tab budget: 10-18 iterations / 4-5 min wall clock. Each tab's goal includes a short list of concrete tasks (open dialog, fill fields, click commit) and explicitly references the relevant AutomationIds and container patterns. The LLM oracle was very efficient: every tab stopped early via "Goal completed" except Focus (hit max-iterations cap, all 12 transitions still successful). Crawl-by-crawl results: alarm → 9 iter, 8/8 successful, +3 actions → total 7 stopwatch → 7 iter, 3/6 successful, +3 actions → total 10 (3 fails on the dynamic-name parent Group selector decay we already documented) worldclock→ 5 iter, 4/4 successful, +2 actions → total 12 focus → 12 iter, 12/12 successful, +2 actions → total 14 Final 14 actions: navigate{Alarm,Stopwatch,Timer,WorldClock,Focus}Tab — 5 tabs, 1 step each createAlarm(alarmName, hour, minute) — 5-step flow addWorldClock(cityQuery, suggestionItem) — 3-step flow setAlarmEnabled(enabled: boolean) — auto-merged from enable/disable setStopwatchRunning(running: boolean) — auto-merged from start/pause setFocusSessionRunning(running: boolean) — auto-merged from start/pause setTimerViewMode(mode: enum) — auto-merged from expand/restore recordLap() startTimer() renameTimer(name: string) Verified end-to-end through TypeAgent: $ run request "in windows clock app, navigate to alarm tab" → Done: navigateToAlarmTab (1 steps) $ run request "in windows clock, set an alarm for 8:30 named morning" → Done: createAlarm (5 steps) → real "morning" alarm at 8:30AM appeared in Clock's tree Quality issues still standing: - Boolean parameter examples got nonsense values like 'stopwatch' from the merge-recommendation collectExamples heuristic (it derives from action-name suffix). Recipes still work since the boolean is unused in the playback (the merged action still just toggles), but the schema example values are misleading. Fix: pass true/false through proper merge-aware example synthesis. - createAlarm assumes the Alarm tab is already active. Synthesis correctly extracted the alarm-creation flow but discarded the navigation prefix (it became its own navigateToAlarmTab action). Multi-step user requests through TypeAgent need to either chain navigate-then-create, or the runtime handler needs a precondition step (e.g., auto-call navigateToTab matching the action's tabOrSection). Current workflow: user says navigate first, then act. - Dispatcher's construction cache is aggressive: phrases like "create an alarm" hit the onboarding-agent's scaffoldAgent action, "go to X" hits excel.navigateToCell, "switch to" hits player.selectDevice. Workaround: include "windows clock" in the request to disambiguate. Real fix: clear the cache or extend the windowsClock agent's NL patterns explicitly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Updates the Status section (now 2026-05-05) to reflect TypeAgent integration shipping end-to-end and the multi-tab crawl producing 14 actions across all 5 Clock tabs verified through the dispatcher. Adds 3 newly-surfaced issues to the prioritized backlog: - createAlarm assumes the right tab is already active (synthesis dropped the navigate prefix into its own action; needs runtime handler to auto-navigate or synthesizer to inject the prefix). - Auto-merged toggle actions get nonsense boolean examples like 'stopwatch' instead of [true, false] — applyMergeRecommendations' collectExamples fallback uses action-name suffix. - Dispatcher construction cache misroutes common phrasings: "create an alarm" hits onboarding's scaffoldAgent, "go to X" hits excel, "switch to" hits player. Workaround: include "windows clock" in the request. Real fix: clear cache or write explicit .agr. Plus an item for helper binary bundling once a UI-Automation agent needs to ship outside the dev repo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…sualStudio agents Co-authored-by: robgruen <25374553+robgruen@users.noreply.github.com>

Copilot · 2026-05-05T20:10:07Z

@copilot resolve the merge conflicts in this pull request

Resolved the merge conflicts in commit 750df170. The two conflicts were:

ts/packages/defaultAgentProvider/data/config.json — kept both the windowsClock and visualStudio agent entries.
ts/pnpm-lock.yaml — took the origin/main version and regenerated it to include the windowsClock-agent workspace dependency.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

www.electronjs.org
- Triggering command: /usr/local/bin/node /usr/local/bin/node /home/REDACTED/work/TypeAgent/TypeAgent/ts/node_modules/.pnpm/@electron+rebuild@4.0.3/node_modules/@electron/rebuild/lib/module-type/node-gyp/worker.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

… remove invisible unicode Agent-Logs-Url: https://github.com/microsoft/TypeAgent/sessions/49ceea26-2edf-4b12-8dcf-5ab4e7f5f22c Co-authored-by: robgruen <25374553+robgruen@users.noreply.github.com>

Agent-Logs-Url: https://github.com/microsoft/TypeAgent/sessions/da715f62-6a47-4132-a9eb-ac0c5ceffe20 Co-authored-by: robgruen <25374553+robgruen@users.noreply.github.com>

Copilot · 2026-05-05T21:52:49Z

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

www.electronjs.org
- Triggering command: /usr/local/bin/node /usr/local/bin/node /home/REDACTED/work/TypeAgent/TypeAgent/ts/node_modules/.pnpm/@electron+rebuild@4.0.3/node_modules/@electron/rebuild/lib/module-type/node-gyp/worker.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

…havior Agent-Logs-Url: https://github.com/microsoft/TypeAgent/sessions/067b6b29-26d7-4ab2-b919-536804ba697f Co-authored-by: robgruen <25374553+robgruen@users.noreply.github.com>

Copilot · 2026-05-05T23:11:22Z

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

www.electronjs.org
- Triggering command: /usr/local/bin/node /usr/local/bin/node /home/REDACTED/work/TypeAgent/TypeAgent/ts/node_modules/.pnpm/@electron+rebuild@4.0.3/node_modules/@electron/rebuild/lib/module-type/node-gyp/worker.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Agent-Logs-Url: https://github.com/microsoft/TypeAgent/sessions/0f8b27b9-e8b7-4348-87cc-f87261e29f81 Co-authored-by: robgruen <25374553+robgruen@users.noreply.github.com>

Copilot · 2026-05-06T00:57:08Z

Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

www.electronjs.org
- Triggering command: /usr/local/bin/node /usr/local/bin/node /home/REDACTED/work/TypeAgent/TypeAgent/ts/node_modules/.pnpm/@electron+rebuild@4.0.3/node_modules/@electron/rebuild/lib/module-type/node-gyp/worker.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

robgruen and others added 21 commits May 3, 2026 20:18

Onboarding agent: grammarGen and scaffolder updates

75087f8

marked UI crawling as experimental

59cf152

added choice for building uiautomation handler directly from shell

c5f9f27

desktop agent will now autobuild if use consents

c71432a

updated clock handler to handle neutral state

342a306

robgruen had a problem deploying to development-fork May 5, 2026 18:45 — with GitHub Actions Error

robgruen marked this pull request as ready for review May 5, 2026 18:45

lint

405bef2

robgruen had a problem deploying to development-fork May 5, 2026 18:46 — with GitHub Actions Error

updated status messaging during build action

fe1b471

robgruen had a problem deploying to development-fork May 5, 2026 20:04 — with GitHub Actions Error

Copilot started work on behalf of robgruen May 5, 2026 20:04 View session

Merge origin/main: resolve conflicts keeping both windowsClock and vi…

750df17

…sualStudio agents Co-authored-by: robgruen <25374553+robgruen@users.noreply.github.com>

Copilot AI temporarily deployed to development-fork May 5, 2026 20:10 Inactive

Copilot AI had a problem deploying to development-fork May 5, 2026 20:10 Failure

Copilot finished work on behalf of robgruen May 5, 2026 20:10

Copilot started work on behalf of robgruen May 5, 2026 21:07 View session

Fix repo policy check failures: lowercase package name, add metadata,…

e1297f6

… remove invisible unicode Agent-Logs-Url: https://github.com/microsoft/TypeAgent/sessions/49ceea26-2edf-4b12-8dcf-5ab4e7f5f22c Co-authored-by: robgruen <25374553+robgruen@users.noreply.github.com>

Copilot finished work on behalf of robgruen May 5, 2026 21:17

Copilot AI had a problem deploying to development-fork May 5, 2026 21:41 Failure

Copilot started work on behalf of robgruen May 5, 2026 21:43 View session

fix: align windowsclock-agent lockfile specifier casing

baefc8a

Agent-Logs-Url: https://github.com/microsoft/TypeAgent/sessions/da715f62-6a47-4132-a9eb-ac0c5ceffe20 Co-authored-by: robgruen <25374553+robgruen@users.noreply.github.com>

Copilot finished work on behalf of robgruen May 5, 2026 21:52

Copilot AI had a problem deploying to development-fork May 5, 2026 22:32 Failure

Copilot AI temporarily deployed to development-fork May 5, 2026 22:32 Inactive

Copilot started work on behalf of robgruen May 5, 2026 23:01 View session

test(shell): align completion toggle tests with @ command dropdown be…

039c7ce

…havior Agent-Logs-Url: https://github.com/microsoft/TypeAgent/sessions/067b6b29-26d7-4ab2-b919-536804ba697f Co-authored-by: robgruen <25374553+robgruen@users.noreply.github.com>

Copilot finished work on behalf of robgruen May 5, 2026 23:11

Copilot AI had a problem deploying to development-fork May 5, 2026 23:25 Failure

Copilot AI temporarily deployed to development-fork May 5, 2026 23:25 Inactive

Copilot AI had a problem deploying to development-fork May 6, 2026 00:29 Failure

Copilot started work on behalf of robgruen May 6, 2026 00:46 View session

test(shell): update completion toggle spec for command dropdown behavior

33a6e98

Agent-Logs-Url: https://github.com/microsoft/TypeAgent/sessions/0f8b27b9-e8b7-4348-87cc-f87261e29f81 Co-authored-by: robgruen <25374553+robgruen@users.noreply.github.com>

Copilot finished work on behalf of robgruen May 6, 2026 00:57

Copilot AI had a problem deploying to development-fork May 6, 2026 00:58 Failure

Copilot AI temporarily deployed to development-fork May 6, 2026 00:58 Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added UI Automation Crawler to Onboarding Agent#2294

Added UI Automation Crawler to Onboarding Agent#2294
robgruen wants to merge 28 commits intomainfrom
dev/robgruen/onboarding_experiment

robgruen commented May 5, 2026

Uh oh!

Copilot AI commented May 5, 2026 •

edited

Loading

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Copilot AI commented May 5, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Copilot AI commented May 5, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Copilot AI commented May 6, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

robgruen commented May 5, 2026

Uh oh!

Copilot AI commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Copilot AI commented May 5, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Copilot AI commented May 5, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Copilot AI commented May 6, 2026

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented May 5, 2026 •

edited

Loading