diff --git a/CHANGELOG.md b/CHANGELOG.md
index fe336d935..d187613ce 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -3,6 +3,16 @@
 All notable changes to this project will be documented in this file.
 
 
+## [Unreleased]
+
+## What's New
+
+- Adds `settings.lean: true` user config option (`~/.config/cagent/config.yaml`) to make the lean TUI the default interface for all interactive runs, without needing to pass `--lean` each time
+
+### Pull Requests
+
+- [#3181](https://github.com/docker/docker-agent/pull/3181) - feat(tui): add lean user config setting
+
 ## [v1.83.0] - 2026-06-19
 
 This release adds an opt-in sudo askpass flow for shell commands, a headless embedded chat session API, and several bug fixes for cost accounting, session handling, and custom provider model resolution.
diff --git a/docs/guides/go-sdk/index.md b/docs/guides/go-sdk/index.md
index 9f3afbf14..6ac81e40b 100644
--- a/docs/guides/go-sdk/index.md
+++ b/docs/guides/go-sdk/index.md
@@ -31,6 +31,7 @@ docker-agent can be used as a Go library, allowing you to build AI agents direct
 | `pkg/model/provider/*` | Model provider clients                   |
 | `pkg/config/latest`    | Configuration types                      |
 | `pkg/environment`      | Environment and secrets                  |
+| `pkg/embeddedchat`     | Headless chat session for embedding the agent runtime in a custom UI |
 | `pkg/tui/components/toolconfirm` | Tool-confirmation policy: `Decision` enum, `BuildPermissionPattern`, key bindings, and rejection-reason presets. Share this instead of copying the permission-pattern logic. |
 | `pkg/tui/service`      | `StaticSessionState` — a `SessionStateReader` with conservative fixed values, for rendering message/tool views outside the full TUI app. Replaces hand-rolled nine-method stubs. |
 | `pkg/tui/animation`    | `Stopper` / `StopView` — animation lifecycle contract. Call `StopAnimation` on views removed from the UI to prevent leaked tick subscriptions. |
@@ -45,6 +46,128 @@ When building custom UIs on top of docker-agent's TUI primitives, four packages
 - **`pkg/tui/animation`** — implement `animation.Stopper` on any view that owns a tick-based animation. Call `StopAnimation` whenever a view is removed from the UI hierarchy to prevent leaked `time.Tick` subscriptions from firing against a dead view.
 - **`pkg/tui/components/transcript`** — embed the transcript view for displaying conversation history. Use the `Messages()` method to read the current slice of transcript messages (treat as read-only — mutations desync renders). This is useful for host-side tests asserting on chat history, and for persistence layers that need to snapshot conversation state.
 
+## Headless Embedded Chat (`pkg/embeddedchat`)
+
+`pkg/embeddedchat` is a thin wrapper around the docker-agent runtime that lets you drive an agent from your own UI instead of running docker-agent's Bubble Tea application. It handles runtime construction, event projection, and conversation state, exposing a simple `Send` / `Confirm` / `Restart` / `Close` API.
+
+### Creating a session
+
+```go
+import (
+    "context"
+    "fmt"
+    "strings"
+
+    dagentcfg "github.com/docker/docker-agent/pkg/config"
+    dagentruntime "github.com/docker/docker-agent/pkg/runtime"
+    "github.com/docker/docker-agent/pkg/embeddedchat"
+)
+
+chat, err := embeddedchat.New(ctx, embeddedchat.Config{
+    // AgentSource can be a file path, raw YAML bytes, or an OCI reference.
+    AgentSource: dagentcfg.NewBytesSource("agent", []byte(agentYAML)),
+})
+if err != nil {
+    return err
+}
+defer chat.Close()
+```
+
+### Sending a message and reading events
+
+`Send` appends the user message to the conversation and returns a channel of `Event` values. Drain the channel until it closes.
+
+```go
+events, err := chat.Send(ctx, "Hello! What can you do?")
+if err != nil {
+    return err
+}
+
+var response strings.Builder
+for ev := range events {
+    switch {
+    case ev.Text != "":
+        response.WriteString(ev.Text)
+    case ev.Tool != nil && ev.Tool.NeedsConfirmation:
+        // Approve the pending tool call (use ResumeApproveSession to allow all).
+        if err := chat.Confirm(ctx, dagentruntime.ResumeApprove()); err != nil {
+            return err
+        }
+    case ev.Tool != nil && ev.Tool.Finished:
+        fmt.Printf("[tool %s finished]\n", ev.Tool.Def.Name)
+    case ev.Err != nil:
+        fmt.Printf("error: %v\n", ev.Err)
+    case ev.Done:
+        fmt.Println("\n[turn complete]")
+    }
+}
+fmt.Print(response.String())
+```
+
+### Restarting the conversation
+
+To start a fresh conversation without recreating the runtime:
+
+```go
+if err := chat.Restart(); err != nil {
+    return err
+}
+```
+
+### Event types
+
+| Field          | When set                                                                 |
+| -------------- | ------------------------------------------------------------------------ |
+| `Text`         | Assistant text delta; accumulate into a string for the full reply.       |
+| `Tool`         | A tool call started, needs confirmation, or finished.                    |
+| `Tool.NeedsConfirmation` | Runtime is blocked until `Confirm` is called.              |
+| `Tool.Finished` | Tool call completed; `Tool.IsError` is true if it errored.             |
+| `Err`          | A user-facing runtime error; no further content events follow.           |
+| `Done`         | Clean end of turn; no more events.                                       |
+| `RuntimeEvent` | The original `runtime.Event` for callers that need the full stream.      |
+
+For advanced use (custom elicitation, raw event inspection), call `chat.Runtime()` to access the underlying `runtime.Runtime` directly.
+
+## Optional Provider Build Tags
+
+By default docker-agent includes all four cloud providers (OpenAI, Anthropic, Google, Amazon Bedrock). When embedding docker-agent in your own binary you can compile out unneeded providers — together with their transitive SDK dependencies — to reduce binary size.
+
+Each provider is gated by a negative build tag prefixed `docker_agent_` to avoid collisions with your own project's tags:
+
+| Build tag                    | Provider dropped         | Major dependency removed                          |
+| ---------------------------- | ------------------------ | ------------------------------------------------- |
+| `docker_agent_no_openai`     | OpenAI                   | `github.com/openai/openai-go`                     |
+| `docker_agent_no_anthropic`  | Anthropic                | `github.com/anthropics/anthropic-sdk-go` (partial — see note) |
+| `docker_agent_no_google`     | Google / Vertex AI       | `google.golang.org/genai`, Vertex auth stack, and indirectly the Anthropic and OpenAI SDKs via Vertex Model Garden |
+| `docker_agent_no_bedrock`    | Amazon Bedrock           | `github.com/aws/aws-sdk-go-v2` stack (the largest provider dependency tree) |
+
+To build without Bedrock and OpenAI:
+
+```bash
+go build -tags 'docker_agent_no_bedrock docker_agent_no_openai' ./...
+```
+
+Requesting a model whose provider was compiled out fails at construction time with a clear `"not compiled into this build"` error. The `dmr` (Docker Model Runner) provider and the rule-based router are always compiled in.
+
+<div class="callout callout-warning" markdown="1">
+<div class="callout-title">Anthropic + Google dependency</div>
+  <p>The Google provider's Vertex Model Garden support also imports the Anthropic SDK, so the Anthropic dependency is only fully removed when <em>both</em> <code>docker_agent_no_anthropic</code> and <code>docker_agent_no_google</code> are set.</p>
+</div>
+
+## RAG Toolset (cgo-free builds)
+
+The RAG toolset (`type: rag`) uses a tree-sitter code parser that requires cgo. When building without cgo — or when you want to drop the cgo dependency entirely — do not import the `pkg/rag` package in your binary.
+
+By default the RAG toolset is **opt-in**: it is only linked when you blank-import its package:
+
+```go
+import (
+    _ "github.com/docker/docker-agent/pkg/tools/builtin/rag" // register RAG toolset
+)
+```
+
+Without this import, a config that declares `type: rag` fails with a "toolset type not registered" error at startup. If your application does not use RAG, simply omit the blank import; the rest of docker-agent works without cgo.
+
 ## Basic Example
 
 Create a simple agent and run it:
diff --git a/docs/guides/thinking/index.md b/docs/guides/thinking/index.md
index 88308caab..f1b2ec2da 100644
--- a/docs/guides/thinking/index.md
+++ b/docs/guides/thinking/index.md
@@ -84,9 +84,9 @@ models:
 
 docker-agent auto-adjusts `max_tokens` when you set a thinking budget but leave `max_tokens` at its default. If you set `max_tokens` explicitly, it must be greater than `thinking_budget`.
 
-### Adaptive thinking (Claude Opus 4.6+)
+### Adaptive thinking (Opus 4.6+ and Sonnet 4.6)
 
-Newer Claude models support adaptive thinking, where the model decides how much to think. **Claude Opus 4.6, 4.7 and 4.8 only support adaptive thinking** — they reject token-based budgets. Use `adaptive`, `adaptive/<effort>`, or a bare effort level — on Anthropic, a bare effort level like `high` is shorthand for adaptive thinking at that effort:
+Newer Claude models support adaptive thinking, where the model decides how much to think. **Claude Opus 4.6, 4.7, 4.8, and Sonnet 4.6 only support adaptive thinking** — they reject token-based budgets. Use `adaptive`, `adaptive/<effort>`, or a bare effort level — on Anthropic, a bare effort level like `high` is shorthand for adaptive thinking at that effort:
 
 ```yaml
 models:
@@ -106,21 +106,22 @@ models:
     thinking_budget: adaptive/max      # adaptive/low | adaptive/medium | adaptive/high | adaptive/xhigh | adaptive/max
 ```
 
-**Adaptive effort levels:**
+**Adaptive effort levels and per-model support:**
 
-| Level     | Description                                       |
-| --------- | ------------------------------------------------- |
-| `minimal` | Treated as `low` (bare form only).                |
-| `low`     | Minimal thinking; fastest adaptive mode.          |
-| `medium`  | Moderate effort.                                  |
-| `high`    | Thorough reasoning; default for `adaptive`.       |
-| `xhigh`   | Very high effort (newer models, e.g. Opus 4.7+).  |
-| `max`     | Maximum effort.                                   |
+| Level     | Opus 4.5 | Sonnet 4.5 / Haiku | Sonnet 4.6 | Opus 4.6 | Opus 4.7 / 4.8 | Fable 5 | Mythos 5 | Mythos preview |
+| --------- | :------: | :----------------: | :--------: | :------: | :------------: | :-----: | :------: | :------------: |
+| `low`     | ✓        | ✓                  | ✓          | ✓        | ✓              | ✓       | ✓        | ✓              |
+| `medium`  | ✓        | ✓                  | ✓          | ✓        | ✓              | ✓       | ✓        | ✓              |
+| `high`    | ✓        | ✓                  | ✓          | ✓        | ✓              | ✓       | ✓        | ✓              |
+| `xhigh`   | —        | —                  | —          | —        | ✓              | ✓       | ✓        | —              |
+| `max`     | —        | —                  | ✓          | ✓        | ✓              | ✓       | ✓        | ✓              |
+
+`minimal` is treated as `low` (bare form only). `high` is the default when `adaptive` is used without an effort level.
 
 <div class="callout callout-warning" markdown="1">
 <div class="callout-title">Effort strings require adaptive-capable models
 </div>
-  <p>Every string effort value on Anthropic is sent as adaptive thinking (<code>output_config.effort</code>), which only newer Claude models (Opus 4.6+) accept. For older models like Sonnet 4.5, use an integer token budget instead. Conversely, models that <em>only</em> support adaptive thinking (Opus 4.6, 4.7, 4.8) automatically have token budgets coerced to <code>adaptive</code> (a warning is logged).</p>
+  <p>Every string effort value on Anthropic is sent as adaptive thinking (<code>output_config.effort</code>), which only newer Claude models (Opus 4.6+, Sonnet 4.6) accept. For older models like Sonnet 4.5, use an integer token budget instead. Conversely, models that <em>only</em> support adaptive thinking (Opus 4.6, 4.7, 4.8, Sonnet 4.6) automatically have token budgets coerced to <code>adaptive</code> (a warning is logged).</p>
 </div>
 
 ### Disabling thinking
@@ -332,7 +333,7 @@ models:
 
 While running in the TUI, press **Shift+Tab** to cycle the thinking effort level for the current model without editing your YAML config:
 
-- The level steps through the model's supported range (model-specific), wrapping around — for example `none → minimal → low → medium → high → none` on OpenAI gpt-5/o-series, `none → minimal → low → medium → high → xhigh → none` on gpt-5.2+, `none → low → medium → high → max → none` on Anthropic Opus 4.6, and `none → low → medium → high → xhigh → none` on Anthropic Opus 4.7+. For older Anthropic models (e.g. Sonnet 4.5) that only accept token budgets, effort-string cycling has no effect — use an integer `thinking_budget` in your YAML config instead.
+- The level steps through the model's supported range (model-specific), wrapping around — for example `none → minimal → low → medium → high → none` on OpenAI gpt-5/o-series, `none → minimal → low → medium → high → xhigh → none` on gpt-5.2+, `none → low → medium → high → max → none` on Anthropic Opus 4.6 and Sonnet 4.6, and `none → low → medium → high → xhigh → max → none` on Anthropic Opus 4.7+, Fable 5, and Mythos 5. For older Anthropic models (e.g. Sonnet 4.5) that only accept token budgets, effort-string cycling has no effect — use an integer `thinking_budget` in your YAML config instead.
 - The current level is shown in the sidebar next to the model name (e.g. `openai/gpt-5 • high`).
 - This applies as a session override — it is **not** saved to the config file. The next session starts from the level defined in your YAML.
 - For models that don't support reasoning, and for remote runtimes, Shift+Tab is a no-op and an informational message is displayed.
diff --git a/docs/providers/anthropic/index.md b/docs/providers/anthropic/index.md
index c018d1e14..df0575c6b 100644
--- a/docs/providers/anthropic/index.md
+++ b/docs/providers/anthropic/index.md
@@ -116,7 +116,7 @@ models:
     thinking_budget: 16384 # must be < max_tokens
 ```
 
-**Adaptive / effort-based** (Claude Opus 4.6+ only — every string value is sent as adaptive thinking via `output_config.effort`):
+**Adaptive / effort-based** (Claude Opus 4.6+, Sonnet 4.6 — every string value is sent as adaptive thinking via `output_config.effort`):
 
 ```yaml
 models:
@@ -131,7 +131,7 @@ models:
     thinking_budget: high # low | medium | high | xhigh | max (same as adaptive/<effort>)
 ```
 
-On models that reject token-based thinking (Opus 4.6, 4.7, 4.8), an integer budget is automatically coerced to `adaptive` with a logged warning. See the [Thinking / Reasoning guide]({{ '/guides/thinking/' | relative_url }}) for the full cross-provider reference.
+On models that reject token-based thinking (Opus 4.6, 4.7, 4.8, Sonnet 4.6), an integer budget is automatically coerced to `adaptive` with a logged warning. See the [Thinking / Reasoning guide]({{ '/guides/thinking/' | relative_url }}) for the full cross-provider reference.
 
 ## Interleaved Thinking