diff --git a/content/en/docs/eino/Cookbook.md b/content/en/docs/eino/Cookbook.md index 0485c508545..c2230232d09 100644 --- a/content/en/docs/eino/Cookbook.md +++ b/content/en/docs/eino/Cookbook.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: Cookbook @@ -63,6 +63,27 @@ This document serves as an example index for the eino-examples project, helping adk/multiagent/integration-excel-agentExcel Agent (ADK Integration)ADK integrated Excel Agent, including Planner, Executor, Replanner, Reporter +### Agent + + + + +
DirectoryNameDescription
adk/agent/ralph-loopRalph LoopAutonomous iteration pattern: external
for
loop with
Runner.Run
for single-round iteration, Agent perceives prior work through filesystem, validation gate checks BUG markers before accepting completion claims
+ +### Cancel + + + + +
DirectoryNameDescription
adk/cancel/graceful-exitGraceful ExitDemonstrates Agent Cancel + Resume: captures terminal signals then cancels nested Agent with
CancelAfterChatModel
+
WithRecursive
mode, waits for safe point to save Checkpoint, then resumes execution
+ +### Middlewares + + + + +
DirectoryNameDescription
adk/middlewares/skillSkill MiddlewareLoads Agent skills from filesystem (e.g., log_analyzer), demonstrating skill middleware usage
+ ### GraphTool diff --git a/content/en/docs/eino/FAQ.md b/content/en/docs/eino/FAQ.md index d75225d43f8..f43462c6e3b 100644 --- a/content/en/docs/eino/FAQ.md +++ b/content/en/docs/eino/FAQ.md @@ -1,10 +1,10 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-21" lastmod: "" tags: [] title: FAQ -weight: 11 +weight: 10 --- # Q: cannot use openapi3.TypeObject (untyped string constant "object") as *openapi3.Types value in struct literal, cannot use types (variable of type string) as *openapi3.Types value in struct literal @@ -13,11 +13,7 @@ Check that the github.com/getkin/kin-openapi dependency version does not exceed # Q: Agent streaming calls do not enter the ToolsNode node. Or streaming effect is lost, behaving as non-streaming. -- First update Eino to the latest version - -Different models may output tool calls differently in streaming mode: some models (like OpenAI) output tool calls directly; some models (like Claude) output text first, then output tool calls. Therefore, different methods need to be used for judgment. This field is used to specify the function that determines whether the model's streaming output contains tool calls. - -The Config of ReAct Agent has a StreamToolCallChecker field. If not filled, the Agent will use "non-empty packet" to determine whether it contains tool calls: +- First update Eino to the latest version. Different models may output tool calls differently in streaming mode: some models (like OpenAI) output tool calls directly; some models (like Claude) output text first, then output tool calls. Therefore, different methods need to be used for judgment. This field is used to specify the function that determines whether the model's streaming output contains tool calls. The Config of ReAct Agent has a StreamToolCallChecker field. If not filled, the Agent will use "non-empty packet" to determine whether it contains tool calls: ```go func firstChunkStreamToolCallChecker(_ context.Context, sr *schema.StreamReader[*schema.Message]) (bool, error) { @@ -45,9 +41,7 @@ func firstChunkStreamToolCallChecker(_ context.Context, sr *schema.StreamReader[ } ``` -The above default implementation is suitable for: Tool Call Messages output by the model only contain Tool Calls. - -Cases where the default implementation is not applicable: there are non-empty content chunks before outputting Tool Calls. In this case, a custom tool call checker is needed: +The above default implementation is suitable for: Tool Call Messages output by the model only contain Tool Calls. Cases where the default implementation is not applicable: there are non-empty content chunks before outputting Tool Calls. In this case, a custom tool call checker is needed: ```go toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Message]) (bool, error) { @@ -74,9 +68,7 @@ toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Mes The above custom StreamToolCallChecker needs to check **all packets** for ToolCall when the model normally outputs an answer, which causes the "streaming judgment" effect to be lost. To preserve the "streaming judgment" effect as much as possible, the suggestion is: > 💡 -> Try adding prompts to constrain the model not to output extra text when calling tools, for example: "If you need to call a tool, output the tool directly without outputting text." -> -> Different models may be affected differently by prompts. In actual use, you need to adjust the prompts yourself and verify the effect. +> Try adding prompts to constrain the model not to output extra text when calling tools, for example: "If you need to call a tool, output the tool directly without outputting text." Different models may be affected differently by prompts. In actual use, you need to adjust the prompts yourself and verify the effect. # Q: [github.com/bytedance/sonic/loader](http://github.com/bytedance/sonic/loader): invalid reference to runtime.lastmoduledatap @@ -91,17 +83,15 @@ Currently, models generally do not produce illegal JSON output. First confirm wh Eino currently does not support batch processing. There are two optional methods: 1. Dynamically build the graph on demand for each request, with low additional cost. Note that Chain Parallel requires more than one parallel node. -2. Custom batch processing node, where the node handles batch processing tasks internally - -Code example: [https://github.com/cloudwego/eino-examples/tree/main/compose/batch](https://github.com/cloudwego/eino-examples/tree/main/compose/batch) +2. Custom batch processing node, where the node handles batch processing tasks internally. Code example: [https://github.com/cloudwego/eino-examples/tree/main/compose/batch](https://github.com/cloudwego/eino-examples/tree/main/compose/batch) # Q: Does Eino support structured model output? Two steps. First, require the model to output structured data, with three methods: 1. Some models support direct configuration (like OpenAI's response format). Check if there's such configuration in the model settings. -2. Obtain through tool call functionality -3. Write prompts requiring the model to output structured data +2. Obtain through tool call functionality. +3. Write prompts requiring the model to output structured data. After getting structured output from the model, you can use schema.NewMessageJSONParser to convert the message to the struct you need. @@ -115,14 +105,8 @@ Discussion by case: 1. context.canceled: When executing a graph or agent, the user passed in a cancelable context and initiated a cancellation. Check the context cancel operation in the application layer code. This error is unrelated to the Eino framework. 2. Context deadline exceeded: Could be two situations: - 1. When executing a graph or agent, the user passed in a context with timeout, triggering a timeout. - 2. Timeout or httpclient with timeout was configured for ChatModel or other external resources, triggering a timeout. - -Check `node path: [node name x]` in the thrown error. If the node name is not a node with external calls like ChatModel, it's most likely situation 2-a; otherwise, it's most likely situation 2-b. - -If you suspect it's situation 2-a, check which link in the upstream chain set the timeout on context. Common possibilities include FaaS platforms, etc. - -If you suspect it's situation 2-b, check whether the node has its own timeout configuration, such as Ark ChatModel configured with Timeout, or OpenAI ChatModel configured with HttpClient (with internal Timeout configuration). If neither is configured but still timing out, it may be the model SDK's default timeout. Known default timeouts: Ark SDK 10 minutes, Deepseek SDK 5 minutes. +3. When executing a graph or agent, the user passed in a context with timeout, triggering a timeout. +4. Timeout or httpclient with timeout was configured for ChatModel or other external resources, triggering a timeout. Check `node path: [node name x]` in the thrown error. If the node name is not a node with external calls like ChatModel, it's most likely situation 2-a; otherwise, it's most likely situation 2-b. If you suspect it's situation 2-a, check which link in the upstream chain set the timeout on context. Common possibilities include FaaS platforms, etc. If you suspect it's situation 2-b, check whether the node has its own timeout configuration, such as Ark ChatModel configured with Timeout, or OpenAI ChatModel configured with HttpClient (with internal Timeout configuration). If neither is configured but still timing out, it may be the model SDK's default timeout. Known default timeouts: Ark SDK 10 minutes, Deepseek SDK 5 minutes. # Q: How to get the parent graph's State in a subgraph @@ -138,37 +122,268 @@ The latest version of Eino introduces UserInputMultiContent and AssistantGenMult # Q: After upgrading to version 0.6.x, there are incompatibility issues -According to the previous community announcement plan [Migration from OpenAPI 3.0 Schema Object to JSONSchema in Eino · cloudwego/eino · Discussion #397](https://github.com/cloudwego/eino/discussions/397), Eino V0.6.1 has been released. Important update content includes removing the getkin/kin-openapi dependency and all OpenAPI 3.0 related code. - -For errors like undefined: schema.NewParamsOneOfByOpenAPIV3 in some eino-ext modules, upgrade the error-reporting eino-ext module to the latest version. +According to the previous community announcement plan [Migration from OpenAPI 3.0 Schema Object to JSONSchema in Eino · cloudwego/eino · Discussion #397](https://github.com/cloudwego/eino/discussions/397), Eino V0.6.1 has been released. Important update content includes removing the getkin/kin-openapi dependency and all OpenAPI 3.0 related code. For errors like undefined: schema.NewParamsOneOfByOpenAPIV3 in some eino-ext modules, upgrade the error-reporting eino-ext module to the latest version. If schema transformation is complex, you can use the tool methods in the [JSONSchema conversion methods](https://bytedance.larkoffice.com/wiki/ZMaawoQC4iIjNykzahwc6YOknXf) document to assist with conversion. -If schema transformation is complex, you can use existing OpenAPI 3.0 → JSONSchema conversion tools to assist with conversion. +> 💡 -# Q: Which models provided by Eino-ext ChatModel support Response API format calls? +# Q: After creating a model, attempting model calls results in error: 400 Bad Request, message: code: missing_required_parameter; message: Missing required parameter: 'input'. -- Currently in Eino-Ext, only ARK's Chat Model can create ResponsesAPI ChatModel through **NewResponsesAPIChatModel**. Other models currently do not support ResponsesAPI creation and usage. - - If you encounter this error, confirm whether the base URL you used to create the chat model is the Chat Completions URL or the Responses API URL. In most cases, an incorrect Responses API base URL was passed. +- If you encounter this error, confirm whether the base URL you used to create the chat model is the Chat Completions URL or the Responses API URL. In most cases, an incorrect Responses API base URL was passed. # Q: How to troubleshoot ChatModel call errors? For example, [NodeRunError] failed to create chat completion: error, status code: 400, status: 400 Bad Request. -This type of error is an error from the model API (such as GPT, Ark, Gemini, etc.). The general approach is to check whether the actual HTTP Request calling the model API has missing fields, incorrect field values, wrong BaseURL, etc. It's recommended to print out the actual HTTP Request through logs and verify/modify the HTTP Request through direct HTTP request methods (such as sending Curl from command line or using Postman for direct requests). After locating the problem, modify the corresponding issues in the Eino code accordingly. - -For how to print out the actual HTTP Request of the model API through logs, refer to this code example: [https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport](https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport) +This type of error is an error from the model API (such as GPT, Ark, Gemini, etc.). The general approach is to check whether the actual HTTP Request calling the model API has missing fields, incorrect field values, wrong BaseURL, etc. It's recommended to print out the actual HTTP Request through logs and verify/modify the HTTP Request through direct HTTP request methods (such as sending Curl from command line or using Postman for direct requests). After locating the problem, modify the corresponding issues in the Eino code accordingly. For how to print out the actual HTTP Request of the model API through logs, refer to this code example: [https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport](https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport) # Q: The gemini chat model created under the eino-ext repository doesn't support using Image URL to pass multimodal data? How to adapt? Currently, the gemini Chat model under the Eino-ext repository has already added support for passing URL types. Use go get github.com/cloudwego/eino-ext/components/model/gemini to update to [components/model/gemini/v0.1.22](https://github.com/cloudwego/eino-ext/releases/tag/components%2Fmodel%2Fgemini%2Fv0.1.22), the current latest version. Test passing Image URL to see if it meets business requirements. -# Q: Before calling tools (including MCP tool), getting JSON Unmarshal failure error, how to solve +# Q: Tool Calls generated by the model have issues (invalid JSON arguments, calling non-existent tools, parameter name changes, etc.), how to handle? + +Tool Calls generated by models (LLMs) may have various issues. Eino provides multi-layered defense mechanisms to address them. Below are introductions by problem type: + +## 1. Tool Call Arguments Are Not Valid JSON (Unmarshal Failure) + +**Typical Error:** `failed to call mcp tool: failed to marshal request: json: error calling MarshalJSON for type json.RawMessage: unexpected end of JSON input` **Root Cause:** In the Tool Call generated by ChatModel, the Argument field is a string. Eino performs JSON Unmarshal before calling the tool. If the JSON output by the model is invalid (extra prefix/suffix, special character escaping, missing braces, over-length truncation, etc.), it will throw an error. **Solution A: ToolArgumentsHandler (Recommended)** Configure `ToolArgumentsHandler` in `ToolsNodeConfig` (or ADK's `ToolsConfig`) to preprocess and fix arguments before tool execution: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: tools, + ToolArgumentsHandler: func(ctx context.Context, name, arguments string) (string, error) { + // Fix common JSON format issues here, such as missing braces, extra prefixes, etc. + return fixJSON(arguments), nil + }, + }, + }, +}) +``` + +A reference implementation for JSON fixing: [eino-examples/components/tool/middlewares/jsonfix](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix) **Execution Order:** `ArgumentsAliases replacement → ToolArgumentsHandler → Tool execution` + +## 2. Model Calls Non-existent Tools (Tool Name Hallucination) + +**Typical Error:** `tool xxx not found in toolsNode indexes` **Root Cause:** The model may "hallucinate" non-existent tool names. **Solution: UnknownToolsHandler** When configured, if the model calls a non-existent tool, instead of throwing an error directly, the Handler returns a prompt text for the model to self-correct: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + UnknownToolsHandler: func(ctx context.Context, name, input string) (string, error) { + return fmt.Sprintf("Tool '%s' does not exist. Available tools: %s. Please retry.", name, availableToolNames), nil + }, +} +``` + +## 3. Tool Name or Parameter Name Changes (Compatibility Issues from Schema Migration) + +**Scenario:** Tool renamed (e.g., `search` → `web_search`), or parameter field renamed (e.g., `q` → `query`), but the model may still use old names. This is especially common when using LLM Cache or when old tool schemas are recorded in conversation history. **Solution: ToolAliases** Configure name aliases and parameter aliases for tools; the framework automatically resolves them during dispatch: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + ToolAliases: map[string]compose.ToolAliasConfig{ + "web_search": { + NameAliases: []string{"search", "web-search"}, // old tool name → current tool name + ArgumentsAliases: map[string][]string{ + "query": {"q", "search_term"}, // old param name → current param name + }, + }, + }, +} +``` + +> 💡 +> ToolAliases parameter alias replacement occurs before ToolArgumentsHandler. The complete execution order is: Name Alias resolution → Arguments Alias replacement → ToolArgumentsHandler → Tool execution. + +## 4. Let the Model Self-correct After Tool Execution Failure (Instead of Interrupting the Flow) + +**Scenario:** When Tool execution encounters an error (e.g., file not found, insufficient permissions, API call failure), the default behavior is to interrupt the Agent flow. However, a better approach is usually to return the error information as a normal Tool Result to the model, allowing the model to automatically correct and retry. **Solution A: ADK Middleware (WrapInvokableToolCall)** In ADK Agent, use `ChatModelAgentMiddleware`'s `WrapInvokableToolCall` method to convert errors to string results: + +```go +func (m *safeToolMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err // Don't convert interrupt errors + } + return fmt.Sprintf("[tool error] %v", err), nil + } + return result, nil + }, nil +} +``` + +Reference: [quickstart/chatwitheino Ch05 Middleware](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go) **Solution B: compose layer ToolCallMiddlewares** Use `ToolCallMiddlewares` directly at the compose layer, suitable for scenarios using Graph/ToolsNode directly: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + ToolCallMiddlewares: []compose.ToolMiddleware{ + { + Invokable: func(next compose.InvokableToolEndpoint) compose.InvokableToolEndpoint { + return func(ctx context.Context, in *compose.ToolInput) (*compose.ToolOutput, error) { + output, err := next(ctx, in) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return nil, err + } + return &compose.ToolOutput{Result: fmt.Sprintf("[tool error] %v", err)}, nil + } + return output, nil + } + }, + }, + }, +} +``` + +Reference: [eino-examples/components/tool/middlewares/errorremover](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/errorremover) + +> 💡 +> Note: When converting errors, you must first check `compose.IsInterruptRerunError`. InterruptRerun errors are control flow signals used by the framework for Human-in-the-loop and similar scenarios, and should not be swallowed. -The Argument field in Tool Call generated by ChatModel is a string. When the Eino framework calls tools based on this Argument string, it first does JSON Unmarshal. At this point, if the Argument string is not valid JSON, JSON Unmarshal will fail, throwing an error like: `failed to call mcp tool: failed to marshal request: json: error calling MarshalJSON for type json.RawMessage: unexpected end of JSON input` +## Summary -The fundamental solution to this problem is to rely on the model to output valid Tool Call Arguments. Engineering-wise, we can try to fix some common JSON format issues, such as extra prefixes/suffixes, special character escaping issues, missing braces, etc., but cannot guarantee 100% correction. A similar fix implementation can be referenced in this code example: [https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix) +
+ + + + + +
ProblemMechanismConfiguration Location
Invalid JSON arguments
ToolArgumentsHandler
ToolsNodeConfig
/
ToolsConfig
Calling non-existent tools
UnknownToolsHandler
ToolsNodeConfig
/
ToolsConfig
Tool name/param name changes
ToolAliases
ToolsNodeConfig
/
ToolsConfig
Tool execution errors need auto-correctionMiddleware error conversionADK
Handlers
or
ToolCallMiddlewares
# Q: How to visualize the topology structure of a graph/chain/workflow? Use the `GraphCompileCallback` mechanism to export the topology structure during `graph.Compile`. A code example for exporting as a mermaid diagram: [https://github.com/cloudwego/eino-examples/tree/main/devops/visualize](https://github.com/cloudwego/eino-examples/tree/main/devops/visualize) +# Q: How to get the Tool Call Message and Tool Result in a Flow/ReAct Agent scenario in Eino? + - For obtaining intermediate structures in Flow/React Agent scenarios, refer to the document [Eino: ReAct Agent Manual](/docs/eino/core_modules/flow_integration_components/react_agent_manual) +- Additionally, you can replace Flow/React Agent with ADK's ChatModel Agent. For details, refer to [Eino ADK: Overview](/docs/eino/core_modules/eino_adk/agent_preview) + +# Q: When developing an Agent with Eino, I defined a tool (Tool) that requires no parameters. Why do I encounter JSON Schema validation failures (like `unknown msg type` or unsupported format) when calling certain LLMs? How to properly resolve this? + +**A: Root Cause:** In the Function Calling / tool calling ecosystem, many LLM providers have strict format validation logic for the JSON Schema they receive. If when defining a parameterless tool, the developer mistakenly passes an empty parameter mapping or empty struct (e.g., causing the framework to generate `{"type": "object", "properties": {}}` - a syntactically valid but meaningless Schema), some model validation engines will judge it as an unexpected abnormal format and directly reject the request. **Framework Mechanism and Code Behavior:** + +- In Eino framework's core definition (`eino/schema/tool.go`), the `schema.ToolInfo` struct specifically uses the `ParamsOneOf` field to describe parameters. +- The framework design explicitly allows: for tools that don't need parameters, `ParamsOneOf` should be `nil`. +- When `ParamsOneOf` is `nil`, Eino's underlying components directly omit the tool's `parameters` field when building requests to various model Providers, fundamentally avoiding triggering the model's strict validation rules. **Best Practice:** When constructing parameterless tools in Eino, **do not use empty structs or empty Maps to initialize parameter descriptions**. Simply let `ParamsOneOf` remain at its default `nil` state. + +```go +tool := &schema.ToolInfo{ + Name: "fetch_current_time", + Desc: "Get current system time, no parameters needed", + // Best practice: explicitly set to nil, or simply don't declare this field + ParamsOneOf: nil, +} +``` + +**(Note: If using **utils.InferTool** or similar reflection-based derivation tools, and the input is an empty struct, ensure that the Eino extension version being used correctly handles filtering of empty properties, or consider manually overriding its parameter definition as needed.)** + +# Q: How to get Session Values outside an Agent (e.g., deep agent's TODOs)? + +In ADK, `adk.GetSessionValues(ctx)` and `adk.AddSessionValue(ctx, key, value)` depend on the `runSession` injected into the context during Agent execution. This means they **can only be used within the Agent's execution context** — for example, in Middleware, Handler, or Tool callback functions. When a user obtains an `AsyncIterator` through Runner's `Run` method and consumes `AgentEvent` externally, they are no longer in the Agent's execution context, so `adk.GetSessionValues` cannot retrieve Session Values. If you need to obtain Session Values in real-time during Agent execution (e.g., while consuming streaming events), consider using Middleware/Callback Handler callbacks to pass the needed data through other channels (such as a channel). + +# Q: How to differentiate AgentEvents from multiple same-named SubAgents executing concurrently? + +**Scenario:** When using DeepAgent, multiple same-named SubAgents (e.g., `general-purpose`) may execute concurrently. When consuming `AsyncIterator[*AgentEvent]` through Runner, events emitted by different instances are hard to distinguish. **Solution: Wrap Agent, inject identifier via CustomizedOutput** `AgentOutput` provides a `CustomizedOutput any` field that can carry custom data. By wrapping the Agent's `Run` method, inject a unique identifier on each emitted event: + +```go +type wrappedAgent struct { + adk.Agent + identifier int +} + +func (w *wrappedAgent) Run(ctx context.Context, input *adk.AgentInput, options ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + iter := w.Agent.Run(ctx, input, options...) + newIter, newGen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() + go func() { + defer newGen.Close() + for { + event, ok := iter.Next() + if !ok { + break + } + // Note: event.Output may be nil (e.g., error events, action-only events) + if event.Output == nil { + event.Output = &adk.AgentOutput{} + } + event.Output.CustomizedOutput = w.identifier + newGen.Send(event) + } + }() + return newIter +} +``` + +**Usage:** + +```go +agent1 := &wrappedAgent{Agent: generalAgent, identifier: 1} +agent2 := &wrappedAgent{Agent: generalAgent, identifier: 2} +// Pass agent1, agent2 as SubAgents to DeepAgent +``` + +**Consumer side differentiation:** + +```go +for { + event, ok := iter.Next() + if !ok { + break + } + if event.Output != nil && event.Output.CustomizedOutput != nil { + id := event.Output.CustomizedOutput.(int) + fmt.Printf("Event from agent instance %d\n", id) + } +} +``` + +> 💡 +> Notes: +> +> 1. event.Output may be nil. You must do a nil check before setting CustomizedOutput. +> 2. This wrapper only covers the Run method. If the Agent implements the ResumableAgent interface (like Agents created by DeepAgent), the Resume method is called directly through the embedded Agent, and its events will not have the identifier injected. For complete coverage, you need to also wrap the Resume method. +> 3. This solution is a workaround, suitable for quickly solving differentiation problems. CustomizedOutput is not persisted to Checkpoint. + +# Q: How to load corresponding ToolInfo only when a Skill is triggered? / How to use Skill to force the model to call a specific tool? + +The root of both questions lies in confusing the concepts of Skill and Tool. **The essence of Skill is Prompt.** When triggered, the Skill middleware inserts a new UserMessage into the conversation, whose content is that Skill's Prompt text. You can write "please call tool xxx with parameters yyy" in a Skill Prompt, but this is still just a prompt — whether the model follows it depends on the quality of Prompt Engineering and the model's inherent randomness. **The essence of Tool (ToolInfo) is request parameters.** The ToolInfo list is sent to the model as the `tools` parameter of the ChatModel request, telling the model "which tools you can call". Unless using ToolSearch for dynamic loading (supported by Claude, GPT 5.4+, etc.), ToolInfo must be passed along with the request. **Regarding "dynamically loading ToolInfo when Skill triggers":** To achieve this effect means that when a Skill Prompt is inserted into the conversation, the corresponding tool definitions needed by that Skill are also appended to the current request's `[]ToolInfo`. This is entirely custom user-side behavior — you need to: 1) identify whether the current turn triggered a Skill; 2) determine which Tools that Skill needs; 3) append the corresponding ToolInfo to `[]ToolInfo` before constructing the ChatModel request. Note that `[]ToolInfo` is at the front of Prompt Cache; dynamically appending new tools very likely breaks Prompt Cache, causing cache hit rate decrease and latency increase. If you care about cache efficiency, pass all potentially needed tools at initialization. **Regarding "using Skill to force the model to call a specific tool":** Skill only sends a text prompt to the model. Whether the model strictly follows it depends on the clarity of the Prompt, the model's own instruction-following capability, and context interference. This is essentially a Prompt Engineering problem with inherent uncertainty. If business requirements demand 100% certainty of calling a specific tool, specify ToolChoice in the LLM request to force the model to select that tool, or call the tool directly in application-layer code without relying on model decisions. + +> 💡 +> Recommended practices: When Skill trigger should "very likely" call a tool → explicitly write the tool name, parameter format, and calling instructions in the Skill Prompt; Need dynamic control of available tool set → use ToolSearch or dynamically modify `[]ToolInfo` in ChatModel middleware based on context; Must 100% call a specific tool → call directly in application-layer code, don't rely on model decisions; Concerned about Prompt Cache invalidation → pass all potentially needed ToolInfo at initialization, avoid dynamic additions/removals. + +# Q: Supervisor sub-Agent transfer back to main Agent errors / transfer_to_agent causes user content changes after forwarding to sub-Agent + +These issues are all related to ADK's AgentTransfer mechanism. Supervisor is a multi-Agent collaboration pattern implemented based on AgentTransfer. The AgentTransfer mechanism has the following known limitations: + +- **Full context sharing**: Supervisor and SubAgents, and between SubAgents, are forced to share complete context, leading to high token costs and latency. +- **Attention dilution**: The fully shared context is often redundant for sub-Agents, diluting the sub-Agent's focus on its actual task and reducing execution quality. +- **Context pollution**: "Successfully transferred to xxx" messages generated during the transfer process remain in the context, potentially misleading subsequent Agent Tool Call decisions (forming incorrect few-shot examples). +- **Forced tool injection**: The mechanism requires injecting Transfer Tool (and possibly Exit Tool), increasing the complexity of the ToolInfo list. + +> 💡 +> Based on the above reasons, the AgentTransfer / Supervisor pattern in ADK is currently marked as "not recommended". + +**Recommended Alternative:** Use DeepAgent or ChatModelAgent + AgentTool combination. In this pattern: + +- Each AgentTool has independently encapsulated context, avoiding mutual pollution, with faster speed, lower cost, and usually better results. +- No "Successfully transferred to xxx" interference messages are generated, avoiding misleading model decisions. + +# Q: DeepSeek V4 model has issues with reason content return in tool call scenarios, how to solve? + +DeepSeek V4 model has known issues with reason content return in tool call scenarios. Multiple users have reported encountering this situation. + +**Solution:** Upgrade the corresponding eino-ext deepseek module to the latest version to fix the issue. + +```shell +go get github.com/cloudwego/eino-ext/components/model/deepseek@latest +``` -# Q: Gemini model error missing a `thought_signature` +After upgrading, run again and confirm whether reason content return has recovered to normal. diff --git a/content/en/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md b/content/en/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md index 5975d0e8ce5..03c3f1c20a1 100644 --- a/content/en/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md +++ b/content/en/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md @@ -1,9 +1,9 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: Orchestration Design Principles' +title: Orchestration Design Principles weight: 2 --- diff --git a/content/en/docs/eino/core_modules/components/agentic_tools_node_guide.md b/content/en/docs/eino/core_modules/components/agentic_tools_node_guide.md index 8b28977eaa8..1d5194e611f 100644 --- a/content/en/docs/eino/core_modules/components/agentic_tools_node_guide.md +++ b/content/en/docs/eino/core_modules/components/agentic_tools_node_guide.md @@ -3,7 +3,7 @@ Description: "" date: "2026-03-03" lastmod: "" tags: [] -title: 'Eino: AgenticToolsNode & Tool User Guide [Beta]' +title: 'AgenticToolsNode & Tool User Guide [Beta]' weight: 12 --- @@ -237,7 +237,7 @@ input := &schema.AgenticMessage{ FunctionToolCall: &schema.FunctionToolCall{ CallID: "1", Name: "get_weather", - Arguments: `{"city": "Shenzhen", "date": "tomorrow"}`, + Arguments: `{"city": "深圳", "date": "tomorrow"}`, }, }, }, @@ -371,8 +371,8 @@ In the tool function body and tool callback handler, you can use the `compose.Ge There are multiple ways to implement tools. You can refer to the following approaches: -- HTTP API-based tool implementation: [How to create a tool/function call using OpenAPI?](/docs/eino/usage_guide/how_to_guide/openapi_tool_creation) -- gRPC-based tool implementation: [How to create a tool/function call using proto3?](/docs/eino/usage_guide/how_to_guide/proto3_tool_creation) -- Thrift-based tool implementation: [How to create a tool/function call using thrift IDL?](/docs/eino/usage_guide/how_to_guide/thrift_idl_tool_creation) +- HTTP API-based tool implementation: [How to create a tool/function call using OpenAPI?](https://bytedance.larkoffice.com/wiki/FjXzwf3exijtKyk2hh7cAmnZn1g) +- gRPC-based tool implementation: [How to create a tool/function call using proto3?](https://bytedance.larkoffice.com/wiki/EPkawUVbdiGwxCkWCJTcAMQonbh) +- Thrift-based tool implementation: [How to create a tool/function call using thrift IDL?](https://bytedance.larkoffice.com/wiki/PcHfwo6x0iOrXxkIjJecez8xnNg) - Local function-based tool implementation: [How to create a tool?](/docs/eino/core_modules/components/tools_node_guide/how_to_create_a_tool) - …… diff --git a/content/en/docs/eino/core_modules/components/document_transformer_guide.md b/content/en/docs/eino/core_modules/components/document_transformer_guide.md index 080ba8e7c81..a5ec32da0b4 100644 --- a/content/en/docs/eino/core_modules/components/document_transformer_guide.md +++ b/content/en/docs/eino/core_modules/components/document_transformer_guide.md @@ -1,9 +1,9 @@ --- Description: "" -date: "2025-07-21" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: Document Transformer User Guide' +title: 'Document Transformer User Guide' weight: 3 --- @@ -160,9 +160,11 @@ for idx, doc := range outDocs { ## **Existing Implementations** -1. Markdown Header Splitter: Document splitting based on Markdown headers [Splitter - markdown](/docs/eino/ecosystem_integration/document/splitter_markdown) -2. Text Splitter: Document splitting based on text length or delimiters [Splitter - semantic](/docs/eino/ecosystem_integration/document/splitter_semantic) -3. Document Filter: Filter document content based on rules [Splitter - recursive](/docs/eino/ecosystem_integration/document/splitter_recursive) + + + + +
markdownREADME_zh.mdREADME.md
recursiveREADME_zh.mdREADME.md
semanticREADME_zh.mdREADME.md
## **Implementation Reference** diff --git a/content/en/docs/eino/core_modules/components/embedding_guide.md b/content/en/docs/eino/core_modules/components/embedding_guide.md index b0cf613360d..a9e7bc5df9e 100644 --- a/content/en/docs/eino/core_modules/components/embedding_guide.md +++ b/content/en/docs/eino/core_modules/components/embedding_guide.md @@ -1,9 +1,9 @@ --- Description: "" -date: "2025-07-21" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: Embedding User Guide' +title: Embedding User Guide weight: 2 --- diff --git a/content/en/docs/eino/core_modules/components/tools_node_guide/_index.md b/content/en/docs/eino/core_modules/components/tools_node_guide/_index.md index 87b36f8e210..0d7867c6a2c 100644 --- a/content/en/docs/eino/core_modules/components/tools_node_guide/_index.md +++ b/content/en/docs/eino/core_modules/components/tools_node_guide/_index.md @@ -1,9 +1,9 @@ --- Description: "" -date: "2026-03-03" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: ToolsNode & Tool Guide' +title: ToolsNode & Tool Guide weight: 9 --- @@ -282,6 +282,82 @@ type ToolInfo struct { The Tool component uses ToolOption to define optional parameters. ToolsNode has no abstracted common options. Each specific implementation can define its own specific Options, wrapped into the unified ToolOption type using the WrapToolImplSpecificOptFn function. +## Tool Aliases 🏷️ alpha/09 + +The Tool Alias feature allows configuring **name aliases** and **argument aliases** for tools, so that when an LLM calls a tool using an alias, it is automatically resolved to the real tool and canonical parameters. + +### Configuration Structure + +```go +// ToolAliasConfig configures name and argument aliases for a single tool +type ToolAliasConfig struct { + // NameAliases is a list of alternative names for the tool + // If the model returns any of these names, it will be resolved to the canonical tool name + NameAliases []string + + // ArgumentsAliases maps canonical argument keys to their alias lists + // key=canonical name, value=[]aliases + // e.g.: {"query": ["q", "search_term"], "limit": ["max_results", "count"]} + ArgumentsAliases map[string][]string +} +``` + +Configure via the `ToolAliases` field in `ToolsNodeConfig`: + +```go +config := &compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{searchTool, weatherTool}, + ToolAliases: map[string]ToolAliasConfig{ + "search": { + NameAliases: []string{"find", "query", "search_v1"}, + ArgumentsAliases: map[string][]string{ + "query": {"q", "search_term"}, + "limit": {"max_results", "count"}, + }, + }, + }, +} +toolsNode, err := compose.NewToolNode(ctx, config) +``` + +### Dynamic Override + +Use the `WithToolAliases()` call option to override global alias configuration at runtime: + +```go +// Override alias configuration (keeping original tool list) +result, err := toolsNode.Invoke(ctx, input, + compose.WithToolAliases(map[string]compose.ToolAliasConfig{ + "search": { + NameAliases: []string{"new_alias"}, + }, + }), +) + +// Override both tool list and aliases +result, err := toolsNode.Invoke(ctx, input, + compose.WithToolList(newSearchTool), + compose.WithToolAliases(map[string]compose.ToolAliasConfig{...}), +) +``` + +### Execution Flow + +Processing order during tool invocation: + +1. **Name Resolution**: The tool name returned by the LLM (which may be an alias) is resolved to the canonical tool name via index lookup +2. **Argument Remapping**: Alias keys in the JSON arguments are automatically replaced with canonical keys +3. **ToolArgumentsHandler** (if configured): Receives the canonical tool name and already-remapped arguments +4. **Tool Execution**: Calls the tool using canonical name and arguments + +### Notes + +- Name aliases **cannot** conflict with other tools' canonical names or already-registered aliases +- Argument aliases **cannot** conflict with existing property names in the tool's JSON Schema +- When both an alias key and canonical key are **present simultaneously** in argument JSON, the canonical key takes precedence and the alias key is left as-is +- Configuring aliases for non-existent tool names will be **silently ignored** +- The alias feature supports both **standard tools** and **enhanced tools** + ## Usage ### Standard Tool Usage diff --git a/content/en/docs/eino/core_modules/devops/visual_debug_plugin_guide.md b/content/en/docs/eino/core_modules/devops/visual_debug_plugin_guide.md index 65ecbba27e5..483d7d8b096 100644 --- a/content/en/docs/eino/core_modules/devops/visual_debug_plugin_guide.md +++ b/content/en/docs/eino/core_modules/devops/visual_debug_plugin_guide.md @@ -1,9 +1,9 @@ --- Description: "" -date: "2025-11-20" +date: "2026-05-17" lastmod: "" tags: [] -title: Eino Dev Visual Debugging Guide +title: Eino Dev Visual Debugging Plugin Guide weight: 3 --- @@ -155,6 +155,7 @@ Because debugging starts an HTTP service in your main process to interact with t > 1. Ensure the target orchestration has run `Compile()` at least once. > 2. `devops.Init()` must run before calling `Compile()`. > 3. Make sure the main process stays alive after `devops.Init()`. +> 4. Starting from v0.1.9, the debug service default listen address changed from `0.0.0.0` to `127.0.0.1` (local connections only). For remote debugging, explicitly specify the listen IP via `WithDevServerIP`, e.g.: `devops.Init(ctx, devops.WithDevServerIP("0.0.0.0"))`. ```go // 1. Initialize debug service @@ -201,10 +202,18 @@ func main() { ### Configure Address -- IP: `127.0.0.1` for local; remote server IP for remote (IPv4/IPv6). -- Port: default `52538`, configurable via `WithDevServerPort`. +- **IP**: IP address of the server where the user process is running. + - If the user process is running on local computer, enter `127.0.0.1`; + - If the user process is running on a remote server, enter the remote server's IP address, supporting both IPv4 and IPv6. +- **Port**: Port the debug service listens on, default is `52538`, configurable via the `WithDevServerPort` option method. -Allow network prompts locally; ensure remote ports are reachable. Once connected, the status indicator turns green. +> 💡 +> Notes +> +> - Local debugging: The system may pop up a network access warning; allow access. +> - Remote debugging: Ensure the port is accessible. Additionally, starting from v0.1.9, the default listen address is `127.0.0.1` only; for remote debugging you must specify an accessible IP (e.g., `0.0.0.0`) via `WithDevServerIP` when calling `devops.Init()`. + +Once IP and Port are configured, click confirm. The debug plugin will automatically connect to the target debug server. If successfully connected, the connection status indicator will turn green. @@ -222,10 +231,22 @@ Ensure your target orchestration has been compiled at least once. Multiple `Comp -- From a specific node: click the run button on that node. +- From a specific node: click the run button on that node to start debugging from there. + + +### View Execution Results + +Debugging from the START node: after clicking Test Run, view debug results in the plugin panel below. + + + +Debugging from any operable node: view debug results in the plugin panel below. + + + ## Advanced ### Specify Implementation Type for Interface Fields @@ -290,11 +311,21 @@ func RegisterGraphOfInterfaceType(ctx context.Context) { err := devops.Init(ctx, devops.AppendType(&graph.NodeInfo{})) ``` -3) During Test Run, interface fields show `{}` by default. Type a space inside `{}` to view all built-in and custom types, select the concrete implementation, then fill `_value`. +3) During Test Run, interface fields show `{}` by default. Type a space inside `{}` to view all built-in and custom types, and select the concrete implementation type for that interface. + + + +4) Fill in the debug node input in the `_value` field. + + + +5) Click confirm to view the debug results. + + -### Debugging `map[string]any` +#### Debugging `map[string]any` -If a node input is `map[string]any`: +Here we explain how to debug when the input type is `map[string]any`. If a node's input type is `map[string]any`, as shown below: ```go func RegisterAnyInputGraph(ctx context.Context) { @@ -341,7 +372,7 @@ func RegisterAnyInputGraph(ctx context.Context) { } ``` -During debugging, in the Test Run JSON input box, use the following format to specify concrete types for values: +During debugging, in the Test Run JSON input box, you need to enter content in the following format: ```json { diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md index 7feea7b33e5..f57e9feb441 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md @@ -1,28 +1,24 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: PatchToolCalls -weight: 7 +weight: 8 --- adk/middlewares/patchtoolcalls > 💡 -> The PatchToolCalls middleware is used to fix "dangling tool calls" issues in the message history. This middleware was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1). +> The PatchToolCalls middleware is used to fix "dangling tool calls" issues in the message history. Introduced in v0.8.0. Supports both `*schema.Message` and `*schema.AgenticMessage` message types. ## Overview -In multi-turn conversation scenarios, there may be cases where an Assistant message contains ToolCalls, but the corresponding Tool message response is missing from the conversation history. Such "dangling tool calls" can cause some model APIs to throw errors or produce abnormal behavior. - -**Common scenarios:** +In multi-turn conversation scenarios, there may be cases where an Assistant message contains ToolCalls, but the corresponding Tool response is missing from the conversation history. Such "dangling tool calls" can cause some model APIs to throw errors or produce abnormal behavior. **Common scenarios:** - User sent a new message before tool execution completed, causing the tool call to be interrupted - Some tool call results were lost when restoring a session -- User canceled tool execution in a Human-in-the-loop scenario - -The PatchToolCalls middleware scans the message history before each model call and automatically inserts placeholder messages for tool calls that lack responses. +- User canceled tool execution in a Human-in-the-loop scenario The PatchToolCalls middleware scans the message history before each model call (in the `BeforeModelRewriteState` hook) and automatically inserts placeholder messages for tool calls that lack responses. ## Quick Start @@ -33,48 +29,64 @@ import ( "github.com/cloudwego/eino/adk/middlewares/patchtoolcalls" ) -// Create middleware with default configuration +// Use default configuration (cfg can be nil) mw, err := patchtoolcalls.New(ctx, nil) if err != nil { // Handle error } -// Use with ChatModelAgent agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ Model: yourChatModel, Middlewares: []adk.ChatModelAgentMiddleware{mw}, }) ``` -## Configuration Options +## API Reference + +### Config ```go type Config struct { - // PatchedContentGenerator custom function to generate placeholder message content - // Optional, uses default message if not set PatchedContentGenerator func(ctx context.Context, toolName, toolCallID string) (string, error) } ``` - +
FieldTypeRequiredDescription
PatchedContentGenerator
func(ctx, toolName, toolCallID string) (string, error)
NoCustom function to generate placeholder message content. Parameters include tool name and call ID, returns the content to fill in
PatchedContentGenerator
func(ctx context.Context, toolName, toolCallID string) (string, error)
NoCustom function for generating placeholder message content. Uses the built-in default message template when not set
-### Default Placeholder Message +### New + +```go +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) +``` + +Creates the PatchToolCalls middleware. `cfg` can be `nil`, in which case default configuration is used. Internally calls `NewTyped[*schema.Message]`. + +### NewTyped + +```go +func NewTyped[M adk.MessageType](_ context.Context, cfg *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +Generic version constructor, supports `*schema.Message` and `*schema.AgenticMessage`. `cfg` can be `nil`. + +- When `M = *schema.Message`, matches Tool messages via the `ToolCallID` field +- When `M = *schema.AgenticMessage`, matches via `ContentBlock.FunctionToolResult.CallID` -If `PatchedContentGenerator` is not set, the middleware uses a default placeholder message: +### Default Placeholder Message -**English (default):** +If `PatchedContentGenerator` is not set, the middleware uses a built-in template (formatted via `fmt.Sprintf`, with `%s` corresponding to toolName and toolCallID respectively): **English (default):** ``` -Tool call {toolName} with id {toolCallID} was cancelled - another message came in before it could be completed. +Tool call %s with id %s was canceled - another message came in before it could be completed. ``` **Chinese:** ``` -工具调用 {toolName}(ID 为 {toolCallID})已被取消——在其完成之前收到了另一条消息。 +工具调用 %s(ID 为 %s)已被取消——在其完成之前收到了另一条消息。 ``` You can switch languages via `adk.SetLanguage()`. @@ -91,6 +103,20 @@ mw, err := patchtoolcalls.New(ctx, &patchtoolcalls.Config{ }) ``` +### Generic Usage (AgenticMessage) + +```go +mw, err := patchtoolcalls.NewTyped[*schema.AgenticMessage](ctx, nil) +if err != nil { + // Handle error +} + +agent, err := adk.NewTypedChatModelAgent[*schema.AgenticMessage](ctx, &adk.TypedChatModelAgentConfig[*schema.AgenticMessage]{ + Model: yourChatModel, + Middlewares: []adk.TypedChatModelAgentMiddleware[*schema.AgenticMessage]{mw}, +}) +``` + ### Combined with Other Middlewares ```go @@ -108,40 +134,33 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ ## How It Works - - -**Processing Logic:** - -1. Executes in the `BeforeModelRewriteState` hook -2. Iterates through all messages to find Assistant messages containing `ToolCalls` -3. For each ToolCall, checks if a corresponding Tool message exists in subsequent messages (matched by `ToolCallID`) -4. If no corresponding Tool message is found, inserts a placeholder message -5. Returns the repaired message list +> 💡 +> For `*schema.Message`, matching is done via `msg.Role == schema.Tool && msg.ToolCallID`; for `*schema.AgenticMessage`, matching is done via `ContentBlock.FunctionToolResult.CallID`. -## Example Scenario +### Example Scenario -### Message History Before Repair +**Before repair:** ``` -[User] "Help me check the weather" -[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] -[Tool] "call_1: Sunny, 25°C" -[User] "No need to check the location, just tell me Beijing's weather" <- User interrupts +[User] "Help me check the weather" +[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] +[Tool] "call_1: Sunny, 25°C" +[User] "No need to check the location, just tell me Beijing's weather" <- User interrupts ``` -### Message History After Repair +**After repair:** ``` -[User] "Help me check the weather" -[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] -[Tool] "call_1: Sunny, 25°C" -[Tool] "call_2: Tool call get_location (ID: call_2) was cancelled..." <- Automatically inserted -[User] "No need to check the location, just tell me Beijing's weather" +[User] "Help me check the weather" +[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] +[Tool] "call_1: Sunny, 25°C" +[Tool] "call_2: Tool call get_location (ID: call_2) was canceled..." <- Automatically inserted +[User] "No need to check the location, just tell me Beijing's weather" ``` ## Multi-language Support -Placeholder messages support both Chinese and English, switch via `adk.SetLanguage()`: +Placeholder messages support Chinese and English, switch via `adk.SetLanguage()`: ```go import "github.com/cloudwego/eino/adk" @@ -153,7 +172,8 @@ adk.SetLanguage(adk.LanguageEnglish) // English (default) ## Notes > 💡 -> This middleware only modifies the history messages for the current run in the `BeforeModelRewriteState` hook, and does not affect the actual stored message history. The repair is temporary and only used for the current agent call. +> The state returned by `BeforeModelRewriteState` is persisted by the framework into the agent's internal state (see the `ProcessState` call in `wrappers.go`). Therefore, placeholder messages inserted by PatchToolCalls **will be retained in subsequent iterations** and do not need to be re-patched each round. - It's recommended to place this middleware at the **front** of the middleware chain to ensure other middlewares process a complete message history -- If your scenario requires persisting the repaired messages, implement the corresponding logic in `PatchedContentGenerator` +- The `cfg` parameter can be `nil`, equivalent to `&Config{}` +- If the message list is empty (`len(state.Messages) == 0`), the middleware returns immediately without any processing diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md index 740b3775891..2e48f2ad797 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md @@ -1,33 +1,28 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: PlanTask -weight: 4 +weight: 6 --- -# PlanTask Middleware - -adk/middlewares/plantask - > 💡 -> This middleware was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1). +> This middleware was introduced in v0.8.0. Package path: `github.com/cloudwego/eino/adk/middlewares/plantask` ## Overview -`plantask` is a task management middleware that allows Agents to create and manage task lists. The middleware injects four tools through the `BeforeAgent` hook: - -- **TaskCreate**: Create a task -- **TaskGet**: View task details -- **TaskUpdate**: Update a task -- **TaskList**: List all tasks +`plantask` is a task management middleware that injects four tools into the Agent through the `BeforeAgent` hook, giving it structured task planning capabilities: -Main purposes: + + + + + + +
ToolFunction
TaskCreate
Create a task
TaskGet
Get details of a single task
TaskUpdate
Update task status/fields, set dependencies, delete task
TaskList
List summaries of all tasks
-- Track progress of complex tasks -- Break large tasks into smaller steps -- Manage dependencies between tasks +Core purpose: break complex requests into trackable sub-tasks, manage dependencies between tasks, and let users see execution progress. --- @@ -38,7 +33,7 @@ Main purposes: │ Agent │ │ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ -│ │ BeforeAgent: Inject task tools │ │ +│ │ BeforeAgent: Inject task tools (with sync.Mutex for concurrency) │ │ │ │ - TaskCreate │ │ │ │ - TaskGet │ │ │ │ - TaskUpdate │ │ @@ -53,7 +48,7 @@ Main purposes: │ │ │ Storage structure: │ │ baseDir/ │ -│ ├── .highwatermark # ID counter │ +│ ├── .highwatermark # Maximum assigned ID (plain numeric text) │ │ ├── 1.json # Task #1 │ │ ├── 2.json # Task #2 │ │ └── ... │ @@ -63,101 +58,122 @@ Main purposes: --- -## Configuration +## API + +### Constructors + +```go +// Generic version, supports *schema.Message and *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, config *Config) (adk.TypedChatModelAgentMiddleware[M], error) + +// Non-generic version, equivalent to NewTyped[*schema.Message] +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +``` + +### Config ```go type Config struct { Backend Backend // Storage backend, required - BaseDir string // Task file directory, required + BaseDir string // Task file storage directory, required } ``` -- Note that the Backend implementation should be isolated by session, with different sessions corresponding to different Backends (task lists) +> 💡 +> The Backend should be isolated at the session level — different sessions correspond to different Backend instances (i.e., different task lists). ---- +### Backend Interface -## Backend Interface +`Backend` is defined within the `plantask` package and is a minimal subset of `filesystem.Backend`, retaining only the four methods needed for task storage: ```go type Backend interface { LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) - Read(ctx context.Context, req *ReadRequest) (string, error) + Read(ctx context.Context, req *ReadRequest) (*filesystem.FileContent, error) Write(ctx context.Context, req *WriteRequest) error Delete(ctx context.Context, req *DeleteRequest) error } ``` +Type alias relationships: + +```go +type FileInfo = filesystem.FileInfo // Path, IsDir, Size, ModifiedAt +type LsInfoRequest = filesystem.LsInfoRequest // Path string +type ReadRequest = filesystem.ReadRequest // FilePath, Offset, Limit +type WriteRequest = filesystem.WriteRequest // FilePath, Content string + +// DeleteRequest is custom to the plantask package (filesystem package has no such type) +type DeleteRequest struct { + FilePath string +} +``` + +> 💡 +> Note that `Read` returns `*filesystem.FileContent` (containing a `Content string` field), not a bare string. Import path is `github.com/cloudwego/eino/adk/filesystem`. + --- ## Task Structure ```go type task struct { - ID string `json:"id"` // Task ID - Subject string `json:"subject"` // Title - Description string `json:"description"` // Description - Status string `json:"status"` // Status - Blocks []string `json:"blocks"` // Tasks blocked by this one - BlockedBy []string `json:"blockedBy"` // Tasks blocking this one - ActiveForm string `json:"activeForm"` // Active form text - Owner string `json:"owner"` // Responsible agent - Metadata map[string]any `json:"metadata"` // Custom data + ID string `json:"id"` + Subject string `json:"subject"` + Description string `json:"description"` + Status string `json:"status"` + Blocks []string `json:"blocks"` + BlockedBy []string `json:"blockedBy"` + ActiveForm string `json:"activeForm,omitempty"` + Owner string `json:"owner,omitempty"` + Metadata map[string]any `json:"metadata,omitempty"` } ``` ### Status - - + + - +
StatusDescription
pending
Pending (default)
Status ValueDescription
pending
Pending (default on creation)
in_progress
In progress
completed
Completed
deleted
Deleted (will delete the file)
deleted
Deleted (physically deletes the JSON file and removes from other tasks' dependency lists)
-Status transition: `pending` → `in_progress` → `completed`, any status can be directly `deleted`. +Status transitions: `pending` → `in_progress` → `completed`; any status can be directly set to `deleted`. --- -## Tools +## Tool Parameters ### TaskCreate -Create a task. +Tool name constant: `TaskCreateToolName = "TaskCreate"` - - - - + + + +
ParameterTypeRequiredDescription
subject
stringYesTitle
description
stringYesDescription
activeForm
stringNoActive form text, e.g., "Running tests"
metadata
objectNoCustom data
subject
stringYesTask title (imperative form)
description
stringYesDetailed task description, including context and acceptance criteria
activeForm
stringNoActive form text (e.g., "Running tests"), displayed to user when in_progress
metadata
objectNoCustom key-value pairs
-When to use: - -- The task is relatively complex with 3 or more steps -- The user has given a list of things to do -- You need to show progress to the user - -When not to use: - -- It's just a simple task -- Something that can be done quickly +After creation, the task ID auto-increments (based on the `.highwatermark` file), with initial status `pending`. ### TaskGet -View task details. +Tool name constant: `TaskGetToolName = "TaskGet"` - +
ParameterTypeRequiredDescription
taskId
stringYesTask ID
taskId
stringYesTask ID (numeric string)
-Returns complete information about the task: title, description, status, dependencies, etc. +Returns complete task information: subject, description, status, blocks, blockedBy, owner. ### TaskUpdate -Update a task. +Tool name constant: `TaskUpdateToolName = "TaskUpdate"` @@ -165,24 +181,28 @@ Update a task. - - - - - + + + + +
ParameterTypeRequiredDescription
subject
stringNoNew title
description
stringNoNew description
activeForm
stringNoNew active form text
status
stringNoNew status
addBlocks
[]stringNoAdd blocked tasks
addBlockedBy
[]stringNoAdd tasks blocking this one
owner
stringNoResponsible agent
metadata
objectNoCustom data (set to null to delete)
status
stringNoNew status, enum:
pending
/
in_progress
/
completed
/
deleted
addBlocks
[]stringNoAdd task IDs that are blocked by the current task (written bidirectionally)
addBlockedBy
[]stringNoAdd task IDs that block the current task (written bidirectionally)
owner
stringNoResponsible agent name
metadata
objectNoMerged into existing metadata; setting a key to null deletes that key
-Notes: +Key behaviors: -- `status: "deleted"` will directly delete the task file -- Circular dependencies are checked when adding dependencies -- Automatic cleanup occurs when all tasks are completed +- `status: "deleted"` physically deletes the task file and removes the ID from all other tasks' blocks/blockedBy +- Adding dependencies performs **cycle detection**; an error is returned if a cycle would be created +- When **all tasks are completed**, all task files are automatically deleted (cleanup mechanism) ### TaskList -List all tasks, no parameters required. +Tool name constant: `TaskListToolName = "TaskList"` -Returns a summary of each task: ID, status, title, responsible agent, dependencies. +No parameters. Returns a summary list of all tasks (sorted by ID), each formatted as: + +``` +#ID [status] subject [owner: xxx] [blocked by #x, #y] +``` --- @@ -191,8 +211,7 @@ Returns a summary of each task: ID, status, title, responsible agent, dependenci ```go ctx := context.Background() -// The plantask middleware should normally be session-scoped -// Different sessions correspond to different task lists +// Backend should be isolated at the session level middleware, err := plantask.New(ctx, &plantask.Config{ Backend: myBackend, BaseDir: "/tasks", @@ -213,39 +232,40 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ 1. Receive complex task │ ▼ -2. TaskCreate to create tasks +2. TaskCreate to create multiple sub-tasks - #1: Analyze requirements - - #2: Write code + - #2: Implement code + - #3: Write tests │ ▼ 3. TaskUpdate to set dependencies - - #2 depends on #1 - - #3 depends on #2 + - #2 addBlockedBy: ["1"] + - #3 addBlockedBy: ["2"] │ ▼ -4. TaskList to see what tasks exist +4. TaskList to view available tasks │ ▼ -5. TaskUpdate to start working - - Change #1 to in_progress +5. TaskUpdate #1 → in_progress │ ▼ -6. When done, TaskUpdate - - Change #1 to completed +6. After completion, TaskUpdate #1 → completed │ ▼ 7. Loop 4-6 until all completed │ ▼ -8. Automatic cleanup +8. All completed → automatic cleanup of all files ``` --- ## Dependency Management -- **blocks**: These tasks can start after I complete -- **blockedBy**: I can start after these tasks complete +- **blocks**: "After I complete, these tasks can start" +- **blockedBy**: "After these tasks complete, I can start" + +Dependency writing is **bidirectional**: executing `addBlocks: ["2"]` on Task A will also write A's ID into Task #2's `blockedBy`. ``` Task #1 (blocks: ["2"]) ────► Task #2 (blockedBy: ["1"]) @@ -253,33 +273,32 @@ Task #1 (blocks: ["2"]) ────► Task #2 (blockedBy: ["1"]) #2 can only start after #1 completes ``` -Circular dependencies will throw an error: +Cycle detection is implemented via DFS reachability: ``` #1 blocks #2 -#2 blocks #1 ← Not allowed, circular +#2 blocks #1 ← Error: would create a cyclic dependency ``` --- -## Automatic Cleanup - -When all tasks are `completed`, all task files will be automatically deleted. - ---- - -## Notes +## Implementation Details -- Task files are stored in JSON format in the `BaseDir` directory, with filenames as `{id}.json` -- The `.highwatermark` file is used to record the maximum assigned task ID, ensuring IDs don't repeat -- All tool operations are protected by mutex locks and are concurrency-safe -- The tool descriptions contain detailed usage guidelines that the Agent will follow + + + + + + + + +
MechanismDescription
ID allocation
.highwatermark
file stores the current maximum ID, incremented by 1 on creation
Concurrency safetyAll four tools share a single
sync.Mutex
, serializing execution within the same middleware instance
File formatOne
{id}.json
file per task, JSON serialized using
sonic
Automatic cleanupAfter TaskUpdate marks a task as completed, it checks — if all tasks are completed, batch delete all files
ID validationPure numeric regex
^\d+$
Delete cascadingWhen deleting a task, iterates all task files to remove references to that ID
--- ## Multi-language Support -Tool descriptions support Chinese and English switching via `adk.SetLanguage()`: +Tool descriptions support Chinese and English, switchable via global setting: ```go // Use Chinese descriptions @@ -289,4 +308,4 @@ adk.SetLanguage(adk.LanguageChinese) adk.SetLanguage(adk.LanguageEnglish) ``` -This setting is global and affects all ADK built-in prompts and tool descriptions. +This setting affects all ADK built-in prompts and tool descriptions. diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md index f9afbf5c065..f34fac82c5d 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md @@ -1,17 +1,17 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: Skill weight: 3 --- -Skill middleware adds Skill support to Eino ADK agents, enabling agents to dynamically discover and use predefined skills to complete tasks more accurately and efficiently. +Skill Middleware provides Skill support for Eino ADK Agents, enabling agents to dynamically discover and use predefined skills to complete tasks. # What is a Skill -A Skill is a folder that contains instructions, scripts, and resources. Agents can discover and use these skills on demand to extend their capabilities. The core of a Skill is a `SKILL.md` file, which includes metadata (at least `name` and `description`) and guidance for the agent to execute a specific type of task. +A Skill is a folder containing instructions, scripts, and resources. Agents can discover and use these Skills on demand to extend their capabilities. The core is a `SKILL.md` file, which includes metadata (at least name and description) and instructions guiding the Agent to execute tasks. ``` my-skill/ @@ -21,11 +21,13 @@ my-skill/ └── assets/ # Optional: templates/resources ``` -Skills use **Progressive Disclosure** to manage context efficiently: +Skills use **Progressive Disclosure** to efficiently manage context: -1. **Discovery**: on startup, the agent only loads each skill’s name and description — enough to decide when the skill might be useful -2. **Activation**: when a task matches a skill’s description, the agent loads the full `SKILL.md` content into context -3. **Execution**: the agent follows the instructions and can load other files or execute bundled code as needed. This keeps the agent responsive while still allowing on-demand access to additional context. + + +1. **Discovery**: The Agent only loads the name and description of each available Skill — enough to determine when the Skill might be needed +2. **Activation**: When a task matches a Skill, the Agent loads the full `SKILL.md` content into context +3. **Execution**: The Agent follows instructions to execute the task, loading other files or executing bundled code as needed > 💡 > Ref: [https://agentskills.io/home](https://agentskills.io/home) @@ -34,7 +36,7 @@ Skills use **Progressive Disclosure** to manage context efficiently: ## FrontMatter -Skill metadata used for quick display during discovery, avoiding loading full content: +Skill metadata structure, parsed from the YAML frontmatter of SKILL.md. Used for quick display during the discovery phase: ```go type FrontMatter struct { @@ -48,11 +50,11 @@ type FrontMatter struct { - - - - - + + + + +
FieldTypeDescription
Name
string
Unique identifier of a skill. The agent invokes the skill by name. Use short, meaningful names (e.g.
pdf-processing
,
web-research
). Corresponds to the
name
field in SKILL.md frontmatter.
Description
string
Description of what the skill does. This is the key basis for the agent to decide whether to use the skill, so it should clearly describe applicable scenarios and capabilities. Corresponds to the
description
field in SKILL.md frontmatter.
Context
ContextMode
Context mode. Supported values:
fork_with_context
(copy history messages to a new agent for execution),
fork
(create a new agent with isolated context for execution). Empty means inline mode (return skill content directly).
Agent
string
Agent name to use. Used with
Context
, resolved via
AgentHub
. Empty means using the default agent.
Model
string
Model name to use. Resolved via
ModelHub
. In context mode, passed to the agent factory; in inline mode, it switches the model used by subsequent ChatModel calls.
Name
string
Unique identifier of the Skill. Use short, meaningful names (e.g.
pdf-processing
,
web-research
)
Description
string
Description of the Skill's capabilities. Key basis for the Agent to decide whether to use it — should clearly describe applicable scenarios and capabilities
Context
ContextMode
Context mode. Values:
fork
(isolated context),
fork_with_context
(copy history messages). Empty means inline mode
Agent
string
Agent name to use, used with
Context
, resolved via
AgentHub
. Empty uses the default Agent
Model
string
Model name to use, resolved via
ModelHub
### ContextMode @@ -66,14 +68,14 @@ const ( - - - + + +
ModeDescription
Inline (default)Skill content is returned as the tool result and the current agent continues processing
ForkWithContextCreate a new agent, copy current conversation history, execute the skill independently, and return the result
ForkCreate a new agent with isolated context (only skill content), execute independently, and return the result
Inline (default)Skill content is returned directly as the tool result, and the current Agent continues processing
fork_with_context
Creates a new Agent, copies the current conversation history, executes the Skill task independently, and returns the result
fork
Creates a new Agent with isolated context (only Skill content), executes independently, and returns the result
## Skill -Complete skill structure (metadata + instruction content): +Complete Skill structure, containing metadata and instruction content: ```go type Skill struct { @@ -85,18 +87,14 @@ type Skill struct { - - - + + +
FieldTypeDescription
FrontMatter
FrontMatter
Embedded metadata:
Name
,
Description
,
Context
,
Agent
,
Model
Content
string
The body of SKILL.md after frontmatter. Contains detailed instructions, workflows, examples, etc. The agent reads it after skill activation.
BaseDirectory
string
Absolute path of the skill directory. The agent can use this path to access other resources in the skill directory (scripts, templates, references, etc.).
FrontMatter
FrontMatter
Embedded metadata structure
Content
string
Body content of SKILL.md after frontmatter, containing detailed instructions, workflows, examples, etc.
BaseDirectory
string
Absolute path of the Skill directory, through which the Agent can access other resource files in the directory
## Backend -Skill backend interface defines how skills are retrieved. It decouples skill storage from usage: - -- **Flexible storage**: store skills in local filesystem, databases, remote services, cloud storage, etc. -- **Extensible**: implement custom backends (e.g. load from Git repos, config centers) -- **Test-friendly**: easy to build mock backends for unit tests +Skill backend interface, decoupling skill storage from usage: ```go type Backend interface { @@ -107,13 +105,13 @@ type Backend interface { - - + +
MethodDescription
List
List metadata of all available skills. Called when the agent starts to build the skill tool description, so the agent knows what skills exist.
Get
Get full skill content by name. Called when the agent decides to use a skill, returning the full Skill structure including detailed instructions.
List
List metadata of all available skills. Called when the Agent starts to build the skill tool description
Get
Get complete skill content by name. Called when the Agent decides to use a skill
### NewBackendFromFilesystem -A filesystem-backed backend implementation that reads skills from a directory via `filesystem.Backend`: +Backend implementation based on the `filesystem.Backend` interface, scanning first-level subdirectories under a specified directory to read skills: ```go type BackendFromFilesystemConfig struct { @@ -126,314 +124,154 @@ func NewBackendFromFilesystem(ctx context.Context, config *BackendFromFilesystem - - + +
FieldTypeRequiredDescription
Backend
filesystem.Backend
YesFilesystem backend implementation used for file operations
BaseDir
string
YesRoot directory for skills. It scans all first-level subdirectories and treats the ones containing
SKILL.md
as skills.
Backend
filesystem.Backend
YesFilesystem backend implementation for file operations
BaseDir
string
YesSkill root directory path. Scans first-level subdirectories under this directory, looking for directories containing
SKILL.md
files
How it works: -- scan first-level subdirectories under `BaseDir` -- look for `SKILL.md` in each subdirectory -- parse YAML frontmatter to get metadata -- deeply nested `SKILL.md` files are ignored - -### filesystem.Backend Implementations +- Scans first-level subdirectories under `BaseDir` +- Looks for `SKILL.md` in each subdirectory +- Parses YAML frontmatter to get metadata +- Deeply nested `SKILL.md` files are ignored -There are two `filesystem.Backend` implementations to choose from. See [Middleware: FileSystem](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem). +`filesystem.Backend` has two implementations to choose from — see the FileSystem Backend documentation. ## AgentHub and ModelHub -When Skills use context mode (fork/isolate), you need to configure AgentHub and ModelHub: +When Skills use Context mode (fork / fork\_with\_context), AgentHub and ModelHub are needed to provide Agent instances and model instances. + +> 💡 +> The following shows non-generic alias types (i.e., `*schema.Message` specialization). Generic versions `TypedAgentHub[M]` and `TypedModelHub[M]` are available for `*schema.AgenticMessage` scenarios, with identical interface signatures differing only in the message type parameter. ```go -// AgentHubOptions contains options passed to AgentHub.Get when creating an agent for skill execution. -type AgentHubOptions struct { - // Model is the resolved model instance when a skill specifies a "model" field in frontmatter. - // nil means the skill did not specify a model override; implementations should use their default. - Model model.ToolCallingChatModel +// AgentHubOptions passed to AgentHub.Get +type AgentHubOptions = TypedAgentHubOptions[*schema.Message] + +type TypedAgentHubOptions[M adk.MessageType] struct { + // Model is the model instance specified in the skill frontmatter (resolved via ModelHub). + // nil means the skill did not specify a model override; implementations should use the default model. + Model model.BaseModel[M] } -// AgentHub provides agent instances for context mode (fork/fork_with_context) execution. -type AgentHub interface { - // Get returns an Agent by name. When name is empty, implementations should return a default agent. - // The opts parameter carries skill-level overrides (e.g., model) resolved by the framework. - Get(ctx context.Context, name string, opts *AgentHubOptions) (adk.Agent, error) +// AgentHub provides Agent instances for Context mode +type AgentHub = TypedAgentHub[*schema.Message] + +type TypedAgentHub[M adk.MessageType] interface { + // Get returns an Agent by name. When name is empty, should return the default Agent. + Get(ctx context.Context, name string, opts *TypedAgentHubOptions[M]) (adk.TypedAgent[M], error) } -// ModelHub provides model instances. -type ModelHub interface { - Get(ctx context.Context, name string) (model.ToolCallingChatModel, error) +// ModelHub resolves model instances by name +type ModelHub = TypedModelHub[*schema.Message] + +type TypedModelHub[M adk.MessageType] interface { + Get(ctx context.Context, name string) (model.BaseModel[M], error) } ``` -## +> 💡 +> Note: The return type of `AgentHubOptions.Model` and `ModelHub.Get` is `model.BaseModel[M]`, not the `model.ToolCallingChatModel` from older documentation. -## Initialization +## SubAgentInput and SubAgentOutput -Create the Skill middleware (recommended: `NewMiddleware`): +These two structs are used when customizing fork mode behavior: ```go -func NewMiddleware(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +type SubAgentInput = TypedSubAgentInput[*schema.Message] + +type TypedSubAgentInput[M adk.MessageType] struct { + Skill Skill + Mode ContextMode + RawArguments string // Raw JSON arguments + SkillContent string // Constructed Skill content + History []M // Conversation history (only in fork_with_context mode) + ToolCallID string // Tool call ID (only in fork_with_context mode) +} + +type SubAgentOutput = TypedSubAgentOutput[*schema.Message] + +type TypedSubAgentOutput[M adk.MessageType] struct { + Skill Skill + Mode ContextMode + RawArguments string + Messages []M // All messages produced by the sub-Agent + Results []string // Extracted assistant message text content +} ``` -Config: +# Initialization + +## Config ```go -type Config struct { - // Backend is required - Backend Backend - - // SkillToolName defaults to "skill" - SkillToolName *string - - // AgentHub provides agent factories for context mode - // Required when skill uses "context: fork" or "context: isolate" - AgentHub AgentHub - - // ModelHub provides model instances for skill-specified models - ModelHub ModelHub - - // CustomSystemPrompt customizes system prompt - CustomSystemPrompt SystemPromptFunc - - // CustomToolDescription customizes tool description +type Config = TypedConfig[*schema.Message] + +type TypedConfig[M adk.MessageType] struct { + Backend Backend + SkillToolName *string + AgentHub TypedAgentHub[M] + ModelHub TypedModelHub[M] + + CustomSystemPrompt SystemPromptFunc CustomToolDescription ToolDescriptionFunc + CustomToolParams func(ctx context.Context, defaults map[string]*schema.ParameterInfo) (map[string]*schema.ParameterInfo, error) + BuildContent func(ctx context.Context, skill Skill, rawArgs string) (string, error) + BuildForkMessages func(ctx context.Context, in TypedSubAgentInput[M]) ([]M, error) + FormatForkResult func(ctx context.Context, in TypedSubAgentOutput[M]) (string, error) } ``` - - - - - - + + + + + + + + + +
FieldTypeRequiredDefaultDescription
Backend
Backend
Yes
  • Skill backend implementation responsible for storage and retrieval. You can use the built-in
    LocalBackend
    or provide your own.
    SkillToolName
    *string
    No
    "skill"
    Name of the skill tool. Agents invoke skills via this tool name. If your agent already has a tool with the same name, set this to avoid conflicts.
    AgentHub
    AgentHub
    No
  • Provides agent factories. Required when a skill uses
    context: fork
    or
    context: isolate
    .
    ModelHub
    ModelHub
    No
  • Provides model instances. Used when a skill specifies the
    model
    field.
    CustomSystemPrompt
    SystemPromptFunc
    NoBuilt-in promptCustom system prompt function
    CustomToolDescription
    ToolDescriptionFunc
    NoBuilt-in descriptionCustom tool description function
    Backend
    Backend
    Yes-Skill backend implementation, responsible for skill storage and retrieval
    SkillToolName
    *string
    No
    "skill"
    Skill tool name. Can be customized to avoid conflicts if a tool with the same name already exists
    AgentHub
    TypedAgentHub[M]
    No-Provides Agent instances. Required when using
    context: fork
    or
    fork_with_context
    ModelHub
    TypedModelHub[M]
    No-Provides model instances. Passed to AgentHub in Context mode; in inline mode, switches the model for subsequent ChatModel calls via WrapModel
    CustomSystemPrompt
    SystemPromptFunc
    NoBuilt-in promptCustom system prompt. Signature:
    func(ctx, toolName) string
    CustomToolDescription
    ToolDescriptionFunc
    NoBuilt-in descriptionCustom tool description. Signature:
    func(ctx, skills []FrontMatter) string
    CustomToolParams
    func
    NoOnly
    skill
    param
    Custom tool parameter schema. Receives defaults and returns custom params;
    skill
    is always required
    BuildContent
    func
    NoDefault formattingCustom Skill content generation, can inject additional context into the content
    BuildForkMessages
    func
    NoSee belowCustom initial messages for the sub-Agent in fork mode. Default:
    fork
    [UserMessage(content)]
    ,
    fork_with_context
    [history..., ToolMessage(content, callID)]
    FormatForkResult
    func
    NoConcatenate contentCustom sub-Agent result formatting. Default concatenates assistant message content
    -# Quick Start - -Example: loading a pdf skill locally. Full code: [https://github.com/cloudwego/eino-examples/tree/main/adk/middlewares/skill](https://github.com/cloudwego/eino-examples/tree/main/adk/middlewares/skill). - -- Create a skills directory under your working directory: +## NewMiddleware ```go -workdir/ -├── skills/ -│ └── pdf/ -│ ├── scripts -│ │ └── analyze.py -│ └── SKILL.md -└── other files +func NewMiddleware(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) ``` -- Create a local filesystem backend and build the Skill middleware: +Creates the Skill Middleware, returns `adk.ChatModelAgentMiddleware`, used in `ChatModelAgentConfig.Handlers`. -```go -import ( - "github.com/cloudwego/eino/adk/middlewares/skill" - "github.com/cloudwego/eino-ext/adk/backend/local" -) +> 💡 +> Generic version `NewTyped[M](ctx, config)` returns `adk.TypedChatModelAgentMiddleware[M]`, usable with `*schema.AgenticMessage` type Agents. -ctx := context.Background() +## Usage Example -be, err := local.NewBackend(ctx, &local.Config{}) +```go +// 1. Create Backend +backend, err := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesystemConfig{ + Backend: fsBackend, + BaseDir: "/path/to/skills", +}) if err != nil { - log.Fatal(err) + return err } -skillBackend, err := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesystemConfig{ - Backend: be, - BaseDir: skillsDir, +// 2. Create Middleware +handler, err := skill.NewMiddleware(ctx, &skill.Config{ + Backend: backend, + AgentHub: myAgentHub, // Optional, only needed for fork mode + ModelHub: myModelHub, // Optional, only needed when using model field }) if err != nil { - log.Fatalf("Failed to create skill backend: %v", err) + return err } -sm, err := skill.NewMiddleware(ctx, &skill.Config{ - Backend: skillBackend, -}) -``` - -- Create a local FileSystem middleware so the agent can read other skill files and execute scripts: - -```go -import ( - "github.com/cloudwego/eino/adk/middlewares/filesystem" -) - -fsm, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: be, - StreamingShell: be, -}) -``` - -- Create an agent and configure middlewares: - -```go +// 3. Pass to Agent's Handlers agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "LogAnalysisAgent", - Description: "An agent that can analyze logs", - Instruction: "You are a helpful assistant.", - Model: cm, - Handlers: []adk.ChatModelAgentMiddleware{fsm, sm}, + // ... other config + Handlers: []adk.ChatModelAgentMiddleware{handler}, }) ``` - -- Run the agent and observe output: - -```go -runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: agent, -}) - -input := fmt.Sprintf("Analyze the %s file", filepath.Join(workDir, "test.log")) -log.Println("User: ", input) - -iterator := runner.Query(ctx, input) -for { - event, ok := iterator.Next() - if !ok { - break - } - if event.Err != nil { - log.Printf("Error: %v\n", event.Err) - break - } - - prints.Event(event) -} -``` - -Agent output: - -```yaml -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool name: skill -arguments: {"skill":"log_analyzer"} - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool response: Launching skill: log_analyzer -Base directory for this skill: /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/skills/log_analyzer -# SKILL.md content - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool name: execute -arguments: {"command": "python3 /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/skills/log_analyzer/scripts/analyze.py /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/test.log"} - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool response: Analysis Result for /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/test.log: -Total Errors: 2 -Total Warnings: 1 - -Error Details: -Line 3: [2024-05-20 10:02:15] ERROR: Database connection failed. -Line 5: [2024-05-20 10:03:05] ERROR: Connection timed out. - -Warning Details: -Line 2: [2024-05-20 10:01:23] WARNING: High memory usage detected. - - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -answer: Here's the analysis result of the log file: - -### Summary -- **Total Errors**: 2 -- **Total Warnings**: 1 - -### Detailed Entries -#### Errors: -1. Line 3: [2024-05-20 10:02:15] ERROR: Database connection failed. -2. Line5: [2024-05-2010:03:05] ERROR: Connection timed out. - -#### Warnings: -1. Line2: [2024-05-2010:01:23] WARNING: High memory usage detected. - -The log file contains critical issues related to database connectivity and a warning about memory usage. Let me know if you need further analysis! -``` - -# How It Works - -The Skill middleware adds a system prompt and a skill tool to the agent. The system prompt is below, where `{tool_name}` is the tool name of the skill tool: - -```python -# Skills System - -**How to Use Skills (Progressive Disclosure):** - -Skills follow a **progressive disclosure** pattern - you see their name and description above, but only read full instructions when needed: - -1. **Recognize when a skill applies**: Check if the user's task matches a skill's description -2. **Read the skill's full instructions**: Use the '{tool_name}' tool to load skill -3. **Follow the skill's instructions**: tool result contains step-by-step workflows, best practices, and examples -4. **Access supporting files**: Skills may include helper scripts, configs, or reference docs - use absolute paths - -**When to Use Skills:** -- User's request matches a skill's domain (e.g., "research X" -> web-research skill) -- You need specialized knowledge or structured workflows -- A skill provides proven patterns for complex tasks - -**Executing Skill Scripts:** -Skills may contain Python scripts or other executable files. Always use absolute paths. - -**Example Workflow:** - -User: "Can you research the latest developments in quantum computing?" - -1. Check available skills -> See "web-research" skill -2. Call '{tool_name}' tool to read the full skill instructions -3. Follow the skill's research workflow (search -> organize -> synthesize) -4. Use any helper scripts with absolute paths - -Remember: Skills make you more capable and consistent. When in doubt, check if a skill exists for the task! -``` - -The skill tool takes a skill name to load and returns the full content of the corresponding SKILL.md. Its tool description lists all available skills with their names and descriptions: - -```sql -Execute a skill within the main conversation - - -When users ask you to perform tasks, check if any of the available skills below can help complete the task more effectively. Skills provide specialized capabilities and domain knowledge. - -How to invoke: -- Use this tool with the skill name only (no arguments) -- Examples: - - `skill: pdf` - invoke the pdf skill - - `skill: xlsx` - invoke the xlsx skill - - `skill: ms-office-suite:pdf` - invoke using fully qualified name - -Important: -- When a skill is relevant, you must invoke this tool IMMEDIATELY as your first action -- NEVER just announce or mention a skill in your text response without actually calling this tool -- This is a BLOCKING REQUIREMENT: invoke the relevant Skill tool BEFORE generating any other response about the task -- Only use skills listed in below -- Do not invoke a skill that is already running -- Do not use this tool for built-in CLI commands (like /help, /clear, etc.) - - - -{{- range .Matters }} - - -{{ .Name }} - - -{{ .Description }} - - -{{- end }} - -``` - -Example: - - - -> 💡 -> Skill middleware only provides the ability to load SKILL.md as shown above. If a skill requires the agent to read files, execute scripts, etc., users need to configure those capabilities for the agent separately. diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md index 8895dc6f705..521446a5ebd 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md @@ -1,210 +1,343 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: Summarization -weight: 3 +weight: 4 --- +> 💡 +> This middleware was introduced in v0.8.0. Package path: `github.com/cloudwego/eino/adk/middlewares/summarization` + ## Overview -The Summarization middleware automatically compresses conversation history when the token count exceeds a configured threshold. This helps maintain context continuity in long conversations while staying within the model's token limits. +The Summarization middleware automatically calls a summary model to compress conversation history when the conversation token count exceeds a threshold, keeping long conversations coherent within the model's context window. The middleware hooks into `BeforeModelRewriteState`, checking trigger conditions before each model call. When triggered, it executes: counting → summary generation (with retry/failover) → post-processing → state replacement. -> 💡 -> This middleware was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1). +## Generic System -## Quick Start +All core types and functions in this package provide both a **Typed generic version** (`M adk.MessageType`) and a **non-generic alias** (fixed to `*schema.Message`). -```go -import ( - "context" - "github.com/cloudwego/eino/adk/middlewares/summarization" -) + + + + + + + + + + + + + + + +
    Generic VersionNon-generic Alias (= Typed\[*schema.Message\])
    TypedConfig[M]
    Config
    NewTyped[M](ctx, *TypedConfig[M])
    New(ctx, *Config)
    TypedTokenCounterFunc[M]
    TokenCounterFunc
    TypedGenModelInputFunc[M]
    GenModelInputFunc
    TypedGetFailoverModelFunc[M]
    GetFailoverModelFunc
    TypedFinalizeFunc[M]
    FinalizeFunc
    TypedCallbackFunc[M]
    CallbackFunc
    TypedUserMessageFilterFunc[M]
    UserMessageFilterFunc
    TypedPreserveUserMessages[M]
    PreserveUserMessages
    TypedRetryConfig[M]
    RetryConfig
    TypedFailoverConfig[M]
    FailoverConfig
    TypedFailoverContext[M]
    FailoverContext
    TypedFinalizerBuilder[M]
    FinalizerBuilder
    -// Create middleware with minimal configuration -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, // Required: model used for generating summaries -}) -if err != nil { - // Handle error -} +Unless otherwise noted, type signatures in this document use the generic form `M`. When using non-generic aliases, `M` = `*schema.Message`. -// Use with ChatModelAgent -agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Model: yourChatModel, - Middlewares: []adk.ChatModelAgentMiddleware{mw}, -}) +### Constructors + +```go +// Generic version — supports *schema.Message and *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error) + +// Non-generic version — equivalent to NewTyped[*schema.Message] +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) ``` -## Configuration Options +## TypedConfig[M] Configuration - - - - - - - - - - - + + + + + + + + + + + + +
    FieldTypeRequiredDefaultDescription
    Modelmodel.BaseChatModelYes
  • Chat model used for generating summaries
    ModelOptions[]model.OptionNo
  • Options passed to the model when generating summaries
    TokenCounterTokenCounterFuncNo~4 chars/tokenCustom token counting function
    Trigger*TriggerConditionNo190,000 tokensCondition to trigger summarization
    InstructionstringNoBuilt-in promptCustom summarization instruction
    TranscriptFilePathstringNo
  • Full conversation transcript file path
    PreparePrepareFuncNo
  • Custom preprocessing function before summary generation
    FinalizeFinalizeFuncNo
  • Custom post-processing function for final messages
    CallbackCallbackFuncNo
  • Called after Finalize to observe state changes (read-only)
    EmitInternalEventsboolNofalseWhether to emit internal events
    PreserveUserMessages*PreserveUserMessagesNoEnabled: trueWhether to preserve original user messages in summary
    Model
    model.BaseModel[M]
    YesModel used to generate summaries
    ModelOptions
    []model.Option
    NoOptions passed to the summary model
    TokenCounter
    TypedTokenCounterFunc[M]
    NoEstimates based on the most recent assistant message's total_tokens as baseline, incremental messages at ~4 chars/tokenCustom token counting function
    Trigger
    *TriggerCondition
    NoContextTokens=160,000Condition to trigger summarization
    UserInstruction
    string
    NoBuilt-in promptCustom user-level summarization instruction, overrides the default instruction
    TranscriptFilePath
    string
    NoFull conversation transcript file path, appended to the summary to remind the model where to find original context. Only effective when Finalize is not set
    GenModelInput
    TypedGenModelInputFunc[M]
    NosysInstruction → contextMsgs → userInstructionFull control over constructing the summary model input
    Finalize
    TypedFinalizeFunc[M]
    NoBuilt-in post-processingCustom summary post-processing. When set, the middleware no longer performs any default post-processing
    Callback
    TypedCallbackFunc[M]
    NoCalled after Finalize, with parameters
    before, after adk.TypedChatModelAgentState[M]
    (value types), read-only
    EmitInternalEvents
    bool
    NofalseWhether to send internal events at key points
    PreserveUserMessages
    *TypedPreserveUserMessages[M]
    NoEnabled: truePreserve original user messages in the summary. Only effective when Finalize is not set
    Retry
    *TypedRetryConfig[M]
    Nonil (no retry)Retry strategy for the primary model summary generation
    Failover
    *TypedFailoverConfig[M]
    NonilFailover strategy after primary model failure
    -### TriggerCondition Structure +> 💡 +> **Finalize override semantics**: Once a custom `Finalize` is set, the middleware will **skip all default post-processing** — `PreserveUserMessages` and `TranscriptFilePath` will no longer take effect. To reuse default post-processing logic in a custom Finalize, use the `DefaultFinalizer` function. + +## Sub-configuration Structs + +### TriggerCondition + +Summarization is triggered when **any** condition is met. ```go type TriggerCondition struct { - // ContextTokens triggers summarization when total token count exceeds this threshold - ContextTokens int + ContextTokens int // Trigger when token count exceeds this threshold + ContextMessages int // Trigger when message count exceeds this threshold } ``` -### PreserveUserMessages Structure +### TypedPreserveUserMessages\[M\] + +When enabled, replaces the `...` section in the summary with the most recent original user messages. ```go -type PreserveUserMessages struct { - // Enabled whether to enable user message preservation - Enabled bool - - // MaxTokens maximum tokens for preserved user messages - // Only preserves the most recent user messages until this limit is reached - // Defaults to 1/3 of TriggerCondition.ContextTokens - MaxTokens int +type TypedPreserveUserMessages[M adk.MessageType] struct { + Enabled bool + MaxTokens int // Max tokens for preserved user messages; defaults to TriggerCondition.ContextTokens / 3 + Filter TypedUserMessageFilterFunc[M] // Filter function, return false to exclude a message } ``` -### Configuration Examples +### TypedRetryConfig[M] -**Custom Token Threshold** +```go +type TypedRetryConfig[M adk.MessageType] struct { + MaxRetries *int // Default 3 + ShouldRetry func(ctx context.Context, resp M, err error) bool // Default: retry when err != nil + BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration // Default: exponential backoff + jitter +} +``` + +### TypedFailoverConfig[M] ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Trigger: &summarization.TriggerCondition{ - ContextTokens: 100000, // Trigger at 100k tokens - }, -}) +type TypedFailoverConfig[M adk.MessageType] struct { + MaxRetries *int // Default 3 + ShouldFailover func(ctx context.Context, resp M, err error) bool // Default: failover when err != nil + BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration + GetFailoverModel TypedGetFailoverModelFunc[M] // Returns (failoverModel model.BaseModel[M], failoverModelInputMsgs []M, failoverErr error) +} ``` -**Custom Token Counter** +### TypedFailoverContext[M] + +Context passed to the `GetFailoverModel` callback. ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - TokenCounter: func(ctx context.Context, input *summarization.TokenCounterInput) (int, error) { - // Use your tokenizer - return yourTokenizer.Count(input.Messages) - }, -}) +type TypedFailoverContext[M adk.MessageType] struct { + Attempt int // Current failover attempt number, starting from 1 + SystemInstruction M // System instruction (set internally by the middleware, not configurable) + UserInstruction M // User instruction + OriginalMessages []M // Original complete conversation + LastModelResponse M // Model response from the last attempt + LastErr error +} ``` -**Set Transcript File Path** +### TypedTokenCounterInput[M] ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, +type TypedTokenCounterInput[M adk.MessageType] struct { + Messages []M + Tools []*schema.ToolInfo +} +``` + +## Function Type Signature Reference + +```go +type TypedTokenCounterFunc[M] func(ctx context.Context, input *TypedTokenCounterInput[M]) (int, error) +type TypedGenModelInputFunc[M] func(ctx context.Context, sysInstruction, userInstruction M, originalMsgs []M) ([]M, error) +type TypedGetFailoverModelFunc[M] func(ctx context.Context, failoverCtx *TypedFailoverContext[M]) (model.BaseModel[M], []M, error) +type TypedFinalizeFunc[M] func(ctx context.Context, originalMessages []M, summary M) ([]M, error) +type TypedCallbackFunc[M] func(ctx context.Context, before, after adk.TypedChatModelAgentState[M]) error +type TypedUserMessageFilterFunc[M] func(ctx context.Context, msg M) (bool, error) +``` + +## DefaultFinalizer + +`DefaultFinalizer` is a standalone factory function that returns a `TypedFinalizeFunc[M]` consistent with the middleware's default post-processing logic. Use it when you need to reuse default logic (preserving user messages, appending transcript path, etc.) in a custom `Finalize`. + +```go +func DefaultFinalizer[M adk.MessageType](cfg *DefaultFinalizerConfig[M]) (TypedFinalizeFunc[M], error) +``` + +### DefaultFinalizerConfig[M] + +```go +type DefaultFinalizerConfig[M adk.MessageType] struct { + PreserveUserMessages *TypedPreserveUserMessages[M] // Default Enabled=true, MaxTokens=30000 + TranscriptFilePath string +} +``` + +**Example**: Execute default post-processing first in a custom Finalize, then add a system message: + +```go +defaultFinalize, err := summarization.DefaultFinalizer[*schema.Message](&summarization.DefaultFinalizerConfig[*schema.Message]{ TranscriptFilePath: "/path/to/transcript.txt", }) +if err != nil { + // handle error +} + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: func(ctx context.Context, originalMessages []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + msgs, err := defaultFinalize(ctx, originalMessages, summary) + if err != nil { + return nil, err + } + // Add system message before the summary + return append([]*schema.Message{schema.SystemMessage("your system prompt")}, msgs...), nil + }, +} ``` -**Custom Finalize Function** +## FinalizerBuilder + +`TypedFinalizerBuilder[M]` provides a chainable API for building `TypedFinalizeFunc[M]`, supporting linking multiple handlers and an optional custom finalizer. ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Finalize: func(ctx context.Context, originalMessages []adk.Message, summary adk.Message) ([]adk.Message, error) { - // Custom logic to build final messages - return []adk.Message{ - schema.SystemMessage("Your system prompt"), - summary, - }, nil - }, -}) +func NewTypedFinalizer[M adk.MessageType]() *TypedFinalizerBuilder[M] +func NewFinalizer() *FinalizerBuilder // = NewTypedFinalizer[*schema.Message] + +func (b *TypedFinalizerBuilder[M]) PreserveSkills(config *PreserveSkillsConfig) *TypedFinalizerBuilder[M] +func (b *TypedFinalizerBuilder[M]) Custom(fn TypedFinalizeFunc[M]) *TypedFinalizerBuilder[M] +func (b *TypedFinalizerBuilder[M]) Build() (TypedFinalizeFunc[M], error) ``` -**Using Callback to Observe State Changes/Store** +Execution order: Handlers transform the summary in registration order → Custom determines the final output message list. If Custom is not set, returns `[]M{summary}`. + +### PreserveSkills + +Preserves skill content loaded by the Skill middleware after summary compression, ensuring the agent retains skill knowledge after context window compression. ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Callback: func(ctx context.Context, before, after adk.ChatModelAgentState) error { - log.Printf("Summarization completed: %d messages -> %d messages", - len(before.Messages), len(after.Messages)) - return nil - }, -}) +type PreserveSkillsConfig struct { + SkillToolName string // Skill tool name, must match the Skill middleware. Default "skill" + MaxSkills *int // Maximum number of skills to preserve. Default 5; 0 means disabled + MaxTokensPerSkill *int // Maximum tokens per skill, truncated if exceeded. Default 5000 + SkillsTokenBudget *int // Total token budget for all skills. Default 25000 +} ``` -**Control User Message Preservation** +**Example**: ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - PreserveUserMessages: &summarization.PreserveUserMessages{ - Enabled: true, - MaxTokens: 50000, // Preserve up to 50k tokens of user messages - }, -}) +finalizer, err := summarization.NewFinalizer(). + PreserveSkills(&summarization.PreserveSkillsConfig{}). + Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + return []*schema.Message{schema.SystemMessage("system prompt"), summary}, nil + }). + Build() + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: finalizer, +} ``` -## How It Works +## Summarize Method -```mermaid -flowchart TD - A[BeforeModelRewriteState] --> B{Token count exceeds threshold?} - B -->|No| C[Return original state] - B -->|Yes| D[Emit BeforeSummary event] - D --> E{Has custom Prepare?} - E -->|Yes| F[Call Prepare] - E -->|No| G[Call model to generate summary] - F --> G - G --> H{Has custom Finalize?} - H -->|Yes| I[Call Finalize] - H -->|No| L{Has custom Callback?} - I --> L - L -->|Yes| M[Call Callback] - L -->|No| J[Emit AfterSummary event] - M --> J - J --> K[Return new state] - - style A fill:#e3f2fd - style G fill:#fff3e0 - style D fill:#e8f5e9 - style J fill:#e8f5e9 - style K fill:#c8e6c9 - style C fill:#f5f5f5 - style M fill:#fce4ec - style F fill:#fff3e0 - style I fill:#fff3e0 +`TypedMiddleware[M]` exposes a `Summarize` method that can manually trigger a summarization outside of the middleware's automatic trigger: + +```go +func (m *TypedMiddleware[M]) Summarize(ctx context.Context, state *adk.TypedChatModelAgentState[M]) ([]M, error) ``` +This method executes the full summarization flow (generation → post-processing → Callback → events) but **does not check trigger conditions**. Returns the replaced message list. + +## How It Works + + + +**Trigger condition check**: First checks `ContextMessages` (message count), then calculates token count via `TokenCounter` and compares with `ContextTokens`. Triggered if either is met. + +**Default post-processing** (when Finalize is not set): + +1. Replaces `...` in the summary with the most recent original user messages (controlled by `PreserveUserMessages`) +2. Appends `TranscriptFilePath` hint +3. Adds summary preamble and continuation instructions + ## Internal Events -When EmitInternalEvents is set to true, the middleware emits events at key points: +When `EmitInternalEvents = true`, the middleware sends events via `adk.TypedSendEvent`: - - + + +
    Event TypeTrigger TimingCarried Data
    ActionTypeBeforeSummaryBefore generating summaryOriginal message list
    ActionTypeAfterSummaryAfter completing summaryFinal message list
    ActionTypeBeforeSummarize
    After trigger condition is met, before calling the model
    TypedBeforeSummarizeAction[M]{Messages}
    : original message list
    ActionTypeGenerateSummary
    After each model generation attempt (including retry/failover)
    TypedGenerateSummaryAction[M]{Attempt, Phase, ModelResponse, GetError()}
    ActionTypeAfterSummarize
    After summary completion and Finalize
    TypedAfterSummarizeAction[M]{Messages}
    : final message list
    -**Usage Example** +Events are wrapped in `TypedCustomizedAction[M]` and placed in the `adk.AgentAction.CustomizedAction` field. `GenerateSummaryPhase` has two values: `GenerateSummaryPhasePrimary` (primary model/retry) and `GenerateSummaryPhaseFailover` (failover). + +## Usage Examples + +### Minimal Configuration ```go mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - EmitInternalEvents: true, + Model: yourChatModel, }) -// Listen for events in your event handler +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: yourChatModel, + Middlewares: []adk.ChatModelAgentMiddleware{mw}, +}) +``` + +### Custom Trigger + Retry + Failover + +```go +mw, err := summarization.New(ctx, &summarization.Config{ + Model: yourChatModel, + Trigger: &summarization.TriggerCondition{ + ContextTokens: 100000, + ContextMessages: 80, + }, + TranscriptFilePath: "/path/to/transcript.txt", + Retry: &summarization.RetryConfig{ + MaxRetries: ptrOf(2), + }, + Failover: &summarization.FailoverConfig{ + MaxRetries: ptrOf(3), + GetFailoverModel: func(ctx context.Context, fctx *summarization.FailoverContext) (model.BaseModel[*schema.Message], []*schema.Message, error) { + return backupModel, nil, nil // Returning nil input will reuse the default input + }, + }, +}) +``` + +### FinalizerBuilder + PreserveSkills + DefaultFinalizer + +```go +defaultFinalize, _ := summarization.DefaultFinalizer[*schema.Message]( + &summarization.DefaultFinalizerConfig[*schema.Message]{ + TranscriptFilePath: "/path/to/transcript.txt", + }, +) + +finalizer, err := summarization.NewFinalizer(). + PreserveSkills(&summarization.PreserveSkillsConfig{ + MaxSkills: ptrOf(3), + }). + Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + msgs, err := defaultFinalize(ctx, origMsgs, summary) + if err != nil { + return nil, err + } + return append([]*schema.Message{schema.SystemMessage("system prompt")}, msgs...), nil + }). + Build() + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: finalizer, +} ``` -## Best Practices +## Notes -1. **Set TranscriptFilePath**: It's recommended to always provide a conversation transcript file path so the model can reference the original conversation when needed. -2. **Adjust Token Threshold**: Adjust `Trigger.MaxTokens` based on the model's context window size. Generally recommended to set it to 80-90% of the model's limit. -3. **Custom Token Counter**: In production environments, it's recommended to implement a custom `TokenCounter` that matches the model's tokenizer for accurate counting. +1. **Set TranscriptFilePath**: Strongly recommended to provide a conversation transcript file path so the model can trace back details from the original records after summarization. +2. **Adjust trigger threshold**: `Trigger.ContextTokens` should be set to 80-90% of the model's context window. The default value of 160,000 is suitable for models with 200k windows. +3. **Custom TokenCounter**: For production environments, it's recommended to implement a counter that precisely matches the model's tokenizer. The default estimator uses the most recent assistant message's `ResponseMeta.Usage.TotalTokens` as a baseline and estimates incremental messages at ~4 chars/token. +4. **Finalize override**: Setting `Finalize` means `PreserveUserMessages` and `TranscriptFilePath` no longer take effect automatically. To reuse them, use `DefaultFinalizer` or `FinalizerBuilder`. +5. **GetFailoverModel constraint**: The callback must return a non-nil model and non-empty input message list. diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md index 3fe9bf47048..cd738ecdbdf 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md @@ -1,25 +1,23 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-17" lastmod: "" tags: [] -title: ToolReduction -weight: 6 +title: Reduction +weight: 5 --- -# ToolReduction Middleware - -adk/middlewares/reduction +`adk/middlewares/reduction` > 💡 -> This middleware was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1). +> This middleware was introduced in v0.8.0. ## Overview -The `reduction` middleware is used to control the token count occupied by tool results, providing two strategies: +The `reduction` middleware manages the token count occupied by tool outputs in Agent conversations, operating in two phases: -1. **Truncation**: Immediately truncate overly long outputs when a tool returns, saving the complete content to Backend -2. **Clear**: When total tokens exceed the threshold, store old tool results to the file system +1. **Truncation**: Triggered immediately when a tool call returns. When a single output exceeds `MaxLengthForTrunc`, the full content is stored in the Backend and the message is replaced with a truncated summary. +2. **Clear**: Triggered before model calls (`BeforeModelRewriteState`). When total tokens exceed `MaxTokensForClear`, it iterates through historical messages and offloads old tool arguments and results to the Backend. --- @@ -30,12 +28,13 @@ Tool call returns result │ ▼ ┌─────────────────────────────────────────────────────────────┐ -│ WrapInvokableToolCall / WrapStreamableToolCall │ +│ WrapInvokableToolCall / WrapStreamableToolCall │ +│ WrapEnhancedInvokableToolCall / WrapEnhancedStreamable │ │ │ -│ Truncation strategy (can be skipped) │ +│ Truncation (can be skipped via SkipTruncation) │ │ Result length > MaxLengthForTrunc? │ │ Yes → Truncate content, save full content to Backend │ -│ No → Return as-is │ +│ No → Return as-is │ └─────────────────────────────────────────────────────────────┘ │ ▼ @@ -45,11 +44,14 @@ Tool call returns result ┌─────────────────────────────────────────────────────────────┐ │ BeforeModelRewriteState │ │ │ -│ Clear strategy (can be skipped) │ +│ Clear (can be skipped via SkipClear) │ │ Total tokens > MaxTokensForClear? │ -│ Yes → Store old tool results to Backend, replace with │ -│ file paths │ -│ No → Do nothing │ +│ Yes → ClearMessageRewriter preprocessing │ +│ → Old tool results stored to Backend, replaced │ +│ with file paths │ +│ → ClearAtLeastTokens minimum release check │ +│ → ClearPostProcess callback │ +│ No → Do nothing │ └─────────────────────────────────────────────────────────────┘ │ ▼ @@ -58,95 +60,75 @@ Tool call returns result --- -## Configuration +## Generic System -### Config Main Configuration +This middleware follows the ADK standard generic pattern, supporting both `*schema.Message` and `*schema.AgenticMessage`: ```go -type Config struct { - // Backend storage backend for saving truncated/cleared content - // Required when SkipTruncation is false - Backend Backend - - // SkipTruncation skip the truncation phase - SkipTruncation bool - - // SkipClear skip the clear phase - SkipClear bool - - // ReadFileToolName name of the tool for reading files - // After content is offloaded to a file, the agent needs this tool to read it - // Default "read_file" - ReadFileToolName string +// Generic config, M is constrained to adk.MessageType +type TypedConfig[M adk.MessageType] struct { ... } - // RootDir root directory for saving content - // Default "/tmp" - // Truncated content saved to {RootDir}/trunc/{tool_call_id} - // Cleared content saved to {RootDir}/clear/{tool_call_id} - RootDir string - - // MaxLengthForTrunc maximum length to trigger truncation - // Default 50000 - MaxLengthForTrunc int +// Backward-compatible alias +type Config = TypedConfig[*schema.Message] +``` - // TokenCounter token counter - // Used to determine if clearing needs to be triggered - // Default uses character_count/4 estimation - TokenCounter func(ctx context.Context, msg []adk.Message, tools []*schema.ToolInfo) (int64, error) +Constructors are also available in both generic and non-generic forms: - // MaxTokensForClear token threshold to trigger clearing - // Default 30000 - MaxTokensForClear int64 +```go +func NewTyped[M adk.MessageType](ctx context.Context, config *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error) +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +``` - // ClearRetentionSuffixLimit how many recent conversation rounds to keep without clearing - // Default 1 - ClearRetentionSuffixLimit int +--- - // ClearPostProcess callback after clearing completes - // Can be used to save or notify current state - ClearPostProcess func(ctx context.Context, state *adk.ChatModelAgentState) context.Context +## Configuration - // ToolConfig configuration for specific tools - // Takes precedence over global configuration - ToolConfig map[string]*ToolReductionConfig -} -``` +### TypedConfig[M] Main Configuration + + + + + + + + + + + + + + + + + + + + +
    FieldTypeDescription
    Backend
    Backend
    Storage backend. Required when
    SkipTruncation
    is false; can be nil when only doing Clear without offload.
    SkipTruncation
    bool
    Skip the truncation phase.
    SkipClear
    bool
    Skip the clear phase.
    ReadFileToolName
    string
    Tool name for reading offloaded content. Default
    "read_file"
    .
    RootDir
    string
    Root directory for saving content. Default
    "/tmp"
    . Truncated content is saved to
    {RootDir}/trunc/{tool_call_id}
    , cleared content to
    {RootDir}/clear/{tool_call_id}
    .
    GenTruncOffloadFilePath
    func(ctx, *ToolDetail) (string, error)
    Custom truncation file path generator. When set, RootDir does not apply to truncation. Useful for scenarios where tool_call_id is not unique.
    GenClearOffloadFilePath
    func(ctx, *ToolDetail) (string, error)
    Custom clear file path generator. When set, RootDir does not apply to clear.
    MaxLengthForTrunc
    int
    Maximum character length to trigger truncation. Default
    50000
    .
    TruncExcludeTools
    []string
    List of tool names to exclude from truncation.
    TokenCounter
    func(ctx, []M, []*schema.ToolInfo) (int64, error)
    Token counting function. Defaults to character_count/4 estimation. Recommend replacing with tiktoken-go/tokenizer.
    MaxTokensForClear
    int64
    Token threshold to trigger clear. Default
    160000
    .
    ClearRetentionSuffixLimit
    int
    Keep the most recent N assistant message rounds without clearing. Default
    1
    .
    ClearAtLeastTokens
    int64
    Minimum token amount that must be released by clearing. If not met, clearing is not executed (avoids needlessly breaking prompt cache). Default
    0
    .
    ClearExcludeTools
    []string
    List of tool names to exclude from clearing.
    ClearMessageRewriter
    func(ctx, M, []M) ([]M, error)
    Message rewrite callback before clearing. Parameters are toolCallMsg and the corresponding toolResponseMsgs. Can be used to rewrite write_file/edit_file calls into system-reminders. Returning nil removes that message group.
    ClearPostProcess
    func(ctx, *adk.TypedChatModelAgentState[M]) context.Context
    Callback after clearing completes, can save state or send notifications. Returns a potentially updated context.
    ToolConfig
    map[string]*ToolReductionConfig
    Per-tool configuration, takes precedence over global settings.
    ### ToolReductionConfig Tool-level Configuration ```go type ToolReductionConfig struct { - // Backend storage backend for this tool - Backend Backend - - // SkipTruncation skip truncation for this tool + Backend Backend SkipTruncation bool - - // TruncHandler custom truncation handler - // Uses default handler if not set - TruncHandler func(ctx context.Context, detail *ToolDetail) (*TruncResult, error) - - // SkipClear skip clearing for this tool - SkipClear bool - - // ClearHandler custom clear handler - // Uses default handler if not set - ClearHandler func(ctx context.Context, detail *ToolDetail) (*ClearResult, error) + TruncHandler func(ctx context.Context, detail *ToolDetail) (*TruncResult, error) + SkipClear bool + ClearHandler func(ctx context.Context, detail *ToolDetail) (*ClearResult, error) } ``` +- `TruncHandler` / `ClearHandler`: when nil and not skipped, the global default handler is used. +- `Backend`: independent storage backend for this tool, overrides the global Backend. + ### ToolDetail Tool Details ```go type ToolDetail struct { - // ToolContext tool metadata (tool name, call ID) - ToolContext *adk.ToolContext - - // ToolArgument input parameters - ToolArgument *schema.ToolArgument - - // ToolResult output result - ToolResult *schema.ToolResult + ToolContext *adk.ToolContext + ToolArgument *schema.ToolArgument + ToolResult *schema.ToolResult // non-streaming + StreamToolResult *schema.StreamReader[*schema.ToolResult] // streaming } ``` @@ -154,23 +136,12 @@ type ToolDetail struct { ```go type TruncResult struct { - // NeedTrunc whether truncation is needed - NeedTrunc bool - - // ToolResult tool result after truncation - // Required when NeedTrunc is true - ToolResult *schema.ToolResult - - // NeedOffload whether offloading to storage is needed - NeedOffload bool - - // OffloadFilePath offload file path - // Required when NeedOffload is true - OffloadFilePath string - - // OffloadContent offload content - // Required when NeedOffload is true - OffloadContent string + NeedTrunc bool + ToolResult *schema.ToolResult // Required when NeedTrunc && non-streaming + StreamToolResult *schema.StreamReader[*schema.ToolResult] // Required when NeedTrunc && streaming + NeedOffload bool + OffloadFilePath string // Required when NeedOffload + OffloadContent string // Required when NeedOffload } ``` @@ -178,30 +149,26 @@ type TruncResult struct { ```go type ClearResult struct { - // NeedClear whether clearing is needed - NeedClear bool - - // ToolArgument tool argument after clearing - // Required when NeedClear is true - ToolArgument *schema.ToolArgument - - // ToolResult tool result after clearing - // Required when NeedClear is true - ToolResult *schema.ToolResult - - // NeedOffload whether offloading to storage is needed - NeedOffload bool + NeedClear bool + ToolArgument *schema.ToolArgument // Required when NeedClear + ToolResult *schema.ToolResult // Required when NeedClear + NeedOffload bool + OffloadFilePath string // Required when NeedOffload + OffloadContent string // Required when NeedOffload +} +``` - // OffloadFilePath offload file path - // Required when NeedOffload is true - OffloadFilePath string +### Backend Interface - // OffloadContent offload content - // Required when NeedOffload is true - OffloadContent string +```go +// Defined in reduction/internal, exported via type alias +type Backend interface { + Write(context.Context, *filesystem.WriteRequest) error } ``` +`filesystem.WriteRequest` contains two fields: `FilePath string` and `Content string`. + --- ## Creating the Middleware @@ -209,67 +176,75 @@ type ClearResult struct { ### Basic Usage ```go -import ( - "context" - "github.com/cloudwego/eino/adk/middlewares/reduction" -) +import "github.com/cloudwego/eino/adk/middlewares/reduction" -// Use default configuration middleware, err := reduction.New(ctx, &reduction.Config{ - Backend: myBackend, // Required: storage backend + Backend: myBackend, }) -// Use with ChatModelAgent agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Model: yourChatModel, + Model: chatModel, Middlewares: []adk.ChatModelAgentMiddleware{middleware}, }) ``` +### Generic Usage (AgenticMessage) + +```go +middleware, err := reduction.NewTyped[*schema.AgenticMessage](ctx, &reduction.TypedConfig[*schema.AgenticMessage]{ + Backend: myBackend, + TokenCounter: myAgenticTokenCounter, +}) + +agent, err := adk.NewTypedChatModelAgent(ctx, &adk.TypedChatModelAgentConfig[*schema.AgenticMessage]{ + Model: chatModel, + Middlewares: []adk.TypedChatModelAgentMiddleware[*schema.AgenticMessage]{middleware}, +}) +``` + ### Custom Configuration ```go -config := &reduction.Config{ +middleware, err := reduction.New(ctx, &reduction.Config{ Backend: myBackend, RootDir: "/data/agent", MaxLengthForTrunc: 30000, MaxTokensForClear: 100000, ClearRetentionSuffixLimit: 2, - TokenCounter: myTokenCounter, + ClearAtLeastTokens: 10000, + TruncExcludeTools: []string{"search_tool"}, + ClearExcludeTools: []string{"read_file"}, + ClearMessageRewriter: func(ctx context.Context, toolCallMsg *schema.Message, toolResponseMsgs []*schema.Message) ([]*schema.Message, error) { + // Rewrite write_file calls into system-reminder + return []*schema.Message{schema.UserMessage("file written")}, nil + }, ClearPostProcess: func(ctx context.Context, state *adk.ChatModelAgentState) context.Context { log.Printf("Clear completed, messages: %d", len(state.Messages)) return ctx }, ToolConfig: map[string]*reduction.ToolReductionConfig{ - "grep": { - Backend: grepBackend, - SkipTruncation: false, - }, - "read_file": { - Backend: readFileBackend, - SkipClear: true, // Read file tool doesn't need clearing - }, + "grep": {Backend: grepBackend}, + "read_file": {SkipClear: true}, }, -} - -middleware, err := reduction.New(ctx, config) +}) ``` -### Using Truncation Strategy Only +### Truncation Only ```go middleware, err := reduction.New(ctx, &reduction.Config{ Backend: myBackend, - SkipClear: true, // Skip clear phase + SkipClear: true, }) ``` -### Using Clear Strategy Only +### Clear Only ```go middleware, err := reduction.New(ctx, &reduction.Config{ - Backend: myBackend, - SkipTruncation: true, // Skip truncation phase + SkipTruncation: true, + MaxTokensForClear: 100000, + // When Backend is nil, clearing still replaces content with placeholders but does not perform offload }) ``` @@ -279,29 +254,37 @@ middleware, err := reduction.New(ctx, &reduction.Config{ ### Truncation -Handled in `WrapInvokableToolCall` / `WrapStreamableToolCall`: +Handled in `WrapInvokableToolCall` / `WrapStreamableToolCall` / `WrapEnhancedInvokableToolCall` / `WrapEnhancedStreamableToolCall`: 1. Tool returns result -2. Call TruncHandler to determine if truncation is needed -3. If truncation needed, save full content to Backend -4. Return truncated content with hint text telling the agent where to find the full content +2. Check `TruncExcludeTools`; skip if matched +3. Look up ToolConfig → global defaultConfig to obtain TruncHandler +4. TruncHandler determines: reads the full output, checks if the total length of all text parts exceeds `MaxLengthForTrunc` +5. If exceeded: retains the first and last `MaxLengthForTrunc/(textParts*2)` characters as a preview, stores the full content in the Backend +6. Returns a truncation notice informing the agent of the file path for the full content + +> 💡 +> For streaming tools, the default TruncHandler waits for the complete stream to be read before deciding whether to truncate. If you need strict incremental streaming behavior, provide a custom TruncHandler for that tool. ### Clear Handled in `BeforeModelRewriteState`: -1. Use TokenCounter to calculate total tokens -2. Only process if exceeds MaxTokensForClear -3. Iterate from old messages, skipping already processed ones and the most recent ClearRetentionSuffixLimit rounds -4. For each tool call in range, call ClearHandler -5. If clearing needed, write to Backend and replace message result with file path -6. Call ClearPostProcess callback +1. Use `TokenCounter` to calculate total tokens +2. Skip if not exceeding `MaxTokensForClear` +3. Determine clear range: from the first unprocessed assistant message to `len(messages) - ClearRetentionSuffixLimit` rounds +4. If `ClearMessageRewriter` is configured, execute rewrite preprocessing on messages within the range first +5. Iterate through tool call messages in range, skipping `ClearExcludeTools` +6. Call ClearHandler for each tool call, replacing arguments and results +7. If `ClearAtLeastTokens` is set: operate on a copy first, compare token difference before and after clearing; abandon this clearing attempt if threshold not met +8. Once threshold is met, execute actual offload writes and update state.Messages +9. Call `ClearPostProcess` --- ## Multi-language Support -Truncation and clear hint text supports Chinese and English, switch via `adk.SetLanguage()`: +Truncation and clear prompt text supports automatic Chinese/English switching: ```go adk.SetLanguage(adk.LanguageChinese) // Chinese @@ -312,7 +295,11 @@ adk.SetLanguage(adk.LanguageEnglish) // English (default) ## Notes -- When `SkipTruncation` is false, `Backend` must be set -- The default TokenCounter uses `character_count / 4` estimation, which is not accurate for Chinese; consider using `github.com/tiktoken-go/tokenizer` as a replacement -- Already processed messages are marked and won't be processed again -- Configuration in `ToolConfig` takes precedence over global configuration +- When `SkipTruncation` is false, `Backend` **must** be set +- The default TokenCounter uses character_count/4 estimation; recommend replacing with `github.com/tiktoken-go/tokenizer` +- Already processed messages are marked via the Extra field `_reduction_mw_processed` and will not be processed again +- Configuration in `ToolConfig` takes precedence over global settings; if a ToolConfig only sets `SkipTruncation: false` without providing a `TruncHandler`, it falls back to the default handler +- `GenTruncOffloadFilePath` / `GenClearOffloadFilePath` are useful for scenarios where tool_call_id is not unique (e.g., retries), preventing file overwrites +- `ClearMessageRewriter` executes after the clear range is determined but before per-tool clearing, suitable for compressing write/edit-type calls into brief prompts +- `ClearAtLeastTokens` set to 0 means clearing executes whenever the threshold is exceeded; values greater than 0 can avoid minimal clearing that would break prompt cache +- Legacy API (`NewClearToolResult`, `NewToolResultMiddleware`) is deprecated; recommend migrating to `New` / `NewTyped` diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md index f3d60b57c33..d25d0132a59 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md @@ -1,27 +1,29 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: ToolSearch -weight: 5 +weight: 7 --- -# ToolSearch Middleware - -adk/middlewares/dynamictool/toolsearch - -> 💡 -> This middleware was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1). - ## Overview The `toolsearch` middleware implements dynamic tool selection. When the tool library is large, passing all tools to the model would overflow the context. This middleware's approach is: -1. Add a `tool_search` meta-tool that accepts regex patterns to search tool names +1. Add a `tool_search` meta-tool that accepts keyword queries or direct selection to search for tools 2. Initially hide all dynamic tools 3. After the model calls `tool_search`, matched tools become available in subsequent calls +It supports three operating modes (two configuration values, but `UseModelToolSearch=true` has two end-to-end behaviors): + +- **Default mode** (`UseModelToolSearch=false`): The middleware manages tool visibility itself. Before each Model call, it filters `state.ToolInfos` via `BeforeModelRewriteState` based on `tool_search` call results, progressively adding selected dynamic tools back to the model's visible list +- **Model native mode — pure server-side retrieval** (`UseModelToolSearch=true`, model retrieves DeferredTools on its own): The middleware moves dynamic tools into `state.DeferredToolInfos` and passes them to the model via `model.WithDeferredTools`. If the model natively supports server-side tool retrieval (e.g., Claude's tool search), the model searches and selects directly from DeferredTools **without calling the tool_search tool** +- **Model native mode — client-side proxy retrieval** (`UseModelToolSearch=true`, model discovers tools by calling `tool_search`): Same middleware configuration as above, but the model does not have autonomous DeferredTools retrieval capability. Instead, it calls the `tool_search` tool (registered via `model.WithToolSearchTool`), the client-side `modelToolSearchTool` executes the search and returns a structured `ToolSearchResult` (containing full ToolInfo of matched tools), and the model selects tools accordingly + +> 💡 +> Package path: github.com/cloudwego/eino/adk/middlewares/dynamictool/toolsearch + --- ## Architecture @@ -31,18 +33,35 @@ Agent initialization │ ▼ ┌───────────────────────────────────────────┐ -│ BeforeAgent │ -│ - Inject tool_search tool │ -│ - Add DynamicTools to Tools list │ +│ BeforeAgent │ +│ - Inject tool_search tool │ +│ - Add DynamicTools to Tools list │ +│ - In model native mode, set │ +│ runCtx.ToolSearchTool │ └───────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────┐ -│ WrapModel │ -│ Before each Model call: │ -│ 1. Scan message history to find all tool_search return results │ -│ 2. Full Tools minus unselected DynamicTools = tools for this Model │ -│ call │ +│ BeforeModelRewriteState │ +│ (executed before each Model call) │ +│ │ +│ 1. Insert │ +│ User message listing all searchable │ +│ tool names │ +│ │ +│ First call (initialization): │ +│ Default mode: │ +│ Remove DynamicTools from ToolInfos │ +│ Model native mode: │ +│ DynamicTools → DeferredToolInfos │ +│ Remove DynamicTools and tool_search │ +│ from ToolInfos │ +│ │ +│ Subsequent calls (default mode - │ +│ forward selection): │ +│ Scan message history, collect matches │ +│ from tool_search returns, add matched │ +│ DynamicTools back to ToolInfos │ └────────────────────────────────────────────┘ │ ▼ @@ -57,33 +76,80 @@ Agent initialization type Config struct { // Tools that can be dynamically searched and loaded DynamicTools []tool.BaseTool + + // Whether to use the model's native tool search capability + // + // When true, the middleware delegates tool search to the model's native capability. + // + // When false (default), the middleware manages tool visibility by + // filtering the tool list before each Model call based on tool_search results. + // Note: this approach may invalidate the model's KV-cache + // (because the tool list changes between calls). + UseModelToolSearch bool } ``` --- -## tool_search Tool +## Constructors + +```go +// Standard constructor, uses *schema.Message +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) + +// Generic constructor, supports *schema.Message and *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, config *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` -The tool injected by the middleware. +## `New` internally calls `NewTyped[*schema.Message]`. If you are using `TypedChatModelAgent` (e.g., Agentic mode), use `NewTyped` directly. -**Parameters:** +## tool_search Tool + +The meta-tool injected by the middleware. **Parameters:** - + +
    ParameterTypeRequiredDescription
    regex_pattern
    stringYesRegex pattern to match tool names
    query
    stringYesQuery string for finding tools. Supports three modes: keyword search,
    select:
    direct selection,
    +keyword
    mandatory match
    max_results
    integerNoMaximum number of results to return (default: 5). Only applies to keyword search mode; direct selection mode is not limited by this
    -**Returns:** +**Query Modes:** + + + + + + +
    ModeSyntaxDescription
    Keyword search
    "weather forecast"
    Matches keywords in tool names and descriptions, sorted by relevance score. Supports camelCase and
    _
    /
    __
    (MCP) separator splitting
    Direct selection
    "select:tool_a,tool_b"
    Selects one or more tools by exact name, comma-separated. Not limited by
    max_results
    Mandatory match
    "+slack send message"
    Keywords prefixed with
    +
    are mandatory match items; tools not containing that keyword are filtered out. Remaining keywords are used for sorting
    + +**Return value (default mode):** ```json -{ - "selectedTools": ["tool_a", "tool_b"] -} +{"matches": ["tool_a", "tool_b"]} ``` ---- +**Return value (model native mode):** Returns a structured `schema.ToolResult` containing the full `ToolInfo` of matched tools for native model processing. + +## Keyword Search Scoring Mechanism -## Usage Example +Keyword search uses a multi-layer scoring system, calculating the highest score for each keyword separately then summing: + + + + + + + +
    Match RuleScore
    Tool name split part exactly matches keyword10
    Tool name split part contains keyword (substring)5
    Full tool name contains keyword3
    Tool description contains keyword2
    + +> 💡 +> Each keyword takes the highest score (intMax) for each rule and does not accumulate scores from multiple parts within the same tool. Scores from multiple keywords are summed for the total. Tools with equal scores are sorted lexicographically by name. + +Tool names are split into parts by `_` (underscore), `__` (MCP server-tool separator), and camelCase boundaries for matching. For example, `mcp__slack__send_message` splits into `["mcp", "slack", "send", "message"]`, and `NotebookEdit` splits into `["Notebook", "Edit"]`. Matching is case-insensitive. + +## Usage Examples + +### Default Mode (middleware manages tool visibility) ```go middleware, err := toolsearch.New(ctx, &toolsearch.Config{ @@ -104,35 +170,72 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ }) ``` +### Model Native Mode + +```go +middleware, err := toolsearch.New(ctx, &toolsearch.Config{ + DynamicTools: []tool.BaseTool{ + weatherTool, + stockTool, + currencyTool, + }, + UseModelToolSearch: true, +}) +if err != nil { + return err +} + +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: myModel, // Model must support native tool search + Handlers: []adk.ChatModelAgentMiddleware{middleware}, +}) +``` + +The configuration is identical, but end-to-end behavior depends on the model adapter implementation: + +- If the model natively supports server-side retrieval (e.g., Claude): the model searches and selects tools directly from `DeferredToolInfos`; the `tool_search` tool is not called +- If the model uses client-side proxy retrieval: model initiates `tool_search` call → client-side `modelToolSearchTool` executes search → returns structured `ToolSearchResult` (with full ToolInfo) → model selects tools accordingly + --- ## How It Works ### BeforeAgent -1. Get all DynamicTools -2. Create `tool_search` tool using DynamicTools -3. Add `tool_search` and all DynamicTools to `runCtx.Tools`, at this point Agent has full Tools +1. Get all DynamicTool ToolInfos, validate no duplicate tool names +2. Create the corresponding type of `tool_search` tool based on `UseModelToolSearch` +3. Add `tool_search` and all DynamicTools to `runCtx.Tools` (at this point the Agent has the full tool set) +4. In model native mode, set `runCtx.ToolSearchTool`; the framework passes it to the model via `model.WithToolSearchTool` + +### BeforeModelRewriteState (before each Model call) + +**Common logic:** + +- Ensure the message list contains an `` reminder (inserted as a User message, listing all searchable tool names) + +**First call — initialization (both modes):** -### WrapModel + +
    +Default modeRemove all DynamicTools from
    state.ToolInfos
    , so the model initially sees only static tools and
    tool_search
    +Model native mode1. Extract DynamicTools from
    state.ToolInfos
    into
    state.DeferredToolInfos
    2. Remove
    tool_search
    from
    state.ToolInfos
    (handled natively by the model)
    -Before each Model call: +**Subsequent calls — forward selection (default mode only):** -1. Iterate through message history to find all `tool_search` return results +1. Iterate through message history, find all JSON `matches` fields from `tool_search` return results 2. Collect selected tool names -3. Filter out unselected DynamicTools from full tools -4. Call Model with filtered tool list +3. Add matched DynamicTools back to `state.ToolInfos` (cumulative; previously added tools are not removed) -### Tool Selection Flow +### Tool Selection Flow (Default Mode) ``` Round 1: - Model can only see tool_search - Model calls tool_search(regex_pattern="weather.*") - Returns {"selectedTools": ["weather_forecast", "weather_history"]} + Model can only see tool_search + static tools + Model calls tool_search(query="weather forecast") + Returns {"matches": ["weather_forecast", "weather_history"]} Round 2: - Model can see tool_search + weather_forecast + weather_history + Model can see tool_search + static tools + weather_forecast + weather_history Model calls weather_forecast(...) ``` @@ -140,7 +243,10 @@ Round 2: ## Notes -- DynamicTools cannot be empty -- Regex matches tool names, not descriptions -- Selected tools remain available unless the tool_search call result is deleted or modified -- tool_search can be called multiple times, results accumulate +- `DynamicTools` cannot be empty, and tool names must not be duplicated +- Keyword search matches tool names and descriptions, case-insensitive +- In default mode, selected tools remain available permanently (accumulated based on `tool_search` results in message history) +- `tool_search` can be called multiple times; results accumulate +- In default mode, the tool list may change before each Model call, which may invalidate the model's KV-cache +- Model native mode requires the ChatModel to support `model.WithToolSearchTool` and/or `model.WithDeferredTools` options. Which path is taken (pure server-side retrieval vs. client-side proxy retrieval) depends on the model adapter implementation +- The `` reminder is inserted as a **User message** (not a System message) into the message list, positioned before the first non-System message diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md index 9a86ea95204..4d516f1cf37 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md @@ -1,298 +1,259 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-21" lastmod: "" tags: [] -title: 'Eino ADK: ChatModelAgentMiddleware' +title: ChatModelAgentMiddleware weight: 8 --- -## Overview +`ChatModelAgentMiddleware` is the core interface for customizing the behavior of `ChatModelAgent` (and `DeepAgent` built on top of it). Introduced in v0.8.0, it continues to evolve in subsequent versions. -## ChatModelAgentMiddleware Interface +## Type Conventions -`ChatModelAgentMiddleware` defines the interface for customizing `ChatModelAgent` behavior. +This document uses the default `M = *schema.Message` aliases. The generic raw types are prefixed with `Typed`: -**Important:** This interface is designed specifically for `ChatModelAgent` and Agents built on top of it (such as `DeepAgent`). - -> 💡 -> The ChatModelAgentMiddleware interface was introduced in [v0.8.0.Beta](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1) +```go +type ChatModelAgentMiddleware = TypedChatModelAgentMiddleware[*schema.Message] +type BaseChatModelAgentMiddleware = TypedBaseChatModelAgentMiddleware[*schema.Message] +type ChatModelAgentState = TypedChatModelAgentState[*schema.Message] +type ModelContext = TypedModelContext[*schema.Message] +``` -### Why Use ChatModelAgentMiddleware Instead of AgentMiddleware? +When using `*schema.AgenticMessage`, use the `Typed` generic versions directly. - - - - - -
    FeatureAgentMiddleware (struct)ChatModelAgentMiddleware (interface)
    ExtensibilityClosed, users cannot add new methodsOpen, users can implement custom handlers
    Context PropagationCallbacks only return errorAll methods return
    (context.Context, ..., error)
    Configuration ManagementScattered in closuresCentralized in struct fields
    +--- -### Interface Definition +## Interface Definition ```go type ChatModelAgentMiddleware interface { - // BeforeAgent is called before each agent run, allows modifying instruction and tools configuration + // ── Lifecycle Hooks ── + + // BeforeAgent: called once before the agent runs, can modify instruction and tools configuration BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) - // BeforeModelRewriteState is called before each model call - // The returned state will be persisted to the agent's internal state and passed to the model - // The returned context will be propagated to the model call and subsequent handlers + // AfterAgent: called after the agent terminates successfully (final answer or return-directly tool result) + // Not called on error termination (max iterations, context cancellation, model error) + AfterAgent(ctx context.Context, state *ChatModelAgentState) (context.Context, error) + + // BeforeModelRewriteState: called before each model call + // The returned state is persisted; can modify Messages, ToolInfos, DeferredToolInfos BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - // AfterModelRewriteState is called after each model call + // AfterModelRewriteState: called after each model call // The input state contains the model response as the last message AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - // WrapInvokableToolCall wraps the synchronous execution of a tool with custom behavior - // If no wrapping is needed, return the original endpoint and nil error - // Only called for tools that implement InvokableTool - WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) + // ── Wrappers ── - // WrapStreamableToolCall wraps the streaming execution of a tool with custom behavior - // If no wrapping is needed, return the original endpoint and nil error - // Only called for tools that implement StreamableTool + WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) - - // WrapEnhancedInvokableToolCall wraps the synchronous execution of an enhanced tool with custom behavior WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) - - // WrapEnhancedStreamableToolCall wraps the streaming execution of an enhanced tool with custom behavior WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) - // WrapModel wraps the chat model with custom behavior - // If no wrapping is needed, return the original model and nil error - // Called at request time, executed before each model call - WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) -} -``` - -### Using BaseChatModelAgentMiddleware - -Embed `*BaseChatModelAgentMiddleware` to get default no-op implementations: - -```go -type MyHandler struct { - *adk.BaseChatModelAgentMiddleware -} - -func (h *MyHandler) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - return ctx, state, nil + // WrapModel: wraps the ChatModel. The parameter type is model.BaseModel[M] (not ToolCallingChatModel) + // The framework handles WithTools binding separately, not going through the user wrapper + WrapModel(ctx context.Context, m model.BaseModel[M], mc *ModelContext) (model.BaseModel[M], error) } ``` ---- - -## Tool Call Endpoint Types - -Tool wrapping uses function types instead of interfaces, more clearly expressing the wrapping intent: +> 💡 +> Embed `*BaseChatModelAgentMiddleware` to get default no-op implementations for all methods — only override the methods you care about. -```go -// InvokableToolCallEndpoint is the function signature for synchronous tool calls -type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) +### AgentMiddleware is Deprecated -// StreamableToolCallEndpoint is the function signature for streaming tool calls -type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) +> 💡 +> The `AgentMiddleware` struct and the `ChatModelAgentConfig.Middlewares` field have been marked as Deprecated and will be removed in a future version. All new code should use `ChatModelAgentMiddleware` (interface-based Handlers). -// EnhancedInvokableToolCallEndpoint is the function signature for enhanced synchronous tool calls -type EnhancedInvokableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.ToolResult, error) +`AgentMiddleware` is a struct with inherent limitations — users cannot extend methods, and callbacks only return error without propagating context. `ChatModelAgentMiddleware` is an interface: -// EnhancedStreamableToolCallEndpoint is the function signature for enhanced streaming tool calls -type EnhancedStreamableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.StreamReader[*schema.ToolResult], error) -``` +- Hook methods return `(context.Context, ..., error)`, supporting context propagation +- Wrapper methods propagate modified context through the endpoint chain +- Custom handlers can carry arbitrary internal state -### Why Use Separate Endpoint Types? +Migration mapping: -The previous `ToolCall` interface contained both `InvokableRun` and `StreamableRun`, but most tools only implement one of them. -Separate endpoint types enable: + + + + + + + +
    AgentMiddleware FieldChatModelAgentMiddleware Replacement
    AdditionalInstruction
    Modify
    runCtx.Instruction
    in
    BeforeAgent
    AdditionalTools
    Modify
    runCtx.Tools
    in
    BeforeAgent
    BeforeChatModel
    BeforeModelRewriteState
    AfterChatModel
    AfterModelRewriteState
    WrapToolCall
    WrapInvokableToolCall
    /
    WrapStreamableToolCall
    etc.
    -- Corresponding wrap methods are only called when the tool implements the respective interface -- Clearer contract for wrapper authors -- No ambiguity about which method to implement +In the current version, both can coexist (Handlers execute after Middlewares), but you should migrate as soon as possible. --- -## ChatModelAgentContext +## Context Types + +### ChatModelAgentContext -`ChatModelAgentContext` contains runtime information passed to handlers before each `ChatModelAgent` run. +Input to `BeforeAgent`, called once before each Run: ```go type ChatModelAgentContext struct { - // Instruction is the instruction for the current Agent execution - // Includes agent-configured instructions, framework and AgentMiddleware appended extra instructions, - // and modifications applied by previous BeforeAgent handlers + // Current instruction (includes agent config + framework appended + preceding handler modifications) Instruction string - // Tools are the original tools (without any wrappers or tool middleware) currently configured for Agent execution - // Includes tools passed in AgentConfig, tools implicitly added by the framework (like transfer/exit tools), - // and other tools added by middleware + // Original tool list (includes framework implicit tools like transfer/exit) Tools []tool.BaseTool - // ReturnDirectly is the set of tool names currently configured to make the Agent return directly + // Set of tool names configured to "return directly" ReturnDirectly map[string]bool + + // ToolInfo for the model's native tool search capability + // After being set by a handler, the framework passes it to the model via model.WithToolSearchTool + ToolSearchTool *schema.ToolInfo } ``` ---- - -## ChatModelAgentState +### ChatModelAgentState -`ChatModelAgentState` represents the state of the chat model agent during conversation. This is the primary state type for `ChatModelAgentMiddleware` and `AgentMiddleware` callbacks. +**Persistent state** passed before and after each model call (persists across iterations): ```go type ChatModelAgentState struct { - // Messages contains all messages in the current conversation session - Messages []Message + // All messages in the current session + Messages []*schema.Message + + // Tool definitions passed to the model (via model.WithTools), can be modified in BeforeModelRewriteState + ToolInfos []*schema.ToolInfo + + // Deferred tool definitions (via model.WithDeferredTools), for the model's native search capability + // nil when not used + DeferredToolInfos []*schema.ToolInfo } ``` ---- +> 💡 +> The recommended place to modify `ToolInfos` / `DeferredToolInfos` is `BeforeModelRewriteState` — this is the source of truth for tool configuration. Do not modify the tool list in `WrapModel`. -## ToolContext +### ModelContext -`ToolContext` provides metadata about the tool being wrapped. Created at request time, contains information about the current tool call. +Context for `WrapModel` and `Before/AfterModelRewriteState`: ```go -type ToolContext struct { - // Name is the tool name - Name string +type ModelContext struct { + // Deprecated: use ChatModelAgentState.ToolInfos instead + Tools []*schema.ToolInfo + + // Model retry configuration + ModelRetryConfig *ModelRetryConfig - // CallID is the unique identifier for this specific tool call - CallID string + // Model failover configuration + ModelFailoverConfig *ModelFailoverConfig[*schema.Message] } ``` -### Usage Example: Tool Call Wrapping +### ToolContext + +Metadata for tool wrapping: ```go -func (h *MyHandler) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - log.Printf("Tool %s (call %s) starting with args: %s", tCtx.Name, tCtx.CallID, argumentsInJSON) - - result, err := endpoint(ctx, argumentsInJSON, opts...) - - if err != nil { - log.Printf("Tool %s failed: %v", tCtx.Name, err) - return "", err - } - - log.Printf("Tool %s completed with result: %s", tCtx.Name, result) - return result, nil - }, nil +type ToolContext struct { + Name string // Tool name + CallID string // Unique identifier for this call } ``` --- -## ModelContext +## Tool Call Endpoint Types -`ModelContext` contains context information passed to `WrapModel`. Created at request time, contains tool configuration for the current model call. +Tool wrapping uses function types instead of interfaces. The framework calls the corresponding Wrap method based on which interface the tool implements: ```go -type ModelContext struct { - // Tools is the list of tools currently configured for the agent - // Populated at request time, contains the tools that will be sent to the model - Tools []*schema.ToolInfo +// Standard tools +type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) +type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) - // ModelRetryConfig contains the retry configuration for the model - // Populated at request time from the agent's ModelRetryConfig - // Used by EventSenderModelWrapper to appropriately wrap stream errors - ModelRetryConfig *ModelRetryConfig -} +// Enhanced tools (using ToolArgument/ToolResult) +type EnhancedInvokableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.ToolResult, error) +type EnhancedStreamableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.StreamReader[*schema.ToolResult], error) ``` -### Usage Example: Model Wrapping - -```go -func (h *MyHandler) WrapModel(ctx context.Context, m model.BaseChatModel, mc *adk.ModelContext) (model.BaseChatModel, error) { - return &myModelWrapper{ - inner: m, - tools: mc.Tools, - }, nil -} - -type myModelWrapper struct { - inner model.BaseChatModel - tools []*schema.ToolInfo -} +> 💡 +> Each Wrap method is **only called when the tool implements the corresponding interface**. For example, if a tool only implements `InvokableTool`, only `WrapInvokableToolCall` will be called, not `WrapStreamableToolCall`. -func (w *myModelWrapper) Generate(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.Message, error) { - log.Printf("Model called with %d tools", len(w.tools)) - return w.inner.Generate(ctx, msgs, opts...) -} +--- -func (w *myModelWrapper) Stream(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.StreamReader[*schema.Message], error) { - return w.inner.Stream(ctx, msgs, opts...) -} -``` +## Execution Order ---- +### Model Call Lifecycle (outer to inner) + +1. ~~AgentMiddleware.BeforeChatModel~~ (**Deprecated**, will be removed) +2. **ChatModelAgentMiddleware.BeforeModelRewriteState** +3. `failoverModelWrapper` (internal — model failover, if configured) +4. `retryModelWrapper` (internal — failure retry) +5. `eventSenderModelWrapper` preprocessing (internal — prepares event sending) +6. **ChatModelAgentMiddleware.WrapModel** preprocessing (first registered → first executed) +7. `callbackInjectionModelWrapper` (internal) +8. **Model.Generate / Stream** +9. `callbackInjectionModelWrapper` postprocessing +10. **ChatModelAgentMiddleware.WrapModel** postprocessing (first registered → last executed) +11. `eventSenderModelWrapper` postprocessing +12. `retryModelWrapper` postprocessing +13. `failoverModelWrapper` postprocessing +14. **ChatModelAgentMiddleware.AfterModelRewriteState** +15. ~~AgentMiddleware.AfterChatModel~~ (**Deprecated**, will be removed) + +### Tool Call Lifecycle (outer to inner) + +1. `eventSenderToolHandler` (internal — sends tool result events) +2. `ToolsConfig.ToolCallMiddlewares` +3. ~~AgentMiddleware.WrapToolCall~~ (**Deprecated**, will be removed) +4. `cancelMonitoredToolHandler` (internal — cancel monitoring, only for Streamable/EnhancedStreamable tools) +5. **ChatModelAgentMiddleware.WrapXxxToolCall** (first registered → outermost) +6. `callbackInjectedToolCall` (internal — injects callback) +7. **Tool.InvokableRun / StreamableRun** ## Run-Local Storage API -`SetRunLocalValue`, `GetRunLocalValue`, and `DeleteRunLocalValue` provide the ability to store, retrieve, and delete values during the current agent Run() call. +Store and retrieve key-value pairs during the current agent `Run()`. Values are compatible with interrupt/resume — they are serialized and persisted with checkpoints. ```go -// SetRunLocalValue sets a key-value pair that persists during the current agent Run() call -// The value is scoped to this specific execution and is not shared between different Run() calls or agent instances -// -// Values stored here are compatible with interrupt/resume cycles - they are serialized and restored when the agent resumes -// For custom types, they must be registered in init() using schema.RegisterName[T]() to ensure proper serialization -// -// This function can only be called from within a ChatModelAgentMiddleware during agent execution -// Returns an error if called outside of agent execution context func SetRunLocalValue(ctx context.Context, key string, value any) error - -// GetRunLocalValue retrieves a value set during the current agent Run() call -// The value is scoped to this specific execution and is not shared between different Run() calls or agent instances -// -// Values stored via SetRunLocalValue are compatible with interrupt/resume cycles - they are serialized and restored when the agent resumes -// For custom types, they must be registered in init() using schema.RegisterName[T]() to ensure proper serialization -// -// This function can only be called from within a ChatModelAgentMiddleware during agent execution -// Returns (value, true, nil) if found, (nil, false, nil) if not found, -// returns error if called outside of agent execution context func GetRunLocalValue(ctx context.Context, key string) (any, bool, error) - -// DeleteRunLocalValue deletes a value set during the current agent Run() call -// -// This function can only be called from within a ChatModelAgentMiddleware during agent execution -// Returns an error if called outside of agent execution context func DeleteRunLocalValue(ctx context.Context, key string) error ``` -### Usage Example: Sharing Data Across Handler Points +> 💡 +> Custom types must be registered in `init()` via `schema.RegisterName[T]()` to ensure correct gob serialization. These functions can only be called within `ChatModelAgentMiddleware` callbacks. + +### Example: Sharing State Across Callbacks ```go func init() { - schema.RegisterName[*MyCustomData]("my_package.MyCustomData") + schema.RegisterName[*ToolStats]("mypackage.ToolStats") } -type MyCustomData struct { +type ToolStats struct { Count int Name string } -type MyHandler struct { +type MyMiddleware struct { *adk.BaseChatModelAgentMiddleware } -func (h *MyHandler) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - result, err := endpoint(ctx, argumentsInJSON, opts...) - - data := &MyCustomData{Count: 1, Name: tCtx.Name} - if err := adk.SetRunLocalValue(ctx, "my_handler.last_tool", data); err != nil { - log.Printf("Failed to set run local value: %v", err) - } - +// Record stats after tool call +func (m *MyMiddleware) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + + _ = adk.SetRunLocalValue(ctx, "last_tool", &ToolStats{Count: 1, Name: tCtx.Name}) return result, err }, nil } -func (h *MyHandler) AfterModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - if val, found, err := adk.GetRunLocalValue(ctx, "my_handler.last_tool"); err == nil && found { - if data, ok := val.(*MyCustomData); ok { - log.Printf("Last tool was: %s (count: %d)", data.Name, data.Count) +// Read stats after model call +func (m *MyMiddleware) AfterModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { + if val, found, _ := adk.GetRunLocalValue(ctx, "last_tool"); found { + if stats, ok := val.(*ToolStats); ok { + log.Printf("Last tool: %s (count=%d)", stats.Name, stats.Count) } } return ctx, state, nil @@ -303,226 +264,79 @@ func (h *MyHandler) AfterModelRewriteState(ctx context.Context, state *adk.ChatM ## SendEvent API -`SendEvent` allows sending custom `AgentEvent` to the event stream during agent execution. +Send custom `AgentEvent` to the event stream during agent execution, which callers can receive when iterating over the event stream: ```go -// SendEvent sends a custom AgentEvent to the event stream during agent execution -// Allows ChatModelAgentMiddleware implementations to emit custom events, -// which will be received by callers iterating over the agent's event stream -// -// This function can only be called from within a ChatModelAgentMiddleware during agent execution -// Returns an error if called outside of agent execution context func SendEvent(ctx context.Context, event *AgentEvent) error ``` ---- - -## State Type (To Be Deprecated) - -`State` holds agent runtime state, including messages and user-extensible storage. - -**⚠️ Deprecation Warning:** This type will be made unexported in v1.0.0. Please use `ChatModelAgentState` in `ChatModelAgentMiddleware` and `AgentMiddleware` callbacks. Direct use of `compose.ProcessState[*State]` is not recommended and will stop working in v1.0.0; please use the handler API instead. - -```go -type State struct { - Messages []Message - extra map[string]any // unexported, access via SetRunLocalValue/GetRunLocalValue - - // The following are internal fields - do not access directly - // Kept exported for backward compatibility with existing checkpoints - ReturnDirectlyToolCallID string - ToolGenActions map[string]*AgentAction - AgentName string - RemainingIterations int - - internals map[string]any -} -``` - ---- - -## Architecture Diagram - -The following diagram shows how `ChatModelAgentMiddleware` works during `ChatModelAgent` execution: - -``` -Agent.Run(input) - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ BeforeAgent(ctx, *ChatModelAgentContext) │ -│ Input: Current Instruction, Tools and other Agent runtime env │ -│ Output: Modified Agent runtime env │ -│ Purpose: Called once at Run start, modifies config for entire Run │ -│ lifecycle │ -└─────────────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ ReAct Loop │ -│ ┌───────────────────────────────────────────────────────────────────┐ │ -│ │ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ BeforeModelRewriteState(ctx, *ChatModelAgentState, *MC) │ │ │ -│ │ │ Input: Persistent state like message history, plus Model │ │ │ -│ │ │ runtime env │ │ │ -│ │ │ Output: Modified persistent state, returns new ctx │ │ │ -│ │ │ Purpose: Modify persistent state across iterations │ │ │ -│ │ │ (mainly message list) │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ WrapModel(ctx, BaseChatModel, *ModelContext) │ │ │ -│ │ │ Input: ChatModel being wrapped, plus Model runtime env │ │ │ -│ │ │ Output: Wrapped Model (onion model) │ │ │ -│ │ │ Purpose: Modify input, output and config for single │ │ │ -│ │ │ Model request │ │ │ -│ │ │ │ │ │ │ -│ │ │ ▼ │ │ │ -│ │ │ ┌───────────────┐ │ │ │ -│ │ │ │ Model │ │ │ │ -│ │ │ │ Generate/Stream│ │ │ │ -│ │ │ └───────────────┘ │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ AfterModelRewriteState(ctx, *ChatModelAgentState, *MC) │ │ │ -│ │ │ Input: Persistent state like message history (with Model │ │ │ -│ │ │ response), plus Model runtime env │ │ │ -│ │ │ Output: Modified persistent state │ │ │ -│ │ │ Purpose: Modify persistent state across iterations │ │ │ -│ │ │ (mainly message list) │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌──────────────────┐ │ │ -│ │ │ Model return? │ │ │ -│ │ └──────────────────┘ │ │ -│ │ │ │ │ │ -│ │ Final response│ │ ToolCalls │ │ -│ │ │ ▼ │ │ -│ │ │ ┌─────────────────────────────────────┐ │ │ -│ │ │ │ WrapInvokableToolCall / WrapStream │ │ │ -│ │ │ │ ableToolCall(ctx, endpoint, *TC) │ │ │ -│ │ │ │ Input: Tool being wrapped plus │ │ │ -│ │ │ │ Tool runtime env │ │ │ -│ │ │ │ Output: Wrapped endpoint │ │ │ -│ │ │ │ (onion model) │ │ │ -│ │ │ │ Purpose: Modify input, output │ │ │ -│ │ │ │ and config for single │ │ │ -│ │ │ │ Tool request │ │ │ -│ │ │ │ │ │ │ │ -│ │ │ │ ▼ │ │ │ -│ │ │ │ ┌─────────────┐ │ │ │ -│ │ │ │ │ Tool.Run() │ │ │ │ -│ │ │ │ └─────────────┘ │ │ │ -│ │ │ └─────────────────────────────────────┘ │ │ -│ │ │ │ │ │ -│ │ │ │ (Result added to Messages) │ │ -│ │ │ │ │ │ -│ │ │ ┌─────────┘ │ │ -│ │ │ │ │ │ -│ │ │ └──────────► Continue loop │ │ -│ │ │ │ │ -│ └─────────────────────┼─────────────────────────────────────────────┘ │ -│ │ │ -│ ▼ │ -│ Loop until complete or maxIterations reached │ -└─────────────────────────────────────────────────────────────────────────┘ - │ - ▼ - Agent.Run() ends -``` - -### Handler Method Description - - - - - - - - - -
    MethodInputOutputScope
    BeforeAgent
    Agent runtime env (
    *ChatModelAgentContext
    )
    Modified Agent runtime envEntire Run lifecycle, called only once
    BeforeModelRewriteState
    Persistent state + Model runtime envModified persistent statePersistent state across iterations (message list)
    WrapModel
    ChatModel being wrapped + Model runtime envWrapped ModelSingle Model request input, output and config
    AfterModelRewriteState
    Persistent state (with response) + Model runtime envModified persistent statePersistent state across iterations (message list)
    WrapInvokableToolCall
    Tool being wrapped + Tool runtime envWrapped endpointSingle Tool request input, output and config
    WrapStreamableToolCall
    Tool being wrapped + Tool runtime envWrapped endpointSingle Tool request input, output and config
    +Can only be called within `ChatModelAgentMiddleware` callbacks. --- -## Execution Order +## State Type -### Model Call Lifecycle (wrapper chain from outer to inner) - -1. `AgentMiddleware.BeforeChatModel` (hook, runs before model call) -2. `ChatModelAgentMiddleware.BeforeModelRewriteState` (hook, can modify state before model call) -3. `retryModelWrapper` (internal - retries on failure, if configured) -4. `eventSenderModelWrapper` preprocessing (internal - prepares event sending) -5. `ChatModelAgentMiddleware.WrapModel` preprocessing (wrapper, wrapped at request time, first registered runs first) -6. `callbackInjectionModelWrapper` (internal - injects callbacks if not enabled) -7. `Model.Generate/Stream` -8. `callbackInjectionModelWrapper` postprocessing -9. `ChatModelAgentMiddleware.WrapModel` postprocessing (wrapper, first registered runs last) -10. `eventSenderModelWrapper` postprocessing (internal - sends model response event) -11. `retryModelWrapper` postprocessing (internal - handles retry logic) -12. `ChatModelAgentMiddleware.AfterModelRewriteState` (hook, can modify state after model call) -13. `AgentMiddleware.AfterChatModel` (hook, runs after model call) - -### Tool Call Lifecycle (from outer to inner) - -1. `eventSenderToolHandler` (internal ToolMiddleware - sends tool result event after all processing) -2. `ToolsConfig.ToolCallMiddlewares` (ToolMiddleware) -3. `AgentMiddleware.WrapToolCall` (ToolMiddleware) -4. `ChatModelAgentMiddleware.WrapInvokableToolCall/WrapStreamableToolCall` (wrapped at request time, first registered is outermost) -5. `Tool.InvokableRun/StreamableRun` +> 💡 +> `State` is kept exported only for checkpoint backward compatibility. **Do not use it directly** — use `ChatModelAgentState` in `ChatModelAgentMiddleware` callbacks, and use `SetRunLocalValue/GetRunLocalValue` instead of the original `State.Extra`. The `compose.ProcessState[*State]` usage will stop working in v1.0.0. --- ## Migration Guide -### Migrating from AgentMiddleware to ChatModelAgentMiddleware +### Migrating from compose.ProcessState[*State] -**Before (AgentMiddleware):** +**Before:** ```go -middleware := adk.AgentMiddleware{ - BeforeChatModel: func(ctx context.Context, state *adk.ChatModelAgentState) error { - return nil - }, -} +compose.ProcessState(ctx, func(_ context.Context, st *adk.State) error { + st.Extra["myKey"] = myValue + return nil +}) ``` -**After (ChatModelAgentMiddleware):** +**After:** ```go -type MyHandler struct { - *adk.BaseChatModelAgentMiddleware +// Write +if err := adk.SetRunLocalValue(ctx, "myKey", myValue); err != nil { + return ctx, state, err } -func (h *MyHandler) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - newCtx := context.WithValue(ctx, myKey, myValue) - return newCtx, state, nil +// Read +if val, found, err := adk.GetRunLocalValue(ctx, "myKey"); err == nil && found { + // use val } ``` -### Migrating from compose.ProcessState[*State] +### Adapting to AfterAgent (new in v0.9) -**Before:** +`AfterAgent` is called after the agent **terminates successfully** (final answer or return-directly tool result), and can be used for post-processing: ```go -compose.ProcessState(ctx, func(_ context.Context, st *adk.State) error { - st.Extra["myKey"] = myValue - return nil -}) +func (m *MyMiddleware) AfterAgent(ctx context.Context, state *adk.ChatModelAgentState) (context.Context, error) { + log.Printf("Agent completed, %d messages total", len(state.Messages)) + // Audit, statistics, cleanup, etc. + return ctx, nil +} ``` -**After (using SetRunLocalValue/GetRunLocalValue):** +> 💡 +> `AfterAgent` is called in registration order (same as `BeforeAgent`). If any handler returns an error, subsequent handlers are not called (fail-fast), and the error is sent to the event stream. -```go -if err := adk.SetRunLocalValue(ctx, "myKey", myValue); err != nil { - return ctx, state, err -} +### Adapting to ToolInfos / DeferredToolInfos (new in v0.9) -if val, found, err := adk.GetRunLocalValue(ctx, "myKey"); err == nil && found { +`ChatModelAgentState` now has `ToolInfos` and `DeferredToolInfos` fields, replacing `ModelContext.Tools` as the source of truth for tool configuration: + +```go +func (m *MyMiddleware) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { + // Dynamically filter tools + filtered := make([]*schema.ToolInfo, 0, len(state.ToolInfos)) + for _, t := range state.ToolInfos { + if shouldInclude(t.Name) { + filtered = append(filtered, t) + } + } + state.ToolInfos = filtered + return ctx, state, nil } ``` diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md index 5eb63ed655d..468bd79f470 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md @@ -1,125 +1,162 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: FileSystem Backend weight: 1 --- -> 💡 -> Package: [github.com/cloudwego/eino/adk/filesystem](https://github.com/cloudwego/eino/tree/main/adk/filesystem) +> 💡Package: [github.com/cloudwego/eino/adk/filesystem](https://github.com/cloudwego/eino/tree/main/adk/filesystem) ## Background and Goals -In AI Agent scenarios, the agent often needs to interact with a filesystem: reading file content, searching code, editing configs, executing commands, and so on. However, different runtime environments access the filesystem very differently: +AI Agents need to interact with file systems (read, search, edit, execute commands), but different runtime environments have very different access methods: local disk, remote sandbox, in-memory simulation, object storage, etc. If each environment independently implements file operation logic, it leads to coupling between Middleware/Agent code and the underlying storage. -- **Local development**: operate on the local filesystem directly, works out of the box -- **Cloud sandbox**: operate on an isolated sandbox filesystem via remote APIs, requires authentication and networking -- **Testing**: use an in-memory simulated filesystem without real disk I/O -- **Custom storage**: integrate with OSS, databases, or other non-traditional “filesystems” +The `filesystem.Backend` interface solves this problem — as a **unified file system operation protocol**: -If each environment implements its own set of file operations, middleware and agent code become tightly coupled to the underlying storage implementation, making reuse and testing difficult. - -To address this, Eino ADK defines the `filesystem.Backend` interface as a **unified filesystem operation protocol**. Its design goals are: - -1. **Decouple storage from business logic**: middleware depends only on the Backend interface and does not care whether the underlying implementation is local disk, a remote sandbox, or an in-memory mock -2. **Pluggable replacement**: by switching Backend implementations, the same agent can run in different environments without changing any business code -3. **Testability**: a built-in `InMemoryBackend` makes it easy to simulate filesystem behavior in unit tests -4. **Extensibility**: all methods use struct parameters, so adding new fields in the future won’t break compatibility for existing implementations +1. **Decouple storage from business logic** — Middleware depends only on the interface, not the underlying implementation +2. **Pluggable replacement** — Switching Backends enables running in different environments without modifying business code +3. **Testability** — Built-in `InMemoryBackend` requires no real disk I/O +4. **Forward compatibility** — All methods use struct parameters; adding new fields does not break existing implementations ## Backend Interface ```go type Backend interface { - // List files and directories under the given path LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) - // Read file content, supports line-based pagination (offset + limit) Read(ctx context.Context, req *ReadRequest) (*FileContent, error) - // Search for matches of pattern under the given path and return the match list GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error) - // Find matching files by glob pattern and base path GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error) - // Write or create a file Write(ctx context.Context, req *WriteRequest) error - // Replace string content in a file Edit(ctx context.Context, req *EditRequest) error } ``` -### Extension Interfaces + + + + + + + + +
    MethodFunctionReturn
    LsInfo
    List files and directory info under the specified path
    []FileInfo
    Read
    Read file content, supports line-based pagination (offset + limit)
    *FileContent
    GrepRaw
    Search for content matching a pattern in files
    []GrepMatch
    GlobInfo
    Find matching files by glob pattern
    []FileInfo
    Write
    Write or create a file
    error
    Edit
    Replace string content in a file
    error
    -Besides the core file operations, a Backend can optionally implement shell command execution: +## Extension Interfaces + +### Shell / StreamingShell + +Backends can optionally implement command execution capabilities. When a Backend implements `Shell` or `StreamingShell`, the Filesystem Middleware additionally registers the `execute` tool. The two are **mutually exclusive** and cannot be configured simultaneously. ```go -// Shell provides synchronous command execution type Shell interface { Execute(ctx context.Context, input *ExecuteRequest) (result *ExecuteResponse, err error) } -// StreamingShell provides streaming command execution for long-running commands type StreamingShell interface { ExecuteStreaming(ctx context.Context, input *ExecuteRequest) (result *schema.StreamReader[*ExecuteResponse], err error) } ``` -When a Backend implements `Shell` or `StreamingShell`, the FileSystem middleware additionally registers the `execute` tool so the agent can run shell commands. +### MultiModalReader + +An optional extension interface supporting multi-modal file reading (images, PDFs, etc.), returning structured `MultiFileContent`. + +```go +type MultiModalReader interface { + MultiModalRead(ctx context.Context, req *MultiModalReadRequest) (*MultiFileContent, error) +} +``` + +When a Backend implements this interface and the Middleware is configured with `UseMultiModalRead = true`, the `read_file` tool will use multi-modal reading. + +## Core Data Types -### Core Data Types +### Request Types - - - - - - - - - - + + + + + + + + +
    TypeDescription
    FileInfo
    File/directory info: path, isDir, size, modified time
    FileContent
    File content with line number information
    GrepMatch
    Search match: content, path, line number
    ReadRequest
    Read request: path, offset (1-based line), limit (line count)
    GrepRequest
    Search request: pattern (regex), path, glob filter, file type filters, etc.
    WriteRequest
    Write request: path, content
    EditRequest
    Edit request: path, old string, new string, replace all
    ExecuteRequest
    Command request: command string, background flag
    ExecuteResponse
    Command result: stdout/stderr, exit code, truncated flag
    TypeFieldsDescription
    LsInfoRequest
    Path string
    Directory path to list
    ReadRequest
    FilePath string
    Offset int
    Limit int
    File path; starting line number (1-based, <1 treated as 1); maximum lines to read (0=all)
    MultiModalReadRequest
    Embeds
    ReadRequest
    Pages string
    Inherits all ReadRequest fields; Pages specifies PDF page range (e.g. "1-5", "3")
    GrepRequest
    Pattern string
    Path string
    Glob string
    FileType string
    CaseInsensitive bool
    EnableMultiline bool
    AfterLines int
    BeforeLines int
    Regex search pattern (ripgrep syntax); search directory; glob file filter; file type filter (e.g. "go", "py"); case-insensitive; enable multiline matching; show N lines after match; show N lines before match
    GlobInfoRequest
    Pattern string
    Path string
    Glob expression (supports
    *
    ,
    **
    ,
    ?
    ,
    [abc]
    ); starting search directory
    WriteRequest
    FilePath string
    Content string
    Target file path; content to write
    EditRequest
    FilePath string
    OldString string
    NewString string
    ReplaceAll bool
    File path; exact string to replace (non-empty); replacement string; when false, OldString must appear exactly once in the file
    ExecuteRequest
    Command string
    RunInBackendGround bool
    Command string to execute; whether to run in background
    +### Response Types + + + + + + + + + +
    TypeFieldsDescription
    FileInfo
    Path string
    IsDir bool
    Size int64
    ModifiedAt string
    File/directory path; whether it is a directory; file size (bytes); last modified time (ISO 8601 format)
    FileContent
    Content string
    Plain text content of the file
    MultiFileContent
    *FileContent
    Parts []FileContentPart
    Embeds FileContent; multi-modal output parts. Parts and FileContent are mutually exclusive: FileContent is ignored when Parts is non-empty
    FileContentPart
    Type FileContentPartType
    MIMEType string
    Data []byte
    Content type (
    "image"
    or
    "pdf"
    ); MIME type (e.g. "image/png"); raw binary data
    GrepMatch
    Content string
    Path string
    Line int
    Matched line content; file path; 1-based line number
    ExecuteResponse
    Output string
    ExitCode *int
    Truncated bool
    Command output content; exit code (pointer, may be nil); whether output was truncated
    + +### Constants + +```go +type FileContentPartType string + +const ( + FileContentPartTypeImage FileContentPartType = "image" + FileContentPartTypePDF FileContentPartType = "pdf" +) +``` + ## Built-in Implementation: InMemoryBackend -`InMemoryBackend` is a built-in Backend implementation that stores files in an in-memory map, mainly used for: +`InMemoryBackend` stores files in an in-memory map, primarily used for: -- **Unit tests**: test agent and middleware file operations without a real filesystem -- **Lightweight scenarios**: temporary file operations without persistence -- **Large tool result offloading**: the FileSystem middleware’s large tool result offloading feature uses InMemoryBackend by default +- **Unit testing** — Test Agent/Middleware file operation logic without a real file system +- **Lightweight scenarios** — Temporary file operations that don't require persistence +- **Tool result offloading** — The Filesystem Middleware's large tool result offloading feature uses InMemoryBackend by default + +### Constructor ```go -import "github.com/cloudwego/eino/adk/filesystem" +func NewInMemoryBackend() *InMemoryBackend +``` -ctx := context.Background() +Zero-parameter constructor, returns an empty in-memory file system. + +### Usage Example + +```go backend := filesystem.NewInMemoryBackend() +ctx := context.Background() -// Write file -err := backend.Write(ctx, &filesystem.WriteRequest{ +// Write +_ = backend.Write(ctx, &filesystem.WriteRequest{ FilePath: "/example/test.txt", Content: "Hello, World!\nLine 2\nLine 3", }) -// Read file (paginated) -content, err := backend.Read(ctx, &filesystem.ReadRequest{ +// Read (paginated) +content, _ := backend.Read(ctx, &filesystem.ReadRequest{ FilePath: "/example/test.txt", Offset: 1, Limit: 10, }) // List directory -files, err := backend.LsInfo(ctx, &filesystem.LsInfoRequest{ - Path: "/example", -}) +files, _ := backend.LsInfo(ctx, &filesystem.LsInfoRequest{Path: "/example"}) -// Search content (regex supported) -matches, err := backend.GrepRaw(ctx, &filesystem.GrepRequest{ - Pattern: "Hello", - Path: "/example", +// Search (regex) +matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ + Pattern: "Hello", + Path: "/example", + CaseInsensitive: true, }) -// Edit file -err = backend.Edit(ctx, &filesystem.EditRequest{ +// Edit +_ = backend.Edit(ctx, &filesystem.EditRequest{ FilePath: "/example/test.txt", OldString: "Hello", NewString: "Hi", @@ -127,39 +164,42 @@ err = backend.Edit(ctx, &filesystem.EditRequest{ }) ``` -Features: +### Implementation Features -- Thread-safe (based on `sync.RWMutex`) -- GrepRaw supports regex, case-insensitive, context lines, and other advanced options -- GrepRaw uses parallel processing internally (up to 10 workers) +- **Thread-safe** — Based on `sync.RWMutex`; read operations use read locks, write operations use write locks +- **GrepRaw parallel processing** — Launches up to 10 workers for parallel matching during multi-file searches +- **Regex support** — Supports full regex, case-insensitive (`(?i)` prefix), multiline mode +- **Context lines** — GrepRaw supports BeforeLines/AfterLines to show context around matches +- **Glob matching** — Uses the `doublestar` library to support `**` recursive matching +- **FileType mapping** — Built-in mapping table of 70+ file types to extensions (go, py, ts, rust, etc.) +- **Does not implement Shell** — InMemoryBackend does not implement the Shell/StreamingShell interfaces ## External Implementations -The following Backend implementations live in the [eino-ext](https://github.com/cloudwego/eino-ext) repository: +The following Backend implementations reside in the [eino-ext](https://github.com/cloudwego/eino-ext) repository: -- **Local Backend** — a local filesystem implementation that operates on the host disk with zero configuration -- **Ark Agentkit Sandbox Backend** — a Volcengine Agentkit remote sandbox implementation that executes file operations in an isolated cloud environment +- **Local Backend** (`github.com/cloudwego/eino-ext/adk/backend/local`) — Local file system implementation, directly operates on the host disk +- **Ark Agentkit Sandbox** (`github.com/cloudwego/eino-ext/adk/backend/agentkit`) — Volcengine Agentkit remote sandbox implementation ### Implementation Comparison - + - - + + +
    FeatureInMemoryLocalAgentkit Sandbox
    Execution modelIn-memoryLocal directRemote sandbox
    Network dependencyNoNoYes
    Network dependencyNoneNoneRequired
    Configuration complexityZero configZero configCredentials required
    PersistenceNoYesYes
    Shell supportNoYes (including streaming)Yes
    Use casesTests/temporaryDevelopment/localMulti-tenant/production
    Shell supportNoShell + StreamingShellShell
    MultiModalReaderNoImplementation-dependentImplementation-dependent
    Use casesTests / temporary storageDevelopment / local environmentMulti-tenant / production
    -## Custom Implementations +## Custom Implementation -To integrate custom storage (e.g. OSS, databases), you only need to implement the `Backend` interface: +Implement the `Backend` interface to integrate custom storage. For command execution, additionally implement `Shell` or `StreamingShell`; for multi-modal reading, implement `MultiModalReader`. ```go -type MyBackend struct { - // ... -} +type MyBackend struct { /* ... */ } func (b *MyBackend) LsInfo(ctx context.Context, req *filesystem.LsInfoRequest) ([]filesystem.FileInfo, error) { // Custom implementation @@ -169,7 +209,29 @@ func (b *MyBackend) Read(ctx context.Context, req *filesystem.ReadRequest) (*fil // Custom implementation } -// ... implement the remaining methods -``` +func (b *MyBackend) GrepRaw(ctx context.Context, req *filesystem.GrepRequest) ([]filesystem.GrepMatch, error) { + // Custom implementation +} + +func (b *MyBackend) GlobInfo(ctx context.Context, req *filesystem.GlobInfoRequest) ([]filesystem.FileInfo, error) { + // Custom implementation +} -If you also need command execution, implement `Shell` or `StreamingShell` as well. +func (b *MyBackend) Write(ctx context.Context, req *filesystem.WriteRequest) error { + // Custom implementation +} + +func (b *MyBackend) Edit(ctx context.Context, req *filesystem.EditRequest) error { + // Custom implementation +} + +// Optional: implement Shell +func (b *MyBackend) Execute(ctx context.Context, input *filesystem.ExecuteRequest) (*filesystem.ExecuteResponse, error) { + // Custom implementation +} + +// Optional: implement MultiModalReader +func (b *MyBackend) MultiModalRead(ctx context.Context, req *filesystem.MultiModalReadRequest) (*filesystem.MultiFileContent, error) { + // Custom implementation +} +``` diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md index 7f10f023cdd..38e052b733f 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: Ark Agentkit Sandbox @@ -147,7 +147,7 @@ files, _ := backend.LsInfo(ctx, &filesystem.LsInfoRequest{ }) // Read file (paginated) -content, _ := backend.Read(ctx, &filesystem.ReadRequest{ +fcontent, _ := backend.Read(ctx, &filesystem.ReadRequest{ FilePath: "/home/gem/file.txt", Offset: 0, Limit: 100, diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md new file mode 100644 index 00000000000..4a5296da0db --- /dev/null +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md @@ -0,0 +1,201 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: Local Filesystem +weight: 2 +--- + +## Local Backend + +**Package**: `github.com/cloudwego/eino-ext/adk/backend/local` + +> 💡 +> eino v0.8.0+ requires local backend v0.2.1 or above. + +Local Backend is the local implementation of Eino ADK FileSystem, directly operating the local file system. It implements both the `filesystem.Backend` (file operations) and `filesystem.StreamingShell` (streaming command execution) interfaces. + +**Core features**: Zero configuration, native performance, enforced absolute paths, streaming command execution, optional command validation. + +--- + +## Installation + +```bash +go get github.com/cloudwego/eino-ext/adk/backend/local +``` + +## Configuration + +```go +type Config struct { + // Optional: Command validation function for security control of ExecuteStreaming. + // Rejects execution when returning non-nil error. + ValidateCommand func(string) error +} +``` + +## Quick Start + +```go +backend, err := local.NewBackend(ctx, &local.Config{}) + +// Write file (must be absolute path; overwrites if file already exists) +err = backend.Write(ctx, &filesystem.WriteRequest{ + FilePath: "/tmp/hello.txt", + Content: "Hello, Local Backend!", +}) + +// Read file (supports line-level pagination) +fc, err := backend.Read(ctx, &filesystem.ReadRequest{ + FilePath: "/tmp/hello.txt", + Offset: 1, // Starting line number (1-based) + Limit: 50, // Maximum number of lines, 0 means all +}) +``` + +### Integration with Agent + +```go +import ( + "github.com/cloudwego/eino/adk" + fsMiddleware "github.com/cloudwego/eino/adk/middlewares/filesystem" + "github.com/cloudwego/eino-ext/adk/backend/local" +) + +backend, _ := local.NewBackend(ctx, &local.Config{}) + +middleware, _ := fsMiddleware.New(ctx, &fsMiddleware.Config{ + Backend: backend, // Required: registers ls/read/write/edit/glob/grep tools + StreamingShell: backend, // Optional: registers streaming execute tool +}) + +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: chatModel, + Handlers: []adk.ChatModelAgentMiddleware{middleware}, +}) +``` + +> 💡 +> In the middleware Config, `Shell` and `StreamingShell` are mutually exclusive. Local Backend only implements `StreamingShell` (streaming command execution), not the non-streaming `Shell`. + +--- + +## Implemented Interfaces and Methods + +### filesystem.Backend + + + + + + + + + +
    MethodSignatureDescription
    LsInfo
    (ctx, *LsInfoRequest) ([]FileInfo, error)
    List directory contents
    Read
    (ctx, *ReadRequest) (*FileContent, error)
    Read file, supports line-level pagination (Offset 1-based, Limit 0=all)
    Write
    (ctx, *WriteRequest) error
    Write file; auto-creates parent directories; overwrites if file already exists
    Edit
    (ctx, *EditRequest) error
    String replacement; supports
    ReplaceAll
    ; errors when
    OldString
    is not unique (in non-ReplaceAll mode)
    GrepRaw
    (ctx, *GrepRequest) ([]GrepMatch, error)
    Search based on ripgrep, supports full regex syntax; supports case-insensitive, multiline matching, context lines
    GlobInfo
    (ctx, *GlobInfoRequest) ([]FileInfo, error)
    Glob pattern file matching, supports
    *
    /
    **
    /
    ?
    /
    [abc]
    + +### filesystem.StreamingShell + + + + +
    MethodSignatureDescription
    ExecuteStreaming
    (ctx, *ExecuteRequest) (*StreamReader[*ExecuteResponse], error)
    Stream shell command execution with real-time output; supports background execution (
    RunInBackendGround
    )
    + +--- + +## Usage Examples + +### Content Search (Regex) + +```go +matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ + Path: "/home/user/project", + Pattern: "TODO|FIXME", // ripgrep regex syntax + Glob: "*.go", + CaseInsensitive: true, +}) +``` + +### Edit File + +```go +backend.Edit(ctx, &filesystem.EditRequest{ + FilePath: "/tmp/file.txt", + OldString: "old text", + NewString: "new text", + ReplaceAll: true, +}) +``` + +### Streaming Command Execution + +```go +reader, _ := backend.ExecuteStreaming(ctx, &filesystem.ExecuteRequest{ + Command: "tail -f /var/log/app.log", +}) +for { + resp, err := reader.Recv() + if err == io.EOF { + break + } + fmt.Print(resp.Output) +} +``` + +### With Command Validation + +```go +backend, _ := local.NewBackend(ctx, &local.Config{ + ValidateCommand: func(cmd string) error { + allowed := map[string]bool{"ls": true, "cat": true, "grep": true} + parts := strings.Fields(cmd) + if len(parts) == 0 || !allowed[parts[0]] { + return fmt.Errorf("command not allowed: %s", parts[0]) + } + return nil + }, +}) +``` + +--- + +## Path Requirements + +All file paths must be absolute paths (starting with `/`). Relative paths can be converted using `filepath.Abs()`. + +--- + +## Comparison with Agentkit Backend + + + + + + + + + + +
    FeatureLocalAgentkit
    Execution modelLocal directRemote sandbox
    Network dependencyNoneRequired
    Configuration complexityZero configurationRequires credentials
    Security modelOS permissions + ValidateCommandIsolated sandbox
    Streaming outputSupported (StreamingShell)Not supported
    Platform supportUnix/Linux/macOSAny
    Use caseDevelopment/local environmentsMulti-tenant/production environments
    + +--- + +## FAQ + +**Q: Does GrepRaw support regex?** + +A: Yes. It uses ripgrep (`rg`) under the hood, supporting full regex syntax. The system must have ripgrep installed; otherwise it reports `ripgrep (rg) is not installed or not in PATH`. See [https://github.com/BurntSushi/ripgrep#installation](https://github.com/BurntSushi/ripgrep#installation) for installation instructions. + +**Q: Does Write create or overwrite?** + +A: Overwrite. `Write` uses `O_CREATE|O_TRUNC` flags — if the file exists, it overwrites the content; if not, it creates the file (including auto-creating parent directories). + +**Q: Is Windows supported?** + +A: No. `ExecuteStreaming` depends on `/bin/sh`. File operations themselves can run on any platform, but command execution is limited to Unix-like systems. + +**Q: Does Local Backend support non-streaming Execute?** + +A: No. Local only implements `StreamingShell` (`ExecuteStreaming`), not `Shell` (`Execute`). In the middleware Config, `Shell` and `StreamingShell` are mutually exclusive — choose one. diff --git "a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" "b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" deleted file mode 100644 index 9334bf19a63..00000000000 --- "a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" +++ /dev/null @@ -1,231 +0,0 @@ ---- -Description: "" -date: "2026-03-24" -lastmod: "" -tags: [] -title: Local File System -weight: 2 ---- - -## Local Backend - -Package: `github.com/cloudwego/eino-ext/adk/backend/local` - -Note: If your eino version is v0.8.0 or above, you need to use local backend [adk/backend/local/v0.2.1](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Flocal%2Fv0.2.1). - -### Overview - -Local Backend is the local file system implementation of EINO ADK FileSystem, directly operating on the local file system, providing native performance and zero-configuration experience. - -#### Core Features - -- Zero Configuration - Works out of the box -- Native Performance - Direct file system access, no network overhead -- Path Safety - Enforces absolute paths -- Streaming Execution - Supports real-time command output streaming -- Command Validation - Optional security validation hooks - -### Installation - -```bash -go get github.com/cloudwego/eino-ext/adk/backend/local -``` - -### Configuration - -```go -type Config struct { - // Optional: Command validation function for Execute() security control - ValidateCommand func(string) error -} -``` - -### Quick Start - -#### Basic Usage - -```go -import ( - "context" - - "github.com/cloudwego/eino-ext/adk/backend/local" - "github.com/cloudwego/eino/adk/filesystem" -) - -func main() { - ctx := context.Background() - - backend, err := local.NewBackend(ctx, &local.Config{}) - if err != nil { - panic(err) - } - - // Write file (must be absolute path) - err = backend.Write(ctx, &filesystem.WriteRequest{ - FilePath: "/tmp/hello.txt", - Content: "Hello, Local Backend!", - }) - - // Read file - fcontent, err := backend.Read(ctx, &filesystem.ReadRequest{ - FilePath: "/tmp/hello.txt", - }) - fmt.Println(fcontent.Content) -} -``` - -#### With Command Validation - -```go -func validateCommand(cmd string) error { - allowed := map[string]bool{"ls": true, "cat": true, "grep": true} - parts := strings.Fields(cmd) - if len(parts) == 0 || !allowed[parts[0]] { - return fmt.Errorf("command not allowed: %s", parts[0]) - } - return nil -} - -backend, _ := local.NewBackend(ctx, &local.Config{ - ValidateCommand: validateCommand, -}) -``` - -#### Integration with Agent - -```go -import ( - "github.com/cloudwego/eino/adk" - fsMiddleware "github.com/cloudwego/eino/adk/middlewares/filesystem" -) - -// Create Backend -backend, _ := local.NewBackend(ctx, &local.Config{}) - -// Create Middleware -middleware, _ := fsMiddleware.New(ctx, &fsMiddleware.Config{ - Backend: backend, - StreamingShell: backend, -}) - -// Create Agent -agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "LocalFileAgent", - Description: "AI Agent with local file system access capabilities", - Model: chatModel, - Handlers: []adk.ChatModelAgentMiddleware{middleware}, -}) -``` - -### API Reference - - - - - - - - - - - -
    MethodDescription
    LsInfoList directory contents
    ReadRead file content (supports pagination, default 200 lines)
    WriteCreate new file (error if exists)
    EditReplace file content
    GrepRawSearch file content (literal match)
    GlobInfoFind files by pattern
    ExecuteExecute shell commands
    ExecuteStreamingExecute commands with streaming output
    - -#### Examples - -```go -// List directory -files, _ := backend.LsInfo(ctx, &filesystem.LsInfoRequest{ - Path: "/home/user", -}) - -// Read file (paginated) -content, _ := backend.Read(ctx, &filesystem.ReadRequest{ - FilePath: "/path/to/file.txt", - Offset: 0, - Limit: 50, -}) - -// Search content (literal match, not regex) -matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ - Path: "/home/user/project", - Pattern: "TODO", - Glob: "*.go", -}) - -// Find files -files, _ := backend.GlobInfo(ctx, &filesystem.GlobInfoRequest{ - Path: "/home/user", - Pattern: "**/*.go", -}) - -// Edit file -backend.Edit(ctx, &filesystem.EditRequest{ - FilePath: "/tmp/file.txt", - OldString: "old", - NewString: "new", - ReplaceAll: true, -}) - -// Execute command -result, _ := backend.Execute(ctx, &filesystem.ExecuteRequest{ - Command: "ls -la /tmp", -}) - -// Streaming execution -reader, _ := backend.ExecuteStreaming(ctx, &filesystem.ExecuteRequest{ - Command: "tail -f /var/log/app.log", -}) -for { - resp, err := reader.Recv() - if err == io.EOF { - break - } - fmt.Print(resp.Stdout) -} -``` - -### Path Requirements - -All paths must be absolute paths (starting with `/`): - -```go -// Correct -backend.Read(ctx, &filesystem.ReadRequest{FilePath: "/home/user/file.txt"}) - -// Incorrect -backend.Read(ctx, &filesystem.ReadRequest{FilePath: "./file.txt"}) -``` - -Convert relative paths: - -```go -absPath, _ := filepath.Abs("./relative/path") -``` - -### Comparison with Agentkit Backend - - - - - - - - - - -
    FeatureLocalAgentkit
    Execution ModelLocal DirectRemote Sandbox
    Network DependencyNoneRequired
    Configuration ComplexityZero ConfigRequires Credentials
    Security ModelOS PermissionsIsolated Sandbox
    Streaming OutputSupportedNot Supported
    Platform SupportUnix/Linux/macOSAny
    Use CasesDevelopment/LocalMulti-tenant/Production
    - -### FAQ - -**Q: Why does running grep fail with `ripgrep (rg) is not installed or not in PATH. Please install it:` [https://github.com/BurntSushi/ripgrep#installation](https://github.com/BurntSushi/ripgrep#installation)?** - -The local Grep command relies on `ripgrep` by default. If your system does not have `ripgrep` installed, install it following the official guide. - -**Q: Does GrepRaw support regex?** - -Yes. GrepRaw uses `ripgrep` under the hood for grep operations, so regex patterns are supported. - -**Q: Windows support?** - -Not supported, depends on `/bin/sh`. diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md index 93301c65f2f..003d0aed3e6 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: AgentsMD @@ -9,85 +9,45 @@ weight: 9 ## Overview -`agentsmd` is an Eino ADK middleware that **automatically injects the content of Agents.md into the model input messages on every model call**. The injection is ephemeral: it is added dynamically for each model call and is not persisted into the session state, so it **won’t be processed by summarization/compression middlewares**. +`agentsmd` is an Eino ADK middleware that **automatically injects the content of Agents.md files into the message sequence on every model call**. The injected message is persisted by the framework into the agent's internal state, but **idempotency checks** (`Extra["__agentsmd_content__"]` marker) ensure it is never injected more than once. Since the injected content is fixed at its first appearance, **it will not change with subsequent summarization/compression**. -**Core value**: define system-level behavior instructions and context for an agent via an Agents.md file (similar to Claude Code’s CLAUDE.md), without manually composing system prompts. +**Core value**: Define system-level behavior instructions and context for an Agent via Agents.md files (similar to Claude Code's CLAUDE.md), without manually managing system prompt composition. -**Package**: `github.com/cloudwego/eino/adk/middlewares/agentsmd` - ---- +**Package path**: `github.com/cloudwego/eino/adk/middlewares/agentsmd` ## Quick Start -### Minimal Example - ```go -package main - -import ( - "context" - "fmt" - - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/adk/middlewares/agentsmd" -) - -func main() { - ctx := context.Background() - - // 1. Prepare Backend (file reading backend) - backend := NewLocalFileBackend("/path/to/project") - - // 2. Create agentsmd middleware - mw, err := agentsmd.New(ctx, &agentsmd.Config{ - Backend: backend, - AgentsMDFiles: []string{"/home/user/project/agents.md"}, - }) - if err != nil { - panic(err) - } - - // 3. Attach the middleware to the agent - // agent := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // Middlewares: []adk.ChatModelAgentMiddleware{mw}, - // }) - _ = mw - fmt.Println("agentsmd middleware created successfully") +ctx := context.Background() + +// 1. Create agentsmd middleware +mw, err := agentsmd.New(ctx, &agentsmd.Config{ + Backend: myBackend, // Implements agentsmd.Backend interface + AgentsMDFiles: []string{"/project/agents.md"}, +}) +if err != nil { + panic(err) } + +// 2. Configure with Agent +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: chatModel, + Handlers: []adk.ChatModelAgentMiddleware{mw}, +}) ``` --- -## Configuration +## Configuration Details -### Config +### Config Struct ```go type Config struct { - // Backend provides file access to load Agents.md files. - // It can be a local filesystem, remote storage, or any other backend. - // Required. - Backend Backend - - // AgentsMDFiles is an ordered list of Agents.md file paths to load. - // Files are loaded and injected in the given order. - // Files support recursive @import (max depth 5). - AgentsMDFiles []string - - // AllAgentsMDMaxBytes limits the total bytes of all loaded Agents.md content. - // Files are loaded in order; once the cumulative size exceeds this limit, - // the remaining files will be skipped. - // Each individual file is always loaded in full. - // 0 means unlimited. + Backend Backend + AgentsMDFiles []string AllAgentsMDMaxBytes int - - // OnLoadWarning is an optional callback invoked on non-fatal errors during loading - // (e.g. file not found, cyclic @import, depth limit exceeded). - // If nil, warnings are printed via log.Printf. - // - // Note: Backend.Read errors other than os.ErrNotExist (e.g. permission denied, I/O errors) - // are not treated as warnings and will abort the loading process. - OnLoadWarning func(filePath string, err error) + OnLoadWarning func(filePath string, err error) } ``` @@ -95,56 +55,77 @@ type Config struct { - - - - + + + +
    ParameterTypeRequiredDefaultDescription
    Backend
    Backend
    Yes-File reading backend that performs the actual I/O
    AgentsMDFiles
    []string
    Yes-List of Agents.md file paths to load (at least one)
    AllAgentsMDMaxBytes
    int
    No
    0
    (unlimited)
    Total byte limit for all files
    OnLoadWarning
    func(string, error)
    No
    log.Printf
    Callback for non-fatal errors
    Backend
    Backend
    YesFile reading backend, responsible for actual file I/O
    AgentsMDFiles
    []string
    YesList of Agents.md file paths to load (at least one), loaded and injected in order
    AllAgentsMDMaxBytes
    int
    No
    0
    (unlimited)
    Total byte limit for all files; subsequent files are skipped once exceeded, but each file is always loaded in full
    OnLoadWarning
    func(string, error)
    No
    log.Printf
    Callback for non-fatal errors (file missing, cyclic @import, depth limit exceeded, etc.)
    +### Validation Rules + +`New` / `NewTyped` validates Config on creation: + +- `Config` must not be nil +- `Backend` must not be nil +- `AgentsMDFiles` must contain at least one path +- `AllAgentsMDMaxBytes` must not be negative + --- +## Constructors + +### New — Standard Constructor + +```go +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) +``` + +Returns `ChatModelAgentMiddleware` (i.e., `TypedChatModelAgentMiddleware[*schema.Message]`), suitable for standard `ChatModelAgent`. + +### NewTyped — Generic Constructor + +```go +func NewTyped[M adk.MessageType](_ context.Context, cfg *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +Generic version, supporting both `*schema.Message` and `*schema.AgenticMessage` message types. `New` internally calls `NewTyped[*schema.Message]`. + ## Backend Interface -### Definition +### Interface Definition ```go type Backend interface { - // Read reads file content. - // If the file does not exist, implementations should return an error that wraps os.ErrNotExist - // (so errors.Is(err, os.ErrNotExist) returns true). - // This lets the loader skip missing files silently and notify via OnLoadWarning. - // Other errors (permission denied, I/O errors) abort the loading process. Read(ctx context.Context, req *ReadRequest) (*FileContent, error) } ``` -### Types +### Type Definitions -```go -// ReadRequest defines request parameters for reading a file -type ReadRequest struct { - FilePath string // file path - Offset int // starting line number (1-based) -} +`ReadRequest` and `FileContent` are aliases for the same-named types in the `github.com/cloudwego/eino/adk/filesystem` package: -// FileContent defines the return structure of file content -type FileContent struct { - Content string // file text content -} +```go +type ReadRequest = filesystem.ReadRequest +type FileContent = filesystem.FileContent ``` +> 💡 +> **Backend Implementation Requirements** +> +> - When a file does not exist, implementations **must** return an error wrapping `os.ErrNotExist` (so that `errors.Is(err, os.ErrNotExist)` returns `true`); the loader uses this to distinguish "file missing" from "real I/O error" +> - Other errors (permission denied, I/O errors) will **abort the entire loading process** and are not treated as warnings +> - The `Read` method should be concurrency-safe + --- ## @import Syntax -Agents.md supports `@import` to recursively include other files. +Agents.md files support the `@path` syntax for recursive inclusion of other files. -### Syntax - -In Agents.md, use `@path/to/file` to reference another file: +### Syntax Format ```markdown -# Project instructions +# Project Instructions You are a coding assistant. @@ -153,68 +134,84 @@ Please follow these rules: @rules/api-conventions.md ``` -### Rules +### Matching Rules + +The loader uses the regex `@([a-zA-Z0-9_.~/][a-zA-Z0-9_.~/\-]*)` to scan file content, with the following filtering logic: + +- **Paths containing /**: directly treated as @import (e.g., `@rules/style.md`) +- **Paths without /**: treated as @import only when the extension is in the allow list; otherwise ignored -1. **Path resolution**: relative paths are resolved from the current file’s directory; absolute paths are used as-is -2. **Max recursion depth**: 5 (beyond that the import is skipped and `OnLoadWarning` is triggered) -3. **Cycle detection**: cyclic imports are detected and skipped (`OnLoadWarning` is triggered) -4. **Global de-duplication**: the same file is not loaded twice -5. **Supported extensions** (when the path contains no `/`): `.md`, `.txt`, `.mdx`, `.yaml`, `.yml`, `.json`, `.toml` -6. **False-positive filtering**: `@ref` without `/` whose extension is not allowed will be ignored (to avoid treating `@someone` or `@example.com` as an import) +**Allowed extensions**: `.md`, `.txt`, `.mdx`, `.yaml`, `.yml`, `.json`, `.toml` -### Example Directory Layout +This design avoids misinterpreting `@someone`, `@example.com`, etc. as import targets. + +### Resolution Behavior + + + + + + + + + +
    RuleDescription
    Path resolutionRelative paths are resolved from the current file's directory; absolute paths are used as-is
    Maximum recursion depth5 levels (exceeded paths are skipped and trigger
    OnLoadWarning
    )
    Cycle detectionPaths already present in the current ancestor chain are skipped (triggers
    OnLoadWarning
    )
    Global deduplicationThe same file path is read and injected only once across the entire load
    Original text preserved@imported files are appended as separate paragraphs; the
    @path
    text in the original is not removed
    Byte budgetOnce cumulative bytes exceed
    AllAgentsMDMaxBytes
    , subsequent imports are skipped
    + +### Directory Structure Example ``` project/ -├── Agents.md # entry file +├── Agents.md # Main entry file ├── rules/ -│ ├── code-style.md # code style rules -│ ├── api-conventions.md # API conventions -│ └── testing.md # testing rules +│ ├── code-style.md # @rules/code-style.md +│ ├── api-conventions.md # @rules/api-conventions.md +│ └── testing.md └── context/ - └── architecture.md # architecture notes + └── architecture.md ``` --- ## How It Works +### Implementation Hook + +The middleware implements the `BeforeModelRewriteState` method of the `TypedChatModelAgentMiddleware` interface (**not** WrapModel). This hook triggers before each model call, when the state is being rewritten. + ### Injection Flow +### Message Sequence After Injection + ``` -User message + history - │ - ▼ -┌─────────────────────┐ -│ agentsmd middleware │ -│ (WrapModel) │ -│ │ -│ 1. Load Agents.md │ -│ 2. Cache in RunLocal│ -│ 3. Build injected msg│ -└─────────────────────┘ - │ - ▼ -┌─────────────────────────────────────┐ -│ Injected message sequence │ -│ │ -│ [System] system prompt │ -│ [User] ← Agents.md injection │ ← inserted before the first User message -│ [User] previous user message 1 │ -│ [Assistant] assistant reply 1 │ -│ [User] current user message │ -└─────────────────────────────────────┘ - │ - ▼ -Model call (Generate / Stream) +[System] System prompt +[User] ← Agents.md content (with Extra marker) +[User] User historical message 1 +[Assistant] Assistant reply 1 +[User] Current user message ``` -### Key Mechanics +### Key Mechanisms + +**1. Persistent injection + idempotency guarantee** + +The framework persists the state returned by `BeforeModelRewriteState` into the agent's internal state (`st.Messages = state.Messages`). The injected message is marked with `Extra["__agentsmd_content__"]`; each time the hook is entered, it first scans for this marker — if found, it returns the original state directly, avoiding duplicate injection. Therefore, in effect: the content is injected and persisted on the first model call, and subsequent iterations do not re-insert it. + +**2. Run-level caching** + +Within the same `Run()`, content loaded for the first time is cached in RunLocal storage via `adk.SetRunLocalValue`. Subsequent model calls (e.g., during multi-turn tool calls) directly reuse the cache via `adk.GetRunLocalValue`. Each new `Run()` reloads from scratch, so file modifications take effect on the next Run. + +**4. Insertion position** + +Content is inserted as a `User` role message **before the first User message**. If there are no User messages in the sequence, it is appended to the end. -1. **Ephemeral injection**: Agents.md content is inserted only for model calls and not written into `ChatModelAgentState`, so it won’t be summarized/compressed -2. **Run-level caching**: within a single agent `Run()`, the loaded Agents.md content is cached in `RunLocalValue`; subsequent model calls reuse it to avoid repeated reads -3. **Insertion position**: injected as a `User` role message before the first user message; if there is no user message, it is appended to the end -4. **I18n**: formatted output adapts to Chinese/English automatically (based on the system language environment) +**5. Content formatting** + +Loaded file content is formatted: + +- Wrapped in `` tags +- Includes i18n header (prompting the model to follow instructions) and footer (noting the context may not be relevant) +- Each file is displayed independently with a `File content: {path} (instructions):` prefix +- Language (Chinese/English) is controlled globally via `adk.SetLanguage` --- @@ -222,15 +219,13 @@ Model call (Generate / Stream) ### Middleware Ordering -**It is recommended to place the `agentsmd` middleware after summarization/compression middlewares.** This ensures Agents.md content: - -- won’t be compressed away by summarization -- is fully available on every model call +> 💡 +> **It is recommended to place the agentsmd middleware after summarization/compression middlewares.** This ensures Agents.md content is not compressed by summarization, and the model receives full instructions on every call. ```go -Middlewares: []adk.ChatModelAgentMiddleware{ - summarizationMiddleware, // summarize first - agentsMDMiddleware, // then inject Agents.md +Handlers: []adk.ChatModelAgentMiddleware{ + summarizationMiddleware, // Summarize first + agentsMDMiddleware, // Then inject Agents.md } ``` @@ -238,44 +233,51 @@ Middlewares: []adk.ChatModelAgentMiddleware{ - - - - - + + + + +
    ScenarioBehavior
    File not found (
    os.ErrNotExist
    )
    Skip the file and trigger
    OnLoadWarning
    Cyclic
    @import
    Skip the cyclic file and trigger
    OnLoadWarning
    @import
    depth > 5
    Skip and trigger
    OnLoadWarning
    Total size exceeds
    AllAgentsMDMaxBytes
    Skip remaining files and trigger
    OnLoadWarning
    (the first file is always loaded fully)
    Permission denied / I/O errorAbort loading and return error
    File not found (
    os.ErrNotExist
    )
    Skip the file, trigger
    OnLoadWarning
    Cyclic @importSkip the cyclic file, trigger
    OnLoadWarning
    @import depth exceeds 5 levelsSkip, trigger
    OnLoadWarning
    Cumulative size exceeds
    AllAgentsMDMaxBytes
    Skip subsequent files, trigger
    OnLoadWarning
    (the first file is always loaded in full)
    Permission denied / I/O errorAbort loading, return error
    All file contents emptyDo not inject; pass through original messages
    -### Backend Requirements - -- When a file does not exist, implementations **must** return an error that wraps `os.ErrNotExist` (e.g. `fmt.Errorf(\"... : %w\", os.ErrNotExist)`), otherwise the loader cannot distinguish “missing file” vs “real I/O error” -- `Read` should be concurrency-safe - ### Performance Considerations -- Set `AllAgentsMDMaxBytes` reasonably to avoid injecting too much content and consuming the model context window -- Agents.md is loaded once per `Run()` (run-level caching), but **every new `Run()` reloads it**, so file edits take effect on the next run -- Avoid importing too many files; the recursion depth limit is 5 +- Set `AllAgentsMDMaxBytes` reasonably to avoid injecting too much content that occupies the context window +- Agents.md content is loaded only once per `Run()` (run-level caching), but **every new `Run()` reloads**, so file edits take effect on the next Run +- Avoid importing too many files; the recursion depth limit is 5 levels -### Writing Agents.md +### Agents.md Writing Guidelines -- Keep it concise and include only instructions that truly affect model behavior -- Use `@import` to split concerns (code style, API conventions, architecture notes, etc.) -- Avoid large code examples or datasets in Agents.md to prevent wasting context window -- The content is wrapped in `` tags when passed to the model, so the model treats it as system-level instructions +- Keep content concise; only include instructions that truly affect model behavior +- Use @import to split by concerns (code standards, API conventions, architecture notes, etc.) +- Avoid including large code examples or data to prevent wasting the context window +- File content is wrapped in `` tags when passed to the model --- ## FAQ **Q: Will Agents.md content be saved into the conversation history?** -A: No. The content is injected dynamically during model calls and is not written into `ChatModelAgentState`, so it won’t appear in history. + +A: Yes. The state returned by `BeforeModelRewriteState` is persisted by the framework. However, due to the idempotency check (`Extra["__agentsmd_content__"]` marker), content is only injected once on the first model call; subsequent iterations skip it directly. It is recommended to place agentsmd after summarization to avoid the injected content being compressed by summarization. **Q: What happens if an Agents.md file does not exist?** -A: The file is skipped and `OnLoadWarning` is triggered (defaults to `log.Printf`). It does not fail the whole load. + +A: That file is skipped, triggering the `OnLoadWarning` callback (defaults to `log.Printf`), without affecting other files' loading. **Q: What is the base directory for @import paths?** + A: The directory of the current file. For example, `@rules/style.md` in `/project/Agents.md` resolves to `/project/rules/style.md`. -**Q: If multiple files import the same file, will it be loaded multiple times?** -A: No. The loader maintains a global de-duplication map; the same file path is read and injected only once. +**Q: If multiple files @import the same file, will it be loaded multiple times?** + +A: No. The loader maintains a global deduplication map (`seen`); the same path is read and injected only once. + +**Q: Will the @path reference in the original text be replaced?** + +A: No. @imported files are appended as separate paragraphs after the original text; the original content remains unchanged. + +**Q: What is the difference between New and NewTyped?** + +A: `New` returns `ChatModelAgentMiddleware` (i.e., `TypedChatModelAgentMiddleware[*schema.Message]`), suitable for standard Agents. `NewTyped` is the generic version that additionally supports the `*schema.AgenticMessage` type, for Agentic Model scenarios. diff --git a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md index e7fba84f352..9c900c8d299 100644 --- a/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md +++ b/content/en/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md @@ -1,187 +1,221 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: FileSystem weight: 2 --- -> 💡 Package: [github.com/cloudwego/eino/adk/middlewares/filesystem](https://github.com/cloudwego/eino/tree/main/adk/middlewares/filesystem) +The FileSystem middleware injects a set of file system operation tools (ls, read\_file, write\_file, edit\_file, glob, grep) and an optional command execution tool (execute) into the Agent, enabling the Agent to interact with local or remote file systems. -## Overview - -The FileSystem middleware provides filesystem access for agents. It operates the filesystem through the [FileSystem Backend](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/filesystem_backend) interface and automatically injects a set of file operation tools and the corresponding system prompt, enabling the agent to read/write/search/edit files directly. - -Core capabilities: - -- **Filesystem tool injection** — automatically registers tools such as ls, read_file, write_file, edit_file, glob, grep -- **Shell command execution** — optionally injects the execute tool, supports both sync and streaming execution -- **Per-tool configuration** — each tool can be configured independently (name/description/custom implementation/disable) -- **Multilingual prompts** — tool descriptions and system prompts support Chinese/English switching +``` +import "github.com/cloudwego/eino/adk/middlewares/filesystem" +``` -## Create the Middleware +--- -It is recommended to use `New` to create the middleware (returns `ChatModelAgentMiddleware`): +## Quick Start ```go -import "github.com/cloudwego/eino/adk/middlewares/filesystem" +import ( + "context" + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/adk/middlewares/filesystem" +) +// 1. Create middleware middleware, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: myBackend, - // To enable shell command execution, set Shell or StreamingShell - Shell: myShell, + Backend: myBackend, // Implements filesystem.Backend interface }) -if err != nil { - // handle error -} +// 2. Inject into Agent agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ // ... Middlewares: []adk.ChatModelAgentMiddleware{middleware}, }) ``` +--- + +## Constructors + + + + + +
    Function SignatureDescription
    New(ctx, *MiddlewareConfig) (ChatModelAgentMiddleware, error)
    Recommended. Returns
    ChatModelAgentMiddleware
    , supports dynamically modifying Instruction and Tools via the
    BeforeAgent
    hook.
    NewTyped[M MessageType](ctx, *MiddlewareConfig) (TypedChatModelAgentMiddleware[M], error)
    Generic version, type parameter
    M
    supports
    *schema.Message
    and
    *schema.AgenticMessage
    .
    New
    is equivalent to
    NewTyped[*schema.Message]
    .
    + > 💡 -> `New` returns `ChatModelAgentMiddleware` with better context propagation (it can modify the agent’s instruction and tools at runtime via the `BeforeAgent` hook). +> **Deprecated**: `NewMiddleware(ctx, *Config) (AgentMiddleware, error)` is the legacy constructor; new code should use `New`. `NewMiddleware` returns the struct `AgentMiddleware`, which lacks the flexibility of the `BeforeAgent` hook; additionally it enables "large result offloading" by default (see below), which has been removed in the `New` path. + +--- ## MiddlewareConfig -```go -type MiddlewareConfig struct { - // Backend provides filesystem operations - // Required - Backend filesystem.Backend - - // Shell provides synchronous shell command execution - // If set, the execute tool will be registered - // Optional, mutually exclusive with StreamingShell - Shell filesystem.Shell - - // StreamingShell provides streaming shell command execution - // If set, the streaming execute tool will be registered (real-time output) - // Optional, mutually exclusive with Shell - StreamingShell filesystem.StreamingShell - - // Per-tool configuration (all optional) - LsToolConfig *ToolConfig // ls tool config - ReadFileToolConfig *ToolConfig // read_file tool config - WriteFileToolConfig *ToolConfig // write_file tool config - EditFileToolConfig *ToolConfig // edit_file tool config - GlobToolConfig *ToolConfig // glob tool config - GrepToolConfig *ToolConfig // grep tool config - - // CustomSystemPrompt overrides the default system prompt - // Optional, defaults to ToolsSystemPrompt - CustomSystemPrompt *string - - // Deprecated fields, use the corresponding *ToolConfig.Desc instead - // CustomLsToolDesc, CustomReadFileToolDesc, CustomGrepToolDesc, - // CustomGlobToolDesc, CustomWriteFileToolDesc, CustomEditToolDesc -} -``` +`MiddlewareConfig` is the configuration struct used by `New` / `NewTyped`. -### ToolConfig +### Core Fields -Each tool can be configured independently via `ToolConfig`: + + + + + + + +
    FieldTypeDescription
    Backend
    filesystem.Backend
    Required. Provides file system operation capabilities, driving the 6 tools: ls, read\_file, write\_file, edit\_file, glob, grep. Interface defined in the
    github.com/cloudwego/eino/adk/filesystem
    package.
    Shell
    filesystem.Shell
    Optional. Provides command execution capability; when set, the
    execute
    tool is registered. Mutually exclusive with
    StreamingShell
    .
    StreamingShell
    filesystem.StreamingShell
    Optional. Provides streaming command execution capability; when set, the streaming
    execute
    tool is registered. Mutually exclusive with
    Shell
    .
    UseMultiModalRead
    bool
    Optional, defaults to
    false
    . When enabled, the
    read_file
    tool becomes an
    EnhancedInvokableTool
    , supporting multi-modal content such as images/PDFs. Requires the Backend to also implement the filesystem.MultiModalReader interface.
    CustomSystemPrompt
    *string
    Optional. Overrides the system prompt appended to the Agent Instruction. If
    nil
    , no system prompt is appended.
    + +### Tool Configuration Fields + +Each tool has a corresponding `*ToolConfig` field for customizing the tool name, description, replacing the implementation, or disabling it: + + + + + + + + + +
    FieldCorresponding Tool
    LsToolConfig
    ls
    ReadFileToolConfig
    read\_file
    WriteFileToolConfig
    write\_file
    EditFileToolConfig
    edit\_file
    GlobToolConfig
    glob
    GrepToolConfig
    grep
    + +> The `execute` tool currently does not support customization via `ToolConfig`; its registration is controlled solely by whether `Shell` / `StreamingShell` is set. + +--- + +## ToolConfig ```go type ToolConfig struct { - // Name overrides the tool name - // Optional. Defaults to the built-in name (e.g. "ls", "read_file") - Name string - - // Desc overrides the tool description - // Optional. Defaults to the built-in description - Desc *string - - // CustomTool provides a custom tool implementation - // If set, it replaces the default implementation built on Backend - // Optional - CustomTool tool.BaseTool - - // Disable disables this tool - // When true, the tool will not be registered - // Optional, defaults to false - Disable bool + Name string // Override tool name; empty string uses default + Desc *string // Override tool description; nil uses default + CustomTool tool.BaseTool // Custom tool implementation; when set, replaces the Backend default implementation + Disable bool // Set to true to not register this tool } ``` -Example — rename a tool and disable write: +**Priority**: `Disable=true` > `CustomTool` > Backend default implementation. + +--- + +## Tool Name Constants ```go -middleware, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: myBackend, - ReadFileToolConfig: &filesystem.ToolConfig{ - Name: "cat_file", // custom name - }, - WriteFileToolConfig: &filesystem.ToolConfig{ - Disable: true, // disable write tool - }, -}) +const ( + ToolNameLs = "ls" + ToolNameReadFile = "read_file" + ToolNameWriteFile = "write_file" + ToolNameEditFile = "edit_file" + ToolNameGlob = "glob" + ToolNameGrep = "grep" + ToolNameExecute = "execute" +) ``` +--- + ## Injected Tools - - - - - - - - + + + + + + + +
    ToolDefault nameDescriptionCondition
    List directory
    ls
    List files and directories under the given pathInjected when Backend is not nil
    Read file
    read_file
    Read file content, supports line-based pagination (offset + limit)Injected when Backend is not nil
    Write file
    write_file
    Create or overwrite a fileInjected when Backend is not nil
    Edit file
    edit_file
    Replace strings in a fileInjected when Backend is not nil
    Glob
    glob
    Find files by glob patternInjected when Backend is not nil
    Search content
    grep
    Search file content by pattern, supports multiple output modesInjected when Backend is not nil
    Execute command
    execute
    Execute shell commandsRequires Shell or StreamingShell
    ToolDefault NameRegistration ConditionDescription
    ls
    ls
    Backend ≠ nilList files and subdirectories under a directory
    read\_file
    read_file
    Backend ≠ nilRead file content, supports offset/limit pagination. When
    UseMultiModalRead
    is enabled, can also read images and PDFs
    write\_file
    write_file
    Backend ≠ nilCreate or overwrite a file
    edit\_file
    edit_file
    Backend ≠ nilExact string replacement editing, supports
    replace_all
    glob
    glob
    Backend ≠ nilMatch file paths by glob pattern
    grep
    grep
    Backend ≠ nilRegex search file content, supports multiple output modes and pagination
    execute
    execute
    Shell ≠ nil or StreamingShell ≠ nilExecute shell commands
    -Each tool can be disabled via its corresponding `*ToolConfig` (`Disable: true`) or replaced with a custom implementation (`CustomTool`). +--- -## Multilingual Support +## Backend Interface -Tool descriptions and built-in prompts default to English. To switch to Chinese, use `adk.SetLanguage()`: +`Backend` is defined in the `github.com/cloudwego/eino/adk/filesystem` package. The middleware package re-exports request/response types via type aliases (e.g., `ReadRequest`, `FileContent`), but **the Backend interface itself needs to be referenced from the adk/filesystem package**. ```go -import "github.com/cloudwego/eino/adk" - -adk.SetLanguage(adk.LanguageChinese) // switch to Chinese -adk.SetLanguage(adk.LanguageEnglish) // switch to English (default) +type Backend interface { + LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) + Read(ctx context.Context, req *ReadRequest) (*FileContent, error) + GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error) + GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error) + Write(ctx context.Context, req *WriteRequest) error + Edit(ctx context.Context, req *EditRequest) error +} ``` -You can also customize each tool’s text via `ToolConfig.Desc` or override the system prompt via `CustomSystemPrompt`. +### Shell and StreamingShell -## [deprecated] Large Tool Result Offloading +```go +type Shell interface { + Execute(ctx context.Context, input *ExecuteRequest) (*ExecuteResponse, error) +} -> 💡 -> This feature will be deprecated in 0.8.0. Please migrate to Middleware: ToolReduction. +type StreamingShell interface { + ExecuteStreaming(ctx context.Context, input *ExecuteRequest) (*schema.StreamReader[*ExecuteResponse], error) +} +``` -> Note: Large tool result offloading is only available in the legacy `Config` + `NewMiddleware` API. The recommended `MiddlewareConfig` + `New` does not include it. If you need it, use the ToolReduction middleware. +The two are mutually exclusive; only one can be set. `StreamingShell` supports streaming output, suitable for long-running commands. -When tool call results are too large (e.g. reading large files, grep matching too many lines), keeping the full result in the conversation context can cause: +--- -- token usage to spike -- agent history context pollution -- worse reasoning efficiency +## MultiModalReader Extension Interface -So the legacy middleware (`NewMiddleware`) provides an automatic offloading mechanism: +When `UseMultiModalRead = true`, the Backend needs to additionally implement the `MultiModalReader` interface: -- when the result exceeds a threshold (default 20,000 tokens), it does not return the full content to the LLM -- the actual result is saved to the filesystem (Backend) -- the context contains only a summary and a file path (the agent can call `read_file` again to fetch on demand) +```go +type MultiModalReader interface { + MultiModalRead(ctx context.Context, req *MultiModalReadRequest) (*MultiFileContent, error) +} +``` -This feature is enabled by default and can be configured via `Config` (not `MiddlewareConfig`): +**Behavior description**: -```go -type Config struct { - // ... Backend, Shell, StreamingShell, ToolConfig fields are the same as MiddlewareConfig +- The `read_file` tool will be upgraded from `InvokableTool` to `EnhancedInvokableTool`, returning multi-modal results via `schema.ToolResult.Parts` +- The default implementation supports reading image files (PNG, JPG, etc.) and PDF files (supports the `pages` parameter to specify page ranges, up to 20 pages at a time) +- The tool description will automatically have a multi-modal capability suffix appended; if the description is customized via `ReadFileToolConfig.Desc`, no suffix is appended - // Disable automatic offloading - WithoutLargeToolResultOffloading bool +> 💡 +> When using `ChatModelAgentMiddleware`, the `WrapEnhancedInvokableToolCall` method must be implemented for the multi-modal read\_file tool to take effect. + +```go +// MultiModalReadRequest extends ReadRequest +type MultiModalReadRequest struct { + ReadRequest + Pages string // PDF page range, e.g. "1-5", "3", "10-20" +} - // Custom threshold (default 20000 tokens) - LargeToolResultOffloadingTokenLimit int +// MultiFileContent return result +type MultiFileContent struct { + *FileContent // Plain text result + Parts []FileContentPart // Multi-modal result (mutually exclusive with FileContent; Parts takes precedence when non-empty) +} - // Custom offloading path generator - // Default path format: /large_tool_result/{ToolCallID} - LargeToolResultOffloadingPathGen func(ctx context.Context, input *compose.ToolInput) (string, error) +type FileContentPart struct { + Type FileContentPartType // "image" or "pdf" + MIMEType string // e.g. "image/png", "application/pdf" + Data []byte // Raw binary data } ``` + +--- + +## Deprecated: Legacy Config and Large Result Offloading + +> 💡 +> The following content applies only to the `NewMiddleware` + `Config` legacy path. The `New` / `NewTyped` path **does not include** the large result offloading feature. + +The legacy `Config` provides a "Large Tool Result Offloading" mechanism in addition to `MiddlewareConfig`: + + + + + + +
    FieldDescription
    WithoutLargeToolResultOffloading bool
    Set to
    true
    to disable offloading, defaults to
    false
    (enabled)
    LargeToolResultOffloadingTokenLimit int
    Token threshold, default
    20000
    LargeToolResultOffloadingPathGen func(ctx, *compose.ToolInput) (string, error)
    Offload path generation function, default
    /large_tool_result/{ToolCallID}
    + +**Trigger condition**: When the character count of a tool result > `tokenLimit × 4`, offloading is triggered. + +**Offload behavior**: The full result is written to a file via `Backend.Write`, and replaced with a summary (first 10 lines + file path hint). The Agent can read the full result in pages via `read_file`. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_cancel_and_turnloop_quickstart.md b/content/en/docs/eino/core_modules/eino_adk/agent_cancel_and_turnloop_quickstart.md new file mode 100644 index 00000000000..33b5e901640 --- /dev/null +++ b/content/en/docs/eino/core_modules/eino_adk/agent_cancel_and_turnloop_quickstart.md @@ -0,0 +1,540 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: Agent Cancel and TurnLoop Quick Start +weight: 10 +--- + +A quick start guide for the two core features in Eino ADK: **Agent Cancel** and **TurnLoop**. Introduced in [v0.9.0-alpha.9](https://github.com/cloudwego/eino/releases/tag/v0.9.0-alpha.9). + +## Type Conventions + +All examples in this document use the following generic instantiations: + +- `T = string` (the business item type pushed to TurnLoop) +- `M = *schema.Message` (the Agent message type, i.e., the standard `Message`) + +ADK type aliases: + +```go +type Agent = TypedAgent[*schema.Message] +type AgentInput = TypedAgentInput[*schema.Message] +type AgentEvent = TypedAgentEvent[*schema.Message] +``` + +When using `*schema.AgenticMessage`, simply replace `M` with the corresponding type—all API signatures are completely symmetric. + +--- + +## Part 1: Agent Cancel + +### Scenario + +After a user sends a request to an agent, they may want to cancel the current execution due to long wait times or changed requirements. + +### Core API + +```go +// Create cancel option and cancel function +cancelOpt, cancelFunc := adk.WithCancel() + +// Start the agent, passing in the cancel option +iter := runner.Run(ctx, []*schema.Message{schema.UserMessage("hello")}, cancelOpt) + +// Initiate cancellation (can be called from any goroutine) +handle, contributed := cancelFunc(adk.WithAgentCancelMode(adk.CancelImmediate)) +// contributed == true: this call affected the execution result +// contributed == false: agent already finished or cancellation already completed, this call has no actual effect + +err := handle.Wait() +``` + +Three possible return values from `CancelHandle.Wait()`: + +```go +switch { +case err == nil: + // Cancellation successful +case errors.Is(err, adk.ErrCancelTimeout): + // Safe-point timeout, automatically escalated to immediate cancellation +case errors.Is(err, adk.ErrExecutionEnded): + // Agent finished naturally before cancellation took effect +} +``` + +### Three Cancellation Modes + + + + + + +
    ModeBehaviorUse Case
    CancelImmediate
    Interrupts immediately without waiting for a safe pointEmergency stop, timeout fallback
    CancelAfterChatModel
    Cancels after the current ChatModel call completesNeed complete model response
    CancelAfterToolCalls
    Cancels after all current ToolCalls completeEnsure tool side effects are complete
    + +> 💡 +> `CancelMode` is a bitmask and can be combined: `CancelAfterChatModel | CancelAfterToolCalls` is equivalent to "cancel at whichever safe point is reached first". + +### Safe-Point Cancellation + +```go +// Cancel after ChatModel completes, with 5-second timeout protection +handle, _ := cancelFunc( + adk.WithAgentCancelMode(adk.CancelAfterChatModel), + adk.WithAgentCancelTimeout(5*time.Second), +) +``` + +> 💡 +> Safe-point modes should always be used with `WithAgentCancelTimeout`. If the agent never reaches a safe point, it automatically escalates to immediate cancellation after timeout. + +### Recursive Cancellation + +By default, cancellation only affects the root agent. Use `WithRecursive()` to propagate cancellation to sub-agents nested within AgentTools: + +```go +handle, _ := cancelFunc( + adk.WithAgentCancelMode(adk.CancelAfterChatModel), + adk.WithRecursive(), +) +``` + +### Identifying Cancellation on the Consumer Side + +```go +for { + event, ok := iter.Next() + if !ok { + break + } + if event.Err != nil { + var cancelErr *adk.CancelError + if errors.As(event.Err, &cancelErr) { + log.Printf("Agent was cancelled (mode=%v, escalated=%v)", + cancelErr.Info.Mode, cancelErr.Info.Escalated) + } + break + } + // Process normal events... +} +``` + +--- + +## Part 2: TurnLoop + +### Scenario + +Build a continuously running agent service: users send messages at any time, the agent processes them in turns; urgent messages can preempt the current execution. + +### Turn Lifecycle + + + +### Basic Usage + +```go +loop := adk.NewTurnLoop(adk.TurnLoopConfig[string, *schema.Message]{ + // GenInput: receives all buffered items, decides which to consume this turn + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(strings.Join(items, "\n"))}}, + Consumed: items, + }, nil + }, + + // PrepareAgent: builds the Agent based on consumed items for this turn + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return myAgent, nil + }, + + // OnAgentEvents: processes the agent event stream (optional) + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + if event.Err != nil { + return event.Err + } + log.Printf("Received event: agent=%s", event.AgentName) + } + return nil + }, +}) + +loop.Push("message 1") +loop.Push("message 2") +loop.Run(ctx) // Non-blocking, starts background processing +loop.Push("message 3") // Can still push while running +loop.Stop() +result := loop.Wait() // Blocks until exit +``` + +### Core Callbacks + + + + + + + +
    CallbackRequiredResponsibility
    GenInput
    Receives all buffered items, returns
    Consumed
    (processed this turn) and
    Remaining
    (kept for subsequent turns). Items not in either will be discarded.
    PrepareAgent
    Builds the Agent based on Consumed items (sets up prompt, tools, middleware, etc.)
    OnAgentEvents
    Processes the agent event stream. When not set, defaults to draining events and returning the first error
    GenResume
    Called when restoring from checkpoint, decides how to merge interrupted/unhandled/new items
    + +> 💡 +> **Do not propagate CancelError** in `OnAgentEvents`—the framework handles it automatically. `CancelError` from Stop is propagated as `ExitReason`; `CancelError` from Preempt is swallowed by the framework, and the loop continues to the next turn. The callback should only return a non-nil error when it encounters a fatal error itself. + +### Preemption + +```go +// Push an urgent message, cancel current agent at safe point +accepted, ack := loop.Push("Urgent message!", adk.WithPreempt[string, *schema.Message](adk.AnySafePoint)) + +if accepted { + <-ack // Wait for preemption signal to be committed (current turn is guaranteed to be cancelled) +} +``` + +Preemption is an atomic operation—"push new message" and "cancel current agent" execute as a whole: + +1. Urgent message enters the buffer +2. Current agent is cancelled at the safe point +3. TurnLoop automatically starts a new turn +4. `GenInput` receives all buffered items (including the urgent message), makes decisions again + +> 💡 +> `WithPreempt` always uses safe-point cancellation and **does not automatically set WithRecursive**. However, `WithPreemptTimeout` automatically enables `WithRecursive`—when timeout escalates to immediate cancellation, nested sub-agents are also terminated. + +### Preemption with Timeout / Delay + +```go +// Safe-point wait, escalate to immediate cancellation after 5-second timeout (auto-recursive) +loop.Push("urgent", adk.WithPreemptTimeout[string, *schema.Message](adk.AnySafePoint, 5*time.Second)) + +// 2-second grace period before initiating preemption +loop.Push("new message", + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + adk.WithPreemptDelay[string, *schema.Message](2*time.Second), +) +``` + +### Conditional Preemption: WithPushStrategy + +When the preemption decision depends on the current turn state, use `WithPushStrategy` to avoid TOCTOU races: + +```go +loop.Push(urgentItem, adk.WithPushStrategy( + func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message]) []adk.PushOption[string, *schema.Message] { + if tc == nil { + return nil // No active turn, no need to preempt + } + if isLowPriority(tc.Consumed) { + return []adk.PushOption[string, *schema.Message]{ + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + } + } + return nil // Current is a high-priority task, don't preempt + }, +)) +``` + +### Detecting Preemption and Stop in OnAgentEvents + +`TurnContext` provides `Preempted` and `Stopped` signal channels: + +```go +OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + + select { + case <-tc.Preempted: + log.Println("Current turn preempted, wrapping up...") + case <-tc.Stopped: + log.Printf("Loop is stopping, reason: %s", tc.StopCause()) + default: + } + + if event.Err != nil { + return event.Err + } + // Process events... + } + return nil +}, +``` + +> 💡 +> `Preempted` / `Stopped` are only closed when the corresponding cancel call actually "contributes" to the current turn's `CancelError`. If the cancellation has already been finalized by another signal, the channels remain open. + +### Stopping TurnLoop + +```go +// Wait for current turn to complete before exiting (ExitReason is nil) +loop.Stop() + +// Immediately abort current agent (recursively propagates to nested agents) +loop.Stop(adk.WithImmediate()) + +// Safe-point stop (recursively propagates, no timeout) +loop.Stop(adk.WithGraceful()) + +// Safe-point stop with timeout (escalates to immediate cancellation after timeout) +loop.Stop(adk.WithGracefulTimeout(10 * time.Second)) + +// Auto-shutdown after idle (stops after 30 seconds of continuous idle) +loop.Stop(adk.UntilIdleFor(30 * time.Second)) +``` + +> 💡 +> You can call `Stop()` multiple times to escalate the cancellation strategy. Typical pattern: first `WithGraceful()`, then `WithImmediate()` after timeout. + +### Attaching Stop Cause + +```go +loop.Stop( + adk.WithGraceful(), + adk.WithStopCause("quota exceeded"), +) +result := loop.Wait() +log.Printf("Stop cause: %s", result.StopCause) +``` + +--- + +## Part 3: Declarative Checkpoint Recovery + +### Scenario + +After an Agent is cancelled or interrupted, the next startup automatically resumes from the breakpoint rather than starting from scratch. TurnLoop automatically manages input bookkeeping—the application layer only needs to declare how interrupted/unhandled/new items re-enter subsequent turns. + +### Configuring Checkpoint + +Enable by setting both `Store` and `CheckpointID` in `TurnLoopConfig`: + +```go +store := NewMyCheckpointStore() // Implements CheckPointStore interface + +cfg := adk.TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(items[0])}}, + Consumed: items[:1], + Remaining: items[1:], + }, nil + }, + + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return myAgent, nil + }, + + // GenResume: called when restoring from checkpoint + GenResume: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], interruptedItems, unhandledItems, newItems []string) (*adk.GenResumeResult[string, *schema.Message], error) { + all := append(append(interruptedItems, unhandledItems...), newItems...) + return &adk.GenResumeResult[string, *schema.Message]{ + Consumed: all[:1], + Remaining: all[1:], + }, nil + }, + + Store: store, + CheckpointID: "session-123", +} +``` + +### Recovery Flow + +`Run()` automatically queries the Store on startup: + + + + + + +
    Checkpoint StateBehavior
    Mid-turn checkpoint exists (agent was interrupted during execution)Calls
    GenResume
    , passes interrupted/unhandled/new items to the application layer for decision before resuming execution
    Between-turns checkpoint exists (stopped between turns)Adds buffered items to the buffer, processes normally through
    GenInput
    No checkpoint existsStarts from scratch
    + +```go +// First run +loop := adk.NewTurnLoop(cfg) +loop.Push("message 1") +loop.Run(ctx) +loop.Stop(adk.WithGraceful()) +exit := loop.Wait() +log.Printf("checkpoint attempted: %v, err: %v", exit.CheckpointAttempted, exit.CheckpointErr) + +// Second run (same cfg, containing the same CheckpointID) +loop2 := adk.NewTurnLoop(cfg) +loop2.Push("new message") // Passed as newItems to GenResume +loop2.Run(ctx) // Automatically detects checkpoint and resumes +result := loop2.Wait() +``` + +### Skipping Checkpoint + +```go +loop.Stop(adk.WithSkipCheckpoint()) // Don't save checkpoint on this exit +``` + +### Implementing CheckPointStore + +```go +type CheckPointStore interface { + Get(ctx context.Context, checkPointID string) ([]byte, bool, error) + Set(ctx context.Context, checkPointID string, checkPoint []byte) error +} +``` + +Optionally implement `CheckPointDeleter` to support explicit deletion of expired checkpoints: + +```go +type CheckPointDeleter interface { + Delete(ctx context.Context, checkPointID string) error +} +``` + +On normal exit (without saving a new checkpoint), TurnLoop will attempt to delete the previously loaded checkpoint to prevent stale recovery. **Only Stores that implement CheckPointDeleter will perform deletion**; otherwise the Store manages the lifecycle itself. + +> 💡 +> When using `Store`, the generic parameter `T` must support `encoding/gob` encoding/decoding—TurnLoop persists runner checkpoints and item bookkeeping information via gob. + +--- + +## Part 4: Complete Example + +Simulates a chat service supporting priority scheduling, preemption, and checkpoint recovery: + +```go +package main + +import ( + "context" + "log" + "strings" + "time" + + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/schema" +) + +func main() { + ctx := context.Background() + store := adk.NewInMemoryStore() + + cfg := adk.TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + // Sort by priority, consume only the first item, keep the rest for subsequent turns + sorted := sortByPriority(items) + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(sorted[0])}}, + Consumed: sorted[:1], + Remaining: sorted[1:], // Items not in either will be discarded + }, nil + }, + + GenResume: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], interruptedItems, unhandledItems, newItems []string) (*adk.GenResumeResult[string, *schema.Message], error) { + all := append(append(interruptedItems, unhandledItems...), newItems...) + return &adk.GenResumeResult[string, *schema.Message]{ + Consumed: all[:1], + Remaining: all[1:], + }, nil + }, + + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return buildAgent(consumed), nil + }, + + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + // Detect preemption/stop signals for cleanup + select { + case <-tc.Preempted: + log.Println("Preempted by higher priority message") + case <-tc.Stopped: + log.Printf("Service shutting down: %s", tc.StopCause()) + default: + } + if event.Err != nil { + // Don't propagate CancelError, framework handles it automatically + return event.Err + } + log.Printf("[%s] %s", event.AgentName, extractText(event)) + } + return nil + }, + + Store: store, + CheckpointID: "chat-session-001", + } + + loop := adk.NewTurnLoop(cfg) + loop.Push("Hello, help me check the weather") + loop.Run(ctx) + + // Send urgent message to preempt after 1 second + time.AfterFunc(1*time.Second, func() { + loop.Push("Stop! Handle this urgent issue first", + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + ) + }) + + // Graceful shutdown after 5 seconds + time.AfterFunc(5*time.Second, func() { + loop.Stop( + adk.WithGracefulTimeout(3*time.Second), + adk.WithStopCause("service shutdown"), + ) + }) + + result := loop.Wait() + log.Printf("Exit reason: %v", result.ExitReason) + log.Printf("Unhandled messages: %v", result.UnhandledItems) + log.Printf("Stop cause: %s", result.StopCause) + log.Printf("checkpoint: attempted=%v, err=%v", result.CheckpointAttempted, result.CheckpointErr) + + // Next startup with the same cfg will automatically resume from checkpoint +} +``` + +--- + +## FAQ + +### Q: Can safe-point cancellation wait forever without reaching a safe point? + +Yes. If the agent is stuck in a long-running tool or model call, the safe point may never arrive. **Always use it with WithAgentCancelTimeout**—after timeout it automatically escalates to `CancelImmediate`. + +### Q: When is `WithRecursive` needed? + +By default, cancellation only affects the root agent. It's only needed when the agent hierarchy contains sub-agents nested in AgentTools and you want those sub-agents to respond to cancellation at safe points too. When in doubt, don't add it. + +### Q: What are the requirements for generic parameter T? + +When `Store` is configured, `T` must be encodable/decodable by `encoding/gob`. Primitive types (`string`, `int`, etc.) and structs with all exported fields are supported by default. If `T` contains interface fields, the concrete types need to be registered via `gob.Register`. + +### Q: What happens when `Push` is called after the loop stops? + +`Push` returns `(false, closedCh)`. These "late items" won't enter the checkpoint and can be recovered via `result.TakeLateItems()` after `Wait()` returns. Once `TakeLateItems()` is called, subsequent `Push` calls will panic to prevent silent data loss. + +### Q: What happens when `Stop()` is called multiple times? + +It's safe—each call can escalate the cancellation strategy. Typical pattern: + +```go +loop.Stop(adk.WithGraceful()) // First try graceful stop +time.AfterFunc(3*time.Second, func() { + loop.Stop(adk.WithImmediate()) // Escalate to immediate cancellation after 3 seconds +}) +``` + +### Q: What happens to items returned by `GenInput` that are neither in Consumed nor in Remaining? + +They are discarded. This is by design—it allows `GenInput` to filter out unwanted items during decision-making. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_collaboration.md b/content/en/docs/eino/core_modules/eino_adk/agent_collaboration.md index 5c012b3651f..5e056eb1f9f 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_collaboration.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_collaboration.md @@ -1,521 +1,116 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: Agent Collaboration' +title: Agent Collaboration weight: 4 --- -# Agent Collaboration +# Multi-Agent Collaboration -The overview document has provided basic explanations of Agent collaboration. Below we will introduce the design and implementation of collaboration and composition primitives in combination with code: +Eino ADK provides two main Agent collaboration approaches: -## Collaboration Primitives +## AgentAsTool (Recommended) -### Inter-Agent Collaboration Methods +Wraps a sub-Agent as a Tool, allowing the parent Agent to autonomously decide when to call it via ToolCall. The sub-Agent executes independently, and results are returned to the parent Agent's context. - - - - -
    Collaboration MethodDescription
    TransferDirectly transfers the task to another Agent. The current Agent exits after execution completes, without concern for the task execution status of the transferred Agent
    ToolCall (AgentAsTool)Invokes the Agent as a ToolCall, waits for the Agent's response, and can obtain the output result of the called Agent for the next round of processing
    - -### AgentInput Context Strategies - - - - - -
    Context StrategyDescription
    Upstream Agent Full ConversationGets the complete conversation history of the current Agent's upstream Agent
    Fresh Task DescriptionIgnores the complete conversation history of the upstream Agent and provides a completely new task summary as the AgentInput for the sub-Agent
    - -### Decision Autonomy - - - - - -
    Decision AutonomyDescription
    Autonomous DecisionWithin the Agent, based on its available downstream Agents, when assistance is needed, it autonomously selects a downstream Agent for assistance. Generally, the Agent makes decisions based on LLM internally, but even if selection is based on preset logic, it is still considered autonomous decision from the Agent's external perspective
    Preset DecisionThe next Agent after an Agent executes a task is predetermined. The execution order of Agents is predetermined and predictable
    - -### Composition Primitives - - - - - - - - -
    TypeDescriptionRunning ModeCollaboration MethodContext StrategyDecision Autonomy
    SubAgentsCombines the user-provided agent as the parent Agent and the user-provided subAgents list as child Agents to form an Agent capable of autonomous decision-making. The Name and Description serve as the Agent's name identifier and description.
  • Currently, an Agent can only have one parent Agent
  • Use the SetSubAgents function to build a "multi-tree" form of Multi-Agent
  • Within this "multi-tree", AgentName must remain unique
  • TransferUpstream Agent Full ConversationAutonomous Decision
    SequentialCombines the user-provided SubAgents list into a Sequential Agent that executes in order. The Name and Description serve as the Sequential Agent's name identifier and description. When the Sequential Agent executes, it runs the SubAgents list in order until all Agents have been executed.TransferUpstream Agent Full ConversationPreset Decision
    ParallelCombines the user-provided SubAgents list into a Parallel Agent that executes concurrently based on the same context. The Name and Description serve as the Parallel Agent's name identifier and description. When the Parallel Agent executes, it runs the SubAgents list concurrently and finishes when all Agents have completed.TransferUpstream Agent Full ConversationPreset Decision
    LoopExecutes the user-provided SubAgents list in array order sequentially and repeatedly, forming a Loop Agent. The Name and Description serve as the Loop Agent's name identifier and description. When the Loop Agent executes, it runs the SubAgents list in order and finishes when all Agents have completed.TransferUpstream Agent Full ConversationPreset Decision
    AgentAsToolConverts an Agent into a Tool to be used by other Agents as a regular Tool. Whether an Agent can call other Agents as Tools depends on its own implementation. The ChatModelAgent provided in ADK supports the AgentAsTool functionalityToolCallFresh Task DescriptionAutonomous Decision
    - -## Context Passing - -When building multi-Agent systems, efficient and accurate sharing of information between different Agents is crucial. Eino ADK provides two core context passing mechanisms to meet different collaboration needs: History and SessionValues. - -### History - -#### Concept - -History corresponds to the [Upstream Agent Full Conversation context strategy]. Every AgentEvent produced by each Agent in a multi-Agent system is saved to History. When calling a new Agent (Workflow/Transfer), the AgentEvents in History are converted and concatenated into the AgentInput. - -By default, Assistant or Tool Messages from other Agents are converted to User Messages. This is equivalent to telling the current LLM: "Just now, Agent_A called some_tool and returned some_result. Now it's your turn to make a decision." - -Through this approach, other Agents' behaviors are treated as "external information" or "factual statements" provided to the current Agent, rather than its own behaviors, thus avoiding LLM context confusion. - - - -In Eino ADK, when building AgentInput for an Agent, the History it can see is "all AgentEvents produced before me". - -Worth mentioning is ParallelWorkflowAgent: two parallel sub-Agents (A, B) cannot see each other's AgentEvents during parallel execution because neither A nor B precedes the other. - -#### RunPath - -Each AgentEvent in History is "produced by a specific Agent in a specific execution sequence", meaning AgentEvent has its own RunPath. The purpose of RunPath is to convey this information; it doesn't carry other functions in the eino framework. - -The table below shows the specific RunPath when Agents execute under various orchestration modes: - - - - - - - -
    ExampleRunPath
  • Agent: [Agent]
  • SubAgent: [Agent, SubAgent]
  • Agent: [Agent]
  • Agent (after function call): [Agent]
  • Agent1: [SequentialAgent, LoopAgent, Agent1]
  • Agent2: [SequentialAgent, LoopAgent, Agent1, Agent2]
  • Agent1: [SequentialAgent, LoopAgent, Agent1, Agent2, Agent1]
  • Agent2: [SequentialAgent, LoopAgent, Agent1, Agent2, Agent1, Agent2]
  • Agent3: [SequentialAgent, LoopAgent, Agent3]
  • Agent4: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent4]
  • Agent5: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent5]
  • Agent6: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent6]
  • Agent: [Agent]
  • SubAgent: [Agent, SubAgent]
  • Agent: [Agent, SubAgent, Agent]
  • - -#### Customization - -In some cases, the History content needs to be adjusted before the Agent runs. At this point, you can customize how the Agent generates AgentInput from History using AgentWithOptions: - -```go -// github.com/cloudwego/eino/adk/flow.go - -type HistoryRewriter func(ctx context.Context, entries []*HistoryEntry) ([]Message, error) - -func WithHistoryRewriter(h HistoryRewriter) AgentOption -``` - -### SessionValues - -#### Concept +This is the most flexible and composable collaboration pattern: -SessionValues is a global temporary KV store that persists throughout a single run, used to support cross-Agent state management and data sharing. Any Agent in a single run can read and write SessionValues at any time. - -Eino ADK provides multiple methods for Agents to read and write Session Values in a concurrency-safe manner at runtime: - -```go -// github.com/cloudwego/eino/adk/runctx.go - -// Get all SessionValues -func GetSessionValues(ctx context.Context) map[string]any -// Batch set SessionValues -func AddSessionValues(ctx context.Context, kvs map[string]any) -// Get a value from SessionValues by specified key. Returns false as the second value if key doesn't exist, otherwise true -func GetSessionValue(ctx context.Context, key string) (any, bool) -// Set a single SessionValue -func AddSessionValue(ctx context.Context, key string, value any) -``` - -Note that since the SessionValues mechanism is implemented based on Context, and Runner reinitializes the Context when running, injecting SessionValues via `AddSessionValues` or `AddSessionValue` outside of the Run method will not take effect. - -If you need to inject data into SessionValues before the Agent runs, you need to use a dedicated Option to assist with this. Usage is as follows: - -```go -// github.com/cloudwego/eino/adk/call_option.go -// WithSessionValues injects SessionValues before Agent runs -func WithSessionValues(v map[string]any) AgentRunOption - -// Usage: -runner := adk.NewRunner(ctx, adk.RunnerConfig{Agent: agent}) -iterator := runner.Run(ctx, []adk.Message{schema.UserMessage("xxx")}, - adk.WithSessionValues(map[string]any{ - PlanSessionKey: 123, - UserInputSessionKey: []adk.Message{schema.UserMessage("yyy")}, - }), -) -``` - -## Transfer SubAgents - -### Concept - -Transfer corresponds to the [Transfer collaboration method]. When an Agent produces an AgentEvent containing TransferAction during runtime, Eino ADK calls the Agent specified by the Action. The called Agent is called a SubAgent. - -TransferAction can be quickly created using `NewTransferToAgentAction`: - -```go -import "github.com/cloudwego/eino/adk" - -event := adk.NewTransferToAgentAction("dest agent name") -``` - -For Eino ADK to find and run the SubAgent instance upon receiving TransferAction, you need to first call `SetSubAgents` to register possible SubAgents with Eino ADK before running: - -```go -// github.com/cloudwego/eino/adk/flow.go -func SetSubAgents(ctx context.Context, agent Agent, subAgents []Agent) (Agent, error) -``` - -> 💡 -> The meaning of Transfer is to **hand over** the task to the SubAgent, not delegate or assign. Therefore: -> -> 1. Unlike ToolCall, when calling a SubAgent through Transfer, after the SubAgent finishes running, the parent Agent will not be called again to summarize content or perform the next operation. -> 2. When calling a SubAgent, the SubAgent's input is still the original input, and the parent Agent's output serves as context for the SubAgent's reference. - -When triggering SetSubAgents, both parent and child Agents need to process to complete initialization. Eino ADK defines the `OnSubAgents` interface to support this functionality: - -```go -// github.com/cloudwego/eino/adk/interface.go -type OnSubAgents interface { - OnSetSubAgents(ctx context.Context, subAgents []Agent) error - OnSetAsSubAgent(ctx context.Context, parent Agent) error - OnDisallowTransferToParent(ctx context.Context) error -} -``` - -If an Agent implements the `OnSubAgents` interface, `SetSubAgents` will call the corresponding methods to register with the Agent. For example, `ChatModelAgent`'s implementation. - -### Example - -Below we demonstrate the Transfer capability with a multi-functional conversation Agent. The goal is to build an Agent that can query weather or chat with users. The Agent structure is as follows: - - - -All three Agents are implemented using ChatModelAgent: +- The parent Agent retains control and can continue reasoning based on sub-Agent results +- The sub-Agent receives an independent task description and does not inherit the parent Agent's full conversation history +- Multiple sub-Agents can be called in parallel ```go import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" "github.com/cloudwego/eino/compose" + "github.com/cloudwego/eino/components/tool" ) -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -type GetWeatherInput struct { - City string `json:"city"` -} - -func NewWeatherAgent() adk.Agent { - weatherTool, err := utils.InferTool( - "get_weather", - "Gets the current weather for a specific city.", - func(ctx context.Context, input *GetWeatherInput) (string, error) { - return fmt.Sprintf(`the temperature in %s is 25°C`, input.City), nil - }, - ) - if err != nil { - log.Fatal(err) - } - - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "WeatherAgent", - Description: "This agent can get the current weather for a given city.", - Instruction: "Your sole purpose is to get the current weather for a given city by using the 'get_weather' tool. After calling the tool, report the result directly to the user.", - Model: newChatModel(), - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{weatherTool}, - }, - }, - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func NewChatAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ChatAgent", - Description: "A general-purpose agent for handling conversational chat.", - Instruction: "You are a friendly conversational assistant. Your role is to handle general chit-chat and answer questions that are not related to any specific tool-based tasks.", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func NewRouterAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "RouterAgent", - Description: "A manual router that transfers tasks to other expert agents.", - Instruction: `You are an intelligent task router. Your responsibility is to analyze the user's request and delegate it to the most appropriate expert agent.If no Agent can handle the task, simply inform the user it cannot be processed.`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} -``` - -Then use Eino ADK's Transfer capability to build a Multi-Agent and run it. ChatModelAgent implements the OnSubAgent interface. In the adk.SetSubAgents method, this interface is used to register parent/child Agents with ChatModelAgent, without requiring users to handle TransferAction generation: - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino/adk" -) - -func main() { - weatherAgent := NewWeatherAgent() - chatAgent := NewChatAgent() - routerAgent := NewRouterAgent() - - ctx := context.Background() - a, err := adk.SetSubAgents(ctx, routerAgent, []adk.Agent{chatAgent, weatherAgent}) - if err != nil { - log.Fatal(err) - } - - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: a, - }) - - // query weather - println("\n\n>>>>>>>>>query weather<<<<<<<<<") - iter := runner.Query(ctx, "What's the weather in Beijing?") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil { - fmt.Printf("\nAgent[%s]: transfer to %+v\n\n======\n", event.AgentName, event.Action.TransferToAgent.DestAgentName) - } else { - fmt.Printf("\nAgent[%s]:\n%+v\n\n======\n", event.AgentName, event.Output.MessageOutput.Message) - } - } - - // failed to route - println("\n\n>>>>>>>>>failed to route<<<<<<<<<") - iter = runner.Query(ctx, "Book me a flight from New York to London tomorrow.") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil { - fmt.Printf("\nAgent[%s]: transfer to %+v\n\n======\n", event.AgentName, event.Action.TransferToAgent.DestAgentName) - } else { - fmt.Printf("\nAgent[%s]:\n%+v\n\n======\n", event.AgentName, event.Output.MessageOutput.Message) - } - } -} -``` - -Running result: - -```yaml ->>>>>>>>>query weather<<<<<<<<< -Agent[RouterAgent]: -assistant: -tool_calls: -{Index: ID:call_SKNsPwKCTdp1oHxSlAFt8sO6 Type:function Function:{Name:transfer_to_agent Arguments:{"agent_name":"WeatherAgent"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{201 17 218} -====== -Agent[RouterAgent]: transfer to WeatherAgent -====== -Agent[WeatherAgent]: -assistant: -tool_calls: -{Index: ID:call_QMBdUwKj84hKDAwMMX1gOiES Type:function Function:{Name:get_weather Arguments:{"city":"Beijing"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{255 15 270} -====== -Agent[WeatherAgent]: -tool: the temperature in Beijing is 25°C -tool_call_id: call_QMBdUwKj84hKDAwMMX1gOiES -tool_call_name: get_weather -====== -Agent[WeatherAgent]: -assistant: The current temperature in Beijing is 25°C. -finish_reason: stop -usage: &{286 11 297} -====== - ->>>>>>>>>failed to route<<<<<<<<< -Agent[RouterAgent]: -assistant: I'm unable to assist with booking flights. Please use a relevant travel service or booking platform to make your reservation. -finish_reason: stop -usage: &{206 23 229} -====== -``` - -The other two methods of OnSubAgents are called when an Agent acts as a SubAgent in SetSubAgents: - -- OnSetAsSubAgent is used to register parent Agent information with the Agent -- OnDisallowTransferToParent is called when the Agent sets the WithDisallowTransferToParent option, to inform the Agent not to produce TransferAction to the parent Agent. - -```go -adk.SetSubAgents( - ctx, - Agent1, - []adk.Agent{ - adk.AgentWithOptions(ctx, Agent2, adk.WithDisallowTransferToParent()), +// Create sub-Agent +subAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "researcher", + Description: "Search and summarize relevant information", + Instruction: "You are a research assistant...", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{searchTool}, + }, }, -) +}) + +// Wrap as Tool +agentTool := adk.NewAgentTool(ctx, subAgent) + +// Parent Agent registers sub-Agent Tool +parentAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "coordinator", + Description: "Main Agent that coordinates tasks", + Instruction: "You are a task coordinator...", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{agentTool}, + }, + }, +}) ``` -### Static Transfer Configuration - -AgentWithDeterministicTransferTo is an Agent Wrapper that generates a preset TransferAction after the original Agent executes, enabling static configuration of Agent jumping: - -```go -// github.com/cloudwego/eino/adk/flow.go +### AgentTool Options -type DeterministicTransferConfig struct { - Agent Agent - ToAgentNames []string -} - -func AgentWithDeterministicTransferTo(_ context.Context, config *DeterministicTransferConfig) Agent -``` - -In Supervisor mode, after a SubAgent finishes execution, it always returns to the Supervisor, which generates the next task objective. AgentWithDeterministicTransferTo can be used here: + + + + +
    OptionDescription
    WithFullChatHistoryAsInput()
    Pass the parent Agent's full conversation history as sub-Agent input (by default only the model-generated request parameters are passed)
    WithAgentInputSchema(schema)
    Customize the sub-Agent's input schema
    - +### Event Stream Pass-through -```go -// github.com/cloudwego/eino/adk/prebuilt/supervisor.go +When `ToolsConfig.EmitInternalEvents = true`, the sub-Agent's events are passed through in real-time to the parent Agent's event stream, allowing end users to see the sub-Agent's intermediate process. -type SupervisorConfig struct { - Supervisor adk.Agent - SubAgents []adk.Agent -} +> 💡 +> Pass-through events do not affect the parent Agent's state or checkpoint, and are only for user display. The only exception is the Interrupted action, which propagates across boundaries via CompositeInterrupt to support interrupt recovery. -func NewSupervisor(ctx context.Context, conf *SupervisorConfig) (adk.Agent, error) { - subAgents := make([]adk.Agent, 0, len(conf.SubAgents)) - supervisorName := conf.Supervisor.Name(ctx) - for _, subAgent := range conf.SubAgents { - subAgents = append(subAgents, adk.AgentWithDeterministicTransferTo(ctx, &adk.DeterministicTransferConfig{ - Agent: subAgent, - ToAgentNames: []string{supervisorName}, - })) - } +### Pre-built Example: DeepAgents - return adk.SetSubAgents(ctx, conf.Supervisor, subAgents) -} -``` +[DeepAgents](/docs/eino/core_modules/eino_adk/agent_implementation/deepagents) is a best practice of the AgentAsTool pattern: the main Agent delegates subtasks to sub-Agents via **TaskTool**, combined with **WriteTodos** for task planning and progress tracking. ## Workflow Agents -WorkflowAgent supports running Agents according to workflows preset in code. Eino ADK provides three basic Workflow Agents: Sequential, Parallel, and Loop. They can be nested within each other to complete more complex tasks. - -By default, the input for each Agent in a Workflow is generated using the method described in the History section. You can customize the AgentInput generation method using WithHistoryRewriter. - -When an Agent produces an ExitAction Event, the Workflow Agent will immediately exit, regardless of whether there are other Agents that need to run afterward. - -For detailed explanations and use case references, see: [Eino ADK: Workflow Agents](/docs/eino/core_modules/eino_adk/agent_implementation/workflow) - -### SequentialAgent - -SequentialAgent executes a series of Agents in the order you provide: - - - -```go -type SequentialAgentConfig struct { - Name string - Description string - SubAgents []Agent -} - -func NewSequentialAgent(ctx context.Context, config *SequentialAgentConfig) (Agent, error) -``` - -### LoopAgent - -LoopAgent is implemented based on SequentialAgent. After SequentialAgent completes, it runs from the beginning again: - - - -```go -type LoopAgentConfig struct { - Name string - Description string - SubAgents []Agent - - MaxIterations int // Maximum number of loop iterations -} - -func NewLoopAgent(ctx context.Context, config *LoopAgentConfig) (Agent, error) -``` - -### ParallelAgent - -ParallelAgent runs multiple Agents concurrently: +Deterministic orchestration for multi-step tasks with fixed processes: - + + + + + +
    TypeDescriptionConstructor
    SequentialExecutes sub-Agents sequentially in array order
    adk.NewSequentialAgent
    ParallelExecutes all sub-Agents concurrently, finishes when all complete
    adk.NewParallelAgent
    LoopLoops execution of sub-Agent sequence until BreakLoop or MaxIterations exceeded
    adk.NewLoopAgent
    -```go -type ParallelAgentConfig struct { - Name string - Description string - SubAgents []Agent -} +Context is passed between Workflow Agents via Transfer: the upstream Agent's output is automatically appended to the downstream Agent's input Messages. -func NewParallelAgent(ctx context.Context, config *ParallelAgentConfig) (Agent, error) -``` +# Context Passing -## AgentAsTool +## SessionValues -When running an Agent requires only clear and explicit instructions rather than a complete running context (History), the Agent can be converted to a Tool for invocation: +Global KV store across Agents, concurrently safe for read/write by any Agent within a single run: ```go -func NewAgentTool(_ context.Context, agent Agent, options ...AgentToolOption) tool.BaseTool +// Read/write API +adk.AddSessionValue(ctx, "key", value) +val, ok := adk.GetSessionValue(ctx, "key") +adk.AddSessionValues(ctx, map[string]any{"k1": v1, "k2": v2}) +all := adk.GetSessionValues(ctx) ``` -After converting to a Tool, the Agent can be called by ChatModels that support function calling, and can also be called by all LLM-driven Agents. The calling method depends on the Agent implementation. - -Message history isolation: An Agent as a Tool does not inherit the message history (History) of the parent Agent. - -SessionValues sharing: However, it shares the SessionValues of the parent Agent, i.e., reads and writes the same KV map. - -Internal event exposure: An Agent as a Tool is still an Agent and produces AgentEvents. By default, these internal AgentEvents are not exposed through the `AsyncIterator` returned by `Runner`. In some business scenarios, if you need to expose the internal AgentTool's AgentEvents to users, you need to add configuration in the parent `ChatModelAgent`'s `ToolsConfig` to enable internal event exposure: +> 💡 +> SessionValues are implemented based on Context, and Runner reinitializes the Context at runtime. To inject data before running, use the `WithSessionValues` Option: ```go -// from adk/chatmodel.go - -type ToolsConfig struct { - // other configurations... - - // EmitInternalEvents indicates whether internal events from agentTool should be emitted - // to the parent generator via a tool option injection at run-time. - EmitInternalEvents bool -} +iter := runner.Run(ctx, messages, + adk.WithSessionValues(map[string]any{ + "user_id": "123", + }), +) ``` - -These internal events will not enter the parent agent's context (except for the last message which would enter anyway), and various AgentActions will not take effect (except InterruptAction). diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_extension.md b/content/en/docs/eino/core_modules/eino_adk/agent_extension.md index 2532cbd00ec..91877058d25 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_extension.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_extension.md @@ -1,118 +1,133 @@ --- Description: "" -date: "2025-11-20" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: Agent Runner and Extension' +title: Agent Runner and Extension weight: 6 --- -# Agent Runner +# Runner -## Definition +Runner is the execution entry for Agents, responsible for managing Agent lifecycle, context initialization, Checkpoint persistence, and interrupt recovery. **Any Agent should be run through Runner.** -Runner is the core engine in Eino ADK responsible for executing Agents. Its main purpose is to manage and control the entire lifecycle of Agents, such as handling multi-Agent collaboration, saving and passing context, etc. Cross-cutting capabilities like interrupt, callback, etc. all rely on Runner for implementation. Any Agent should be run through Runner. +## Basic Usage -## Interrupt & Resume - -Agent Runner provides runtime interrupt and resume functionality. This allows a running Agent to proactively interrupt its execution and save the current state, supporting resumption from the interrupt point. This functionality is commonly used in scenarios where the Agent processing flow requires external input, long waits, or pausable operations. - -Below we introduce three key points in an interrupt-to-resume process: +```go +import "github.com/cloudwego/eino/adk" + +// Create Runner +runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + CheckPointStore: store, // Optional, required for interrupt recovery +}) + +// Method 1: Query — directly send a user question +iter := runner.Query(ctx, "Help me search today's news") + +// Method 2: Run — pass in complete Messages +iter := runner.Run(ctx, []*schema.Message{ + schema.UserMessage("Hello"), +}, adk.WithSessionValues(map[string]any{"user": "alice"})) + +// Consume event stream +for { + event, ok := iter.Next() + if !ok { + break + } + // handle event +} +``` -1. Interrupted Action: Thrown by the Agent as an interrupt event, intercepted by Agent Runner -2. Checkpoint: Agent Runner intercepts the event and saves the current running state -3. Resume: After running conditions are ready again, Agent Runner resumes running from the checkpoint +## Generic Support -### Interrupted Action +```go +type TypedRunner[M MessageType] struct { ... } +type Runner = TypedRunner[*schema.Message] -During the Agent's execution, you can proactively interrupt the Runner's operation by producing an AgentEvent containing an Interrupted Action. +func NewTypedRunner[M MessageType](conf TypedRunnerConfig[M]) *TypedRunner[M] +``` -When the Event's Interrupted is not empty, the Agent Runner considers an interrupt to have occurred: +The `*schema.AgenticMessage` path uses `NewTypedRunner` for construction. -```go -// github.com/cloudwego/eino/adk/interface.go -type AgentAction struct { - // other actions - Interrupted *InterruptInfo - // other actions -} +## Interrupt & Resume -// github.com/cloudwego/eino/adk/interrupt.go -type InterruptInfo struct { - Data any -} -``` +An Agent can proactively interrupt during execution. Runner automatically saves state (requires `CheckPointStore` configuration), and can resume from the breakpoint later. -When an interrupt occurs, you can attach custom interrupt information through the InterruptInfo structure. This information: +### Interrupt -1. Will be passed to the caller, which can be used to explain the reason for the interrupt, etc. -2. If the Agent run needs to be resumed later, the InterruptInfo will be re-passed to the interrupted Agent upon resumption, and the Agent can use this information to resume running +An Agent triggers an interrupt by producing an event containing `Interrupted`: ```go -// For example, when ChatModelAgent interrupts, it sends the following AgentEvent: -h.Send(&AgentEvent{AgentName: h.agentName, Action: &AgentAction{ - Interrupted: &InterruptInfo{ - Data: &ChatModelAgentInterruptInfo{Data: data, Info: info}, +gen.Send(&adk.AgentEvent{ + Action: &adk.AgentAction{ + Interrupted: &adk.InterruptInfo{Data: myData}, }, -}}) +}) ``` -### State Persistence (Checkpoint) - -When Runner captures this Event with Interrupted Action, it immediately terminates the current execution flow. If: +### State Persistence -1. CheckPointStore is set in Runner +After Runner captures the interrupt, it stores the running state (input, conversation history, InterruptInfo) into `CheckPointStore` keyed by CheckPointID: ```go -// github.com/cloudwego/eino/adk/runner.go -type RunnerConfig struct { - // other fields - CheckPointStore CheckPointStore -} - -// github.com/cloudwego/eino/adk/interrupt.go type CheckPointStore interface { Set(ctx context.Context, key string, value []byte) error Get(ctx context.Context, key string) ([]byte, bool, error) } ``` -1. CheckPointID is passed via AgentRunOption WithCheckPointID when calling Runner +Pass the CheckPointID via Option when calling: ```go -// github.com/cloudwego/eino/adk/interrupt.go -func WithCheckPointID(id string) AgentRunOption +iter := runner.Run(ctx, messages, adk.WithCheckPointID("cp-123")) ``` -After terminating running, Runner persists the current running state (original input, conversation history, etc.) and the InterruptInfo thrown by the Agent to CheckPointStore using CheckPointID as the key. - > 💡 -> To preserve the original types of data in interfaces, Eino ADK uses gob ([https://pkg.go.dev/encoding/gob](https://pkg.go.dev/encoding/gob)) to serialize running state. Therefore, when using custom types, you need to register the types in advance using gob.Register or gob.RegisterName (the latter is more recommended; the former uses path plus type name as the default name, so both the type's location and name cannot change). Eino automatically registers types built into the framework. +> ADK uses gob to serialize running state. Custom types need to be registered in advance via gob.RegisterName. Framework built-in types are automatically registered. ### Resume -When running is interrupted, calling Runner's Resume interface with the CheckPointID from the interrupt can resume running: - ```go -// github.com/cloudwego/eino/adk/runner.go -func (r *Runner) Resume(ctx context.Context, checkPointID string, opts ...AgentRunOption) (*AsyncIterator[*AgentEvent], error) +// Simple resume: implicitly resumes all interrupt points +iter, err := runner.Resume(ctx, "cp-123") + +// Precise resume: specify target and data +iter, err := runner.ResumeWithParams(ctx, "cp-123", &adk.ResumeParams{ + Targets: map[string]any{ + "agent-address": resumeData, + }, +}) ``` -Resuming Agent running requires the interrupted Agent to implement the ResumableAgent interface. Runner reads the running state from CheckPointerStore and resumes running, where the InterruptInfo and the EnableStreaming configured in the previous run are provided as input to the Agent: +Resume requires the interrupted Agent to implement the `ResumableAgent` interface: ```go -// github.com/cloudwego/eino/adk/interface.go -type ResumableAgent interface { - Agent - - Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] -} - -// github.com/cloudwego/eino/adk/interrupt.go -type ResumeInfo struct { - EnableStreaming bool - *InterruptInfo +type TypedResumableAgent[M MessageType] interface { + TypedAgent[M] + Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*TypedAgentEvent[M]] } ``` -To pass new information to the Agent during Resume, you can define an AgentRunOption and pass it when calling Runner.Resume. +# Multi-Turn Runtime: TurnLoop + +For scenarios requiring multi-turn interaction (chat applications, continuous conversations), ADK provides the `TurnLoop` runtime: + +- **Push-based event loop**: Push new messages to trigger Agent execution +- **Preempt**: When a user sends a new message while the Agent is running, it can cancel the current run +- **Stop**: Stop the event loop +- **Declarative Checkpoint/Resume**: TurnLoop automatically manages input bookkeeping; the application layer only needs to declare the recovery strategy + +See: [Agent Cancel and TurnLoop Quickstart](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) + +# Agent Cancel + +Runtime cancellation capability added in v0.9, supporting: + +- **CancelMode bitmask combination**: `CancelModelStream | CancelToolCalls` +- **CancelHandle.Wait()**: Wait for cancellation to complete +- **Integration with TurnLoop**: Automatically triggers Cancel on Preempt + +See: [Agent Cancel and TurnLoop Quickstart](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md deleted file mode 100644 index bfd09a76eb8..00000000000 --- a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md +++ /dev/null @@ -1,897 +0,0 @@ ---- -Description: "" -date: "2026-03-16" -lastmod: "" -tags: [] -title: 'Eino ADK: ChatModelAgent' -weight: 1 ---- - -# ChatModelAgent Overview - -## Import Path - -`import "github.com/cloudwego/eino/adk"` - -## What is ChatModelAgent - -`ChatModelAgent` is a core prebuilt Agent in Eino ADK that encapsulates the complex logic of interacting with Large Language Models (LLMs) and supports using tools to complete tasks. - -## ChatModelAgent ReAct Pattern - -`ChatModelAgent` uses the [ReAct](https://react-lm.github.io/) pattern internally, which is designed to solve complex problems by having the ChatModel perform explicit, step-by-step "thinking". After configuring tools for `ChatModelAgent`, its internal execution flow follows the ReAct pattern: - -- Call ChatModel (Reason) -- LLM returns tool call request (Action) -- ChatModelAgent executes tool (Act) -- It returns the tool result to ChatModel (Observation), then starts a new cycle until ChatModel determines no Tool call is needed and ends. - -When no tools are configured, `ChatModelAgent` degrades to a single ChatModel call. - - - -You can configure Tools for ChatModelAgent through ToolsConfig: - -```go -// github.com/cloudwego/eino/adk/chatmodel.go - -type ToolsConfig struct { - compose.ToolsNodeConfig - - // Names of the tools that will make agent return directly when the tool is called. - // When multiple tools are called and more than one tool is in the return directly list, only the first one will be returned. - ReturnDirectly map[string]bool - - // EmitInternalEvents indicates whether internal events from agentTool should be emitted - // to the parent generator via a tool option injection at run-time. - EmitInternalEvents bool -} -``` - -ToolsConfig reuses Eino Graph ToolsNodeConfig, see [Eino: ToolsNode & Tool Usage Guide](/docs/eino/core_modules/components/tools_node_guide) for details. Additionally, it provides the ReturnDirectly configuration. ChatModelAgent will exit directly after calling a Tool configured in ReturnDirectly. - -## ChatModelAgent Configuration Fields - -> 💡 -> Note: GenModelInput by default renders the Instruction in F-String format using adk.GetSessionValues(). To disable this behavior, customize the GenModelInput method. - -```go -type ChatModelAgentConfig struct { - // Name of the agent. Better be unique across all agents. - Name string - // Description of the agent's capabilities. - // Helps other agents determine whether to transfer tasks to this agent. - Description string - // Instruction used as the system prompt for this agent. - // Optional. If empty, no system prompt will be used. - // Supports f-string placeholders for session values in default GenModelInput, for example: - // "You are a helpful assistant. The current time is {Time}. The current user is {User}." - // These placeholders will be replaced with session values for "Time" and "User". - Instruction string - - Model model.ToolCallingChatModel - - ToolsConfig ToolsConfig - - // GenModelInput transforms instructions and input messages into the model's input format. - // Optional. Defaults to defaultGenModelInput which combines instruction and messages. - GenModelInput GenModelInput - - // Exit defines the tool used to terminate the agent process. - // Optional. If nil, no Exit Action will be generated. - // You can use the provided 'ExitTool' implementation directly. - Exit tool.BaseTool - - // OutputKey stores the agent's response in the session. - // Optional. When set, stores output via AddSessionValue(ctx, outputKey, msg.Content). - OutputKey string - - // MaxIterations defines the upper limit of ChatModel generation cycles. - // The agent will terminate with an error if this limit is exceeded. - // Optional. Defaults to 20. - MaxIterations int - - // ModelRetryConfig configures retry behavior for the ChatModel. - // When set, the agent will automatically retry failed ChatModel calls - // based on the configured policy. - // Optional. If nil, no retry will be performed. - ModelRetryConfig *ModelRetryConfig -} - -type ToolsConfig struct { - compose.ToolsNodeConfig - - // Names of the tools that will make agent return directly when the tool is called. - // When multiple tools are called and more than one tool is in the return directly list, only the first one will be returned. - ReturnDirectly map[string]bool - - // EmitInternalEvents indicates whether internal events from agentTool should be emitted - // to the parent generator via a tool option injection at run-time. - EmitInternalEvents bool -} - -type GenModelInput func(ctx context.Context, instruction string, input *AgentInput) ([]Message, error) -``` - -- `Name`: Agent name -- `Description`: Agent description -- `Instruction`: System Prompt when calling ChatModel, supports f-string rendering -- `Model`: ChatModel used for running, must support tool calling -- `ToolsConfig`: Tool configuration - - ToolsConfig reuses Eino Graph ToolsNodeConfig, see [Eino: ToolsNode & Tool Usage Guide](/docs/eino/core_modules/components/tools_node_guide) for details. - - ReturnDirectly: When ChatModelAgent calls a Tool configured in ReturnDirectly, it will immediately exit with the result, without returning to ChatModel per the react pattern. If multiple Tools are hit, only the first Tool is returned. Map key is the Tool name. - - EmitInternalEvents: When using adk.AgentTool() to treat an Agent as a SubAgent through ToolCall, by default, this SubAgent will not send AgentEvents, only returning the final result as ToolResult. -- `GenModelInput`: When the Agent is called, it uses this method to convert `Instruction` and `AgentInput` into Messages for calling ChatModel. The Agent provides a default GenModelInput method: - 1. Add `Instruction` as `System Message` before `AgentInput.Messages` - 2. Render `SessionValues` as variables into the message list from step 1 - -> 💡 -> The default `GenModelInput` uses pyfmt rendering. Text in the message list is treated as a pyfmt template, meaning '{' and '}' in the text are treated as keywords. If you want to input these two characters directly, they need to be escaped as '{{' and '}}'. - -- `OutputKey`: When configured, the last Message produced by ChatModelAgent running will be set in `SessionValues` with `OutputKey` as the key -- `MaxIterations`: Maximum number of ChatModel generations in react mode. Agent will exit with error when exceeded. Default value is 20 -- `Exit`: Exit is a special Tool. When the model calls this tool and executes it, ChatModelAgent will exit directly, with an effect similar to `ToolsConfig.ReturnDirectly`. ADK provides a default ExitTool implementation for users: - -```go -type ExitTool struct{} - -func (et ExitTool) Info(_ context.Context) (*schema.ToolInfo, error) { - return ToolInfoExit, nil -} - -func (et ExitTool) InvokableRun(ctx context.Context, argumentsInJSON string, _ ...tool.Option) (string, error) { - type exitParams struct { - FinalResult string `json:"final_result"` - } - - params := &exitParams{} - err := sonic.UnmarshalString(argumentsInJSON, params) - if err != nil { - return "", err - } - - err = SendToolGenAction(ctx, "exit", NewExitAction()) - if err != nil { - return "", err - } - - return params.FinalResult, nil -} -``` - -- `ModelRetryConfig`: When configured, various errors during ChatModel request (including direct errors and errors during streaming response) will be retried according to the configured policy. If an error occurs during streaming response, the streaming response will still be returned through AgentEvent immediately. If the error during streaming response will be retried according to the configured policy, consuming the message stream in AgentEvent will get `WillRetryError`. Users can handle this error for corresponding display processing. Example: - -```go -iterator := agent.Run(ctx, input) -for { - event, ok := iterator.Next() - if !ok { - break - } - - if event.Err != nil { - handleFinalError(event.Err) - break - } - - // Process streaming output - if event.Output != nil && event.Output.MessageOutput.IsStreaming { - stream := event.Output.MessageOutput.MessageStream - for { - msg, err := stream.Recv() - if err == io.EOF { - break // Stream completed successfully - } - if err != nil { - // Check if this error will be retried (more streams coming) - var willRetry *adk.WillRetryError - if errors.As(err, &willRetry) { - log.Printf("Attempt %d failed, retrying...", willRetry.RetryAttempt) - break // Wait for next event with new stream - } - // Original error - won't retry, agent will stop and the next AgentEvent probably will be an error - log.Printf("Final error (no retry): %v", err) - break - } - // Display chunk to user - displayChunk(msg) - } - } -} -``` - -## ChatModelAgent Transfer - -`ChatModelAgent` supports converting other Agents' meta information into its own Tools, achieving dynamic Transfer through ChatModel judgment: - -- `ChatModelAgent` implements the `OnSubAgents` interface. After using `SetSubAgents` to set sub Agents for `ChatModelAgent`, `ChatModelAgent` will add a `Transfer Tool` and instruct ChatModel in the prompt to call this Tool when transfer is needed, using the transfer target AgentName as Tool input. - -```go -const ( - TransferToAgentInstruction = `Available other agents: %s - -Decision rule: -- If you're best suited for the question according to your description: ANSWER -- If another agent is better according its description: CALL '%s' function with their agent name - -When transferring: OUTPUT ONLY THE FUNCTION CALL` -) - -func genTransferToAgentInstruction(ctx context.Context, agents []Agent) string { - var sb strings.Builder - for _, agent := range agents { - sb.WriteString(fmt.Sprintf("\n- Agent name: %s\n Agent description: %s", - agent.Name(ctx), agent.Description(ctx))) - } - - return fmt.Sprintf(TransferToAgentInstruction, sb.String(), TransferToAgentToolName) -} -``` - -- `Transfer Tool` running sets a Transfer Event, specifying the jump to the target Agent, and ChatModelAgent exits after completion. -- Agent Runner receives the Transfer Event and jumps to the target Agent for execution, completing the Transfer operation - -## ChatModelAgent AgentAsTool - -When the Agent being called doesn't need a complete running context but only clear and explicit input parameters to run correctly, the Agent can be converted to a Tool for `ChatModelAgent` to judge and call: - -- ADK provides utility methods to conveniently convert Eino ADK Agents to Tools for ChatModelAgent to call: - -```go -// github.com/cloudwego/eino/adk/agent_tool.go - -func NewAgentTool(_ context.Context, agent Agent, options ...AgentToolOption) tool.BaseTool -``` - -- Agents converted to Tools can be registered directly in ChatModelAgent through `ToolsConfig` - -```go -bookRecommender := NewBookRecommendAgent() -bookRecommendeTool := NewAgentTool(ctx, bookRecommender) - -a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // ... - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{bookRecommendeTool}, - }, - }, -}) -``` - -## ChatModelAgent Middleware - -`ChatModelAgentMiddleware` is an extension mechanism for `ChatModelAgent` that allows developers to inject custom logic at various stages of Agent execution: - - - -`ChatModelAgentMiddleware` is defined as an interface. Developers can implement this interface and configure it in `ChatModelAgentConfig` to make it effective in `ChatModelAgent`: - -```go -type ChatModelAgentMiddleware interface { - // ... -} - -a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // ... - Handlers: []adk.ChatModelAgentMiddleware{ - &MyMiddleware{}, - }, -}) -``` - -**Using BaseChatModelAgentMiddleware** - -`BaseChatModelAgentMiddleware` provides default empty implementations for all methods. By embedding it, you can override only the methods you need: - -```go -type MyMiddleware struct { - *adk.BaseChatModelAgentMiddleware - // Custom fields - logger *log.Logger -} - -// Only override the methods you need -func (m *MyMiddleware) BeforeModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - m.logger.Printf("Messages count: %d", len(state.Messages)) - return ctx, state, nil -} -``` - -### BeforeAgent - -Called before each Agent run, can be used to modify instructions and tool configuration. ChatModelAgentContext defines the content that can be read and written in BeforeAgent: - -```go -type ChatModelAgentContext struct { - // Instruction is the current Agent's instruction - Instruction string - // Tools is the current configured original tool list - Tools []tool.BaseTool - // ReturnDirectly configures tool name sets that return directly after being called - ReturnDirectly map[string]bool -} - -type ChatModelAgentMiddleware interface { - // ... - BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) - // ... -} -``` - -Example: - -```go -func (m *MyMiddleware) BeforeAgent( - ctx context.Context, - runCtx *adk.ChatModelAgentContext, -) (context.Context, *adk.ChatModelAgentContext, error) { - // Copy runCtx to avoid modifying input - nRunCtx := *runCtx - - // Modify instruction - nRunCtx.Instruction += "\n\nPlease always reply in Chinese." - - // Add tool - nRunCtx.Tools = append(runCtx.Tools, myCustomTool) - - // Set tool to return directly - nRunCtx.ReturnDirectly["my_tool"] = true - - return ctx, &nRunCtx, nil -} -``` - -### BeforeModelRewriteState / AfterModelRewriteState - -Called before/after each model call, can be used to inspect and modify message history. ModelContext defines read-only content, ChatModelAgentState defines read-write content: - -```go -type ModelContext struct { - // Tools contains the list of tools currently configured for the Agent - // Populated at request time, contains tool info that will be sent to the model - Tools []*schema.ToolInfo - - // ModelRetryConfig contains the retry configuration for the model - // Populated from Agent's ModelRetryConfig - ModelRetryConfig *ModelRetryConfig -} - -type ChatModelAgentState struct { - // Messages contains all messages in the current session - Messages []Message -} - -type ChatModelAgentMiddleware interface { - BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) -} -``` - -Example: - -```go -func (m *MyMiddleware) BeforeModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - // Copy state to avoid modifying input - nState := *state - - // Check message history - if len(state.Messages) > 50 { - // Truncate old messages - nState.Messages = state.Messages[len(state.Messages)-50:] - } - return ctx, &nState, nil -} - -func (m *MyMiddleware) AfterModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - // Model response is the last message - lastMsg := state.Messages[len(state.Messages)-1] - m.logger.Printf("Model response: %s", lastMsg.Content) - return ctx, state, nil -} -``` - -### WrapModel - -Wraps model calls, can be used to intercept and modify model input and output: - -```go -type ChatModelAgentMiddleware interface { - WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) -} -``` - -Example: - -```go -func (m *MyMiddleware) WrapModel( - ctx context.Context, - chatModel model.BaseChatModel, - mc *adk.ModelContext, -) (model.BaseChatModel, error) { - return &loggingModel{ - inner: chatModel, - logger: m.logger, - }, nil -} - -type loggingModel struct { - inner model.BaseChatModel - logger *log.Logger -} - -func (m *loggingModel) Generate(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.Message, error) { - m.logger.Printf("Input messages: %d", len(msgs)) - resp, err := m.inner.Generate(ctx, msgs, opts...) - m.logger.Printf("Output: %v, error: %v", resp != nil, err) - return resp, err -} - -func (m *loggingModel) Stream(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.StreamReader[*schema.Message], error) { - return m.inner.Stream(ctx, msgs, opts...) -} -``` - -### WrapInvokableToolCall / WrapStreamableToolCall - -Wraps tool calls, can be used to intercept and modify tool input and output: - -```go -// InvokableToolCallEndpoint is the function signature for tool calls. -// Middleware developers add custom logic around this Endpoint. -type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) - -// StreamableToolCallEndpoint is the function signature for streaming tool calls. -// Middleware developers add custom logic around this Endpoint. -type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) - -type ToolContext struct { - // Name indicates the name of the tool being called - Name string - // CallID indicates the ToolCallID of this tool call - CallID string -} - -type ChatModelAgentMiddleware interface { - WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) - WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) -} -``` - -Example: - -```go -func (m *MyMiddleware) WrapInvokableToolCall( - ctx context.Context, - endpoint adk.InvokableToolCallEndpoint, - tCtx *adk.ToolContext, -) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - m.logger.Printf("Calling tool: %s (ID: %s)", tCtx.Name, tCtx.CallID) - start := time.Now() - - result, err := endpoint(ctx, argumentsInJSON, opts...) - - m.logger.Printf("Tool %s completed in %v", tCtx.Name, time.Since(start)) - return result, err - }, nil -} -``` - -# ChatModelAgent Usage Example - -## Scenario Description - -Create a book recommendation Agent that can recommend relevant books based on user input. - -## Code Implementation - -### Step 1: Define Tools - -The book recommendation Agent needs a `book_search` tool that can search for books based on user requirements (genre, rating, etc.). - -Using utility methods provided by Eino makes it easy to create (see [How to create a tool?](/docs/eino/core_modules/components/tools_node_guide/how_to_create_a_tool)): - -```go -import ( - "context" - "log" - - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" -) - -type BookSearchInput struct { - Genre string `json:"genre" jsonschema:"description=Preferred book genre,enum=fiction,enum=sci-fi,enum=mystery,enum=biography,enum=business"` - MaxPages int `json:"max_pages" jsonschema:"description=Maximum page length (0 for no limit)"` - MinRating int `json:"min_rating" jsonschema:"description=Minimum user rating (0-5 scale)"` -} - -type BookSearchOutput struct { - Books []string -} - -func NewBookRecommender() tool.InvokableTool { - bookSearchTool, err := utils.InferTool("search_book", "Search books based on user preferences", func(ctx context.Context, input *BookSearchInput) (output *BookSearchOutput, err error) { - // search code - // ... - return &BookSearchOutput{Books: []string{"God's blessing on this wonderful world!"}}, nil - }) - if err != nil { - log.Fatalf("failed to create search book tool: %v", err) - } - return bookSearchTool -} -``` - -### Step 2: Create ChatModel - -Eino provides various ChatModel wrappers (such as openai, gemini, doubao, etc., see [Eino: ChatModel Usage Guide](/docs/eino/core_modules/components/chat_model_guide) for details). Here we use openai ChatModel as an example: - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/components/model" -) - -func NewChatModel() model.ToolCallingChatModel { - ctx := context.Background() - apiKey := os.Getenv("OPENAI_API_KEY") - openaiModel := os.Getenv("OPENAI_MODEL") - - cm, err := openai.NewChatModel(ctx, &openai.ChatModelConfig{ - APIKey: apiKey, - Model: openaiModel, - }) - if err != nil { - log.Fatal(fmt.Errorf("failed to create chatmodel: %w", err)) - } - return cm -} -``` - -### Step 3: Create ChatModelAgent - -In addition to configuring ChatModel and tools, you need to configure Name and Description describing the Agent's function and purpose, as well as the Instruction that instructs the ChatModel. The Instruction will ultimately be passed to ChatModel as a system message. - -```go -import ( - "context" - "fmt" - "log" - - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/compose" -) - -func NewBookRecommendAgent() adk.Agent { - ctx := context.Background() - - a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "BookRecommender", - Description: "An agent that can recommend books", - Instruction: `You are an expert book recommender. Based on the user's request, use the "search_book" tool to find relevant books. Finally, present the results to the user.`, - Model: NewChatModel(), - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{NewBookRecommender()}, - }, - }, - }) - if err != nil { - log.Fatal(fmt.Errorf("failed to create chatmodel: %w", err)) - } - - return a -} -``` - -### - -### Step 4: Run via Runner - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino/adk" - - "github.com/cloudwego/eino-examples/adk/intro/chatmodel/internal" -) - -func main() { - ctx := context.Background() - a := internal.NewBookRecommendAgent() - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: a, - }) - iter := runner.Query(ctx, "recommend a fiction book to me") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - msg, err := event.Output.MessageOutput.GetMessage() - if err != nil { - log.Fatal(err) - } - fmt.Printf("\nmessage:\n%v\n======", msg) - } -} -``` - -## Running Result - -```yaml -message: -assistant: -tool_calls: -{Index: ID:call_o2It087hoqj8L7atzr70EnfG Type:function Function:{Name:search_book Arguments:{"genre":"fiction","max_pages":0,"min_rating":0}} Extra:map[]} - -finish_reason: tool_calls -usage: &{140 24 164} -====== - - -message: -tool: {"Books":["God's blessing on this wonderful world!"]} -tool_call_id: call_o2It087hoqj8L7atzr70EnfG -tool_call_name: search_book -====== - - -message: -assistant: I recommend the fiction book "God's blessing on this wonderful world!". It's a great choice for readers looking for an exciting story. Enjoy your reading! -finish_reason: stop -usage: &{185 31 216} -====== -``` - -# ChatModelAgent Interrupt and Resume - -## Introduction - -`ChatModelAgent` is implemented using Eino Graph, so it can reuse Eino Graph's Interrupt&Resume capability in the agent. - -- On Interrupt, return a special error in the tool to make the Graph trigger an interrupt and throw custom information. On resume, the Graph will re-run this tool: - -```go -// github.com/cloudwego/eino/adk/interrupt.go - -func NewInterruptAndRerunErr(extra any) error -``` - -- On Resume, custom ToolOptions are supported for passing additional information to the Tool during resume: - -```go -import ( - "github.com/cloudwego/eino/components/tool" -) - -type askForClarificationOptions struct { - NewInput *string -} - -func WithNewInput(input string) tool.Option { - return tool.WrapImplSpecificOptFn(func(t *askForClarificationOptions) { - t.NewInput = &input - }) -} -``` - -## Example - -Below we will build on the code from the [ChatModelAgent Usage Example] section above to add a tool `ask_for_clarification` to `BookRecommendAgent`. When the user provides insufficient information for recommendations, the Agent will call this tool to ask the user for more information. `ask_for_clarification` uses the Interrupt&Resume capability to implement "asking" the user. - -### Step 1: Add Tool Supporting Interrupt - -```go -import ( - "context" - "log" - - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" - "github.com/cloudwego/eino/compose" -) - -type askForClarificationOptions struct { - NewInput *string -} - -func WithNewInput(input string) tool.Option { - return tool.WrapImplSpecificOptFn(func(t *askForClarificationOptions) { - t.NewInput = &input - }) -} - -type AskForClarificationInput struct { - Question string `json:"question" jsonschema:"description=The specific question you want to ask the user to get the missing information"` -} - -func NewAskForClarificationTool() tool.InvokableTool { - t, err := utils.InferOptionableTool( - "ask_for_clarification", - "Call this tool when the user's request is ambiguous or lacks the necessary information to proceed. Use it to ask a follow-up question to get the details you need, such as the book's genre, before you can use other tools effectively.", - func(ctx context.Context, input *AskForClarificationInput, opts ...tool.Option) (output string, err error) { - o := tool.GetImplSpecificOptions[askForClarificationOptions](nil, opts...) - if o.NewInput == nil { - return "", compose.NewInterruptAndRerunErr(input.Question) - } - return *o.NewInput, nil - }) - if err != nil { - log.Fatal(err) - } - return t -} -``` - -### Step 2: Add Tool to Agent - -```go -func NewBookRecommendAgent() adk.Agent { - // xxx - a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // xxx - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{NewBookRecommender(), NewAskForClarificationTool()}, - }, - // Whether to output AgentEvents from SubAgent when Tool internally calls SubAgent via AgentTool() - EmitInternalEvents: true, - }, - }) - // xxx -} -``` - -### Step 3: Configure CheckPointStore in Agent Runner - -Configure `CheckPointStore` in Runner (the example uses the simplest InMemoryStore), and pass in `CheckPointID` when calling the Agent for use during resume. Also, on interrupt, Graph places `InterruptInfo` in `Interrupted.Data`: - -```go -func newInMemoryStore() compose.CheckPointStore { - return &inMemoryStore{ - mem: map[string][]byte{}, - } -} - -func main() { - ctx := context.Background() - a := subagents.NewBookRecommendAgent() - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - EnableStreaming: true, // you can disable streaming here - Agent: a, - CheckPointStore: newInMemoryStore(), - }) - iter := runner.Query(ctx, "recommend a book to me", adk.WithCheckPointID("1")) - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil && event.Action.Interrupted != nil { - fmt.Printf("\ninterrupt happened, info: %+v\n", event.Action.Interrupted.Data.(*adk.ChatModelAgentInterruptInfo).RerunNodesExtra["ToolNode"]) - continue - } - msg, err := event.Output.MessageOutput.GetMessage() - if err != nil { - log.Fatal(err) - } - fmt.Printf("\nmessage:\n%v\n======\n\n", msg) - } - - scanner := bufio.NewScanner(os.Stdin) - fmt.Print("\nyour input here: ") - scanner.Scan() - fmt.Println() - nInput := scanner.Text() - - iter, err := runner.Resume(ctx, "1", adk.WithToolOptions([]tool.Option{subagents.WithNewInput(nInput)})) - if err != nil { - log.Fatal(err) - } - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - prints.Event(event) - } -} -``` - -### Running Result - -An interrupt will occur after running - -``` -message: -assistant: -tool_calls: -{Index: ID:call_3HAobzkJvW3JsTmSHSBRftaG Type:function Function:{Name:ask_for_clarification Arguments:{"question":"Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{219 37 256} -====== - - -interrupt happened, info: &{ToolCalls:[{Index: ID:call_3HAobzkJvW3JsTmSHSBRftaG Type:function Function:{Name:ask_for_clarification Arguments:{"question":"Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?"}} Extra:map[]}] ExecutedTools:map[] RerunTools:[call_3HAobzkJvW3JsTmSHSBRftaG] RerunExtraMap:map[call_3HAobzkJvW3JsTmSHSBRftaG:Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?]} -your input here: -``` - -After stdin input, retrieve the previous interrupt state from CheckPointStore and continue running with the completed input - -``` -new input is: -recommend me a fiction book - -message: -tool: recommend me a fiction book -tool_call_id: call_3HAobzkJvW3JsTmSHSBRftaG -tool_call_name: ask_for_clarification -====== - - -message: -assistant: -tool_calls: -{Index: ID:call_3fC5OqPZLls11epXMv7sZGAF Type:function Function:{Name:search_book Arguments:{"genre":"fiction","max_pages":0,"min_rating":0}} Extra:map[]} - -finish_reason: tool_calls -usage: &{272 24 296} -====== - - -message: -tool: {"Books":["God's blessing on this wonderful world!"]} -tool_call_id: call_3fC5OqPZLls11epXMv7sZGAF -tool_call_name: search_book -====== - - -message: -assistant: I recommend the fiction book "God's Blessing on This Wonderful World!" Enjoy your reading! -finish_reason: stop -usage: &{317 20 337} -====== -``` - -# Summary - -`ChatModelAgent` is the core Agent implementation in ADK, serving as the "thinking" part of applications. It leverages the powerful capabilities of LLMs for reasoning, understanding natural language, making decisions, generating responses, and interacting with tools. - -`ChatModelAgent`'s behavior is non-deterministic, dynamically deciding which tools to use or transferring control to other Agents through LLM. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md new file mode 100644 index 00000000000..b5fb989f1f0 --- /dev/null +++ b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md @@ -0,0 +1,306 @@ +--- +Description: "" +date: "2026-05-21" +lastmod: "" +tags: [] +title: ChatModelAgent +weight: 1 +--- + +# ChatModelAgent Overview + +`import "github.com/cloudwego/eino/adk"` + +## What is ChatModelAgent + +`ChatModelAgent` is the core Agent implementation of Eino ADK — it uses a ChatModel as the decision-maker, Tools as the action space, and autonomously drives problem-solving through a ReAct Loop. + +For a complete introduction to ChatModelAgent concepts, the ReAct Loop, and the Middleware system, see: [ChatModelAgent Introduction](/docs/eino/overview/eino_adk_quickstart) + +## ReAct Loop + +When Tools are configured, ChatModelAgent executes in a ReAct loop: + +1. **Reason**: Call the ChatModel, which decides the next action +2. **Action**: The model returns a ToolCall request +3. **Act**: Execute the corresponding Tool +4. **Observation**: Inject the Tool result into the context and start a new loop iteration + +The loop continues until the model determines no further Tool calls are needed. Without Tools configured, it degrades to a single ChatModel invocation. + +# Configuration + +## TypedChatModelAgentConfig + +```go +type TypedChatModelAgentConfig[M MessageType] struct { + Name string + Description string + Instruction string + + Model model.BaseModel[M] // Required. Must support model.WithTools when using Tools + + ToolsConfig ToolsConfig + GenModelInput TypedGenModelInput[M] + + Exit tool.BaseTool // NOT RECOMMENDED + OutputKey string // NOT RECOMMENDED + MaxIterations int // Default 20 + + Handlers []TypedChatModelAgentMiddleware[M] + Middlewares []AgentMiddleware // Legacy compatibility + + ModelRetryConfig *TypedModelRetryConfig[M] + ModelFailoverConfig *ModelFailoverConfig[M] +} + +// Default alias +type ChatModelAgentConfig = TypedChatModelAgentConfig[*schema.Message] +``` + +### Field Descriptions + + + + + + + + + + + + + + +
    FieldDescription
    Name
    Agent name. Required when used as AgentTool
    Description
    Agent capability description. Required when used as AgentTool
    Instruction
    System Prompt. Supports
    {Key}
    placeholders; default
    GenModelInput
    renders using SessionValues
    Model
    Required. Type
    model.BaseModel[M]
    ; must support
    model.WithTools
    when using Tools
    ToolsConfig
    Tool configuration, see below
    GenModelInput
    Custom input transformation. Default uses Instruction as System Message + f-string rendering
    MaxIterations
    Maximum ReAct loop iterations; exceeding this causes an error exit. Default 20
    Handlers
    Interface-style Middleware (
    TypedChatModelAgentMiddleware[M]
    ), recommended
    Middlewares
    Struct-style Middleware (
    AgentMiddleware
    ), legacy compatibility
    ModelRetryConfig
    Retry strategy for failed model calls
    ModelFailoverConfig
    Switch to backup model on failure. Requires configuring
    GetFailoverModel
    and
    ShouldFailover
    + +> 💡 +> The default GenModelInput uses pyfmt rendering. `{` and `}` in Messages are treated as placeholders. To output these characters literally, escape them with `{{` and `}}`. + +### ToolsConfig + +```go +type ToolsConfig struct { + compose.ToolsNodeConfig + + ReturnDirectly map[string]bool // Tool names that return directly after invocation + EmitInternalEvents bool // Forward AgentTool internal events +} +``` + +- **ReturnDirectly**: When a matching Tool completes execution, the Agent exits immediately without calling the model again. If multiple match, the first one is used +- **EmitInternalEvents**: When a sub-Agent is invoked via AgentTool, its events are forwarded in real-time to the parent Agent's event stream + +### Constructors + +```go +func NewChatModelAgent(ctx context.Context, config *ChatModelAgentConfig) (*ChatModelAgent, error) +func NewTypedChatModelAgent[M MessageType](ctx context.Context, config *TypedChatModelAgentConfig[M]) (*TypedChatModelAgent[M], error) +``` + +# Middleware (ChatModelAgentMiddleware) + +## Interface Definition + +```go +type TypedChatModelAgentMiddleware[M MessageType] interface { + BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) + AfterAgent(ctx context.Context, state *TypedChatModelAgentState[M]) (context.Context, error) + + BeforeModelRewriteState(ctx context.Context, state *TypedChatModelAgentState[M], mc *TypedModelContext[M]) (context.Context, *TypedChatModelAgentState[M], error) + AfterModelRewriteState(ctx context.Context, state *TypedChatModelAgentState[M], mc *TypedModelContext[M]) (context.Context, *TypedChatModelAgentState[M], error) + + WrapModel(ctx context.Context, m model.BaseModel[M], mc *TypedModelContext[M]) (model.BaseModel[M], error) + + WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) + WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) + WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) + WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) +} + +type ChatModelAgentMiddleware = TypedChatModelAgentMiddleware[*schema.Message] +``` + +Embed `*BaseChatModelAgentMiddleware` to only override the methods you need: + +```go +type MyMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *MyMiddleware) BeforeModelRewriteState( + ctx context.Context, + state *adk.ChatModelAgentState, + mc *adk.ModelContext, +) (context.Context, *adk.ChatModelAgentState, error) { + // Custom logic + return ctx, state, nil +} +``` + +## Hook Points + + + + + + + + + +
    HookTimingModifiable Content
    BeforeAgent
    Before Agent runs (once only)Instruction, Tools, ReturnDirectly, ToolSearchTool
    AfterAgent
    After Agent completes successfullyRead final state (no modification)
    BeforeModelRewriteState
    Before each model callMessages, ToolInfos, DeferredToolInfos (persisted to state)
    AfterModelRewriteState
    After each model callMessages (including model response), ToolInfos (persisted to state)
    WrapModel
    Wraps model invocationRetry, failover, event emission (do not modify Messages)
    WrapToolCall
    Wraps tool invocationPermission checks, logging, output rewriting
    + +> 💡 +> The state returned by `BeforeModelRewriteState` is persisted by the framework to the agent's internal state. Therefore, modifications in this hook (such as compressing Messages or filtering ToolInfos) will affect all subsequent iterations. + +## Core Types + +### ChatModelAgentContext (BeforeAgent Parameter) + +```go +type ChatModelAgentContext struct { + Instruction string + Tools []tool.BaseTool + ReturnDirectly map[string]bool + ToolSearchTool *schema.ToolInfo // Model's native ToolSearch capability +} +``` + +### ChatModelAgentState (BeforeModel/AfterModel Parameter) + +```go +type TypedChatModelAgentState[M MessageType] struct { + Messages []M + ToolInfos []*schema.ToolInfo // Tool list passed to the model + DeferredToolInfos []*schema.ToolInfo // Server-side deferred retrieval tool list +} + +type ChatModelAgentState = TypedChatModelAgentState[*schema.Message] +``` + +### ModelContext (WrapModel Parameter) + +```go +type TypedModelContext[M MessageType] struct { + Tools []*schema.ToolInfo // Deprecated: use state.ToolInfos + ModelRetryConfig *TypedModelRetryConfig[M] + ModelFailoverConfig *ModelFailoverConfig[M] +} + +type ModelContext = TypedModelContext[*schema.Message] +``` + +## Execution Order + +**Model call chain** (outer to inner): + +1. `AgentMiddleware.BeforeChatModel` +2. **BeforeModelRewriteState** +3. failover wrapper (built-in) +4. retry wrapper (built-in) +5. event sender wrapper (built-in) +6. **WrapModel** (first registered = outermost) +7. callback injection (built-in) +8. Actual model call +9. **AfterModelRewriteState** +10. `AgentMiddleware.AfterChatModel` + +**Tool call chain** (outer to inner): + +1. event sender (built-in) +2. `ToolsConfig.ToolCallMiddlewares` +3. `AgentMiddleware.WrapToolCall` +4. **WrapToolCall** (first registered = outermost) +5. callback injection (built-in) +6. Actual tool call + +# AgentAsTool + +Wrap a sub-Agent as a Tool so the parent Agent can invoke it autonomously via ToolCall: + +```go +subAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "researcher", + Description: "Search and summarize information", + Model: chatModel, + // ... +}) + +agentTool := adk.NewAgentTool(ctx, subAgent) + +parentAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + // ... + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{agentTool}, + }, + }, +}) +``` + +Generic version: `adk.NewTypedAgentTool[M](ctx, agent, options...)` + +Options: `WithFullChatHistoryAsInput()` (pass complete chat history), `WithAgentInputSchema(schema)` (custom input schema) + +# ModelRetry + +When configured, ChatModel calls are automatically retried on failure. When an error occurs during a streaming response, the current stream is still returned via AgentEvent, and consuming the MessageStream yields a `WillRetryError`: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + // ... + ModelRetryConfig: &adk.ModelRetryConfig{ + // Retry strategy configuration + }, +}) + +// Handle WillRetryError when consuming the event stream +stream := event.Output.MessageOutput.MessageStream +for { + msg, err := stream.Recv() + if err == io.EOF { + break + } + if err != nil { + var willRetry *adk.WillRetryError + if errors.As(err, &willRetry) { + log.Printf("Attempt %d failed, retrying...", willRetry.RetryAttempt) + break // Wait for next event + } + break + } + displayChunk(msg) +} +``` + +# ModelFailover + +When configured, the agent switches to a backup model on failure: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: primaryModel, + ModelFailoverConfig: &adk.ModelFailoverConfig{ + GetFailoverModel: func(ctx context.Context, err error) (model.BaseModel[*schema.Message], error) { + return backupModel, nil + }, + ShouldFailover: func(err error) bool { + return true // Decide whether to failover based on error type + }, + }, +}) +``` + +# Cancel + +New runtime cancellation capability added in v0.9. See [Agent Cancel and TurnLoop](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) for details. + +```go +cancelOpt, cancelFn := adk.WithCancel() +iter := runner.Run(ctx, messages, cancelOpt) + +// Cancel later (CancelMode supports bitmask combinations) +handle := cancelFn(adk.CancelAfterChatModel | adk.CancelAfterToolCalls) +handle.Wait() // Wait for cancellation to complete +``` diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md new file mode 100644 index 00000000000..f0c2c2b674a --- /dev/null +++ b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md @@ -0,0 +1,173 @@ +--- +Description: "" +date: "2026-05-21" +lastmod: "" +tags: [] +title: ChatModel Failover Guide +weight: 1 +--- + +## Overview + +`ChatModelAgent` has built-in model failover capability: when the primary model call fails, it automatically switches to a backup model, supporting both Generate (synchronous) and Stream (streaming). Configured via `ModelFailoverConfig[M]`, it composes orthogonally with `TypedModelRetryConfig[M]` (same-model retry). + +> This document uses the default `*schema.Message` type as an example. For generic usage, replace the APIs with their `Typed`-prefixed versions and parameterize the message type as `M MessageType`. + +## Core Data Structures + +### ModelFailoverConfig[M] + +```go +type ModelFailoverConfig[M MessageType] struct { + // Maximum failover attempts. 0 means no failover; + // 1 means GetFailoverModel is called at most once. + // When lastSuccessModel exists, it is tried first before calling GetFailoverModel. + MaxRetries uint + + // Determines whether to trigger failover. When ctx.Err() != nil, stops regardless of return value. + // When combined with ModelRetryConfig, outputErr is *RetryExhaustedError; + // the original error is obtained via RetryExhaustedError.LastErr. + // In streaming scenarios, outputMessage may carry partially received messages. + // This field is required when configuring ModelFailoverConfig. + ShouldFailover func(ctx context.Context, outputMessage M, outputErr error) bool + + // Selects the next model and optionally transforms input messages. + // failoverCtx.FailoverAttempt starts from 1. + // Returning nil failoverModelInputMessages means using the original input. + // Returning non-nil failoverErr immediately terminates failover. + // This field is required when configuring ModelFailoverConfig. + GetFailoverModel func(ctx context.Context, failoverCtx *FailoverContext[M]) ( + failoverModel model.BaseModel[M], + failoverModelInputMessages []M, + failoverErr error, + ) +} +``` + +### FailoverContext[M] + +```go +type FailoverContext[M MessageType] struct { + FailoverAttempt uint // Current attempt number, starting from 1 + InputMessages []M // Original input before transformation + LastOutputMessage M // Output from last failure (partial message in streaming) + // When combined with ModelRetryConfig, this is *RetryExhaustedError + LastErr error // Error from last failure +} +``` + +## Quick Start + +### Basic Usage: Dual-Model Failover + +```go +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "my-agent", + Instruction: "You are a helpful assistant.", + Model: primaryModel, // model.BaseModel[*schema.Message], required + + ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, // At most 1 failover (2 calls total) + + ShouldFailover: func(ctx context.Context, msg *schema.Message, err error) bool { + return !errors.Is(err, context.Canceled) && + !errors.Is(err, context.DeadlineExceeded) + }, + + GetFailoverModel: func(ctx context.Context, fc *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + return fallbackModel, nil, nil // nil messages → use original input + }, + }, +}) +``` + +> 💡 +> `model.BaseChatModel` is a type alias for `model.BaseModel[*schema.Message]`; the two can be used interchangeably. + +### Transforming Input During Failover + +When the backup model doesn't support certain features (e.g., image input): + +```go +ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, + ShouldFailover: func(_ context.Context, _ *schema.Message, _ error) bool { + return true + }, + GetFailoverModel: func(_ context.Context, fc *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + // Filter out image content, downgrade to text-only model + return textModel, filterTextOnly(fc.InputMessages), nil + }, +}, +``` + +### Combining with Retry + +Failover and Retry compose orthogonally. Semantics: **each model first retries according to the Retry strategy; after retries are exhausted, Failover switches to a different model**. + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: primaryModel, + // ... + + ModelRetryConfig: &adk.ModelRetryConfig{ + MaxRetries: 2, + IsRetryAble: func(_ context.Context, err error) bool { + return isTransientError(err) + }, + }, + + ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, + ShouldFailover: func(_ context.Context, _ *schema.Message, err error) bool { + // err is *RetryExhaustedError at this point + return true + }, + GetFailoverModel: func(_ context.Context, _ *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + return fallbackModel, nil, nil + }, + }, +}) +``` + +## Streaming Failover Behavior + + + + + + +
    ScenarioBehavior
    Stream()
    initialization failure
    Same as Generate, directly triggers failover evaluation
    Mid-stream errorReceived chunks are concatenated into
    LastOutputMessage
    and passed to
    ShouldFailover
    ; after deciding to failover, the current stream is closed and restarted with the new model
    Client impactEvents already sent during the failed attempt are not retracted. Clients should reset partial results or deduplicate by metadata when receiving a new stream round
    + +> 💡 +> `ErrStreamCanceled` (caller actively abandons the stream) does not trigger failover and returns immediately. + +## Model Call Chain Execution Order + +Position of Failover in the wrapper chain (outer to inner): + +``` +1. AgentMiddleware.BeforeChatModel + 2. ChatModelAgentMiddleware.BeforeModelRewriteState + 3. failoverModelWrapper ← failover at this layer + 4. retryModelWrapper ← internal retry within each failover model + 5. eventSenderModelWrapper + 6. ChatModelAgentMiddleware.WrapModel (first registered = outermost) + 7. callbackInjectionModelWrapper (handled internally by failoverProxyModel when failover is enabled) + 8. failoverProxyModel / Model.Generate|Stream + 9. ChatModelAgentMiddleware.AfterModelRewriteState +10. AgentMiddleware.AfterChatModel +``` + +## Important Notes + +- **Required field validation**: Both `ShouldFailover` and `GetFailoverModel` are required when configuring `ModelFailoverConfig`; missing either causes `NewChatModelAgent` to return an error. The `Model` field is always required. +- **Attempt numbering**: `FailoverAttempt` starts from 1. A single Model call executes at most `1 + MaxRetries` times (1 initial + up to MaxRetries failovers). +- **Input messages**: When `GetFailoverModel` returns `nil` messages, the original input is used; when returning non-`nil`, it replaces the original input. +- **Error type when combined with Retry**: `ShouldFailover` and `FailoverContext.LastErr` receive `*RetryExhaustedError`; the original error is obtained via `RetryExhaustedError.LastErr`. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md index 9f7f876db3d..ae1a45189e7 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md @@ -1,192 +1,208 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: DeepAgents' -weight: 5 +title: DeepAgents +weight: 3 --- -## DeepAgents Overview +> 💡 +> This feature requires eino >= v0.5.14. -DeepAgents is an out-of-the-box agent solution built on top of ChatModelAgent (see: [Eino ADK: ChatModelAgent](/docs/eino/core_modules/eino_adk/agent_implementation/chat_model)). You don't need to assemble prompts, tools, or context management yourself - you can immediately get a runnable agent while still using ChatModelAgent's extension capabilities to add business features, such as custom tools and middleware. +## Overview -**Included Features:** +DeepAgents is an out-of-the-box solution built on ChatModelAgent. Without manually assembling prompts, tools, or context management, you can get an Agent with planning, file system, Shell execution, and sub-Agent delegation capabilities, while retaining all of ChatModelAgent's extension capabilities (custom tools, middleware, handlers). -- **Planning Capability** — Task decomposition and progress tracking through `write_todos` -- **File System** — Provides `read_file`, `write_file`, `edit_file`, `ls`, `glob`, `grep` for reading and writing context -- **Shell Access** — Use `execute` to run commands -- **Sub-Agents** — Delegate work to sub-agents with independent context windows via `task` -- **Smart Default Configuration** — Built-in prompts that teach the model how to efficiently use these tools -- **Context Management** — Automatic summarization for long conversation history, automatic file saving for large outputs - - SummarizationMiddleware, ReductionMiddleware are under development +**Built-in Capabilities**: -### ImportPath +- **Planning** — `write_todos` tool for task decomposition and progress tracking +- **File System** — `ls`, `read_file`, `write_file`, `edit_file`, `glob`, `grep` +- **Shell** — `execute` (supports streaming) +- **Sub-Agent** — `task` tool delegates tasks to context-isolated sub-agents +- **Smart Defaults** — Built-in Prompts that teach the model to efficiently use tools +- **Context Management** — Large outputs are automatically saved to files -Eino version must be >= v0.5.14 +### Import ```go -import github.com/cloudwego/eino/adk/prebuilt/deep +import "github.com/cloudwego/eino/adk/prebuilt/deep" -agent, err := deep.New(ctx, &deep.Config{}) +agent, err := deep.New(ctx, &deep.Config{ + ChatModel: myModel, +}) ``` -### DeepAgents Structure - -The core concept of DeepAgents is to use a main agent (MainAgent) to coordinate, plan, delegate, or autonomously execute tasks. The main agent uses its built-in ChatModel and a series of tools to interact with the external world or decompose complex tasks to specialized sub-agents (SubAgents). - - +--- -The diagram above shows the core components of DeepAgents and their relationships: +## Full Config Definition -- Main Agent: The entry point and commander of the system, receives initial tasks, calls tools in ReAct mode to complete tasks and is responsible for presenting the final results. -- ChatModel (ToolCallingChatModel): Usually a large language model with tool-calling capabilities, responsible for understanding tasks, reasoning, selecting and calling tools. -- Tools: A collection of capabilities available to MainAgent, including: - - WriteTodos: Built-in planning tool for decomposing complex tasks into structured todo lists. - - TaskTool: A special tool that serves as the unified entry point for calling sub-agents. - - BuiltinTools, CustomTools: General tools built into DeepAgents and various tools customized by users according to business needs. -- SubAgents: Responsible for executing specific, independent subtasks, with context isolated from MainAgent. - - GeneralPurpose: A general-purpose sub-agent with the same tools as MainAgent (except TaskTool), used to execute subtasks in a "clean" context. - - CustomSubAgents: Various sub-agents customized by users according to business needs. +```go +type Config = TypedConfig[*schema.Message] -### Built-in Capabilities +type TypedConfig[M adk.MessageType] struct { + Name string // Agent identifier name + Description string // Purpose description + ChatModel model.BaseModel[M] // Required; must support model.WithTools + Instruction string // System prompt; uses built-in default Prompt when empty -#### Filesystem + // Sub-Agents (bound to TaskTool) + SubAgents []adk.TypedAgent[M] -> 💡 -> Currently in alpha state + // Custom tools + ToolsConfig adk.ToolsConfig + MaxIteration int // Maximum reasoning iteration count -When creating DeepAgents, configure the relevant Backend, and DeepAgents will automatically load the corresponding tools: + // File system (choose one or combine) + Backend filesystem.Backend // Registers ls/read_file/write_file/edit_file/glob/grep + Shell filesystem.Shell // Registers execute (mutually exclusive with StreamingShell) + StreamingShell filesystem.StreamingShell // Registers execute (streaming, mutually exclusive with Shell) -``` -type Config struct { - // ... - Backend filesystem.Backend - Shell filesystem.Shell - StreamingShell filesystem.StreamingShell - // ... -} -``` + // Built-in feature toggles + WithoutWriteTodos bool // true disables the write_todos tool + WithoutGeneralSubAgent bool // true disables the default general-purpose sub-Agent - - - - - -
    ConfigurationFunctionAdded Tools
    BackendProvides file system access capability, optionalread_file, write_file, edit_file, glob, grep
    ShellProvides Shell capability, optional, mutually exclusive with StreamShellexecute
    StreamingShellProvides Shell capability with streaming results, optional, mutually exclusive with Shellexecute(streaming)
    + // TaskTool description generator (customize the task tool's description) + TaskToolDescriptionGenerator func(ctx context.Context, agents []adk.TypedAgent[M]) (string, error) -DeepAgents implements built-in filesystem by referencing filesystem middleware. For more detailed capability description of this middleware, see: [Middleware: FileSystem](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem) + // Extensions + Middlewares []adk.AgentMiddleware // struct-based middleware + Handlers []adk.TypedChatModelAgentMiddleware[M] // interface-based handlers -### Task Decomposition and Planning + // Model fault tolerance + ModelRetryConfig *adk.TypedModelRetryConfig[M] + ModelFailoverConfig *adk.ModelFailoverConfig[M] -The Description of WriteTodos describes the principles of task decomposition and planning. The main agent adds a subtask list to the context by calling the WriteTodos tool to inspire subsequent reasoning and execution processes: + // Output storage (written to session via AddSessionValue) + OutputKey string +} +``` - +### Constructors -1. The model receives user input. -2. The model calls the WriteTodos tool with a task list generated according to the WriteTodos Description. This tool call is added to the context for future reference. -3. The model calls TaskTool according to the todos in the context to complete the first todo. -4. Calls WriteTodos again to update the Todos execution progress. +```go +// Standard version (M = *schema.Message) +func New(ctx context.Context, cfg *Config) (adk.ResumableAgent, error) -> 💡 -> For simple tasks, calling WriteTodos every time may have a negative effect. The WriteTodos Description includes some common positive and negative examples to avoid not calling or over-calling WriteTodos. When using DeepAgents, you can add more prompts according to actual business scenarios to make WriteTodos called at appropriate times. +// Generic version (supports *schema.AgenticMessage) +func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedResumableAgent[M], error) +``` > 💡 -> WriteTodos will be added to the Agent by default. Configure `WithoutWriteTodos=true` to disable WriteTodos. +> Returns ResumableAgent (includes Resume method), which can be used with Runner's checkpoint/resume mechanism. -### Task Delegation and SubAgents Invocation - -**TaskTool** +--- -All sub-agents are bound to TaskTool. When the main agent assigns subtasks to sub-agents for processing, it calls TaskTool and specifies which sub-agent is needed and the task to execute. TaskTool then routes the task to the specified sub-agent and returns the result to the main agent after execution. The default Description of TaskTool explains the general rules for calling sub-agents and concatenates the Description of each sub-agent. Developers can customize the Description of TaskTool by configuring `TaskToolDescriptionGenerator`. +## Architecture -> When users configure Config.SubAgents, these Agents will be bound to TaskTool based on ChatModelAgent's AgentAsTool capability + -**Context Isolation** +- **Main Agent**: System entry point, completes tasks by calling tools in ReAct mode +- **ChatModel** (`model.BaseModel[M]`): Responsible for reasoning and tool selection +- **Tools**: + - `write_todos`: Built-in planning tool, decomposes tasks into structured TODO lists + - `task`: Sub-Agent invocation entry (routing parameters: `subagent_type`, `description`) + - Built-in tools (file system/Shell) + user-defined tools (`ToolsConfig`) +- **SubAgents**: Context-isolated, execute subtasks independently + - `general-purpose`: Default sub-Agent with the same tools (except task) and configuration as the main Agent + - Custom sub-Agents (`Config.SubAgents`) -Context isolation between Agents: +--- -- Information Transfer: The main agent and sub-agents do not share context. Sub-agents only receive the subtask goals assigned by the main agent, not the entire task processing; the main agent only receives the processing results from sub-agents, not the processing of sub-agents. -- Avoid Pollution: This isolation ensures that the execution process of sub-agents (such as numerous tool calls and intermediate steps) does not "pollute" the main agent's context. The main agent only receives concise, clear final answers. +## Built-in File System -**general-purpose** + + + + + +
    Config FieldRegistered ToolsDescription
    Backend
    ls, read_file, write_file, edit_file, glob, grepFile system operations
    Shell
    executeNon-streaming command execution, mutually exclusive with StreamingShell
    StreamingShell
    execute (streaming)Streaming command execution, mutually exclusive with Shell
    -DeepAgents adds a sub-agent by default: general-purpose. general-purpose has the same system prompt and tools as the main agent (except TaskTool). When there is no specialized sub-agent to handle a task, the main agent can call general-purpose to isolate context. Developers can remove this agent by configuring `WithoutGeneralSubAgent=true`. +Internally implemented using FileSystem Middleware. -### Comparison with Other Agents +--- -- Compared to ReAct Agent +## Task Planning: write_todos - - Advantages: DeepAgents strengthens task decomposition and planning through built-in WriteTodos; it also isolates multi-agent contexts, usually performing better in large-scale, multi-step tasks. - - Disadvantages: Making plans and calling sub-agents bring additional model requests, increasing latency and token costs; if task decomposition is unreasonable, it may have a negative effect. -- Compared to Plan-and-Execute + - - Advantages: DeepAgents provides Plan/RePlan as tools for the main agent to freely call, allowing unnecessary planning to be skipped during tasks, overall reducing model calls and lowering latency and costs. - - Disadvantages: Task planning and delegation are completed in one model call, requiring higher model capabilities, and prompt tuning is relatively more difficult. +The `write_todos` tool writes a structured TODO list to the session (key: `deep_agent_session_key_todos`) for subsequent reasoning reference. -## DeepAgents Usage Example +**TODO Structure**: -### Scenario Description +```go +type TODO struct { + Content string `json:"content"` + ActiveForm string `json:"activeForm"` + Status string `json:"status"` // "pending" | "in_progress" | "completed" +} +``` -Excel Agent is an "intelligent assistant that understands Excel". It first breaks down the problem into steps, then executes and verifies results step by step. It can understand user questions and uploaded file content, propose feasible solutions, and select appropriate tools (system commands, generate and run Python code, web queries, etc.) to complete tasks. +**Workflow**: -In real business, you can think of Excel Agent as an "Excel expert + automation engineer". When you provide a raw spreadsheet and target description, it will propose a solution and complete the execution: +1. Model receives user input +2. Calls `write_todos` to decompose tasks and write to context +3. Executes TODO items one by one (calls task or direct tools) +4. Calls `write_todos` again to update progress -- **Data Cleaning and Formatting**: Complete deduplication, null value handling, and date format standardization from an Excel file containing large amounts of data. -- **Data Analysis and Report Generation**: Extract monthly sales totals from sales data, aggregate statistics, pivot, and finally generate and export chart reports. -- **Automated Budget Calculation**: Automatically calculate total budget based on budget applications from different departments and generate department budget allocation tables. -- **Data Matching and Merging**: Match and merge customer information tables from multiple different sources to generate a complete customer information database. +> 💡 +> For simple tasks, calling write_todos every time may be counterproductive. The built-in Prompt already includes positive and negative examples to guide when to use it. You can further tune this through custom Instructions. Configure WithoutWriteTodos=true to disable it completely. -The structure of Excel Agent built with DeepAgents is as follows: +--- - +## Sub-Agent Delegation: task Tool -1. Add ReadFile tool to the main agent, allowing the main agent to view file content and assist in subtask formulation -2. Add Code and WebSearch sub-agents: Code can write python code to operate excel spreadsheets; WebSearch can search for information and summarize. +**TaskTool** is the unified invocation entry for all sub-Agents: -### Code Implementation +- Parameters: `subagent_type` (target sub-Agent name), `description` (task description) +- Internally wraps each sub-Agent as a tool via `adk.NewTypedAgentTool` +- Default Description includes all available sub-Agent names and descriptions; customizable via `TaskToolDescriptionGenerator` -[https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) +**Context Isolation**: -### Execution Results +- Sub-Agents only receive the task description assigned by the main Agent, not sharing conversation history +- The main Agent only receives the sub-Agent's final result; intermediate steps are not passed back +- Avoids large numbers of tool calls and intermediate reasoning "polluting" the main Agent's context -```yaml -name: ExcelAgent -path: [{ExcelAgent}] -tool name: task -arguments: {"subagent_type":"CodeAgent","description":"Please help me with a CSV file operation task. Specific requirements: 1. Read the questions.csv file in the current directory; 2. Extract the first column data from the CSV file; 3. Save the extracted first column data to a new CSV file named first_column.csv. Please use the pandas library to complete these operations, ensuring the code can execute correctly and handle possible file path issues."} +**general-purpose Sub-Agent**: -name: ExcelAgent -path: [{ExcelAgent}] -tool name: python_runner -arguments: {"code":"```python\nimport pandas as pd\nimport os\n\n# Get current working directory and build file path\nbase_dir = os.getcwd()\ninput_path = os.path.join(base_dir, 'questions.csv')\noutput_path = os.path.join(base_dir, 'first_column.csv')\n\ntry:\n # Read CSV file\n df = pd.read_csv(input_path)\n # Extract first column data\n first_column = df.iloc[:, 0]\n # Save as new CSV file (without index)\n first_column.to_csv(output_path, index=False)\n print(f\"Successfully saved first column data to: {output_path}\")\nexcept FileNotFoundError:\n print(f\"Error: File not found {input_path}\")\nexcept Exception as e:\n print(f\"Error during processing: {str(e)}\")\n```"} +- Created by default, has the same tools (except task), Instruction, and ModelFailoverConfig as the main Agent +- Used to execute general tasks without a specialized sub-Agent in an isolated context +- Configure `WithoutGeneralSubAgent=true` to disable -name: ExcelAgent -path: [{ExcelAgent}] -tool response: Successfully saved first column data to: /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv +--- +## Comparison with Other Solutions -name: ExcelAgent -path: [{ExcelAgent}] -answer: Task completed. Successfully read the `questions.csv` file in the current directory, extracted the first column data, and saved the result to `first_column.csv`. The specific output path is: + + + + +
    DimensionDeepAgents vs ReActDeepAgents vs Plan-and-Execute
    AdvantagesBuilt-in planning + sub-Agent context isolation, better performance in multi-step tasksPlan/RePlan invoked as tools on demand, reducing unnecessary planning overhead
    DisadvantagesPlanning + sub-Agent calls increase model requests, latency, and token costsPlanning and delegation completed in a single call, requiring higher model capability
    -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv` +--- -The code handles path concatenation and exception catching (such as file not found or format errors) to ensure execution stability. +## Usage Example -name: ExcelAgent -path: [{ExcelAgent}] -tool response: Task completed. Successfully read the `questions.csv` file in the current directory, extracted the first column data, and saved the result to `first_column.csv`. The specific output path is: +### Excel Agent Scenario -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv` + -The code handles path concatenation and exception catching (such as file not found or format errors) to ensure execution stability. +- Main Agent configured with ReadFile tool to assist task formulation +- Code (Python for Excel operations) and WebSearch sub-Agents added -name: ExcelAgent -path: [{ExcelAgent}] -answer: Successfully extracted the first column data from the `questions.csv` spreadsheet to a new file `first_column.csv`, saved at: +### Code -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv` +Full example: [https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) -The process handled path concatenation and exception catching (such as file not found, format errors, etc.) to ensure data extraction completeness and file generation stability. If you need to adjust the file path or have further requirements for data format, please let me know. +```go +agent, err := deep.New(ctx, &deep.Config{ + Name: "ExcelAgent", + ChatModel: myModel, + Backend: localBackend, + SubAgents: []adk.Agent{codeAgent, webSearchAgent}, + ToolsConfig: adk.ToolsConfig{ + InvokableTools: []tool.InvokableTool{readFileTool}, + }, +}) ``` diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md index 76988341bc5..7ca37de8281 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md @@ -1,10 +1,10 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: Plan-Execute Agent' -weight: 4 +title: Plan-Execute Agent +weight: 2 --- ## Plan-Execute Agent Overview @@ -275,7 +275,7 @@ func newPlanExecuteAgent(ctx context.Context) adk.Agent { replanner := newReplanner(ctx, model) // Combine into PlanExecuteAgent (fixed execute-replan max iterations 10) - planExecuteAgent, err := planexecute.NewPlanExecuteAgent(ctx, &planexecute.PlanExecuteConfig{ + planExecuteAgent, err := planexecute.New(ctx, &planexecute.PlanExecuteConfig{ Planner: planner, Executor: executor, Replanner: replanner, diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md deleted file mode 100644 index bca088eb1cf..00000000000 --- a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md +++ /dev/null @@ -1,359 +0,0 @@ ---- -Description: "" -date: "2026-03-02" -lastmod: "" -tags: [] -title: 'Eino ADK: Supervisor Agent' -weight: 3 ---- - -## Supervisor Agent Overview - -### Import Path - -`import github.com/cloudwego/eino/adk/prebuilt/supervisor` - -### What is Supervisor Agent? - -Supervisor Agent is a centralized multi-agent collaboration pattern consisting of one Supervisor Agent and multiple SubAgents. The Supervisor is responsible for task allocation, monitoring the execution process of sub-agents, and summarizing results and making decisions after sub-agents complete; sub-agents focus on executing specific tasks and automatically transfer task control back to the Supervisor via WithDeterministicTransferTo after completion. - - - -This pattern is suitable for scenarios that require dynamic coordination of multiple specialized agents to complete complex tasks, such as: - -- Research project management (Supervisor assigns research, experiment, report writing tasks to different sub-agents). -- Customer service processes (Supervisor assigns tasks to technical support, after-sales, sales sub-agents based on user question types). - -### Supervisor Agent Structure - -The core structure of Supervisor pattern is as follows: - -- **Supervisor Agent**: As the collaboration core, has task allocation logic (such as rule-based or LLM decision), can include sub-agents under management via `SetSubAgents`. -- **SubAgents**: Each sub-agent is enhanced with WithDeterministicTransferTo, with `ToAgentNames` preset to the Supervisor name, ensuring automatic transfer back to Supervisor after task completion. - -### Supervisor Agent Features - -1. **Deterministic Callback**: After sub-agent execution completes (not interrupted), WithDeterministicTransferTo automatically triggers Transfer event, transferring task control back to Supervisor, avoiding collaboration flow interruption. -2. **Centralized Control**: Supervisor uniformly manages sub-agents, can dynamically adjust task allocation based on sub-agent execution results (such as assigning to other sub-agents or directly generating final results). -3. **Loosely Coupled Extension**: Sub-agents can be independently developed, tested, and replaced; just ensure they implement the Agent interface and bind to Supervisor to join the collaboration flow. -4. **Support for Interrupt and Resume**: If sub-agent or Supervisor supports `ResumableAgent` interface, collaboration flow can resume after interruption, maintaining task context continuity. - -### Supervisor Agent Execution Flow - -The typical collaboration flow of Supervisor pattern is as follows: - -1. **Task Start**: Runner triggers Supervisor to run, inputs initial task (e.g., "Complete a report on LLM development history"). -2. **Task Allocation**: Supervisor transfers task to designated sub-agent (e.g., "Research Agent") via Transfer event based on task requirements. -3. **Sub-Agent Execution**: Sub-agent executes specific task (e.g., researches LLM key milestones) and generates execution result events. -4. **Automatic Callback**: After sub-agent completes, WithDeterministicTransferTo triggers Transfer event, transferring task back to Supervisor. -5. **Result Processing**: Supervisor receives sub-agent results, decides next step (e.g., assign to "Report Writing Agent" to continue processing, or directly output final result). - -## Supervisor Agent Usage Example - -### Scenario Description - -Create a research report generation system: - -- **Supervisor**: Based on user input research topic, assigns tasks to "Research Agent" and "Writer Agent", and summarizes final report. -- **Research Agent**: Responsible for generating research plan (e.g., key stages of LLM development). -- **Writer Agent**: Responsible for writing complete report based on research plan. - -### Code Implementation - -#### Step 1: Implement Sub-Agents - -First create two sub-agents, responsible for research and writing tasks respectively: - -```go -// Research Agent: Generates research plan -func NewResearchAgent(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ResearchAgent", - Description: "Generates a detailed research plan for a given topic.", - Instruction: ` -You are a research planner. Given a topic, output a step-by-step research plan with key stages and milestones. -Output ONLY the plan, no extra text.`, - Model: model, - }) - return agent -} - -// Writer Agent: Writes report based on research plan -func NewWriterAgent(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "WriterAgent", - Description: "Writes a report based on a research plan.", - Instruction: ` -You are an academic writer. Given a research plan, expand it into a structured report with details and analysis. -Output ONLY the report, no extra text.`, - Model: model, - }) - return agent -} -``` - -#### Step 2: Implement Supervisor Agent - -Create Supervisor Agent, define task allocation logic (simplified here as rule-based: first assign to Research Agent, then assign to Writer Agent): - -```go -// Supervisor Agent: Coordinates research and writing tasks -func NewReportSupervisor(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ReportSupervisor", - Description: "Coordinates research and writing to generate a report.", - Instruction: ` -You are a project supervisor. Your task is to coordinate two sub-agents: -- ResearchAgent: generates a research plan. -- WriterAgent: writes a report based on the plan. - -Workflow: -1. When receiving a topic, first transfer the task to ResearchAgent. -2. After ResearchAgent finishes, transfer the task to WriterAgent with the plan as input. -3. After WriterAgent finishes, output the final report.`, - Model: model, - }) - return agent -} -``` - -#### Step 3: Combine Supervisor and Sub-Agents - -Use `NewSupervisor` to combine Supervisor and sub-agents: - -```go -import ( - "context" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/adk/prebuilt/supervisor" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -func main() { - ctx := context.Background() - - // 1. Create LLM model (e.g., GPT-4o) - model, _ := openai.NewChatModel(ctx, &openai.ChatModelConfig{ - APIKey: "YOUR_API_KEY", - Model: "gpt-4o", - }) - - // 2. Create sub-agents and Supervisor - researchAgent := NewResearchAgent(model) - writerAgent := NewWriterAgent(model) - reportSupervisor := NewReportSupervisor(model) - - // 3. Combine Supervisor and sub-agents - supervisorAgent, _ := supervisor.New(ctx, &supervisor.Config{ - Supervisor: reportSupervisor, - SubAgents: []adk.Agent{researchAgent, writerAgent}, - }) - - // 4. Run Supervisor pattern - iter := supervisorAgent.Run(ctx, &adk.AgentInput{ - Messages: []adk.Message{ - schema.UserMessage("Write a report on the history of Large Language Models."), - }, - EnableStreaming: true, - }) - - // 5. Consume event stream (print results) - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Output != nil && event.Output.MessageOutput != nil { - msg, _ := event.Output.MessageOutput.GetMessage() - println("Agent[" + event.AgentName + "]:\n" + msg.Content + "\n===========") - } - } -} -``` - -### Execution Results - -```markdown -Agent[ReportSupervisor]: - -=========== -Agent[ReportSupervisor]: -successfully transferred to agent [ResearchAgent] -=========== -Agent[ResearchAgent]: -1. **Scope Definition & Background Research** - - Task: Define "Large Language Model" (LLM) for the report (e.g., size thresholds, key characteristics: transformer-based, large-scale pretraining, general-purpose). - - Task: Identify foundational NLP/AI concepts pre-LLMs (statistical models, early neural networks, word embeddings) to contextualize origins. - - Milestone: 3-day literature review of academic definitions, industry reports, and AI historiographies to finalize scope. - -2. **Chronological Periodization** - - Task: Divide LLM history into distinct eras (e.g., Pre-2017: Pre-transformer foundations; 2017-2020: Transformer revolution & early LLMs; 2020-Present: Scaling & mainstream adoption). - ... - -Agent[ResearchAgent]: -successfully transferred to agent [ReportSupervisor] -=========== -Agent[ReportSupervisor]: -successfully transferred to agent [WriterAgent] -=========== -Agent[WriterAgent]: -# The History of Large Language Models: From Foundations to Mainstream Revolution - -## Abstract -Large Language Models (LLMs) represent one of the most transformative technological innovations of the 21st century... - -## 1. Introduction: Defining Large Language Models -A **Large Language Model (LLM)** is a type of machine learning model designed to process and generate human language... - -... - -## 7. Conclusion: A Revolution in Five Years -The history of LLMs is a story of exponential progress: from the transformer's 2017 invention to ChatGPT's 2022 viral explosion... - -## References -- Devlin, J., et al. (2018). *BERT: Pre-training of deep bidirectional transformers for language understanding*. NAACL. -... -=========== -Agent[WriterAgent]: -successfully transferred to agent [ReportSupervisor] -=========== -``` - -## WithDeterministicTransferTo - -### What is WithDeterministicTransferTo? - -`WithDeterministicTransferTo` is an Agent enhancement tool provided by Eino ADK, used to inject task transfer capability into Agents. It allows developers to preset fixed task transfer paths for target Agents. When the Agent completes its task (not interrupted), it automatically generates a Transfer event to transfer the task flow to the preset target Agent. - -This capability is the foundation for building the Supervisor Agent collaboration pattern, ensuring sub-agents can reliably transfer task control back to the Supervisor after execution, forming a "allocate-execute-feedback" closed-loop collaboration flow. - -### WithDeterministicTransferTo Core Implementation - -#### Configuration Structure - -Define core task transfer parameters through `DeterministicTransferConfig`: - -```go -// Wrapper method -func AgentWithDeterministicTransferTo(_ context.Context, config *DeterministicTransferConfig) Agent - -// Configuration details -type DeterministicTransferConfig struct { - Agent Agent // Target Agent to be enhanced - ToAgentNames []string // List of target Agent names to transfer to after task completion -} -``` - -- `Agent`: The original Agent that needs transfer capability added. -- `ToAgentNames`: When `Agent` completes task and is not interrupted, automatically transfers task to target Agent name list (transfers in order). - -#### Agent Wrapping - -WithDeterministicTransferTo wraps the original Agent. Based on whether it implements the `ResumableAgent` interface (supports interrupt and resume), it returns `agentWithDeterministicTransferTo` or `resumableAgentWithDeterministicTransferTo` instance respectively, ensuring enhanced capability is compatible with Agent's original functions (such as `Resume` method). - -The wrapped Agent overrides the `Run` method (for `ResumableAgent`, also overrides `Resume` method), appending Transfer events to the original Agent's event stream: - -```go -// Wrapper for regular Agent -type agentWithDeterministicTransferTo struct { - agent Agent // Original Agent - toAgentNames []string // Target Agent name list -} - -// Run method: Executes original Agent task, appends Transfer event after task completion -func (a *agentWithDeterministicTransferTo) Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent] { - aIter := a.agent.Run(ctx, input, options...) - - iterator, generator := NewAsyncIteratorPair[*AgentEvent]() - - // Asynchronously process original event stream and append Transfer event - go appendTransferAction(ctx, aIter, generator, a.toAgentNames) - - return iterator -} -``` - -For `ResumableAgent`, additionally implements `Resume` method, ensuring deterministic transfer still triggers after resume execution: - -```go -type resumableAgentWithDeterministicTransferTo struct { - agent ResumableAgent // Original Agent supporting resume - toAgentNames []string // Target Agent name list -} - -// Resume method: Resumes execution of original Agent task, appends Transfer event after completion -func (a *resumableAgentWithDeterministicTransferTo) Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] { - aIter := a.agent.Resume(ctx, info, opts...) - iterator, generator := NewAsyncIteratorPair[*AgentEvent]() - go appendTransferAction(ctx, aIter, generator, a.toAgentNames) - return iterator -} -``` - -#### Event Stream Append Transfer Event - -`appendTransferAction` is the core logic implementing deterministic transfer. It consumes the original Agent's event stream and automatically generates and sends Transfer events to target Agents after the Agent task ends normally (not interrupted): - -```go -func appendTransferAction(ctx context.Context, aIter *AsyncIterator[*AgentEvent], generator *AsyncGenerator[*AgentEvent], toAgentNames []string) { - defer func() { - // Exception handling: Capture panic and pass error via event - if panicErr := recover(); panicErr != nil { - generator.Send(&AgentEvent{Err: safe.NewPanicErr(panicErr, debug.Stack())}) - } - generator.Close() // Event stream ends, close generator - }() - - interrupted := false - - // 1. Forward all events from original Agent - for { - event, ok := aIter.Next() - if !ok { // Original event stream ended - break - } - generator.Send(event) // Forward event to caller - - // Check if interruption occurred (e.g., InterruptAction) - if event.Action != nil && event.Action.Interrupted != nil { - interrupted = true - } else { - interrupted = false - } - } - - // 2. If not interrupted and target Agent exists, generate Transfer event - if !interrupted && len(toAgentNames) > 0 { - for _, toAgentName := range toAgentNames { - // Generate transfer message (system prompt + Transfer action) - aMsg, tMsg := GenTransferMessages(ctx, toAgentName) - // Send system prompt event (notify user of task transfer) - aEvent := EventFromMessage(aMsg, nil, schema.Assistant, "") - generator.Send(aEvent) - // Send Transfer action event (trigger task transfer) - tEvent := EventFromMessage(tMsg, nil, schema.Tool, tMsg.ToolName) - tEvent.Action = &AgentAction{ - TransferToAgent: &TransferToAgentAction{ - DestAgentName: toAgentName, // Target Agent name - }, - } - generator.Send(tEvent) - } - } -} -``` - -**Key Logic**: - -- **Event Forwarding**: All events generated by the original Agent (such as thinking, tool calls, output results) are fully forwarded, ensuring business logic is unaffected. -- **Interruption Check**: If Agent is interrupted during execution (e.g., `InterruptAction`), Transfer is not triggered (interruption is considered task not completed normally). -- **Transfer Event Generation**: After task ends normally, two events are generated for each `ToAgentNames`: - 1. System prompt event (`schema.Assistant` role): Notifies user that task will be transferred to target Agent. - 2. Transfer action event (`schema.Tool` role): Carries `TransferToAgentAction`, triggers ADK runtime to transfer task to the Agent corresponding to `DestAgentName`. - -## Summary - -WithDeterministicTransferTo provides reliable task transfer capability for Agents, which is the core foundation for building Supervisor pattern; Supervisor pattern achieves efficient collaboration between multiple Agents through centralized coordination and deterministic callbacks, significantly reducing development and maintenance costs for complex tasks. By combining both, developers can quickly build flexible, scalable multi-Agent systems. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md b/content/en/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md deleted file mode 100644 index 8581e2d329c..00000000000 --- a/content/en/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md +++ /dev/null @@ -1,1265 +0,0 @@ ---- -Description: "" -date: "2026-01-20" -lastmod: "" -tags: [] -title: 'Eino ADK: Workflow Agents' -weight: 2 ---- - -# Workflow Agents Overview - -## Import Path - -`import "github.com/cloudwego/eino/adk"` - -## What Are Workflow Agents - -Workflow Agents are a special type of Agent in Eino ADK that allows developers to organize and execute multiple sub-agents in a predefined flow. - -Unlike the Transfer pattern based on LLM autonomous decision-making, Workflow Agents use **predefined decisions**, running sub-agents according to the execution flow defined in code, providing a more predictable and controllable multi-agent collaboration approach. - -Eino ADK provides three basic Workflow Agent types: - -- **SequentialAgent**: Executes sub-agents sequentially in order -- **LoopAgent**: Loops through a sequence of sub-agents -- **ParallelAgent**: Executes multiple sub-agents concurrently - -These Workflow Agents can be nested with each other to build more complex execution flows, meeting various business scenario requirements. - -# SequentialAgent - -## Features - -SequentialAgent is the most basic Workflow Agent. It executes a series of sub-agents sequentially according to the order provided in the configuration. After each sub-agent completes execution, its output is passed to the next sub-agent through the History mechanism, forming a linear execution chain. - - - -```go -type SequentialAgentConfig struct { - Name string // Agent name - Description string // Agent description - SubAgents []Agent // List of sub-agents, arranged in execution order -} - -func NewSequentialAgent(ctx context.Context, config *SequentialAgentConfig) (Agent, error) -``` - -SequentialAgent execution follows these rules: - -1. **Linear execution**: Strictly follows the order of the SubAgents array -2. **History passing**: Each agent's execution result is added to History, allowing subsequent agents to access the execution history of previous agents -3. **Early exit**: If any sub-agent produces an ExitAction / Interrupt, the entire Sequential flow terminates immediately - -SequentialAgent is suitable for the following scenarios: - -- **Multi-step processing flows**: Such as data preprocessing -> analysis -> report generation -- **Pipeline processing**: Each step's output serves as the next step's input -- **Task sequences with dependencies**: Subsequent tasks depend on results from previous tasks - -## Example - -This example demonstrates how to use SequentialAgent to create a three-step document processing pipeline: - -1. **DocumentAnalyzer**: Analyzes document content -2. **ContentSummarizer**: Summarizes analysis results -3. **ReportGenerator**: Generates the final report - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -// Create ChatModel instance -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// Document analysis Agent -func NewDocumentAnalyzerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "DocumentAnalyzer", - Description: "Analyzes document content and extracts key information", - Instruction: "You are a document analysis expert. Please carefully analyze the document content provided by the user, extracting key information, main points, and important data.", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Content summarization Agent -func NewContentSummarizerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ContentSummarizer", - Description: "Summarizes analysis results", - Instruction: "Based on the previous document analysis results, generate a concise and clear summary highlighting the most important findings and conclusions.", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Report generation Agent -func NewReportGeneratorAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ReportGenerator", - Description: "Generates the final analysis report", - Instruction: "Based on the previous analysis and summary, generate a structured analysis report including an executive summary, detailed analysis, and recommendations.", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // Create three processing step Agents - analyzer := NewDocumentAnalyzerAgent() - summarizer := NewContentSummarizerAgent() - generator := NewReportGeneratorAgent() - - // Create SequentialAgent - sequentialAgent, err := adk.NewSequentialAgent(ctx, &adk.SequentialAgentConfig{ - Name: "DocumentProcessingPipeline", - Description: "Document processing pipeline: Analysis → Summary → Report Generation", - SubAgents: []adk.Agent{analyzer, summarizer, generator}, - }) - if err != nil { - log.Fatal(err) - } - - // Create Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: sequentialAgent, - }) - - // Execute document processing flow - input := "Please analyze the following market report: In Q3 2024, company revenue grew 15%, mainly due to the successful launch of new product lines. However, operating costs also increased by 8%, requiring efficiency optimization." - - fmt.Println("Starting document processing pipeline...") - iter := runner.Query(ctx, input) - - stepCount := 1 - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - if event.Output != nil && event.Output.MessageOutput != nil { - fmt.Printf("\n=== Step %d: %s ===\n", stepCount, event.AgentName) - fmt.Printf("%s\n", event.Output.MessageOutput.Message.Content) - stepCount++ - } - } - - fmt.Println("\nDocument processing pipeline completed!") -} -``` - -Run result: - -```markdown -Starting document processing pipeline... - -=== Step 1: DocumentAnalyzer === -Market Report Key Information Analysis: - -1. Revenue Growth: - - In Q3 2024, company revenue grew 15% year-over-year. - - The main driver of revenue growth was the successful launch of new product lines. - -2. Cost Situation: - - Operating costs increased by 8%. - - The cost increase reminds the company of the need for efficiency optimization. - -Key Points Summary: -- The new product line launch significantly drove revenue growth, showing good results in product innovation. -- Although revenue increased, the rise in operating costs somewhat affected profitability, highlighting the importance of improving operational efficiency. - -Important Data: -- Revenue growth rate: 15% -- Operating cost growth rate: 8% - -=== Step 2: ContentSummarizer === -Summary: In Q3 2024, the company achieved 15% revenue growth, mainly attributed to the successful launch of new product lines, demonstrating significant improvement in product innovation capability. However, operating costs also increased by 8%, putting some pressure on profitability and emphasizing the urgent need for operational efficiency optimization. Overall, the company needs to seek a better balance between growth and cost control to ensure sustainable healthy development. - -=== Step 3: ReportGenerator === -Analysis Report - -I. Executive Summary -In Q3 2024, the company achieved 15% year-over-year revenue growth, mainly driven by the successful launch of new product lines, demonstrating strong product innovation capability. However, operating costs also increased 8% year-over-year, putting some pressure on profit margins. To ensure continued profitable growth, focus should be on optimizing operational efficiency and promoting balanced development of cost control and revenue growth. - -II. Detailed Analysis -1. Revenue Growth Analysis -- The company's 15% revenue growth reflects good market acceptance of new product lines, effectively expanding revenue sources. -- The launch of new product lines demonstrates improved R&D and market responsiveness, laying a foundation for future sustained growth. - -2. Operating Cost Situation -- The 8% increase in operating costs may come from various aspects including raw material price increases, decreased production efficiency, or increased sales and promotion expenses. -- This cost increase somewhat offsets the profit gains from revenue growth, affecting overall profitability. - -3. Profitability and Efficiency Considerations -- The mismatch between revenue and cost growth indicates room for improvement in current operational efficiency. -- Optimizing supply chain management, improving production automation, and strengthening cost control will become key measures. - -III. Recommendations -1. Strengthen follow-up support for new product lines, including marketing and customer feedback mechanisms, to continue driving revenue growth. -2. Conduct in-depth analysis of operating cost composition, identify main cost drivers, and develop targeted cost reduction strategies. -3. Promote internal process optimization and technology upgrades to improve production and operational efficiency and alleviate cost pressure. -4. Establish a dynamic financial monitoring system to achieve real-time tracking and adjustment of revenue and costs, ensuring company financial health. - -IV. Conclusion -The company demonstrated good growth momentum in Q3 2024 but also faces challenges from rising costs. Through continuous product innovation combined with effective cost management, there is potential to achieve dual improvement in profitability and market competitiveness, driving steady company development. - -Document processing pipeline completed! -``` - -# LoopAgent - -## Features - -LoopAgent is built on SequentialAgent. It repeatedly executes the configured sub-agent sequence until the maximum iteration count is reached or a sub-agent produces an ExitAction. LoopAgent is particularly suitable for scenarios requiring iterative optimization, repeated processing, or continuous monitoring. - - - -```go -type LoopAgentConfig struct { - Name string // Agent name - Description string // Agent description - SubAgents []Agent // List of sub-agents - MaxIterations int // Maximum iteration count, 0 means infinite loop -} - -func NewLoopAgent(ctx context.Context, config *LoopAgentConfig) (Agent, error) -``` - -LoopAgent execution follows these rules: - -1. **Loop execution**: Repeatedly executes the SubAgents sequence, with each loop being a complete Sequential execution process -2. **History accumulation**: Results from each iteration accumulate in History, allowing subsequent iterations to access all historical information -3. **Conditional exit**: Supports terminating the loop via ExitAction or reaching maximum iteration count; setting `MaxIterations=0` means infinite loop - -LoopAgent is suitable for the following scenarios: - -- **Iterative optimization**: Tasks requiring repeated improvement such as code optimization, parameter tuning -- **Continuous monitoring**: Periodically checking status and executing corresponding operations -- **Repeated processing**: Tasks that need multiple rounds of processing to achieve satisfactory results -- **Self-improvement**: Agent continuously improves its output based on previous execution results - -## Example - -This example demonstrates how to use LoopAgent to create a code optimization loop: - -1. **CodeAnalyzer**: Analyzes code issues -2. **CodeOptimizer**: Optimizes code based on analysis results -3. **ExitController**: Determines whether to exit the loop - -The loop continues until code quality meets standards or maximum iteration count is reached. - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// Code analysis Agent -func NewCodeAnalyzerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "CodeAnalyzer", - Description: "Analyzes code quality and performance issues", - Instruction: `You are a code analysis expert. Please analyze the provided code and identify the following issues: -1. Performance bottlenecks -2. Code duplication -3. Readability issues -4. Potential bugs -5. Non-compliance with best practices - -If the code is already excellent, output "EXIT: Code quality has met standards" to end the optimization process.`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Code optimization Agent -func NewCodeOptimizerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "CodeOptimizer", - Description: "Optimizes code based on analysis results", - Instruction: `Based on the previous code analysis results, optimize and improve the code: -1. Fix identified performance issues -2. Eliminate code duplication -3. Improve code readability -4. Fix potential bugs -5. Apply best practices - -Please provide the complete optimized code.`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Create a special Agent to handle exit logic -func NewExitControllerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ExitController", - Description: "Controls the exit of the optimization loop", - Instruction: `Check the previous analysis results. If the code analyst believes the code quality has met standards (contains "EXIT" keyword), -output "TERMINATE" and generate an exit action to end the loop. Otherwise continue to the next optimization round.`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // Create optimization flow Agents - analyzer := NewCodeAnalyzerAgent() - optimizer := NewCodeOptimizerAgent() - controller := NewExitControllerAgent() - - // Create LoopAgent, execute up to 5 optimization rounds - loopAgent, err := adk.NewLoopAgent(ctx, &adk.LoopAgentConfig{ - Name: "CodeOptimizationLoop", - Description: "Code optimization loop: Analysis → Optimization → Check exit condition", - SubAgents: []adk.Agent{analyzer, optimizer, controller}, - MaxIterations: 5, // Maximum 5 optimization rounds - }) - if err != nil { - log.Fatal(err) - } - - // Create Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: loopAgent, - }) - - // Code example to optimize - codeToOptimize := ` -func processData(data []int) []int { - result := []int{} - for i := 0; i < len(data); i++ { - for j := 0; j < len(data); j++ { - if data[i] > data[j] { - result = append(result, data[i]) - break - } - } - } - return result -} -` - - fmt.Println("Starting code optimization loop...") - iter := runner.Query(ctx, "Please optimize the following Go code:\n"+codeToOptimize) - - iteration := 1 - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - if event.Output != nil && event.Output.MessageOutput != nil { - fmt.Printf("\n=== Round %d - %s ===\n", iteration, event.AgentName) - fmt.Printf("%s\n", event.Output.MessageOutput.Message.Content) - - // Check if exit is needed - if event.AgentName == "ExitController" { - if event.Action != nil && event.Action.Exit { - fmt.Println("\nOptimization loop ended early!") - break - } - iteration++ - } - } - } - - fmt.Println("\nCode optimization loop completed!") -} -``` - -Run result: - -```java -Starting code optimization loop... - -=== Round 1 - CodeAnalyzer === -Analyzing the provided code: - -```go -func processData(data []int) []int { - result := []int{} - for i := 0; i < len(data); i++ { - for j := 0; j < len(data); j++ { - if data[i] > data[j] { - result = append(result, data[i]) - break - } - } - } - return result -} -``` - -### 1. Performance Bottlenecks - -- Double nested loop with O(n²) time complexity, poor performance for large datasets. -- Inner loop breaks immediately when condition is met, reducing some unnecessary comparisons, but overall still quadratic complexity. - -### 2. Code Duplication - -- No obvious code duplication, but logic can be simplified. - -### 3. Readability Issues - -- The code's intent is not very clear. Based on the code logic, the function filters out elements that have at least one element smaller than them in the array. In other words, it removes all elements that are equal to or are the minimum. -- Readability can be improved through comments or renaming. - -### 4. Potential Bugs - -- No obvious bugs currently, but returning an empty array for empty input is expected behavior. - -### 5. Non-compliance with Best Practices - -- The process is relatively inefficient, not using Go language features to optimize performance. -- Missing comments or documentation. - ---- - -## Optimization Suggestions - -If the goal is indeed to return all elements greater than some element in the array (excluding the minimum element), it can be changed to: - -- Find the minimum element minVal in the array. -- Iterate through the array, adding elements greater than minVal to the result. -- This reduces time complexity to O(n). - -Optimized code: - -```go -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -### Optimization Points Explained - -- Use one traversal to determine minimum value, O(n). -- Another traversal to construct result, O(n). -- Properly use slice capacity initialization to reduce memory allocation. -- Code intent is clear and performance improvement is significant. - ---- - -If you expect different logic, please let me know for further adjustments. - -=== Round 1 - CodeOptimizer === - -```go -// processData returns all values in the input slice that are greater than the minimum element. -// Returns nil if input is empty. -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - // Find the minimum value in the slice - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - // Collect all elements greater than the minimum value - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== Round 1 - ExitController === -TERMINATE - -=== Round 2 - CodeAnalyzer === -EXIT: Code quality has met standards - -=== Round 2 - CodeOptimizer === - -```go -// processData returns all values in the input slice that are greater than the minimum element. -// Returns nil if input is empty. -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - // Find the minimum value in the slice - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - // Collect all elements greater than the minimum value - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== Round 2 - ExitController === -TERMINATE - -=== Round 3 - CodeAnalyzer === -EXIT: Code quality has met standards - -=== Round 3 - CodeOptimizer === - -```go -// processData returns all values in the input slice that are greater than the minimum element. -// Returns nil if input is empty. -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== Round 3 - ExitController === -TERMINATE - -=== Round 4 - CodeAnalyzer === -EXIT: Code quality has met standards - -=== Round 4 - CodeOptimizer === - -```go -// processData returns all elements in the slice that are greater than the minimum element. -// Returns nil if input is empty. -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== Round 4 - ExitController === -TERMINATE - -=== Round 5 - CodeAnalyzer === -EXIT: Code quality has met standards - -=== Round 5 - CodeOptimizer === - -```go -// processData returns all values in the input slice that are greater than the minimum element. -// Returns nil if input is empty. -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== Round 5 - ExitController === -TERMINATE - -Code optimization loop completed! - -``` - - - - -## BreakLoop - - -In a Loop Agent, when an Agent needs to interrupt the loop execution, you can use the corresponding Break Action provided by ADK. - -```go -// BreakLoopAction is a programmatic-only agent action used to prematurely -// terminate the execution of a loop workflow agent. -// When a loop workflow agent receives this action from a sub-agent, it will stop its -// current iteration and will not proceed to the next one. -// It will mark the BreakLoopAction as Done, signalling to any 'upper level' loop agent -// that this action has been processed and should be ignored further up. -// This action is not intended to be used by LLMs. -type BreakLoopAction struct { - // From records the name of the agent that initiated the break loop action. - From string - // Done is a state flag that can be used by the framework to mark when the - // action has been handled. - Done bool - // CurrentIterations is populated by the framework to record at which - // iteration the loop was broken. - CurrentIterations int -} - -// NewBreakLoopAction creates a new BreakLoopAction, signaling a request -// to terminate the current loop. -func NewBreakLoopAction(agentName string) *AgentAction { - return &AgentAction{BreakLoop: &BreakLoopAction{ - From: agentName, - }} -} -``` - -Break Action achieves the interruption purpose without affecting other Agents outside the Loop Agent, while Exit Action immediately interrupts all subsequent Agent execution. - -Using the following diagram as an example: - - - -- When Agent1 issues a BreakAction, the Loop Agent will be interrupted, and Sequential continues to run Agent3 -- When Agent1 issues an ExitAction, the Sequential execution flow terminates entirely, and neither Agent2 nor Agent3 will run - -# ParallelAgent - -## Features - -ParallelAgent allows multiple sub-agents to execute concurrently based on the same input context. All sub-agents start execution simultaneously and wait for all to complete before ending. This pattern is particularly suitable for tasks that can be processed independently in parallel, significantly improving execution efficiency. - - - -```go -type ParallelAgentConfig struct { - Name string // Agent name - Description string // Agent description - SubAgents []Agent // List of sub-agents to execute concurrently -} - -func NewParallelAgent(ctx context.Context, config *ParallelAgentConfig) (Agent, error) -``` - -ParallelAgent execution follows these rules: - -1. **Concurrent execution**: All sub-agents start simultaneously, executing in parallel in independent goroutines -2. **Shared input**: All sub-agents receive the same initial input and context -3. **Wait and result aggregation**: Internally uses sync.WaitGroup to wait for all sub-agents to complete, collecting all sub-agent execution results and outputting them in the order received - -Additionally, Parallel internally includes exception handling mechanisms by default: - -- **Panic recovery**: Each goroutine has independent panic recovery mechanism -- **Error isolation**: Errors from a single sub-agent do not affect execution of other sub-agents -- **Interrupt handling**: Supports sub-agent interrupt and resume mechanisms - -ParallelAgent is suitable for the following scenarios: - -- **Independent task parallel processing**: Multiple unrelated tasks can execute simultaneously -- **Multi-angle analysis**: Analyzing the same problem from different angles simultaneously -- **Performance optimization**: Reducing overall execution time through parallel execution -- **Multi-expert consultation**: Consulting multiple specialized domain Agents simultaneously - -## Example - -This example demonstrates how to use ParallelAgent to analyze a product proposal from four different angles simultaneously: - -1. **TechnicalAnalyst**: Technical feasibility analysis -2. **BusinessAnalyst**: Business value analysis -3. **UXAnalyst**: User experience analysis -4. **SecurityAnalyst**: Security risk analysis - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - "sync" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" -) - -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// Technical analysis Agent -func NewTechnicalAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "TechnicalAnalyst", - Description: "Analyzes content from a technical perspective", - Instruction: `You are a technical expert. Please analyze the provided content from technical implementation, architecture design, and performance optimization perspectives. -Focus on: -1. Technical feasibility -2. Architecture rationality -3. Performance considerations -4. Technical risks -5. Implementation complexity`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Business analysis Agent -func NewBusinessAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "BusinessAnalyst", - Description: "Analyzes content from a business perspective", - Instruction: `You are a business analysis expert. Please analyze the provided content from business value, market prospects, and cost-effectiveness perspectives. -Focus on: -1. Business value -2. Market demand -3. Competitive advantages -4. Cost analysis -5. Revenue model`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// User experience analysis Agent -func NewUXAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "UXAnalyst", - Description: "Analyzes content from a user experience perspective", - Instruction: `You are a user experience expert. Please analyze the provided content from user experience, usability, and user satisfaction perspectives. -Focus on: -1. User friendliness -2. Operational convenience -3. Learning cost -4. User satisfaction -5. Accessibility`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// Security analysis Agent -func NewSecurityAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "SecurityAnalyst", - Description: "Analyzes content from a security perspective", - Instruction: `You are a security expert. Please analyze the provided content from information security, data protection, and privacy compliance perspectives. -Focus on: -1. Data security -2. Privacy protection -3. Access control -4. Security vulnerabilities -5. Compliance requirements`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // Create four analysis Agents from different angles - techAnalyst := NewTechnicalAnalystAgent() - bizAnalyst := NewBusinessAnalystAgent() - uxAnalyst := NewUXAnalystAgent() - secAnalyst := NewSecurityAnalystAgent() - - // Create ParallelAgent for simultaneous multi-angle analysis - parallelAgent, err := adk.NewParallelAgent(ctx, &adk.ParallelAgentConfig{ - Name: "MultiPerspectiveAnalyzer", - Description: "Multi-angle parallel analysis: Technical + Business + User Experience + Security", - SubAgents: []adk.Agent{techAnalyst, bizAnalyst, uxAnalyst, secAnalyst}, - }) - if err != nil { - log.Fatal(err) - } - - // Create Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: parallelAgent, - }) - - // Product proposal to analyze - productProposal := ` -Product Proposal: Intelligent Customer Service System - -Overview: Develop an intelligent customer service system based on large language models that can automatically answer user questions, handle common business inquiries, and transfer to human agents when necessary. - -Main Features: -1. Natural language understanding and response -2. Multi-turn conversation management -3. Knowledge base integration -4. Sentiment analysis -5. Human agent transfer -6. Conversation history recording -7. Multi-channel access (Web, WeChat, App) - -Technical Architecture: -- Frontend: React + TypeScript -- Backend: Go + Gin framework -- Database: PostgreSQL + Redis -- AI Model: GPT-4 API -- Deployment: Docker + Kubernetes -` - - fmt.Println("Starting multi-angle parallel analysis...") - iter := runner.Query(ctx, "Please analyze the following product proposal:\n"+productProposal) - - // Use map to collect results from different analysts - results := make(map[string]string) - var mu sync.Mutex - - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Printf("Error during analysis: %v", event.Err) - continue - } - - if event.Output != nil && event.Output.MessageOutput != nil { - mu.Lock() - results[event.AgentName] = event.Output.MessageOutput.Message.Content - mu.Unlock() - - fmt.Printf("\n=== %s analysis completed ===\n", event.AgentName) - } - } - - // Output all analysis results - fmt.Println("\n" + "============================================================") - fmt.Println("Multi-angle Analysis Results Summary") - fmt.Println("============================================================") - - analysisOrder := []string{"TechnicalAnalyst", "BusinessAnalyst", "UXAnalyst", "SecurityAnalyst"} - analysisNames := map[string]string{ - "TechnicalAnalyst": "Technical Analysis", - "BusinessAnalyst": "Business Analysis", - "UXAnalyst": "User Experience Analysis", - "SecurityAnalyst": "Security Analysis", - } - - for _, agentName := range analysisOrder { - if result, exists := results[agentName]; exists { - fmt.Printf("\n【%s】\n", analysisNames[agentName]) - fmt.Printf("%s\n", result) - fmt.Println("----------------------------------------") - } - } - - fmt.Println("\nMulti-angle parallel analysis completed!") - fmt.Printf("Received %d analysis results\n", len(results)) -} -``` - -Run result: - -```markdown -Starting multi-angle parallel analysis... - -=== BusinessAnalyst analysis completed === - -=== UXAnalyst analysis completed === - -=== SecurityAnalyst analysis completed === - -=== TechnicalAnalyst analysis completed === - -============================================================ -Multi-angle Analysis Results Summary -============================================================ - -【Technical Analysis】 -For this intelligent customer service system proposal, here is a detailed analysis from technical implementation, architecture design, and performance optimization perspectives: - ---- - -### I. Technical Feasibility - -1. **Natural Language Understanding and Response** - - Using GPT-4 API for natural language understanding and automatic response is a mature and feasible solution. GPT-4 has strong language understanding and generation capabilities, suitable for handling complex and diverse questions. - -2. **Multi-turn Conversation Management** - - Relies on backend to maintain context state, combined with GPT-4 model can handle multi-turn interactions well. Need to design reasonable context management mechanism (such as conversation history maintenance, key slot extraction, etc.) to ensure context information integrity. - -3. **Knowledge Base Integration** - - Can add specific knowledge base retrieval results to GPT-4 API (retrieval-augmented generation), or integrate knowledge base through local retrieval interface. Technically feasible, but has high requirements for real-time and accuracy. - -4. **Sentiment Analysis** - - Sentiment analysis function can be implemented with independent lightweight models (such as fine-tuned BERT), or try using GPT-4 output, but cost is higher. Sentiment analysis capability helps intelligent customer service better understand user emotions and improve user experience. - -5. **Human Agent Transfer** - - Technically achievable through establishing event trigger rules (such as turn count, emotion threshold, keyword detection) to implement automatic transfer to human. System needs to support ticket or session transfer mechanism and ensure seamless session switching. - -6. **Multi-channel Access** - - Multi-channel access including web, WeChat, App can all be achieved through unified API gateway, technology is mature, while needing to handle channel differences (message format, authentication, push mechanism, etc.). - ---- - -### II. Architecture Rationality - -- **Frontend React + TypeScript** - Very suitable for building responsive customer service interface, mature ecosystem, convenient for multi-channel component sharing. - -- **Backend Go + Gin** - Go language has excellent performance, Gin framework is lightweight and high-performance, suitable for high-concurrency scenarios. Backend handles GPT-4 API integration, state management, multi-channel message forwarding and other responsibilities, reasonable choice. - -- **Database PostgreSQL + Redis** - - PostgreSQL handles structured data storage, such as user information, conversation history, knowledge base metadata. - - Redis handles session state caching, hot knowledge base, rate limiting, etc., improving access performance. - Architecture design follows common large internet product patterns, with clear component division. - -- **AI Model GPT-4 API** - Using mature API reduces development difficulty and model maintenance cost; disadvantage is high dependency on network and API calls. - -- **Deployment Docker + Kubernetes** - Containerization and K8s orchestration ensure system elastic scaling, high availability and canary deployment, suitable for production environment, follows modern microservices architecture trends. - ---- - -### III. Performance Considerations - -1. **Response Time** - - GPT-4 API calls have inherent latency (usually hundreds of milliseconds to 1 second), significantly affecting response time. Need to handle interface asynchronously and design frontend experience well (such as loading animations, partial progressive response). - -2. **Concurrent Processing Capability** - - Backend Go has high concurrent processing advantages, combined with Redis caching hot data, can greatly improve overall throughput. - - But GPT-4 API calls are limited by OpenAI service QPS limits and call costs, need to reasonably design call frequency and degradation strategies. - -3. **Caching Strategy** - - Cache user conversation context and common question answers to reduce repeated API calls. - - Match key questions locally first, call GPT-4 only on failure, improving efficiency. - -4. **Multi-channel Load Balancing** - - Need to design unified message bus and reliable async queue to prevent traffic spikes from one channel affecting overall system stability. - ---- - -### IV. Technical Risks - -1. **GPT-4 API Dependency** - - High dependency on third-party API, risks include service interruption, interface changes and cost fluctuations. - - Recommend designing local cache and limited alternative response logic to handle API exceptions. - -2. **Multi-turn Conversation Context Management Difficulty** - - Context too long or complex will reduce answer quality, need to design context length limits and selective important information retention mechanism. - -3. **Knowledge Base Integration Complexity** - - How to achieve knowledge base and... ----------------------------------------- - -【Business Analysis】 -Here is the business perspective analysis of the intelligent customer service system product proposal: - -1. Business Value -- Improve customer service efficiency: Automatically answer user questions and common inquiries, reduce human agent pressure, lower labor costs. -- Improve user experience: Multi-turn conversation and sentiment analysis make interactions more natural, enhance customer satisfaction and stickiness. -- Data-driven decision support: Conversation history and knowledge base integration provide valuable user feedback and behavior data for enterprises, optimizing products and services. -- Support business expansion: Multi-channel access (web, WeChat, App) meets different customer access habits, improving coverage. - -2. Market Demand -- Market demand for intelligent customer service continues to grow, especially in e-commerce, finance, healthcare, education and other industries, customer service automation is an important direction for enterprise digital transformation. -- With the maturity of AI technology, enterprises expect to use large language models to improve customer service intelligence level. -- Users' demand for instant response and 24/7 service is increasing, driving widespread adoption of intelligent customer service systems. - -3. Competitive Advantages -- Using advanced GPT-4 large language model, has strong natural language understanding and generation capabilities, improving Q&A accuracy and conversation naturalness. -- Sentiment analysis function helps accurately identify user emotions, dynamically adjust response strategies, improve customer satisfaction. -- Multi-channel access design meets enterprise diversified customer reach needs, enhancing product applicability. -- Technical architecture uses microservices, containerized deployment, convenient for elastic scaling and maintenance, improving system stability and scalability. - -4. Cost Analysis -- AI model call cost is high, depends on GPT-4 API, need to adjust budget based on call volume and response speed. -- Technical R&D investment is large, involving frontend and backend, multi-channel integration, AI and knowledge base management. -- Operation and server costs need to consider multi-channel concurrent access. -- In the long term, human agent count can be significantly reduced, saving labor costs. -- Can reduce initial hardware investment through cloud services, but cloud resource usage needs careful management to control costs. - -5. Revenue Model -- SaaS subscription service: Charge monthly/yearly service fees to enterprise customers, tiered pricing based on access channels, concurrency, and feature levels. -- Charge by call count or conversation count, suitable for customers with large business fluctuations. -- Value-added services: Data analysis report customization, industry knowledge base integration, human agent collaboration tools, etc. -- For medium and large customers, can provide custom development and technical support, charging project fees. -- Through continuous model and service optimization, increase customer retention and renewal rates. - -In summary, this intelligent customer service system based on mature technology and AI advantages has good business value and market potential. Its multi-channel access and sentiment analysis features enhance competitiveness, but need to reasonably control AI call costs and operating expenses. Recommend focusing on SaaS subscription and value-added services, combined with marketing, quickly capture customer resources and improve profitability. ----------------------------------------- - -【User Experience Analysis】 -For this intelligent customer service system proposal, I will analyze from user experience, usability, user satisfaction and accessibility perspectives: - -1. User Friendliness -- Natural language understanding and response capability improves user communication experience with the system, allowing users to express needs in natural language, reducing communication barriers. -- Multi-turn conversation management allows the system to understand context, reducing repeated explanations, enhancing conversation coherence, further improving user experience. -- Sentiment analysis function helps the system identify user emotions, making more thoughtful responses, improving interaction personalization and humanization. -- Multi-channel access covers users' commonly used access paths, convenient for users to get service anytime anywhere, improving friendliness. - -2. Operational Convenience -- Automatically answering common business inquiries can reduce user waiting time and operational burden, improving response speed. -- Human agent transfer mechanism ensures complex issues can be handled timely, ensuring service continuity and seamless operation handoff. -- Conversation history recording convenient for users to review consultation content, avoiding repeated queries, improving operational convenience. -- Using modern tech stack (React, TypeScript) provides good frontend interaction performance and response speed, indirectly enhancing operational smoothness. - -3. Learning Cost -- Based on natural language processing, users don't need to learn special commands, lowering usage threshold. -- Multi-turn conversation natural connection makes it easier for users to understand system response logic, reducing confusion and frustration. -- Consistent interface across different channels (such as keeping similar experience on web and WeChat) helps users get started quickly. -- More precise feedback provided through sentiment analysis reduces time cost of users frequently trying due to misunderstanding. - -4. User Satisfaction -- Fast and accurate automatic replies and multi-turn conversation reduce user waiting and repeated input, improving satisfaction. -- Sentiment analysis makes the system better understand user emotions, bringing warmer interaction experience, increasing user stickiness. -- Human agent intervention ensures complex issues are properly handled, improving service quality perception. -- Multi-channel coverage meets different users' usage scenarios, enhancing overall satisfaction. - -5. Accessibility -- Multi-channel access covers web, WeChat, App, adapting to different users' devices and environments, improving accessibility. -- The proposal doesn't explicitly mention accessibility design (such as screen reader compatibility, high contrast mode, etc.), which may be an area to supplement in the future. -- Frontend using React and TypeScript is conducive to implementing responsive design and accessibility features, but need to ensure development standards are implemented. -- Backend architecture and deployment solution ensure system stability and scalability, indirectly improving user continuous accessibility. - -Summary: -This intelligent customer service system proposal is fairly comprehensive in user experience and usability considerations, using large language models to achieve natural multi-turn conversation, sentiment analysis and knowledge base integration, meeting users' diverse needs. Meanwhile, multi-channel access enhances system coverage. Recommend strengthening accessibility design in specific implementation to achieve more comprehensive accessibility assurance, while continuing to optimize conversation strategies to improve user satisfaction. ----------------------------------------- - -【Security Analysis】 -For this intelligent customer service system proposal, here is the analysis from information security, data protection and privacy compliance perspectives: - -I. Data Security - -1. Data Transmission Security -- Recommend all client-server communications use TLS/SSL encryption to ensure data confidentiality and integrity during transmission. -- Since multi-channel access is supported (web, WeChat, App), need to ensure each entry point strictly implements encrypted transmission. - -2. Data Storage Security -- PostgreSQL stores sensitive information like conversation history and user data, need to enable database encryption (such as transparent data encryption TDE or field-level encryption) to prevent data leakage. -- Redis as cache may store temporary session data, also need to enable access authentication and encrypted transmission. -- Implement minimum storage principle for user sensitive data, avoid storing unrelated data beyond scope. -- Data backup process needs encrypted storage, and backup access should also be controlled. - -3. API Call Security -- GPT-4 API calls generate large amounts of user data interaction, should evaluate its data processing and storage policies to ensure compliance with data security requirements. -- Add call permission management, limit API key access scope and permissions to prevent abuse. - -4. Log Security -- System logs should avoid storing plaintext sensitive information, especially personal identity information and conversation content. Log access needs strict control. - -II. Privacy Protection - -1. Personal Data Processing -- Collection and storage of user personal data (name, contact information, account information, etc.) must clearly inform users and obtain user consent. -- Implement data anonymization/de-identification technology, especially for identity information processing in conversation history. - -2. User Privacy Rights -- Meet users' rights to access, correct, and delete data in relevant laws and regulations (such as Personal Information Protection Law, GDPR). -- Provide privacy policy clearly disclosing data collection, use and sharing situations. - -3. Interaction Privacy -- Multi-turn conversation and sentiment analysis features should consider avoiding excessive invasion of user privacy, such as transparent notification and restriction of sensitive emotion data usage. - -4. Third-party Compliance -- GPT-4 API is provided by third party, need to ensure its service complies with relevant privacy compliance requirements and data protection standards. - -III. Access Control - -1. User Identity Verification -- When system involves user identity information query and management, need to establish reliable identity authentication mechanism. -- Support multi-factor authentication to enhance security. - -2. Permission Management -- Backend management interface and human agent transfer module need to use role-based access control (RBAC) to ensure minimum operation permissions. -- Operations accessing sensitive data need detailed audit and monitoring. - -3. Session Management -- Need effective session management mechanism for multi-channel sessions to prevent session hijacking. -- Conversation history access permissions should be limited to only relevant users or authorized personnel. - -IV. Security Vulnerabilities - -1. Application Security -- Frontend React+TypeScript should prevent XSS, CSRF attacks, reasonably use Content Security Policy (CSP). -- Backend Go application needs to prevent SQL injection, request forgery and permission deficiency. Gin framework provides middleware support, recommend fully utilizing security modules. - -2. AI Model Risks -- GPT-4 API input/output may have sensitive information leakage or model misuse risks, need to limit input content and filter sensitive information. -- Prevent generating malicious answers or information leakage, establish content review mechanism. - -3. Container and Deployment Security -- Docker containers must use secure images and patch timely. Kubernetes cluster network policies and access control need to be complete. -- Container runtime permissions minimized to avoid container escape risks. - -V. Compliance Requirements - -1. Data Protection Regulations -- Based on operating region, need to comply with Personal Information Protection Law (PIPL), EU General Data Protection Regulation (GDPR) or other relevant legal requirements. -- Clearly define user data collection, processing, transmission and storage processes comply with regulations. - -2. User Privacy Notice and Consent -- Should provide clear privacy policy and terms of use, explaining data purposes and processing methods. -- Implement user consent management mechanism. - -3. Cross-border Data Transfer Compliance -- If system involves cross-border data flow, need to assess compliance risks and take corresponding technical... ----------------------------------------- - -Multi-angle parallel analysis completed! -Received 4 analysis results -``` - -# Summary - -Workflow Agents provide powerful multi-agent collaboration capabilities for Eino ADK. By reasonably selecting and combining these Workflow Agents, developers can build efficient and reliable multi-agent collaboration systems to meet various complex business requirements. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_interface.md b/content/en/docs/eino/core_modules/eino_adk/agent_interface.md index 0c57ea1982d..528913a0448 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_interface.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_interface.md @@ -1,390 +1,198 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: Agent Abstraction' +title: Agent Abstraction weight: 3 --- -# Agent Definition +# Agent Interface -Eino defines a basic interface for Agents. Any struct implementing this interface can be considered an Agent: +All ADK functionality revolves around the `Agent` interface: ```go -// github.com/cloudwego/eino/adk/interface.go +// github.com/cloudwego/eino/adk -type Agent interface { +type TypedAgent[M MessageType] interface { Name(ctx context.Context) string Description(ctx context.Context) string - Run(ctx context.Context, input *AgentInput, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] + Run(ctx context.Context, input *TypedAgentInput[M], options ...AgentRunOption) *AsyncIterator[*TypedAgentEvent[M]] } + +// Default type alias (using *schema.Message) +type Agent = TypedAgent[*schema.Message] ``` - - - + + +
    MethodDescription
    NameThe name of the Agent, serving as the Agent's identifier
    DescriptionDescription of the Agent's capabilities, mainly used for other Agents to understand and determine this Agent's responsibilities or functions
    RunThe core execution method of the Agent, returns an iterator through which callers can continuously receive events produced by the Agent
    Name
    Agent name identifier
    Description
    Capability description, for other Agents or the framework to understand capabilities
    Run
    Core execution method, asynchronously returns an event stream (Future pattern)
    -## AgentInput - -The Run method receives AgentInput as the Agent's input: - -```go -type AgentInput struct { - Messages []Message - EnableStreaming bool -} - -type Message = *schema.Message -``` - -Agents typically have ChatModel as their core, so the Agent's input is defined as `Messages`, which is the same type as when calling Eino ChatModel. `Messages` can include user instructions, conversation history, background knowledge, sample data, or any other data you want to pass to the Agent. For example: +## MessageType Constraint ```go -import ( - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/schema" -) - -input := &adk.AgentInput{ - Messages: []adk.Message{ - schema.UserMessage("What's the capital of France?"), - schema.AssistantMessage("The capital of France is Paris.", nil), - schema.UserMessage("How far is it from London? "), - }, +type MessageType interface { + *schema.Message | *schema.AgenticMessage } ``` -`EnableStreaming` is used to **suggest** the output mode to the Agent, but it is not a mandatory constraint. Its core idea is to control the behavior of components that support both streaming and non-streaming output, such as ChatModel, while `EnableStreaming` does not affect components that only support one output method. Additionally, the `AgentOutput.IsStreaming` field indicates the actual output type. The runtime behavior is: +All ADK generic types are parameterized with `[M MessageType]`. `*schema.Message` supports full ADK features; `*schema.AgenticMessage` is for the structured content block mode added in v0.9. -- When `EnableStreaming=false`, components that can output both streaming and non-streaming will use the non-streaming mode that returns complete results at once. -- When `EnableStreaming=true`, components within the Agent that can output streaming (such as ChatModel calls) should return results progressively as a stream. If a component naturally doesn't support streaming, it can still work in its original non-streaming way. +## Type Alias Quick Reference -As shown in the diagram below, ChatModel can output both streaming and non-streaming, while Tool can only output non-streaming: - -- When `EnableStream=false`, both output non-streaming -- When `EnableStream=true`, ChatModel outputs streaming, Tool still outputs non-streaming because it doesn't have streaming capability. - - - -## AgentRunOption - -`AgentRunOption` is defined by the Agent implementation and can modify Agent configuration or control Agent behavior at the request level. - -Eino ADK provides some commonly defined Options for users: - -- `WithSessionValues`: Set cross-Agent read/write data -- `WithSkipTransferMessages`: When configured, if the Event is a Transfer to SubAgent, the messages in the Event will not be appended to History - -Eino ADK provides `WrapImplSpecificOptFn` and `GetImplSpecificOptions` methods for Agents to wrap and read custom `AgentRunOptions`. - -When using the `GetImplSpecificOptions` method to read `AgentRunOptions`, AgentRunOptions that don't match the required type (like options in the example) will be ignored. + + + + + + + +
    Generic TypeDefault Alias
    TypedAgent[*schema.Message]
    Agent
    TypedAgentInput[*schema.Message]
    AgentInput
    TypedAgentEvent[*schema.Message]
    AgentEvent
    TypedAgentOutput[*schema.Message]
    AgentOutput
    TypedMessageVariant[*schema.Message]
    MessageVariant
    -For example, you can define `WithModelName` to require the Agent to change the model being called at the request level: +# AgentInput ```go -// github.com/cloudwego/eino/adk/call_option.go -// func WrapImplSpecificOptFn[T any](optFn func(*T)) AgentRunOption -// func GetImplSpecificOptions[T any](base *T, opts ...AgentRunOption) *T - -import "github.com/cloudwego/eino/adk" - -type options struct { - modelName string -} - -func WithModelName(name string) adk.AgentRunOption { - return adk.WrapImplSpecificOptFn(func(t *options) { - t.modelName = name - }) -} - -func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { - o := &options{} - o = adk.GetImplSpecificOptions(o, opts...) - // run code... +type TypedAgentInput[M MessageType] struct { + Messages []M + EnableStreaming bool } ``` -Additionally, AgentRunOption has a `DesignateAgent` method. Calling this method allows you to specify which Agent the Option takes effect on when calling a multi-Agent system: +- **Messages**: User instructions, conversation history, background knowledge, etc., consistent with ChatModel input format +- **EnableStreaming**: Suggests the Agent use streaming output. Components that support streaming (e.g., ChatModel) will return progressively; components that don't support it are unaffected -```go -func genOpt() { - // Specify that the option only takes effect for agent_1 and agent_2 - opt := adk.WithSessionValues(map[string]any{}).DesignateAgent("agent_1", "agent_2") -} -``` - -## AsyncIterator +# AgentEvent -`Agent.Run` returns an iterator `AsyncIterator[*AgentEvent]`: +Events produced during Agent execution: ```go -// github.com/cloudwego/eino/adk/utils.go - -type AsyncIterator[T any] struct { - ... -} - -func (ai *AsyncIterator[T]) Next() (T, bool) { - ... +type TypedAgentEvent[M MessageType] struct { + AgentName string + RunPath []RunStep + Output *TypedAgentOutput[M] + Action *AgentAction + Err error } ``` -It represents an asynchronous iterator (asynchronous means there is no synchronization control between production and consumption), allowing callers to consume a series of events produced by the Agent in an ordered, blocking manner. - -- `AsyncIterator` is a generic struct that can be used to iterate over any type of data. Currently in the Agent interface, the iterator type returned by the Run method is fixed as `AsyncIterator[*AgentEvent]`. This means that every element you get from this iterator will be a pointer to an `AgentEvent` object. `AgentEvent` will be explained in detail in the following sections. -- The main way to interact with the iterator is by calling its `Next()` method. This method is **blocking** - each call to `Next()` will pause execution until one of the following two situations occurs: - - Agent produces a new `AgentEvent`: The `Next()` method returns this event, and the caller can immediately process it. - - Agent actively closes the iterator: When the Agent will no longer produce any new events (usually when the Agent finishes running), it closes the iterator. At this point, the `Next()` call will end blocking and return false in the second return value, telling the caller that iteration has ended. - -Typically, you need to use a for loop to process `AsyncIterator`: +## AgentOutput ```go -iter := myAgent.Run(xxx) // get AsyncIterator from Agent.Run - -for { - event, ok := iter.Next() - if !ok { - break - } - // handle event +type TypedAgentOutput[M MessageType] struct { + MessageOutput *TypedMessageVariant[M] + CustomizedOutput any } ``` -`AsyncIterator` can be created by `NewAsyncIteratorPair`, which returns another parameter `AsyncGenerator` for producing data: +`MessageVariant` unifies handling of streaming and non-streaming messages: ```go -// github.com/cloudwego/eino/adk/utils.go - -func NewAsyncIteratorPair[T any]() (*AsyncIterator[T], *AsyncGenerator[T]) -``` - -Agent.Run returns AsyncIterator to let callers receive a series of AgentEvents produced by the Agent in real-time. Therefore, Agent.Run typically runs the Agent in a Goroutine to immediately return the AsyncIterator for the caller to listen to: - -```go -import "github.com/cloudwego/eino/adk" - -func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { - // handle input - iter, gen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() - go func() { - defer func() { - // recover code - gen.Close() - }() - // agent run code - // gen.Send(event) - }() - return iter +type TypedMessageVariant[M MessageType] struct { + IsStreaming bool + Message M + MessageStream *schema.StreamReader[M] + Role schema.RoleType // *schema.Message path + AgenticRole schema.AgenticRoleType // *schema.AgenticMessage path + ToolName string } ``` -## AgentWithOptions - -Using the `AgentWithOptions` method allows you to make some general configurations in Eino ADK Agents. - -Unlike `AgentRunOption`, `AgentWithOptions` takes effect before running and does not support custom options. +- `IsStreaming=true` → Read frame by frame from `MessageStream` +- `IsStreaming=false` → Get at once from `Message` +- `Role`/`ToolName`: Only valid for `*schema.Message` path (Assistant or Tool) +- `AgenticRole`: Only valid for `*schema.AgenticMessage` path -```go -// github.com/cloudwego/eino/adk/flow.go -func AgentWithOptions(ctx context.Context, agent Agent, opts ...AgentOption) Agent -``` - -Currently built-in supported configurations in Eino ADK include: - -- `WithDisallowTransferToParent`: Configures that this SubAgent is not allowed to Transfer to ParentAgent, which will trigger the SubAgent's `OnDisallowTransferToParent` callback method -- `WithHistoryRewriter`: When configured, the Agent will rewrite the received context information through this method before execution - -# AgentEvent +## AgentAction -AgentEvent is the core event data structure produced by the Agent during its run. It contains the Agent's metadata, output, actions, and errors: +Behavior signals that control multi-Agent collaboration: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentEvent struct { - AgentName string - - RunPath []RunStep - - Output *AgentOutput - - Action *AgentAction - - Err error +type AgentAction struct { + Exit bool + Interrupted *InterruptInfo + TransferToAgent *TransferToAgentAction // NOT RECOMMENDED + BreakLoop *BreakLoopAction + CustomizedAction any } - -// EventFromMessage constructs a regular event -func EventFromMessage(msg Message, msgStream MessageStream, role schema.RoleType, toolName string) *AgentEvent ``` -## AgentName & RunPath - -The `AgentName` and `RunPath` fields are automatically filled by the framework. They provide important context information about the event source, which is crucial in complex systems composed of multiple Agents. +- **Interrupted**: Interrupts Runner execution, carries custom data, supports subsequent Resume +- **BreakLoop**: Terminates the LoopAgent's loop +- **Exit**: Immediately exits the multi-Agent system +- **TransferToAgent**: (Not recommended) Task transfer, use AgentAsTool instead -```go -type RunStep struct { - agentName string -} -``` +# AgentRunOption -- `AgentName` indicates which Agent instance produced the current AgentEvent. -- `RunPath` records the complete call chain to reach the current Agent. `RunPath` is a slice of `RunStep` that sequentially records all `AgentNames` from the initial entry Agent to the current Agent producing the event. - -## AgentOutput +Request-level Agent configuration. ADK built-in options: -`AgentOutput` encapsulates the output produced by the Agent. +- `WithSessionValues(map[string]any)`: Inject KV data shared across Agents +- `WithCallbacks(...callbacks.Handler)`: Add callback handlers +- `WithCancel()`: Enable Agent Cancel capability (see [Cancel and TurnLoop](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart)) -Message output is set in the MessageOutput field, while other types of custom output are set in the CustomizedOutput field: +Custom Options: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentOutput struct { - MessageOutput *MessageVariant - - CustomizedOutput any +type myOptions struct { + modelName string } -type MessageVariant struct { - IsStreaming bool +func WithModelName(name string) adk.AgentRunOption { + return adk.WrapImplSpecificOptFn(func(t *myOptions) { + t.modelName = name + }) +} - Message Message - MessageStream MessageStream - // message role: Assistant or Tool - Role schema.RoleType - // only used when Role is Tool - ToolName string +// Read in Run +func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + o := adk.GetImplSpecificOptions(&myOptions{}, opts...) + // use o.modelName ... } ``` -The `MessageVariant` type of the `MessageOutput` field is a core data structure with main functions: - -1. Unified handling of streaming and non-streaming messages: `IsStreaming` is a flag. When true, it indicates the current `MessageVariant` contains a streaming message (read from MessageStream). When false, it indicates a non-streaming message (read from Message): - - - Streaming: Returns a series of message fragments progressively over time, eventually forming a complete message (MessageStream). - - Non-streaming: Returns a complete message at once (Message). -2. Convenient metadata access: The Message struct contains some important metadata, such as the message's Role (Assistant or Tool). To quickly identify message type and source, MessageVariant elevates these commonly used metadata to the top level: - - - `Role`: The role of the message, Assistant / Tool - - `ToolName`: If the message role is Tool, this field directly provides the tool's name. - -The benefit of this is that when code needs to route or make decisions based on message type, it doesn't need to deeply parse the specific content of the Message object - it can directly get the needed information from MessageVariant's top-level fields, simplifying logic and improving code readability and efficiency. - -## AgentAction - -When an Agent produces an Event containing AgentAction, it can control multi-Agent collaboration, such as immediate exit, interrupt, jump, etc.: +`DesignateAgent` restricts an Option to a specific Agent: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentAction struct { - Exit bool - - Interrupted *InterruptInfo - - TransferToAgent *TransferToAgentAction - - BreakLoop *BreakLoopAction - - CustomizedAction any -} - -type InterruptInfo struct { - Data any -} - -type TransferToAgentAction struct { - DestAgentName string -} +opt := adk.WithSessionValues(map[string]any{"key": "val"}).DesignateAgent("agent_1") ``` -Eino ADK currently has four preset Actions: +# AsyncIterator -1. Exit: When an Agent produces an Exit Action, the Multi-Agent will exit immediately +The asynchronous event iterator returned by `Run`: ```go -func NewExitAction() *AgentAction { - return &AgentAction{Exit: true} +iter := agent.Run(ctx, input) +for { + event, ok := iter.Next() + if !ok { + break + } + // handle event } ``` -2. Transfer: When an Agent produces a Transfer Action, it will jump to the target Agent to run +`Next()` blocks until a new event arrives or iteration ends. Agent implementations typically write to a Generator in a goroutine and immediately return the Iterator: ```go -func NewTransferToAgentAction(destAgentName string) *AgentAction { - return &AgentAction{TransferToAgent: &TransferToAgentAction{DestAgentName: destAgentName}} +func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + iter, gen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() + go func() { + defer gen.Close() + // Execute logic, produce events via gen.Send(event) + }() + return iter } ``` -3. Interrupt: When an Agent produces an Interrupt Action, it will interrupt the Runner's execution. Since interrupts can occur at any position and need to pass unique information when interrupting, the Action provides an `Interrupted` field for Agents to set custom data. When the Runner receives an Action with non-empty Interrupted, it considers an interrupt has occurred. The internal mechanism of Interrupt & Resume is relatively complex and will be elaborated in the [Eino ADK: Agent Runner] - [Eino ADK: Interrupt & Resume] section. - -```go -// For example, when ChatModelAgent interrupts, it sends the following AgentEvent: -h.Send(&AgentEvent{AgentName: h.agentName, Action: &AgentAction{ - Interrupted: &InterruptInfo{ - Data: &ChatModelAgentInterruptInfo{Data: data, Info: info}, - }, -}}) -``` - -4. Break Loop: When a child Agent of LoopAgent emits a BreakLoopAction, the corresponding LoopAgent will stop looping and exit normally. - # Language Settings -ADK provides a `SetLanguage` function to set the language for built-in prompts. This affects the language of prompts generated by all ADK built-in components and middleware. This capability was introduced in [alpha/08](https://github.com/cloudwego/eino/releases/tag/v0.8.0-alpha.13) version. - -## API - ```go -// Language represents the language setting for ADK built-in prompts -type Language uint8 - -const ( - // LanguageEnglish represents English (default) - LanguageEnglish Language = iota - // LanguageChinese represents Chinese - LanguageChinese -) - -// SetLanguage sets the language for ADK built-in prompts -// The default language is English (if not explicitly set) -func SetLanguage(lang Language) error +adk.SetLanguage(adk.LanguageChinese) // or adk.LanguageEnglish (default) ``` -## Usage Example - -```go -import "github.com/cloudwego/eino/adk" - -// Set to Chinese -err := adk.SetLanguage(adk.LanguageChinese) -if err != nil { - // Handle error -} - -// Set to English (default) -err = adk.SetLanguage(adk.LanguageEnglish) -``` - -## Scope of Effect - -Language settings affect the built-in prompts of the following components: - - - - - - - -
    Component/MiddlewareAffected Prompts
    FileSystem MiddlewareFile system tool descriptions, system prompts, execution tool prompts
    Reduction MiddlewareTool result truncation/cleanup prompt text
    Skill MiddlewareSkill system prompts, skill tool descriptions
    ChatModelAgentBuilt-in system prompts
    - -> 💡 -> It is recommended to set the language during program initialization because the language setting takes effect globally. Changing the language at runtime may result in mixed-language prompts within the same session. +Affects ADK built-in prompts (FileSystem, Reduction, Skill, ChatModelAgent, and other components). Recommended to set during program initialization. > 💡 -> The language setting only affects ADK built-in prompts. Your custom prompts (such as Agent's Instruction) need to handle internationalization on your own. +> The language setting only affects ADK built-in prompts. Custom Instructions need to handle internationalization on their own. diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_preview.md b/content/en/docs/eino/core_modules/eino_adk/agent_preview.md index 4dcd230afb4..83396581f62 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_preview.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_preview.md @@ -1,162 +1,80 @@ --- Description: "" -date: "2026-01-20" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino ADK: Overview' +title: Overview weight: 2 --- # What is Eino ADK? -Eino ADK, inspired by [Google-ADK](https://google.github.io/adk-docs/agents/), provides a flexible composition framework for Agent development in Go, i.e., an Agent and Multi-Agent development framework. Eino ADK has accumulated common capabilities for multi-Agent interaction, including context passing, event stream distribution and conversion, task control transfer, interrupt and resume, and common aspects. It is widely applicable, model-agnostic, and deployment-agnostic, making Agent and Multi-Agent development simpler and more convenient while providing comprehensive production-grade application governance capabilities. +Eino ADK is a Go-based Agent development framework, providing: -Eino ADK aims to help developers develop and manage Agent applications. It provides a flexible and robust development environment to help developers build various Agent applications such as conversational agents, non-conversational agents, complex tasks, workflows, and more. +- **ChatModelAgent**: A ReAct Agent with LLM as the decision-maker, supporting tool calls, autonomous reasoning, and runtime enhancement (Middleware) +- **Workflow Agents**: Deterministic orchestration primitives (Sequential / Loop / Parallel) +- **Runner / TurnLoop**: Agent execution entry, supporting event streams, checkpoint/resume, and multi-turn preemption +- **Multi-Agent Collaboration**: AgentAsTool (recommended), Workflow composition -# ADK Framework +Widely applicable, model-agnostic, and deployment-agnostic. -The overall module structure of Eino ADK is shown in the diagram below: - - +# ADK Architecture ## Agent Interface -The core of Eino ADK is the Agent abstraction (Agent Interface). All ADK functionality is designed around the Agent abstraction. For details, see [Eino ADK: Agent Interface](/docs/eino/core_modules/eino_adk/agent_interface) +All ADK functionality revolves around the `Agent` interface: ```go type Agent interface { Name(ctx context.Context) string Description(ctx context.Context) string - - // Run runs the agent. - // The returned AgentEvent within the AsyncIterator must be safe to modify. - // If the returned AgentEvent within the AsyncIterator contains MessageStream, - // the MessageStream MUST be exclusive and safe to be received directly. - // NOTE: it's recommended to use SetAutomaticClose() on the MessageStream of AgentEvents emitted by AsyncIterator, - // so that even the events are not processed, the MessageStream can still be closed. Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent] } ``` -The definition of `Agent.Run` is: - -1. Get task details and related data from the input AgentInput, AgentRunOption, and optional Context Session -2. Execute the task and write the execution process and results to the AgentEvent Iterator - -`Agent.Run` requires the Agent implementation to execute asynchronously in a Future pattern. The core is divided into three steps. For specifics, refer to the implementation of the Run method in ChatModelAgent: +Semantics of `Run`: -1. Create a pair of Iterator and Generator -2. Start the Agent's asynchronous task and pass in the Generator to process AgentInput. The Agent executes core logic in this asynchronous task (e.g., ChatModelAgent calls LLM) and writes new events to the Generator for the Agent caller to consume from the Iterator -3. Return the Iterator immediately after starting the task in step 2 +1. Get task information from `AgentInput` and Context +2. Execute the task asynchronously, writing produced events to `AsyncIterator` +3. Return the Iterator immediately after starting the async task (Future pattern) -## Multi-Agent Collaboration +## ChatModelAgent -Around the Agent abstraction, Eino ADK provides various simple, easy-to-use composition primitives for rich scenarios, supporting the development of diverse Multi-Agent collaboration strategies such as Supervisor, Plan-Execute, Group-Chat, and other Multi-Agent scenarios. This enables different Agent division of labor and cooperation patterns to handle more complex tasks. For details, see [Eino ADK: Agent Collaboration](/docs/eino/core_modules/eino_adk/agent_collaboration) +The core implementation of ADK. Uses a ChatModel as the decision-maker and autonomously drives problem-solving through a ReAct Loop. -The collaboration primitives defined by Eino ADK during Agent collaboration are as follows: +**ChatModelAgent = ChatModel + Tools + ReAct Loop + Middleware** -- Collaboration methods between Agents +For detailed introduction, see: [Eino ADK: ChatModelAgent Introduction](/docs/eino/overview/eino_adk_quickstart) - - - - -
    Collaboration MethodDescription
    TransferDirectly transfer the task to another Agent. The current Agent exits after execution and does not care about the task execution status of the transferred Agent
    ToolCall(AgentAsTool)Call an Agent as a ToolCall, wait for the Agent's response, and obtain the output result of the called Agent for the next round of processing
    +## Multi-Agent Collaboration -- Context strategies for AgentInput +> 💡 +> Recommended approach: **AgentAsTool** — Convert a sub-Agent to a Tool, and the parent Agent calls it via ToolCall and obtains the result. This is the most flexible and composable collaboration pattern. - - - + + +
    Context StrategyDescription
    Upstream Agent Full DialogueGet the complete dialogue record of this Agent's upstream Agent
    New Task DescriptionIgnore the complete dialogue record of the upstream Agent and provide a new task summary as the sub-Agent's AgentInput
    Collaboration ApproachMechanismApplicable Scenarios
    AgentAsTool (Recommended)Sub-Agent wrapped as Tool, parent Agent autonomously decides whether to callDelegating subtasks, capability composition
    WorkflowSequential / Loop / Parallel deterministic orchestrationMulti-step tasks with fixed processes
    -- Decision Autonomy +See: [Agent Collaboration](/docs/eino/core_modules/eino_adk/agent_collaboration) - - - - -
    Decision AutonomyDescription
    Autonomous DecisionInside the Agent, based on its available downstream Agents, when assistance is needed, autonomously select downstream Agents for assistance. Generally, the Agent makes decisions based on LLM internally, but even if selection is based on preset logic, it is still considered autonomous decision from outside the Agent
    Preset DecisionPre-set the next Agent after an Agent executes a task. The execution order of Agents is predetermined and predictable
    +## Runner -Around the collaboration primitives, Eino ADK provides the following Agent composition primitives: +Runner is the execution entry for Agents. The following features are only available when running through Runner: - - - - - - - -
    TypeDescriptionRun ModeCollaboration MethodContext StrategyDecision Autonomy
    SubAgentsUse the user-provided agent as the Parent Agent and the user-provided subAgents list as Child Agents to form an autonomously deciding Agent, where Name and Description serve as the Agent's name identifier and description.
  • Currently limited to one Agent having only one Parent Agent
  • Use the SetSubAgents function to build a "multi-branch tree" form of Multi-Agent
  • In this "multi-branch tree", AgentName must remain unique
  • TransferUpstream Agent Full DialogueAutonomous Decision
    SequentialCombine the user-provided SubAgents list into a Sequential Agent that executes in order, where Name and Description serve as the Sequential Agent's name identifier and description. When the Sequential Agent executes, it runs the SubAgents list in order until all Agents have been executed.TransferUpstream Agent Full DialoguePreset Decision
    ParallelCombine the user-provided SubAgents list into a Parallel Agent that executes concurrently based on the same context, where Name and Description serve as the Parallel Agent's name identifier and description. When the Parallel Agent executes, it runs the SubAgents list concurrently and ends after all Agents complete execution.TransferUpstream Agent Full DialoguePreset Decision
    LoopExecute the user-provided SubAgents list in array order, cycling repeatedly, to form a Loop Agent, where Name and Description serve as the Loop Agent's name identifier and description. When the Loop Agent executes, it runs the SubAgents list in sequence and ends after all Agents complete execution.TransferUpstream Agent Full DialoguePreset Decision
    AgentAsToolConvert an Agent into a Tool to be used by other Agents as a regular Tool. Whether an Agent can call other Agents as Tools depends on its own implementation. The ChatModelAgent provided in Eino ADK supports the AgentAsTool functionalityToolCallNew Task DescriptionAutonomous Decision
    - -## ChatModelAgent - -`ChatModelAgent` is Eino ADK's key implementation of Agent. It encapsulates the interaction logic with large language models, implements a ReAct paradigm Agent, orchestrates the ReAct Agent control flow based on Graph in Eino, and exports events generated during ReAct Agent execution through callbacks.Handler, converting them to AgentEvent for return. - -To learn more about ChatModelAgent, see: [Eino ADK: ChatModelAgent](/docs/eino/core_modules/eino_adk/agent_implementation/chat_model) +- **Event stream output**: Query/Run → AsyncIterator[AgentEvent] +- **Checkpoint / Resume**: Persist running state, support interrupt recovery +- **TurnLoop**: Multi-turn runtime, Push/Preempt/Stop ```go -type ChatModelAgentConfig struct { - // Name of the agent. Better be unique across all agents. - Name string - // Description of the agent's capabilities. - // Helps other agents determine whether to transfer tasks to this agent. - Description string - // Instruction used as the system prompt for this agent. - // Optional. If empty, no system prompt will be used. - // Supports f-string placeholders for session values in default GenModelInput, for example: - // "You are a helpful assistant. The current time is {Time}. The current user is {User}." - // These placeholders will be replaced with session values for "Time" and "User". - Instruction string - - Model model.ToolCallingChatModel - - ToolsConfig ToolsConfig - - // GenModelInput transforms instructions and input messages into the model's input format. - // Optional. Defaults to defaultGenModelInput which combines instruction and messages. - GenModelInput GenModelInput - - // Exit defines the tool used to terminate the agent process. - // Optional. If nil, no Exit Action will be generated. - // You can use the provided 'ExitTool' implementation directly. - Exit tool.BaseTool - - // OutputKey stores the agent's response in the session. - // Optional. When set, stores output via AddSessionValue(ctx, outputKey, msg.Content). - OutputKey string - - // MaxIterations defines the upper limit of ChatModel generation cycles. - // The agent will terminate with an error if this limit is exceeded. - // Optional. Defaults to 20. - MaxIterations int -} +runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + CheckPointStore: store, // Optional +}) -func NewChatModelAgent(_ context.Context, config *ChatModelAgentConfig) (*ChatModelAgent, error) { - // omit code -} +iter := runner.Query(ctx, "Your question") ``` -# AgentRunner - -AgentRunner is the executor for Agents, providing support for extended functionality required by Agent execution. For details, see: [Eino ADK: Agent Extension](/docs/eino/core_modules/eino_adk/agent_extension) - -Only when executing agents through Runner can you use the following ADK features: - -- Interrupt & Resume -- Aspect mechanism (supported in 1226 test version, API compatibility not guaranteed before official release) -- Context environment preprocessing - - ```go - type RunnerConfig struct { - Agent Agent - EnableStreaming bool - - CheckPointStore compose.CheckPointStore - } - - func NewRunner(_ context.Context, conf RunnerConfig) *Runner { - // omit code - } - ``` +See: [Agent Runner and Extension](/docs/eino/core_modules/eino_adk/agent_extension) | [Agent Cancel and TurnLoop](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) diff --git a/content/en/docs/eino/core_modules/eino_adk/agent_quickstart.md b/content/en/docs/eino/core_modules/eino_adk/agent_quickstart.md index 5560ebacc43..6ce92d1a8dc 100644 --- a/content/en/docs/eino/core_modules/eino_adk/agent_quickstart.md +++ b/content/en/docs/eino/core_modules/eino_adk/agent_quickstart.md @@ -1,93 +1,55 @@ --- Description: "" -date: "2026-01-30" +date: "2026-05-19" lastmod: "" tags: [] -title: 'Eino ADK: Quickstart' +title: Quickstart weight: 1 --- # Installation -Eino provides ADK from `v0.5.0`. Upgrade your project: +Eino ADK is available since v0.5.0, with v0.9.0 as the current recommended version: ```go -// stable >= eino@v0.5.0 go get github.com/cloudwego/eino@latest ``` -# Agent +# Core Concepts -### What is Eino ADK +**Eino ADK** is a Go-based Agent development framework. The core primitive is **ChatModelAgent** — an intelligent agent that uses a ChatModel as its decision-maker, Tools as its action space, and autonomously drives problem-solving through a ReAct Loop. -Eino ADK, inspired by [Google‑ADK](https://google.github.io/adk-docs/agents/), is a Go framework for building Agent and Multi‑Agent applications. It standardizes context passing, event streaming, task transfer, interrupts/resume, and cross‑cutting features. +> 💡 +> If you only read one document, read: [Eino ADK: ChatModelAgent Introduction](/docs/eino/overview/eino_adk_quickstart) -### What is an Agent - -An Agent is the core of Eino ADK, representing an independent, executable intelligent task unit. You can think of it as an "intelligent entity" that can understand instructions, execute tasks, and provide responses. Each Agent has a clear name and description, making it discoverable and callable by other Agents. - -Any scenario requiring interaction with a Large Language Model (LLM) can be abstracted as an Agent. For example: - -- An Agent for querying weather information -- An Agent for booking meetings -- An Agent capable of answering domain‑specific questions - -### Agent in ADK - -All features in Eino ADK are designed around the Agent abstraction: - -```go -type Agent interface { - Name(ctx context.Context) string - Description(ctx context.Context) string - Run(ctx context.Context, input *AgentInput) *AsyncIterator[*AgentEvent] -} -``` - -Based on the Agent abstraction, ADK provides three base extension categories: - -- `ChatModel Agent`: The "thinking" part of the application, using LLM as its core to understand natural language, perform reasoning, planning, generate responses, and dynamically decide how to execute or which tools to use. -- `Workflow Agents`: The coordination and management part of the application, controlling sub-Agent execution flow based on predefined logic according to their type (sequential/parallel/loop). Workflow Agents produce deterministic, predictable execution patterns, unlike the dynamic random decisions generated by ChatModel Agent. - - Sequential (Sequential Agent): Execute sub-Agents in order - - Loop (Loop Agent): Repeatedly execute sub-Agents until a specific termination condition is met - - Parallel (Parallel Agent): Execute multiple sub-Agents concurrently -- `Custom Agent`: Implement your own Agent through the interface, allowing highly customized complex Agents - -Based on these base extensions, you can combine these basic Agents according to your needs to build the Multi-Agent system you require. Additionally, Eino provides several out-of-the-box Multi-Agent best practice paradigms based on daily practical experience: - -- Supervisor: Supervisor mode, where the Supervisor Agent controls all communication flows and task delegation, deciding which Agent to call based on current context and task requirements. -- Plan-Execute: Plan-Execute mode, where the Plan Agent generates a plan with multiple steps, and the Execute Agent completes tasks based on user query and the plan. After execution, Plan is called again to decide whether to complete the task or replan. - -The table and diagram below provide the characteristics, differences, and relationships of these base extensions and encapsulations. Subsequent chapters will detail the principles and specifics of each type: +## Component Map - - - - + + + + + +
    CategoryChatModel AgentWorkflow AgentsCustom LogicEinoBuiltInAgent (supervisor, plan‑execute)
    FunctionThinking, generation, tool callsControl execution flow among agentsRun custom logicOut‑of‑the‑box multi‑agent pattern encapsulation
    CoreLLMPredetermined execution flows (sequential/parallel/loop)Custom codeHigh‑level encapsulation based on Eino practical experience
    PurposeGeneration, dynamic decisionsStructured processing, orchestrationCustomization needsTurnkey solutions for specific scenarios
    ComponentResponsibilityDocumentation
    ChatModelAgentReAct Loop: Reasoning → Action → Feedback, autonomous decision-makingChatModelAgent Introduction
    MiddlewareInject behavior at lifecycle points of the ReAct Loop (compression, search, retry, etc.)ChatModelAgentMiddleware
    RunnerSingle Agent run entry: Query / Run → event streamAgent Runner and Extension
    TurnLoopMulti-turn runtime: Push / Preempt / Stop + declarative checkpoint/resumeAgent Cancel and TurnLoop
    DeepAgentsPre-built Agent: task planning (PlanTask) + subtask delegation (TaskTool)DeepAgents
    - +## Other Agent Types -# ADK Examples +Besides ChatModelAgent, ADK also provides deterministic orchestration primitives: -The [Eino‑examples](https://github.com/cloudwego/eino-examples/tree/main/adk) project provides various ADK implementation examples. You can refer to the example code and descriptions to build an initial understanding of ADK capabilities: +- **Workflow Agents**: Sequential / Loop / Parallel Agent, for structured orchestration of predefined processes. +- **Custom Agent**: Implement the `Agent` interface to integrate with the framework. - - - - - - - - - -
    Project PathIntroductionDiagram
    Sequential workflow exampleThis example code demonstrates a sequential multi-agent workflow built using Eino ADK's Workflow paradigm.
  • Sequential workflow construction: Create a sequential execution agent named ResearchAgent via adk.NewSequentialAgent, containing two sub-agents (SubAgents) PlanAgent and WriterAgent, responsible for research plan formulation and report writing respectively.
  • Clear sub-agent responsibilities: PlanAgent receives research topics and generates detailed, logically clear research plans; WriterAgent writes structurally complete academic reports based on the research plan.
  • Chained input/output: PlanAgent's output research plan serves as WriterAgent's input, forming a clear upstream-downstream data flow, reflecting the sequential dependency of business steps.
  • Loop workflow exampleThis example code builds a reflection-iteration agent framework based on Eino ADK's Workflow paradigm using LoopAgent.
  • Iterative reflection framework: Create ReflectionAgent via adk.NewLoopAgent, containing two sub-agents MainAgent and CritiqueAgent, supporting up to 5 iterations, forming a closed loop of main task solving and critical feedback.
  • MainAgent: Responsible for generating initial solutions based on user tasks, pursuing accurate and complete answer output.
  • CritiqueAgent: Performs quality review on MainAgent's output, provides improvement feedback, terminates the loop if results are satisfactory, and provides final summary.
  • Loop mechanism: Utilizes LoopAgent's iteration capability to continuously optimize solutions through multiple rounds of reflection, improving output quality and accuracy.
  • Parallel workflow exampleThis example code builds a concurrent information collection framework based on Eino ADK's Workflow paradigm using ParallelAgent:
  • Concurrent execution framework: Create DataCollectionAgent via adk.NewParallelAgent, containing multiple information collection sub-agents.
  • Sub-agent responsibility allocation: Each sub-agent is responsible for information collection and analysis from one channel, with no interaction needed between them, clear functional boundaries.
  • Concurrent execution: Parallel Agent can simultaneously start information collection tasks from multiple data sources, significantly improving processing efficiency compared to serial approaches.
  • supervisorThis use case employs a single-layer Supervisor managing two relatively comprehensive sub-Agents: Research Agent handles retrieval tasks, Math Agent handles various mathematical operations (add, multiply, divide), but all math operations are uniformly processed within the same Math Agent rather than being split into multiple sub-Agents. This design simplifies the agent hierarchy, suitable for scenarios where tasks are relatively concentrated and don't require excessive decomposition, facilitating rapid deployment and maintenance.
    layered‑supervisorThis use case implements a multi-tier intelligent agent supervision system, where the top-level Supervisor manages Research Agent and Math Agent, and Math Agent is further subdivided into three sub-Agents: Subtract, Multiply, and Divide. The top-level Supervisor is responsible for assigning research tasks and math tasks to lower-level Agents, while Math Agent as a mid-tier supervisor further dispatches specific math operation tasks to its sub-Agents.
  • Multi-tier agent structure: Implements a top-level Supervisor Agent managing two sub-agents — Research Agent (responsible for information retrieval) and Math Agent (responsible for mathematical operations).
  • Math Agent internally subdivides into three sub-agents: Subtract Agent, Multiply Agent, and Divide Agent, handling subtraction, multiplication, and division operations respectively, reflecting multi-level supervision and task delegation.
  • This hierarchical management structure reflects fine-grained decomposition of complex tasks and multi-level task delegation, suitable for scenarios with clear task classification and computational complexity.
    plan‑execute exampleThis example implements a multi-Agent travel planning system using the plan-execute-replan pattern based on Eino ADK. The core function is to process complex user travel requests (such as "3-day Beijing trip, need flights from New York, hotel recommendations, must-see attractions") through a "plan-execute-replan" loop to complete tasks: 1. Plan:
    Planner Agent
    generates a step-by-step execution plan based on the large model (e.g., "Step 1: check Beijing weather, Step 2: search New York to Beijing flights"); 2. Execute:
    Executor Agent
    calls mock tools **weather (get_weather), flights (search_flights), hotels (search_hotels), attractions (search_attractions)** to execute each step. If user input information is missing (e.g., budget not specified), it calls
    ask_for_clarification
    tool to ask follow-up questions; 3. Replan:
    Replanner Agent
    evaluates whether the plan needs adjustment based on tool execution results (e.g., if no flight tickets available, reselect dates). Execute and Replan continuously loop until all steps in the plan are completed; 4. Supports session trajectory tracking (CozeLoop callback) and state management, ultimately outputting a complete travel plan. Structurally, plan-execute-replan has two layers:
  • Layer 2 is a loop agent composed of execute + replan agent, meaning after replan, re-execution may be needed (after replanning, need to query travel information / request user to continue clarifying questions)
  • Layer 1 is a sequential agent composed of plan agent + Layer 2 loop agent, meaning plan executes only once, then hands over to the loop agent for execution
  • book recommendation agent (interrupt and resume)This code demonstrates a book recommendation chat agent implementation built on the Eino ADK framework, showcasing Agent interrupt and resume functionality.
  • Agent construction: Create a chat agent named BookRecommender via adk.NewChatModelAgent for recommending books based on user requests.
  • Tool integration: Integrates two tools — BookSearch tool for searching books and AskForClarification tool for asking clarifying information, supporting multi-turn interaction and information supplementation.
  • State management: Implements simple in-memory CheckPoint storage, supporting session breakpoint continuation to ensure context continuity.
  • Event-driven: Obtains event streams by iterating runner.Query and runner.Resume, handling various events and errors during execution.
  • Custom input: Supports dynamic user input reception, using tool options to pass new query requests, flexibly driving task flow.
  • +> 💡 +> Graph (deterministic orchestration) and Agent (autonomous decision-making) are two different forms of AI applications. When the core problem is "autonomous decision-making + runtime enhancement", ChatModelAgent is recommended. See "Why not continue using flow/react" in the ChatModelAgent Introduction. -# What's Next +# Examples -After this Quickstart overview, you should have a basic understanding of Eino ADK and Agents. +[eino-examples/adk](https://github.com/cloudwego/eino-examples/tree/main/adk) provides complete ADK example code: -The following articles will dive deep into ADK core concepts to help you understand how Eino ADK works and use it more effectively: +- **ChatModelAgent Intro**: [chatmodel](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/chatmodel) — Book recommendation Agent with interrupt and resume +- **DeepAgents**: [deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) — Task planning + subtask delegation +- **Workflow**: [sequential](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/sequential) / [loop](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/loop) / [parallel](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/parallel) +- **Multi-Agent**: [supervisor](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/supervisor) / [plan-execute](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/plan-execute-replan) - +# What's Next diff --git a/content/en/docs/eino/core_modules/flow_integration_components/react_agent_manual.md b/content/en/docs/eino/core_modules/flow_integration_components/react_agent_manual.md index b91b261e0ab..af6940703ba 100644 --- a/content/en/docs/eino/core_modules/flow_integration_components/react_agent_manual.md +++ b/content/en/docs/eino/core_modules/flow_integration_components/react_agent_manual.md @@ -1,9 +1,9 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: ReAct Agent Manual' +title: ReAct Agent Manual weight: 1 --- diff --git a/content/en/docs/eino/ecosystem_integration/_index.md b/content/en/docs/eino/ecosystem_integration/_index.md index 7c831883a7e..c7f22c2e141 100644 --- a/content/en/docs/eino/ecosystem_integration/_index.md +++ b/content/en/docs/eino/ecosystem_integration/_index.md @@ -1,67 +1,8 @@ --- Description: "" -date: "2026-01-20" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: Component Integration' -weight: 6 +title: 'Component Integration' +weight: 5 --- - -## Component Integration - -### ChatModel - -- openai: [OpenAI](/docs/eino/ecosystem_integration/chat_model/agentic_model_openai) -- ark: [ARK](/docs/eino/ecosystem_integration/chat_model/agentic_model_ark) -- More components: [ChatModel component list](/docs/eino/ecosystem_integration/chat_model) - -### Document - -#### Loader - -- file: [Loader - local file](/docs/eino/ecosystem_integration/document/loader_local_file) -- s3: [Loader - amazon s3](/docs/eino/ecosystem_integration/document/loader_amazon_s3) -- web url: [Loader - web url](/docs/eino/ecosystem_integration/document/loader_web_url) - -#### Parser - -- html: [Parser - html](/docs/eino/ecosystem_integration/document/parser_html) -- pdf: [Parser - pdf](/docs/eino/ecosystem_integration/document/parser_pdf) - -#### Transformer - -- markdown splitter: [Splitter - markdown](/docs/eino/ecosystem_integration/document/splitter_markdown) -- recursive splitter: [Splitter - recursive](/docs/eino/ecosystem_integration/document/splitter_recursive) -- semantic splitter: [Splitter - semantic](/docs/eino/ecosystem_integration/document/splitter_semantic) - -### Embedding - -- ark: [Embedding - ARK](/docs/eino/ecosystem_integration/embedding/embedding_ark) -- openai: [Embedding - OpenAI](/docs/eino/ecosystem_integration/embedding/embedding_openai) - -### Indexer - -- volc vikingdb: [Indexer - volc VikingDB](/docs/eino/ecosystem_integration/indexer/indexer_volc_vikingdb) -- Milvus 2.5+: [Indexer - Milvus 2 (v2.5+)](/docs/eino/ecosystem_integration/indexer/indexer_milvusv2) -- Milvus 2.4: [Indexer - Milvus](/docs/eino/ecosystem_integration/indexer/indexer_milvus) -- OpenSearch 3: [Indexer - OpenSearch 3](/docs/eino/ecosystem_integration/indexer/indexer_opensearch3) -- OpenSearch 2: [Indexer - OpenSearch 2](/docs/eino/ecosystem_integration/indexer/indexer_opensearch2) -- ElasticSearch 9: [Indexer - Elasticsearch 9](/docs/eino/ecosystem_integration/indexer/indexer_elasticsearch9) -- Elasticsearch 8: [Indexer - ES8](/docs/eino/ecosystem_integration/indexer/indexer_es8) -- ElasticSearch 7: [Indexer - Elasticsearch 7 ](/docs/eino/ecosystem_integration/indexer/indexer_elasticsearch7) - -### Retriever - -- volc vikingdb: [Retriever - volc VikingDB](/docs/eino/ecosystem_integration/retriever/retriever_volc_vikingdb) -- Milvus 2.5+: [Retriever - Milvus 2 (v2.5+) ](/docs/eino/ecosystem_integration/retriever/retriever_milvusv2) -- Milvus 2.4: [Retriever - Milvus](/docs/eino/ecosystem_integration/retriever/retriever_milvus) -- OpenSearch 3: [Retriever - OpenSearch 3](/docs/eino/ecosystem_integration/retriever/retriever_opensearch3) -- OpenSearch 2: [Retriever - OpenSearch 2](/docs/eino/ecosystem_integration/retriever/retriever_opensearch2) -- ElasticSearch 9: [Retriever - Elasticsearch 9](/docs/eino/ecosystem_integration/retriever/retriever_elasticsearch9) -- ElasticSearch 8: [Retriever - ES8](/docs/eino/ecosystem_integration/retriever/retriever_es8) -- ElasticSearch 7: [Retriever - ES 7](/docs/eino/ecosystem_integration/retriever/retriever_elasticsearch7) - -### Tools - -- googlesearch: [Tool - Googlesearch](/docs/eino/ecosystem_integration/tool/tool_googlesearch) -- duckduckgo search: [Tool - DuckDuckGoSearch](/docs/eino/ecosystem_integration/tool/tool_duckduckgo_search) diff --git a/content/en/docs/eino/ecosystem_integration/chat_model/_index.md b/content/en/docs/eino/ecosystem_integration/chat_model/_index.md index 869e343e0f9..2324b5ad0f6 100644 --- a/content/en/docs/eino/ecosystem_integration/chat_model/_index.md +++ b/content/en/docs/eino/ecosystem_integration/chat_model/_index.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-19" lastmod: "" tags: [] title: ChatModel @@ -31,3 +31,24 @@ For detailed documentation of each component in this category, please refer to t - The links above point directly to the latest documentation in the GitHub repository - Chinese and English documentation are updated synchronously - To view historical versions or submit documentation suggestions, please visit the GitHub repository + +# AgenticModel Component List + +For detailed documentation of each component in this category, please refer to the GitHub README: + + + + + + + + +
    Component NameChinese DocsEnglish Docs
    AgenticARKREADME.zh_CN.mdREADME.md
    AgenticDeepSeekREADME_zh.mdREADME.md
    AgenticOpenAIREADME.zh_CN.mdREADME.md
    AgenticGeminiREADME.zh_CN.mdREADME.md
    AgenticQwenREADME_zh.mdREADME.md
    + +--- + +**Notes**: + +- The links above point directly to the latest documentation in the GitHub repository +- AgenticModel is a model interface designed for agentic scenarios, supporting advanced capabilities such as Server Tools, MCP Tools, and prefix caching +- To view historical versions or submit documentation suggestions, please visit the GitHub repository diff --git a/content/en/docs/eino/overview/_index.md b/content/en/docs/eino/overview/_index.md index 452d7af88d5..412ef9e8e2f 100644 --- a/content/en/docs/eino/overview/_index.md +++ b/content/en/docs/eino/overview/_index.md @@ -344,10 +344,13 @@ The Eino framework consists of several parts: - [Eino](https://github.com/cloudwego/eino): Contains type definitions, stream data processing mechanisms, component abstraction definitions, orchestration functionality, callback mechanisms, etc. - [EinoExt](https://github.com/cloudwego/eino-ext): Component implementations, callback handler implementations, component usage examples, and various tools such as evaluators, prompt optimizers, etc. +> 💡 +> For components used internally at ByteDance, there are corresponding internal code repositories: + - [Eino Devops](https://github.com/cloudwego/eino-ext/tree/main/devops): Visual development, visual debugging, etc. - [EinoExamples](https://github.com/cloudwego/eino-examples): A code repository containing example applications and best practices. -See: [Eino Framework Structure Description](/docs/eino/overview/Eino Framework Structure Description) +See: [Eino Framework Structure Description](/docs/eino/overview/eino_architecture) ## Detailed Documentation diff --git a/content/en/docs/eino/overview/eino_adk_quickstart.md b/content/en/docs/eino/overview/eino_adk_quickstart.md new file mode 100644 index 00000000000..98386505a5e --- /dev/null +++ b/content/en/docs/eino/overview/eino_adk_quickstart.md @@ -0,0 +1,255 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: Get Started with Eino ADK in 5 Minutes +weight: 9 +--- + +This article is for developers already familiar with Eino, focusing on the most important autonomous decision-making primitive in ADK: **ChatModelAgent** and its runtime enhancement mechanism **ChatModelAgentMiddleware**. + +## Understanding ChatModelAgent + +When we talk about "Agent," we almost always mean: an entity powered by a large model as its core, equipped with tools, capable of autonomous decision-making and solving complex real-world problems. `ChatModelAgent` is Eino ADK's direct implementation of this concept. + +**ChatModelAgent = A ReAct Agent that uses ChatModel as the decision-maker, Tools as the action space, and tool feedback plus history as context for the next decision.** + +Four key components: + +1. **ChatModel**: The large model, responsible for reasoning and decision-making. +2. **Tools**: The tool collection, defining the range of actions the Agent can take. +3. **Feedback**: Tool execution results feed back into the model context, becoming the basis for the next decision. +4. **History**: Complete preservation of the reasoning trajectory, tool calls, and tool results throughout the problem-solving process. + +Therefore, `ChatModelAgent` is not a single model call, but a sustained problem-solving process. + +## ChatModelAgent's Execution Structure: ReAct Loop + +`ChatModelAgent`'s core capability is **autonomous decision-making** — within a single `Run`, the model can repeatedly reason, act, and receive feedback until the problem is solved. The execution structure supporting this capability is the ReAct Loop. + +Autonomous decision-making requires four elements to coexist: + +1. **Decision-maker (ChatModel)**: Each round, based on current context, determines what to do next. +2. **Action space (Tools)**: Defines the concrete actions the Agent can take. +3. **Feedback signal (Tool Feedback)**: Action results are injected into the context, becoming the basis for subsequent decisions — this enables the Agent to correct course based on actual execution results rather than guessing everything at once. +4. **Accumulated context (History)**: Complete preservation of reasoning trajectory, tool calls, and tool results. Each round, the model doesn't see an isolated single query but the complete problem-solving process from start to current state. + +All four are indispensable: without a decision-maker there's no reasoning, without action space there's no execution, without feedback there's no correction, without accumulated context there's no informed judgment based on history. + + + +Key characteristic: **Accumulated context-driven progressive decision-making**. Each loop iteration doesn't start from scratch but continues on top of the complete trajectory of all prior reasoning and actions. Every model decision is made based on a continuously growing problem-solving context, enabling the Agent to handle complex tasks requiring multi-step reasoning, trial-and-error, and correction. + +## What Makes Your ChatModelAgent Different + +The ReAct Loop structure is fixed. So what makes **your** ChatModelAgent different from others, tailored to your specific problem? + +Four dimensions: + +1. **ChatModel** — Which model makes decisions. +2. **Instruction** — System instructions: role definition, behavioral constraints, few-shot examples. +3. **Tools** — Tool collection: determines what the Agent can do. +4. **Middleware (ChatModelAgentMiddleware)** — Inject behavior at specific lifecycle points of the ReAct Loop: intercept, modify, and enhance inputs and outputs within the loop. + +The first three define what the Agent "is" — decision capability, role constraints, action scope. + +Middleware defines how the Agent "runs" — it doesn't change the Loop's structure (reason → act → feedback always remains), but controls the specific runtime behavior of the loop. For example: compressing context before model calls, dynamically injecting tools before running, performing permission checks during tool calls, retrying or switching to backup models on failure. These are all runtime enhancements at specific loop points. + +## Middleware: Injecting Behavior into the ReAct Loop + +When building a ChatModelAgent, you'll encounter these typical problems: + +- **Agent needs to read/write files, execute commands?** → Need to inject a set of general-purpose tools before running. +- **Agent needs to reuse predefined instructions and knowledge?** → Need to package reusable capabilities as Skills, loaded on demand. +- **Context growing too long, exceeding model window?** → Need to automatically compress history before each model call. +- **Too many tools, stuffing all into prompt dilutes attention?** → Need to search and load tools on demand. +- **Model occasionally fails or returns garbage?** → Need automatic retry or backup model switching. + +The common thread: they don't need to change the ReAct Loop's structure, only intercept and enhance at specific points in the loop. This is what Middleware does. + +Corresponding built-in Middleware: + + + + + + + + +
    ScenarioMiddlewareWhat It Does
    Need filesystem capabilitiesFileSystemInjects ls/read/write/edit/grep/execute tools before running
    Reuse predefined capabilitiesSkillPackages instructions, knowledge, tools as skill units loadable on demand
    Context exceeds windowReduction / SummarizationCompresses messages and tool results before model calls
    Too many toolsToolSearchSearches and loads Tools on demand rather than exposing all at once
    Unstable model callsModelRetry / ModelFailoverPer-model-call retry / failover switching
    + +Each Middleware implementation injects at a specific hook point in the ReAct Loop. The diagram below shows where `ChatModelAgentMiddleware` hooks are positioned in the loop: + + + +Hook point summary: + + + + + + + + + +
    Hook PointTimingTypical Use
    BeforeAgent
    Before Agent runs (once only)Enhance Instruction, inject Tools
    BeforeModelRewriteState
    Before each model callModify Messages / ToolInfos
    AfterModelRewriteState
    After each model callModify model response or patch state
    WrapModel
    Per-model-call levelRetry, failover, rewrite model returns
    WrapToolCall
    Per-tool-call levelPermissions, security, output rewriting
    AfterAgent
    After Agent completes successfullyPost-processing, state cleanup
    + +See the appendix at the end for a complete Middleware quick reference. + +## Quick Start: Create and Run a ChatModelAgent + +`Runner` is the entry point for executing an Agent. It transforms a user request into a single Agent run, handling per-run configuration, event stream output, streaming toggles, and runtime capabilities like checkpoint/resume. The minimal usage is: put a `ChatModelAgent` into `RunnerConfig`, then call `Query` or `Run`. + +The following example shows how to create a minimal ChatModelAgent and execute it via Runner: + +```go +package main + +import ( + "context" + "fmt" + "log" + + "github.com/cloudwego/eino-ext/components/model/ark" + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/compose" + "github.com/cloudwego/eino/components/tool" +) + +func main() { + ctx := context.Background() + + // 1. Create ChatModel + chatModel, err := ark.NewChatModel(ctx, &ark.ChatModelConfig{ + Model: "doubao-seed-1-8-251228", + APIKey: "your_api_key", // Replace with your API Key + }) + if err != nil { + log.Fatal(err) + } + + // 2. Create ChatModelAgent + agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "my-assistant", + Description: "An assistant that can answer questions using tools.", + Instruction: "You are a helpful assistant. Please answer user questions using available tools.", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{ + // Register your tools, e.g., webSearchTool + }, + }, + }, + // Handlers: []adk.ChatModelAgentMiddleware{...}, // Register Middleware + }) + if err != nil { + log.Fatal(err) + } + + // 3. Execute Agent via Runner + runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + }) + + // 4. Send user request and consume event stream + iter := runner.Query(ctx, "Help me search for today's news") + for { + event, ok := iter.Next() + if !ok { + break + } + fmt.Println(event) + } +} +``` + +Core flow: `NewChatModelAgent` → `NewRunner` → `Runner.Query/Run` → consume `AsyncIterator` event stream. + +For more basic examples, see: [Eino: Quick Start](/docs/eino/quick_start). + +## Further Reading: DeepAgents + +DeepAgents is a pre-built ChatModelAgent whose core value lies in two preset Middleware: + +- **WriteTodos (PlanTask)**: Enables the main Agent to explicitly plan a task list before execution and continuously track progress during execution. Complex problems no longer rely on the model "thinking through everything at once" but instead decompose first, then progress step by step. +- **TaskTool**: Enables the main Agent to delegate subtasks to sub-Agents for independent execution, with results summarized back to the main loop. This allows a single Agent's capability boundary to be extended through composition. + +Additionally, DeepAgents comes with preset system prompts and optional FileSystem Middleware, ready out-of-the-box for scenarios requiring task planning and multi-Agent collaboration. + +``` +DeepAgents = ChatModelAgent + + WriteTodos (task planning and tracking) + + TaskTool (subtask delegation) + + Optional FileSystem + + Preset system prompts +``` + +Further reading: + +- Eino ADK Deep Agents complete guide: [Eino ADK: DeepAgents](/docs/eino/core_modules/eino_adk/agent_implementation/deepagents) +- DeepAgents examples: [eino-examples/adk/multiagent/deep at main · cloudwego/eino-examples](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) + +## Further Reading: Why Not Continue Using flow/react? + +Back to first principles: Graph and Agent are two fundamentally different AI application paradigms. + +- **Graph**'s core is **determinism**: Developers predefine the topology, and node transitions are determined at compile time. Input is structured, output is predictable. +- **Agent**'s core is **autonomy**: The LLM dynamically decides the next action at runtime, execution paths are unpredictable, and output is a full-process event stream. + +`flow/react` is essentially using Graph's approach to "simulate" an Agent — unrolling the ReAct reasoning loop into static nodes and edges. This works, but is fundamentally a mismatch: using deterministic orchestration to carry dynamic decision-making. As Agent complexity grows, this mismatch creates systemic problems: + +1. **Deliverable mismatch**: Graph targets "final results," while Agent's deliverable is the full process (reasoning trajectory, intermediate tool calls, state changes). Using Graph for Agent means intermediate process can only be extracted through side channels like Callbacks — feasible, but a patch. +2. **Execution model mismatch**: Graph is a synchronous execution model, while Agents are naturally asynchronous long-running processes. Event stream output, checkpoint/resume, interrupt recovery, and other runtime capabilities need framework-level unified management at the Agent dimension, not scattered across Graph node callbacks. +3. **Extension point mismatch**: Agent runtime enhancements (context compression, dynamic tool loading, model retry, security control) are fundamentally interception and injection into the decision loop. In Graph, these capabilities have no unified mount point and are scattered across various nodes or edges; in ChatModelAgent, they have clear lifecycle hooks (Middleware). + +Therefore, flow/react isn't deprecated but returns to its best-fit position: **deterministic process orchestration**. When the core problem is "autonomous decision-making + runtime enhancement," the correct abstraction is `ChatModelAgent + ChatModelAgentMiddleware`. + +Further reading: + +- Agent or Graph? AI Application Route Analysis: [Agent or Graph? AI Application Route Analysis](/docs/eino/overview/graph_or_agent) + + + +## Appendix: Middleware Quick Reference + +### Instance Overview + + + + + + + + + + + + + + + + + +
    MiddlewareDescription
    ReductionTruncates overly long tool output / writes to filesystem, preventing token limit exceeded
    SummarizationHistorical message summary compression
    SkillReusable instructions/knowledge exposed as Tools, Agent loads on demand
    FileSystemls/read/write/edit/glob/grep/execute file operation tool set
    ToolSearch
    tool_search
    meta-tool, searches and loads Tools on demand (reduces resident tool list footprint)
    PatchToolCallPatches dangling tool calls in message history (missing tool results)
    SafeToolWrapToolCall-level interception of tool execution errors, converting to readable text returned to the model, allowing Agent to self-correct rather than abort
    ModelRetryRetries on model call failure per configured strategy [built-in config]
    ModelFailoverSwitches to backup model on model call failure [built-in config]
    AgentsMDInjects Agents.md knowledge file into model context, improving context quality
    PlanTaskPersistent task management tool set (create/get/update/list), supports dependency tracking
    WriteTodosLightweight TODO list tool, Agent can create and track structured to-do items [DeepAgent built-in]
    TaskToolSub-Agent delegation tool, main Agent uses it to dispatch subtasks to sub-Agents for independent execution [DeepAgent built-in]
    PermissionTool call permission control [WIP]
    + +> Note: ModelRetry / ModelFailover are built-in fields of `ChatModelAgentConfig` (`ModelRetryConfig` / `ModelFailoverConfig`) in code, conceptually corresponding to the `WrapModel` hook. SafeTool is an example pattern (see ChatWithEino ch05), implemented as user-defined Middleware. WriteTodos / TaskTool are DeepAgent built-ins, not exported separately. Permission is a planned capability. + +### Categories + + + + + + + + +
    CategoryProblem SolvedIncludes
    Extend general ToolsGive Agent more capabilitiesFileSystem, Skill, ToolSearch, PlanTask, WriteTodos, TaskTool
    Handle errors during ReAct processImprove reliabilityModelRetry, ModelFailover, SafeTool, PatchToolCall
    Keep context window within limitsPrevent token overflowReduction, Summarization, ToolSearch
    Security and permissionsConstrain Agent behaviorPermission
    Improve context content qualityHelp model see better contextSkill, AgentsMD
    + +ToolSearch spans two categories: it is both "extend Tools" (providing on-demand tool discovery) and "keep context window within limits" (avoiding loading too many tool descriptions at once). + +Further reading: + +- ChatModelAgent Middleware detailed guide: [Eino ADK: ChatModelAgentMiddleware](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware) diff --git a/content/en/docs/eino/overview/graph_or_agent.md b/content/en/docs/eino/overview/graph_or_agent.md index 000dcaef6cb..3f1358d9862 100644 --- a/content/en/docs/eino/overview/graph_or_agent.md +++ b/content/en/docs/eino/overview/graph_or_agent.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: Agent or Graph? AI Application Path Analysis diff --git a/content/en/docs/eino/quick_start/_index.md b/content/en/docs/eino/quick_start/_index.md index d50018aa4ff..914cd64194d 100644 --- a/content/en/docs/eino/quick_start/_index.md +++ b/content/en/docs/eino/quick_start/_index.md @@ -78,7 +78,8 @@ Notes: Chapter 7Interrupt/Resumehttps://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch07_interrupt_resume.md Chapter 8Graph Tool (complex workflows)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch08_graph_tool.md Chapter 9Skill (Console)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch09_skill.md -FinalA2UI (Web)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch10_a2ui.md +Chapter 10A2UI (Web)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch10_a2ui.md +Chapter 11TurnLoophttps://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch11_turnloop.md ## Final deliverable: an extensible end-to-end Agent application skeleton diff --git a/content/en/docs/eino/quick_start/chapter_01_chatmodel_and_message.md b/content/en/docs/eino/quick_start/chapter_01_chatmodel_and_message.md index bc08710a458..92d5c04fb35 100644 --- a/content/en/docs/eino/quick_start/chapter_01_chatmodel_and_message.md +++ b/content/en/docs/eino/quick_start/chapter_01_chatmodel_and_message.md @@ -1,117 +1,133 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 1: ChatModel and Message (Console)" +title: 'Chapter 1: ChatModel and Message (Console)' weight: 1 --- -## Introduction to the Eino framework +## Introduction to the Eino Framework **What is Eino?** -Eino is an AI application development framework in Go (Agent Development Kit) designed to help developers quickly build scalable and maintainable AI applications. +Eino is an AI application development framework (Agent Development Kit) implemented in Go, designed to help developers quickly build extensible, maintainable AI applications. **What problems does Eino solve?** -1. **Model abstraction**: unify interfaces across different LLM providers (OpenAI, Ark, Claude, etc.), so switching models does not require changing business code -2. **Capability composition**: provide replaceable, composable capability units through the Component interfaces (chat, tools, retrieval, etc.) -3. **Orchestration framework**: offer orchestration abstractions such as Agent, Graph, and Chain to support complex multi-step AI workflows -4. **Runtime support**: built-in streaming output, interrupt/resume, state management, and Callback-based observability +1. **Model abstraction**: Unifies interfaces across different LLM providers (OpenAI, Ark, Claude, etc.), allowing model switching without modifying business code +2. **Capability composition**: Implements replaceable, composable capability units (conversation, tools, retrieval, etc.) through Component interfaces +3. **Orchestration framework**: Provides orchestration abstractions like Agent, Graph, and Chain, supporting complex multi-step AI workflows +4. **Runtime support**: Built-in streaming output, interrupt and resume, state management, Callback observability, and more -**Main repositories of Eino:** +**Eino's main repositories:** -- **eino** (this repo): the core library, defining interfaces, orchestration abstractions, and ADK -- **eino-ext**: the extension library, providing concrete implementations of Components (OpenAI, Ark, Milvus, etc.) -- **eino-examples**: the examples repo, including this Quickstart series +- **eino** (this repository): Core library, defines interfaces, orchestration abstractions, and ADK +- **eino-ext**: Extension library, provides concrete implementations of various Components (OpenAI, Ark, Milvus, etc.) +- **eino-examples**: Example code repository, containing this quickstart series --- -## ChatWithEino: an assistant that talks with Eino docs +## ChatWithEino: An Intelligent Assistant for Conversing with Eino Documentation **What is ChatWithEino?** -ChatWithEino is an intelligent assistant built with Eino. It helps developers learn Eino and write Eino code by accessing the Eino repository’s source code, comments, and examples, so it can provide accurate and up-to-date technical help. +ChatWithEino is an intelligent assistant built on the Eino framework that helps developers learn the Eino framework and write Eino code. It provides the most accurate and timely technical support by accessing source code, comments, and examples from the Eino repository. **Core capabilities:** -- **Conversational interaction**: understand questions about Eino and respond clearly -- **Code access**: read Eino source code/comments/examples and answer based on real implementations -- **Persistent sessions**: support multi-turn conversations, remember context, and restore sessions across processes -- **Tool calling**: perform operations such as file reading and code search +- **Conversational interaction**: Understands user questions about Eino and provides clear answers +- **Code access**: Directly reads Eino source code, comments, and examples, answering questions based on real implementations +- **Persistent sessions**: Supports multi-turn conversations, remembers context, and can resume sessions across processes +- **Tool calling**: Can perform file reading, code searching, and other operations -**Architecture overview:** +**Technical architecture:** -- **ChatModel**: communicate with LLM providers (OpenAI, Ark, Claude, etc.) -- **Tool**: extend capabilities such as file system access and code search -- **Memory**: persist conversation history -- **Agent**: a unified execution framework that coordinates components +- **ChatModel**: Communicates with large language models (OpenAI, Ark, Claude, etc.) +- **Tool**: Capability extensions like file system access and code search +- **Memory**: Persistent storage of conversation history +- **Agent**: Unified execution framework coordinating components to work together -## Quickstart series: build ChatWithEino from scratch +## Quickstart Documentation Series: Building ChatWithEino from Scratch -This series walks you step by step: starting from the most basic ChatModel call, and progressively building a fully functional ChatWithEino Agent. +This documentation series takes you through a progressive approach, starting from the most basic ChatModel call and gradually building a fully-featured ChatWithEino Agent. **Learning path:** - - - - - - - - - - + + + + + + + + + + +
    ChapterTopicCore contentCapability gain
    Chapter 1ChatModel and MessageUnderstand the Component abstraction and implement a single-turn chatBasic conversation
    Chapter 2Agent and RunnerIntroduce execution abstractions and implement multi-turn chatSession management
    Chapter 3Memory and SessionPersist chat history and support session recoveryPersistence
    Chapter 4Tools and file systemAdd file access to read source codeTool calling
    Chapter 5MiddlewareMiddleware mechanism and unified cross-cutting concernsExtensibility
    Chapter 6CallbackCallbacks to observe the Agent execution processObservability
    Chapter 7Interrupt and ResumeInterrupt and resume to support long-running tasksReliability
    Chapter 8Graph and ToolUse Graph to orchestrate complex workflowsComplex orchestration
    Chapter 9A2UIIntegration from Agent to UIProduction-grade delivery
    ChapterTopicCore ContentCapability Gained
    Chapter 1ChatModel and AgenticMessageUnderstand Component abstraction, implement single-turn conversationBasic conversation
    Chapter 2Agent and RunnerIntroduce execution abstraction, implement multi-turn conversationSession management
    Chapter 3Memory and SessionPersist conversation history, support session recoveryPersistence
    Chapter 4Tool and File SystemAdd file access capability, read source codeTool calling
    Chapter 5MiddlewareMiddleware mechanism, unified handling of cross-cutting concernsExtensibility
    Chapter 6CallbackCallback mechanism, monitor Agent executionObservability
    Chapter 7Interrupt and ResumeInterrupt and resume, support long-running tasksReliability
    Chapter 8Graph and ToolUse Graph to orchestrate complex workflowsComplex orchestration
    Chapter 9SkillUse Skill middleware to load and reuse skill documentsKnowledge reuse
    FinalA2UIAgent-to-UI integration solutionProduction-grade application
    -**Why design it this way?** +**Why this design?** -Each chapter adds one core capability on top of the previous chapter, so you can: +Each chapter adds one core capability on top of the previous one, allowing you to: -1. **Understand the role of each component**: features are introduced progressively instead of all at once -2. **See the architecture evolve**: from simple to complex, and why each abstraction exists -3. **Build practical skills**: every chapter comes with runnable code you can try hands-on +1. **Understand the role of each component**: Rather than showing all features at once, they are introduced progressively +2. **See the architecture evolution**: From simple to complex, understanding why each abstraction is needed +3. **Master practical development skills**: Each chapter has runnable code for hands-on practice --- -Goal of this chapter: understand Eino’s Component abstraction, call a ChatModel once with minimal code (with streaming output), and learn the basics of `schema.Message`. +This chapter's goal: Understand Eino's Component abstraction, call a ChatModel with minimal code (with streaming output support), and learn how to organize model input and streaming output using `schema.AgenticMessage`. -## Code location +## Code Location - Entry code: [cmd/ch01/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch01/main.go) -## Why we need the Component interfaces +## Why Component Interfaces Are Needed -Eino defines a set of Component interfaces (`ChatModel`, `Tool`, `Retriever`, `Loader`, etc.). Each interface describes one replaceable capability category: +Eino defines a set of Component interfaces (`ChatModel`, `Tool`, `Retriever`, `Loader`, etc.), each describing a category of replaceable capabilities: ```go -type BaseChatModel interface { - Generate(ctx context.Context, input []*schema.Message, opts ...Option) (*schema.Message, error) - Stream(ctx context.Context, input []*schema.Message, opts ...Option) ( - *schema.StreamReader[*schema.Message], error) +type BaseModel[M any] interface { + Generate(ctx context.Context, input []M, opts ...Option) (M, error) + Stream(ctx context.Context, input []M, opts ...Option) (*schema.StreamReader[M], error) } + +type AgenticModel = BaseModel[*schema.AgenticMessage] ``` **Benefits of interfaces:** -1. **Replaceable implementations**: `eino-ext` provides implementations for OpenAI, Ark, Claude, Ollama, and more. Business code depends only on the interface, so switching models only changes construction logic. -2. **Composable orchestration**: orchestration layers such as Agent, Graph, and Chain depend only on Component interfaces, not concrete implementations. You can swap OpenAI for Ark without changing orchestration code. -3. **Mockable in tests**: interfaces make mocking natural; unit tests do not need real model calls. +1. **Replaceable implementations**: `eino-ext` provides multiple implementations including OpenAI, Ark, Claude, Ollama, etc. Business code only depends on the interface; switching models only requires changing the construction logic. +2. **Composable orchestration**: Orchestration layers like Agent, Graph, and Chain only depend on Component interfaces, not caring about specific implementations. You can swap OpenAI for Ark without changing orchestration code. +3. **Mockable for testing**: Interfaces naturally support mocking; unit tests don't need real model calls. + +This chapter only involves `ChatModel`; subsequent chapters will progressively introduce `Tool`, `Retriever`, and other Components. -This chapter focuses on `ChatModel`. Later chapters will introduce Components such as `Tool` and `Retriever`. +The example code defaults to using `model.AgenticModel`, which is `model.BaseModel[*schema.AgenticMessage]`. This allows subsequent chapters to express text, reasoning, tool calls, tool results, and more within the same message structure. -## schema.Message: the basic unit of conversation +## schema.AgenticMessage: The Basic Unit of Conversation -`Message` is the basic structure for conversation data in Eino: +`AgenticMessage` is the conversation data structure used in this Quickstart: + +In a single model call, the model may return multiple ordered events — for example, first outputting `reasoning`, then calling a server tool, followed by more `reasoning`, then calling a function tool. `AgenticMessage` stores these structured events in order using `ContentBlock`. ```go -type Message struct { - Role RoleType // system / user / assistant / tool - Content string // text content - ToolCalls []ToolCall // only assistant messages may have this +type AgenticMessage struct { + Role AgenticRoleType + ContentBlocks []*ContentBlock + ResponseMeta *AgenticResponseMeta + Extra map[string]any +} + +type ContentBlock struct { + Type ContentBlockType + Reasoning *Reasoning + UserInputText *UserInputText + AssistantGenText *AssistantGenText + FunctionToolCall *FunctionToolCall + FunctionToolResult *FunctionToolResult // ... } ``` @@ -119,22 +135,27 @@ type Message struct { Common constructors: ```go -schema.SystemMessage("You are a helpful assistant.") -schema.UserMessage("What is the weather today?") -schema.AssistantMessage("I don't know.", nil) // second arg is ToolCalls -schema.ToolMessage("tool result", "call_id") +schema.SystemAgenticMessage("You are a helpful assistant.") +schema.UserAgenticMessage("What is the weather today?") + +&schema.AgenticMessage{ + Role: schema.AgenticRoleTypeAssistant, + ContentBlocks: []*schema.ContentBlock{ + schema.NewContentBlock(&schema.AssistantGenText{Text: "I don't know."}), + }, +} ``` **Role semantics:** -- `system`: system instructions, typically placed at the beginning of messages -- `user`: user input -- `assistant`: model response -- `tool`: tool call result (covered in later chapters) +- `system`: System instructions, typically placed at the beginning of the message list +- `user`: User input +- `assistant`: Model response +- Tool calls and tool results are expressed through `function_tool_call` / `function_tool_result` content blocks (covered in later chapters) ## Prerequisites -### Get the code +### Get the Code ```bash git clone https://github.com/cloudwego/eino-examples.git @@ -142,13 +163,13 @@ cd eino-examples/quickstart/chatwitheino ``` - Go version: Go 1.21+ (see `go.mod`) -- A callable ChatModel (OpenAI by default; Ark is also supported) +- A callable ChatModel (defaults to OpenAI; Ark is also supported) -### Option A: OpenAI (default) +### Option A: OpenAI (Default) ```bash export OPENAI_API_KEY="..." -export OPENAI_MODEL="gpt-4.1-mini" # OpenAI 2025 new model; gpt-4o / gpt-4o-mini also work +export OPENAI_MODEL="gpt-4.1-mini" # OpenAI 2025 new model, gpt-4o or gpt-4o-mini also work # Optional: # OPENAI_BASE_URL (proxy or compatible service) # OPENAI_BY_AZURE=true (use Azure OpenAI) @@ -163,60 +184,65 @@ export ARK_MODEL="..." # Optional: ARK_BASE_URL ``` -## Run +## Running -In `eino-examples/quickstart/chatwitheino`, run: +In the `examples/quickstart/chatwitheino` directory: ```bash -go run ./cmd/ch01 -- "Explain in one sentence what problem Eino’s Component design solves." +go run ./cmd/ch01 -- "Explain in one sentence what problem Eino's Component design solves." ``` -Example output (printed incrementally as the stream arrives): +Example output (streamed progressively): ``` -[assistant] Eino’s Component design defines unified interfaces... +[assistant] Eino's Component design solves the problem of... ``` -## What the entry code does +## What the Entry Code Does In execution order: -1. **Create a ChatModel**: choose OpenAI or Ark based on the `MODEL_TYPE` environment variable -2. **Build input messages**: `SystemMessage(instruction)` + `UserMessage(query)` -3. **Call Stream**: all ChatModel implementations must support `Stream()`, returning a `StreamReader[*Message]` -4. **Print the result**: iterate `StreamReader` and print the assistant reply chunk by chunk +1. **Create ChatModel**: Select OpenAI or Ark's agentic model based on the `MODEL_TYPE` environment variable +2. **Construct input messages**: Create `AgenticMessage` using `msgops.NewSystem[M]` / `msgops.NewUser[M]` +3. **Call Stream**: Use `model.BaseModel[M].Stream()`, returning a `StreamReader[M]` +4. **Print results**: Iterate the `StreamReader` to print assistant replies frame by frame -Key code snippet (**note: simplified and not directly runnable; for the full code see** [cmd/ch01/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch01/main.go)): +Key code snippet (**Note: This is a simplified code snippet that cannot run directly. For the complete code, please refer to** [cmd/ch01/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch01/main.go)): ```go -// Build input -messages := []*schema.Message{ - schema.SystemMessage(instruction), - schema.UserMessage(query), -} - -// Call Stream (all ChatModels must implement this) -stream, err := cm.Stream(ctx, messages) -if err != nil { - log.Fatal(err) -} -defer stream.Close() +func runTyped[M adk.MessageType](ctx context.Context, instruction, query string) { + cm, err := chatmodel.NewModel[M](ctx) + if err != nil { + log.Fatal(err) + } -for { - chunk, err := stream.Recv() - if errors.Is(err, io.EOF) { - break + messages := []M{ + msgops.NewSystem[M](instruction), + msgops.NewUser[M](query), } + + stream, err := cm.Stream(ctx, messages) if err != nil { log.Fatal(err) } - fmt.Print(chunk.Content) + defer stream.Close() + + for { + frame, err := stream.Recv() + if errors.Is(err, io.EOF) { + break + } + if err != nil { + log.Fatal(err) + } + fmt.Print(msgops.AssistantDeltaText(frame)) + } } ``` -## Summary +## Chapter Summary -- **Component interfaces**: define boundaries for replaceable, composable, and testable capabilities -- **Message**: the basic unit of conversation data, with semantics defined by roles -- **ChatModel**: the most fundamental Component, providing `Generate` and `Stream` -- **Implementation choice**: switch between OpenAI/Ark implementations via env/config without changing business code +- **Component interface**: Defines replaceable, composable, and testable capability boundaries +- **AgenticMessage**: The basic unit of conversation data, distinguishing semantics through roles and content blocks +- **ChatModel**: The most fundamental Component, providing two core methods: `Generate` and `Stream` +- **Implementation selection**: Switch between different implementations like OpenAI/Ark via environment variables or configuration, with no changes needed in business code diff --git a/content/en/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md b/content/en/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md index ad2bd195ee6..bb50addf13e 100644 --- a/content/en/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md +++ b/content/en/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md @@ -1,26 +1,320 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 2: ChatModelAgent, Runner, AgentEvent (Console multi-turn)" +title: "Chapter 2: ChatModelAgent, Runner, AgentEvent (Console Multi-Turn)" weight: 2 --- Goal of this chapter: introduce ADK execution abstractions (Agent + Runner) and implement a multi-turn conversation in a Console program. -## Code location +## Code Location - Entry code: [cmd/ch02/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch02/main.go) -## Full tutorial +## Prerequisites -This page is a website-friendly overview. For the full runnable walkthrough, see: +Same as Chapter 1: you need a configured and available ChatModel (OpenAI or Ark). -- [ch02_chatmodel_agent_runner_console.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch02_chatmodel_agent_runner_console.md) +## Running -## What you learn +In the `examples/quickstart/chatwitheino` directory: -- Why “Agent” is a higher-level abstraction than “ChatModel”: it owns the interaction loop and tool routing. -- What “Runner” does: it provides the runtime (streaming, events, interrupt/resume plumbing) for running an Agent. -- How “AgentEvent” models the execution stream: user input, model output, tool calls, tool results, and lifecycle signals. +```bash +go run ./cmd/ch02 +``` + +After the prompt appears, enter your questions (empty line to exit): + +``` +you> Hi, explain what an Agent is in Eino? +... +you> Summarize that in one sentence +... +``` + +## Key Concepts + +### From Component to Agent + +In Chapter 1 we learned about **Components** — the replaceable, composable capability units in Eino: + +- `ChatModel`: calls a large language model +- `Tool`: executes specific tasks +- `Retriever`: retrieves information +- `Loader`: loads data + +**The relationship between Component and Agent:** + +- **Components alone don't form a complete AI application**: they are capability units that need to be organized, orchestrated, and executed +- **An Agent is a complete AI application**: it encapsulates complete business logic and can run directly +- **Agents use Components internally**: most importantly `ChatModel` (conversation) and `Tool` (execution) + +**Why do we need Agent?** + +With Components alone, you would need to handle: + +- Managing conversation history +- Orchestrating the call flow (when to call the model, when to call tools) +- Handling streaming output +- Implementing interrupt and resume +- ... + +**What does Agent provide?** + +- **A complete runtime framework**: unified execution management via `Runner` +- **Standardized event stream output**: `Run() -> AsyncIterator[*AgentEvent]`, supporting streaming, interrupt, and resume +- **Extensibility**: tools, middleware, interrupt, and more can be added +- **Ready to use**: create an Agent and run it directly, no need to worry about internal details + +**This chapter's example:** + +`ChatModelAgent` is the simplest Agent — it only uses a `ChatModel` internally, but already possesses the complete Agent capability framework. Later chapters will demonstrate how to add `Tool` and other capabilities. + +### Agent Interface + +`Agent` is the core interface in ADK, defining the basic behavior of an intelligent agent: + +```go +type Agent interface { + Name(ctx context.Context) string + Description(ctx context.Context) string + + // Run executes the Agent and returns an event stream + Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent] +} +``` + +**Interface responsibilities:** + +- `Name()` / `Description()`: identify the Agent's name and description +- `Run()`: the core method to execute the Agent, accepting input messages and returning an event stream + +**Design philosophy:** + +- **Unified abstraction**: all Agents (ChatModelAgent, WorkflowAgent, SupervisorAgent, etc.) implement this interface +- **Event-driven**: execution is output through an event stream (`AsyncIterator[*AgentEvent]`), supporting streaming responses +- **Extensibility**: when adding tools, middleware, interrupt, etc., the interface remains unchanged + +### ChatModelAgent + +`ChatModelAgent` is an implementation of the Agent interface, built on top of ChatModel: + +```go +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "Ch02ChatModelAgent", + Description: "A minimal ChatModelAgent with in-memory multi-turn history.", + Instruction: instruction, + Model: cm, +}) +``` + +**ChatModel vs ChatModelAgent: the essential difference** + + + + + + + + +
    DimensionChatModelChatModelAgent
    PositioningComponentAgent
    Core Interface
    Generate()
    /
    Stream()
    Run() -> AsyncIterator[*AgentEvent]
    Output FormReturns message content directlyReturns event stream (messages, control actions, etc.)
    Core CapabilitiesPure LLM invocationSupports tools, middleware, interrupt, etc.
    Use CasesSimple conversational interactionsComplex agent application development
    + +**Why do we need ChatModelAgent?** + +1. **Unified abstraction**: ChatModel is just one kind of Component, while Agent is a higher-level abstraction that can compose multiple Components +2. **Event-driven**: Agent outputs an event stream, supporting streaming responses, interrupt/resume, state transitions, and other complex scenarios +3. **Extensibility**: ChatModelAgent can have tools, middleware, interrupt, etc. added, while ChatModel can only invoke the model +4. **Orchestration-friendly**: Agents can be uniformly managed by Runner, supporting checkpoint, resume, and other runtime capabilities + +**In simple terms:** + +- **ChatModel** = "The component responsible for communicating with the LLM, abstracting away differences between model providers (OpenAI, Ark, Claude, etc.)" +- **ChatModelAgent** = "An agent built on top of the model that can call the model but can also do much more" + +**Analogy:** + +- **ChatModel** is like a "database driver": responsible for communicating with the database, abstracting away MySQL/PostgreSQL differences +- **ChatModelAgent** is like a "business logic layer": built on top of the database driver, but also contains business rules, transaction management, etc. + +**Characteristics:** + +- Encapsulates ChatModel invocation logic +- Provides a unified `Run() -> AgentEvent` output form +- Can have tools, middleware, and other capabilities added later + +### Runner + +`Runner` is the entry point for executing an Agent, responsible for managing the Agent's lifecycle: + +```go +type Runner struct { + a Agent // The Agent to execute + enableStreaming bool + store CheckPointStore // State storage for interrupt/resume +} +``` + +**Why do we need Runner?** + +Although Agent provides a `Run()` method, calling it directly lacks many runtime capabilities: + +1. **Lifecycle management**: Runner manages the Agent's startup, resume, interrupt, and other states +2. **Checkpoint support**: works with `CheckPointStore` to implement interrupt/resume (covered in later chapters) +3. **Unified entry point**: provides convenient methods like `Run()` and `Query()` +4. **Event stream encapsulation**: converts the Agent's event stream into a consumable `AsyncIterator[*TypedAgentEvent[M]]` + +**Usage:** + +```go +runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ + Agent: agent, + EnableStreaming: true, +}) + +// Method 1: pass a message list +events := runner.Run(ctx, history) + +// Method 2: convenience method, pass a single query string +events := runner.Query(ctx, "Hello") +``` + +### AgentEvent + +`AgentEvent` is the event unit returned by Runner: + +```go +type AgentEvent struct { + AgentName string + RunPath []RunStep + + Output *AgentOutput // Output content + Action *AgentAction // Control action + Err error // Execution error +} +``` + +**Main fields:** + +- `event.Err`: execution error +- `event.Output.MessageOutput`: message or message stream (streaming) +- `event.Action`: interrupt/transfer/exit and other control actions (used in later chapters) + +### AsyncIterator: Consuming the Event Stream + +`Runner.Run()` returns `*AsyncIterator[*AgentEvent]`, a non-blocking streaming iterator. + +**Why use AsyncIterator instead of returning results directly?** + +Because Agent execution is **streaming**: the model generates replies token by token, with tool calls interspersed. If we waited for everything to complete before returning, users would have to wait much longer. `AsyncIterator` lets you consume each event in real time. + +**Consumption pattern:** + +```go +// events is *AsyncIterator[*AgentEvent], returned by runner.Run() +events := runner.Run(ctx, history) + +for { + event, ok := events.Next() // Get next event, blocks until an event is available or stream ends + if !ok { + break // Iterator closed, all events consumed + } + if event.Err != nil { + // Handle error + } + if event.Output != nil && event.Output.MessageOutput != nil { + // Handle message output (may be streaming) + } +} +``` + +**Note:** each `runner.Run()` creates a new iterator; it cannot be reused after consumption. + +## Multi-Turn Conversation Implementation + +This chapter implements simple multi-turn conversation: user input → model reply → user continues → ... + +**Implementation approach:** + +Without tools, `ChatModelAgent` only performs one model invocation per `Run()` call. Multi-turn conversation is achieved by maintaining history on the caller side: + +1. Use `history []M` to accumulate conversation messages (in this example, `M` defaults to `*schema.AgenticMessage`) +2. Each user input: append to history via `msgops.NewUser[M]` +3. Call `runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history))` to get the event stream and consume the assistant text +4. Append the assistant text back to history via `msgops.NewAssistant[M]`, then enter the next turn + +**Key code snippet** (Note: this is a simplified snippet that cannot run directly; see [cmd/ch02/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch02/main.go) for the full code): + +```go +func runTyped[M adk.MessageType](ctx context.Context, instruction string) { + agent, err := adk.NewTypedChatModelAgent[M](ctx, &adk.TypedChatModelAgentConfig[M]{ + Name: "Ch02Agent", + Instruction: instruction, + Model: cm, + }) + if err != nil { + log.Fatal(err) + } + + runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ + Agent: agent, + EnableStreaming: true, + }) + + history := make([]M, 0, 16) + + for { + line := readUserInput() + if line == "" { + break + } + + history = append(history, msgops.NewUser[M](line)) + events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) + result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) + if err != nil { + log.Fatal(err) + } + history = append(history, msgops.NewAssistant[M](result.AssistantText, nil)) + } +} +``` + +**Flow diagram:** + +``` +┌─────────────────────────────────────────┐ +│ Initialize history = [] │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ User inputs message │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Append to history │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ runner.Run(history) │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Consume event stream │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Append AssistantMsg │ + └──────────────────────┘ + ↓ + (loop continues) +``` + +## Chapter Summary + +- **Agent interface**: defines the basic behavior of an intelligent agent; the core is `Run() -> AsyncIterator[*AgentEvent]` +- **ChatModelAgent**: an Agent implementation based on ChatModel, providing a unified execution abstraction +- **Runner**: the execution entry point for Agents, managing lifecycle, checkpoint, event streams, and other runtime capabilities +- **AgentEvent**: an event-driven output unit supporting streaming responses and control actions +- **Multi-turn conversation**: implemented by maintaining history on the caller side; each `Run()` completes one conversation turn diff --git a/content/en/docs/eino/quick_start/chapter_03_memory_and_session.md b/content/en/docs/eino/quick_start/chapter_03_memory_and_session.md index e90a0a8e990..85dfc2a0ee0 100644 --- a/content/en/docs/eino/quick_start/chapter_03_memory_and_session.md +++ b/content/en/docs/eino/quick_start/chapter_03_memory_and_session.md @@ -1,28 +1,333 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 3: Memory and Session (persistent conversations)" +title: "Chapter 3: Memory and Session (Persistent Conversations)" weight: 3 --- Goal of this chapter: persist conversation history and support session recovery across processes. -> ⚠️ Important note: **Memory, Session, and Store here are business-layer concepts**, not core Eino framework components. +> **⚠️ Important Note: Business-Layer Concepts vs Framework Concepts** > -> Eino focuses on “how to process messages”; “how to store messages” is entirely up to your application (DB/Redis/object storage/etc.). The implementation in this chapter is a simple reference you can replace. +> The **Memory, Session, and Store introduced in this chapter are business-layer concepts**, **not core Eino framework components**. +> +> - **Eino framework layer**: provides base abstractions like `adk.Runner`, `adk.NewTypedRunner[M]`, `schema.AgenticMessage`, etc. The framework itself does not concern itself with how conversation history is stored +> - **Business layer**: Memory/Session/Store are business logic designed by this example project to implement persistent conversations, interacting with the Eino framework by assembling inputs for `adk.Runner` +> +> In other words, the Eino framework is only responsible for "how to process messages", while "how to store messages" is entirely decided by the business layer. The implementation in this chapter is just a simple reference example — you can choose a completely different storage solution (database, Redis, cloud storage, etc.) based on your business needs. -## Code location +## Code Location - Entry code: [cmd/ch03/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch03/main.go) +- Memory implementation: [mem/store.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/mem/store.go) + +## Prerequisites + +Same as Chapter 1: you need a configured and available ChatModel (OpenAI or Ark). + +## Running + +In the `examples/quickstart/chatwitheino` directory: + +```bash +# Create a new session +go run ./cmd/ch03 + +# Resume an existing session +go run ./cmd/ch03 --session +``` + +Example output: + +``` +Created new session: 083d16da-6b13-4fe6-afb0-c45d8f490ce1 +Session title: New Session +Enter your message (empty line to exit): +you> Hi, my name is Zhang San +[assistant] Hi Zhang San! Nice to meet you... +you> What's my name? +[assistant] Your name is Zhang San... + +Session saved: 083d16da-6b13-4fe6-afb0-c45d8f490ce1 +Resume with: go run ./cmd/ch03 --session 083d16da-6b13-4fe6-afb0-c45d8f490ce1 +``` + +## From In-Memory to Persistence: Why We Need Memory + +In Chapter 2 we implemented multi-turn conversation, but there's a problem: **conversation history only exists in memory**. + +**Limitations of in-memory storage:** + +- Conversation history is lost when the process exits +- Cannot resume sessions across devices or processes +- Cannot implement session management (listing, deletion, search, etc.) + +**Memory's role:** + +- **Memory is persistent storage for conversation history**: saving conversations to disk or a database +- **Memory supports Session management**: each Session represents a complete conversation +- **Memory is decoupled from Agent**: the Agent doesn't care about storage details, it only cares about the message list + +**Simple analogy:** + +- **In-memory storage** = "scratch paper" (gone when the process exits) +- **Memory** = "notebook" (permanently saved, accessible anytime) + +## Key Concepts + +> **Reminder**: the Session, Store, and other concepts below are all **business-layer implementations** for managing conversation history storage. The Eino framework itself does not provide these components — the business layer is responsible for managing the message list, then passing messages to `adk.Runner` for processing. + +### Session (Business-Layer Concept) + +`Session` represents a complete conversation: + +```go +type Session struct { + ID string + CreatedAt time.Time + + messages []M // Conversation history; in this example M defaults to *schema.AgenticMessage + // ... +} +``` + +**Core methods:** + +- `Append(msg)`: appends a message to the session and persists it +- `GetMessages()`: retrieves all messages +- `Title()`: generates a session title from the first user message + +### Store (Business-Layer Concept) + +`Store` manages persistent storage for multiple Sessions: + +```go +type Store struct { + dir string // Storage directory + cache map[string]*Session // In-memory cache +} +``` + +**Core methods:** + +- `GetOrCreate(id)`: get or create a Session +- `List()`: list all Sessions +- `Delete(id)`: delete a Session + +### JSONL File Format + +Each Session is stored as a `.jsonl` file: + +``` +{"type":"session","id":"083d16da-...","created_at":"2026-03-11T10:00:00Z","message_kind":"agentic"} +{"role":"user","content_blocks":[{"type":"user_input_text","user_input_text":{"text":"Hello, who am I?"}}]} +{"role":"assistant","content_blocks":[{"type":"assistant_gen_text","assistant_gen_text":{"text":"Hello! I don't know who you are yet..."}}]} +{"role":"user","content_blocks":[{"type":"user_input_text","user_input_text":{"text":"My name is Zhang San"}}]} +{"role":"assistant","content_blocks":[{"type":"assistant_gen_text","assistant_gen_text":{"text":"OK, Zhang San, nice to meet you!"}}]} +``` + +Sessions are saved by default in `./data/sessions_agentic`; to use a different directory, set `SESSION_DIR_AGENTIC`. + +**Why JSONL?** + +- **Simple**: one JSON object per line, easy to read and write +- **Extensible**: new messages can be appended without rewriting the entire file +- **Readable**: can be viewed directly with a text editor +- **Fault-tolerant**: a corrupted line doesn't affect other lines + +## Memory Implementation (Business-Layer Example) + +Below is a simple business-layer implementation example using JSONL files for conversation history storage. This is just one of many possible implementations — you can choose database, Redis, or other storage solutions based on your actual needs. + +### 1. Create the Store + +```go +sessionDir := "./data/sessions_agentic" +store, err := mem.NewStore(sessionDir) +if err != nil { + log.Fatal(err) +} +``` + +### 2. Get or Create a Session + +```go +sessionID := "083d16da-6b13-4fe6-afb0-c45d8f490ce1" +session, err := store.GetOrCreate(sessionID) +if err != nil { + log.Fatal(err) +} +``` + +### 3. Append User Message + +```go +userMsg := msgops.NewUser[M]("Hello") +if err := session.Append(userMsg); err != nil { + log.Fatal(err) +} +``` + +### 4. Get History and Call the Agent + +```go +history := session.GetMessages() +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) +if err != nil { + log.Fatal(err) +} +``` + +### 5. Append Assistant Message + +```go +assistantMsg := msgops.NewAssistant[M](result.AssistantText, nil) +if err := session.Append(assistantMsg); err != nil { + log.Fatal(err) +} +``` + +**Key code snippet** (Note: this is a simplified snippet that cannot run directly; see [cmd/ch03/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch03/main.go) for the full code): + +```go +store, err := mem.NewStore[M](msgops.DefaultSessionDir(msgops.KindOf[M]())) +if err != nil { + log.Fatal(err) +} + +// Create or resume a Session +session, err := store.GetOrCreate(sessionID) +if err != nil { + log.Fatal(err) +} + +// User input +userMsg := msgops.NewUser[M](line) +if err := session.Append(userMsg); err != nil { + log.Fatal(err) +} + +// Call the Agent +history := session.GetMessages() +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) +if err != nil { + log.Fatal(err) +} + +// Save assistant reply +assistantMsg := msgops.NewAssistant[M](result.AssistantText, nil) +if err := session.Append(assistantMsg); err != nil { + log.Fatal(err) +} +``` + +## Session and Agent Relationship: Business Layer and Framework Layer Collaboration + +**Key understanding:** + +- **Session is a business-layer concept**: implemented and managed by business code, responsible for storing and loading conversation history +- **Agent (Runner) is a framework-layer concept**: provided by the Eino framework, responsible for processing messages and generating replies +- **Their interaction point**: the business layer gets the message list via `session.GetMessages()`, then generates model input via `msgops.NormalizeMessagesForModelInput(history)`, and finally passes it to `runner.Run(ctx, messages)` for processing + +**Architecture layers:** + +``` +┌─────────────────────────────────────────────────────────────┐ +│ Business Layer (your code) │ +│ ┌─────────────┐ ┌──────────────┐ ┌───────────────┐ │ +│ │ Session │───→│ GetMessages() │───→│ runner.Run() │ │ +│ │ (storage) │ │ (msg list) │ │ (framework) │ │ +│ └─────────────┘ └──────────────┘ └───────────────┘ │ +│ ↑ │ │ +│ │ ↓ │ +│ ┌─────────────┐ ┌───────────────┐ │ +│ │ Append() │←─────────────────────│ Assistant reply│ │ +│ │ (save msg) │ └───────────────┘ │ +│ └─────────────┘ │ +└─────────────────────────────────────────────────────────────┘ + │ + ↓ +┌─────────────────────────────────────────────────────────────┐ +│ Framework Layer (Eino framework) │ +│ ┌───────────────────────────────────────────────────────┐ │ +│ │ adk.Runner: receives message list, calls ChatModel, │ │ +│ │ returns reply │ │ +│ └───────────────────────────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────┘ +``` + +**Flow diagram:** + +``` +┌─────────────────────────────────────────┐ +│ User input │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ session.Append() │ + │ Save user message │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ session.GetMessages()│ + │ Get full history │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ runner.Run(history) │ + │ Agent processes msgs│ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Collect assistant │ + │ reply │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ session.Append() │ + │ Save assistant msg │ + └──────────────────────┘ +``` + +## Chapter Summary + +**Framework Layer vs Business Layer:** + +- **Eino framework layer**: provides base abstractions like `adk.Runner`, typed runner, `schema.AgenticMessage`, etc., and does not concern itself with how messages are stored +- **Business layer (this chapter's implementation)**: Memory/Session/Store are business-layer concepts for managing conversation history storage + +**Business-layer concepts:** + +- **Memory**: persistent storage for conversation history, supporting cross-process recovery +- **Session**: a complete conversation, containing ID, creation time, and message list +- **Store**: manages storage for multiple Sessions, supporting create, get, list, and delete +- **JSONL format**: a simple file format, easy to read/write and extend + +**Business layer and framework layer interaction:** + +- The business layer stores messages and retrieves the message list via `session.GetMessages()` +- After normalizing the message list for model input, it passes it to the framework layer's `runner.Run(ctx, messages)` for processing +- The framework layer's reply is collected and saved back to storage by the business layer + +> **💡 Tip**: The implementation in this chapter is just one simple example among many storage solutions. In real projects, you can choose databases, Redis, cloud storage, etc. based on your business needs, and even implement more advanced features like session expiration cleanup, search, sharing, etc. + +## Extended Thinking: Choosing a Business-Layer Storage Solution + +The JSONL file storage approach in this chapter is suitable for simple single-machine applications. In real business scenarios, you may want to consider other storage solutions: -## Full tutorial +**Alternative storage implementations:** -- [ch03_memory_session_jsonl.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch03_memory_session_jsonl.md) +- Database storage (MySQL, PostgreSQL, MongoDB) +- Redis storage (supports distributed setups) +- Cloud storage (S3, OSS) -## What you learn +**Advanced features:** -- How to model “Session” as a stable ID and resume a conversation by reloading stored messages. -- A simple storage format (JSONL) as a baseline for implementing your own persistence layer. -- How to integrate persistence with the Agent/Runner loop without coupling it into Eino itself. +- Session expiration cleanup +- Session search +- Session export/import +- Session sharing diff --git a/content/en/docs/eino/quick_start/chapter_04_tool_and_filesystem.md b/content/en/docs/eino/quick_start/chapter_04_tool_and_filesystem.md index aff152a9b87..2b8138d305b 100644 --- a/content/en/docs/eino/quick_start/chapter_04_tool_and_filesystem.md +++ b/content/en/docs/eino/quick_start/chapter_04_tool_and_filesystem.md @@ -1,33 +1,329 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 4: Tools and file system access" +title: "Chapter 4: Tools and File System Access" weight: 4 --- Goal of this chapter: add Tool capabilities so the Agent can access the file system. -## Why Tools +## Why We Need Tools -In Chapters 1–3, the Agent can only chat; it cannot perform real actions. +In the first three chapters, the Agent we built can only chat — it cannot perform real actions. -Typical limitations without tools: +**Agent limitations:** -- Only generates text responses -- Cannot access external resources (files/APIs/databases) -- Cannot execute real tasks (compute/query/modify) +- Can only generate text replies +- Cannot access external resources (files, APIs, databases, etc.) +- Cannot execute real tasks (compute, query, modify, etc.) -## Code location +**Tool's role:** + +- **Tool is a capability extension for Agent**: enabling the Agent to perform concrete operations +- **Tool encapsulates specific implementations**: the Agent doesn't care how the Tool works internally, only about inputs and outputs +- **Tools are composable**: an Agent can have multiple Tools and choose which to call as needed + +**Simple analogy:** + +- **Agent** = "intelligent assistant" (understands instructions, but needs tools to act) +- **Tool** = "toolbox" (file operations, network requests, database queries, etc.) + +## Why File System Access + +This example is ChatWithDoc (chat with documentation), aimed at helping users learn the Eino framework and write Eino code. So what's the best documentation? + +**The answer is: the Eino repository's code itself.** + +- **Code**: source code shows the real framework implementation +- **Comments**: code comments provide design rationale and usage instructions +- **Examples**: example code demonstrates best practices + +With file system access, the Agent can directly read Eino source code, comments, and examples, providing users with the most accurate and up-to-date technical support. + +## Key Concepts + +### Tool Interface + +`Tool` is the interface in Eino that defines executable capabilities: + +```go +// BaseTool provides tool metadata that ChatModel uses to decide whether and how to call the tool +type BaseTool interface { + Info(ctx context.Context) (*schema.ToolInfo, error) +} + +// InvokableTool is a tool that can be executed by ToolsNode +type InvokableTool interface { + BaseTool + // InvokableRun executes the tool; arguments are a JSON-encoded string, returns a string result + InvokableRun(ctx context.Context, argumentsInJSON string, opts ...Option) (string, error) +} + +// StreamableTool is the streaming variant of InvokableTool +type StreamableTool interface { + BaseTool + // StreamableRun executes the tool in streaming mode, returns a StreamReader + StreamableRun(ctx context.Context, argumentsInJSON string, opts ...Option) (*schema.StreamReader[string], error) +} +``` + +**Interface hierarchy:** + +- `BaseTool`: base interface, only provides metadata +- `InvokableTool`: executable tool (extends BaseTool) +- `StreamableTool`: streaming tool (extends BaseTool) + +### Backend Interface + +`Backend` is Eino's abstract interface for file system operations: + +```go +type Backend interface { + // List file info in a directory + LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) + + // Read file content, supports line offset and limit + Read(ctx context.Context, req *ReadRequest) (*FileContent, error) + + // Search for matching content in files + GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error) + + // Match files by glob pattern + GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error) + + // Write file content + Write(ctx context.Context, req *WriteRequest) error + + // Edit file content (string replacement) + Edit(ctx context.Context, req *EditRequest) error +} +``` + +### LocalBackend + +`LocalBackend` is the local file system implementation of Backend, directly accessing the OS file system: + +```go +import localbk "github.com/cloudwego/eino-ext/adk/backend/local" + +backend, err := localbk.NewBackend(ctx, &localbk.Config{}) +``` + +**Characteristics:** + +- Directly accesses the local file system using Go standard library +- Supports all Backend interface methods +- Supports executing shell commands (ExecuteStreaming) +- Path safety: requires absolute paths to prevent directory traversal attacks +- Zero configuration: works out of the box with no additional setup + +## Implementation: Using DeepAgent + +This chapter uses the DeepAgent prebuilt agent, which provides first-class configuration for Backend and StreamingShell, making it convenient to register file-system-related tools. + +### From ChatModelAgent to DeepAgent: When to Switch? + +Previous chapters used `ChatModelAgent`, which can handle multi-turn conversations. But to access the file system, we need to switch to `DeepAgent`. + +**ChatModelAgent vs DeepAgent comparison:** + + + + + + + + + +
    CapabilityChatModelAgentDeepAgent
    Multi-turn conversation
    Add custom Tools✅ Manual registration of each Tool✅ Manual or automatic registration
    File system access (Backend)❌ Must manually create and register all file tools✅ First-class config, auto-registered
    Command execution (StreamingShell)❌ Must manually create✅ First-class config, auto-registered
    Built-in task management✅ write_todos tool
    Sub-Agent support
    + +**Selection guidance:** + +- Pure conversation scenarios (no external access) → use `ChatModelAgent` +- Need file system access or command execution → use `DeepAgent` + +### Why Use DeepAgent? + +Compared to using ChatModelAgent directly, DeepAgent advantages: + +1. **First-class configuration**: Backend and StreamingShell are first-class configs — just pass them in +2. **Automatic tool registration**: configuring Backend automatically registers file system tools, no manual creation needed +3. **Built-in task management**: provides the `write_todos` tool for task planning and tracking +4. **Sub-Agent support**: can configure specialized sub-Agents for specific tasks +5. **More powerful**: integrates file system, command execution, and other capabilities + +### Code Implementation + +```go +import ( + localbk "github.com/cloudwego/eino-ext/adk/backend/local" + "github.com/cloudwego/eino/adk/prebuilt/deep" +) + +// Create LocalBackend +backend, err := localbk.NewBackend(ctx, &localbk.Config{}) + +// Create DeepAgent with automatic file system tool registration +agent, err := deep.New(ctx, &deep.Config{ + Name: "Ch04ToolAgent", + Description: "ChatWithDoc agent with filesystem access via LocalBackend.", + ChatModel: cm, + Instruction: agentInstruction, + Backend: backend, // Provides file system operation capabilities + StreamingShell: backend, // Provides command execution capabilities + MaxIteration: 50, +}) +``` + +### Tools Automatically Registered by DeepAgent + +When `Backend` and `StreamingShell` are configured, DeepAgent automatically registers the following tools: + +- `read_file`: read file content +- `write_file`: write file content +- `edit_file`: edit file content +- `glob`: find files by glob pattern +- `grep`: search content in files +- `execute`: execute shell commands + +## Code Location - Entry code: [cmd/ch04/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch04/main.go) -## Full tutorial +## Prerequisites + +Same as Chapter 1: you need a configured and available ChatModel (OpenAI or Ark). + +This chapter also requires setting `PROJECT_ROOT` (optional, see run instructions below). + +## Running + +In the `examples/quickstart/chatwitheino` directory: + +```bash +# Optional: set the root directory path of the Eino core library +# When not set, the Agent defaults to using the current working directory (the chatwitheino directory) +# To let the Agent search the full Eino codebase, point to the eino core library root +export PROJECT_ROOT=/path/to/eino + +# Verify the path is correct (you should see directories like adk, components, compose, etc.) +ls $PROJECT_ROOT + +go run ./cmd/ch04 +``` + +**PROJECT_ROOT explanation:** + +- **When not set**: `PROJECT_ROOT` defaults to the current working directory (where `chatwitheino` resides), and the Agent can only access files from this example project. This is sufficient for quick experimentation. +- **When set**: points to the Eino core library root, and the Agent can search the complete Eino framework codebase (core lib, extensions, examples). This is the full ChatWithEino use case. + +**Recommended three-repo directory structure (for the full experience):** + +``` +eino/ # PROJECT_ROOT (Eino core library) +├── adk/ +├── components/ +├── compose/ +├── ext/ # eino-ext (extension components like OpenAI, Ark implementations) +├── examples/ # eino-examples (this repo, where this example lives) +│ └── quickstart/ +│ └── chatwitheino/ +└── ... +``` + +You can use the `dev_setup.sh` script to automatically set up this directory structure: + +```bash +# Run in the eino root directory to auto-clone extensions and examples repos to the correct locations +bash scripts/dev_setup.sh +``` + +Example output: + +``` +you> List the files in the current directory +[assistant] Let me list the files in the current directory... +[tool call] glob(pattern: "*") +[tool result] Found 5 files: +- main.go +- go.mod +- go.sum +- README.md +- cmd/ + +you> Read the content of main.go +[assistant] Let me read the main.go file... +[tool call] read_file(file_path: "main.go") +[tool result] File content: +... +``` + +**Note:** if you encounter a Tool error that interrupts the Agent during execution, don't panic — this is normal. Tool errors are common (e.g., wrong arguments, file not found). How to gracefully handle Tool errors will be covered in detail in the next chapter. + +## Tool Call Flow + +When the Agent needs to call a Tool: + +``` +┌─────────────────────────────────────────┐ +│ User: list files in current directory │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent analyzes intent│ + │ Decides to call glob │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Generate Tool Call │ + │ {"pattern": "*"} │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Execute Tool │ + │ glob("*") │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Return Tool Result │ + │ {"files": [...]} │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent generates reply│ + │ "Found 5 files..." │ + └──────────────────────┘ +``` + +## Chapter Summary + +- **Tool**: a capability extension for Agent, enabling it to perform concrete operations +- **Backend**: abstract interface for file system operations, providing unified file operation capabilities +- **LocalBackend**: local file system implementation of Backend, directly accessing the OS file system +- **DeepAgent**: a prebuilt advanced Agent providing first-class Backend and StreamingShell configuration +- **Automatic tool registration**: configuring Backend auto-registers file system tools +- **Tool call flow**: Agent analyzes intent → generates Tool Call → executes Tool → returns result → generates reply + +## Extended Thinking + +**Other Tool types:** + +- HTTP Tool: call external APIs +- Database Tool: query databases +- Calculator Tool: perform calculations +- Code Executor Tool: run code + +**Other Backend implementations:** + +- Other storage backends can be implemented based on the Backend interface +- For example: cloud storage, database storage, etc. +- LocalBackend already provides complete file system operation capabilities -- [ch04_tool_backend_filesystem.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch04_tool_backend_filesystem.md) +**Custom Tool creation:** -## What you learn +If you need to create custom Tools, you can use `utils.InferTool` to automatically infer from functions. See: -- How to expose file reads as tools and let the model call them through the Agent. -- How to keep tool boundaries explicit (inputs/outputs) so they are testable and observable. +- [Tool interface documentation](https://github.com/cloudwego/eino/tree/main/components/tool) +- [Tool creation examples](https://github.com/cloudwego/eino-examples/tree/main/components/tool) diff --git a/content/en/docs/eino/quick_start/chapter_05_middleware.md b/content/en/docs/eino/quick_start/chapter_05_middleware.md index d15b7f96640..912fe4ff0d9 100644 --- a/content/en/docs/eino/quick_start/chapter_05_middleware.md +++ b/content/en/docs/eino/quick_start/chapter_05_middleware.md @@ -1,33 +1,454 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 5: Middleware (cross-cutting concerns)" +title: "Chapter 5: Middleware (Cross-Cutting Concerns)" weight: 5 --- -Goal of this chapter: understand the middleware pattern and implement Tool error handling and ChatModel retry. +Goal of this chapter: understand the Middleware pattern and implement Tool error handling and ChatModel retry. -## Why Middleware +## Why We Need Middleware -Once you add tools (Chapter 4), failures become normal in real-world systems: +In Chapter 4 we added Tool capabilities to the Agent, enabling file system access. But in real-world scenarios, **Tool errors or ChatModel errors are common**, for example: -- Tool failures: file not found, invalid args, missing permissions, etc. -- ChatModel failures: rate limits (429), network timeouts, temporary outages, etc. +- **Tool errors**: file not found, invalid arguments, insufficient permissions, etc. +- **ChatModel errors**: API rate limiting (429), network timeouts, service unavailable, etc. -Middleware provides a single place to handle these cross-cutting concerns without scattering logic throughout your business code. +### Problem 1: Tool Errors Interrupt the Entire Flow -## Code location +When a Tool execution fails, the error propagates directly to the Agent, interrupting the entire conversation: + +``` +[tool call] read_file(file_path: "nonexistent.txt") +Error: open nonexistent.txt: no such file or directory +// Conversation interrupted, user must restart +``` + +### Problem 2: Model Calls May Fail Due to Rate Limiting + +When the model API returns a 429 (Too Many Requests) error, the entire conversation also interrupts: + +``` +Error: rate limit exceeded (429) +// Conversation interrupted +``` + +### Desired Behavior + +These errors often **should not directly terminate the Agent flow**. Instead, we want to pass the error information to the model and let it self-correct in the next turn. For example: + +``` +[tool call] read_file(file_path: "nonexistent.txt") +[tool result] [tool error] open nonexistent.txt: no such file or directory +[assistant] Sorry, the file doesn't exist. Let me first list the files in the current directory... +[tool call] glob(pattern: "*") +``` + +### Middleware's Role + +The **Middleware pattern** can extend Tool and ChatModel behavior, making it ideal for solving this problem: + +- **Middleware is an Agent interceptor**: inserts custom logic before and after calls +- **Middleware can handle errors**: converts errors into a format the model can understand +- **Middleware can implement retries**: automatically retries failed operations +- **Middleware is composable**: multiple Middlewares can be chained together + +**Simple analogy:** + +- **Agent** = "business logic" +- **Middleware** = "AOP aspect" (logging, retry, error handling, and other cross-cutting concerns) + +## Code Location - Entry code: [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go) -## Full tutorial +## Prerequisites + +Same as Chapter 1: you need a configured and available ChatModel (OpenAI or Ark). Also, like Chapter 4, set `PROJECT_ROOT`: + +```bash +export PROJECT_ROOT=/path/to/eino # Eino core library root directory +``` + +## Running + +In the `examples/quickstart/chatwitheino` directory: + +```bash +# Set project root directory +export PROJECT_ROOT=/path/to/your/project + +go run ./cmd/ch05 +``` + +Example output: + +``` +you> List the files in the current directory +[assistant] Let me list the files... +[tool call] list_files(directory: ".") + +you> Read a non-existent file +[assistant] Trying to read the file... +[tool call] read_file(file_path: "nonexistent.txt") +[tool result] [tool error] open nonexistent.txt: no such file or directory +[assistant] Sorry, the file doesn't exist... +``` + +## Key Concepts + +### Middleware Interface + +`ChatModelAgentMiddleware` is the middleware interface for Agents: + +```go +type ChatModelAgentMiddleware interface { + // BeforeAgent is called before each agent run, allowing modification of + // the agent's instruction and tools configuration. + BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) + + // BeforeModelRewriteState is called before each model invocation. + // The returned state is persisted to the agent's internal state and passed to the model. + BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) + + // AfterModelRewriteState is called after each model invocation. + // The input state includes the model's response as the last message. + AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) + + // WrapInvokableToolCall wraps a tool's synchronous execution with custom behavior. + // This method is only called for tools that implement InvokableTool. + WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) + + // WrapStreamableToolCall wraps a tool's streaming execution with custom behavior. + // This method is only called for tools that implement StreamableTool. + WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) + + // WrapEnhancedInvokableToolCall wraps an enhanced tool's synchronous execution. + // This method is only called for tools that implement EnhancedInvokableTool. + WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) + + // WrapEnhancedStreamableToolCall wraps an enhanced tool's streaming execution. + // This method is only called for tools that implement EnhancedStreamableTool. + WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) + + // WrapModel wraps a chat model with custom behavior. + // This method is called at request time when the model is about to be invoked. + WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) +} +``` + +**Design philosophy:** + +- **Decorator pattern**: each Middleware wraps the original call, and can modify inputs, outputs, or errors +- **Onion model**: requests pass through Middlewares from outer to inner; responses return from inner to outer +- **Composable**: multiple Middlewares execute in sequence + +### Middleware Execution Order + +`Handlers` (i.e., Middlewares) wrap in **array order**, forming an onion model: + +```go +Handlers: []adk.ChatModelAgentMiddleware{ + &middlewareA{}, // Outermost: wraps first, intercepts requests first, but WrapModel takes effect last + &middlewareB{}, // Middle layer + &middlewareC{}, // Innermost: wraps last +} +``` + +**Execution order for Tool calls:** + +``` +Request → A.Wrap → B.Wrap → C.Wrap → Actual Tool execution → C returns → B returns → A returns → Response +``` + +**Practical advice:** place `safeToolMiddleware` (error capture) at the innermost position (end of array) to ensure that interrupt errors thrown by other Middlewares propagate correctly outward. + +### SafeToolMiddleware + +`SafeToolMiddleware` converts Tool errors into strings so the model can understand and handle them: + +```go +type safeToolMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *safeToolMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err + } + // Convert error to string instead of returning an error + return fmt.Sprintf("[tool error] %v", err), nil + } + return result, nil + }, nil +} +``` + +**Effect:** + +``` +[tool call] read_file(file_path: "nonexistent.txt") +[tool result] [tool error] open nonexistent.txt: no such file or directory +[assistant] Sorry, the file doesn't exist, please check the file path... +// Conversation continues, model can adjust its strategy based on the error +``` + +### ModelRetryConfig + +`ModelRetryConfig` configures automatic retry for ChatModel: + +```go +type ModelRetryConfig struct { + MaxRetries int // Maximum retry count + IsRetryAble func(ctx context.Context, err error) bool // Determines if an error is retryable +} +``` + +**Usage (using DeepAgent as an example):** + +```go +agent, err := deep.New(ctx, &deep.Config{ + // ... + ModelRetryConfig: &adk.ModelRetryConfig{ + MaxRetries: 5, + IsRetryAble: func(_ context.Context, err error) bool { + // 429 rate limit errors are retryable + return strings.Contains(err.Error(), "429") || + strings.Contains(err.Error(), "Too Many Requests") || + strings.Contains(err.Error(), "qpm limit") + }, + }, +}) +``` + +**Retry strategy:** + +- Exponential backoff: retry intervals increase with each attempt +- Configurable conditions: `IsRetryAble` determines which errors are retryable +- Automatic recovery: no user intervention needed + +## Middleware Implementation + +### 1. Implement SafeToolMiddleware + +```go +type safeToolMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *safeToolMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + if err != nil { + // Interrupt errors are not converted; they need to propagate + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err + } + // Other errors are converted to strings + return fmt.Sprintf("[tool error] %v", err), nil + } + return result, nil + }, nil +} +``` + +### 2. Implement Streaming Tool Error Handling + +```go +func (m *safeToolMiddleware) WrapStreamableToolCall( + _ context.Context, + endpoint adk.StreamableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.StreamableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) { + sr, err := endpoint(ctx, args, opts...) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return nil, err + } + // Return a single-chunk stream containing the error message + return singleChunkReader(fmt.Sprintf("[tool error] %v", err)), nil + } + // Wrap the stream to capture errors within the stream + return safeWrapReader(sr), nil + }, nil +} +``` + +### 3. Configure the Agent to Use Middleware + +This chapter continues using the `DeepAgent` introduced in Chapter 4, registering Middleware in its `Handlers` field: + +```go +agent, err := deep.New(ctx, &deep.Config{ + Name: "Ch05MiddlewareAgent", + Description: "ChatWithDoc agent with safe tool middleware and retry.", + ChatModel: cm, + Instruction: agentInstruction, + Backend: backend, + StreamingShell: backend, + MaxIteration: 50, + Handlers: []adk.ChatModelAgentMiddleware{ + &safeToolMiddleware{}, // Converts Tool errors to strings + }, + ModelRetryConfig: &adk.ModelRetryConfig{ + MaxRetries: 5, + IsRetryAble: func(_ context.Context, err error) bool { + return strings.Contains(err.Error(), "429") || + strings.Contains(err.Error(), "Too Many Requests") || + strings.Contains(err.Error(), "qpm limit") + }, + }, +}) +``` + +**Note**: `Handlers` field (in config) and "Middleware" (the concept discussed in docs) are the same thing — `Handlers` is the config field name, while `ChatModelAgentMiddleware` is the interface name. + +**Key code snippet** (Note: this is a simplified snippet that cannot run directly; see [cmd/ch05/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go) for the full code): + +```go +// SafeToolMiddleware captures Tool errors and converts them to strings +type safeToolMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *safeToolMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err + } + return fmt.Sprintf("[tool error] %v", err), nil + } + return result, nil + }, nil +} + +// Configure DeepAgent (same as Chapter 4, adding Handlers and ModelRetryConfig) +agent, _ := deep.New(ctx, &deep.Config{ + ChatModel: cm, + Backend: backend, + StreamingShell: backend, + MaxIteration: 50, + Handlers: []adk.ChatModelAgentMiddleware{ + &safeToolMiddleware{}, + }, + ModelRetryConfig: &adk.ModelRetryConfig{ + MaxRetries: 5, + IsRetryAble: func(_ context.Context, err error) bool { + return strings.Contains(err.Error(), "429") + }, + }, +}) +``` + +## Middleware Execution Flow + +``` +┌─────────────────────────────────────────┐ +│ User: read a non-existent file │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent analyzes intent│ + │ Decides to call │ + │ read_file │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ SafeToolMiddleware │ + │ Intercepts Tool call │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Execute read_file │ + │ Returns error │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ SafeToolMiddleware │ + │ Converts error to │ + │ string │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Return Tool Result │ + │ "[tool error] ..." │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent generates reply│ + │ "Sorry, file not │ + │ found..." │ + └──────────────────────┘ +``` + +## Chapter Summary + +- **Middleware**: Agent interceptors that insert custom logic before and after calls +- **SafeToolMiddleware**: converts Tool errors to strings so the model can understand and handle them +- **ModelRetryConfig**: configures automatic ChatModel retry for handling temporary errors like rate limiting +- **Decorator pattern**: Middleware wraps the original call and can modify inputs, outputs, or errors +- **Onion model**: requests pass through Middlewares from outer to inner; responses return from inner to outer + +## Extended Thinking + +**Eino built-in Middlewares:** + + + + + + +
    MiddlewareDescription
    reductionTool output reduction — when tool output is too long, automatically truncates and offloads to the file system to prevent context overflow
    summarizationConversation history auto-summarization — automatically generates summaries to compress history when token count exceeds a threshold
    skillSkill loading middleware — enables the Agent to dynamically load and execute predefined skills
    + +**Middleware chain example:** + +```go +import ( + "github.com/cloudwego/eino/adk/middlewares/reduction" + "github.com/cloudwego/eino/adk/middlewares/summarization" + "github.com/cloudwego/eino/adk/middlewares/skill" +) -- [ch05_middleware.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch05_middleware.md) +// Create reduction middleware: manages tool output length +reductionMW, _ := reduction.New(ctx, &reduction.Config{ + Backend: filesystemBackend, // Storage backend + MaxLengthForTrunc: 50000, // Max length per tool output + MaxTokensForClear: 30000, // Token threshold to trigger cleanup +}) -## What you learn +// Create summarization middleware: auto-compresses conversation history +summarizationMW, _ := summarization.New(ctx, &summarization.Config{ + Model: chatModel, // Model used for generating summaries + Trigger: &summarization.TriggerCondition{ + ContextTokens: 190000, // Token threshold to trigger summarization + }, +}) -- How to wrap tool execution with consistent error handling. -- How to add retry policies around ChatModel calls in a composable way. -- How middleware keeps the Agent core clean and extensible. +// Combine multiple middlewares (conceptual example; when using DeepAgent, replace adk.NewChatModelAgent with deep.New) +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Handlers: []adk.ChatModelAgentMiddleware{ // Note: the config field is named Handlers, conceptually equivalent to Middlewares + summarizationMW, // Outermost: conversation history summarization + reductionMW, // Middle layer: tool output reduction + }, +}) +``` diff --git a/content/en/docs/eino/quick_start/chapter_06_callback_and_trace.md b/content/en/docs/eino/quick_start/chapter_06_callback_and_trace.md index 397af7dae31..652cec53495 100644 --- a/content/en/docs/eino/quick_start/chapter_06_callback_and_trace.md +++ b/content/en/docs/eino/quick_start/chapter_06_callback_and_trace.md @@ -1,23 +1,347 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 6: Callback and Trace (observability)" +title: "Chapter 6: Callback and Trace (Observability)" weight: 6 --- -Goal of this chapter: understand the Callback mechanism and integrate tracing/observability for the Agent execution. +Goal of this chapter: understand the Callback mechanism and integrate CozeLoop for tracing and observability. -## Code location +## Code Location - Entry code: [cmd/ch06/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch06/main.go) -## Full tutorial +## Prerequisites -- [ch06_callback.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch06_callback.md) +Same as Chapter 1: you need a configured and available ChatModel (OpenAI or Ark). Also, like Chapter 4, set `PROJECT_ROOT`: -## What you learn +```bash +export PROJECT_ROOT=/path/to/eino # Eino core library root directory (defaults to current directory if not set) +``` -- How callbacks expose lifecycle hooks for key execution points (model calls, tool calls, streaming chunks). -- How to build logging/metrics/tracing without coupling instrumentation into core logic. +Optional: configure CozeLoop for tracing: + +```bash +export COZELOOP_WORKSPACE_ID=your_workspace_id +export COZELOOP_API_TOKEN=your_token +``` + +## Running + +In the `examples/quickstart/chatwitheino` directory: + +```bash +# Set project root directory +export PROJECT_ROOT=/path/to/your/project + +# Optional: configure CozeLoop +export COZELOOP_WORKSPACE_ID=your_workspace_id +export COZELOOP_API_TOKEN=your_token + +go run ./cmd/ch06 +``` + +Example output: + +``` +[trace] starting session: 083d16da-6b13-4fe6-afb0-c45d8f490ce1 +you> Hello +[trace] chat_model_generate: model=gpt-4.1-mini tokens=150 +[trace] tool_call: name=list_files duration=23ms +[assistant] Hello! How can I help you? +``` + +## From Black Box to White Box: Why We Need Callbacks + +In previous chapters, the Agent we built was a "black box": questions go in, answers come out, but what happened in between was opaque. + +**Problems with a black box:** + +- Don't know how many times the model was called +- Don't know how long Tool execution took +- Don't know how many tokens were consumed +- Hard to locate the root cause when issues arise + +**Callback's role:** + +- **Callback is Eino's sidecar mechanism**: consistent from component to compose (discussed below) to ADK +- **Callback triggers at fixed points**: 5 key moments in a component's lifecycle +- **Callback extracts real-time information**: inputs, outputs, errors, streaming data, etc. +- **Callback has broad applications**: observability, logging, metrics, tracing, debugging, auditing, etc. + +**Simple analogy:** + +- **Agent** = "business logic" (main path) +- **Callback** = "sidecar hooks" (extract information at fixed points) + +## Key Concepts + +### Handler Interface + +`Handler` is the core interface in Eino for defining callback handlers: + +```go +type Handler interface { + // Non-streaming input (before component starts processing) + OnStart(ctx context.Context, info *RunInfo, input CallbackInput) context.Context + + // Non-streaming output (after component returns successfully) + OnEnd(ctx context.Context, info *RunInfo, output CallbackOutput) context.Context + + // Error (when component returns an error) + OnError(ctx context.Context, info *RunInfo, err error) context.Context + + // Streaming input (when component receives streaming input) + OnStartWithStreamInput(ctx context.Context, info *RunInfo, + input *schema.StreamReader[CallbackInput]) context.Context + + // Streaming output (when component returns streaming output) + OnEndWithStreamOutput(ctx context.Context, info *RunInfo, + output *schema.StreamReader[CallbackOutput]) context.Context +} +``` + +**Design philosophy:** + +- **Sidecar mechanism**: does not interfere with the main flow, extracts information at fixed points +- **Full coverage**: all components from component to compose to ADK support callbacks +- **State passing**: the same Handler's OnStart→OnEnd can pass state via context +- **Performance optimization**: implement the `TimingChecker` interface to skip unnecessary timings + +**RunInfo structure:** + +```go +type RunInfo struct { + Name string // Business name (node name or user-specified) + Type string // Implementation type (e.g., "OpenAI") + Component string // Component type (e.g., "ChatModel") +} +``` + +**Important notes:** + +- Streaming callbacks must close the StreamReader, otherwise goroutine leaks will occur +- Do not modify Input/Output — they are shared with all downstream consumers +- RunInfo may be nil; check before use + +### CozeLoop + +CozeLoop is ByteDance's open-source AI application observability platform, providing: + +- **Tracing**: complete call chain visualization +- **Metrics monitoring**: latency, token consumption, error rates, etc. +- **Log aggregation**: centralized log management +- **Debug support**: online viewing and debugging + +**Integration:** + +```go +import ( + clc "github.com/cloudwego/eino-ext/callbacks/cozeloop" + "github.com/cloudwego/eino/callbacks" + "github.com/coze-dev/cozeloop-go" +) + +// Create CozeLoop client +client, err := cozeloop.NewClient( + cozeloop.WithAPIToken(apiToken), + cozeloop.WithWorkspaceID(workspaceID), +) + +// Register as global Callback +callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client)) +``` + +### Callback Trigger Timings + +Callbacks trigger at 5 key moments in a component's lifecycle. The `Timing*` constants in the table below are Eino internal constant names (used for the `TimingChecker` interface); the corresponding Handler interface methods are shown on the right: + + + + + + + + +
    Timing ConstantHandler MethodTrigger PointInput / Output
    TimingOnStartOnStartBefore component starts processingCallbackInput
    TimingOnEndOnEndAfter component returns successfullyCallbackOutput
    TimingOnErrorOnErrorWhen component returns an errorerror
    TimingOnStartWithStreamInputOnStartWithStreamInputWhen component receives streaming inputStreamReader[CallbackInput]
    TimingOnEndWithStreamOutputOnEndWithStreamOutputWhen component returns streaming outputStreamReader[CallbackOutput]
    + +**Example: ChatModel call flow** + +``` +┌─────────────────────────────────────────┐ +│ ChatModel.Generate(ctx, messages) │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ OnStart │ ← Input: CallbackInput (messages) + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Model processing │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ OnEnd │ ← Output: CallbackOutput (response) + └──────────────────────┘ +``` + +**Example: Streaming output flow** + +``` +┌─────────────────────────────────────────┐ +│ ChatModel.Stream(ctx, messages) │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ OnStart │ ← Input: CallbackInput (messages) + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Model processing │ + │ (streaming) │ + └──────────────────────┘ + ↓ + ┌──────────────────────────┐ + │ OnEndWithStreamOutput │ ← Output: StreamReader[CallbackOutput] + └──────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Chunks returned │ + │ one by one │ + └──────────────────────┘ +``` + +**Notes:** + +- Streaming errors (errors mid-stream) do not trigger OnError; they are returned within the StreamReader +- The same Handler's OnStart→OnEnd can pass state via context +- There is no guaranteed execution order between different Handlers + +## Callback Implementation + +### 1. Implement a Custom Callback Handler + +Fully implementing the `Handler` interface requires implementing all 5 methods, which is verbose. Eino provides the `callbacks.HandlerHelper` utility class to simplify implementation: + +```go +import "github.com/cloudwego/eino/callbacks" + +// Use NewHandlerHelper to register callbacks of interest +handler := callbacks.NewHandlerHelper(). + OnStart(func(ctx context.Context, info *callbacks.RunInfo, input callbacks.CallbackInput) context.Context { + log.Printf("[trace] %s/%s start", info.Component, info.Name) + return ctx + }). + OnEnd(func(ctx context.Context, info *callbacks.RunInfo, output callbacks.CallbackOutput) context.Context { + log.Printf("[trace] %s/%s end", info.Component, info.Name) + return ctx + }). + OnError(func(ctx context.Context, info *callbacks.RunInfo, err error) context.Context { + log.Printf("[trace] %s/%s error: %v", info.Component, info.Name, err) + return ctx + }). + Handler() + +// Register as global Callback +callbacks.AppendGlobalHandlers(handler) +``` + +**Note**: `RunInfo` may be `nil` (e.g., top-level calls without RunInfo); check before use. + +### 2. Integrate CozeLoop + +```go +// Setup CozeLoop tracing (optional) +// Set COZELOOP_API_TOKEN and COZELOOP_WORKSPACE_ID to enable +cozeloopApiToken := os.Getenv("COZELOOP_API_TOKEN") +cozeloopWorkspaceID := os.Getenv("COZELOOP_WORKSPACE_ID") +if cozeloopApiToken != "" && cozeloopWorkspaceID != "" { + client, err := cozeloop.NewClient( + cozeloop.WithAPIToken(cozeloopApiToken), + cozeloop.WithWorkspaceID(cozeloopWorkspaceID), + ) + if err != nil { + log.Fatalf("cozeloop.NewClient failed: %v", err) + } + defer func() { + time.Sleep(5 * time.Second) + client.Close(ctx) + }() + callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client)) + log.Println("CozeLoop tracing enabled") +} else { + log.Println("CozeLoop tracing disabled (set COZELOOP_API_TOKEN and COZELOOP_WORKSPACE_ID to enable)") +} +``` + +**Key code snippet** (Note: this is a simplified snippet that cannot run directly; see [cmd/ch06/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch06/main.go) for the full code): + +```go +// Setup CozeLoop tracing +cozeloopApiToken := os.Getenv("COZELOOP_API_TOKEN") +cozeloopWorkspaceID := os.Getenv("COZELOOP_WORKSPACE_ID") +if cozeloopApiToken != "" && cozeloopWorkspaceID != "" { + client, err := cozeloop.NewClient( + cozeloop.WithAPIToken(cozeloopApiToken), + cozeloop.WithWorkspaceID(cozeloopWorkspaceID), + ) + if err != nil { + log.Fatalf("cozeloop.NewClient failed: %v", err) + } + defer func() { + time.Sleep(5 * time.Second) + client.Close(ctx) + }() + callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client)) +} +``` + +## The Value of Observability + +### 1. Performance Analysis + +With data collected via Callbacks, you can analyze: + +- Model call latency distribution +- Tool execution time rankings +- Token consumption trends + +### 2. Error Tracing + +When the Agent has problems: + +- View the complete call chain +- Identify which step failed +- Analyze the root cause + +### 3. Cost Optimization + +Through token consumption data: + +- Identify high-consumption conversations +- Optimize prompts to reduce tokens +- Choose more economical models + +## Chapter Summary + +- **Callback**: Eino's observability hooks, triggered at key lifecycle points +- **CozeLoop**: ByteDance's AI application observability platform +- **Global registration**: register global Callbacks via `callbacks.AppendGlobalHandlers` +- **Non-invasive**: business code doesn't need modification; Callbacks trigger automatically +- **Observability value**: performance analysis, error tracing, cost optimization + +## Extended Thinking + +**Other Callback implementations:** + +- OpenTelemetry Callback: integrate with standard observability protocols +- Custom logging Callback: log to local files +- Metrics Callback: integrate with monitoring systems like Prometheus + +**Advanced usage:** + +- Implement sampling in Callbacks (only record a subset of requests) +- Implement rate limiting in Callbacks (based on token consumption) +- Implement alerting in Callbacks (notify when error rate is too high) diff --git a/content/en/docs/eino/quick_start/chapter_07_interrupt_resume.md b/content/en/docs/eino/quick_start/chapter_07_interrupt_resume.md index fe5fdfa2924..17bf9d43af6 100644 --- a/content/en/docs/eino/quick_start/chapter_07_interrupt_resume.md +++ b/content/en/docs/eino/quick_start/chapter_07_interrupt_resume.md @@ -1,23 +1,367 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 7: Interrupt/Resume (human-in-the-loop)" +title: "Chapter 7: Interrupt/Resume (Human-in-the-Loop)" weight: 7 --- -Goal of this chapter: understand Interrupt/Resume and implement an approval flow so users can confirm before sensitive tool operations. +Goal of this chapter: understand the Interrupt/Resume mechanism and implement a Tool approval flow so users can confirm before sensitive operations. -## Code location +## Code Location - Entry code: [cmd/ch07/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch07/main.go) -## Full tutorial +## Prerequisites -- [ch07_interrupt_resume.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch07_interrupt_resume.md) +Same as Chapter 1: you need a configured and available ChatModel (OpenAI or Ark). Also, like Chapter 4, set `PROJECT_ROOT`: -## What you learn +```bash +export PROJECT_ROOT=/path/to/eino # Eino core library root directory (defaults to current directory if not set) +``` -- How to pause an execution at a safe boundary and request user input. -- How to resume from checkpoints to support long-running or approval-gated tasks. +## Running + +In the `examples/quickstart/chatwitheino` directory: + +```bash +# Set project root directory +export PROJECT_ROOT=/path/to/your/project + +go run ./cmd/ch07 +``` + +Example output: + +``` +you> Please execute the command echo hello + +⚠️ Approval Required ⚠️ +Tool: execute +Arguments: {"command":"echo hello"} + +Approve this action? (y/n): y +[tool result] hello + +hello +``` + +## From Automatic Execution to Human Approval: Why We Need Interrupt + +In previous chapters, the Agent automatically executed all Tool calls, but in certain scenarios this is dangerous: + +**Risks of automatic execution:** + +- Deleting files: accidentally deleting important data +- Sending emails: sending incorrect content +- Executing commands: running dangerous operations +- Modifying config: breaking system settings + +**Interrupt's role:** + +- **Interrupt is the Agent's pause mechanism**: pauses before critical operations, waiting for user confirmation +- **Interrupt carries information**: shows the user the operation about to be executed +- **Interrupt is resumable**: continues after user confirmation, or returns an error on rejection + +**Simple analogy:** + +- **Automatic execution** = "autopilot" (fully trusting the system) +- **Interrupt** = "manual override" (critical decisions made by humans) + +## Key Concepts + +### Interrupt Mechanism + +`Interrupt` is the core mechanism in Eino for implementing human-agent collaboration. + +**Core idea: pause before executing critical operations, wait for user confirmation, then continue.** + +A Tool that requires approval has its execution split into **two phases**: + +1. **First call (triggers interrupt)**: the Tool saves the current arguments, then returns an interrupt signal. Runner pauses execution and returns an Interrupt event to the caller. +2. **Resume after user approval**: Runner re-invokes the Tool; this time the Tool detects it was "previously interrupted", reads the user's approval result, and executes (or rejects). + +**Simplified pseudocode:** + +``` +func myTool(ctx, args): + if first_call: + save args + return interrupt_signal // Runner pauses, shows approval prompt + else: // Second call after Resume + if user_approved: + return execute_operation(saved_args) + else: + return "Operation rejected by user" +``` + +**Full code with key field explanations:** + +```go +// Trigger interrupt in a Tool +func myTool(ctx context.Context, args string) (string, error) { + // wasInterrupted: whether this is the second call after Resume (false on first call, true after Resume) + // storedArgs: arguments saved via StatefulInterrupt on first call, retrievable after Resume + wasInterrupted, _, storedArgs := tool.GetInterruptState[string](ctx) + + if !wasInterrupted { + // First call: trigger interrupt, saving args for use after Resume + return "", tool.StatefulInterrupt(ctx, &ApprovalInfo{ + ToolName: "my_tool", + ArgumentsInJSON: args, + }, args) // Third parameter is the state to save (retrieved via storedArgs after Resume) + } + + // Second call after Resume: read user's approval result + // isTarget: whether this Resume targets the current Tool (each Resume targets only one Tool) + // hasData: whether Resume carried approval result data + // data: the user's approval result + isTarget, hasData, data := tool.GetResumeContext[*ApprovalResult](ctx) + if isTarget && hasData { + if data.Approved { + return doSomething(storedArgs) // Execute actual operation with saved args + } + return "Operation rejected by user", nil + } + + // Other cases (isTarget=false means this Resume's target is not the current Tool): re-interrupt + return "", tool.StatefulInterrupt(ctx, &ApprovalInfo{ + ToolName: "my_tool", + ArgumentsInJSON: storedArgs, + }, storedArgs) +} +``` + +### ApprovalMiddleware + +`ApprovalMiddleware` is a generic approval middleware that intercepts specific Tool calls: + +```go +type approvalMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *approvalMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + tCtx *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + // Only intercept Tools that require approval + if tCtx.Name != "execute" { + return endpoint, nil + } + + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + wasInterrupted, _, storedArgs := tool.GetInterruptState[string](ctx) + + if !wasInterrupted { + return "", tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ + ToolName: tCtx.Name, + ArgumentsInJSON: args, + }, args) + } + + isTarget, hasData, data := tool.GetResumeContext[*commontool.ApprovalResult](ctx) + if isTarget && hasData { + if data.Approved { + return endpoint(ctx, storedArgs, opts...) + } + if data.DisapproveReason != nil { + return fmt.Sprintf("tool '%s' disapproved: %s", tCtx.Name, *data.DisapproveReason), nil + } + return fmt.Sprintf("tool '%s' disapproved", tCtx.Name), nil + } + + isTarget, _, _ = tool.GetResumeContext[any](ctx) + if !isTarget { + return "", tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ + ToolName: tCtx.Name, + ArgumentsInJSON: storedArgs, + }, storedArgs) + } + + return endpoint(ctx, storedArgs, opts...) + }, nil +} + +func (m *approvalMiddleware) WrapStreamableToolCall( + _ context.Context, + endpoint adk.StreamableToolCallEndpoint, + tCtx *adk.ToolContext, +) (adk.StreamableToolCallEndpoint, error) { + // If the agent is configured with StreamingShell, execute goes through streaming calls; + // this method must be implemented to intercept it + if tCtx.Name != "execute" { + return endpoint, nil + } + return func(ctx context.Context, args string, opts ...tool.Option) (*schema.StreamReader[string], error) { + wasInterrupted, _, storedArgs := tool.GetInterruptState[string](ctx) + if !wasInterrupted { + return nil, tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ + ToolName: tCtx.Name, + ArgumentsInJSON: args, + }, args) + } + + isTarget, hasData, data := tool.GetResumeContext[*commontool.ApprovalResult](ctx) + if isTarget && hasData { + if data.Approved { + return endpoint(ctx, storedArgs, opts...) + } + if data.DisapproveReason != nil { + return singleChunkReader(fmt.Sprintf("tool '%s' disapproved: %s", tCtx.Name, *data.DisapproveReason)), nil + } + return singleChunkReader(fmt.Sprintf("tool '%s' disapproved", tCtx.Name)), nil + } + + isTarget, _, _ = tool.GetResumeContext[any](ctx) + if !isTarget { + return nil, tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ + ToolName: tCtx.Name, + ArgumentsInJSON: storedArgs, + }, storedArgs) + } + + return endpoint(ctx, storedArgs, opts...) + }, nil +} +``` + +### CheckPointStore + +`CheckPointStore` is a key component for implementing interrupt/resume: + +```go +type CheckPointStore interface { + // Save checkpoint + Put(ctx context.Context, key string, checkpoint *Checkpoint) error + + // Get checkpoint + Get(ctx context.Context, key string) (*Checkpoint, error) +} +``` + +**Why do we need CheckPointStore?** + +- Save state on interrupt: Tool arguments, execution position, etc. +- Load state on resume: continue execution from the interrupt point +- Support cross-process resume: can still resume after process restart + +## Interrupt/Resume Implementation + +### 1. Configure Runner with CheckPointStore + +```go +runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ + Agent: agent, + EnableStreaming: true, + CheckPointStore: adkstore.NewInMemoryStore(), // In-memory storage +}) +``` + +### 2. Configure Agent with ApprovalMiddleware + +```go +agent, err := deep.NewTyped[M](ctx, &deep.TypedConfig[M]{ + // ... other config + Handlers: []adk.TypedChatModelAgentMiddleware[M]{ + newApprovalMiddleware[M](), // Add approval middleware + newSafeToolMiddleware[M](), // Convert Tool errors to strings (interrupt errors propagate upward) + }, +}) +``` + +### 3. Handle Interrupt Events + +```go +checkPointID := sessionID + +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history), adk.WithCheckPointID(checkPointID)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{ + ShowToolCalls: true, + ShowToolResults: true, + CaptureInterrupt: true, +}) +if err != nil { + return err +} + +assistantText := result.AssistantText +if result.InterruptInfo != nil { + // Note: it's recommended to use the same stdin reader for both "user input" and "approval y/n" + // to avoid approval input being treated as the next round's you> message + assistantText, err = handleInterrupt[M](ctx, runner, checkPointID, result.InterruptInfo, reader) + if err != nil { + return err + } +} + +_ = session.Append(msgops.NewAssistant[M](assistantText, nil)) +``` + +## Interrupt/Resume Execution Flow + +``` +┌─────────────────────────────────────────┐ +│ User: execute command echo hello │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent analyzes intent│ + │ Decides to call │ + │ execute │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ ApprovalMiddleware │ + │ Intercepts Tool call │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Trigger Interrupt │ + │ Save state to Store │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Return Interrupt │ + │ event; await user │ + │ approval │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ User inputs y/n │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ runner.ResumeWith... │ + │ Resume execution │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Execute command │ + │ or return rejection │ + └──────────────────────┘ +``` + +## Chapter Summary + +- **Interrupt**: the Agent's pause mechanism, pausing before critical operations to await confirmation +- **Resume**: resume execution — continue after user confirmation, or return an error on rejection +- **ApprovalMiddleware**: a generic approval middleware that intercepts specific Tool calls +- **CheckPointStore**: saves interrupt state, supporting cross-process resume +- **Human-agent collaboration**: critical decisions are confirmed by humans, improving safety + +## Extended Thinking + +**Other Interrupt scenarios:** + +- Multi-option approval: user selects one of several options +- Parameter completion: user provides missing parameters +- Conditional branching: user decides the execution path + +**Approval strategies:** + +- Allowlist: only approve sensitive operations +- Blocklist: approve all operations except safe ones +- Dynamic rules: decide whether to require approval based on argument content diff --git a/content/en/docs/eino/quick_start/chapter_08_graph_tool.md b/content/en/docs/eino/quick_start/chapter_08_graph_tool.md index e842eadb624..85d8cb984ce 100644 --- a/content/en/docs/eino/quick_start/chapter_08_graph_tool.md +++ b/content/en/docs/eino/quick_start/chapter_08_graph_tool.md @@ -1,24 +1,330 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 8: Graph Tool (complex workflows)" +title: "Chapter 8: Graph Tool (Complex Workflows)" weight: 8 --- -Goal of this chapter: understand the Graph Tool concept and build more complex workflows using the compose package. +Goal of this chapter: understand the Graph Tool concept, implement parallel chunk retrieval for large files, and introduce the compose package for building complex workflows. -## Code location +## Code Location - Entry code: [cmd/ch08/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch08/main.go) - RAG implementation: [rag/rag.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/rag/rag.go) -## Full tutorial +## Prerequisites -- [ch08_graph_tool.md](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch08_graph_tool.md) +Same as Chapter 1: you need a configured and available ChatModel (OpenAI or Ark). -## What you learn +## Running -- How to decompose a complex task into a deterministic execution graph. -- How to parallelize “chunking + retrieval” for large files and aggregate results back into a final answer. +In the `examples/quickstart/chatwitheino` directory: + +```bash +# Set project root directory +export PROJECT_ROOT=/path/to/your/project + +go run ./cmd/ch08 +``` + +Example output: + +``` +you> Please analyze the WebSocket handshake section in the RFC6455 document +[assistant] Let me analyze the document... +[tool call] answer_from_document(file_path: "rfc6455.txt", question: "WebSocket handshake process") +[tool result] Found 3 relevant passages, generating answer... +[assistant] According to the RFC6455 document, the WebSocket handshake process is as follows... +``` + +## From Simple Tools to Graph Tools: Why We Need Complex Workflows + +In Chapter 4, we created simple Tools where each Tool executes a single task. But in real scenarios, many tasks require multiple steps working together. + +**Limitations of simple Tools:** + +- Single responsibility: each Tool only does one thing +- No parallelism: multiple independent tasks cannot execute simultaneously +- Hard to reuse: complex logic is difficult to split and compose + +**Important note: this chapter only demonstrates a small part of compose/graph/workflow capabilities.** + +From a broader perspective, Eino's `compose` package provides very general, deterministic orchestration capabilities: you can organize any system that needs "deterministic business flows" into executable pipelines using `compose`'s Graph/Chain/Workflow. It can **natively orchestrate all Eino components** (ChatModel, Prompt, Tools, Retriever, Embedding, Indexer, etc.), with a complete **callback** system and **interrupt/resume + checkpoint** support. + +**Graph Tool's role:** + +- **Graph Tool is a Tool-wrapped compose workflow**: wraps `compose.Graph / compose.Chain / compose.Workflow` compilable orchestration artifacts into a Tool callable by Agent +- **Supports parallelism/branching/composition**: provided by compose (parallelism, branching, field mapping, subgraphs, etc.); Graph Tool simply exposes them as a Tool entry point +- **Supports state management and persistence**: passes data between nodes, and saves/restores running state via checkpoints +- **Supports interrupt/resume**: both workflow-internal interrupts (triggered within nodes) and tool-level interrupt wrapping (nested interrupt scenarios) + +**Simple analogy:** + +- **Simple Tool** = "single-step operation" (read a file) +- **Graph Tool** = "pipeline" (read → chunk → score → filter → generate answer) + +## Key Concepts + +### compose.Workflow + +`compose.Workflow` is the core component for building workflows in Eino: + +```go +wf := compose.NewWorkflow[Input, Output]() + +// Add nodes +wf.AddLambdaNode("load", loadFunc).AddInput(compose.START) +wf.AddLambdaNode("chunk", chunkFunc).AddInput("load") +wf.AddLambdaNode("score", scoreFunc).AddInput("chunk") +wf.AddLambdaNode("answer", answerFunc).AddInput("score") + +// Connect to end node +wf.End().AddInput("answer") +``` + +**Core concepts:** + +- **Node**: a processing unit in the workflow +- **Edge**: data flow direction between nodes +- **START**: the workflow entry point +- **END**: the workflow exit point + +### BatchNode + +`BatchNode` is used for parallel processing of multiple tasks: + +```go +scorer := batch.NewBatchNode(&batch.NodeConfig[Task, Result]{ + Name: "ChunkScorer", + InnerTask: scoreOneChunk, // Processing function for a single task + MaxConcurrency: 5, // Maximum concurrency +}) +``` + +**How it works:** + +1. Receives a task list as input +2. Executes each task in parallel (bounded by MaxConcurrency) +3. Collects and returns all results + +### FieldMapping + +`FieldMapping` is used to pass data across nodes: + +```go +wf.AddLambdaNode("answer", answerFunc). + AddInputWithOptions("filter", // Get data from filter node + []*compose.FieldMapping{compose.ToField("TopK")}, + compose.WithNoDirectDependency()). + AddInputWithOptions(compose.START, // Get data from START node + []*compose.FieldMapping{compose.MapFields("Question", "Question")}, + compose.WithNoDirectDependency()) +``` + +**Why do we need FieldMapping?** + +- Pass data between non-adjacent nodes +- Merge multiple data sources into a single node +- Rename data fields + +## Graph Tool Implementation + +### 1. Define Input/Output Structures + +```go +type Input struct { + FilePath string `json:"file_path" jsonschema:"description=Absolute path to the uploaded document file"` + Question string `json:"question" jsonschema:"description=The question to answer from the document"` +} + +type Output struct { + Answer string `json:"answer"` + Sources []string `json:"sources"` +} +``` + +### 2. Build the Workflow + +```go +func buildWorkflow(cm model.BaseChatModel) *compose.Workflow[Input, Output] { + wf := compose.NewWorkflow[Input, Output]() + + // load: read file + wf.AddLambdaNode("load", compose.InvokableLambda( + func(ctx context.Context, in Input) ([]*schema.Document, error) { + data, err := os.ReadFile(in.FilePath) + if err != nil { + return nil, err + } + return []*schema.Document{{Content: string(data)}}, nil + }, + )).AddInput(compose.START) + + // chunk: split into chunks + wf.AddLambdaNode("chunk", compose.InvokableLambda( + func(ctx context.Context, docs []*schema.Document) ([]*schema.Document, error) { + var out []*schema.Document + for _, d := range docs { + out = append(out, splitIntoChunks(d.Content, 800)...) + } + return out, nil + }, + )).AddInput("load") + + // score: parallel scoring + scorer := batch.NewBatchNode(&batch.NodeConfig[scoreTask, scoredChunk]{ + Name: "ChunkScorer", + InnerTask: newScoreWorkflow(cm), + MaxConcurrency: 5, + }) + + wf.AddLambdaNode("score", compose.InvokableLambda( + func(ctx context.Context, in scoreIn) ([]scoredChunk, error) { + tasks := make([]scoreTask, len(in.Chunks)) + for i, c := range in.Chunks { + tasks[i] = scoreTask{Text: c.Content, Question: in.Question} + } + return scorer.Invoke(ctx, tasks) + }, + )). + AddInputWithOptions("chunk", []*compose.FieldMapping{compose.ToField("Chunks")}, compose.WithNoDirectDependency()). + AddInputWithOptions(compose.START, []*compose.FieldMapping{compose.MapFields("Question", "Question")}, compose.WithNoDirectDependency()) + + // filter: sort descending by score, keep up to top-3 chunks with score ≥ 3. + wf.AddLambdaNode("filter", compose.InvokableLambda( + func(ctx context.Context, scored []scoredChunk) ([]scoredChunk, error) { + sort.Slice(scored, func(i, j int) bool { + return scored[i].Score > scored[j].Score + }) + const maxK = 3 + var top []scoredChunk + for _, c := range scored { + if c.Score < 3 { + break + } + top = append(top, c) + if len(top) == maxK { + break + } + } + return top, nil + }, + )).AddInput("score") + + // answer: synthesize a response from top-k chunks, or return a not-found message if empty. + wf.AddLambdaNode("answer", compose.InvokableLambda( + func(ctx context.Context, in synthIn) (Output, error) { + if len(in.TopK) == 0 { + return Output{ + Answer: fmt.Sprintf("No relevant content found in the document for: %q", in.Question), + }, nil + } + return synthesize(ctx, cm, in) + }, + )). + AddInputWithOptions("filter", []*compose.FieldMapping{compose.ToField("TopK")}, compose.WithNoDirectDependency()). + AddInputWithOptions(compose.START, []*compose.FieldMapping{compose.MapFields("Question", "Question")}, compose.WithNoDirectDependency()) + + wf.End().AddInput("answer") + + return wf +} +``` + +### 3. Wrap as a Tool + +```go +func BuildTool(ctx context.Context, cm model.BaseChatModel) (tool.BaseTool, error) { + wf := buildWorkflow(cm) + return graphtool.NewInvokableGraphTool[Input, Output]( + wf, + "answer_from_document", + "Search a large uploaded document for content relevant to a question and synthesize a "+ + "cited answer from the most relevant passages. "+ + "Use this instead of read_file when the document may be too large to fit in context.", + ) +} +``` + +**Key code snippet** (Note: this is a simplified snippet that cannot run directly; see [rag/rag.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/rag/rag.go) for the full code): + +```go +func BuildTool[M adk.MessageType](ctx context.Context, cm model.BaseModel[M]) (tool.BaseTool, error) { +// Build workflow +wf := compose.NewWorkflow[Input, Output]() + +// Add nodes +wf.AddLambdaNode("load", loadFunc).AddInput(compose.START) +wf.AddLambdaNode("chunk", chunkFunc).AddInput("load") +wf.AddLambdaNode("score", scoreFunc). + AddInputWithOptions("chunk", []*compose.FieldMapping{compose.ToField("Chunks")}, compose.WithNoDirectDependency()). + AddInputWithOptions(compose.START, []*compose.FieldMapping{compose.MapFields("Question", "Question")}, compose.WithNoDirectDependency()) + +// Wrap as Tool +return graphtool.NewInvokableGraphTool[Input, Output](wf, "answer_from_document", "...") +} +``` + +## Graph Tool Execution Flow + +``` +┌─────────────────────────────────────────┐ +│ Input: file_path, question │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ load: read file │ + │ Output: []*Document │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ chunk: split │ + │ Output: []*Document │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ score: parallel │ + │ scoring │ + │ (MaxConcurrency=5) │ + │ Output: []scoredChunk│ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ filter: select top-k │ + │ Output: []scoredChunk│ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ answer: generate │ + │ Output: Output │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Return result │ + │ {answer, sources} │ + └──────────────────────┘ +``` + +## Chapter Summary + +- **Graph Tool**: wraps complex workflows as a Tool, supporting multi-step collaboration +- **compose.Workflow**: the core component for building workflows +- **BatchNode**: parallel processing of multiple tasks +- **FieldMapping**: passing data across nodes +- **Interrupt/Resume support**: Graph Tool supports the Checkpoint mechanism + +## Extended Thinking + +**Other Graph Tool applications:** + +- Multi-document RAG: parallel processing of multiple documents +- Multi-model collaboration: different models handling different tasks +- Complex decision trees: choosing different branches based on conditions + +**Performance optimization:** + +- Adjust MaxConcurrency to control parallelism +- Use caching to avoid redundant computation +- Stream output to improve user experience diff --git a/content/en/docs/eino/quick_start/chapter_09_a2ui_protocol.md b/content/en/docs/eino/quick_start/chapter_09_a2ui_protocol.md deleted file mode 100644 index b5a9204f732..00000000000 --- a/content/en/docs/eino/quick_start/chapter_09_a2ui_protocol.md +++ /dev/null @@ -1,252 +0,0 @@ ---- -Description: "" -date: "2026-03-16" -lastmod: "" -tags: [] -title: "Chapter 10: A2UI Protocol (Streaming UI Components)" -weight: 10 ---- - -Goal of this chapter: implement the A2UI protocol and render agent output as streaming UI components. - -## Important: A2UI’s boundary - -A2UI is not part of the Eino framework itself. It is a business-layer UI protocol/rendering approach. This chapter integrates A2UI into the agent built in earlier chapters to provide an end-to-end, production-ready example: model calls, tool calls, workflow orchestration, and finally presenting results in a more user-friendly UI. - -In real-world products, you can choose different UI forms depending on your product: - -- Web/App: custom components, tables, cards, charts -- IM/office suite: message cards, interactive forms -- CLI: plain text or TUI (terminal UI) - -Eino focuses on “composable intelligent execution and orchestration.” “How to present to users” is a business-layer concern you can extend freely. - -## Code location - -- Entry: [main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/main.go) -- Agent construction: [agent.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/agent.go) -- Server routes: [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) -- A2UI subset types: [a2ui/types.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/types.go) -- A2UI event streamer: [a2ui/streamer.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/streamer.go) -- Frontend page: [static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html) - -## Prerequisites - -Same as Chapter 1: configure a ChatModel (OpenAI or Ark). - -## Run - -In `quickstart/chatwitheino`, run: - -```bash -go run . -``` - -Output example: - -``` -starting server on http://localhost:8080 -``` - -### (Optional) Enable ch09 skills - -The final web agent aligns with Chapter 9 logic: when `EINO_EXT_SKILLS_DIR` points to a valid skills directory, it registers the `skill` middleware so the model can load `eino-guide` / `eino-component` / `eino-compose` / `eino-agent` on demand. - -```bash -go run ./scripts/sync_eino_ext_skills.go -src /path/to/eino-ext -dest ./skills/eino-ext -clean -EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run . -``` - -## From text to UI: why A2UI - -The agents we built in the first eight chapters only output text, but modern AI applications need richer interaction. - -**Limitations of pure text:** - -- Cannot display structured data (tables, lists, cards) -- Cannot update in real time (progress, status changes) -- Cannot embed interactive elements (buttons, forms, links) -- Cannot support multimedia (images, video, audio) - -**A2UI’s positioning:** - -- **A2UI is a protocol from agent to UI**: defines how agent outputs map to UI components -- **A2UI supports streaming rendering**: components update in real time without waiting for the full response -- **A2UI is declarative**: the agent declares “what to show” and the UI handles rendering - -**Simple analogy:** - -- **Plain text output** = “terminal CLI” (text only) -- **A2UI** = “web app” (can render any UI components) - -## Key concepts - -### A2UI v0.8 subset (scope of this example) - -This quickstart does not implement a “full A2UI standard library.” Instead, it implements a **subset of A2UI v0.8**: the goal is to push the agent event stream to the browser as a stable, incrementally renderable UI component tree. - -The current supported A2UI message types and component types are defined in [a2ui/types.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/types.go). - -### A2UI messages: BeginRendering / SurfaceUpdate / DataModelUpdate / InterruptRequest - -Each SSE line (`data: {...}`) carries one A2UI Message. The Message is an “envelope” that contains exactly one field: - -**Key snippet (simplified; see a2ui/types.go for the full code):** - -```go -type Message struct { - BeginRendering *BeginRenderingMsg - SurfaceUpdate *SurfaceUpdateMsg - DataModelUpdate *DataModelUpdateMsg - DeleteSurface *DeleteSurfaceMsg - InterruptRequest *InterruptRequestMsg -} -``` - -Where: - -- `BeginRendering`: tells the frontend “start rendering a surface (session)” and provides the root node ID -- `SurfaceUpdate`: adds/updates a batch of components (components form a tree and reference each other by id) -- `DataModelUpdate`: updates data bindings (for streaming incremental text into a Text component) -- `InterruptRequest`: when the agent triggers an interrupt (e.g. approval), asks the frontend to render approve/reject entry points - -### A2UI components: Text / Column / Card / Row - -This example implements only 4 UI components (see [a2ui/types.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/types.go)): - -- `Text`: text rendering (supports `usageHint` to distinguish caption/body/title); when `dataKey` exists, text comes from `DataModelUpdate` -- `Column` / `Row`: layout (children is a list of component IDs) -- `Card`: card container (children is a list of component IDs) - -## A2UI implementation: converting AgentEvent to A2UI SSE - -The core web pipeline is: - -- Run the agent to get `*adk.AsyncIterator[*adk.AgentEvent]` -- Convert the event stream to A2UI JSONL/SSE for the browser (see [a2ui/streamer.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/streamer.go)) -- Frontend parses SSE `data:` lines and renders the component tree (see [static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html)) - -### Server routes (high level) - -Key endpoints related to A2UI (see [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go)): - -- `GET /`: serves the frontend page `static/index.html` -- `POST /sessions/:id/chat`: returns SSE stream (A2UI messages), renders the agent output as it runs -- `GET /sessions/:id/render`: returns JSONL (A2UI messages) for replaying history -- `POST /sessions/:id/approve`: handles interrupt approval/rejection and continues streaming - -### Event streaming (high level) - -The server passes `Runner.Run(...)` events to `a2ui.StreamToWriter(...)`, which: - -- splits user/assistant/tool outputs -- renders tool call / tool result as “chip cards” -- turns assistant streaming tokens into `DataModelUpdate` for incremental rendering -- sends `InterruptRequest` when an interrupt happens, and waits for human approval - -## Frontend integration: fetch + SSE (not WebSocket) - -- The frontend calls `fetch('/sessions/:id/chat')`, then reads `res.body` as a stream, splits by lines, and parses `data: {...}` JSON (see [static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html)). - -**Key snippet (simplified; see static/index.html for full code):** - -```javascript -const res = await fetch(`/sessions/${id}/chat`, { - method: 'POST', - headers: {'Content-Type': 'application/json'}, - body: JSON.stringify({message}), -}); - -const reader = res.body.getReader(); -const decoder = new TextDecoder(); -let buffer = ''; -while (true) { - const {done, value} = await reader.read(); - if (done) break; - buffer += decoder.decode(value, {stream: true}); - const lines = buffer.split('\n'); - buffer = lines.pop(); - for (const line of lines) { - const trimmed = line.trim(); - if (trimmed.startsWith('data:')) { - const jsonStr = trimmed.slice(5).trimStart(); - processA2UIMessage(JSON.parse(jsonStr)); - } - } -} -``` - -## A2UI streaming flow (overview) - -``` -┌─────────────────────────────────────────┐ -│ User: Analyze this file │ -└─────────────────────────────────────────┘ - ↓ - ┌──────────────────────┐ - │ Agent starts │ - │ A2UI: AddText │ - │ "Analyzing..." │ - └──────────────────────┘ - ↓ - ┌──────────────────────┐ - │ Tool call │ - │ A2UI: AddProgress │ - │ Progress: 0% │ - └──────────────────────┘ - ↓ - ┌──────────────────────┐ - │ Tool running │ - │ A2UI: UpdateProgress│ - │ Progress: 50% │ - └──────────────────────┘ - ↓ - ┌──────────────────────┐ - │ Tool finished │ - │ A2UI: tool result │ - └──────────────────────┘ - ↓ - ┌──────────────────────┐ - │ Show result │ - │ A2UI: DataModelUpdate│ - │ (stream assistant) │ - └──────────────────────┘ -``` - -## Chapter summary - -- **A2UI**: a protocol from agent to UI defining how agent output maps to UI components -- **Subset implementation**: this example only implements Text/Column/Card/Row plus data binding -- **Streaming output**: the backend streams A2UI JSONL over SSE, the frontend renders incrementally -- **Events to UI**: convert `AgentEvent` into visual outputs for tool calls, tool results, and assistant streams - -## Series wrap-up: the full vision of this Quickstart Agent - -By the end of this chapter, we have an agent that ties together Eino’s core capabilities. Think of it as an extensible “end-to-end agent application skeleton”: - -- Runtime: Runner-driven execution with streaming output and event model -- Tooling: filesystem/shell tools with safe error handling -- Middleware: pluggable middleware/handlers for errors, retries, approvals, and more -- Observability: callbacks/trace to connect key pipelines for debugging and production monitoring -- Human-in-the-loop: interrupt/resume + checkpoint for approvals, parameter requests, branch choices -- Deterministic orchestration: compose (graph/chain/workflow) organizes complex business flows -- Delivery: UI integration like A2UI is business-layer — pick what fits your product - -You can gradually replace/extend any part: models, tools, storage, workflows, frontend protocol — without starting over. - -## Further exploration - -**Other component types:** - -- Chart components (line, bar, pie) -- Map components -- Timeline components -- Tree components -- Tabs components - -**Advanced features:** - -- Component interactions (click, drag, input) -- Conditional rendering -- Component animations -- Responsive layout diff --git a/content/en/docs/eino/quick_start/chapter_09_skill_console.md b/content/en/docs/eino/quick_start/chapter_09_skill_console.md index ee7a57add54..7a7170783df 100644 --- a/content/en/docs/eino/quick_start/chapter_09_skill_console.md +++ b/content/en/docs/eino/quick_start/chapter_09_skill_console.md @@ -1,59 +1,56 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-19" lastmod: "" tags: [] -title: "Chapter 9: Skill (Console)" +title: "Chapter 9: Skill Middleware" weight: 9 --- -Goal of this chapter: on top of Chapter 8 (RAG + Interrupt/Resume + Checkpoint), introduce the `skill` middleware so the agent can discover and load reusable skill documents (`SKILL.md`) and invoke them via tool calls. +Goal of this chapter: on top of Chapter 8 (RAG + Interrupt/Resume + Checkpoint), introduce the `skill` package, use `skill middleware` to inject and manage skills, so the Agent can discover and load a set of reusable skill documents (`SKILL.md`) and invoke them via tool calls when needed. -## Code location +## Code Location - Entry: [cmd/ch09/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch09/main.go) - Sync script: [scripts/sync_eino_ext_skills.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/scripts/sync_eino_ext_skills.go) ## Prerequisites -- Same as Chapter 1: configure a ChatModel (OpenAI or Ark) -- Prepare skills provided by the eino-ext PR (`eino-guide` / `eino-component` / `eino-compose` / `eino-agent`) +- Same as Chapter 1: configure a working ChatModel (OpenAI or Ark) +- Prepare the skills documents provided by the `eino-ext` PR (`eino-guide` / `eino-component` / `eino-compose` / `eino-agent`) -Why these four? +`skill middleware` supports plugging in various skills. This chapter only uses the four eino-related skills as an example to demonstrate how to use `skill middleware` to integrate skills. Why these four? -ChatWithEino is positioned as “help users learn Eino and assist with Eino coding using AI.” These four skills cover the key knowledge areas: +ChatWithEino is positioned as "help users learn the Eino framework and assist with writing Eino code using AI." These four skill documents cover exactly the key knowledge areas needed for this goal. -- `eino-guide`: entry point and navigation (where to start, how to run quickly) -- `eino-component`: component interfaces and implementation references (Model/Embedding/Retriever/Tool/Callback, etc.) -- `eino-compose`: orchestration and deterministic workflow references (Graph/Chain/Workflow, etc.) -- `eino-agent`: ADK/Agent references (Agent/Runner/Middleware/Filesystem/Human-in-the-loop, etc.) +Skill sources can be: -Skill sources: +- The local path to the `eino-ext` repository (the script reads `/skills/...` automatically) +- Or a directory where skills are already installed (containing the above subdirectories) -- Local path to the `eino-ext` repository (the script reads `/skills/...`) -- Or any directory where skills are already installed (containing the above subdirectories) +## From Graph Tool to Skill: Why "Skill Docs" -## From Graph Tool to Skill: why “skill docs” - -Chapter 8 solves “how to make a complex workflow callable as a Tool” (Graph Tool). But for a framework-learning/development assistant agent, there is another problem: **how to inject stable, reusable knowledge and instructions into the agent, and let it load them on demand at runtime**. +Chapter 8 solves "how to make a complex workflow callable as a Tool" (Graph Tool). But when building an agent for framework-learning/development assistance, you encounter another type of problem: **how to inject stable, reusable knowledge and instructions into the Agent, and let it load them on demand at runtime**. That is the role of Skills: -- **Tool** is more like an “action/capability”: read files, run workflows, call external systems -- **Skill** is more like a “reusable knowledge/instruction pack”: a set of markdown files (`SKILL.md` + `reference/*.md`) that describe “how to do something” +- **Tool** is more like an "action/capability": read files, run workflows, call external systems +- **Skill** is more like a "reusable knowledge/instruction pack": a set of markdown files (`SKILL.md` + `reference/*.md`) describing "how to do something" + +And `Skill middleware` is responsible for integrating skills into the agent. After registering the skill middleware, the Agent can read a specific Skill on demand via the `skill` tool. Simple analogy: -- **Tool** = “what you can do” (function/interface) -- **Skill** = “how to do it” (reusable handbook/manual) +- **Tool** = "what you can do" (function/interface) +- **Skill** = "how to do it" (reusable handbook/manual) ## Run -In `quickstart/chatwitheino`, do: +In the `quickstart/chatwitheino` directory: -### 1) Sync eino-ext skills into a local directory +### 1) Sync eino-ext skills to a local directory -To let the `skill` middleware discover skills, place them under a single directory and follow the scan convention: +To let the `skill` middleware "discover" these skills, place them under a unified directory following the scan convention: - `EINO_EXT_SKILLS_DIR//SKILL.md` @@ -66,14 +63,15 @@ go run ./scripts/sync_eino_ext_skills.go -src /path/to/eino-ext -dest ./skills/e Notes: - `-src` supports two forms: - - The root of the `eino-ext` repo (the script reads `/skills/...`) + - The root of the `eino-ext` repository (the script reads `/skills/...` automatically) - A directory where skills are already installed (should contain `eino-guide/`, `eino-component/`, etc.) - `-dest` defaults to `./skills/eino-ext` (can be omitted) ### 2) Start Chapter 9 ```bash -EINO_EXT_SKILLS_DIR=/absolute/path/to/chatwitheino/skills/eino-ext go run ./cmd/ch09 +export EINO_EXT_SKILLS_DIR=/absolute/path/to/chatwitheino/skills/eino-ext +go run ./cmd/ch09 ``` Output example (snippet): @@ -85,13 +83,13 @@ Enter your message (empty line to exit): ## Enable Skill in DeepAgent -Skill invocation is not automatic. You must register the `skill` middleware when building the agent. It’s a three-step setup: +Skill invocation does not happen automatically. You must register the `Skill middleware` when building the Agent. The core setup is three steps: -1. Use a local filesystem backend (this chapter uses `eino-ext/adk/backend/local`) to provide file reading/Glob -2. Use `skill.NewBackendFromFilesystem` to turn `EINO_EXT_SKILLS_DIR` into a skill backend -3. Use `skill.NewMiddleware` to create the middleware and attach it to DeepAgent’s `Handlers` +1. Use a local filesystem backend (this chapter uses `eino-ext/adk/backend/local`) to provide file reading/Glob capability +2. Use `skill.NewBackendFromFilesystem` to turn `EINO_EXT_SKILLS_DIR` into a Skill Backend +3. Use `skill.NewTyped[M]` to create a generic `Skill middleware` and attach it to DeepAgent's `Handlers` -**Key snippet (simplified; see cmd/ch09/main.go for full code):** +**Key code snippet (note: this is simplified and not directly runnable; see cmd/ch09/main.go for full code):** ```go backend, _ := localbk.NewBackend(ctx, &localbk.Config{}) @@ -100,29 +98,29 @@ skillBackend, _ := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesys Backend: backend, BaseDir: skillsDir, // = $EINO_EXT_SKILLS_DIR }) -skillMiddleware, _ := skill.NewMiddleware(ctx, &skill.Config{ +skillMiddleware, _ := skill.NewTyped[M](ctx, &skill.TypedConfig[M]{ Backend: skillBackend, }) -agent, _ := deep.New(ctx, &deep.Config{ +agent, _ := deep.NewTyped[M](ctx, &deep.TypedConfig[M]{ ChatModel: cm, Backend: backend, StreamingShell: backend, - Handlers: []adk.ChatModelAgentMiddleware{ + Handlers: []adk.TypedChatModelAgentMiddleware[M]{ skillMiddleware, // ... other middlewares like approval/safeTool/retry }, }) ``` -Notes: +Additional notes: -- This quickstart checks `EINO_EXT_SKILLS_DIR` existence at runtime: if it exists, it registers `skillMiddleware`; otherwise it skips it (the agent still runs and can use RAG tools). -- Skill tool input is JSON: `{"skill": ""}`, e.g. `{"skill":"eino-guide"}`. +- This quickstart checks `EINO_EXT_SKILLS_DIR` existence at runtime to ensure "it runs even without skills configured": if the directory exists, it registers `skillMiddleware`; otherwise it skips it (the agent still works and can use RAG tools). +- The Skill tool's input is JSON: `{"skill": ""}`, e.g. `{"skill":"eino-guide"}`. -## Quick verification (recommended) +## Quick Verification (Recommended) -After startup, send a prompt that forces a skill tool call to verify that skills are discovered and loadable: +After startup, send a prompt that explicitly asks the model to call the skill tool (to verify skills are discovered and loadable): ``` Use the skill tool with skill="eino-guide" and tell me what the entry point is for getting started. @@ -133,10 +131,10 @@ You should see output similar to: - `[tool result] Launching skill: eino-guide` - Tool result includes `Base directory for this skill: .../eino-guide` -## What you will see +## What You Will See - When the model calls the skill tool, the console prints: - `[tool call] ...` - - `[tool result] ...` (truncated) -- Sessions are stored under `SESSION_DIR` (default `./data/sessions`) and can be resumed: + - `[tool result] ...` (truncated for display) +- Sessions are saved to `./data/sessions_agentic` by default and support resumption: - `go run ./cmd/ch09 --session ` diff --git a/content/en/docs/eino/quick_start/chapter_10_a2ui_protocol.md b/content/en/docs/eino/quick_start/chapter_10_a2ui_protocol.md new file mode 100644 index 00000000000..1ed1b7d9b91 --- /dev/null +++ b/content/en/docs/eino/quick_start/chapter_10_a2ui_protocol.md @@ -0,0 +1,227 @@ +--- +Description: "" +date: "2026-05-19" +lastmod: "" +tags: [] +title: "Chapter 10: A2UI Protocol (Streaming UI Components)" +weight: 10 +--- + +Goal of this chapter: Implement the A2UI protocol to render Agent output as streaming UI components. + +## Important Note: The Scope of A2UI + +A2UI does not belong to the Eino framework itself — it is a business-layer UI protocol/rendering solution. This chapter integrates A2UI into the Agent built progressively in previous chapters to provide an end-to-end, production-ready complete example: from model calls, tool calls, workflow orchestration, to finally presenting results in a more user-friendly UI form. + +In real business scenarios, you can choose different UI forms depending on the product: + +- Web / App: Custom components, tables, cards, charts, etc. +- IM/Office suites: Message cards, interactive forms +- Command line: Plain text or TUI (Terminal UI) + +Eino focuses on "composable intelligent execution and orchestration capabilities." How to present results to users is a business-layer concern that can be freely extended. + +## Code Locations + +- Entry code (Runner version): [cmd/ch10/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch10/main.go) +- A2UI subset implementation: [a2ui/types.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/types.go) +- A2UI event stream conversion: [a2ui/streamer.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/streamer.go) +- Frontend page: [static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html) + +## Prerequisites + +Same as Chapter 1: You need to configure an available ChatModel (OpenAI or Ark) + +## Running + +In the `quickstart/chatwitheino` directory, execute: + +```bash +go run ./cmd/ch10/ +``` + +Example output: + +``` +starting server on http://localhost:8080 +``` + +### (Optional) Enable ch09 skills capability + +The final Web version uses Agent construction logic aligned with Chapter 9: when `EINO_EXT_SKILLS_DIR` points to a valid skills directory, the `skill` middleware is automatically registered, allowing the model to load `eino-guide` / `eino-component` / `eino-compose` / `eino-agent` via the `skill` tool on demand. + +```bash +go run ./scripts/sync_eino_ext_skills.go -src /path/to/eino-ext -dest ./skills/eino-ext -clean +EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run ./cmd/ch10/ +``` + +Sessions are saved by default in `./data/sessions_agentic`. + +## From Text to UI: Why A2UI is Needed + +In the first eight chapters, our Agent only outputs text, but modern AI applications need richer interactions. + +**Limitations of plain text output:** + +- Cannot display structured data (tables, lists, cards, etc.) +- Cannot update in real-time (progress bars, status changes, etc.) +- Cannot embed interactive elements (buttons, forms, links, etc.) +- Cannot support multimedia (images, video, audio, etc.) + +**A2UI's positioning:** + +- **A2UI is a protocol from Agent to UI**: Defines how Agent output maps to UI components +- **A2UI supports streaming rendering**: Components can update in real-time without waiting for a complete response +- **A2UI is declarative**: The Agent only needs to declare "what to display," and the UI handles rendering + +**Simple analogy:** + +- **Plain text output** = "Terminal command line" (can only display text) +- **A2UI** = "Web application" (can display any UI component) + +## Key Concepts + +### A2UI v0.8 Subset (Scope of This Example) + +This quickstart does not implement a "complete A2UI standard library." Instead, it implements an **A2UI v0.8 subset**: the goal is to push the Agent's event stream to the browser as a stable, incrementally renderable UI component tree. + +The currently implemented A2UI message types and component types are defined in [a2ui/types.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/types.go). + +### A2UI Messages: BeginRendering / SurfaceUpdate / DataModelUpdate / InterruptRequest + +Each SSE line (`data: {...}`) carries one A2UI Message. A Message is an "envelope structure" where only one field is present at a time: + +**Key code snippet (Note: this is a simplified code snippet that cannot run directly. See ****a2ui/types.go**** for complete code):** + +```go +type Message struct { + BeginRendering *BeginRenderingMsg + SurfaceUpdate *SurfaceUpdateMsg + DataModelUpdate *DataModelUpdateMsg + DeleteSurface *DeleteSurfaceMsg + InterruptRequest *InterruptRequestMsg +} +``` + +Where: + +- `BeginRendering`: Tells the frontend to "start rendering a surface (session)" and specifies the root node ID +- `SurfaceUpdate`: Adds/updates a batch of components (components form a tree, referencing each other by `id`) +- `DataModelUpdate`: Updates data bindings (used to incrementally update streaming text to a Text component) +- `InterruptRequest`: When the Agent triggers an interrupt (e.g., approval), notifies the frontend to display an approve/reject entry + +### A2UI Components: Text / Column / Card / Row + +This example implements only 4 UI components (see [a2ui/types.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/types.go)): + +- `Text`: Text rendering (supports `usageHint` to distinguish caption/body/title); when `dataKey` is present, text comes from `DataModelUpdate` +- `Column` / `Row`: Layout (children are component ID lists) +- `Card`: Card container (children are component ID lists) + +## A2UI Implementation: Converting AgentEvent to A2UI SSE + +The core pipeline of the final Web version is: + +- The backend runs the Agent, obtaining `*adk.AsyncIterator[*adk.TypedAgentEvent[M]]` +- The event stream is converted to A2UI JSONL/SSE output for the browser (see [a2ui/streamer.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/streamer.go)) +- The frontend parses SSE `data:` lines and renders the component tree (see [static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html)) + +### Server Routes (High Level) + +Key interfaces related to A2UI (see [cmd/ch10/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch10/main.go)): + +- `GET /`: Returns the frontend page `static/index.html` +- `POST /sessions/:id/chat`: Returns an SSE stream (A2UI messages), rendering Agent results to the UI as they execute +- `GET /sessions/:id/render`: Returns JSONL (A2UI messages) for "replaying history when selecting a session" +- `POST /sessions/:id/approve`: Handles interrupt approval/rejection and continues returning the SSE stream + +### Event Stream Conversion (High Level) + +The server passes the `Runner.Run(...)` event stream to `a2ui.StreamToWriter[M](...)`, which is responsible for: + +- Splitting user/assistant/tool output +- Rendering tool call / tool result as "chip cards" +- Converting the assistant's streaming tokens into `DataModelUpdate` for "render while generating" +- Sending `InterruptRequest` when encountering an interrupt, and pausing to wait for human approval + +## Frontend Integration: fetch + SSE (Not WebSocket) + +- The frontend initiates a request via `fetch('/sessions/:id/chat')`, then reads streaming bytes from `res.body`, splits by line, and parses `data: {...}` JSON (see [static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html)). + +**Key code snippet (Note: this is a simplified code snippet that cannot run directly. See ****static/index.html**** for complete code):** + +```javascript +const res = await fetch(`/sessions/${id}/chat`, { + method: 'POST', + headers: {'Content-Type': 'application/json'}, + body: JSON.stringify({message}), +}); + +const reader = res.body.getReader(); +const decoder = new TextDecoder(); +let buffer = ''; +while (true) { + const {done, value} = await reader.read(); + if (done) break; + buffer += decoder.decode(value, {stream: true}); + const lines = buffer.split('\n'); + buffer = lines.pop(); + for (const line of lines) { + const trimmed = line.trim(); + if (trimmed.startsWith('data:')) { + const jsonStr = trimmed.slice(5).trimStart(); + processA2UIMessage(JSON.parse(jsonStr)); + } + } +} +``` + +## A2UI Streaming Rendering Flow (Overview) + +``` +┌─────────────────────────────────────────┐ +│ User: Analyze this file │ +└─────────────────────────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Agent starts │ + │ A2UI: AddText │ + │ "Analyzing..." │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Call Tool │ + │ A2UI: AddProgress │ + │ Progress: 0% │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Tool executing │ + │ A2UI: UpdateProgress│ + │ Progress: 50% │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Tool complete │ + │ A2UI: tool result │ + └──────────────────────┘ + ↓ + ┌──────────────────────┐ + │ Display results │ + │ A2UI: DataModelUpdate│ + │ (streaming assistant)│ + └──────────────────────┘ +``` + +## Chapter Summary + +- **A2UI**: A protocol from Agent to UI, defining how Agent output maps to UI components +- **Subset implementation**: This example only implements Text/Column/Card/Row and data binding +- **Streaming output**: The backend pushes A2UI JSONL via SSE; the frontend incrementally renders the component tree +- **Events to UI**: Converts `AgentEvent` into visualized output of `tool call / tool result / assistant stream` + +## Next Steps + +This chapter's `cmd/ch10` uses `adk.Runner` to implement a complete Web application. However, Runner is a "one-shot" model — if a user sends a new question while the Agent is still answering, Runner has no built-in mechanism to cancel the current execution and switch to the new input. + +The next chapter introduces `adk.TurnLoop`, adding **Preempt** and **Abort** capabilities to the Agent. diff --git a/content/en/docs/eino/quick_start/chapter_11_turnloop.md b/content/en/docs/eino/quick_start/chapter_11_turnloop.md new file mode 100644 index 00000000000..1c9d3ffffe5 --- /dev/null +++ b/content/en/docs/eino/quick_start/chapter_11_turnloop.md @@ -0,0 +1,247 @@ +--- +Description: "" +date: "2026-05-19" +lastmod: "" +tags: [] +title: "Chapter 11: TurnLoop — Preemption, Abort, and Multi-Turn Lifecycle" +weight: 11 +--- + +In the previous chapter, we used `adk.Runner` to implement a complete A2UI Web application. It works fine, but try this scenario: + +> You ask the Agent a complex question. It starts calling tools, generating a long answer... but you suddenly realize you asked the wrong thing and want to switch to a different question. + +In the previous chapter's Runner mode, you can only wait for it to finish or refresh the page and lose everything. + +This chapter introduces `adk.TurnLoop`, enabling two new user-facing capabilities for the Agent: **Preemption** and **Abort**. + +## Prerequisites + +Same as Chapter 1: You need to configure an available ChatModel (OpenAI or Ark). See the "Prerequisites" section in Chapter 1 for details. + +## Running & Trying It Out + +In the `quickstart/chatwitheino` directory, execute: + +```bash +go run . +``` + +Open your browser to `http://localhost:8080`, then try the following: + +### Experience Preemption + +1. Send a question that triggers a long answer, e.g., "Explain all of Eino's components in detail" +2. **While the Agent is still answering**, send a new message directly, e.g., "Never mind, just tell me what ChatModel is" +3. Observe: The old answer stops immediately, and the Agent begins answering the new question + +### Experience Abort + +1. Send a question +2. **While the Agent is answering**, click the **Abort button** in the top-right corner +3. Observe: The Agent stops immediately and produces no further output + +Neither of these capabilities existed in the previous chapter's Runner version. Below explains how they are implemented. + +## Code Locations + +- Entry code: [main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/main.go) +- Agent construction: [agent.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/agent.go) +- TurnLoop server: [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) + +## Why Runner Can't Do This + +In the previous chapter's `cmd/ch10`, each `/sessions/:id/chat` request calls `runner.Run(ctx, messages)` once. Runner is a **single-turn** model — call once, execute once, finish. If a user sends another message while the Agent is executing, Runner has no "running loop" to receive it. + +TurnLoop is a **persistent multi-turn execution loop**. It remains idle between turns, ready to receive new input via `Push()` and respond immediately. Because there is a continuously running loop, preemption and abort become possible — you can interrupt an ongoing turn or stop the entire loop. + + + + + + + + + +
    CapabilityCh10 (Runner, single-turn)Ch11 (TurnLoop, multi-turn)
    Streaming output
    Approval / interrupt
    Persistent cross-turn execution, real-time response to new input❌ Each Run() is independent✅ Push() anytime
    Preempt an ongoing answer✅ Push(item, WithPreempt(...))
    Abort Agent✅ loop.Stop(WithImmediate())
    Flexible per-turn input construction❌ Business layer manually assembles✅ GenInput callback
    + +## TurnLoop's Core Model + +TurnLoop is a **push-based event loop that manages Agent execution in units of turns**. Unlike Runner's "call once, execute once" model, TurnLoop runs continuously: after a turn ends, it enters idle wait; when a new item arrives, it immediately starts the next turn. + +``` +Push(item) → [queue] → GenInput(items) → Agent.Run() → OnAgentEvents(events) + ↑ │ + └──── idle wait / next turn ←──┘ +``` + +Key concepts: + +- **Item**: The carrier of user input. This example defines it as `ChatItem`, which can carry user messages or approval decisions +- **GenInput**: Builds Agent input from items in the queue (decides which items to consume and which to retain for the next turn) +- **OnAgentEvents**: Receives the Agent's output event stream, responsible for rendering and persistence +- **Push**: Pushes a new item to the queue, with optional preemption options + +## One Session Corresponds to One TurnLoop + +In this example's Web scenario, each chat session corresponds to one TurnLoop instance. When a user sends their first message, the server creates a TurnLoop for that session and calls `Run()` to start it; subsequent messages are fed into the same loop via `Push()`. The loop remains idle between turns until the session is deleted or the user aborts. + +This is TurnLoop's most typical usage pattern: **the loop's lifecycle is bound to the user session**. A long-running TurnLoop makes preemption and abort natural operations — because the "running loop" always exists, new input can be fed in at any time. + +## Normal Flow: idle → new message → answer → idle + +The simplest scenario is the user asking questions sequentially, waiting for answers, then asking the next: + +```go +// When the user sends the first message, create and start TurnLoop +loop := adk.NewTurnLoop(cfg) +loop.Push(&ChatItem{Query: "hello"}) +loop.Run(ctx) +// → GenInput builds input → Agent executes → OnAgentEvents streams output +// → Turn ends, TurnLoop enters idle wait + +// User sends second message (loop is idle at this point) +loop.Push(&ChatItem{Query: "explain Eino's architecture"}) +// → TurnLoop wakes up, starts new turn: GenInput → Agent → OnAgentEvents → idle +``` + +This flow is no different from the previous chapter's Runner in terms of user experience — the difference is that TurnLoop's loop **persists** and doesn't need to be recreated each time. Once a user sends a new message while the Agent is still answering, we enter the "preemption" scenario below. + +## How Preemption Works + +When a user sends a new message while the Agent is answering, the business layer only needs one line of code to trigger preemption: + +```go +loop.Push(item, adk.WithPreempt[*ChatItem, M](adk.AfterToolCalls)) +``` + +After TurnLoop receives this instruction: + +1. Waits for the current tool call to complete (`AfterToolCalls` means don't interrupt an executing tool to avoid inconsistent state) +2. Cancels the current turn — OnAgentEvents' context is cancelled, the old turn exits +3. Takes the new item from the queue, builds input via GenInput, starts a new turn + +Preemption mode allows choosing different safe points based on business needs: + + + + + + +
    ModeSpecific Behavior
    AfterToolCallsWaits for the currently executing tool calls to complete, then cancels the current turn and starts a new turn
    AfterChatModelWaits for the current LLM call to complete, then cancels the current turn and starts a new turn
    AnySafePointImmediately cancels the current turn at any safe point (e.g., between tool calls, between model calls) and starts a new turn
    + +> In this example, TurnLoop runs in a separate goroutine while the HTTP handler needs to write the event stream to the SSE response. The two coordinate via channels (see `iterEnvelope`/`iterResult` and the `handlerDone` signaling mechanism in [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go)). These are HTTP adaptation layer details, not part of the TurnLoop API itself. + +## How Abort Works + +Abort is simpler — directly stop the entire TurnLoop: + +```go +loop.Stop(adk.WithImmediate()) // Cancel immediately without waiting for current turn +loop.Wait() // Wait for complete exit +``` + +### Three Modes of Stop + + + + + + +
    ModeSpecific Behavior
    loop.Stop()Turn-boundary exit: waits for the current turn to complete before exiting
    loop.Stop(WithImmediate())Immediate exit: cancels the current turn's context
    loop.Stop(WithGraceful())Safe-point exit: exits at the next safe point (e.g., between tool calls)
    + +## TurnLoop Configuration + +When creating a TurnLoop, specify callbacks and options via `TurnLoopConfig`: + +```go +cfg := adk.TurnLoopConfig[*ChatItem, M]{ + // GenInput: Called at the start of each turn, decides "what the Agent sees this turn" + // Selects items from the queue to build Agent input, returns Consumed (processed this turn) and Remaining (kept for later turns) + GenInput: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], items []*ChatItem) (*adk.GenInputResult[*ChatItem, M], error) { + // ...build AgentInput, persist user messages... + }, + + // PrepareAgent: Called once per turn, returns the Agent to use for this turn + // This example returns the same Agent, but you can dynamically select different Agents based on items + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], consumed []*ChatItem) (adk.TypedAgent[M], error) { + return agent, nil + }, + + // OnAgentEvents: Receives the Agent's event stream, responsible for rendering output and persisting intermediate messages + // This example transfers the event stream to the HTTP handler via channel for SSE output + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[*ChatItem, M], events *adk.AsyncIterator[*adk.TypedAgentEvent[M]]) error { + // ...pass events to HTTP handler, wait for consumption to complete... + }, + + // The following three fields are for declarative checkpoint (approval recovery), detailed in the next section + GenResume: makeGenResume(), + Store: checkpointStore, + CheckpointID: sessionID, +} + +loop := adk.NewTurnLoop(cfg) +``` + + + + + + + + +
    CallbackInvocation TimingResponsibility
    GenInputWhen items exist in the queueSelect which items to consume, build Agent input (can decide which items to retain for the next turn)
    PrepareAgentAfter GenInputReturn the Agent instance for this turn, supports dynamic Agent configuration adjustment
    OnAgentEventsWhen Agent produces event streamConsume events, render output, persist results — the core entry point for business-layer Agent output processing
    GenResumeWhen resuming from checkpointExtract approval results from newly Pushed items, construct
    ResumeParams
    , automating approval recovery
    Store + CheckpointIDEnable declarative checkpoint; TurnLoop automatically handles saving and restoring execution state
    + +> For complete callback implementations, see [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go). + +## Declarative Checkpoint: Automating Approval Recovery + +In Chapter 7 (Runner mode), approval recovery required the business layer to manually call `runner.ResumeWithParams()` and determine whether "this is a normal execution or a recovery execution." TurnLoop provides a more concise approach — declare `Store` and `CheckpointID` in the configuration (see previous section), and TurnLoop automatically handles saving and restoration: + +1. When Agent execution reaches an approval interrupt, TurnLoop automatically saves execution state to `Store` (keyed by `CheckpointID`) +2. After the user makes an approval decision, the business layer creates a new TurnLoop (using the **same** `CheckpointID`) and Pushes the approval item +3. When the new TurnLoop `Run()`s, it detects the checkpoint exists and **automatically calls `GenResume`** (instead of `GenInput`) to obtain recovery parameters +4. The Agent resumes execution from the interrupt point + +`GenResume`'s responsibility is to extract approval results from newly Pushed items and construct `ResumeParams`: + +```go +GenResume: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], + canceledItems, unhandledItems, newItems []*ChatItem, +) (*adk.GenResumeResult[*ChatItem, M], error) { + // newItems contains the item Pushed during approval recovery + item := newItems[0] + return &adk.GenResumeResult[*ChatItem, M]{ + ResumeParams: &adk.ResumeParams{ + InterruptID: item.InterruptID, + ApprovalResult: item.ApprovalResult, + }, + }, nil +} +``` + +Compared to Runner's `ResumeWithParams()`, declarative checkpoint frees the business layer from managing the "normal execution vs. recovery execution" branching — TurnLoop automatically chooses between `GenInput` and `GenResume` based on whether a checkpoint exists. + +## Chapter Summary + +- **TurnLoop** is a persistent multi-turn execution loop whose lifecycle is bound to the user session +- **Normal flow**: `Push(item)` → GenInput → Agent → OnAgentEvents → idle → wait for next Push +- **Preemption**: `Push(item, WithPreempt(AfterToolCalls))` — one line of code cancels the current turn and starts a new one +- **Abort**: `loop.Stop(WithImmediate())` — one line of code terminates the entire loop +- **Declarative checkpoint**: Configure `Store` + `CheckpointID`, and TurnLoop automatically handles interrupt saving and restoration +- For specific callback implementations, see [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) + +## Series Conclusion: Complete Agent Application Skeleton + +By this chapter, we've used a runnable Agent to connect Eino's core capabilities: + +- **Runtime**: Runner / TurnLoop drives execution, supporting streaming output, preemption, and abort +- **Tool layer**: Filesystem / Shell and other Tool capabilities integrated, tool errors handled safely +- **Middleware**: Pluggable middleware/handlers for cross-cutting capabilities like error handling, retry, and approval +- **Observability**: callbacks/trace capabilities connecting key paths for debugging and production observability +- **Human-AI collaboration**: interrupt/resume + checkpoint supporting approval, parameter completion, branch selection, and other interactive flows +- **Deterministic orchestration**: compose (graph/chain/workflow) organizes complex business flows into maintainable, reusable execution graphs +- **Business delivery**: A2UI protocol presents Agent capabilities to users as streaming UI +- **Execution control**: TurnLoop provides preemption, abort, and multi-turn lifecycle management, adapting to complex interaction needs in real business scenarios + +You can progressively replace/extend any component on this skeleton: model, tools, storage, workflows, frontend rendering protocol — without starting from scratch. diff --git a/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_Breaking_Changes.md b/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_Breaking_Changes.md index e00ac16fbe2..89d3c9978ed 100644 --- a/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_Breaking_Changes.md +++ b/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_Breaking_Changes.md @@ -9,9 +9,9 @@ weight: 1 ## 1. API Breaking Changes -### 1.1 filesystem Shell Interface Renamed +### 1.1 Filesystem Shell Interface Renamed -**Location**: `adk/filesystem/backend.go` **Change Description**: Shell-related interfaces have been renamed and no longer embed the `Backend` interface. **Before (v0.7.x)**: +**Location**: `adk/filesystem/backend.go` **Description**: Shell-related interfaces have been renamed and no longer embed the `Backend` interface. **Before (v0.7.x)**: ```go type ShellBackend interface { @@ -41,38 +41,69 @@ type StreamingShell interface { - `ShellBackend` renamed to `Shell` - `StreamingShellBackend` renamed to `StreamingShell` -- Interfaces no longer embed `Backend`. If your implementation depends on the composite interface, you need to implement them separately **Migration Guide**: +- Interfaces no longer embed `Backend`; if your implementation relies on the combined interface, you need to implement them separately. **Migration Guide**: ```go // Before type MyBackend struct {} func (b *MyBackend) Execute(...) {...} -// MyBackend implementing ShellBackend needed to implement all Backend methods +// MyBackend implementing ShellBackend required implementing all Backend methods // After type MyShell struct {} func (s *MyShell) Execute(...) {...} -// MyShell only needs to implement Shell interface methods -// If you also need Backend functionality, implement both interfaces separately +// MyShell only needs to implement the Shell interface methods +// If Backend functionality is also needed, implement both interfaces separately ``` --- +### 1.2 Filesystem Backend: Read Return Value Breaking Change + +- **Location**: adk/filesystem/backend.go +- **Description**: The return value of `Backend.Read` has been incompatibly changed from returning `string` to returning a `*FileContent` struct. + +**Before (v0.7.x)**: + +```go +type Backend interface { + ... + Read(ctx context.Context, req *ReadRequest) (string, error) + ... + } +``` + +**After (v0.8.0)**: + +```go +type Backend interface { + ... + Read(ctx context.Context, req *ReadRequest) (*FileContent, error) + ... + } +``` + +**Impact:** + +- v0.7.x Read interface returned `string`. v0.8.0 Read interface returns struct `FileContent`, which is a breaking change. +- For Backend implementors: Need to replace the Read method implementation, changing from returning String to returning *FileContent. +- For Backend consumers: Need to upgrade the Backend implementation to one supporting v0.8. Also need to modify the Backend.Read call to use the new *FileContent return value. + ## 2. Behavioral Breaking Changes ### 2.1 AgentEvent Sending Mechanism Change -**Location**: `adk/chatmodel.go` **Change Description**: `ChatModelAgent`'s `AgentEvent` sending mechanism changed from eino callback mechanism to Middleware mechanism. **Before (v0.7.x)**: +**Location**: `adk/chatmodel.go` **Description**: The `AgentEvent` sending mechanism in `ChatModelAgent` has been changed from the eino callback mechanism to the Middleware mechanism. **Before (v0.7.x)**: - `AgentEvent` was sent through eino's callback mechanism -- If users customized ChatModel or Tool Decorator/Wrapper, and the original ChatModel/Tool had embedded Callback points, `AgentEvent` would be sent **inside** the Decorator/Wrapper -- This applied to all ChatModels implemented in eino-ext, but may not apply to most user-implemented Tools and Tools provided by eino **After (v0.8.0)**: -- `AgentEvent` is sent through Middleware mechanism -- `AgentEvent` is sent **outside** user-customized Decorator/Wrapper **Impact**: +- If users customized a ChatModel or Tool Decorator/Wrapper, and the original ChatModel/Tool had embedded Callback hooks, the `AgentEvent` would be sent **inside** the Decorator/Wrapper +- This applied to all ChatModels implemented in eino-ext, but might not apply to most user-implemented Tools and Tools provided by eino directly **After (v0.8.0)**: +- `AgentEvent` is sent through the Middleware mechanism +- `AgentEvent` is sent **outside** the user's custom Decorator/Wrapper **Impact**: - Under normal circumstances, users won't notice this change -- If users previously implemented their own ChatModel or Tool Decorator/Wrapper, the relative position of event sending will change -- Position change may cause `AgentEvent` content to change: previous events didn't include Decorator/Wrapper modifications, current events will include them **Reason for Change**: -- In normal business scenarios, we want emitted events to include Decorator/Wrapper modifications **Migration Guide**: If you previously wrapped ChatModel or Tool through Decorator/Wrapper, you need to implement the `ChatModelAgentMiddleware` interface instead: +- If users previously implemented a ChatModel or Tool Decorator/Wrapper, the relative position of event sending changes +- Position change may cause the content of `AgentEvent` to change: previously events did not include changes made by Decorator/Wrapper, now events will include them **Rationale**: +- In normal business scenarios, we want the emitted events to include changes made by Decorator/Wrapper **Migration Guide**: If you previously wrapped ChatModel or Tool through Decorator/Wrapper, switch to implementing the `ChatModelAgentMiddleware` interface: ```go // Before: Wrapping ChatModel through Decorator/Wrapper @@ -85,19 +116,19 @@ func (w *MyModelWrapper) Generate(ctx context.Context, input []*schema.Message, return w.inner.Generate(ctx, input, opts...) } -// After: Implement WrapModel method of ChatModelAgentMiddleware +// After: Implement the WrapModel method of ChatModelAgentMiddleware type MyMiddleware struct{} func (m *MyMiddleware) WrapModel(ctx context.Context, chatModel model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) { return &myWrappedModel{inner: chatModel}, nil } -// For Tool Wrappers, implement WrapInvokableToolCall / WrapStreamableToolCall methods instead +// For Tool Wrappers, switch to implementing WrapInvokableToolCall / WrapStreamableToolCall methods ``` ### 2.2 filesystem.ReadRequest.Offset Semantic Change -**Location**: `adk/filesystem/backend.go` **Change Description**: `Offset` field changed from 0-based to 1-based. **Before (v0.7.x)**: +**Location**: `adk/filesystem/backend.go` **Description**: The `Offset` field has been changed from 0-based to 1-based. **Before (v0.7.x)**: ```go type ReadRequest struct { @@ -112,6 +143,7 @@ type ReadRequest struct { ```go type ReadRequest struct { + FilePath string // Offset specifies the starting line number (1-based) for reading. // Line 1 is the first line of the file. @@ -124,10 +156,10 @@ type ReadRequest struct { **Migration Guide**: ```go -// Before: Read from line 0 (i.e., first line) +// Before: Reading from line 0 (i.e., the first line) req := &ReadRequest{Offset: 0, Limit: 100} -// After: Read from line 1 (i.e., first line) +// After: Reading from line 1 (i.e., the first line) req := &ReadRequest{Offset: 1, Limit: 100} // If you previously used Offset: 10 to mean starting from line 11 @@ -138,7 +170,7 @@ req := &ReadRequest{Offset: 1, Limit: 100} ### 2.3 filesystem.FileInfo.Path Semantic Change -**Location**: `adk/filesystem/backend.go` **Change Description**: `FileInfo.Path` field is no longer guaranteed to be an absolute path. **Before (v0.7.x)**: +**Location**: `adk/filesystem/backend.go` **Description**: The `FileInfo.Path` field no longer guarantees an absolute path. **Before (v0.7.x)**: ```go type FileInfo struct { @@ -160,18 +192,25 @@ type FileInfo struct { **Impact**: -- Code that depends on `Path` being an absolute path may have issues +- Code that depends on `Path` being an absolute path may encounter issues - Need to check and handle relative path cases --- ### 2.4 filesystem.WriteRequest Behavior Change -**Location**: `adk/filesystem/backend.go` **Change Description**: `WriteRequest` write behavior changed from "error if file exists" to "overwrite if file exists". **Before (v0.7.x)**: +**Location**: `adk/filesystem/backend.go` **Description**: The write behavior of `WriteRequest` has been changed from "error if file exists" to "overwrite if file exists". **Before (v0.7.x)**: ```go // WriteRequest comment: // The file will be created if it does not exist, or error if file exists. +type WriteRequest struct { + // FilePath is the absolute path of the file to write. Must start with '/'. + // The file will be created if it does not exist, or error if file exists. + FilePath string + + ... +} ``` **After (v0.8.0)**: @@ -179,19 +218,26 @@ type FileInfo struct { ```go // WriteRequest comment: // Creates the file if it does not exist, overwrites if it exists. +type WriteRequest struct { + // FilePath is the path of the file to write. + FilePath string + + .... +} ``` **Impact**: -- Code that previously relied on "error if file exists" behavior will no longer error, but directly overwrite -- May cause unexpected data loss **Migration Guide**: -- If you need to preserve the original behavior, check if the file exists before writing +- Code that previously relied on "error if file exists" behavior will no longer error, but instead overwrite directly +- May lead to unexpected data loss **Migration Guide**: +- If you need to preserve the original behavior, check whether the file exists before writing +- Previously FilePath represented an absolute path; the new version does not stipulate that FilePath must be an absolute path. Scenarios that depended on absolute paths need to adapt accordingly --- ### 2.5 GrepRequest.Pattern Semantic Change -**Location**: `adk/filesystem/backend.go` **Change Description**: `GrepRequest.Pattern` changed from literal matching to regular expression matching. **Before (v0.7.x)**: +**Location**: `adk/filesystem/backend.go` **Description**: `GrepRequest.Pattern` has been changed from literal matching to regular expression matching. **Before (v0.7.x)**: ```go // Pattern is the literal string to search for. This is not a regular expression. @@ -208,16 +254,16 @@ type FileInfo struct { **Impact**: - Search patterns containing regex special characters will behave differently -- For example, searching for `interface{}` now needs to be escaped as `interface\{\}` **Migration Guide**: +- For example, searching for `interface{}` now requires escaping to `interface\{\}` **Migration Guide**: ```go // Before: Literal search req := &GrepRequest{Pattern: "interface{}"} -// After: Regex search, need to escape special characters +// After: Regex search, special characters need escaping req := &GrepRequest{Pattern: "interface\\{\\}"} -// Or if searching for literals containing . * + ?, also need to escape +// Or if searching for literals containing . * + ?, they also need escaping // Before req := &GrepRequest{Pattern: "config.json"} // After @@ -226,11 +272,37 @@ req := &GrepRequest{Pattern: "config\\.json"} --- +### 2.6 EditRequest.FilePath Semantic Change + +**Location**: `adk/filesystem/backend.go` **Description**: The mandatory absolute path description has been removed from EditRequest.FilePath comments. **Before (v0.7.x)**: + +```go +type EditRequest struct { + // FilePath is the absolute path of the file to edit. Must start with '/'. + FilePath string + .... + } + } +``` + +**After (v0.8.0)**: + +```go +type EditRequest struct { + // FilePath is the path of the file to edit. + FilePath string +} +``` + +**Impact**: + +- In the old version, `FilePath` defaulted to representing an absolute path; the new version no longer guarantees `FilePath` is an absolute path. Logic that previously relied on `FilePath` being an absolute path needs to be adapted accordingly. + ## Migration Recommendations -1. **Handle compile errors first**: Type changes (like Shell interface renaming) will cause compilation failures, need to fix first -2. **Pay attention to semantic changes**: `ReadRequest.Offset` changed from 0-based to 1-based, `Pattern` changed from literal to regex - these won't cause compile errors but will change runtime behavior -3. **Check file operations**: `WriteRequest` overwrite behavior change may cause data loss, requires additional checks -4. **Migrate Decorator/Wrapper**: If you have custom ChatModel/Tool Decorator/Wrapper, change to implement `ChatModelAgentMiddleware` -5. Upgrade backend implementations as needed: If using local/ark agentkit backend provided by eino-ext, upgrade to corresponding alpha versions: [local backend v0.2.0-alpha](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Flocal%2Fv0.2.0-alpha.1), [ark agentkit backend v0.2.0-alpha](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Fagentkit%2Fv0.2.0-alpha.1) -6. **Test verification**: After migration, perform comprehensive testing, especially for code involving file operations and search functionality +1. **Fix compilation errors first**: Type changes (such as Shell interface renaming) will cause compilation failures and need to be fixed first +2. **Pay attention to semantic changes**: `ReadRequest.Offset` changing from 0-based to 1-based, `Pattern` changing from literal to regex—these won't cause compilation errors but will change runtime behavior +3. **Check file operations**: The overwrite behavior change in `WriteRequest` may lead to data loss and requires additional checking +4. **Migrate Decorator/Wrapper**: If you have custom ChatModel/Tool Decorator/Wrappers, switch to implementing `ChatModelAgentMiddleware` +5. **Upgrade backend implementations as needed**: If using the local/ark agentkit backend provided by eino-ext, upgrade to the corresponding latest version: [adk/backend/local/v0.2.1](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Flocal%2Fv0.2.1) [adk/backend/agentkit/v0.2.1](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Fagentkit%2Fv0.2.1) +6. **Test verification**: After migration, conduct comprehensive testing, especially code involving file operations and search functionality diff --git a/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md b/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md index 24e6d20ae02..173f137c89c 100644 --- a/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md +++ b/content/en/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md @@ -1,17 +1,14 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] -title: 'Eino: v0.8.*-adk middlewares' +title: v0.8.*-adk middlewares weight: 8 --- This document introduces the main new features and improvements in Eino ADK v0.8.*. -> 💡 -> Currently in the v0.8.0.Beta version stage: [https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1](https://github.com/cloudwego/eino/releases/tag/v0.8.0-beta.1) - ## Version Highlights v0.8 is a significant feature enhancement release that introduces a new middleware interface architecture, adds multiple practical middlewares, and provides enhanced observability support. diff --git a/content/en/docs/eino/release_notes_and_migration/_index.md b/content/en/docs/eino/release_notes_and_migration/_index.md index 3a2fed802c3..7017a891d4f 100644 --- a/content/en/docs/eino/release_notes_and_migration/_index.md +++ b/content/en/docs/eino/release_notes_and_migration/_index.md @@ -3,8 +3,8 @@ Description: "" date: "2026-03-02" lastmod: "" tags: [] -title: 'Eino: Release Notes & Migration Guide' -weight: 8 +title: Release Notes & Migration Guide +weight: 7 --- # Version Management Guidelines diff --git a/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md b/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md new file mode 100644 index 00000000000..2fb03ec295a --- /dev/null +++ b/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md @@ -0,0 +1,97 @@ +--- +Description: "" +date: "2026-05-21" +lastmod: "" +tags: [] +title: v0.9.* agentic-runtime +weight: 9 +--- + +The theme of V0.9 is `agentic-runtime`. This release focuses on ADK's message protocol, Agent execution control, and multi-turn runtime capabilities. While preserving the default `*schema.Message` path, it introduces `AgenticMessage` along with generic abstractions, laying the foundation for richer model-native Agent protocols, server-side tool calling, execution interruption and resumption. + +## 1. AgenticMessage and ADK Support + +V0.9 introduces `schema.AgenticMessage` to express a more complete Agentic message structure compared to the traditional `schema.Message`. + +- `AgenticMessage` adopts a content block model, supporting structured fragments such as text, reasoning content, tool calls, tool results, server-side tools, MCP tools, and multimodal content. +- `[]ContentBlock` preserves the block ordering from different model protocol responses more completely; the new block types are also better suited for structures like tool use, reasoning, and streaming metadata in protocols such as OpenAI Responses API, Claude, and Gemini. +- `components/model` introduces the `AgenticModel` component for integrating model implementations that use `AgenticMessage` as input/output. +- ADK provides typed agent, typed event, typed runner, and typed `ChatModelAgent` support for the `AgenticMessage` path, enabling AgenticModel to participate in the ADK Agent lifecycle. +- [Eino: Quick Start](/docs/eino/quick_start): The entire series has been rewritten based on AgenticMessage. + +## 2. ChatModelAgent Capability Enhancements + +V0.9 systematically enhances `ChatModelAgent`'s execution control, model call reliability, and middleware extension points. + +### Cancel + +- Introduces Agent Cancel capability for externally terminating a running Agent. +- Supports safe-point cancellation, recursive cancellation, cancel timeout escalation, and checkpoint persistence during cancellation. +- Interrupts that occur during cancellation are unified under cancel semantics; callers can distinguish active cancellation from normal business failures via `CancelError`. +- [Eino ADK: Agent Cancel and TurnLoop Quick Start](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) + +### Model Retry + +- Retry has been expanded from simple error retry to `ShouldRetry(ctx, RetryContext) -> RetryDecision`. +- Retry decisions can read model output, reject outputs that don't meet conditions, modify the next input, append model options, and override backoff. + +### Model Failover + +- Introduces Model Failover capability for switching to backup models after a model call failure. +- Failover decisions can read the failed attempt's output, error, original input, and attempt number, then select which model to use next. +- Supports rewriting input for backup models; also supports prioritizing the last successfully called model to reduce the cost of starting from the fixed primary model each time. +- [ChatModel Failover Guide](/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide) + +### Middleware Enhancements + +- `ChatModelAgentMiddleware` adds `AfterAgent` for executing cleanup logic after an Agent completes successfully. +- Summarization, reduction, skill, filesystem, plan-task, patch-tool-calls and other middlewares have been genericized to support the `AgenticMessage` path. +- Summarization middleware adds `TypedMiddleware.Summarize`, transitioning synchronous summarization from a standalone function to a cohesive middleware capability. +- Filesystem middleware enhances multimodal reading capabilities and adds PDF pages validation. +- Introduces `agentsmd` middleware for loading and injecting `AGENTS.md` style project instructions. +- `ChatModelAgentState` adds `ToolInfos` and `DeferredToolInfos` as the primary path for middlewares to adjust the tool set visible to the model. +- `ToolInfos` represents tools directly visible to the current model call; `DeferredToolInfos` represents candidate tools that can be discovered by the model on demand through tool search mechanisms. +- Tool search middleware supports three tool loading approaches: using the model-side native tool search capability to load from deferred tools on demand; providing a fixed-schema `ToolSearchTool` per model protocol requirements, allowing the model to search deferred tools through this entry point; without relying on model-side protocol, using Eino's custom `tool_search` tool to retrieve tools and append matches to regular `ToolInfos`. +- Compose adds `AgenticToolsNode`; `ToolsNode` adds tool name and argument alias support. +- [Eino ADK: ChatModelAgentMiddleware](/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware) + +## 3. TurnLoop + +V0.9 introduces `TurnLoop` to elevate a one-shot Agent run into a continuously running, externally-driven turn-level runtime. + +- Designed for multi-turn execution: `TurnLoop` continuously receives external input, with each turn independently planning input, constructing the Agent, and consuming events—suitable for long-running interactive Agents. +- Supports input merging: `GenInput` decides at the turn boundary which inputs to consume in this turn and which to continue waiting for, enabling applications to implement batching, deduplication, merging of consecutive user inputs, and other strategies. +- Supports preemption: `Push` with a preempt option atomically writes new input and requests cancellation of the current turn, allowing high-priority input to interrupt a running Agent. +- Supports declarative checkpoint/resume: on recovery, applications don't need to manually restore the input queue; `TurnLoop` distinguishes between interrupted inputs, unprocessed inputs, and newly arrived inputs after recovery—applications only need to declare how these inputs re-enter subsequent turns. +- [Eino ADK: Agent Cancel and TurnLoop Quick Start](/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) + +## Upgrade Guide + +> 💡 +> As of now (5.19), the latest version is v0.9.0-beta.1, with the stable release expected in about one week. Before the stable release, always use the latest beta version; after the stable release, upgrade to the latest stable version. + +```bash +# Upgrade to the latest beta (use before stable release) +go get github.com/cloudwego/eino@v0.9.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticopenai@v0.2.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticark@v0.2.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticclaude@v0.1.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticgemini@v0.2.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticdeepseek@v0.1.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticqwen@v0.1.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticopenai@v0.2.0-beta.1 +go get github.com/cloudwego/eino-ext/callbacks/cozeloop@v0.3.0-beta.1 + +# After stable release, replace with the latest stable version numbers +go get github.com/cloudwego/eino@v0.9.0 +go get github.com/cloudwego/eino-ext/components/model/agenticopenai@v0.2.0 +go get github.com/cloudwego/eino-ext/components/model/agenticark@v0.2.0 +go get github.com/cloudwego/eino-ext/components/model/agenticclaude@v0.1.0 +go get github.com/cloudwego/eino-ext/components/model/agenticgemini@v0.2.0 +go get github.com/cloudwego/eino-ext/components/model/agenticdeepseek@v0.1.0 +go get github.com/cloudwego/eino-ext/components/model/agenticqwen@v0.1.0 +go get github.com/cloudwego/eino-ext/components/model/agenticopenai@v0.2.0 +go get github.com/cloudwego/eino-ext/callbacks/cozeloop@v0.3.0 +``` + +Check the latest version numbers: [github.com/cloudwego/eino/tags](https://github.com/cloudwego/eino/tags) diff --git a/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md b/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md new file mode 100644 index 00000000000..c91fd5b3b17 --- /dev/null +++ b/content/en/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md @@ -0,0 +1,198 @@ +--- +Description: "" +date: "2026-05-19" +lastmod: "" +tags: [] +title: Eino V0.9 Migration Notes +weight: 1 +--- + +This document lists the API and semantic changes that existing users need to be aware of when upgrading from V0.8.x to V0.9 `agentic-runtime`. New capabilities not listed here generally do not affect the existing `*schema.Message` path. + +## Explicit API Changes + +### Agent Transfer / Workflow Agent / Supervisor Marked as NOT RECOMMENDED + +V0.9 marks the multi-Agent collaboration model based on Agent Transfer (full context sharing) as **NOT RECOMMENDED**. Affected public APIs include: + +**Agent Transfer related**: + +- `SetSubAgents` +- `AgentWithOptions` / `WithDisallowTransferToParent` / `WithHistoryRewriter` +- `ChatModelAgentConfig.Exit` / `ChatModelAgentConfig.OutputKey` +- `AgentWithDeterministicTransferTo` +- `OnSetSubAgents` / `OnSetAsSubAgent` / `OnDisallowTransferToParent` + +**Workflow Agent**: + +- `NewSequentialAgent` / `SequentialAgentConfig` +- `NewParallelAgent` / `ParallelAgentConfig` +- `NewLoopAgent` / `LoopAgentConfig` + +**Supervisor**: + +- `supervisor.New` / `supervisor.Config` + +> 💡 +> These APIs can still be used and will not cause compilation failures, but they are not recommended for new projects. Experience has shown that the transfer model where Agents share full conversation context does not perform better than the tool-calling model in practice. + +Recommended migration directions: + +- Use `ChatModelAgent` + `AgentTool` (wrap sub-Agents as tools, call on demand). +- Use `DeepAgent` (structured sub-task delegation). +- Both approaches provide better controllability, observability, and prompt cache efficiency. + +### ChatModelAgentMiddleware Adds AfterAgent + +`ChatModelAgentMiddleware` adds the `AfterAgent` method. Types that manually implement this interface need to add this method, otherwise compilation will fail. + +Recommended approach: + +- If the middleware doesn't need special cleanup logic, embed `*adk.BaseChatModelAgentMiddleware`. +- If the middleware needs to clean up state, record events, or add statistics after the Agent completes successfully, implement `AfterAgent(ctx, state)`. + +Impact: + +- Only affects user code that explicitly implements `ChatModelAgentMiddleware`. +- Code that extends via `BaseChatModelAgentMiddleware` composition remains compatible. + +### AgentMiddleware Struct Deprecated + +The `AgentMiddleware` struct and the `ChatModelAgentConfig.Middlewares` field have been marked as **Deprecated** and will be removed in a future version. + +> 💡 +> Both AgentMiddleware and the Middlewares field are deprecated. Please migrate to the interface-based Handlers (ChatModelAgentMiddleware) approach. + +Migration steps: + +- Migrate the logic from `Middlewares []AgentMiddleware` to `Handlers []ChatModelAgentMiddleware`. +- `AgentMiddleware.BeforeChatModel` → implement `ChatModelAgentMiddleware.BeforeModelRewriteState`. +- `AgentMiddleware.AfterChatModel` → implement `ChatModelAgentMiddleware.AfterModelRewriteState`. +- `AgentMiddleware.WrapToolCall` → implement `ChatModelAgentMiddleware.WrapToolCall`. +- `AgentMiddleware.AdditionalInstruction` → modify `state.Instruction` in `BeforeModelRewriteState`. +- `AgentMiddleware.AdditionalTools` → modify `state.ToolInfos` in `BeforeModelRewriteState`. +- If the middleware doesn't need special logic, embed `*adk.BaseChatModelAgentMiddleware` for a default no-op implementation. + +Impact: + +- All code using `AgentMiddleware` in `ChatModelAgentConfig.Middlewares` needs to be migrated. +- In the current version, both approaches can coexist (Handlers execute after Middlewares), but early migration is recommended to avoid compilation failures when removed in future versions. + +### summarization.SummarizeMessages Removed + +`summarization.SummarizeMessages` and `summarization.SummarizeOutput` are no longer exported. + +Migration steps: + +- Continue using `summarization.New` or `summarization.NewTyped` when constructing the summarization middleware. +- When needing to trigger synchronous summarization manually, use `TypedMiddleware.Summarize`. + +This change converges summarization's configuration, state reading, and execution logic into the middleware, avoiding semantic divergence between standalone functions and runtime state. + +## Semantic Changes to Be Aware Of + +### Summarization Finalize Post-Processing Semantic Change + +In V0.8.x, the summarization middleware would first execute default summary post-processing, then call the user-configured `Finalize`. Therefore, the custom `Finalize` received a `summary` that already included `PreserveUserMessages` replacement, `TranscriptFilePath` injection, and summary preamble. + +In V0.9, if `Config.Finalize` is set, the middleware passes the raw summary generated by the model directly to `Finalize`, no longer automatically executing default post-processing. Affected configurations include: + +- `PreserveUserMessages` +- `TranscriptFilePath` + +Migration steps: + +- If you want to retain default post-processing, don't set `Finalize`—let the middleware use the default finalization path. +- If you must customize `Finalize` but still want to retain default post-processing, first construct a default finalizer via `DefaultFinalizer`, then explicitly compose it in your custom logic. +- `DefaultFinalizer` does not automatically read the outer `Config.PreserveUserMessages` and `Config.TranscriptFilePath`; they need to be explicitly passed via `DefaultFinalizerConfig`. +- Code using `NewFinalizer().PreserveSkills(...).Build()` needs special attention: this finalizer only handles preserve skills and does not automatically add `PreserveUserMessages` and `TranscriptFilePath`. + +### Tool List Modification Path Adjustment + +`ModelContext.Tools` is no longer the recommended entry point for tool list modification. + +Upgrade recommendations: + +- Modify `state.ToolInfos` in `BeforeModelRewriteState`. +- For model-native deferred tool search, modify `state.DeferredToolInfos`. +- Modifying the tool list in `WrapModel` is not recommended; such modifications only affect the current model call—subsequent middlewares, subsequent turns, or checkpoint/resume will not inherit this modification. + +### ToolSearch / AgentsMD Middleware Internal Implementation Migration + +The internal implementations of ToolSearch and AgentsMD middlewares have been migrated from `WrapModel` (v0.8.x) to `BeforeModelRewriteState` (v0.9). + +> 💡 +> For users who only use `toolsearch.New()` / `agentsmd.New()`, the public API (Config struct, constructor) has not changed—no code modifications needed. + +Semantic changes: + +- **v0.8.x**: Middleware temporarily injected tool lists via `model.Option` through `WrapModel` during model calls—changes were not persisted and did not enter agent state. +- **v0.9**: Middleware directly modifies `state.ToolInfos` / `state.DeferredToolInfos` and `state.Messages` (injecting reminder messages) in `BeforeModelRewriteState`—changes persist with state. + +Impact: + +- **Checkpoint/Resume**: Reminder messages and dynamic tool search results injected by ToolSearch now persist with checkpoints and are correctly reconstructed on restoration; in v0.8.x this information was lost after restoration. +- **Visibility to Other Middlewares**: Subsequent middlewares' `BeforeModelRewriteState` / `AfterModelRewriteState` can now see the `state.ToolInfos` modified by ToolSearch; in v0.8.x these modifications were invisible to other middlewares. +- **Prompt Cache**: Since tool list changes are now reflected in state (rather than temporarily injected during each model call), the model's KV-cache behavior may differ. + +Note: + +- If you have custom middlewares that rely on `ModelContext.Tools` in `WrapModel` to read/modify the tool list, they should be migrated to reading `state.ToolInfos` in `BeforeModelRewriteState`. + +### Model Retry Decision Semantic Enhancement + +`ModelRetryConfig` adds `ShouldRetry`. When `ShouldRetry` is non-nil, `IsRetryAble` is ignored. + +Note: + +- The old `IsRetryAble` can still be used for simple error-dimension retries. +- When using `ShouldRetry`, explicitly handle scenarios where output is successful but not accepted by the business. +- Interrupt and `ErrStreamCanceled` are not treated as normal retry errors. + +### Cancel Error Semantics + +After V0.9 introduces active cancellation semantics, applications need to distinguish between active cancellation, normal errors, and business interrupts. + +Upgrade recommendations: + +- The upper layer should distinguish between `CancelError`, normal error, and business interrupt. +- If the application actively integrates `WithCancel`, do not treat `CancelError` as a normal business failure. + +### AgenticMessage Migration Requires Understanding the New Message Structure + +`TypedChatModelAgent[*schema.AgenticMessage]` is the new path targeting model-native Agentic protocols. Migrating to this path isn't just changing the generic parameter from `*schema.Message` to `*schema.AgenticMessage`—it also requires processing message content according to `AgenticMessage`'s content block structure. + +Note: + +- The AgenticMessage path uses `AgenticModel` and `AgenticToolsNode` to handle tool calls. +- Tool calls and tool results are expressed through `AgenticMessage` content blocks; correct handling of tool call / tool result content blocks is particularly important. +- Agent transfer capability does not apply to the AgenticMessage path. +- Existing applications that don't need model-native Agentic protocols should continue using the default `*schema.Message` path; only migrate when explicitly needing to integrate `AgenticModel` protocols. + +### Model Adapters Need to Recognize New Options + +After V0.9 introduces `AgenticModel`, model adapters need to handle call-time options more strictly. `AgenticModel` is an alias for `BaseModel[*schema.AgenticMessage]` and no longer provides enhanced interfaces like `ToolCallingChatModel.WithTools`; tool binding is uniformly passed as a `model.Option` via `model.WithTools`. + +Note: + +- All model adapters supporting AgenticMessage should read `Options.Tools` and map them to the provider's tool calling protocol. +- `AgenticModel` should not require users to first call some `WithTools` method to get a "model instance with tools"; ADK passes the current tool list via `model.WithTools` on each model call. +- If the adapter only reads tools from its own config while ignoring `model.WithTools`, in the ChatModelAgent / AgenticToolsNode path the model won't see tools or the tool list won't change with runtime state. + +V0.9 also adds the following to `model.Options`: + +- `DeferredTools` +- `ToolSearchTool` +- `AgenticToolChoice` + +Existing model adapters ignoring these options generally won't cause compilation failures, but will result in deferred tool search, model-native tool search, or agentic tool choice not taking effect. Adapter maintainers should add conversion logic according to the target provider's protocol. + +### ToolInfo Serialization Format Change + +`ToolInfo` adds explicit JSON/Gob encoding/decoding to preserve `ParamsOneOf`. + +Impact: + +- `ToolInfo` is now part of `ChatModelAgentState.ToolInfos` / `DeferredToolInfos`, and may therefore enter checkpoints along with Agent state. +- Explicit JSON/Gob encoding/decoding ensures that `ParamsOneOf` is not lost during checkpoint, deep copy, and restoration. +- If external systems directly depend on the old `ToolInfo` JSON format, serialization compatibility needs to be re-verified. diff --git a/content/en/docs/eino/release_notes_and_migration/v02_second_release.md b/content/en/docs/eino/release_notes_and_migration/v02_second_release.md index e2f407a6dc0..68219f02ca6 100644 --- a/content/en/docs/eino/release_notes_and_migration/v02_second_release.md +++ b/content/en/docs/eino/release_notes_and_migration/v02_second_release.md @@ -3,7 +3,7 @@ Description: "" date: "2026-03-02" lastmod: "" tags: [] -title: 'Eino: v0.2.*-second release' +title: v0.2.*-second release weight: 2 --- diff --git a/content/zh/docs/eino/Cookbook.md b/content/zh/docs/eino/Cookbook.md index db3b6c8f616..df2aaeebce8 100644 --- a/content/zh/docs/eino/Cookbook.md +++ b/content/zh/docs/eino/Cookbook.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: Cookbook @@ -63,6 +63,27 @@ weight: 3 adk/multiagent/integration-excel-agentExcel Agent (ADK 集成版)ADK 集成版 Excel Agent,包含 Planner、Executor、Replanner、Reporter +### Agent + + + + +
    目录名称说明
    adk/agent/ralph-loopRalph Loop自主迭代模式:外部
    for
    循环配合
    Runner.Run
    实现单轮迭代,Agent 通过文件系统感知先前工作,验证门控检查 BUG 标记后才接受完成承诺
    + +### Cancel (取消) + + + + +
    目录名称说明
    adk/cancel/graceful-exitGraceful Exit演示 Agent Cancel + Resume:捕获终端信号后以
    CancelAfterChatModel
    +
    WithRecursive
    模式取消嵌套 Agent,等待安全点保存 Checkpoint,然后恢复继续执行
    + +### Middlewares (中间件) + + + + +
    目录名称说明
    adk/middlewares/skillSkill 中间件从文件系统加载 Agent 技能(如 log_analyzer),展示技能中间件的使用方式
    + ### GraphTool (图工具) @@ -209,6 +230,7 @@ weight: 3 +
    quickstart/chatChat 快速开始最基础的 LLM 对话示例,包含模板、生成、流式输出
    quickstart/eino_assistantEino 助手完整的 RAG 应用示例,包含知识索引、Agent 服务、Web 界面
    quickstart/todoagentTodo Agent简单的 Todo 管理 Agent 示例
    quickstart/chatwitheinoChat with Eino (教程)9 章渐进式教程,从 ChatModel → Runner → Session → Tool → Middleware → Callback → Interrupt → GraphTool → Skill,逐步构建完整 Agent
    --- diff --git a/content/zh/docs/eino/FAQ.md b/content/zh/docs/eino/FAQ.md index dc39656ace1..cd603427a5c 100644 --- a/content/zh/docs/eino/FAQ.md +++ b/content/zh/docs/eino/FAQ.md @@ -1,10 +1,10 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-21" lastmod: "" tags: [] title: FAQ -weight: 11 +weight: 10 --- # Q: cannot use openapi3.TypeObject (untyped string constant "object") as *openapi3.Types value in struct literal,cannot use types (variable of type string) as *openapi3.Types value in struct literal @@ -13,11 +13,7 @@ weight: 11 # Q: Agent 流式调用时不会进入 ToolsNode 节点。或流式效果丢失,表现为非流式。 -- 先更新 eino 版本到最新 - -不同的模型在流式模式下输出工具调用的方式可能不同: 某些模型(如 OpenAI) 会直接输出工具调用;某些模型 (如 Claude) 会先输出文本,然后再输出工具调用。因此需要使用不同的方法来判断,这个字段用来指定判断模型流式输出中是否包含工具调用的函数。 - -ReAct Agent 的 Config 中有一个 StreamToolCallChecker 字段,如未填写,Agent 会使用“非空包”是否包含工具调用判断: +- 先更新 eino 版本到最新不同的模型在流式模式下输出工具调用的方式可能不同: 某些模型(如 OpenAI) 会直接输出工具调用;某些模型 (如 Claude) 会先输出文本,然后再输出工具调用。因此需要使用不同的方法来判断,这个字段用来指定判断模型流式输出中是否包含工具调用的函数。ReAct Agent 的 Config 中有一个 StreamToolCallChecker 字段,如未填写,Agent 会使用“非空包”是否包含工具调用判断: ```go func firstChunkStreamToolCallChecker(_ context.Context, sr *schema.StreamReader[*schema.Message]) (bool, error) { @@ -45,9 +41,7 @@ func firstChunkStreamToolCallChecker(_ context.Context, sr *schema.StreamReader[ } ``` -上述默认实现适用于:模型输出的 Tool Call Message 中只有 Tool Call。 - -默认实现不适用的情况:在输出 Tool Call 前,有非空的 content chunk。此时,需要自定义 tool Call checker 如下: +上述默认实现适用于:模型输出的 Tool Call Message 中只有 Tool Call。默认实现不适用的情况:在输出 Tool Call 前,有非空的 content chunk。此时,需要自定义 tool Call checker 如下: ```go toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Message]) (bool, error) { @@ -74,9 +68,7 @@ toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Mes 上面这个自定义 StreamToolCallChecker,在模型常规输出 answer 时,需要判断**所有包**是否包含 ToolCall,从而导致“流式判断”的效果丢失。如果希望尽可能保留“流式判断”效果,解决这一问题的建议是: > 💡 -> 尝试添加 prompt 来约束模型在工具调用时不额外输出文本,例如:“如果需要调用 tool,直接输出 tool,不要输出文本”。 -> -> 不同模型受 prompt 影响可能不同,实际使用时需要自行调整 prompt 并验证效果。 +> 尝试添加 prompt 来约束模型在工具调用时不额外输出文本,例如:“如果需要调用 tool,直接输出 tool,不要输出文本”。不同模型受 prompt 影响可能不同,实际使用时需要自行调整 prompt 并验证效果。 # Q: [github.com/bytedance/sonic/loader](http://github.com/bytedance/sonic/loader): invalid reference to runtime.lastmoduledatap @@ -91,9 +83,7 @@ toolCallChecker := func(ctx context.Context, sr *schema.StreamReader[*schema.Mes Eino 目前不支持批处理,可选方法有两种 1. 每次请求按需动态构建 graph,额外成本不高。 这种方法需要注意 Chain Parallel 要求其中并行节点数量大于一, -2. 自定义批处理节点,节点内自行批处理任务 - -代码示例:[https://github.com/cloudwego/eino-examples/tree/main/compose/batch](https://github.com/cloudwego/eino-examples/tree/main/compose/batch) +2. 自定义批处理节点,节点内自行批处理任务代码示例:[https://github.com/cloudwego/eino-examples/tree/main/compose/batch](https://github.com/cloudwego/eino-examples/tree/main/compose/batch) # Q: eino 支持把模型结构化输出吗 @@ -101,9 +91,7 @@ Eino 目前不支持批处理,可选方法有两种 1. 部分模型支持直接配置(比如 openai 的 response format),可以看下模型配置里有没有。 2. 通过 tool call 功能获得 -3. 写 prompt 要求模型 - -得到模型结构化输出后,可以用 schema.NewMessageJSONParser 把 message 转换成你需要的 struct +3. 写 prompt 要求模型得到模型结构化输出后,可以用 schema.NewMessageJSONParser 把 message 转换成你需要的 struct # Q: 如何获取模型(chat model)输出的 Reasoning Content/推理/深度思考 内容: @@ -115,14 +103,8 @@ Eino 目前不支持批处理,可选方法有两种 1. context.canceled: 在执行 graph 或者 agent 时,用户侧传入了一个可以 cancel 的 context,并发起了取消。排查应用层代码的 context cancel 操作。此报错与 eino 框架无关。 2. Context deadline exceeded: 可能是两种情况: - 1. 在执行 graph 或者 agent 时,用户侧传入了一个带 timeout 的 context,触发了超时。 - 2. 给 ChatModel 或者其他外部资源配置了 timeout 或带 timeout 的 httpclient,触发了超时。 - -查看抛出的 error 中的 `node path: [node name x]`,如果 node name 不是 ChatModel 等带外部调用的节点,大概率是 2-a 这种情况,反之大概率是 2-b 这种情况。 - -如果怀疑是 2-a 这种情况,自行排查下上游链路那个环节给 context 设置了 timeout,常见的可能性如 faas 平台等。 - -如果怀疑是 2-b 这种情况,看下节点是否自行配置了超时,比如 Ark ChatModel 配置了 Timeout,或者 OpenAI ChatModel 配置了 HttpClient(内部配置了 Timeout)。如果都没有配置,但依然超时了,可能是模型侧 SDK 的默认超时。已知 Ark SDK 默认超时 10 分钟,Deepseek SDK 默认超时 5 分钟。 +3. 在执行 graph 或者 agent 时,用户侧传入了一个带 timeout 的 context,触发了超时。 +4. 给 ChatModel 或者其他外部资源配置了 timeout 或带 timeout 的 httpclient,触发了超时。查看抛出的 error 中的 `node path: [node name x]`,如果 node name 不是 ChatModel 等带外部调用的节点,大概率是 2-a 这种情况,反之大概率是 2-b 这种情况。如果怀疑是 2-a 这种情况,自行排查下上游链路那个环节给 context 设置了 timeout,常见的可能性如 faas 平台等。如果怀疑是 2-b 这种情况,看下节点是否自行配置了超时,比如 Ark ChatModel 配置了 Timeout,或者 OpenAI ChatModel 配置了 HttpClient(内部配置了 Timeout)。如果都没有配置,但依然超时了,可能是模型侧 SDK 的默认超时。已知 Ark SDK 默认超时 10 分钟,Deepseek SDK 默认超时 5 分钟。 # Q:想要在子图中获取父图的 State 怎么做 @@ -138,37 +120,268 @@ eino-ext 支持的多模态输入输出场景,可以查阅 [https://www.cloudw # Q: 升级到 0.6.x 版本后,有不兼容问题 -根据先前社区公告规划 [Migration from OpenAPI 3.0 Schema Object to JSONSchema in Eino · cloudwego/eino · Discussion #397](https://github.com/cloudwego/eino/discussions/397),已发布 eino V0.6.1 版本。重要更新内容为移除了 getkin/kin-openapi 依赖以及所有 OpenAPI 3.0 相关代码。 +根据先前社区公告规划 [Migration from OpenAPI 3.0 Schema Object to JSONSchema in Eino · cloudwego/eino · Discussion #397](https://github.com/cloudwego/eino/discussions/397),已发布 eino V0.6.1 版本。重要更新内容为移除了 getkin/kin-openapi 依赖以及所有 OpenAPI 3.0 相关代码。eino-ext 部分 module 报错 undefined: schema.NewParamsOneOfByOpenAPIV3 等问题,升级报错的 eino-ext module 到最新版本即可。如果 schema 改造比较复杂,可以使用 [JSONSchema 转换方法](https://bytedance.larkoffice.com/wiki/ZMaawoQC4iIjNykzahwc6YOknXf)文档中的工具方法辅助转换。 -eino-ext 部分 module 报错 undefined: schema.NewParamsOneOfByOpenAPIV3 等问题,升级报错的 eino-ext module 到最新版本即可。 - -如果 schema 改造比较复杂,可以使用 JSONSchema 转换工具方法辅助转换。 +> 💡 -# Q: Eino-ext 提供的 ChatModel 有哪些模型是支持 Response API 形式调用嘛? +# Q: 我创建模型之后,尝试模型调用报错 : 400 Bad Reqvest,message: code: missing_required_parameter; message: Missing reqvired parameter:'input 。 -- Eino-Ext 中目前只有 ARK 的 Chat Model 可通过 **NewResponsesAPIChatModel **创建 ResponsesAPI ChatModel,其他模型目前不支持 ResponsesAPI 的创建与使用, - - 遇到这个报错请确认咱们生成 chat model 是填写的 base url 是 chat completion 的 URL 还是 ResponseAPI 的 URL,绝大多数场景是错误传递了 Response API 的 Base URL +- 遇到这个报错请确认咱们生成 chat model 是填写的 base url 是 chat completion 的 URL 还是 ResponseAPI 的 URL,绝大多数场景是错误传递了 Response API 的 Base URL # Q: 如何排查 ChatModel 调用报错?比如[NodeRunError] failed to create chat completion: error, status code: 400, status: 400 Bad Request。 -这类报错是模型 API(如 GPT、Ark、Gemini 等)的报错,通用的思路是检查实际调用模型 API 的 HTTP Request 是否有缺字段、字段值错误、BaseURL 错误等情况。建议将实际的 HTTP Request 通过日志打印出来,并通过 HTTP 直接请求的方式(如命令行发起 Curl 或使用 Postman 直接请求)来验证、修改该 HTTP Request。在定位问题后,再相应修改对应的 Eino 代码中的问题。 - -如何通过日志打印出模型 API 的实际 HTTP Request,参考这个代码样例:[https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport](https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport) +这类报错是模型 API(如 GPT、Ark、Gemini 等)的报错,通用的思路是检查实际调用模型 API 的 HTTP Request 是否有缺字段、字段值错误、BaseURL 错误等情况。建议将实际的 HTTP Request 通过日志打印出来,并通过 HTTP 直接请求的方式(如命令行发起 Curl 或使用 Postman 直接请求)来验证、修改该 HTTP Request。在定位问题后,再相应修改对应的 Eino 代码中的问题。如何通过日志打印出模型 API 的实际 HTTP Request,参考这个代码样例:[https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport](https://github.com/cloudwego/eino-examples/tree/main/components/model/httptransport) # Q: 使用 eino-ext 仓库下 创建的 gemini chat model 不支持使用 Image URL 传递多模态?如何适配? 目前 Eino-ext 仓库下的 gemini Chat model 已经做了传递 URL 类型的支持,使用 go get github.com/cloudwego/eino-ext/components/model/gemini 更新到 [components/model/gemini/v0.1.22](https://github.com/cloudwego/eino-ext/releases/tag/components%2Fmodel%2Fgemini%2Fv0.1.22) 目前最新版本,传递 Image URL 测试是否满足业务需求 -# Q: 调用工具(包括 MCP tool)之前,报 JSON Unmarshal 失败的错误,如何解决 +# Q: 模型产生的 Tool Call 有问题(参数非法 JSON、调用了不存在的工具、参数名称发生变化等),如何处理? + +模型(LLM)产生的 Tool Call 可能存在多种问题,Eino 提供了多层防御机制来应对。以下按问题类型分别介绍: + +## 1. Tool Call 参数不是合法 JSON(Unmarshal 失败) + +**典型报错:** `failed to call mcp tool: failed to marshal request: json: error calling MarshalJSON for type json.RawMessage: unexpected end of JSON input` **根因:** ChatModel 产生的 Tool Call 中,Argument 字段是 string。Eino 在调用工具前会做 JSON Unmarshal,如果模型输出的 JSON 不合法(多余前缀/后缀、特殊字符转义、缺失大括号、超长截断等),则会报错。**方案 A:ToolArgumentsHandler(推荐)**在 `ToolsNodeConfig`(或 ADK 的 `ToolsConfig`)中配置 `ToolArgumentsHandler`,在工具执行前对参数进行预处理和修复: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: tools, + ToolArgumentsHandler: func(ctx context.Context, name, arguments string) (string, error) { + // 在此修复常见 JSON 格式问题,如缺失大括号、多余前缀等 + return fixJSON(arguments), nil + }, + }, + }, +}) +``` + +一个 JSON 修复的参考实现:[eino-examples/components/tool/middlewares/jsonfix](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix)**执行顺序:** `ArgumentsAliases 别名替换 → ToolArgumentsHandler → 工具执行` + +## 2. 模型调用了不存在的工具(Tool Name 幻觉) + +**典型报错:** `tool xxx not found in toolsNode indexes` **根因:** 模型可能"幻觉"出不存在的工具名称。**方案:UnknownToolsHandler** 配置后,当模型调用不存在的工具时,不会直接报错,而是由 Handler 返回一段提示文本,让模型自行纠正: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + UnknownToolsHandler: func(ctx context.Context, name, input string) (string, error) { + return fmt.Sprintf("Tool '%s' does not exist. Available tools: %s. Please retry.", name, availableToolNames), nil + }, +} +``` + +## 3. 工具名称或参数名称发生变化(Schema 迁移导致的兼容性问题) + +**场景:** 工具重命名(如 `search` → `web_search`),或参数字段重命名(如 `q` → `query`),但模型可能仍使用旧名称。这在使用 LLM Cache 或对话历史中记录了旧工具 Schema 时尤为常见。**方案:ToolAliases** 为工具配置名称别名和参数别名,框架在调度时自动解析: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + ToolAliases: map[string]compose.ToolAliasConfig{ + "web_search": { + NameAliases: []string{"search", "web-search"}, // 旧工具名 → 当前工具名 + ArgumentsAliases: map[string][]string{ + "query": {"q", "search_term"}, // 旧参数名 → 当前参数名 + }, + }, + }, +} +``` + +> 💡 +> ToolAliases 的参数别名替换发生在 ToolArgumentsHandler 之前。完整的执行顺序为:Name Alias 解析 → Arguments Alias 替换 → ToolArgumentsHandler → 工具执行。 + +## 4. 工具执行失败后,让模型自行纠错(而非中断流程) + +**场景:** Tool 执行报错(如文件不存在、权限不足、API 调用失败)时,默认会中断 Agent 流程。但通常更好的做法是将错误信息作为正常的 Tool Result 返回给模型,由模型自动纠错重试。**方案 A:ADK Middleware(WrapInvokableToolCall)**在 ADK Agent 中,通过 `ChatModelAgentMiddleware` 的 `WrapInvokableToolCall` 方法将错误转换为字符串结果: + +```go +func (m *safeToolMiddleware) WrapInvokableToolCall( + _ context.Context, + endpoint adk.InvokableToolCallEndpoint, + _ *adk.ToolContext, +) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err // 中断错误不转换 + } + return fmt.Sprintf("[tool error] %v", err), nil + } + return result, nil + }, nil +} +``` + +参考:[quickstart/chatwitheino Ch05 Middleware](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch05/main.go)**方案 B:compose 层 ToolCallMiddlewares** 在 compose 层直接使用 `ToolCallMiddlewares`,适用于直接使用 Graph/ToolsNode 的场景: + +```go +compose.ToolsNodeConfig{ + Tools: tools, + ToolCallMiddlewares: []compose.ToolMiddleware{ + { + Invokable: func(next compose.InvokableToolEndpoint) compose.InvokableToolEndpoint { + return func(ctx context.Context, in *compose.ToolInput) (*compose.ToolOutput, error) { + output, err := next(ctx, in) + if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return nil, err + } + return &compose.ToolOutput{Result: fmt.Sprintf("[tool error] %v", err)}, nil + } + return output, nil + } + }, + }, + }, +} +``` + +参考:[eino-examples/components/tool/middlewares/errorremover](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/errorremover) -ChatModel 产生的 Tool Call 中,Argument 字段是 string。Eino 框架在根据这个 Argument string 调用工具时,会先做 JSON Unmarshal。这时,如果 Argument string 不是合法的 JSON,则 JSON Unmarshal 会失败,报出类似这样的错误:`failed to call mcp tool: failed to marshal request: json: error calling MarshalJSON for type json.RawMessage: unexpected end of JSON input` +> 💡 +> 注意:在转换错误时,必须先检查 `compose.IsInterruptRerunError`。InterruptRerun 错误是框架用于 Human-in-the-loop 等场景的控制流信号,不应被吞掉。 + +## 总结 -解决这个问题的根本途径是依靠模型输出合法的 Tool Call Argument。在工程方面,我们可以尝试修复一些常见的 JSON 格式问题,如多余的前缀、后缀,特殊字符转义问题,缺失的大括号等,但无法保证 100% 的修正。一个类似的修复实现可以参考代码样例:[https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix](https://github.com/cloudwego/eino-examples/tree/main/components/tool/middlewares/jsonfix) + + + + + + +
    问题机制配置位置
    参数 JSON 不合法
    ToolArgumentsHandler
    ToolsNodeConfig
    /
    ToolsConfig
    调用不存在的工具
    UnknownToolsHandler
    ToolsNodeConfig
    /
    ToolsConfig
    工具名/参数名变化
    ToolAliases
    ToolsNodeConfig
    /
    ToolsConfig
    工具执行报错需自动纠错Middleware 错误转换ADK
    Handlers
    ToolCallMiddlewares
    # Q:如何可视化一个 graph/chain/workflow 的拓扑结构? 利用 `GraphCompileCallback` 机制在 `graph.Compile` 的过程中将拓扑结构导出。一个导出为 mermaid 图的代码样例:[https://github.com/cloudwego/eino-examples/tree/main/devops/visualize](https://github.com/cloudwego/eino-examples/tree/main/devops/visualize) +# Q: Eino 中使用 Flow/react Agent 场景下如何获取工具调用的 Tool Call Message 以及本次调用工具的 Tool Result 结果? + - Flow/React Agent 场景下获取中间结构参考文档 [Eino: ReAct Agent 使用手册](/zh/docs/eino/core_modules/flow_integration_components/react_agent_manual) +- 此外还可以将 Flow/React Agent 替换成 ADK 的 ChatModel Agent 具体可参考 [Eino ADK: 概述](/zh/docs/eino/core_modules/eino_adk/agent_preview) + +# Q: 在使用 Eino 开发 Agent 时,定义了一个不需要任何参数的工具(Tool)。为什么在调用部分大模型时,会遇到类似 JSON Schema 校验失败(如 `unknown msg type` 或格式不支持)的报错?该如何规范解决? + +**A: 问题根因:**在 Function Calling / 工具调用的生态中,许多大模型厂商对下发的 JSON Schema 都有着严格的格式校验逻辑。如果在定义无参工具时,开发者错误地传入了空的参数映射或空结构体(例如导致框架生成 `{"type": "object", "properties": {}}` 这样虽然语法合法但无实际意义的 Schema),部分模型的校验引擎会将其判定为不符合预期的异常格式,进而直接拒绝请求。**框架机制与代码行为:** + +- 在 Eino 框架的核心定义(`eino/schema/tool.go`)中,`schema.ToolInfo` 结构体专门使用 `ParamsOneOf` 字段来描述参数。 +- 框架设计上明确允许:对于不需要参数的工具,`ParamsOneOf` 应当为 `nil`。 +- 当 `ParamsOneOf` 为 `nil` 时,Eino 的底层组件在向各类模型 Provider 构建请求时,会直接省略工具的 `parameters` 字段,从而从根本上避免触发模型的强校验规则。**最佳实践:**在 Eino 中构造无参工具时,**切勿使用空结构体或空 Map 去初始化参数描述**,应直接让 `ParamsOneOf` 保持默认的 `nil` 状态。 + +```go +tool := &schema.ToolInfo{ + Name: "fetch_current_time", + Desc: "获取当前系统时间,无需任何参数", + // 最佳实践:明确置为 nil,或直接不声明该字段 + ParamsOneOf: nil, +} +``` + +**(注:如果使用的是 **utils.InferTool** 等反射推导工具,且入参为空结构体时,需注意确保使用的 Eino 扩展版本已正确处理了空属性的过滤,或考虑根据需要手动覆盖其参数定义。)** + +# Q: 如何在 Agent 外部获取 Session Values(如 deep agent 的 TODOs)? + +在 ADK 中,`adk.GetSessionValues(ctx)` 和 `adk.AddSessionValue(ctx, key, value)` 依赖 Agent 运行期间注入到 context 中的 `runSession`。这意味着它们**只能在 Agent 的执行上下文内使用**——例如在 Middleware、Handler 或 Tool 回调函数中。当用户通过 Runner 的 `Run` 方法获取到 `AsyncIterator` 并在外部消费 `AgentEvent` 时,此时已经不在 Agent 的执行上下文中,因此无法通过 `adk.GetSessionValues` 获取到 Session Values。如果需要在 Agent 运行过程中实时获取 Session Values(例如在消费流式事件的同时),可以考虑使用 Middleware/Callback Handler 的回调将所需数据通过其他渠道(如 channel)传递出来。 + +# Q: 多个同名 SubAgent 并发执行时,如何区分它们发出的 AgentEvent? + +**场景:** 使用 DeepAgent 时,多个同名 SubAgent(如 `general-purpose`)可能并发执行。在通过 Runner 消费 `AsyncIterator[*AgentEvent]` 时,不同实例发出的事件难以区分。**方案:包装 Agent,通过 CustomizedOutput 注入标识符** `AgentOutput` 提供了 `CustomizedOutput any` 字段,可以用于承载自定义数据。通过包装 Agent 的 `Run` 方法,在每个发出的事件上注入唯一标识: + +```go +type wrappedAgent struct { + adk.Agent + identifier int +} + +func (w *wrappedAgent) Run(ctx context.Context, input *adk.AgentInput, options ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + iter := w.Agent.Run(ctx, input, options...) + newIter, newGen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() + go func() { + defer newGen.Close() + for { + event, ok := iter.Next() + if !ok { + break + } + // 注意:event.Output 可能为 nil(如错误事件、action-only 事件) + if event.Output == nil { + event.Output = &adk.AgentOutput{} + } + event.Output.CustomizedOutput = w.identifier + newGen.Send(event) + } + }() + return newIter +} +``` + +**使用方式:** + +```go +agent1 := &wrappedAgent{Agent: generalAgent, identifier: 1} +agent2 := &wrappedAgent{Agent: generalAgent, identifier: 2} +// 将 agent1、agent2 作为 SubAgent 传入 DeepAgent +``` + +**消费端区分:** + +```go +for { + event, ok := iter.Next() + if !ok { + break + } + if event.Output != nil && event.Output.CustomizedOutput != nil { + id := event.Output.CustomizedOutput.(int) + fmt.Printf("Event from agent instance %d\n", id) + } +} +``` + +> 💡 +> 注意事项: +> +> 1. event.Output 可能为 nil,设置 CustomizedOutput 前必须做 nil 检查。 +> 2. 此包装仅覆盖 Run 方法。如果 Agent 实现了 ResumableAgent 接口(如 DeepAgent 创建的 Agent),Resume 方法通过嵌入的 Agent 直接调用,其事件不会被注入标识符。如需完整覆盖,需要同时包装 Resume 方法。 +> 3. 此方案是 workaround,适合快速解决区分问题。CustomizedOutput 不会被持久化到 Checkpoint。 + +# Q: 如何在某个 Skill 被触发时才加载对应的 ToolInfo?/ 如何用 Skill 强制模型调用指定工具? + +这两个问题的根源在于对 Skill 和 Tool 概念的混淆。**Skill 的本质是 Prompt。** Skill 中间件在触发时,会向对话中插入一条新的 UserMessage,其内容就是该 Skill 的 Prompt 文本。你可以在 Skill Prompt 中写明"请调用 xxx 工具,参数为 yyy",但这仍然只是提示词——模型是否遵循,取决于 Prompt Engineering 的质量和模型本身的随机性。**Tool(ToolInfo)的本质是请求参数。** ToolInfo 列表作为 ChatModel 请求的 `tools` 参数发送给模型,告诉模型"你可以调用哪些工具"。除非使用 ToolSearch 动态加载(Claude、GPT 5.4+ 等支持),否则 ToolInfo 必须在请求时一并传递。**关于"Skill 触发时动态加载 ToolInfo":** 要实现这个效果,意味着当 Skill Prompt 被插入对话时,同时往本次请求的 `[]ToolInfo` 中追加该 Skill 所需的工具定义。这完全是用户侧的自定义行为——你需要:1) 识别当前轮次是否触发了 Skill;2) 确定该 Skill 需要哪些 Tool;3) 在构造 ChatModel 请求前,将对应的 ToolInfo 追加到 `[]ToolInfo`。需要注意,`[]ToolInfo` 位于 Prompt Cache 的前部,动态追加新工具极大概率会破坏 Prompt Cache,导致缓存命中率下降和延迟增加。如果在意缓存效率,应在初始化时就把所有可能用到的工具一次性传入。**关于"用 Skill 强制模型调用指定工具":** Skill 只是向模型发送了一段文字提示,模型是否严格遵循取决于 Prompt 的清晰度、模型自身的 instruction-following 能力以及上下文干扰。这本质上是 Prompt Engineering 问题,存在固有的不确定性。如果业务要求 100% 确定调用某个工具,可以在 LLM 请求中指定 ToolChoice 强制模型选择该工具,或在应用层代码中直接调用该工具而非依赖模型决策。 + +> 💡 +> 推荐做法:Skill 触发时希望模型"大概率"调用某工具 → 在 Skill Prompt 中明确写出工具名称、参数格式和调用指令;需要动态控制可用工具集 → 使用 ToolSearch 或在 ChatModel 中间件中根据上下文动态修改 `[]ToolInfo`;必须 100% 调用某工具 → 在应用层代码中直接调用,不依赖模型决策;担心 Prompt Cache 失效 → 初始化时传入所有可能用到的 ToolInfo,避免动态增删。 + +# Q: Supervisor 子 Agent 转回主 Agent 报错 / transfer_to_agent 转发后子 Agent 收到的用户内容变更 + +这些问题均与 ADK 的 AgentTransfer 机制有关。Supervisor 是基于 AgentTransfer 实现的多 Agent 协作模式。AgentTransfer 机制存在以下已知局限: + +- **上下文全量共享**:Supervisor 与 SubAgent 之间、SubAgent 之间强制共享完整上下文,导致 token 开销大、延迟高。 +- **注意力稀释**:全量共享的上下文对子 Agent 而言往往冗余,稀释了子 Agent 对其真正任务的关注度,降低执行质量。 +- **上下文污染**:转发过程中产生的 "Successfully transferred to xxx" 消息会残留在上下文中,可能误导后续 Agent 的 Tool Call 决策(形成错误的 few-shot 示例)。 +- **强制注入工具**:机制要求注入 Transfer Tool(以及可能的 Exit Tool),增加了 ToolInfo 列表的复杂度。 + +> 💡 +> 基于上述原因,ADK 中的 AgentTransfer / Supervisor 模式目前标记为「不推荐使用」。 + +**推荐替代方案:** 使用 DeepAgent 或 ChatModelAgent + AgentTool 组合。这种模式下: + +- 每个 AgentTool 拥有独立封装的上下文,不会相互污染,速度更快、成本更低,通常效果更好。 +- 不会产生 "Successfully transferred to xxx" 等干扰消息,避免对模型决策形成误导。 + +# Q: DeepSeek V4 模型 tool call 场景下 reason content 回传有问题,如何解决? + +DeepSeek V4 模型在 tool call 场景下,reason content 的回传存在已知问题,多位业务同学反馈遇到此情况。 + +**解决方式:** 升级对应的 eino-ext deepseek 模块到最新版本即可修复。 + +```shell +go get github.com/cloudwego/eino-ext/components/model/deepseek@latest +``` -# Q: Gemini 模型报错 missing a `thought_signature` +升级后重新运行,确认 reason content 回传是否恢复正常。 diff --git a/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md b/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md index 731c634fbdf..1b6055116be 100644 --- a/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md +++ b/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-17" lastmod: "" tags: [] title: 编排的设计理念 diff --git a/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/stream_programming_essentials.md b/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/stream_programming_essentials.md index 39ef62359a2..9c7025c0804 100644 --- a/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/stream_programming_essentials.md +++ b/content/zh/docs/eino/core_modules/chain_and_graph_orchestration/stream_programming_essentials.md @@ -101,7 +101,7 @@ Collect 和 Transform 两种流式范式,目前只在编排场景有用到。 上面的 Concat message stream 是 Eino 框架自动提供的能力,即使不是 message,是任意的 T,只要满足特定的条件,Eino 框架都会自动去做这个 StreamReader[T] 到 T 的转化,这个条件是:**在编排中,当一个组件的上游输出是 StreamReader[T],但是组件只提供了 T 作为输入的业务接口时,框架会自动将 StreamReader[T] concat 成 T,再输入给这个组件。** > 💡 -> 框架自动将 StreamReader[T] concat 成 T 的过程,可能需要用户提供一个 Concat function。详见 [Eino: 编排的设计理念](/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles) 中关于“合并帧”的章节。 +> 框架自动将 StreamReader[T] concat 成 T 的过程,可能需要用户提供一个 Concat function。详见 [Eino: 编排的设计理念](/zh/docs/eino/core_modules/chain_and_graph_orchestration/orchestration_design_principles#share-FaVnd9E2foy4fAxtbTqcsgq3n5f) 中关于“合并帧”的章节。 另一方面,考虑一个相反的例子。还是 React Agent,这次是一个更完整的编排示意图: diff --git a/content/zh/docs/eino/core_modules/components/agentic_tools_node_guide.md b/content/zh/docs/eino/core_modules/components/agentic_tools_node_guide.md index 67f1559297a..a9f6438c4c0 100644 --- a/content/zh/docs/eino/core_modules/components/agentic_tools_node_guide.md +++ b/content/zh/docs/eino/core_modules/components/agentic_tools_node_guide.md @@ -371,8 +371,8 @@ result, err := runnable.Invoke(ctx, input, compose.WithCallbacks(helper)) 工具的实现方式有多种,可以参考如下方式: -- 基于 HTTP API 的 tool 实现: [如何使用 openapi 创建 tool/function call ?](/zh/docs/eino/usage_guide/how_to_guide/openapi_tool_creation) -- 基于 gRPC 的 tool 实现: [如何使用 proto3 创建 tool/function call ? ](/zh/docs/eino/usage_guide/how_to_guide/proto3_tool_creation) -- 基于 thrift 的 tool 实现: [如何使用 thrift idl 创建 tool/function call ? ](/zh/docs/eino/usage_guide/how_to_guide/thrift_idl_tool_creation) +- 基于 HTTP API 的 tool 实现: [如何使用 openapi 创建 tool/function call ?](https://bytedance.larkoffice.com/wiki/FjXzwf3exijtKyk2hh7cAmnZn1g) +- 基于 gRPC 的 tool 实现: [如何使用 proto3 创建 tool/function call ? ](https://bytedance.larkoffice.com/wiki/EPkawUVbdiGwxCkWCJTcAMQonbh) +- 基于 thrift 的 tool 实现: [如何使用 thrift idl 创建 tool/function call ? ](https://bytedance.larkoffice.com/wiki/PcHfwo6x0iOrXxkIjJecez8xnNg) - 基于本地函数的工具实现: [如何创建一个 tool ?](/zh/docs/eino/core_modules/components/tools_node_guide/how_to_create_a_tool) - …… diff --git a/content/zh/docs/eino/core_modules/components/document_transformer_guide.md b/content/zh/docs/eino/core_modules/components/document_transformer_guide.md index fa0d2764a98..b9a2a6ae272 100644 --- a/content/zh/docs/eino/core_modules/components/document_transformer_guide.md +++ b/content/zh/docs/eino/core_modules/components/document_transformer_guide.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2025-07-21" +date: "2026-05-17" lastmod: "" tags: [] title: Document Transformer 使用说明 @@ -160,9 +160,11 @@ for idx, doc := range outDocs { ## **已有实现** -1. Markdown Header Splitter: 基于 Markdown 标题进行文档分割 [Splitter - markdown](/zh/docs/eino/ecosystem_integration/document/splitter_markdown) -2. Text Splitter: 基于文本长度或分隔符进行文档分割 [Splitter - semantic](/zh/docs/eino/ecosystem_integration/document/splitter_semantic) -3. Document Filter: 基于规则过滤文档内容 [Splitter - recursive](/zh/docs/eino/ecosystem_integration/document/splitter_recursive) + + + + +
    markdownREADME_zh.mdREADME.md
    recursiveREADME_zh.mdREADME.md
    semanticREADME_zh.mdREADME.md
    ## **自行实现参考** diff --git a/content/zh/docs/eino/core_modules/components/embedding_guide.md b/content/zh/docs/eino/core_modules/components/embedding_guide.md index 329129f3d87..029bbe92266 100644 --- a/content/zh/docs/eino/core_modules/components/embedding_guide.md +++ b/content/zh/docs/eino/core_modules/components/embedding_guide.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2025-07-21" +date: "2026-05-17" lastmod: "" tags: [] title: Embedding 使用说明 diff --git a/content/zh/docs/eino/core_modules/components/tools_node_guide/_index.md b/content/zh/docs/eino/core_modules/components/tools_node_guide/_index.md index 2561cabb5b5..8660b0f0abb 100644 --- a/content/zh/docs/eino/core_modules/components/tools_node_guide/_index.md +++ b/content/zh/docs/eino/core_modules/components/tools_node_guide/_index.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-03" +date: "2026-05-17" lastmod: "" tags: [] title: ToolsNode&Tool 使用说明 @@ -282,6 +282,82 @@ type ToolInfo struct { Tool 组件使用 ToolOption 来定义可选参数, ToolsNode 没有抽象公共的 option。每个具体的实现可以定义自己的特定 Option,通过 WrapToolImplSpecificOptFn 函数包装成统一的 ToolOption 类型。 +## Tool 别名(Alias)🏷️ alpha/09 + +Tool 别名功能允许为工具配置**名称别名**和**参数别名**,使 LLM 使用别名调用工具时能自动解析到真实工具和规范参数。 + +### 配置结构 + +```go +// ToolAliasConfig 配置单个工具的名称和参数别名 +type ToolAliasConfig struct { + // NameAliases 是工具的替代名称列表 + // 如果模型返回这些名称中的任何一个,将解析为规范工具名 + NameAliases []string + + // ArgumentsAliases 将规范参数 key 映射到其别名列表 + // key=规范名, value=[]别名 + // 例: {"query": ["q", "search_term"], "limit": ["max_results", "count"]} + ArgumentsAliases map[string][]string +} +``` + +在 `ToolsNodeConfig` 中通过 `ToolAliases` 字段配置: + +```go +config := &compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{searchTool, weatherTool}, + ToolAliases: map[string]ToolAliasConfig{ + "search": { + NameAliases: []string{"find", "query", "search_v1"}, + ArgumentsAliases: map[string][]string{ + "query": {"q", "search_term"}, + "limit": {"max_results", "count"}, + }, + }, + }, +} +toolsNode, err := compose.NewToolNode(ctx, config) +``` + +### 动态覆盖 + +通过 `WithToolAliases()` 调用选项可在运行时覆盖全局别名配置: + +```go +// 覆盖别名配置(保留原工具列表) +result, err := toolsNode.Invoke(ctx, input, + compose.WithToolAliases(map[string]compose.ToolAliasConfig{ + "search": { + NameAliases: []string{"new_alias"}, + }, + }), +) + +// 同时覆盖工具列表和别名 +result, err := toolsNode.Invoke(ctx, input, + compose.WithToolList(newSearchTool), + compose.WithToolAliases(map[string]compose.ToolAliasConfig{...}), +) +``` + +### 执行流程 + +工具调用时的处理顺序: + +1. **名称解析**:LLM 返回的工具名(可能是别名)通过 indexes 查找解析为规范工具名 +2. **参数重映射**:JSON 参数中的别名 key 自动替换为规范 key +3. **ToolArgumentsHandler**(如已配置):接收规范工具名和已重映射的参数 +4. **工具执行**:使用规范名称和参数调用工具 + +### 注意事项 + +- 名称别名**不能**与其他工具的规范名或已注册的别名冲突 +- 参数别名**不能**与工具 JSON Schema 中已有的属性名冲突 +- 当别名 key 和规范 key **同时存在**于参数 JSON 中时,规范 key 优先,别名 key 保持原样 +- 为不存在的工具名配置别名会被**静默忽略** +- 别名功能同时支持**标准工具**和**增强型工具** + ## **使用方式** ### **标准工具使用** diff --git a/content/zh/docs/eino/core_modules/devops/visual_debug_plugin_guide.md b/content/zh/docs/eino/core_modules/devops/visual_debug_plugin_guide.md index f195e80e1d6..ef9539c7302 100644 --- a/content/zh/docs/eino/core_modules/devops/visual_debug_plugin_guide.md +++ b/content/zh/docs/eino/core_modules/devops/visual_debug_plugin_guide.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2025-11-20" +date: "2026-05-17" lastmod: "" tags: [] title: Eino Dev 可视化调试插件功能指南 @@ -166,6 +166,7 @@ go mod tidy > 1. 确保目标调试的编排产物至少执行过一次 `Compile()`。 > 2. `devops.Init()` 的执行必须要在调用 `Compile()` 之前。 > 3. 用户需要保证 `devops.Init()` 执行后主进程不能退出。 +> 4. v0.1.9 起,调试服务默认监听地址由 `0.0.0.0` 变更为 `127.0.0.1`(仅允许本地连接)。如需远程调试,请通过 `WithDevServerIP` 显式指定监听 IP,例如:`devops.Init(ctx, devops.WithDevServerIP("0.0.0.0"))`。 如在 `main()` 函数中增加调试服务启动代码 @@ -223,7 +224,7 @@ func main() { > 注意事项 > > - 本地电脑调试:系统可能会弹出网络接入警告,允许接入即可。 -> - 远程服务器调试:需要你保证端口可访问。 +> - 远程服务器调试:需要保证端口可访问。此外,v0.1.9 起默认仅监听 `127.0.0.1`,远程调试必须在 `devops.Init()` 时通过 `WithDevServerIP` 指定可被远端访问的 IP(如 `0.0.0.0`)。 IP 和 Port 配置完成后,点击确认,调试插件会自动连接到目标调试服务器。如果成功连接,连接状态指示器会变成绿色。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md index bd3e960647d..6c0fe4d8656 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PatchToolCalls.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-17" lastmod: "" tags: [] title: PatchToolCalls @@ -10,19 +10,15 @@ weight: 8 adk/middlewares/patchtoolcalls > 💡 -> PatchToolCalls 中间件用于修复消息历史中「悬空的工具调用」(dangling tool calls)问题。本中间件在 v0.8.0 版本引入。 +> PatchToolCalls 中间件用于修复消息历史中「悬空的工具调用」(dangling tool calls)问题。在 v0.8.0 版本引入。同时支持 `*schema.Message` 和 `*schema.AgenticMessage` 两种消息类型。 ## 概述 -在多轮对话场景中,可能会出现 Assistant 消息包含工具调用(ToolCalls),但对话历史中缺少对应的 Tool 消息响应的情况。这种「悬空的工具调用」会导致某些模型 API 报错或产生异常行为。 - -**常见场景:** +在多轮对话场景中,可能出现 Assistant 消息包含工具调用(ToolCalls),但对话历史中缺少对应 Tool 响应的情况。这种「悬空的工具调用」会导致某些模型 API 报错或产生异常行为。**常见场景:** - 用户在工具执行完成前发送了新消息,导致工具调用被中断 - 会话恢复时,部分工具调用结果丢失 -- Human-in-the-loop 场景下,用户取消了工具执行 - -PatchToolCalls 中间件会在每次模型调用前扫描消息历史,为缺少响应的工具调用自动插入占位符消息。 +- Human-in-the-loop 场景下,用户取消了工具执行 PatchToolCalls 中间件会在每次模型调用前(`BeforeModelRewriteState` 钩子)扫描消息历史,为缺少响应的工具调用自动插入占位符消息。 ## 快速开始 @@ -33,48 +29,64 @@ import ( "github.com/cloudwego/eino/adk/middlewares/patchtoolcalls" ) -// 使用默认配置创建中间件 +// 使用默认配置(cfg 可传 nil) mw, err := patchtoolcalls.New(ctx, nil) if err != nil { // 处理错误 } -// 与 ChatModelAgent 一起使用 agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ Model: yourChatModel, Middlewares: []adk.ChatModelAgentMiddleware{mw}, }) ``` -## 配置项 +## API 参考 + +### Config ```go type Config struct { - // PatchedContentGenerator 自定义生成占位符消息内容的函数 - // 可选,不设置时使用默认消息 PatchedContentGenerator func(ctx context.Context, toolName, toolCallID string) (string, error) } ``` - +
    字段类型必填说明
    PatchedContentGenerator
    func(ctx, toolName, toolCallID string) (string, error)
    自定义生成占位符消息内容的函数。参数包含工具名和调用 ID,返回要填充的内容
    PatchedContentGenerator
    func(ctx context.Context, toolName, toolCallID string) (string, error)
    自定义生成占位符消息内容的函数。未设置时使用内置默认消息模板
    -### 默认占位符消息 +### New + +```go +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) +``` + +创建 PatchToolCalls 中间件。`cfg` 可为 `nil`,此时使用默认配置。内部调用 `NewTyped[*schema.Message]`。 + +### NewTyped + +```go +func NewTyped[M adk.MessageType](_ context.Context, cfg *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +泛型版本构造函数,支持 `*schema.Message` 和 `*schema.AgenticMessage`。`cfg` 可为 `nil`。 + +- 当 `M = *schema.Message` 时,通过 `ToolCallID` 字段匹配 Tool 消息 +- 当 `M = *schema.AgenticMessage` 时,通过 `ContentBlock.FunctionToolResult.CallID` 匹配 -如果不设置 `PatchedContentGenerator`,中间件会使用默认的占位符消息: +### 默认占位符消息 -**英文(默认):** +如果不设置 `PatchedContentGenerator`,中间件使用内置模板(通过 `fmt.Sprintf` 格式化,`%s` 依次对应 toolName 和 toolCallID):**英文(默认):** ``` -Tool call {toolName} with id {toolCallID} was cancelled - another message came in before it could be completed. +Tool call %s with id %s was canceled - another message came in before it could be completed. ``` **中文:** ``` -工具调用 {toolName}(ID 为 {toolCallID})已被取消——在其完成之前收到了另一条消息。 +工具调用 %s(ID 为 %s)已被取消——在其完成之前收到了另一条消息。 ``` 可通过 `adk.SetLanguage()` 切换语言。 @@ -91,10 +103,24 @@ mw, err := patchtoolcalls.New(ctx, &patchtoolcalls.Config{ }) ``` -### 结合其他中间件使用 +### 泛型用法(AgenticMessage) + +```go +mw, err := patchtoolcalls.NewTyped[*schema.AgenticMessage](ctx, nil) +if err != nil { + // 处理错误 +} + +agent, err := adk.NewTypedChatModelAgent[*schema.AgenticMessage](ctx, &adk.TypedChatModelAgentConfig[*schema.AgenticMessage]{ + Model: yourChatModel, + Middlewares: []adk.TypedChatModelAgentMiddleware[*schema.AgenticMessage]{mw}, +}) +``` + +### 结合其他中间件 ```go -// PatchToolCalls 通常应该放在中间件链的前面 +// PatchToolCalls 通常放在中间件链的前面 // 确保在其他中间件处理消息之前修复悬空的工具调用 agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ Model: yourChatModel, @@ -108,35 +134,28 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ ## 工作原理 - - -**处理逻辑:** - -1. 在 `BeforeModelRewriteState` 钩子中执行 -2. 遍历所有消息,查找包含 `ToolCalls` 的 Assistant 消息 -3. 对于每个 ToolCall,检查后续消息中是否存在对应的 Tool 消息(通过 `ToolCallID` 匹配) -4. 如果找不到对应的 Tool 消息,则插入一个占位符消息 -5. 返回修复后的消息列表 +> 💡 +> 对于 `*schema.Message`,通过 `msg.Role == schema.Tool && msg.ToolCallID` 匹配;对于 `*schema.AgenticMessage`,通过 `ContentBlock.FunctionToolResult.CallID` 匹配。 -## 示例场景 +### 示例场景 -### 修复前的消息历史 +**修复前:** ``` -[User] "帮我查询天气" -[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] -[Tool] "call_1: 晴天,25°C" -[User] "不用查位置了,直接告诉我北京的天气" <- 用户中断 +[User] "帮我查询天气" +[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] +[Tool] "call_1: 晴天,25°C" +[User] "不用查位置了,直接告诉我北京的天气" <- 用户中断 ``` -### 修复后的消息历史 +**修复后:** ``` -[User] "帮我查询天气" -[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] -[Tool] "call_1: 晴天,25°C" -[Tool] "call_2: 工具调用 get_location(ID 为 call_2)已被取消..." <- 自动插入 -[User] "不用查位置了,直接告诉我北京的天气" +[User] "帮我查询天气" +[Assistant] ToolCalls: [{id: "call_1", name: "get_weather"}, {id: "call_2", name: "get_location"}] +[Tool] "call_1: 晴天,25°C" +[Tool] "call_2: 工具调用 get_location(ID 为 call_2)已被取消..." <- 自动插入 +[User] "不用查位置了,直接告诉我北京的天气" ``` ## 多语言支持 @@ -153,7 +172,8 @@ adk.SetLanguage(adk.LanguageEnglish) // 英文(默认) ## 注意事项 > 💡 -> 此中间件仅在 `BeforeModelRewriteState` 钩子中修改本次运行的历史消息,不会影响实际存储的消息历史。修复只是临时的,仅用于本轮 agent 调用。 +> `BeforeModelRewriteState` 返回的 state 会被框架持久化到 agent 内部状态(参见 `wrappers.go` 中的 `ProcessState` 调用)。因此 PatchToolCalls 插入的占位符消息**会保留在后续迭代中**,不需要每轮重复修补。 - 建议将此中间件放在中间件链的**前面**,确保其他中间件处理的是完整的消息历史 -- 如果你的场景需要持久化修复后的消息,请在 `PatchedContentGenerator` 中实现相应逻辑 +- `cfg` 参数可传 `nil`,等价于 `&Config{}` +- 如果消息列表为空(`len(state.Messages) == 0`),中间件直接返回,不做任何处理 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md index daf5662c286..1622e947a52 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_PlanTask.md @@ -1,33 +1,28 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-17" lastmod: "" tags: [] title: PlanTask weight: 6 --- -# PlanTask 中间件 - -adk/middlewares/plantask - > 💡 -> 本中间件在 v0.8.0 版本引入。 +> 本中间件在 v0.8.0 版本引入。包路径:`github.com/cloudwego/eino/adk/middlewares/plantask` ## 概述 -`plantask` 是一个任务管理中间件,让 Agent 可以创建和管理任务列表。中间件通过 `BeforeAgent` 钩子注入四个工具: - -- **TaskCreate**: 创建任务 -- **TaskGet**: 查看任务详情 -- **TaskUpdate**: 更新任务 -- **TaskList**: 列出所有任务 +`plantask` 是一个任务管理中间件,通过 `BeforeAgent` 钩子向 Agent 注入四个工具,使其具备结构化任务规划能力: -主要用途: + + + + + + +
    工具功能
    TaskCreate
    创建任务
    TaskGet
    获取单个任务详情
    TaskUpdate
    更新任务状态/字段、设置依赖、删除任务
    TaskList
    列出所有任务摘要
    -- 跟踪复杂任务的进度 -- 把大任务拆成小步骤 -- 管理任务间的依赖关系 +核心用途:将复杂请求拆解为可跟踪的小任务,管理任务间依赖关系,让用户看到执行进度。 --- @@ -38,7 +33,7 @@ adk/middlewares/plantask │ Agent │ │ │ │ ┌───────────────────────────────────────────────────────────────────┐ │ -│ │ BeforeAgent: 注入任务工具 │ │ +│ │ BeforeAgent: 注入任务工具 (带 sync.Mutex 保证并发安全) │ │ │ │ - TaskCreate │ │ │ │ - TaskGet │ │ │ │ - TaskUpdate │ │ @@ -53,7 +48,7 @@ adk/middlewares/plantask │ │ │ 存储结构: │ │ baseDir/ │ -│ ├── .highwatermark # ID 计数器 │ +│ ├── .highwatermark # 已分配的最大 ID(纯数字文本) │ │ ├── 1.json # 任务 #1 │ │ ├── 2.json # 任务 #2 │ │ └── ... │ @@ -63,126 +58,151 @@ adk/middlewares/plantask --- -## 配置 +## API + +### 构造函数 + +```go +// 泛型版本,支持 *schema.Message 和 *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, config *Config) (adk.TypedChatModelAgentMiddleware[M], error) + +// 非泛型版本,等价于 NewTyped[*schema.Message] +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +``` + +### Config ```go type Config struct { Backend Backend // 存储后端,必填 - BaseDir string // 任务文件目录,必填 + BaseDir string // 任务文件存储目录,必填 } ``` -- 注意这个 Backend 的实现,应该是 session 维度隔离的,不同的 session 对应不同的 Backend(任务列表) +> 💡 +> Backend 应该是 session 维度隔离的——不同会话对应不同的 Backend 实例(即不同的任务列表)。 ---- +### Backend 接口 -## Backend 接口 +`Backend` 定义在 `plantask` 包内,是 `filesystem.Backend` 的精简子集,仅保留任务存储所需的四个方法: ```go type Backend interface { LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) - Read(ctx context.Context, req *ReadRequest) (string, error) + Read(ctx context.Context, req *ReadRequest) (*filesystem.FileContent, error) Write(ctx context.Context, req *WriteRequest) error Delete(ctx context.Context, req *DeleteRequest) error } ``` +其中类型别名关系: + +```go +type FileInfo = filesystem.FileInfo // Path, IsDir, Size, ModifiedAt +type LsInfoRequest = filesystem.LsInfoRequest // Path string +type ReadRequest = filesystem.ReadRequest // FilePath, Offset, Limit +type WriteRequest = filesystem.WriteRequest // FilePath, Content string + +// DeleteRequest 是 plantask 包自定义的(filesystem 包无此类型) +type DeleteRequest struct { + FilePath string +} +``` + +> 💡 +> 注意 `Read` 返回 `*filesystem.FileContent`(含 `Content string` 字段),不是裸 string。导入路径为 `github.com/cloudwego/eino/adk/filesystem`。 + --- ## 任务结构 ```go type task struct { - ID string `json:"id"` // 任务 ID - Subject string `json:"subject"` // 标题 - Description string `json:"description"` // 描述 - Status string `json:"status"` // 状态 - Blocks []string `json:"blocks"` // 阻塞哪些任务 - BlockedBy []string `json:"blockedBy"` // 被哪些任务阻塞 - ActiveForm string `json:"activeForm"` // 进行时文案 - Owner string `json:"owner"` // 负责 agent - Metadata map[string]any `json:"metadata"` // 自定义数据 + ID string `json:"id"` + Subject string `json:"subject"` + Description string `json:"description"` + Status string `json:"status"` + Blocks []string `json:"blocks"` + BlockedBy []string `json:"blockedBy"` + ActiveForm string `json:"activeForm,omitempty"` + Owner string `json:"owner,omitempty"` + Metadata map[string]any `json:"metadata,omitempty"` } ``` ### 状态 - - + + - +
    状态说明
    pending
    待处理(默认)
    状态值说明
    pending
    待处理(创建时默认)
    in_progress
    进行中
    completed
    已完成
    deleted
    删除(会删掉文件)
    deleted
    删除(物理删除 JSON 文件,并从其他任务的依赖列表中移除)
    -状态流转:`pending` → `in_progress` → `completed`,任何状态都可以直接 `deleted`。 +状态流转:`pending` → `in_progress` → `completed`;任何状态均可直接设为 `deleted`。 --- -## 工具 +## 工具参数 ### TaskCreate -创建任务。 +工具名常量:`TaskCreateToolName = "TaskCreate"` - - - - + + + +
    参数类型必填说明
    subject
    string标题
    description
    string描述
    activeForm
    string进行时文案,比如"正在运行测试"
    metadata
    object自定义数据
    subject
    string任务标题(祈使句形式)
    description
    string任务详细描述,包含上下文和验收标准
    activeForm
    string进行时文案(如"正在运行测试"),in_progress 时展示给用户
    metadata
    object自定义键值对
    -什么时候用: - -- 任务比较复杂,有 3 步以上 -- 用户给了一堆事情要做 -- 需要让用户看到进度 - -什么时候不用: - -- 就一个简单任务 -- 三两下就能搞定的事 +创建后任务 ID 自动递增(基于 `.highwatermark` 文件),状态初始为 `pending`。 ### TaskGet -查看任务详情。 +工具名常量:`TaskGetToolName = "TaskGet"` - +
    参数类型必填说明
    taskId
    string任务 ID
    taskId
    string任务 ID(纯数字字符串)
    -返回任务的完整信息:标题、描述、状态、依赖关系等。 +返回任务的完整信息:subject、description、status、blocks、blockedBy、owner。 ### TaskUpdate -更新任务。 +工具名常量:`TaskUpdateToolName = "TaskUpdate"` - - - - - - + + + + + +
    参数类型必填说明
    taskId
    string任务 ID
    subject
    string新标题
    description
    string新描述
    activeForm
    string新的进行时文案
    status
    string新状态
    addBlocks
    []string添加被阻塞的任务
    addBlockedBy
    []string添加阻塞自己的任务
    owner
    string负责 agent
    metadata
    object自定义数据(设 null 删除)
    activeForm
    string新进行时文案
    status
    string新状态,enum:
    pending
    /
    in_progress
    /
    completed
    /
    deleted
    addBlocks
    []string添加被当前任务阻塞的任务 ID(双向写入)
    addBlockedBy
    []string添加阻塞当前任务的任务 ID(双向写入)
    owner
    string负责的 agent 名称
    metadata
    object合并到现有 metadata;设 key 为 null 则删除该 key
    -注意: +关键行为: -- `status: "deleted"` 会直接删掉任务文件 -- 加依赖时会检查循环依赖 -- 所有任务都完成后会自动清理 +- `status: "deleted"` 会物理删除任务文件,并从所有其他任务的 blocks/blockedBy 中移除该 ID +- 添加依赖时会进行**循环依赖检测**,若形成环则报错 +- 当**所有任务均为 completed** 时,自动删除全部任务文件(清理机制) ### TaskList -列出所有任务,不需要参数。 +工具名常量:`TaskListToolName = "TaskList"` -返回每个任务的摘要:ID、状态、标题、负责 agent、依赖关系。 +无参数。返回所有任务的摘要列表(按 ID 排序),每条格式为: + +``` +#ID [status] subject [owner: xxx] [blocked by #x, #y] +``` --- @@ -191,8 +211,7 @@ type task struct { ```go ctx := context.Background() -// plantask middleware 正常情况下应该 session 维度的 -// 不同的 session 对应不同的任务列表 +// Backend 应该是 session 维度隔离的 middleware, err := plantask.New(ctx, &plantask.Config{ Backend: myBackend, BaseDir: "/tasks", @@ -213,39 +232,40 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ 1. 收到复杂任务 │ ▼ -2. TaskCreate 创建任务 +2. TaskCreate 创建多个子任务 - #1: 分析需求 - - #2: 写代码 + - #2: 实现代码 + - #3: 编写测试 │ ▼ 3. TaskUpdate 设置依赖 - - #2 依赖 #1 - - #3 依赖 #2 + - #2 addBlockedBy: ["1"] + - #3 addBlockedBy: ["2"] │ ▼ -4. TaskList 看看有啥任务 +4. TaskList 查看可用任务 │ ▼ -5. TaskUpdate 开始干活 - - #1 改成 in_progress +5. TaskUpdate #1 → in_progress │ ▼ -6. 干完了 TaskUpdate - - #1 改成 completed +6. 完成后 TaskUpdate #1 → completed │ ▼ 7. 循环 4-6 直到全部完成 │ ▼ -8. 自动清理 +8. 全部 completed → 自动清理所有文件 ``` --- ## 依赖管理 -- **blocks**: 我完成了,这些任务才能开始 -- **blockedBy**: 这些任务完成了,我才能开始 +- **blocks**: "我完成了,这些任务才能开始" +- **blockedBy**: "这些任务完成了,我才能开始" + +依赖写入是**双向**的:对 Task A 执行 `addBlocks: ["2"]`,会同时在 Task #2 的 `blockedBy` 中写入 A 的 ID。 ``` Task #1 (blocks: ["2"]) ────► Task #2 (blockedBy: ["1"]) @@ -253,33 +273,32 @@ Task #1 (blocks: ["2"]) ────► Task #2 (blockedBy: ["1"]) #1 完成后 #2 才能开始 ``` -循环依赖会报错: +循环依赖检测通过 DFS 可达性判断实现: ``` #1 blocks #2 -#2 blocks #1 ← 不行,循环了 +#2 blocks #1 ← 报错:would create a cyclic dependency ``` --- -## 自动清理 - -所有任务都 `completed` 后,会自动把任务文件都删掉。 - ---- - -## 注意事项 +## 实现细节 -- 任务文件以 JSON 格式存储在 `BaseDir` 目录下,文件名为 `{id}.json` -- `.highwatermark` 文件用于记录已分配的最大任务 ID,确保 ID 不重复 -- 所有工具操作都有互斥锁保护,并发安全 -- 工具的 description 里已经包含了详细的使用指南,Agent 会根据这些指南来使用工具 + + + + + + + + +
    机制说明
    ID 分配
    .highwatermark
    文件存储当前最大 ID,创建时 +1
    并发安全四个工具共享同一
    sync.Mutex
    ,同一 middleware 实例串行执行
    文件格式每个任务一个
    {id}.json
    文件,JSON 序列化使用
    sonic
    自动清理TaskUpdate 将任务标记为 completed 后检查——若所有任务均 completed 则批量删除
    ID 校验纯数字正则
    ^\d+$
    删除级联删除任务时遍历所有任务文件,移除对该 ID 的引用
    --- ## 多语言支持 -工具的 description 支持中英文切换,通过 `adk.SetLanguage()` 设置: +工具的 description 支持中英文双语,通过全局设置切换: ```go // 使用中文 description @@ -289,4 +308,4 @@ adk.SetLanguage(adk.LanguageChinese) adk.SetLanguage(adk.LanguageEnglish) ``` -这个设置是全局的,会影响所有 ADK 内置的 prompt 和工具 description。 +此设置影响所有 ADK 内置的 prompt 和工具 description。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md index 2753ec9ae44..275cfb3f391 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Skill.md @@ -1,17 +1,17 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: Skill weight: 3 --- -Skill Middleware 为 Eino ADK Agent 提供了 Skill 支持,使 Agent 能够动态发现和使用预定义的技能来更准确、高效地完成任务。 +Skill Middleware 为 Eino ADK Agent 提供 Skill 支持,使 Agent 能够动态发现和使用预定义的技能来完成任务。 # 什么是 Skill -Skill 是包含指令、脚本和资源的文件夹,Agent 可以按需发现和使用这些 Skill 来扩展自身能力。 Skill 的核心是一个 `SKILL.md` 文件,包含元数据(至少需要 name 和 description)和指导 Agent 执行特定任务的说明。 +Skill 是包含指令、脚本和资源的文件夹,Agent 可以按需发现和使用这些 Skill 来扩展自身能力。核心是 `SKILL.md` 文件,包含元数据(至少需要 name 和 description)和指导 Agent 执行任务的说明。 ``` my-skill/ @@ -23,9 +23,11 @@ my-skill/ Skill 使用**渐进式展示(Progressive Disclosure)**来高效管理上下文: -1. **发现(Discovery)**:启动时,Agent 仅加载每个可用 Skill 的名称和描述,足以判断何时可能需要使用该 Skill -2. **激活****(Activation)**:当任务匹配某个 Skill 的描述时,Agent 将完整的 `SKILL.md` 内容读入上下文 -3. **执行(Execution)**:Agent 遵循指令执行任务,也可以根据需要加载其他文件或执行捆绑的代码这种方式让 Agent 保持快速响应,同时能够按需访问更多上下文。 + + +1. **发现(Discovery)**:Agent 仅加载每个可用 Skill 的 name 和 description,足以判断何时可能需要使用该 Skill +2. **激活(Activation)**:当任务匹配某个 Skill 时,Agent 将完整的 `SKILL.md` 内容读入上下文 +3. **执行(Execution)**:Agent 遵循指令执行任务,按需加载其他文件或执行捆绑代码 > 💡 > Ref: [https://agentskills.io/home](https://agentskills.io/home) @@ -34,7 +36,7 @@ Skill 使用**渐进式展示(Progressive Disclosure)**来高效管理上下 ## FrontMatter -Skill 的元数据结构,用于在发现阶段快速展示 Skill 信息,避免加载完整内容: +Skill 的元数据结构,从 SKILL.md 的 YAML frontmatter 中解析。用于在发现阶段快速展示 Skill 信息: ```go type FrontMatter struct { @@ -48,14 +50,14 @@ type FrontMatter struct { - - - - - + + + + +
    字段类型说明
    Name
    string
    Skill 的唯一标识符。Agent 通过此名称调用 Skill ,建议使用简短、有意义的名称(如
    pdf-processing
    web-research
    )。对应 SKILL.md 中 frontmatter 的
    name
    字段
    Description
    string
    Skill 的功能描述。这是 Agent 判断是否使用该 Skill 的关键依据,应清晰说明技 Skill 能适用的场景和能力。对应 SKILL.md 中 frontmatter 的
    description
    字段
    Context
    ContextMode
    上下文模式。可选值:
    fork_with_context
    (复制历史消息创建新 Agent 执行)、
    fork
    (隔离上下文创建新 Agent 执行)。留空表示内联模式(直接返回 Skill 内容)
    Agent
    string
    指定使用的 Agent 名称。配合
    Context
    字段使用,通过
    AgentHub
    获取对应的 Agent 工厂函数。留空时使用默认 Agent
    Model
    string
    指定使用的模型名称。通过
    ModelHub
    获取对应的模型实例。在 Context 模式下传递给 Agent 工厂;在内联模式下切换后续 ChatModel 调用使用的模型
    Name
    string
    Skill 的唯一标识符。建议使用简短、有意义的名称(如
    pdf-processing
    web-research
    Description
    string
    Skill 的功能描述。Agent 判断是否使用该 Skill 的关键依据,应清晰说明适用场景和能力
    Context
    ContextMode
    上下文模式。可选值:
    fork
    (隔离上下文)、
    fork_with_context
    (复制历史消息)。留空表示内联模式
    Agent
    string
    指定使用的 Agent 名称,配合
    Context
    使用,通过
    AgentHub
    获取对应 Agent。留空使用默认 Agent
    Model
    string
    指定使用的模型名称,通过
    ModelHub
    获取对应模型实例
    -### ContextMode 上下文模式 +### ContextMode ```go const ( @@ -67,13 +69,13 @@ const ( - - + +
    模式说明
    内联(默认)Skill 内容直接作为工具结果返回,由当前 Agent 继续处理
    ForkWithContext创建新 Agent,复制当前对话历史,独立执行 Skill 任务后返回结果
    Fork创建新 Agent,使用隔离的上下文(仅包含 Skill 内容),独立执行后返回结果
    fork_with_context
    创建新 Agent,复制当前对话历史,独立执行 Skill 任务后返回结果
    fork
    创建新 Agent,使用隔离上下文(仅包含 Skill 内容),独立执行后返回结果
    ## Skill -完整的 Skill 结构,包含元数据和实际指令内容: +完整的 Skill 结构,包含元数据和指令内容: ```go type Skill struct { @@ -85,18 +87,14 @@ type Skill struct { - - - + + +
    字段类型说明
    FrontMatter
    FrontMatter
    嵌入的元数据结构,包含
    Name
    Description
    Context
    Agent
    Model
    Content
    string
    SKILL.md 文件中 frontmatter 之后的正文内容。包含 Skill 的详细指令、工作流程、示例等,Agent 激活 Skill 后会读取此内容
    BaseDirectory
    string
    Skill 目录的绝对路径。Agent 可以使用此路径访问 Skill 目录中的其他资源文件(如脚本、模板、参考文档等)
    FrontMatter
    FrontMatter
    嵌入的元数据结构
    Content
    string
    SKILL.md 中 frontmatter 之后的正文内容,包含详细指令、工作流程、示例等
    BaseDirectory
    string
    Skill 目录的绝对路径,Agent 可用此路径访问目录中的其他资源文件
    ## Backend -Skill 后端接口,定义了技能的检索方式。Backend 接口将技能的存储与使用解耦,提供以下优势: - -- **灵活的存储方式**:技能可以存储在本地文件系统、数据库、远程服务、云存储等任意位置 -- **可扩展性**:团队可以根据需求实现自定义 Backend,如从 Git 仓库动态加载、从配置中心获取等 -- **测试友好**:可以轻松创建 Mock Backend 进行单元测试 +Skill 后端接口,将技能的存储与使用解耦: ```go type Backend interface { @@ -107,13 +105,13 @@ type Backend interface { - - + +
    方法说明
    List
    列出所有可用技能的元数据。在 Agent 启动时调用,用于构建技能工具的描述信息,让 Agent 知道有哪些技能可用
    Get
    根据名称获取完整的技能内容。当 Agent 决定使用某个技能时调用,返回包含详细指令的完整 Skill 结构
    List
    列出所有可用技能的元数据。Agent 启动时调用,用于构建技能工具的描述
    Get
    根据名称获取完整的技能内容。Agent 决定使用某个技能时调用
    -### **NewBackendFromFilesystem** +### NewBackendFromFilesystem -基于 `filesystem.Backend` 接口的后端实现,在指定的目录下读取技能: +基于 `filesystem.Backend` 接口的后端实现,扫描指定目录下的一级子目录读取技能: ```go type BackendFromFilesystemConfig struct { @@ -126,8 +124,8 @@ func NewBackendFromFilesystem(ctx context.Context, config *BackendFromFilesystem - - + +
    字段类型必需说明
    Backend
    filesystem.Backend
    文件系统后端实现,用于文件操作
    BaseDir
    string
    技能根目录的路径。会扫描此目录下的所有一级子目录,查找包含
    SKILL.md
    文件的目录作为技能
    Backend
    filesystem.Backend
    文件系统后端实现,用于文件操作
    BaseDir
    string
    技能根目录路径。扫描此目录下的一级子目录,查找包含
    SKILL.md
    文件的目录
    工作方式: @@ -137,303 +135,143 @@ func NewBackendFromFilesystem(ctx context.Context, config *BackendFromFilesystem - 解析 YAML frontmatter 获取元数据 - 深层嵌套的 `SKILL.md` 文件会被忽略 -### **filesystem.Backend 实现** - -`filesystem.Backend` 接口有以下两种实现可供选择,详见 [Middleware: FileSystem](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem) +`filesystem.Backend` 接口有两种实现可供选择,详见 FileSystem Backend 文档。 ## AgentHub 和 ModelHub -当 Skill 使用 Context 模式(fork/isolate)时,需要配置 AgentHub 和 ModelHub: +当 Skill 使用 Context 模式(fork / fork\_with\_context)时,需要通过 AgentHub 和 ModelHub 提供 Agent 实例和模型实例。 + +> 💡 +> 以下展示非泛型别名类型(即 `*schema.Message` 特化)。泛型版本 `TypedAgentHub[M]`、`TypedModelHub[M]` 可用于 `*schema.AgenticMessage` 场景,接口签名一致,仅消息类型参数不同。 ```go -// AgentHubOptions contains options passed to AgentHub.Get when creating an agent for skill execution. -type AgentHubOptions struct { - // Model is the resolved model instance when a skill specifies a "model" field in frontmatter. - // nil means the skill did not specify a model override; implementations should use their default. - Model model.ToolCallingChatModel +// AgentHubOptions 传递给 AgentHub.Get 的选项 +type AgentHubOptions = TypedAgentHubOptions[*schema.Message] + +type TypedAgentHubOptions[M adk.MessageType] struct { + // Model 为技能 frontmatter 中指定的模型实例(通过 ModelHub 解析)。 + // nil 表示技能未指定模型覆盖,实现方应使用默认模型。 + Model model.BaseModel[M] } -// AgentHub provides agent instances for context mode (fork/fork_with_context) execution. -type AgentHub interface { - // Get returns an Agent by name. When name is empty, implementations should return a default agent. - // The opts parameter carries skill-level overrides (e.g., model) resolved by the framework. - Get(ctx context.Context, name string, opts *AgentHubOptions) (adk.Agent, error) +// AgentHub 为 Context 模式提供 Agent 实例 +type AgentHub = TypedAgentHub[*schema.Message] + +type TypedAgentHub[M adk.MessageType] interface { + // Get 根据名称返回 Agent。name 为空时应返回默认 Agent。 + Get(ctx context.Context, name string, opts *TypedAgentHubOptions[M]) (adk.TypedAgent[M], error) } -// ModelHub 提供模型实例 -type ModelHub interface { - Get(ctx context.Context, name string) (model.ToolCallingChatModel, error) +// ModelHub 根据名称解析模型实例 +type ModelHub = TypedModelHub[*schema.Message] + +type TypedModelHub[M adk.MessageType] interface { + Get(ctx context.Context, name string) (model.BaseModel[M], error) } ``` -### +> 💡 +> 注意:`AgentHubOptions.Model` 和 `ModelHub.Get` 的返回类型为 `model.BaseModel[M]`,而非旧版文档中的 `model.ToolCallingChatModel`。 -## 初始化 +## SubAgentInput 和 SubAgentOutput -创建 Skill Middleware(推荐使用 `NewMiddleware`): +这两个结构体在自定义 fork 模式行为时使用: ```go -func NewMiddleware(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +type SubAgentInput = TypedSubAgentInput[*schema.Message] + +type TypedSubAgentInput[M adk.MessageType] struct { + Skill Skill + Mode ContextMode + RawArguments string // 原始 JSON 参数 + SkillContent string // 构建好的 Skill 内容 + History []M // 对话历史(仅 fork_with_context 模式) + ToolCallID string // 工具调用 ID(仅 fork_with_context 模式) +} + +type SubAgentOutput = TypedSubAgentOutput[*schema.Message] + +type TypedSubAgentOutput[M adk.MessageType] struct { + Skill Skill + Mode ContextMode + RawArguments string + Messages []M // 子 Agent 产生的所有消息 + Results []string // 提取的 assistant 消息文本内容 +} ``` -Config 中配置为: +# 初始化 + +## Config ```go -type Config struct { - // Backend 技能后端实现,必填 - Backend Backend - - // SkillToolName 技能工具名称,默认 "skill" - SkillToolName *string - - // AgentHub 提供 Agent 工厂函数,用于 Context 模式 - // 当 Skill 使用 "context: fork" 或 "context: isolate" 时必填 - AgentHub AgentHub - - // ModelHub 提供模型实例,用于 Skill 指定模型 - ModelHub ModelHub - - // CustomSystemPrompt 自定义系统提示词 - CustomSystemPrompt SystemPromptFunc - - // CustomToolDescription 自定义工具描述 +type Config = TypedConfig[*schema.Message] + +type TypedConfig[M adk.MessageType] struct { + Backend Backend + SkillToolName *string + AgentHub TypedAgentHub[M] + ModelHub TypedModelHub[M] + + CustomSystemPrompt SystemPromptFunc CustomToolDescription ToolDescriptionFunc + CustomToolParams func(ctx context.Context, defaults map[string]*schema.ParameterInfo) (map[string]*schema.ParameterInfo, error) + BuildContent func(ctx context.Context, skill Skill, rawArgs string) (string, error) + BuildForkMessages func(ctx context.Context, in TypedSubAgentInput[M]) ([]M, error) + FormatForkResult func(ctx context.Context, in TypedSubAgentOutput[M]) (string, error) } ``` - - - - - - + + + + + + + + + +
    字段类型必需默认值说明
    Backend
    Backend
  • 技能后端实现。负责技能的存储和检索,可使用内置的
    LocalBackend
    或自定义实现
    SkillToolName
    *string
    "skill"
    技能工具的名称。Agent 通过此名称调用技能工具。如果你的 Agent 已有同名工具,可以通过此字段自定义名称避免冲突
    AgentHub
    AgentHub
  • 提供 Agent 工厂函数。当 Skill 使用
    context: fork
    context: isolate
    时必填
    ModelHub
    ModelHub
  • 提供模型实例。当 Skill 指定
    model
    字段时使用
    CustomSystemPrompt
    SystemPromptFunc
    内置提示词自定义系统提示词函数
    CustomToolDescription
    ToolDescriptionFunc
    内置描述自定义工具描述函数
    Backend
    Backend
    -技能后端实现,负责技能的存储和检索
    SkillToolName
    *string
    "skill"
    技能工具名称。如已有同名工具,可自定义避免冲突
    AgentHub
    TypedAgentHub[M]
    -提供 Agent 实例。使用
    context: fork
    fork_with_context
    时必填
    ModelHub
    TypedModelHub[M]
    -提供模型实例。Context 模式下传给 AgentHub;内联模式下通过 WrapModel 切换后续 ChatModel 调用的模型
    CustomSystemPrompt
    SystemPromptFunc
    内置提示词自定义系统提示词。签名:
    func(ctx, toolName) string
    CustomToolDescription
    ToolDescriptionFunc
    内置描述自定义工具描述。签名:
    func(ctx, skills []FrontMatter) string
    CustomToolParams
    func
    skill
    参数
    自定义工具参数 schema。接收默认参数,返回自定义参数,始终保留
    skill
    为必填
    BuildContent
    func
    默认格式化自定义 Skill 内容生成,可在内容中注入额外上下文
    BuildForkMessages
    func
    见下文自定义 fork 模式下传给子 Agent 的初始消息。默认:
    fork
    [UserMessage(content)]
    fork_with_context
    [history..., ToolMessage(content, callID)]
    FormatForkResult
    func
    拼接内容自定义子 Agent 结果格式化。默认将 assistant message 内容拼接后返回
    -# 快速开始 - -以从本地加载 pdf skill 为例, 完整代码见 [https://github.com/cloudwego/eino-examples/tree/main/adk/middlewares/skill](https://github.com/cloudwego/eino-examples/tree/main/adk/middlewares/skill)。 - -- 在工作目录中创建 skills 目录: +## NewMiddleware ```go -workdir/ -├── skills/ -│ └── pdf/ -│ ├── scripts -│ │ └── analyze.py -│ └── SKILL.md -└── other files +func NewMiddleware(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) ``` -- 创建本地 filesystem backend,基于 backend 创建 Skill middleware: +创建 Skill Middleware,返回 `adk.ChatModelAgentMiddleware`,传入 `ChatModelAgentConfig.Handlers` 使用。 -```go -import ( - "github.com/cloudwego/eino/adk/middlewares/skill" - "github.com/cloudwego/eino-ext/adk/backend/local" -) +> 💡 +> 泛型版本 `NewTyped[M](ctx, config)` 返回 `adk.TypedChatModelAgentMiddleware[M]`,可用于 `*schema.AgenticMessage` 类型的 Agent。 -ctx := context.Background() +## 使用示例 -be, err := local.NewBackend(ctx, &local.Config{}) +```go +// 1. 创建 Backend +backend, err := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesystemConfig{ + Backend: fsBackend, + BaseDir: "/path/to/skills", +}) if err != nil { - log.Fatal(err) + return err } -skillBackend, err := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesystemConfig{ - Backend: be, - BaseDir: skillsDir, +// 2. 创建 Middleware +handler, err := skill.NewMiddleware(ctx, &skill.Config{ + Backend: backend, + AgentHub: myAgentHub, // 可选,仅 fork 模式需要 + ModelHub: myModelHub, // 可选,仅使用 model 字段时需要 }) if err != nil { - log.Fatalf("Failed to create skill backend: %v", err) + return err } -sm, err := skill.NewMiddleware(ctx, &skill.Config{ - Backend: skillBackend, -}) -``` - -- 基于 backend 创建本地 Filesystem Middleware,供 agent 读取 skill 其他文件以及执行脚本: - -```go -import ( - "github.com/cloudwego/eino/adk/middlewares/filesystem" -) - -fsm, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: be, - StreamingShell: be, -}) -``` - -- 创建 Agent 并配置 middlewares - -```go +// 3. 传入 Agent 的 Handlers agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "LogAnalysisAgent", - Description: "An agent that can analyze logs", - Instruction: "You are a helpful assistant.", - Model: cm, - Handlers: []adk.ChatModelAgentMiddleware{fsm, sm}, + // ... 其他配置 + Handlers: []adk.ChatModelAgentMiddleware{handler}, }) ``` - -- 调用 Agent,观察结果 - -```go -runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: agent, -}) - -input := fmt.Sprintf("Analyze the %s file", filepath.Join(workDir, "test.log")) -log.Println("User: ", input) - -iterator := runner.Query(ctx, input) -for { - event, ok := iterator.Next() - if !ok { - break - } - if event.Err != nil { - log.Printf("Error: %v\n", event.Err) - break - } - - prints.Event(event) -} -``` - -agent 输出: - -```yaml -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool name: skill -arguments: {"skill":"log_analyzer"} - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool response: Launching skill: log_analyzer -Base directory for this skill: /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/skills/log_analyzer -# SKILL.md content - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool name: execute -arguments: {"command": "python3 /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/skills/log_analyzer/scripts/analyze.py /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/test.log"} - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -tool response: Analysis Result for /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/middlewares/skill/workdir/test.log: -Total Errors: 2 -Total Warnings: 1 - -Error Details: -Line 3: [2024-05-20 10:02:15] ERROR: Database connection failed. -Line 5: [2024-05-20 10:03:05] ERROR: Connection timed out. - -Warning Details: -Line 2: [2024-05-20 10:01:23] WARNING: High memory usage detected. - - -name: LogAnalysisAgent -path: [{LogAnalysisAgent}] -answer: Here's the analysis result of the log file: - -### Summary -- **Total Errors**: 2 -- **Total Warnings**: 1 - -### Detailed Entries -#### Errors: -1. Line 3: [2024-05-20 10:02:15] ERROR: Database connection failed. -2. Line5: [2024-05-2010:03:05] ERROR: Connection timed out. - -#### Warnings: -1. Line2: [2024-05-2010:01:23] WARNING: High memory usage detected. - -The log file contains critical issues related to database connectivity and a warning about memory usage. Let me know if you need further analysis! -``` - -# 原理 - -Skill middleware 向 Agent 增加 system prompt 与 skill tool,system prompt 内容如下,{tool_name} 为 skill 工具的工具名: - -```python -# Skills System - -**How to Use Skills (Progressive Disclosure):** - -Skills follow a **progressive disclosure** pattern - you see their name and description above, but only read full instructions when needed: - -1. **Recognize when a skill applies**: Check if the user's task matches a skill's description -2. **Read the skill's full instructions**: Use the '{tool_name}' tool to load skill -3. **Follow the skill's instructions**: tool result contains step-by-step workflows, best practices, and examples -4. **Access supporting files**: Skills may include helper scripts, configs, or reference docs - use absolute paths - -**When to Use Skills:** -- User's request matches a skill's domain (e.g., "research X" -> web-research skill) -- You need specialized knowledge or structured workflows -- A skill provides proven patterns for complex tasks - -**Executing Skill Scripts:** -Skills may contain Python scripts or other executable files. Always use absolute paths. - -**Example Workflow:** - -User: "Can you research the latest developments in quantum computing?" - -1. Check available skills -> See "web-research" skill -2. Call '{tool_name}' tool to read the full skill instructions -3. Follow the skill's research workflow (search -> organize -> synthesize) -4. Use any helper scripts with absolute paths - -Remember: Skills make you more capable and consistent. When in doubt, check if a skill exists for the task! -``` - -Skill 工具接收需要加载 skill name,返回对应 SKILL.md 中的完整内容,在工具描述中告知 agent 所有可使用的 skill 的 name 和 description: - -```sql -Execute a skill within the main conversation - - -When users ask you to perform tasks, check if any of the available skills below can help complete the task more effectively. Skills provide specialized capabilities and domain knowledge. - -How to invoke: -- Use this tool with the skill name only (no arguments) -- Examples: - - `skill: pdf` - invoke the pdf skill - - `skill: xlsx` - invoke the xlsx skill - - `skill: ms-office-suite:pdf` - invoke using fully qualified name - -Important: -- When a skill is relevant, you must invoke this tool IMMEDIATELY as your first action -- NEVER just announce or mention a skill in your text response without actually calling this tool -- This is a BLOCKING REQUIREMENT: invoke the relevant Skill tool BEFORE generating any other response about the task -- Only use skills listed in below -- Do not invoke a skill that is already running -- Do not use this tool for built-in CLI commands (like /help, /clear, etc.) - - - -{{- range .Matters }} - - -{{ .Name }} - - -{{ .Description }} - - -{{- end }} - -``` - -运行举例: - - - -> 💡 -> Skill Middleware 仅提供了如上图所示的加载 SKILL.md 能力,如果 Skill 需要 agent 具备读取文件、执行脚本等能力,需要用户另外为 agent 配置。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md index f59fe4dbc17..e8b5fce7d83 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_Summarization.md @@ -1,210 +1,343 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-17" lastmod: "" tags: [] title: Summarization weight: 4 --- +> 💡 +> 本中间件在 v0.8.0 版本引入。包路径:`github.com/cloudwego/eino/adk/middlewares/summarization` + ## 概述 -Summarization 中间件会在对话的 token 数量超过配置阈值时,自动压缩对话历史。这有助于在长对话中保持上下文连续性,同时控制在模型的 token 限制范围内。 +Summarization 中间件在对话 token 数超过阈值时自动调用摘要模型压缩对话历史,使长对话在模型上下文窗口内保持连贯。中间件挂载在 `BeforeModelRewriteState` 钩子上,每轮模型调用前检查触发条件,触发后执行:计数 → 摘要生成(含重试/降级)→ 后处理 → 替换 state。 -> 💡 -> 本中间件在 v0.8.0 版本引入。 +## 泛型体系 -## 快速开始 +本包全部核心类型和函数均提供 **Typed 泛型版本**(`M adk.MessageType`)与 **非泛型别名**(固定为 `*schema.Message`)。 -```go -import ( - "context" - "github.com/cloudwego/eino/adk/middlewares/summarization" -) + + + + + + + + + + + + + + + +
    泛型版本非泛型别名(= Typed\[*schema.Message\])
    TypedConfig[M]
    Config
    NewTyped[M](ctx, *TypedConfig[M])
    New(ctx, *Config)
    TypedTokenCounterFunc[M]
    TokenCounterFunc
    TypedGenModelInputFunc[M]
    GenModelInputFunc
    TypedGetFailoverModelFunc[M]
    GetFailoverModelFunc
    TypedFinalizeFunc[M]
    FinalizeFunc
    TypedCallbackFunc[M]
    CallbackFunc
    TypedUserMessageFilterFunc[M]
    UserMessageFilterFunc
    TypedPreserveUserMessages[M]
    PreserveUserMessages
    TypedRetryConfig[M]
    RetryConfig
    TypedFailoverConfig[M]
    FailoverConfig
    TypedFailoverContext[M]
    FailoverContext
    TypedFinalizerBuilder[M]
    FinalizerBuilder
    -// 使用最小配置创建中间件 -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, // 必填:用于生成摘要的模型 -}) -if err != nil { - // 处理错误 -} +以下文档中如无特别说明,类型签名使用泛型形式 `M`。使用非泛型别名时 `M` = `*schema.Message`。 -// 与 ChatModelAgent 一起使用 -agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Model: yourChatModel, - Middlewares: []adk.ChatModelAgentMiddleware{mw}, -}) +### 构造函数 + +```go +// 泛型版本 — 支持 *schema.Message 和 *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error) + +// 非泛型版本 — 等价于 NewTyped[*schema.Message] +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) ``` -## 配置项 +## TypedConfig[M] 配置项 - - - - - - - - - - - + + + + + + + + + + + + +
    字段类型必填默认值说明
    Modelmodel.BaseChatModel
  • 用于生成摘要的聊天模型
    ModelOptions[]model.Option
  • 传递给模型生成摘要时的选项
    TokenCounterTokenCounterFunc约 4 字符/token自定义 token 计数函数
    Trigger*TriggerCondition190,000 tokens触发摘要的条件
    UserInstructionstring内置 prompt自定义摘要指令
    TranscriptFilePathstring
  • 完整对话记录文件路径
    GenModelInputGenModelInputFunc
  • 自定义摘要模型输入的预处理函数
    FinalizeFinalizeFunc
  • 自定义最终消息的后处理函数
    CallbackCallbackFunc
  • 在 Finalize 之后调用,用于观察状态变化(只读)
    EmitInternalEventsboolfalse是否发送内部事件
    PreserveUserMessages*PreserveUserMessagesEnabled: true是否在摘要中保留原始用户消息
    Model
    model.BaseModel[M]
    用于生成摘要的模型
    ModelOptions
    []model.Option
    传递给摘要模型的选项
    TokenCounter
    TypedTokenCounterFunc[M]
    基于最近 assistant 消息的 total\_tokens 作为基线,增量消息按 ~4 字符/token 估算自定义 token 计数函数
    Trigger
    *TriggerCondition
    ContextTokens=160,000触发摘要的条件
    UserInstruction
    string
    内置 prompt自定义用户级摘要指令,覆盖默认指令
    TranscriptFilePath
    string
    完整对话记录文件路径,附加到摘要中提醒模型原始上下文位置。仅在未设置 Finalize 时生效
    GenModelInput
    TypedGenModelInputFunc[M]
    sysInstruction → contextMsgs → userInstruction完全控制摘要模型输入的构建
    Finalize
    TypedFinalizeFunc[M]
    内置后处理自定义摘要后处理。设置后中间件不再执行任何默认后处理
    Callback
    TypedCallbackFunc[M]
    在 Finalize 后调用,参数为
    before, after adk.TypedChatModelAgentState[M]
    (值类型),只读
    EmitInternalEvents
    bool
    false是否在关键节点发送内部事件
    PreserveUserMessages
    *TypedPreserveUserMessages[M]
    Enabled: true在摘要中保留原始用户消息。仅在未设置 Finalize 时生效
    Retry
    *TypedRetryConfig[M]
    nil(不重试)主模型摘要生成的重试策略
    Failover
    *TypedFailoverConfig[M]
    nil主模型失败后的降级策略
    -### TriggerCondition 结构 +> 💡 +> **Finalize 覆盖语义**:一旦设置了自定义 `Finalize`,中间件将**跳过所有默认后处理**——`PreserveUserMessages` 和 `TranscriptFilePath` 均不再生效。如需在自定义 Finalize 中复用默认后处理逻辑,请使用 `DefaultFinalizer` 函数。 + +## 子配置结构体 + +### TriggerCondition + +满足**任一**条件即触发摘要。 ```go type TriggerCondition struct { - // ContextTokens 当总 token 数量超过此阈值时触发摘要 - ContextTokens int + ContextTokens int // token 数超过此阈值时触发 + ContextMessages int // 消息数超过此阈值时触发 } ``` -### PreserveUserMessages 结构 +### TypedPreserveUserMessages\[M\] + +启用后,将摘要中 `...` 区段替换为最近的原始用户消息。 ```go -type PreserveUserMessages struct { - // Enabled 是否启用保留用户消息功能 - Enabled bool - - // MaxTokens 保留用户消息的最大 token 数 - // 只保留最近的用户消息,直到达到此限制 - // 默认为 TriggerCondition.ContextTokens 的 1/3 - MaxTokens int +type TypedPreserveUserMessages[M adk.MessageType] struct { + Enabled bool + MaxTokens int // 保留用户消息的最大 token 数;默认为 TriggerCondition.ContextTokens / 3 + Filter TypedUserMessageFilterFunc[M] // 过滤函数,返回 false 则不保留该消息 } ``` -### 配置示例 +### TypedRetryConfig[M] -**自定义 Token 阈值** +```go +type TypedRetryConfig[M adk.MessageType] struct { + MaxRetries *int // 默认 3 + ShouldRetry func(ctx context.Context, resp M, err error) bool // 默认 err != nil 时重试 + BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration // 默认指数退避 + 抖动 +} +``` + +### TypedFailoverConfig[M] ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Trigger: &summarization.TriggerCondition{ - ContextTokens: 100000, // 在 100k tokens 时触发 - }, -}) +type TypedFailoverConfig[M adk.MessageType] struct { + MaxRetries *int // 默认 3 + ShouldFailover func(ctx context.Context, resp M, err error) bool // 默认 err != nil 时降级 + BackoffFunc func(ctx context.Context, attempt int, resp M, err error) time.Duration + GetFailoverModel TypedGetFailoverModelFunc[M] // 返回 (failoverModel model.BaseModel[M], failoverModelInputMsgs []M, failoverErr error) +} ``` -**自定义 Token 计数器** +### TypedFailoverContext[M] + +传递给 `GetFailoverModel` 回调的上下文。 ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - TokenCounter: func(ctx context.Context, input *summarization.TokenCounterInput) (int, error) { - // 使用你的 tokenizer - return yourTokenizer.Count(input.Messages) - }, -}) +type TypedFailoverContext[M adk.MessageType] struct { + Attempt int // 当前降级尝试次数,从 1 开始 + SystemInstruction M // 系统指令(中间件内部设置,不可配置) + UserInstruction M // 用户指令 + OriginalMessages []M // 原始完整对话 + LastModelResponse M // 上次尝试的模型响应 + LastErr error +} ``` -**设置对话记录文件路径** +### TypedTokenCounterInput[M] ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, +type TypedTokenCounterInput[M adk.MessageType] struct { + Messages []M + Tools []*schema.ToolInfo +} +``` + +## 函数类型签名速查 + +```go +type TypedTokenCounterFunc[M] func(ctx context.Context, input *TypedTokenCounterInput[M]) (int, error) +type TypedGenModelInputFunc[M] func(ctx context.Context, sysInstruction, userInstruction M, originalMsgs []M) ([]M, error) +type TypedGetFailoverModelFunc[M] func(ctx context.Context, failoverCtx *TypedFailoverContext[M]) (model.BaseModel[M], []M, error) +type TypedFinalizeFunc[M] func(ctx context.Context, originalMessages []M, summary M) ([]M, error) +type TypedCallbackFunc[M] func(ctx context.Context, before, after adk.TypedChatModelAgentState[M]) error +type TypedUserMessageFilterFunc[M] func(ctx context.Context, msg M) (bool, error) +``` + +## DefaultFinalizer + +`DefaultFinalizer` 是一个独立的工厂函数,返回与中间件默认后处理逻辑一致的 `TypedFinalizeFunc[M]`。当你需要在自定义 `Finalize` 中复用默认逻辑(保留用户消息、附加 transcript 路径等)时使用。 + +```go +func DefaultFinalizer[M adk.MessageType](cfg *DefaultFinalizerConfig[M]) (TypedFinalizeFunc[M], error) +``` + +### DefaultFinalizerConfig[M] + +```go +type DefaultFinalizerConfig[M adk.MessageType] struct { + PreserveUserMessages *TypedPreserveUserMessages[M] // 默认 Enabled=true,MaxTokens=30000 + TranscriptFilePath string +} +``` + +**示例**:在自定义 Finalize 中先执行默认后处理,再添加系统消息: + +```go +defaultFinalize, err := summarization.DefaultFinalizer[*schema.Message](&summarization.DefaultFinalizerConfig[*schema.Message]{ TranscriptFilePath: "/path/to/transcript.txt", }) +if err != nil { + // handle error +} + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: func(ctx context.Context, originalMessages []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + msgs, err := defaultFinalize(ctx, originalMessages, summary) + if err != nil { + return nil, err + } + // 在摘要前添加系统消息 + return append([]*schema.Message{schema.SystemMessage("your system prompt")}, msgs...), nil + }, +} ``` -**自定义 Finalize 函数** +## FinalizerBuilder + +`TypedFinalizerBuilder[M]` 提供链式 API 构建 `TypedFinalizeFunc[M]`,支持链接多个处理器(Handler)和一个可选的自定义终结器(Custom)。 ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Finalize: func(ctx context.Context, originalMessages []adk.Message, summary adk.Message) ([]adk.Message, error) { - // 自定义逻辑构建最终消息 - return []adk.Message{ - schema.SystemMessage("你的系统提示词"), - summary, - }, nil - }, -}) +func NewTypedFinalizer[M adk.MessageType]() *TypedFinalizerBuilder[M] +func NewFinalizer() *FinalizerBuilder // = NewTypedFinalizer[*schema.Message] + +func (b *TypedFinalizerBuilder[M]) PreserveSkills(config *PreserveSkillsConfig) *TypedFinalizerBuilder[M] +func (b *TypedFinalizerBuilder[M]) Custom(fn TypedFinalizeFunc[M]) *TypedFinalizerBuilder[M] +func (b *TypedFinalizerBuilder[M]) Build() (TypedFinalizeFunc[M], error) ``` -**使用 Callback 观察状态变化****/存储** +执行顺序:Handler 按注册顺序依次对 summary 进行变换 → Custom 确定最终输出消息列表。若未设置 Custom,则返回 `[]M{summary}`。 + +### PreserveSkills + +在摘要压缩后保留 Skill 中间件加载过的技能内容,确保 agent 在上下文窗口压缩后仍保留技能知识。 ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - Callback: func(ctx context.Context, before, after adk.ChatModelAgentState) error { - log.Printf("Summarization completed: %d messages -> %d messages", - len(before.Messages), len(after.Messages)) - return nil - }, -}) +type PreserveSkillsConfig struct { + SkillToolName string // 技能工具名,需与 Skill 中间件一致。默认 "skill" + MaxSkills *int // 最多保留技能数。默认 5;0 表示禁用 + MaxTokensPerSkill *int // 单个技能最大 token 数,超出截断。默认 5000 + SkillsTokenBudget *int // 所有技能总 token 预算。默认 25000 +} ``` -**控制用户消息保留** +**示例**: ```go -mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - PreserveUserMessages: &summarization.PreserveUserMessages{ - Enabled: true, - MaxTokens: 50000, // 保留最多 50k tokens 的用户消息 - }, -}) +finalizer, err := summarization.NewFinalizer(). + PreserveSkills(&summarization.PreserveSkillsConfig{}). + Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + return []*schema.Message{schema.SystemMessage("system prompt"), summary}, nil + }). + Build() + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: finalizer, +} ``` -## 工作原理 +## Summarize 方法 -```mermaid -flowchart TD - A[BeforeModelRewriteState] --> B{Token 数量超过阈值?} - B -->|否| C[返回原始状态] - B -->|是| D[发送 BeforeSummarize 事件] - D --> E{有自定义 GenModelInput?} - E -->|是| F[调用 GenModelInput] - E -->|否| G[调用模型生成摘要] - F --> G - G --> H{有自定义 Finalize?} - H -->|是| I[调用 Finalize] - H -->|否| L{有自定义 Callback?} - I --> L - L -->|是| M[调用 Callback] - L -->|否| J[发送 AfterSummarize 事件] - M --> J - J --> K[返回新状态] - - style A fill:#e3f2fd - style G fill:#fff3e0 - style D fill:#e8f5e9 - style J fill:#e8f5e9 - style K fill:#c8e6c9 - style C fill:#f5f5f5 - style M fill:#fce4ec - style F fill:#fff3e0 - style I fill:#fff3e0 +`TypedMiddleware[M]` 暴露 `Summarize` 方法,可在中间件自动触发之外手动执行一次摘要: + +```go +func (m *TypedMiddleware[M]) Summarize(ctx context.Context, state *adk.TypedChatModelAgentState[M]) ([]M, error) ``` +该方法执行完整的摘要流程(生成 → 后处理 → Callback → 事件),但**不检查触发条件**。返回替换后的消息列表。 + +## 工作原理 + + + +**触发条件检查**:先检查 `ContextMessages`(消息数),再通过 `TokenCounter` 计算 token 数与 `ContextTokens` 对比。满足任一即触发。 + +**默认后处理**(未设置 Finalize 时): + +1. 将摘要中 `...` 替换为最近的原始用户消息(受 `PreserveUserMessages` 控制) +2. 附加 `TranscriptFilePath` 提示 +3. 添加摘要前言和继续指令 + ## 内部事件 -当 EmitInternalEvents 设置为 true 时,中间件会在关键节点发送事件: +当 `EmitInternalEvents = true` 时,中间件通过 `adk.TypedSendEvent` 发送事件: - - + + +
    事件类型触发时机携带数据
    ActionTypeBeforeSummarize生成摘要之前原始消息列表
    ActionTypeAfterSummarize完成总结之后最终消息列表
    ActionTypeBeforeSummarize
    触发条件满足后,调用模型前
    TypedBeforeSummarizeAction[M]{Messages}
    :原始消息列表
    ActionTypeGenerateSummary
    每次模型生成尝试后(含重试/降级)
    TypedGenerateSummaryAction[M]{Attempt, Phase, ModelResponse, GetError()}
    ActionTypeAfterSummarize
    摘要完成、Finalize 之后
    TypedAfterSummarizeAction[M]{Messages}
    :最终消息列表
    -**使用示例** +事件通过 `TypedCustomizedAction[M]` 包装,放在 `adk.AgentAction.CustomizedAction` 字段中。`GenerateSummaryPhase` 有两个值:`GenerateSummaryPhasePrimary`(主模型/重试)和 `GenerateSummaryPhaseFailover`(降级)。 + +## 使用示例 + +### 最小配置 ```go mw, err := summarization.New(ctx, &summarization.Config{ - Model: yourChatModel, - EmitInternalEvents: true, + Model: yourChatModel, }) -// 在你的事件处理器中监听事件 +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: yourChatModel, + Middlewares: []adk.ChatModelAgentMiddleware{mw}, +}) +``` + +### 自定义触发条件 + 重试 + 降级 + +```go +mw, err := summarization.New(ctx, &summarization.Config{ + Model: yourChatModel, + Trigger: &summarization.TriggerCondition{ + ContextTokens: 100000, + ContextMessages: 80, + }, + TranscriptFilePath: "/path/to/transcript.txt", + Retry: &summarization.RetryConfig{ + MaxRetries: ptrOf(2), + }, + Failover: &summarization.FailoverConfig{ + MaxRetries: ptrOf(3), + GetFailoverModel: func(ctx context.Context, fctx *summarization.FailoverContext) (model.BaseModel[*schema.Message], []*schema.Message, error) { + return backupModel, nil, nil // 返回 nil input 将复用默认输入 + }, + }, +}) +``` + +### FinalizerBuilder + PreserveSkills + DefaultFinalizer + +```go +defaultFinalize, _ := summarization.DefaultFinalizer[*schema.Message]( + &summarization.DefaultFinalizerConfig[*schema.Message]{ + TranscriptFilePath: "/path/to/transcript.txt", + }, +) + +finalizer, err := summarization.NewFinalizer(). + PreserveSkills(&summarization.PreserveSkillsConfig{ + MaxSkills: ptrOf(3), + }). + Custom(func(ctx context.Context, origMsgs []*schema.Message, summary *schema.Message) ([]*schema.Message, error) { + msgs, err := defaultFinalize(ctx, origMsgs, summary) + if err != nil { + return nil, err + } + return append([]*schema.Message{schema.SystemMessage("system prompt")}, msgs...), nil + }). + Build() + +cfg := &summarization.Config{ + Model: yourModel, + Finalize: finalizer, +} ``` -## 最佳实践 +## 注意事项 -1. **设置 TranscriptFilePath**:建议始终提供对话记录文件路径,以便模型在需要时可以参考原始对话。 -2. **调整 Token 阈值**:根据模型的上下文窗口大小调整 `Trigger.MaxTokens`。一般建议设置为模型限制的 80-90%。 -3. **自定义 Token 计数器**:在生产环境中,建议实现与模型 tokenizer 匹配的自定义 `TokenCounter`,以获得准确的计数。 +1. **设置 TranscriptFilePath**:强烈建议提供对话记录文件路径,摘要后模型可从原始记录中回溯细节。 +2. **调整触发阈值**:`Trigger.ContextTokens` 建议设为模型上下文窗口的 80-90%。默认值 160,000 适用于 200k 窗口的模型。 +3. **自定义 TokenCounter**:生产环境建议实现与模型 tokenizer 精确匹配的计数器。默认估算器以最近 assistant 消息的 `ResponseMeta.Usage.TotalTokens` 为基线,增量消息按 ~4 字符/token 估算。 +4. **Finalize 覆盖**:设置 `Finalize` 后,`PreserveUserMessages` 和 `TranscriptFilePath` 不再自动生效。如需复用,使用 `DefaultFinalizer` 或 `FinalizerBuilder`。 +5. **GetFailoverModel 约束**:回调必须返回非 nil 的 model 和非空的 input 消息列表。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md index db4d192d2e6..96a47b99fc3 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolReduction.md @@ -1,25 +1,23 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-17" lastmod: "" tags: [] title: Reduction weight: 5 --- -# Reduction 中间件 - -adk/middlewares/reduction +`adk/middlewares/reduction` > 💡 > 本中间件在 v0.8.0 版本引入。 ## 概述 -`reduction` 中间件用来控制工具结果占用的 token 数量,提供两种策略: +`reduction` 中间件管理 Agent 对话中工具输出占用的 token 数量,分为两个阶段: -1. **截断 (Truncation)**:工具返回时立即截断过长的输出,将完整内容保存到 Backend -2. **清理 (Clear)**:总 token 超过阈值时,把旧的工具结果存到文件系统 +1. **截断(Truncation)**:工具调用返回时立即触发。单次输出超过 `MaxLengthForTrunc` 时,完整内容存入 Backend,消息替换为截断摘要。 +2. **清理(Clear)**:模型调用前触发(`BeforeModelRewriteState`)。总 token 超过 `MaxTokensForClear` 时,遍历历史消息,将旧的工具参数和结果卸载到 Backend。 --- @@ -30,9 +28,10 @@ Tool 调用返回结果 │ ▼ ┌─────────────────────────────────────────────────────────────┐ -│ WrapInvokableToolCall / WrapStreamableToolCall │ +│ WrapInvokableToolCall / WrapStreamableToolCall │ +│ WrapEnhancedInvokableToolCall / WrapEnhancedStreamable │ │ │ -│ Truncation 策略(可跳过) │ +│ Truncation(可通过 SkipTruncation 跳过) │ │ 结果长度 > MaxLengthForTrunc? │ │ 是 → 截断内容,完整内容存到 Backend │ │ 否 → 原样返回 │ @@ -45,9 +44,12 @@ Tool 调用返回结果 ┌─────────────────────────────────────────────────────────────┐ │ BeforeModelRewriteState │ │ │ -│ Clear 策略(可跳过) │ +│ Clear(可通过 SkipClear 跳过) │ │ 总 token > MaxTokensForClear? │ -│ 是 → 把旧的工具结果存到 Backend,替换成文件路径 │ +│ 是 → ClearMessageRewriter 预处理 │ +│ → 旧工具结果存到 Backend,替换为文件路径 │ +│ → ClearAtLeastTokens 最小释放量检查 │ +│ → ClearPostProcess 回调 │ │ 否 → 不处理 │ └─────────────────────────────────────────────────────────────┘ │ @@ -57,95 +59,75 @@ Tool 调用返回结果 --- -## 配置 +## 泛型体系 -### Config 主配置 +本中间件采用 ADK 标准泛型模式,同时支持 `*schema.Message` 和 `*schema.AgenticMessage`: ```go -type Config struct { - // Backend 存储后端,用于保存截断/清理的内容 - // 当 SkipTruncation 为 false 时必填 - Backend Backend - - // SkipTruncation 跳过截断阶段 - SkipTruncation bool - - // SkipClear 跳过清理阶段 - SkipClear bool - - // ReadFileToolName 读取文件的工具名 - // 内容卸载到文件后,agent 需要使用此工具读取 - // 默认 "read_file" - ReadFileToolName string +// 泛型配置,M 约束为 adk.MessageType +type TypedConfig[M adk.MessageType] struct { ... } - // RootDir 保存内容的根目录 - // 默认 "/tmp" - // 截断内容保存到 {RootDir}/trunc/{tool_call_id} - // 清理内容保存到 {RootDir}/clear/{tool_call_id} - RootDir string - - // MaxLengthForTrunc 触发截断的最大长度 - // 默认 50000 - MaxLengthForTrunc int +// 向后兼容别名 +type Config = TypedConfig[*schema.Message] +``` - // TokenCounter token 计数器 - // 用于判断是否需要触发清理 - // 默认使用 字符数/4 估算 - TokenCounter func(ctx context.Context, msg []adk.Message, tools []*schema.ToolInfo) (int64, error) +构造函数同样提供泛型和非泛型两种: - // MaxTokensForClear 触发清理的 token 阈值 - // 默认 30000 - MaxTokensForClear int64 +```go +func NewTyped[M adk.MessageType](ctx context.Context, config *TypedConfig[M]) (adk.TypedChatModelAgentMiddleware[M], error) +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) +``` - // ClearRetentionSuffixLimit 保留最近多少轮对话不清理 - // 默认 1 - ClearRetentionSuffixLimit int +--- - // ClearPostProcess 清理完成后的回调 - // 可用于保存或通知当前状态 - ClearPostProcess func(ctx context.Context, state *adk.ChatModelAgentState) context.Context +## 配置 - // ToolConfig 针对特定工具的配置 - // 优先级高于全局配置 - ToolConfig map[string]*ToolReductionConfig -} -``` +### TypedConfig[M] 主配置 + + + + + + + + + + + + + + + + + + + + +
    字段类型说明
    Backend
    Backend
    存储后端。
    SkipTruncation
    为 false 时必填;仅做 Clear 且不需要 offload 时可为 nil。
    SkipTruncation
    bool
    跳过截断阶段。
    SkipClear
    bool
    跳过清理阶段。
    ReadFileToolName
    string
    用于读取卸载内容的工具名。默认
    "read_file"
    RootDir
    string
    保存内容的根目录。默认
    "/tmp"
    。截断内容存到
    {RootDir}/trunc/{tool_call_id}
    ,清理内容存到
    {RootDir}/clear/{tool_call_id}
    GenTruncOffloadFilePath
    func(ctx, *ToolDetail) (string, error)
    自定义截断文件路径生成。设置后 RootDir 对截断不生效。适用于 tool_call_id 不唯一的场景。
    GenClearOffloadFilePath
    func(ctx, *ToolDetail) (string, error)
    自定义清理文件路径生成。设置后 RootDir 对清理不生效。
    MaxLengthForTrunc
    int
    触发截断的最大字符长度。默认
    50000
    TruncExcludeTools
    []string
    不截断的工具名列表。
    TokenCounter
    func(ctx, []M, []*schema.ToolInfo) (int64, error)
    token 计数函数。默认使用字符数/4 估算。建议用 tiktoken-go/tokenizer 替换
    MaxTokensForClear
    int64
    触发清理的 token 阈值。默认
    160000
    ClearRetentionSuffixLimit
    int
    保留最近 N 轮 assistant 消息不清理。默认
    1
    ClearAtLeastTokens
    int64
    清理至少释放的 token 量。未达标则不执行清理(避免无谓破坏 prompt cache)。默认
    0
    ClearExcludeTools
    []string
    不清理的工具名列表。
    ClearMessageRewriter
    func(ctx, M, []M) ([]M, error)
    清理前的消息重写回调。参数为 toolCallMsg 和对应的 toolResponseMsgs。可用于将 write_file/edit_file 调用重写为 system-reminder。返回 nil 表示移除该组消息。
    ClearPostProcess
    func(ctx, *adk.TypedChatModelAgentState[M]) context.Context
    清理完成后的回调,可保存状态或发送通知。返回可能更新后的 context。
    ToolConfig
    map[string]*ToolReductionConfig
    按工具名配置,优先级高于全局。
    ### ToolReductionConfig 工具级配置 ```go type ToolReductionConfig struct { - // Backend 此工具使用的存储后端 - Backend Backend - - // SkipTruncation 跳过此工具的截断 + Backend Backend SkipTruncation bool - - // TruncHandler 自定义截断处理器 - // 不设置时使用默认处理器 - TruncHandler func(ctx context.Context, detail *ToolDetail) (*TruncResult, error) - - // SkipClear 跳过此工具的清理 - SkipClear bool - - // ClearHandler 自定义清理处理器 - // 不设置时使用默认处理器 - ClearHandler func(ctx context.Context, detail *ToolDetail) (*ClearResult, error) + TruncHandler func(ctx context.Context, detail *ToolDetail) (*TruncResult, error) + SkipClear bool + ClearHandler func(ctx context.Context, detail *ToolDetail) (*ClearResult, error) } ``` +- `TruncHandler` / `ClearHandler` 为 nil 且未跳过时,使用全局默认 handler。 +- `Backend` 为该工具独立的存储后端,可覆盖全局 Backend。 + ### ToolDetail 工具详情 ```go type ToolDetail struct { - // ToolContext 工具元信息(工具名、调用 ID) - ToolContext *adk.ToolContext - - // ToolArgument 输入参数 - ToolArgument *schema.ToolArgument - - // ToolResult 输出结果 - ToolResult *schema.ToolResult + ToolContext *adk.ToolContext + ToolArgument *schema.ToolArgument + ToolResult *schema.ToolResult // 非流式 + StreamToolResult *schema.StreamReader[*schema.ToolResult] // 流式 } ``` @@ -153,23 +135,12 @@ type ToolDetail struct { ```go type TruncResult struct { - // NeedTrunc 是否需要截断 - NeedTrunc bool - - // ToolResult 截断后的工具结果 - // NeedTrunc 为 true 时必填 - ToolResult *schema.ToolResult - - // NeedOffload 是否需要卸载到存储 - NeedOffload bool - - // OffloadFilePath 卸载文件路径 - // NeedOffload 为 true 时必填 - OffloadFilePath string - - // OffloadContent 卸载内容 - // NeedOffload 为 true 时必填 - OffloadContent string + NeedTrunc bool + ToolResult *schema.ToolResult // NeedTrunc && 非流式时必填 + StreamToolResult *schema.StreamReader[*schema.ToolResult] // NeedTrunc && 流式时必填 + NeedOffload bool + OffloadFilePath string // NeedOffload 时必填 + OffloadContent string // NeedOffload 时必填 } ``` @@ -177,30 +148,26 @@ type TruncResult struct { ```go type ClearResult struct { - // NeedClear 是否需要清理 - NeedClear bool - - // ToolArgument 清理后的工具参数 - // NeedClear 为 true 时必填 - ToolArgument *schema.ToolArgument - - // ToolResult 清理后的工具结果 - // NeedClear 为 true 时必填 - ToolResult *schema.ToolResult - - // NeedOffload 是否需要卸载到存储 - NeedOffload bool + NeedClear bool + ToolArgument *schema.ToolArgument // NeedClear 时必填 + ToolResult *schema.ToolResult // NeedClear 时必填 + NeedOffload bool + OffloadFilePath string // NeedOffload 时必填 + OffloadContent string // NeedOffload 时必填 +} +``` - // OffloadFilePath 卸载文件路径 - // NeedOffload 为 true 时必填 - OffloadFilePath string +### Backend 接口 - // OffloadContent 卸载内容 - // NeedOffload 为 true 时必填 - OffloadContent string +```go +// 定义于 reduction/internal,通过类型别名导出 +type Backend interface { + Write(context.Context, *filesystem.WriteRequest) error } ``` +`filesystem.WriteRequest` 包含 `FilePath string` 和 `Content string` 两个字段。 + --- ## 创建中间件 @@ -208,67 +175,75 @@ type ClearResult struct { ### 基本用法 ```go -import ( - "context" - "github.com/cloudwego/eino/adk/middlewares/reduction" -) +import "github.com/cloudwego/eino/adk/middlewares/reduction" -// 使用默认配置 middleware, err := reduction.New(ctx, &reduction.Config{ - Backend: myBackend, // 必填:存储后端 + Backend: myBackend, }) -// 与 ChatModelAgent 一起使用 agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Model: yourChatModel, + Model: chatModel, Middlewares: []adk.ChatModelAgentMiddleware{middleware}, }) ``` +### 泛型用法(AgenticMessage) + +```go +middleware, err := reduction.NewTyped[*schema.AgenticMessage](ctx, &reduction.TypedConfig[*schema.AgenticMessage]{ + Backend: myBackend, + TokenCounter: myAgenticTokenCounter, +}) + +agent, err := adk.NewTypedChatModelAgent(ctx, &adk.TypedChatModelAgentConfig[*schema.AgenticMessage]{ + Model: chatModel, + Middlewares: []adk.TypedChatModelAgentMiddleware[*schema.AgenticMessage]{middleware}, +}) +``` + ### 自定义配置 ```go -config := &reduction.Config{ +middleware, err := reduction.New(ctx, &reduction.Config{ Backend: myBackend, RootDir: "/data/agent", MaxLengthForTrunc: 30000, MaxTokensForClear: 100000, ClearRetentionSuffixLimit: 2, - TokenCounter: myTokenCounter, + ClearAtLeastTokens: 10000, + TruncExcludeTools: []string{"search_tool"}, + ClearExcludeTools: []string{"read_file"}, + ClearMessageRewriter: func(ctx context.Context, toolCallMsg *schema.Message, toolResponseMsgs []*schema.Message) ([]*schema.Message, error) { + // 将 write_file 调用重写为 system-reminder + return []*schema.Message{schema.UserMessage("file written")}, nil + }, ClearPostProcess: func(ctx context.Context, state *adk.ChatModelAgentState) context.Context { log.Printf("Clear completed, messages: %d", len(state.Messages)) return ctx }, ToolConfig: map[string]*reduction.ToolReductionConfig{ - "grep": { - Backend: grepBackend, - SkipTruncation: false, - }, - "read_file": { - Backend: readFileBackend, - SkipClear: true, // 读文件工具不需要清理 - }, + "grep": {Backend: grepBackend}, + "read_file": {SkipClear: true}, }, -} - -middleware, err := reduction.New(ctx, config) +}) ``` -### 仅使用截断策略 +### 仅截断 ```go middleware, err := reduction.New(ctx, &reduction.Config{ Backend: myBackend, - SkipClear: true, // 跳过清理阶段 + SkipClear: true, }) ``` -### 仅使用清理策略 +### 仅清理 ```go middleware, err := reduction.New(ctx, &reduction.Config{ - Backend: myBackend, - SkipTruncation: true, // 跳过截断阶段 + SkipTruncation: true, + MaxTokensForClear: 100000, + // Backend 为 nil 时,清理仍会替换内容为占位符,但不执行 offload }) ``` @@ -278,29 +253,37 @@ middleware, err := reduction.New(ctx, &reduction.Config{ ### Truncation(截断) -在 `WrapInvokableToolCall` / `WrapStreamableToolCall` 中处理: +在 `WrapInvokableToolCall` / `WrapStreamableToolCall` / `WrapEnhancedInvokableToolCall` / `WrapEnhancedStreamableToolCall` 中处理: 1. 工具返回结果 -2. 调用 TruncHandler 判断是否需要截断 -3. 如需截断,将完整内容存到 Backend -4. 返回截断后的内容,包含提示文字告知 agent 完整内容的位置 +2. 检查 `TruncExcludeTools`,命中则跳过 +3. 查找 ToolConfig → 全局 defaultConfig,获取 TruncHandler +4. TruncHandler 判定:读取完整输出,检查所有 text 部分总长度是否超过 `MaxLengthForTrunc` +5. 超过则:保留首尾各 `MaxLengthForTrunc/(textParts*2)` 字符作为预览,完整内容存到 Backend +6. 返回截断通知,告知 agent 完整内容的文件路径 + +> 💡 +> 对于流式工具,默认 TruncHandler 会等待完整流读取完毕后再决定是否截断。若需严格增量流式行为,请为该工具提供自定义 TruncHandler。 ### Clear(清理) 在 `BeforeModelRewriteState` 中处理: -1. 用 TokenCounter 计算总 token -2. 超过 MaxTokensForClear 才处理 -3. 从旧消息开始遍历,跳过已处理的和最近 ClearRetentionSuffixLimit 轮 -4. 对范围内的每个工具调用,调用 ClearHandler -5. 需要清理的,写入 Backend,把消息里的结果替换成文件路径 -6. 调用 ClearPostProcess 回调 +1. 用 `TokenCounter` 计算总 token +2. 未超过 `MaxTokensForClear` 则跳过 +3. 确定清理范围:从第一条未处理的 assistant 消息开始,到 `len(messages) - ClearRetentionSuffixLimit` 轮结束 +4. 若配置了 `ClearMessageRewriter`,先对范围内消息执行重写预处理 +5. 遍历范围内的 tool call 消息,跳过 `ClearExcludeTools` +6. 对每个 tool call 调用 ClearHandler,替换参数和结果 +7. 如设置了 `ClearAtLeastTokens`:先在副本上操作,对比清理前后 token 差值,不达标则放弃本次清理 +8. 达标后执行实际 offload 写入,更新 state.Messages +9. 调用 `ClearPostProcess` --- ## 多语言支持 -截断和清理的提示文字支持中英文,通过 `adk.SetLanguage()` 切换: +截断和清理的提示文字支持中英文自动切换: ```go adk.SetLanguage(adk.LanguageChinese) // 中文 @@ -311,7 +294,11 @@ adk.SetLanguage(adk.LanguageEnglish) // 英文(默认) ## 注意事项 -- 当 `SkipTruncation` 为 false 时,`Backend` 必须设置 -- 默认 TokenCounter 用 `字符数 / 4` 估算,对于中文不精准,建议使用 `github.com/tiktoken-go/tokenizer` 替换 -- 已处理过的消息会打标记,不会重复处理 -- `ToolConfig` 中的配置优先级高于全局配置 +- `SkipTruncation` 为 false 时,`Backend` **必须**设置 +- 默认 TokenCounter 用字符数/4 估算,建议使用 `github.com/tiktoken-go/tokenizer` 替换 +- 已处理过的消息通过 Extra 字段打标记 `_reduction_mw_processed`,不会重复处理 +- `ToolConfig` 中配置优先级高于全局;若 ToolConfig 中仅设置了 `SkipTruncation: false` 但未提供 `TruncHandler`,则回退到默认 handler +- `GenTruncOffloadFilePath` / `GenClearOffloadFilePath` 适用于 tool_call_id 不唯一的场景(如 retry),防止文件覆盖 +- `ClearMessageRewriter` 在清理范围确定后、逐工具清理前执行,适合将 write/edit 类调用压缩为简短提示 +- `ClearAtLeastTokens` 设为 0 表示只要超阈值就执行清理;大于 0 时可避免微量清理破坏 prompt cache +- Legacy API(`NewClearToolResult`、`NewToolResultMiddleware`)已废弃,建议迁移到 `New` / `NewTyped` diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md index de9bf1ac22a..7fac9f60cb2 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/Middleware_ToolSearch.md @@ -1,26 +1,26 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-17" lastmod: "" tags: [] title: ToolSearch weight: 7 --- -# ToolSearch 中间件 - -adk/middlewares/dynamictool/toolsearch - -> 💡 -> 本中间件在 v0.8.0 版本引入。 - ## 概述 `toolsearch` 中间件实现动态工具选择。当工具库很大时,把所有工具都传给模型会撑爆上下文。这个中间件的做法是: -1. 添加一个 `tool_search` 元工具,接受正则表达式搜索工具名 +1. 添加一个 `tool_search` 元工具,接受关键字查询或直接选择来搜索工具 2. 初始时隐藏所有动态工具 -3. 模型调用 `tool_search` 后,匹配的工具才会出现在后续调用中 +3. 模型调用 `tool_search` 后,匹配的工具才会出现在后续调用中支持三种运行模式(配置层面为两个值,但 `UseModelToolSearch=true` 存在两种端到端行为): + +- **默认模式**(`UseModelToolSearch=false`):中间件自行管理工具可见性。在每次 Model 调用前通过 `BeforeModelRewriteState` 根据 `tool_search` 的调用结果过滤 `state.ToolInfos`,逐步将选中的动态工具加回模型可见列表 +- **模型原生模式 — 纯服务端检索**(`UseModelToolSearch=true`,模型自行检索 DeferredTools):中间件把动态工具移入 `state.DeferredToolInfos`,通过 `model.WithDeferredTools` 传递给模型。如果模型原生支持 server-side 工具检索(如 Claude 的 tool search),模型直接从 DeferredTools 中搜索和选择,**无需调用 tool_search tool** +- **模型原生模式 — 客户端代理检索**(`UseModelToolSearch=true`,模型通过调用 `tool_search` 发现工具):与上一模式相同的中间件配置,但模型不具备自主检索 DeferredTools 的能力,而是通过调用 `tool_search` 工具(由 `model.WithToolSearchTool` 注册),客户端的 `modelToolSearchTool` 执行搜索并返回结构化的 `ToolSearchResult`(含匹配工具的完整 ToolInfo),模型据此选择工具 + +> 💡 +> 包路径:github.com/cloudwego/eino/adk/middlewares/dynamictool/toolsearch --- @@ -31,17 +31,33 @@ Agent 初始化 │ ▼ ┌───────────────────────────────────────────┐ -│ BeforeAgent │ -│ - 注入 tool_search 工具 │ -│ - 把 DynamicTools 加到 Tools 列表 │ +│ BeforeAgent │ +│ - 注入 tool_search 工具 │ +│ - 把 DynamicTools 加到 Tools 列表 │ +│ - 模型原生模式下设置 │ +│ runCtx.ToolSearchTool │ └───────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────┐ -│ WrapModel │ -│ 每次 Model 调用前: │ -│ 1. 扫描消息历史,找到历史中所有 tool_search 的返回结果。 │ -│ 2. 全量 Tools 减去未被选中的 DynamicTools,作为本次 Model 调用的工具列表。 │ +│ BeforeModelRewriteState │ +│ (每次 Model 调用前执行) │ +│ │ +│ 1. 插入 │ +│ User 消息,列出所有可搜索的工具名 │ +│ │ +│ 首次调用时(初始化): │ +│ 默认模式: │ +│ 从 ToolInfos 中移除 DynamicTools │ +│ 模型原生模式: │ +│ DynamicTools → DeferredToolInfos │ +│ ToolInfos 中移除 DynamicTools │ +│ 和 tool_search │ +│ │ +│ 后续调用(默认模式-前向选择): │ +│ 扫描消息历史,收集 tool_search 返回的 │ +│ matches,把匹配的 DynamicTools 加回 │ +│ ToolInfos │ └────────────────────────────────────────────┘ │ ▼ @@ -56,34 +72,81 @@ Agent 初始化 type Config struct { // 可动态搜索和加载的工具列表 DynamicTools []tool.BaseTool + + // 是否使用模型原生的工具搜索能力 + // + // 为 true 时,中间件将工具搜索委托给模型的原生能力。 + // + // 为 false 时(默认),中间件通过在每次 Model 调用前 + // 根据 tool_search 结果过滤工具列表来管理工具可见性。 + // 注意:这种方式可能会使模型的 KV-cache 失效 + // (因为工具列表在调用之间会变化)。 + UseModelToolSearch bool } ``` --- -## tool_search 工具 +## 构造函数 -中间件注入的工具。 +```go +// 标准构造函数,使用 *schema.Message +func New(ctx context.Context, config *Config) (adk.ChatModelAgentMiddleware, error) -**参数:** +// 泛型构造函数,支持 *schema.Message 和 *schema.AgenticMessage +func NewTyped[M adk.MessageType](ctx context.Context, config *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +## `New` 内部调用 `NewTyped[*schema.Message]`。如果你使用 `TypedChatModelAgent`(如 Agentic 模式),请直接使用 `NewTyped`。 + +## tool_search 工具 + +中间件注入的元工具。**参数:** - + +
    参数类型必填说明
    regex_pattern
    string匹配工具名的正则表达式
    query
    string查找工具的查询字符串。支持三种模式:关键字搜索、
    select:
    直接选择、
    +keyword
    必须匹配
    max_results
    integer返回的最大结果数(默认:5)。仅对关键字搜索模式生效,直接选择模式不受此限制
    -**返回:** +**查询模式:** + + + + + + +
    模式语法说明
    关键字搜索
    "weather forecast"
    按关键字在工具名和描述中匹配,按相关性评分排序。支持 camelCase 和
    _
    /
    __
    (MCP)分隔符拆分
    直接选择
    "select:tool_a,tool_b"
    按精确名称选择一个或多个工具,逗号分隔。不受
    max_results
    限制
    必须匹配
    "+slack send message"
    +
    前缀的关键字为必须匹配项,不含该关键字的工具会被过滤掉。其余关键字用于排序
    + +**返回值(默认模式):** ```json -{ - "selectedTools": ["tool_a", "tool_b"] -} +{"matches": ["tool_a", "tool_b"]} ``` ---- +**返回值(模型原生模式):** 返回结构化的 `schema.ToolResult`,包含匹配工具的完整 `ToolInfo`,供模型原生处理。 + +## 关键字搜索评分机制 + +关键字搜索使用多层评分系统,对每个关键字分别计算最高得分后累加: + + + + + + + +
    匹配规则得分
    工具名拆分后的部分完全匹配关键字10
    工具名拆分后的部分包含关键字(子串)5
    工具全名包含关键字3
    工具描述包含关键字2
    + +> 💡 +> 每个关键字对每个规则取最高分(intMax),不会叠加同一工具内多个 part 的匹配分数。多个关键字的得分相加为总分。得分相同时按工具名字典序排列。 + +工具名会按 `_`(下划线)、`__`(MCP 服务器与工具分隔符)和 camelCase 边界拆分为多个部分进行匹配。例如 `mcp__slack__send_message` 会拆分为 `["mcp", "slack", "send", "message"]`,`NotebookEdit` 会拆分为 `["Notebook", "Edit"]`。匹配不区分大小写。 ## 使用示例 +### 默认模式(中间件管理工具可见性) + ```go middleware, err := toolsearch.New(ctx, &toolsearch.Config{ DynamicTools: []tool.BaseTool{ @@ -103,35 +166,70 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ }) ``` +### 模型原生模式 + +```go +middleware, err := toolsearch.New(ctx, &toolsearch.Config{ + DynamicTools: []tool.BaseTool{ + weatherTool, + stockTool, + currencyTool, + }, + UseModelToolSearch: true, +}) +if err != nil { + return err +} + +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: myModel, // 需要模型支持原生 tool search + Handlers: []adk.ChatModelAgentMiddleware{middleware}, +}) +``` + +配置完全相同,但端到端行为取决于模型适配器的实现: + +- 如果模型原生支持 server-side 检索(如 Claude):模型直接从 `DeferredToolInfos` 中搜索和选择工具,`tool_search` 工具不会被调用 +- 如果模型通过客户端代理检索:模型发起 `tool_search` 调用 → 客户端 `modelToolSearchTool` 执行搜索 → 返回结构化 `ToolSearchResult`(含完整 ToolInfo)→ 模型据此选择工具 + --- ## 工作原理 ### BeforeAgent -1. 获取所有 DynamicTool -2. 使用 DynamicTools 创建 `tool_search` 工具 -3. 把 `tool_search` 和所有 DynamicTools 加到 `runCtx.Tools`,此时 Agent 中的 Tools 为全量 +1. 获取所有 DynamicTool 的 ToolInfo,校验无重复工具名 +2. 根据 `UseModelToolSearch` 创建对应类型的 `tool_search` 工具 +3. 把 `tool_search` 和所有 DynamicTools 加到 `runCtx.Tools`(此时 Agent 中为全量工具) +4. 模型原生模式下,设置 `runCtx.ToolSearchTool`,框架会通过 `model.WithToolSearchTool` 传递给模型 + +### BeforeModelRewriteState(每次 Model 调用前) + +**通用逻辑:** + +- 确保消息列表中存在 `` 提醒(以 User 消息插入,列出所有可搜索的工具名)**首次调用 — 初始化(两种模式):** -### WrapModel + +
    +默认模式
    state.ToolInfos
    中移除所有 DynamicTools,使模型初始只能看到静态工具和
    tool_search
    +模型原生模式1. 将 DynamicTools 从
    state.ToolInfos
    提取到
    state.DeferredToolInfos
    2. 从
    state.ToolInfos
    中移除
    tool_search
    (由模型原生处理)
    -每次 Model 调用前: +**后续调用 — 前向选择(仅默认模式):** -1. 遍历消息历史,找所有 `tool_search` 的返回结果 +1. 遍历消息历史,找所有 `tool_search` 返回结果中 JSON `matches` 字段 2. 收集已选中的工具名 -3. 从全量工具中过滤掉未选中的 DynamicTools -4. 用过滤后的工具列表调用 Model +3. 把匹配的 DynamicTools 加回 `state.ToolInfos`(累加,不会移除已添加的工具) -### 工具选择流程 +### 工具选择流程(默认模式) ``` 第一轮: - Model 只能看到 tool_search - Model 调用 tool_search(regex_pattern="weather.*") - 返回 {"selectedTools": ["weather_forecast", "weather_history"]} + Model 只能看到 tool_search + 静态工具 + Model 调用 tool_search(query="weather forecast") + 返回 {"matches": ["weather_forecast", "weather_history"]} 第二轮: - Model 能看到 tool_search + weather_forecast + weather_history + Model 能看到 tool_search + 静态工具 + weather_forecast + weather_history Model 调用 weather_forecast(...) ``` @@ -139,7 +237,10 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ ## 注意事项 -- DynamicTools 不能为空 -- 正则匹配的是工具名,不是描述 -- 选中的工具会一直保持可用,除非 tool_search 调用结果被删除或修改 -- 可以多次调用 tool_search,结果会累加 +- `DynamicTools` 不能为空,且工具名不能重复 +- 关键字搜索匹配工具名和描述,不区分大小写 +- 在默认模式下,选中的工具会一直保持可用(基于消息历史中 `tool_search` 结果累加) +- 可以多次调用 `tool_search`,结果会累加 +- 默认模式下,每次 Model 调用前工具列表可能变化,这可能导致模型 KV-cache 失效 +- 模型原生模式需要 ChatModel 支持 `model.WithToolSearchTool` 和/或 `model.WithDeferredTools` 选项。具体走哪条路径(纯服务端检索 vs 客户端代理检索)取决于模型适配器的实现 +- `` 提醒以 **User 消息**(而非 System 消息)插入到消息列表中,位于第一条非 System 消息之前 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md index 878423730f0..5b63e795b0c 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/_index.md @@ -1,298 +1,259 @@ --- Description: "" -date: "2026-03-09" +date: "2026-05-21" lastmod: "" tags: [] title: ChatModelAgentMiddleware weight: 8 --- -## 概述 +`ChatModelAgentMiddleware` 是自定义 `ChatModelAgent`(及基于它的 `DeepAgent`)行为的核心接口。自 v0.8.0 引入,在后续版本持续演进。 -## ChatModelAgentMiddleware 接口 +## 类型约定 -`ChatModelAgentMiddleware` 定义了自定义 `ChatModelAgent` 行为的接口。 +本文使用默认 `M = *schema.Message` 的别名。泛型原始类型以 `Typed` 前缀命名: -**重要说明:** 此接口专为 `ChatModelAgent` 及基于它构建的 Agent(如 `DeepAgent`)设计。 - -> 💡 -> ChatModelAgentMiddleware 接口在 v0.8.0 版本引入 +```go +type ChatModelAgentMiddleware = TypedChatModelAgentMiddleware[*schema.Message] +type BaseChatModelAgentMiddleware = TypedBaseChatModelAgentMiddleware[*schema.Message] +type ChatModelAgentState = TypedChatModelAgentState[*schema.Message] +type ModelContext = TypedModelContext[*schema.Message] +``` -### 为什么使用 ChatModelAgentMiddleware 而非 AgentMiddleware? +当需使用 `*schema.AgenticMessage` 时,直接使用 `Typed` 泛型版本即可。 - - - - - -
    特性AgentMiddleware (结构体)ChatModelAgentMiddleware (接口)
    扩展性封闭,用户无法添加新方法开放,用户可实现自定义 handler
    Context 传播回调只返回 error所有方法返回
    (context.Context, ..., error)
    配置管理分散在闭包中集中在结构体字段中
    +--- -### 接口定义 +## 接口定义 ```go type ChatModelAgentMiddleware interface { - // BeforeAgent 在每次 agent 运行前调用,允许修改 instruction 和 tools 配置 + // ── 生命周期 Hook ── + + // BeforeAgent:agent 运行前调用一次,可修改 instruction、tools 配置 BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) - // BeforeModelRewriteState 在每次模型调用前调用 - // 返回的 state 会被持久化到 agent 内部状态并传递给模型 - // 返回的 context 会传播到模型调用和后续 handler + // AfterAgent:agent 成功终止后调用(最终回答或 return-directly 工具结果) + // 错误终止(超迭代、context 取消、model 错误)时不调用 + AfterAgent(ctx context.Context, state *ChatModelAgentState) (context.Context, error) + + // BeforeModelRewriteState:每次模型调用前调用 + // 返回的 state 被持久化,可修改 Messages、ToolInfos、DeferredToolInfos BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - // AfterModelRewriteState 在每次模型调用后调用 - // 输入的 state 包含模型响应作为最后一条消息 + // AfterModelRewriteState:每次模型调用后调用 + // 输入 state 包含模型响应作为最后一条消息 AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - // WrapInvokableToolCall 用自定义行为包装工具的同步执行 - // 如果不需要包装,返回原始 endpoint 和 nil error - // 仅对实现了 InvokableTool 的工具调用此方法 - WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) + // ── Wrapper ── - // WrapStreamableToolCall 用自定义行为包装工具的流式执行 - // 如果不需要包装,返回原始 endpoint 和 nil error - // 仅对实现了 StreamableTool 的工具调用此方法 + WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) - - // WrapEnhancedInvokableToolCall 用自定义行为包装增强型工具的同步执行 WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) - - // WrapEnhancedStreamableToolCall 用自定义行为包装增强型工具的流式执行 WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) - // WrapModel 用自定义行为包装聊天模型 - // 如果不需要包装,返回原始 model 和 nil error - // 在请求时调用,每次模型调用前都会执行 - WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) -} -``` - -### 使用 BaseChatModelAgentMiddleware - -嵌入 `*BaseChatModelAgentMiddleware` 以获得默认的空操作实现: - -```go -type MyHandler struct { - *adk.BaseChatModelAgentMiddleware -} - -func (h *MyHandler) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - return ctx, state, nil + // WrapModel:包装 ChatModel,参数类型为 model.BaseModel[M](非 ToolCallingChatModel) + // 框架单独处理 WithTools 绑定,不经过用户 wrapper + WrapModel(ctx context.Context, m model.BaseModel[M], mc *ModelContext) (model.BaseModel[M], error) } ``` ---- - -## 工具调用端点类型 - -工具包装使用函数类型而非接口,更清晰地表达了包装的意图: +> 💡 +> 嵌入 `*BaseChatModelAgentMiddleware` 可获得所有方法的空操作默认实现,只需覆盖关心的方法。 -```go -// InvokableToolCallEndpoint 是同步工具调用的函数签名 -type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) +### AgentMiddleware 已废弃 -// StreamableToolCallEndpoint 是流式工具调用的函数签名 -type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) +> 💡 +> `AgentMiddleware` 结构体及 `ChatModelAgentConfig.Middlewares` 字段已标记为 Deprecated,将在未来版本中移除。所有新代码应使用 `ChatModelAgentMiddleware`(interface-based Handlers)。 -// EnhancedInvokableToolCallEndpoint 是增强型同步工具调用的函数签名 -type EnhancedInvokableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.ToolResult, error) +`AgentMiddleware` 是结构体,有固有局限——用户无法扩展方法,回调仅返回 error 无法传播 context。`ChatModelAgentMiddleware` 是接口: -// EnhancedStreamableToolCallEndpoint 是增强型流式工具调用的函数签名 -type EnhancedStreamableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.StreamReader[*schema.ToolResult], error) -``` +- Hook 方法返回 `(context.Context, ..., error)`,支持 context 传播 +- Wrapper 方法通过 endpoint 链传播修改后的 context +- 自定义 handler 可携带任意内部状态 -### 为什么使用分离的端点类型? +迁移映射: -之前的 `ToolCall` 接口同时包含 `InvokableRun` 和 `StreamableRun`,但大多数工具只实现其中一个。 -分离的端点类型使得: + + + + + + + +
    AgentMiddleware 字段ChatModelAgentMiddleware 替代
    AdditionalInstruction
    BeforeAgent
    中修改
    runCtx.Instruction
    AdditionalTools
    BeforeAgent
    中修改
    runCtx.Tools
    BeforeChatModel
    BeforeModelRewriteState
    AfterChatModel
    AfterModelRewriteState
    WrapToolCall
    WrapInvokableToolCall
    /
    WrapStreamableToolCall
    -- 只有当工具实现相应接口时才调用对应的包装方法 -- wrapper 作者更清晰的契约 -- 关于实现哪个方法没有歧义 +当前版本两者可共存(Handlers 在 Middlewares 之后执行),但应尽早迁移。 --- -## ChatModelAgentContext +## 上下文类型 + +### ChatModelAgentContext -`ChatModelAgentContext` 包含在每次 `ChatModelAgent` 运行前传递给 handler 的运行时信息。 +`BeforeAgent` 的输入,每次 Run 前调用一次: ```go type ChatModelAgentContext struct { - // Instruction 是当前 Agent 执行的指令 - // 包括 agent 配置的指令、框架和 AgentMiddleware 追加的额外指令, - // 以及之前 BeforeAgent handler 应用的修改 + // 当前 instruction(含 agent 配置 + 框架追加 + 前序 handler 修改) Instruction string - // Tools 是当前为 Agent 执行配置的原始工具(无任何 wrapper 或 tool middleware) - // 包括 AgentConfig 中传入的工具、框架隐式添加的工具(如 transfer/exit 工具), - // 以及 middleware 已添加的其他工具 + // 原始工具列表(含框架隐式工具如 transfer/exit) Tools []tool.BaseTool - // ReturnDirectly 是当前配置为使 Agent 直接返回的工具名称集合 + // 配置为"直接返回"的工具名集合 ReturnDirectly map[string]bool + + // 模型原生工具搜索能力的 ToolInfo + // 由 handler 设置后,框架通过 model.WithToolSearchTool 传递给模型 + ToolSearchTool *schema.ToolInfo } ``` ---- - -## ChatModelAgentState +### ChatModelAgentState -`ChatModelAgentState` 表示对话过程中聊天模型 agent 的状态。这是 `ChatModelAgentMiddleware` 和 `AgentMiddleware` 回调的主要状态类型。 +每次模型调用前后传递的**持久化状态**(跨 iteration 保持): ```go type ChatModelAgentState struct { - // Messages 包含当前对话会话中的所有消息 - Messages []Message + // 当前会话的所有消息 + Messages []*schema.Message + + // 传递给模型的工具定义(via model.WithTools),可在 BeforeModelRewriteState 中修改 + ToolInfos []*schema.ToolInfo + + // 延迟检索工具定义(via model.WithDeferredTools),用于模型原生搜索能力 + // 未使用时为 nil + DeferredToolInfos []*schema.ToolInfo } ``` ---- +> 💡 +> 修改 `ToolInfos` / `DeferredToolInfos` 的推荐位置是 `BeforeModelRewriteState`——这是工具配置的 source of truth。不要在 `WrapModel` 中修改工具列表。 -## ToolContext +### ModelContext -`ToolContext` 提供被包装工具的元数据。在请求时创建,包含当前工具调用的信息。 +`WrapModel` 和 `Before/AfterModelRewriteState` 的上下文: ```go -type ToolContext struct { - // Name 是工具名称 - Name string +type ModelContext struct { + // Deprecated: 使用 ChatModelAgentState.ToolInfos 替代 + Tools []*schema.ToolInfo + + // 模型重试配置 + ModelRetryConfig *ModelRetryConfig - // CallID 是此特定工具调用的唯一标识符 - CallID string + // 模型容灾切换配置 + ModelFailoverConfig *ModelFailoverConfig[*schema.Message] } ``` -### 使用示例:工具调用包装 +### ToolContext + +工具包装的元数据: ```go -func (h *MyHandler) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - log.Printf("Tool %s (call %s) starting with args: %s", tCtx.Name, tCtx.CallID, argumentsInJSON) - - result, err := endpoint(ctx, argumentsInJSON, opts...) - - if err != nil { - log.Printf("Tool %s failed: %v", tCtx.Name, err) - return "", err - } - - log.Printf("Tool %s completed with result: %s", tCtx.Name, result) - return result, nil - }, nil +type ToolContext struct { + Name string // 工具名称 + CallID string // 本次调用唯一标识 } ``` --- -## ModelContext +## 工具调用端点类型 -`ModelContext` 包含传递给 `WrapModel` 的上下文信息。在请求时创建,包含当前模型调用的工具配置。 +工具包装使用函数类型而非接口。根据工具实现的接口,框架调用对应的 Wrap 方法: ```go -type ModelContext struct { - // Tools 是当前配置给 agent 的工具列表 - // 在请求时填充,包含将发送给模型的工具 - Tools []*schema.ToolInfo +// 标准工具 +type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) +type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) - // ModelRetryConfig 包含模型的重试配置 - // 在请求时从 agent 的 ModelRetryConfig 填充 - // 用于 EventSenderModelWrapper 适当地包装流错误 - ModelRetryConfig *ModelRetryConfig -} +// 增强型工具(使用 ToolArgument/ToolResult) +type EnhancedInvokableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.ToolResult, error) +type EnhancedStreamableToolCallEndpoint func(ctx context.Context, toolArgument *schema.ToolArgument, opts ...tool.Option) (*schema.StreamReader[*schema.ToolResult], error) ``` -### 使用示例:模型包装 - -```go -func (h *MyHandler) WrapModel(ctx context.Context, m model.BaseChatModel, mc *adk.ModelContext) (model.BaseChatModel, error) { - return &myModelWrapper{ - inner: m, - tools: mc.Tools, - }, nil -} - -type myModelWrapper struct { - inner model.BaseChatModel - tools []*schema.ToolInfo -} +> 💡 +> 每个 Wrap 方法**仅在工具实现了对应接口时才被调用**。例如,工具只实现了 `InvokableTool`,则只会调用 `WrapInvokableToolCall`,不会调用 `WrapStreamableToolCall`。 -func (w *myModelWrapper) Generate(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.Message, error) { - log.Printf("Model called with %d tools", len(w.tools)) - return w.inner.Generate(ctx, msgs, opts...) -} +--- -func (w *myModelWrapper) Stream(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.StreamReader[*schema.Message], error) { - return w.inner.Stream(ctx, msgs, opts...) -} -``` +## 执行顺序 ---- +### Model 调用生命周期(由外到内) + +1. ~~AgentMiddleware.BeforeChatModel~~(**Deprecated**,将移除) +2. **ChatModelAgentMiddleware.BeforeModelRewriteState** +3. `failoverModelWrapper`(内部 — 模型容灾切换,如配置) +4. `retryModelWrapper`(内部 — 失败重试) +5. `eventSenderModelWrapper` 预处理(内部 — 准备事件发送) +6. **ChatModelAgentMiddleware.WrapModel** 预处理(先注册 → 先执行) +7. `callbackInjectionModelWrapper`(内部) +8. **Model.Generate / Stream** +9. `callbackInjectionModelWrapper` 后处理 +10. **ChatModelAgentMiddleware.WrapModel** 后处理(先注册 → 后执行) +11. `eventSenderModelWrapper` 后处理 +12. `retryModelWrapper` 后处理 +13. `failoverModelWrapper` 后处理 +14. **ChatModelAgentMiddleware.AfterModelRewriteState** +15. ~~AgentMiddleware.AfterChatModel~~(**Deprecated**,将移除) + +### Tool 调用生命周期(由外到内) + +1. `eventSenderToolHandler`(内部 — 发送工具结果事件) +2. `ToolsConfig.ToolCallMiddlewares` +3. ~~AgentMiddleware.WrapToolCall~~(**Deprecated**,将移除) +4. `cancelMonitoredToolHandler`(内部 — 取消监控,仅 Streamable/EnhancedStreamable 工具) +5. **ChatModelAgentMiddleware.WrapXxxToolCall**(先注册 → 最外层) +6. `callbackInjectedToolCall`(内部 — 注入 callback) +7. **Tool.InvokableRun / StreamableRun** ## 运行时本地存储 API -`SetRunLocalValue`、`GetRunLocalValue` 和 `DeleteRunLocalValue` 提供在当前 agent Run() 调用期间存储、获取和删除值的能力。 +在当前 agent `Run()` 期间存取键值对。值与中断/恢复兼容——序列化后随 checkpoint 持久化。 ```go -// SetRunLocalValue 设置一个在当前 agent Run() 调用期间持久化的键值对 -// 值的作用域限于此特定执行,不会在不同的 Run() 调用或 agent 实例之间共享 -// -// 存储在这里的值与中断/恢复周期兼容 - 它们会被序列化并在 agent 恢复时还原 -// 对于自定义类型,必须在 init() 函数中使用 schema.RegisterName[T]() 注册以确保正确序列化 -// -// 此函数只能在 agent 执行期间从 ChatModelAgentMiddleware 内部调用 -// 如果在 agent 执行上下文之外调用,返回错误 func SetRunLocalValue(ctx context.Context, key string, value any) error - -// GetRunLocalValue 获取在当前 agent Run() 调用期间设置的值 -// 值的作用域限于此特定执行,不会在不同的 Run() 调用或 agent 实例之间共享 -// -// 通过 SetRunLocalValue 存储的值与中断/恢复周期兼容 - 它们会被序列化并在 agent 恢复时还原 -// 对于自定义类型,必须在 init() 函数中使用 schema.RegisterName[T]() 注册以确保正确序列化 -// -// 此函数只能在 agent 执行期间从 ChatModelAgentMiddleware 内部调用 -// 如果找到值返回 (value, true, nil),如果未找到返回 (nil, false, nil), -// 如果在 agent 执行上下文之外调用返回错误 func GetRunLocalValue(ctx context.Context, key string) (any, bool, error) - -// DeleteRunLocalValue 删除在当前 agent Run() 调用期间设置的值 -// -// 此函数只能在 agent 执行期间从 ChatModelAgentMiddleware 内部调用 -// 如果在 agent 执行上下文之外调用,返回错误 func DeleteRunLocalValue(ctx context.Context, key string) error ``` -### 使用示例:跨 handler 点共享数据 +> 💡 +> 自定义类型必须在 `init()` 中通过 `schema.RegisterName[T]()` 注册,以确保 gob 序列化正确。这些函数只能在 `ChatModelAgentMiddleware` 回调内调用。 + +### 示例:跨回调共享状态 ```go func init() { - schema.RegisterName[*MyCustomData]("my_package.MyCustomData") + schema.RegisterName[*ToolStats]("mypackage.ToolStats") } -type MyCustomData struct { +type ToolStats struct { Count int Name string } -type MyHandler struct { +type MyMiddleware struct { *adk.BaseChatModelAgentMiddleware } -func (h *MyHandler) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - result, err := endpoint(ctx, argumentsInJSON, opts...) - - data := &MyCustomData{Count: 1, Name: tCtx.Name} - if err := adk.SetRunLocalValue(ctx, "my_handler.last_tool", data); err != nil { - log.Printf("Failed to set run local value: %v", err) - } - +// 在工具调用后记录统计 +func (m *MyMiddleware) WrapInvokableToolCall(ctx context.Context, endpoint adk.InvokableToolCallEndpoint, tCtx *adk.ToolContext) (adk.InvokableToolCallEndpoint, error) { + return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { + result, err := endpoint(ctx, args, opts...) + + _ = adk.SetRunLocalValue(ctx, "last_tool", &ToolStats{Count: 1, Name: tCtx.Name}) return result, err }, nil } -func (h *MyHandler) AfterModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - if val, found, err := adk.GetRunLocalValue(ctx, "my_handler.last_tool"); err == nil && found { - if data, ok := val.(*MyCustomData); ok { - log.Printf("Last tool was: %s (count: %d)", data.Name, data.Count) +// 在模型调用后读取统计 +func (m *MyMiddleware) AfterModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { + if val, found, _ := adk.GetRunLocalValue(ctx, "last_tool"); found { + if stats, ok := val.(*ToolStats); ok { + log.Printf("上一次工具: %s (count=%d)", stats.Name, stats.Count) } } return ctx, state, nil @@ -303,219 +264,79 @@ func (h *MyHandler) AfterModelRewriteState(ctx context.Context, state *adk.ChatM ## SendEvent API -`SendEvent` 允许在 agent 执行期间向事件流发送自定义 `AgentEvent`。 +在 agent 执行期间向事件流发送自定义 `AgentEvent`,调用方遍历事件流时可收到: ```go -// SendEvent 在 agent 执行期间向事件流发送自定义 AgentEvent -// 允许 ChatModelAgentMiddleware 实现发出自定义事件, -// 这些事件将被遍历 agent 事件流的调用者接收 -// -// 此函数只能在 agent 执行期间从 ChatModelAgentMiddleware 内部调用 -// 如果在 agent 执行上下文之外调用,返回错误 func SendEvent(ctx context.Context, event *AgentEvent) error ``` ---- - -## State 类型(即将弃用) - -`State` 保存 agent 运行时状态,包括消息和用户可扩展存储。 - -**⚠️ 弃用警告:** 此类型将在 v1.0.0 中设为未导出。请在 `ChatModelAgentMiddleware` 和 `AgentMiddleware` 回调中使用 `ChatModelAgentState`。不建议直接使用 `compose.ProcessState[*State]`,该用法将在 v1.0.0 中停止工作;请使用 handler API。 - -```go -type State struct { - Messages []Message - extra map[string]any // 未导出,通过 SetRunLocalValue/GetRunLocalValue 访问 - - // 以下为内部字段 - 请勿直接访问 - // 为与现有 checkpoint 向后兼容而保持导出 - ReturnDirectlyToolCallID string - ToolGenActions map[string]*AgentAction - AgentName string - RemainingIterations int - - internals map[string]any -} -``` - ---- - -## 架构图 - -下图展示了 `ChatModelAgentMiddleware` 在 `ChatModelAgent` 执行过程中的工作原理: - -``` -Agent.Run(input) - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ BeforeAgent(ctx, *ChatModelAgentContext) │ -│ 输入: 当前 Instruction、Tools 等 Agent 运行环境 │ -│ 输出: 修改后的 Agent 运行环境 │ -│ 作用: Run 开始时调用一次,修改整个 Run 生命周期的配置 │ -└─────────────────────────────────────────────────────────────────────────┘ - │ - ▼ -┌─────────────────────────────────────────────────────────────────────────┐ -│ ReAct Loop │ -│ ┌───────────────────────────────────────────────────────────────────┐ │ -│ │ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ BeforeModelRewriteState(ctx, *ChatModelAgentState, *MC) │ │ │ -│ │ │ 输入: 消息历史等持久化状态,以及 Model 运行环境 │ │ │ -│ │ │ 输出: 修改后的持久化状态,返回新 ctx │ │ │ -│ │ │ 作用: 修改跨 iteration 的持久化状态(主要是消息列表) │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ WrapModel(ctx, BaseChatModel, *ModelContext) │ │ │ -│ │ │ 输入: 被 wrap 的 ChatModel,以及 Model 运行环境 │ │ │ -│ │ │ 输出: 包装后的 Model (洋葱模型) │ │ │ -│ │ │ 作用: 修改单次 Model 请求的输入、输出和配置 │ │ │ -│ │ │ │ │ │ │ -│ │ │ ▼ │ │ │ -│ │ │ ┌───────────────┐ │ │ │ -│ │ │ │ Model │ │ │ │ -│ │ │ │ Generate/Stream│ │ │ │ -│ │ │ └───────────────┘ │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌─────────────────────────────────────────────────────────────┐ │ │ -│ │ │ AfterModelRewriteState(ctx, *ChatModelAgentState, *MC) │ │ │ -│ │ │ 输入: 消息历史等持久化状态(含 Model 响应), │ │ │ -│ │ │ 以及 Model 运行环境 │ │ │ -│ │ │ 输出: 修改后的持久化状态 │ │ │ -│ │ │ 作用: 修改跨 iteration 的持久化状态(主要是消息列表) │ │ │ -│ │ └─────────────────────────────────────────────────────────────┘ │ │ -│ │ │ │ │ -│ │ ▼ │ │ -│ │ ┌──────────────────┐ │ │ -│ │ │ Model 返回内容? │ │ │ -│ │ └──────────────────┘ │ │ -│ │ │ │ │ │ -│ │ 最终响应 │ │ ToolCalls │ │ -│ │ │ ▼ │ │ -│ │ │ ┌─────────────────────────────────────┐ │ │ -│ │ │ │ WrapInvokableToolCall / WrapStream │ │ │ -│ │ │ │ ableToolCall(ctx, endpoint, *TC) │ │ │ -│ │ │ │ 输入: 被 wrap 的 Tool 以及 │ │ │ -│ │ │ │ Tool 运行环境 │ │ │ -│ │ │ │ 输出: 包装后的 endpoint (洋葱模型)│ │ │ -│ │ │ │ 作用: 修改单次 Tool 请求的 │ │ │ -│ │ │ │ 输入、输出和配置 │ │ │ -│ │ │ │ │ │ │ │ -│ │ │ │ ▼ │ │ │ -│ │ │ │ ┌─────────────┐ │ │ │ -│ │ │ │ │ Tool.Run() │ │ │ │ -│ │ │ │ └─────────────┘ │ │ │ -│ │ │ └─────────────────────────────────────┘ │ │ -│ │ │ │ │ │ -│ │ │ │ (结果加入 Messages) │ │ -│ │ │ │ │ │ -│ │ │ ┌─────────┘ │ │ -│ │ │ │ │ │ -│ │ │ └──────────► 继续循环 │ │ -│ │ │ │ │ -│ └─────────────────────┼─────────────────────────────────────────────┘ │ -│ │ │ -│ ▼ │ -│ 循环直到完成或达到 maxIterations │ -└─────────────────────────────────────────────────────────────────────────┘ - │ - ▼ - Agent.Run() 结束 -``` - -### Handler 方法说明 - - - - - - - - - -
    方法输入输出作用范围
    BeforeAgent
    Agent 运行环境 (
    *ChatModelAgentContext
    )
    修改后的 Agent 运行环境整个 Run 生命周期,仅调用一次
    BeforeModelRewriteState
    持久化状态 + Model 运行环境修改后的持久化状态跨 iteration 的持久化状态(消息列表)
    WrapModel
    被 wrap 的 ChatModel + Model 运行环境包装后的 Model单次 Model 请求的输入、输出和配置
    AfterModelRewriteState
    持久化状态(含响应)+ Model 运行环境修改后的持久化状态跨 iteration 的持久化状态(消息列表)
    WrapInvokableToolCall
    被 wrap 的 Tool + Tool 运行环境包装后的 endpoint单次 Tool 请求的输入、输出和配置
    WrapStreamableToolCall
    被 wrap 的 Tool + Tool 运行环境包装后的 endpoint单次 Tool 请求的输入、输出和配置
    +仅能在 `ChatModelAgentMiddleware` 回调内调用。 --- -## 执行顺序 +## State 类型 -### Model 调用生命周期(从外到内的 wrapper 链) - -1. `AgentMiddleware.BeforeChatModel`(hook,在模型调用前运行) -2. `ChatModelAgentMiddleware.BeforeModelRewriteState`(hook,可在模型调用前修改状态) -3. `retryModelWrapper`(内部 - 失败时重试,如已配置) -4. `eventSenderModelWrapper` 预处理(内部 - 准备事件发送) -5. `ChatModelAgentMiddleware.WrapModel` 预处理(wrapper,在请求时包装,先注册的先运行) -6. `callbackInjectionModelWrapper`(内部 - 如未启用则注入回调) -7. `Model.Generate/Stream` -8. `callbackInjectionModelWrapper` 后处理 -9. `ChatModelAgentMiddleware.WrapModel` 后处理(wrapper,先注册的后运行) -10. `eventSenderModelWrapper` 后处理(内部 - 发送模型响应事件) -11. `retryModelWrapper` 后处理(内部 - 处理重试逻辑) -12. `ChatModelAgentMiddleware.AfterModelRewriteState`(hook,可在模型调用后修改状态) -13. `AgentMiddleware.AfterChatModel`(hook,在模型调用后运行) - -### Tool 调用生命周期(从外到内) - -1. `eventSenderToolHandler`(内部 ToolMiddleware - 在所有处理后发送工具结果事件) -2. `ToolsConfig.ToolCallMiddlewares`(ToolMiddleware) -3. `AgentMiddleware.WrapToolCall`(ToolMiddleware) -4. `ChatModelAgentMiddleware.WrapInvokableToolCall/WrapStreamableToolCall`(在请求时包装,先注册的在最外层) -5. `Tool.InvokableRun/StreamableRun` +> 💡 +> `State` 仅为 checkpoint 向后兼容而保持导出。**不要直接使用**——请在 `ChatModelAgentMiddleware` 回调中使用 `ChatModelAgentState`,用 `SetRunLocalValue/GetRunLocalValue` 替代原 `State.Extra`。`compose.ProcessState[*State]` 用法将在 v1.0.0 中停止工作。 --- ## 迁移指南 -### 从 AgentMiddleware 迁移到 ChatModelAgentMiddleware +### 从 compose.ProcessState[*State] 迁移 -**之前(AgentMiddleware):** +**之前:** ```go -middleware := adk.AgentMiddleware{ - BeforeChatModel: func(ctx context.Context, state *adk.ChatModelAgentState) error { - return nil - }, -} +compose.ProcessState(ctx, func(_ context.Context, st *adk.State) error { + st.Extra["myKey"] = myValue + return nil +}) ``` -**之后(ChatModelAgentMiddleware):** +**之后:** ```go -type MyHandler struct { - *adk.BaseChatModelAgentMiddleware +// 写入 +if err := adk.SetRunLocalValue(ctx, "myKey", myValue); err != nil { + return ctx, state, err } -func (h *MyHandler) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { - newCtx := context.WithValue(ctx, myKey, myValue) - return newCtx, state, nil +// 读取 +if val, found, err := adk.GetRunLocalValue(ctx, "myKey"); err == nil && found { + // use val } ``` -### 从 compose.ProcessState[*State] 迁移 +### 适配 AfterAgent(v0.9 新增) -**之前:** +`AfterAgent` 在 agent **成功终止**后调用(最终回答或 return-directly 工具结果),可用于后处理: ```go -compose.ProcessState(ctx, func(_ context.Context, st *adk.State) error { - st.Extra["myKey"] = myValue - return nil -}) +func (m *MyMiddleware) AfterAgent(ctx context.Context, state *adk.ChatModelAgentState) (context.Context, error) { + log.Printf("Agent 完成,共 %d 条消息", len(state.Messages)) + // 可在此做审计、统计、清理等 + return ctx, nil +} ``` -**之后(使用 SetRunLocalValue/GetRunLocalValue):** +> 💡 +> `AfterAgent` 按注册顺序调用(与 `BeforeAgent` 一致)。任一 handler 返回 error 后,后续 handler 不再调用(fail-fast),错误发送到事件流。 -```go -if err := adk.SetRunLocalValue(ctx, "myKey", myValue); err != nil { - return ctx, state, err -} +### 适配 ToolInfos / DeferredToolInfos(v0.9 新增) -if val, found, err := adk.GetRunLocalValue(ctx, "myKey"); err == nil && found { +`ChatModelAgentState` 新增了 `ToolInfos` 和 `DeferredToolInfos` 字段,取代 `ModelContext.Tools` 成为工具配置的 source of truth: + +```go +func (m *MyMiddleware) BeforeModelRewriteState(ctx context.Context, state *adk.ChatModelAgentState, mc *adk.ModelContext) (context.Context, *adk.ChatModelAgentState, error) { + // 动态过滤工具 + filtered := make([]*schema.ToolInfo, 0, len(state.ToolInfos)) + for _, t := range state.ToolInfos { + if shouldInclude(t.Name) { + filtered = append(filtered, t) + } + } + state.ToolInfos = filtered + return ctx, state, nil } ``` diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md index 5291c0c17df..936376cfe4a 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/_index.md @@ -1,125 +1,162 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: FileSystem Backend weight: 1 --- -> 💡 -> Package: [github.com/cloudwego/eino/adk/filesystem](https://github.com/cloudwego/eino/tree/main/adk/filesystem) +> 💡Package: [github.com/cloudwego/eino/adk/filesystem](https://github.com/cloudwego/eino/tree/main/adk/filesystem) ## 背景与目的 -在 AI Agent 场景中,Agent 往往需要与文件系统交互——读取文件内容、搜索代码、编辑配置、执行命令等。然而,不同的运行环境对文件系统的访问方式差异很大: +AI Agent 需要与文件系统交互(读取、搜索、编辑、执行命令),但不同运行环境的访问方式差异很大:本地磁盘、远程沙箱、内存模拟、对象存储等。若每种环境单独实现文件操作逻辑,会导致 Middleware/Agent 代码与底层存储耦合。 -- **本地开发环境**:直接操作本机文件系统,零配置即可使用 -- **云端沙箱环境**:通过远程 API 操作隔离的沙箱文件系统,需要认证和网络通信 -- **测试环境**:需要内存级别的模拟文件系统,无需真实磁盘 I/O -- **自定义存储**:可能需要对接 OSS、数据库等非传统文件系统 +`filesystem.Backend` 接口解决这一问题——作为**统一文件系统操作协议**: -如果每种环境都各自实现一套文件操作逻辑,会导致 Middleware 和 Agent 代码与底层存储实现耦合,难以复用和测试。 - -为了解决这一问题,Eino ADK 抽象出 `filesystem.Backend` 接口,作为**统一的文件系统操作协议**。它的设计目标是: - -1. **解耦存储与业务**:Middleware 只依赖 Backend 接口,不关心底层是本地磁盘、远程沙箱还是内存模拟 -2. **可插拔替换**:通过切换 Backend 实现,同一个 Agent 可以在不同环境中运行,无需修改任何业务代码 -3. **易于测试**:内置 `InMemoryBackend` 实现,方便在单元测试中模拟文件系统行为 -4. **可扩展性**:所有方法使用结构体参数,未来新增字段不会破坏已有实现的兼容性 +1. **解耦存储与业务** — Middleware 只依赖接口,不关心底层实现 +2. **可插拔替换** — 切换 Backend 即可在不同环境运行,无需修改业务代码 +3. **易于测试** — 内置 `InMemoryBackend`,无需真实磁盘 I/O +4. **向前兼容** — 所有方法使用结构体参数,新增字段不破坏已有实现 ## Backend 接口 ```go type Backend interface { - // 列出指定路径下的文件和目录信息 LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) - // 读取文件内容,支持按行分页(offset + limit) Read(ctx context.Context, req *ReadRequest) (*FileContent, error) - // 在指定路径中搜索匹配 pattern 的内容,返回匹配列表 GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error) - // 根据 glob pattern 和路径查找匹配的文件 GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error) - // 写入或创建文件 Write(ctx context.Context, req *WriteRequest) error - // 替换文件中的字符串内容 Edit(ctx context.Context, req *EditRequest) error } ``` -### 扩展接口 + + + + + + + + +
    方法功能返回
    LsInfo
    列出指定路径下的文件和目录信息
    []FileInfo
    Read
    读取文件内容,支持按行分页(offset + limit)
    *FileContent
    GrepRaw
    在文件中搜索匹配 pattern 的内容
    []GrepMatch
    GlobInfo
    根据 glob pattern 查找匹配文件
    []FileInfo
    Write
    写入或创建文件
    error
    Edit
    替换文件中的字符串内容
    error
    -除核心文件操作外,Backend 还可以选择性地实现 Shell 命令执行能力: +## 扩展接口 + +### Shell / StreamingShell + +Backend 可选择性实现命令执行能力。当 Backend 同时实现 `Shell` 或 `StreamingShell` 时,Filesystem Middleware 会额外注册 `execute` 工具。两者**互斥**,不可同时配置。 ```go -// Shell 提供同步命令执行能力 type Shell interface { Execute(ctx context.Context, input *ExecuteRequest) (result *ExecuteResponse, err error) } -// StreamingShell 提供流式命令执行能力,适用于长时间运行的命令 type StreamingShell interface { ExecuteStreaming(ctx context.Context, input *ExecuteRequest) (result *schema.StreamReader[*ExecuteResponse], err error) } ``` -当 Backend 同时实现了 `Shell` 或 `StreamingShell` 接口时,Filesystem Middleware 会额外注册 `execute` 工具,允许 Agent 执行 shell 命令。 +### MultiModalReader + +可选扩展接口,支持多模态文件读取(图片、PDF 等),返回结构化的 `MultiFileContent`。 + +```go +type MultiModalReader interface { + MultiModalRead(ctx context.Context, req *MultiModalReadRequest) (*MultiFileContent, error) +} +``` + +当 Backend 实现此接口且 Middleware 配置 `UseMultiModalRead = true` 时,`read_file` 工具将使用多模态读取。 + +## 核心数据类型 -### 核心数据类型 +### 请求类型 - - - - - - - - - - + + + + + + + + +
    类型描述
    FileInfo
    文件/目录信息:路径、是否目录、大小、修改时间
    FileContent
    文件内容 + 行号信息
    GrepMatch
    搜索匹配结果:内容、路径、行号
    ReadRequest
    读取请求:路径、offset(从第几行开始,1-based)、limit(读取行数)
    GrepRequest
    搜索请求:pattern(支持正则)、路径、glob 过滤、文件类型过滤等
    WriteRequest
    写入请求:路径、内容
    EditRequest
    编辑请求:路径、旧字符串、新字符串、是否全部替换
    ExecuteRequest
    命令执行请求:命令字符串、是否后台运行
    ExecuteResponse
    命令执行结果:输出内容、退出码、是否被截断
    类型字段说明
    LsInfoRequest
    Path string
    要列出的目录路径
    ReadRequest
    FilePath string
    Offset int
    Limit int
    文件路径;起始行号(1-based,<1 视为 1);最大读取行数(0=全部)
    MultiModalReadRequest
    嵌入
    ReadRequest
    Pages string
    继承 ReadRequest 所有字段;Pages 指定 PDF 页码范围(如 "1-5"、"3")
    GrepRequest
    Pattern string
    Path string
    Glob string
    FileType string
    CaseInsensitive bool
    EnableMultiline bool
    AfterLines int
    BeforeLines int
    正则搜索模式(ripgrep 语法);搜索目录;glob 文件过滤;文件类型过滤(如 "go"、"py");忽略大小写;启用多行匹配;匹配后显示 N 行;匹配前显示 N 行
    GlobInfoRequest
    Pattern string
    Path string
    glob 表达式(支持
    *
    **
    ?
    [abc]
    );搜索起始目录
    WriteRequest
    FilePath string
    Content string
    目标文件路径;写入内容
    EditRequest
    FilePath string
    OldString string
    NewString string
    ReplaceAll bool
    文件路径;被替换的精确字符串(非空);替换后的字符串;false 时要求 OldString 在文件中仅出现一次
    ExecuteRequest
    Command string
    RunInBackendGround bool
    要执行的命令字符串;是否后台运行
    +### 响应类型 + + + + + + + + + +
    类型字段说明
    FileInfo
    Path string
    IsDir bool
    Size int64
    ModifiedAt string
    文件/目录路径;是否为目录;文件大小(字节);最后修改时间(ISO 8601 格式)
    FileContent
    Content string
    文件的纯文本内容
    MultiFileContent
    *FileContent
    Parts []FileContentPart
    嵌入 FileContent;多模态输出部分。Parts 与 FileContent 互斥:Parts 非空时 FileContent 被忽略
    FileContentPart
    Type FileContentPartType
    MIMEType string
    Data []byte
    内容类型(
    "image"
    "pdf"
    );MIME 类型(如 "image/png");原始二进制数据
    GrepMatch
    Content string
    Path string
    Line int
    匹配的行内容;文件路径;1-based 行号
    ExecuteResponse
    Output string
    ExitCode *int
    Truncated bool
    命令输出内容;退出码(指针,可能为 nil);输出是否被截断
    + +### 常量 + +```go +type FileContentPartType string + +const ( + FileContentPartTypeImage FileContentPartType = "image" + FileContentPartTypePDF FileContentPartType = "pdf" +) +``` + ## 内置实现:InMemoryBackend -`InMemoryBackend` 是框架内置的 Backend 实现,将文件存储在内存 map 中,主要用于: +`InMemoryBackend` 将文件存储在内存 map 中,主要用于: -- **单元测试**:无需真实文件系统即可测试 Agent 和 Middleware 的文件操作逻辑 -- **轻量场景**:不需要持久化的临时文件操作 -- **工具结果卸载**:Filesystem Middleware 的大型工具结果卸载功能默认使用 InMemoryBackend 存储 +- **单元测试** — 无需真实文件系统即可测试 Agent/Middleware 的文件操作逻辑 +- **轻量场景** — 不需要持久化的临时文件操作 +- **工具结果卸载** — Filesystem Middleware 的大型工具结果卸载功能默认使用 InMemoryBackend + +### 构造函数 ```go -import "github.com/cloudwego/eino/adk/filesystem" +func NewInMemoryBackend() *InMemoryBackend +``` -ctx := context.Background() +零参数构造,返回空的内存文件系统。 + +### 使用示例 + +```go backend := filesystem.NewInMemoryBackend() +ctx := context.Background() -// 写入文件 -err := backend.Write(ctx, &filesystem.WriteRequest{ +// 写入 +_ = backend.Write(ctx, &filesystem.WriteRequest{ FilePath: "/example/test.txt", Content: "Hello, World!\nLine 2\nLine 3", }) -// 读取文件(支持分页) -content, err := backend.Read(ctx, &filesystem.ReadRequest{ +// 读取(分页) +content, _ := backend.Read(ctx, &filesystem.ReadRequest{ FilePath: "/example/test.txt", Offset: 1, Limit: 10, }) -// 列出目录 -files, err := backend.LsInfo(ctx, &filesystem.LsInfoRequest{ - Path: "/example", -}) +// 列目录 +files, _ := backend.LsInfo(ctx, &filesystem.LsInfoRequest{Path: "/example"}) -// 搜索内容(支持正则) -matches, err := backend.GrepRaw(ctx, &filesystem.GrepRequest{ - Pattern: "Hello", - Path: "/example", +// 搜索(正则) +matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ + Pattern: "Hello", + Path: "/example", + CaseInsensitive: true, }) -// 编辑文件 -err = backend.Edit(ctx, &filesystem.EditRequest{ +// 编辑 +_ = backend.Edit(ctx, &filesystem.EditRequest{ FilePath: "/example/test.txt", OldString: "Hello", NewString: "Hi", @@ -127,18 +164,22 @@ err = backend.Edit(ctx, &filesystem.EditRequest{ }) ``` -特性: +### 实现特性 -- 线程安全(基于 `sync.RWMutex`) -- GrepRaw 支持正则匹配、大小写不敏感、上下文行数等高级选项 -- GrepRaw 内部采用并行处理(最多 10 个 worker) +- **线程安全** — 基于 `sync.RWMutex`,读操作使用读锁,写操作使用写锁 +- **GrepRaw 并行处理** — 多文件搜索时最多启动 10 个 worker 并行匹配 +- **正则支持** — 支持完整正则、大小写不敏感 (`(?i)` 前缀)、多行模式 +- **上下文行** — GrepRaw 支持 BeforeLines/AfterLines 显示匹配行前后的上下文 +- **Glob 匹配** — 使用 `doublestar` 库支持 `**` 递归匹配 +- **FileType 映射** — 内置 70+ 种文件类型到扩展名的映射表(go、py、ts、rust 等) +- **不实现 Shell** — InMemoryBackend 不实现 Shell/StreamingShell 接口 ## 外部实现 以下 Backend 实现位于 [eino-ext](https://github.com/cloudwego/eino-ext) 仓库: -- **Local Backend** — 本地文件系统实现,直接操作本机磁盘,零配置开箱即用 -- **Ark Agentkit Sandbox Backend** — 火山引擎 Agentkit 远程沙箱实现,在隔离的云端环境中执行文件操作 +- **Local Backend** (`github.com/cloudwego/eino-ext/adk/backend/local`) — 本地文件系统实现,直接操作本机磁盘 +- **Ark Agentkit Sandbox** (`github.com/cloudwego/eino-ext/adk/backend/agentkit`) — 火山引擎 Agentkit 远程沙箱实现 ### 实现对比 @@ -148,18 +189,17 @@ err = backend.Edit(ctx, &filesystem.EditRequest{ 网络依赖无无需要 配置复杂度零配置零配置需要凭证 持久化否是是 -Shell 支持否支持(含流式)支持 -适用场景测试/临时开发/本地环境多租户/生产环境 +Shell 支持否Shell + StreamingShellShell +MultiModalReader否视实现而定视实现而定 +适用场景测试 / 临时存储开发 / 本地环境多租户 / 生产环境 ## 自定义实现 -如需对接自定义存储(如 OSS、数据库等),只需实现 `Backend` 接口即可: +实现 `Backend` 接口即可对接自定义存储。如需命令执行,额外实现 `Shell` 或 `StreamingShell`;如需多模态读取,实现 `MultiModalReader`。 ```go -type MyBackend struct { - // ... -} +type MyBackend struct { /* ... */ } func (b *MyBackend) LsInfo(ctx context.Context, req *filesystem.LsInfoRequest) ([]filesystem.FileInfo, error) { // 自定义实现 @@ -169,7 +209,29 @@ func (b *MyBackend) Read(ctx context.Context, req *filesystem.ReadRequest) (*fil // 自定义实现 } -// ... 实现其余方法 -``` +func (b *MyBackend) GrepRaw(ctx context.Context, req *filesystem.GrepRequest) ([]filesystem.GrepMatch, error) { + // 自定义实现 +} + +func (b *MyBackend) GlobInfo(ctx context.Context, req *filesystem.GlobInfoRequest) ([]filesystem.FileInfo, error) { + // 自定义实现 +} -如果需要支持命令执行,还可以额外实现 `Shell` 或 `StreamingShell` 接口。 +func (b *MyBackend) Write(ctx context.Context, req *filesystem.WriteRequest) error { + // 自定义实现 +} + +func (b *MyBackend) Edit(ctx context.Context, req *filesystem.EditRequest) error { + // 自定义实现 +} + +// 可选:实现 Shell +func (b *MyBackend) Execute(ctx context.Context, input *filesystem.ExecuteRequest) (*filesystem.ExecuteResponse, error) { + // 自定义实现 +} + +// 可选:实现 MultiModalReader +func (b *MyBackend) MultiModalRead(ctx context.Context, req *filesystem.MultiModalReadRequest) (*filesystem.MultiFileContent, error) { + // 自定义实现 +} +``` diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md index 47767e5af6c..956755f238f 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_ark_agentkit_sandbox.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: Ark Agentkit Sandbox diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md new file mode 100644 index 00000000000..9bdd0a6bb62 --- /dev/null +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_local_filesystem.md @@ -0,0 +1,201 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: 本地文件系统 +weight: 2 +--- + +## Local Backend + +**Package**: `github.com/cloudwego/eino-ext/adk/backend/local` + +> 💡 +> eino v0.8.0+ 需使用 local backend v0.2.1 及以上版本。 + +Local Backend 是 Eino ADK FileSystem 的本地实现,直接操作本机文件系统。实现了 `filesystem.Backend`(文件操作)和 `filesystem.StreamingShell`(流式命令执行)两个接口。 + +**核心特性**:零配置、原生性能、强制绝对路径、流式命令执行、可选命令验证。 + +--- + +## 安装 + +```bash +go get github.com/cloudwego/eino-ext/adk/backend/local +``` + +## 配置 + +```go +type Config struct { + // 可选:命令验证函数,用于 ExecuteStreaming 的安全控制。 + // 返回 non-nil error 时拒绝执行。 + ValidateCommand func(string) error +} +``` + +## 快速开始 + +```go +backend, err := local.NewBackend(ctx, &local.Config{}) + +// 写入文件(必须绝对路径;文件已存在则覆盖) +err = backend.Write(ctx, &filesystem.WriteRequest{ + FilePath: "/tmp/hello.txt", + Content: "Hello, Local Backend!", +}) + +// 读取文件(支持行级分页) +fc, err := backend.Read(ctx, &filesystem.ReadRequest{ + FilePath: "/tmp/hello.txt", + Offset: 1, // 起始行号(1-based) + Limit: 50, // 最大行数,0 表示全部 +}) +``` + +### 与 Agent 集成 + +```go +import ( + "github.com/cloudwego/eino/adk" + fsMiddleware "github.com/cloudwego/eino/adk/middlewares/filesystem" + "github.com/cloudwego/eino-ext/adk/backend/local" +) + +backend, _ := local.NewBackend(ctx, &local.Config{}) + +middleware, _ := fsMiddleware.New(ctx, &fsMiddleware.Config{ + Backend: backend, // 必填:注册 ls/read/write/edit/glob/grep 工具 + StreamingShell: backend, // 可选:注册流式 execute 工具 +}) + +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: chatModel, + Handlers: []adk.ChatModelAgentMiddleware{middleware}, +}) +``` + +> 💡 +> 中间件 Config 中 `Shell` 与 `StreamingShell` 互斥。Local Backend 仅实现 `StreamingShell`(流式命令执行),不实现非流式 `Shell`。 + +--- + +## 实现的接口与方法 + +### filesystem.Backend + + + + + + + + + +
    方法签名说明
    LsInfo
    (ctx, *LsInfoRequest) ([]FileInfo, error)
    列出目录内容
    Read
    (ctx, *ReadRequest) (*FileContent, error)
    读取文件,支持行级分页(Offset 1-based,Limit 0=全部)
    Write
    (ctx, *WriteRequest) error
    写入文件;自动创建父目录;文件已存在则覆盖
    Edit
    (ctx, *EditRequest) error
    字符串替换;支持
    ReplaceAll
    OldString
    不唯一时报错(非 ReplaceAll 模式)
    GrepRaw
    (ctx, *GrepRequest) ([]GrepMatch, error)
    基于 ripgrep 搜索,支持完整正则语法;支持大小写不敏感、多行匹配、上下文行
    GlobInfo
    (ctx, *GlobInfoRequest) ([]FileInfo, error)
    Glob 模式匹配文件,支持
    *
    /
    **
    /
    ?
    /
    [abc]
    + +### filesystem.StreamingShell + + + + +
    方法签名说明
    ExecuteStreaming
    (ctx, *ExecuteRequest) (*StreamReader[*ExecuteResponse], error)
    流式执行 shell 命令,实时输出;支持后台运行(
    RunInBackendGround
    + +--- + +## 使用示例 + +### 搜索内容(正则) + +```go +matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ + Path: "/home/user/project", + Pattern: "TODO|FIXME", // ripgrep 正则语法 + Glob: "*.go", + CaseInsensitive: true, +}) +``` + +### 编辑文件 + +```go +backend.Edit(ctx, &filesystem.EditRequest{ + FilePath: "/tmp/file.txt", + OldString: "old text", + NewString: "new text", + ReplaceAll: true, +}) +``` + +### 流式执行命令 + +```go +reader, _ := backend.ExecuteStreaming(ctx, &filesystem.ExecuteRequest{ + Command: "tail -f /var/log/app.log", +}) +for { + resp, err := reader.Recv() + if err == io.EOF { + break + } + fmt.Print(resp.Output) +} +``` + +### 带命令验证 + +```go +backend, _ := local.NewBackend(ctx, &local.Config{ + ValidateCommand: func(cmd string) error { + allowed := map[string]bool{"ls": true, "cat": true, "grep": true} + parts := strings.Fields(cmd) + if len(parts) == 0 || !allowed[parts[0]] { + return fmt.Errorf("command not allowed: %s", parts[0]) + } + return nil + }, +}) +``` + +--- + +## 路径要求 + +所有文件路径必须为绝对路径(以 `/` 开头)。相对路径可通过 `filepath.Abs()` 转换。 + +--- + +## 与 Agentkit Backend 对比 + + + + + + + + + + +
    特性LocalAgentkit
    执行模型本地直接远程沙箱
    网络依赖需要
    配置复杂度零配置需要凭证
    安全模型OS 权限 + ValidateCommand隔离沙箱
    流式输出支持(StreamingShell)不支持
    平台支持Unix/Linux/macOS任意
    适用场景开发/本地环境多租户/生产环境
    + +--- + +## FAQ + +**Q: GrepRaw 支持正则吗?** + +A: 支持。底层使用 ripgrep(`rg`),支持完整正则语法。系统需安装 ripgrep,否则报错 `ripgrep (rg) is not installed or not in PATH`。安装方式见 [https://github.com/BurntSushi/ripgrep#installation](https://github.com/BurntSushi/ripgrep#installation) 。 + +**Q: Write 是创建还是覆盖?** + +A: 覆盖。`Write` 使用 `O_CREATE|O_TRUNC` 标志,文件已存在则覆盖内容,不存在则创建(含自动创建父目录)。 + +**Q: Windows 支持吗?** + +A: 不支持。`ExecuteStreaming` 依赖 `/bin/sh`。文件操作本身可在任意平台运行,但命令执行仅限 Unix 系。 + +**Q: Local Backend 支持非流式 Execute 吗?** + +A: 不支持。Local 仅实现 `StreamingShell`(`ExecuteStreaming`),未实现 `Shell`(`Execute`)。中间件 Config 中 `Shell` 与 `StreamingShell` 互斥,选其一即可。 diff --git "a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" "b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" deleted file mode 100644 index 00a3dff1644..00000000000 --- "a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/filesystem_backend/backend_\346\234\254\345\234\260\346\226\207\344\273\266\347\263\273\347\273\237.md" +++ /dev/null @@ -1,231 +0,0 @@ ---- -Description: "" -date: "2026-03-24" -lastmod: "" -tags: [] -title: 本地文件系统 -weight: 2 ---- - -## Local Backend - -Package: `github.com/cloudwego/eino-ext/adk/backend/local` - -注意:如果 eino 版本是 v0.8.0 及以上,需要使用 local backend 的 [adk/backend/local/v0.2.1](https://github.com/cloudwego/eino-ext/releases/tag/adk%2Fbackend%2Flocal%2Fv0.2.1) 版本。 - -### 概述 - -Local Backend 是 EINO ADK FileSystem 的本地文件系统实现,直接操作本机文件系统,提供原生性能和零配置体验。 - -#### 核心特性 - -- 零配置 - 开箱即用 -- 原生性能 - 直接文件系统访问,无网络开销 -- 路径安全 - 强制使用绝对路径 -- 流式执行 - 支持命令输出实时流 -- 命令验证 - 可选的安全验证钩子 - -### 安装 - -```bash -go get github.com/cloudwego/eino-ext/adk/backend/local -``` - -### 配置 - -```go -type Config struct { - // 可选: 命令验证函数,用于 Execute() 安全控制 - ValidateCommand func(string) error -} -``` - -### 快速开始 - -#### 基本用法 - -```go -import ( - "context" - - "github.com/cloudwego/eino-ext/adk/backend/local" - "github.com/cloudwego/eino/adk/filesystem" -) - -func main() { - ctx := context.Background() - - backend, err := local.NewBackend(ctx, &local.Config{}) - if err != nil { - panic(err) - } - - // 写入文件(必须是绝对路径) - err = backend.Write(ctx, &filesystem.WriteRequest{ - FilePath: "/tmp/hello.txt", - Content: "Hello, Local Backend!", - }) - - // 读取文件 - fcontent, err := backend.Read(ctx, &filesystem.ReadRequest{ - FilePath: "/tmp/hello.txt", - }) - fmt.Println(fcontent.Content) -} -``` - -#### 带命令验证 - -```go -func validateCommand(cmd string) error { - allowed := map[string]bool{"ls": true, "cat": true, "grep": true} - parts := strings.Fields(cmd) - if len(parts) == 0 || !allowed[parts[0]] { - return fmt.Errorf("command not allowed: %s", parts[0]) - } - return nil -} - -backend, _ := local.NewBackend(ctx, &local.Config{ - ValidateCommand: validateCommand, -}) -``` - -#### 与 Agent 集成 - -```go -import ( - "github.com/cloudwego/eino/adk" - fsMiddleware "github.com/cloudwego/eino/adk/middlewares/filesystem" -) - -// 创建 Backend -backend, _ := local.NewBackend(ctx, &local.Config{}) - -// 创建 Middleware -middleware, _ := fsMiddleware.New(ctx, &fsMiddleware.Config{ - Backend: backend, - StreamingShell: backend, -}) - -// 创建 Agent -agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "LocalFileAgent", - Description: "具有本地文件系统访问能力的 AI Agent", - Model: chatModel, - Handlers: []adk.ChatModelAgentMiddleware{middleware}, -}) -``` - -### API 参考 - - - - - - - - - - - -
    方法描述
    LsInfo列出目录内容
    Read读取文件内容(支持分页,默认 200 行)
    Write创建新文件(已存在则报错)
    Edit替换文件内容
    GrepRaw搜索文件内容(字面量匹配)
    GlobInfo按模式查找文件
    Execute执行 shell 命令
    ExecuteStreaming流式执行命令
    - -#### 示例 - -```go -// 列出目录 -files, _ := backend.LsInfo(ctx, &filesystem.LsInfoRequest{ - Path: "/home/user", -}) - -// 读取文件(分页) -fcontent, _ := backend.Read(ctx, &filesystem.ReadRequest{ - FilePath: "/path/to/file.txt", - Offset: 0, - Limit: 50, -}) - -// 搜索内容(字面量匹配,非正则) -matches, _ := backend.GrepRaw(ctx, &filesystem.GrepRequest{ - Path: "/home/user/project", - Pattern: "TODO", - Glob: "*.go", -}) - -// 查找文件 -files, _ := backend.GlobInfo(ctx, &filesystem.GlobInfoRequest{ - Path: "/home/user", - Pattern: "**/*.go", -}) - -// 编辑文件 -backend.Edit(ctx, &filesystem.EditRequest{ - FilePath: "/tmp/file.txt", - OldString: "old", - NewString: "new", - ReplaceAll: true, -}) - -// 执行命令 -result, _ := backend.Execute(ctx, &filesystem.ExecuteRequest{ - Command: "ls -la /tmp", -}) - -// 流式执行 -reader, _ := backend.ExecuteStreaming(ctx, &filesystem.ExecuteRequest{ - Command: "tail -f /var/log/app.log", -}) -for { - resp, err := reader.Recv() - if err == io.EOF { - break - } - fmt.Print(resp.Stdout) -} -``` - -### 路径要求 - -所有路径必须是绝对路径(以 `/` 开头): - -```go -// 正确 -backend.Read(ctx, &filesystem.ReadRequest{FilePath: "/home/user/file.txt"}) - -// 错误 -backend.Read(ctx, &filesystem.ReadRequest{FilePath: "./file.txt"}) -``` - -转换相对路径: - -```go -absPath, _ := filepath.Abs("./relative/path") -``` - -### 与 Agentkit Backend 对比 - - - - - - - - - - -
    特性LocalAgentkit
    执行模型本地直接远程沙箱
    网络依赖需要
    配置复杂度零配置需要凭证
    安全模型OS 权限隔离沙箱
    流式输出支持不支持
    平台支持Unix/Linux/macOS任意
    适用场景开发/本地环境多租户/生产环境
    - -### 常见问题 - -**Q: 为什么运行 grep 命令报错 ripgrep (rg) is not installed or not in PATH. Please install it: ****[https://github.com/BurntSushi/ripgrep#installation](https://github.com/BurntSushi/ripgrep#installation)** - -local 的 Grep 命令默认依赖** ripgrep **指令,如系统没有预装 ripgrep 则需要通过文档安装 ripgrep - -**Q: GrepRaw 支持正则吗?** - -支持正则匹配,GrepRaw 底层使用的是 ripgrep 命令做的 Grep 操作 - -**Q: Windows 支持吗?** - -不支持,依赖 `/bin/sh`。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md index c4edcf4e9d1..ce12d13f1c4 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_agentsmd.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: AgentsMD @@ -9,51 +9,27 @@ weight: 9 ## 概述 -`agentsmd` 是 Eino ADK 提供的一个中间件,用于在每次模型调用时**自动将 Agents.md 文件内容注入到模型输入消息中**。注入是瞬态的——内容在模型调用时动态添加,不会持久化到会话状态中,因此**不会被摘要/压缩中间件处理**。 - -**核心价值**:通过 Agents.md 文件为 Agent 定义系统级的行为指令和上下文信息(类似 Claude Code 的 CLAUDE.md),无需手动管理 system prompt 的拼接。 - -**包路径**:`github.com/cloudwego/eino/adk/middlewares/agentsmd` - ---- +`agentsmd` 是 Eino ADK 的中间件,在每次模型调用时**自动将 Agents.md 文件内容注入到消息序列中**。注入的消息会被框架持久化到 agent 内部状态,但通过**幂等性检查**(`Extra["__agentsmd_content__"]` 标记)确保不会重复注入。由于注入内容在首次出现时即固定,**不会随后续摘要/压缩而变化**。**核心价值**:通过 Agents.md 文件为 Agent 定义系统级行为指令与上下文(类似 Claude Code 的 CLAUDE.md),无需手动管理 system prompt 拼接。**包路径**:`github.com/cloudwego/eino/adk/middlewares/agentsmd` ## 快速开始 -### 最小化示例 - ```go -package main - -import ( - "context" - "fmt" - - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/adk/middlewares/agentsmd" -) - -func main() { - ctx := context.Background() - - // 1. 准备 Backend(文件读取后端) - backend := NewLocalFileBackend("/path/to/project") - - // 2. 创建 agentsmd 中间件 - mw, err := agentsmd.New(ctx, &agentsmd.Config{ - Backend: backend, - AgentsMDFiles: []string{"/home/user/project/agents.md"}, - }) - if err != nil { - panic(err) - } - - // 3. 将中间件配置到 Agent - // agent := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // Middlewares: []adk.ChatModelAgentMiddleware{mw}, - // }) - _ = mw - fmt.Println("agentsmd middleware created successfully") +ctx := context.Background() + +// 1. 创建 agentsmd 中间件 +mw, err := agentsmd.New(ctx, &agentsmd.Config{ + Backend: myBackend, // 实现 agentsmd.Backend 接口 + AgentsMDFiles: []string{"/project/agents.md"}, +}) +if err != nil { + panic(err) } + +// 2. 配置到 Agent +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: chatModel, + Handlers: []adk.ChatModelAgentMiddleware{mw}, +}) ``` --- @@ -64,84 +40,86 @@ func main() { ```go type Config struct { - // Backend 提供文件访问能力,用于加载 Agents.md 文件。 - // 可以使用本地文件系统、远程存储或任何其他后端实现。 - // 必填。 - Backend Backend - - // AgentsMDFiles 指定要加载的 Agents.md 文件路径的有序列表。 - // 文件按照给定顺序加载和注入。 - // 文件内部支持 @import 语法进行递归引入(最大深度 5)。 - AgentsMDFiles []string - - // AllAgentsMDMaxBytes 限制所有加载的 Agents.md 内容的总字节大小。 - // 文件按顺序加载;一旦累计大小超过此限制,剩余文件将被跳过。 - // 每个单独的文件始终完整加载。 - // 0 表示无限制。 + Backend Backend + AgentsMDFiles []string AllAgentsMDMaxBytes int - - // OnLoadWarning 是一个可选的回调函数,在加载过程中发生非致命错误时调用 - // (如文件未找到、循环 @import、深度超限等)。 - // 如果为 nil,警告通过 log.Printf 输出。 - // - // 注意:Backend.Read 的非 os.ErrNotExist 错误(如权限被拒、I/O 错误) - // 不会被视为警告,而是会中止加载过程。 - OnLoadWarning func(filePath string, err error) + OnLoadWarning func(filePath string, err error) } ``` -### 配置参数说明 +### 参数说明 - - - - + + + +
    参数类型必填默认值说明
    Backend
    Backend
    -文件读取后端,负责实际的文件 I/O
    AgentsMDFiles
    []string
    -要加载的 Agents.md 文件路径列表(至少一个)
    AllAgentsMDMaxBytes
    int
    0
    (无限制)
    所有文件的总字节数上限
    OnLoadWarning
    func(string, error)
    log.Printf
    非致命错误的回调函数
    Backend
    Backend
    文件读取后端,负责实际的文件 I/O
    AgentsMDFiles
    []string
    要加载的 Agents.md 文件路径列表(至少一个),按顺序加载和注入
    AllAgentsMDMaxBytes
    int
    0
    (无限制)
    所有文件的总字节数上限;超过后跳过后续文件,但每个文件始终完整加载
    OnLoadWarning
    func(string, error)
    log.Printf
    非致命错误的回调函数(文件缺失、循环 @import、深度超限等)
    +### 校验规则 + +`New` / `NewTyped` 在创建时会校验 Config: + +- `Config` 不能为 nil +- `Backend` 不能为 nil +- `AgentsMDFiles` 至少包含一个路径 +- `AllAgentsMDMaxBytes` 不能为负数 + --- +## 构造函数 + +### New — 标准构造 + +```go +func New(ctx context.Context, cfg *Config) (adk.ChatModelAgentMiddleware, error) +``` + +返回 `ChatModelAgentMiddleware`(即 `TypedChatModelAgentMiddleware[*schema.Message]`),适用于标准 `ChatModelAgent`。 + +### NewTyped — 泛型构造 + +```go +func NewTyped[M adk.MessageType](_ context.Context, cfg *Config) (adk.TypedChatModelAgentMiddleware[M], error) +``` + +泛型版本,支持 `*schema.Message` 和 `*schema.AgenticMessage` 两种消息类型。`New` 内部调用 `NewTyped[*schema.Message]`。 + ## Backend 接口 ### 接口定义 ```go type Backend interface { - // Read 读取文件内容。 - // 如果文件不存在,实现应返回包装了 os.ErrNotExist 的 error - // (以便 errors.Is(err, os.ErrNotExist) 返回 true)。 - // 这样 loader 可以静默跳过缺失文件并通过 OnLoadWarning 通知。 - // 其他错误(如权限被拒、I/O 错误)会中止加载过程。 Read(ctx context.Context, req *ReadRequest) (*FileContent, error) } ``` ### 类型定义 -```go -// ReadRequest 定义读取文件的请求参数 -type ReadRequest struct { - FilePath string // 文件路径 - Offset int // 起始行号(1-based) -} +`ReadRequest` 和 `FileContent` 是 `github.com/cloudwego/eino/adk/filesystem` 包中同名类型的别名: -// FileContent 定义文件内容的返回结构 -type FileContent struct { - Content string // 文件的文本内容 -} +```go +type ReadRequest = filesystem.ReadRequest +type FileContent = filesystem.FileContent ``` +> 💡 +> **Backend 实现要求** +> +> - 文件不存在时**必须**返回包裹 `os.ErrNotExist` 的错误(使 `errors.Is(err, os.ErrNotExist)` 为 `true`),loader 据此区分"文件缺失"和"真正的 I/O 错误" +> - 其他错误(权限被拒、I/O 错误)会**中止整个加载过程**,不视为警告 +> - `Read` 方法应当是并发安全的 + --- ## @import 语法 -Agents.md 文件支持 `@import` 语法,可以递归引入其他文件。 +Agents.md 文件支持 `@路径` 语法递归引入其他文件。 ### 语法格式 -在 Agents.md 文件中,使用 `@路径/文件名` 引用其他文件: - ```markdown # 项目指令 @@ -152,68 +130,66 @@ Agents.md 文件支持 `@import` 语法,可以递归引入其他文件。 @rules/api-conventions.md ``` -### 规则 +### 匹配规则 + +loader 使用正则 `@([a-zA-Z0-9_.~/][a-zA-Z0-9_.~/\-]*)` 扫描文件内容,并结合以下过滤逻辑: + +- **含 / 的路径**:直接视为 @import(如 `@rules/style.md`) +- **不含 / 的路径**:仅当扩展名在允许列表内时视为 @import,否则忽略**允许的扩展名**:`.md`、`.txt`、`.mdx`、`.yaml`、`.yml`、`.json`、`.toml` 这一设计避免将 `@someone`、`@example.com` 等误识为导入目标。 + +### 解析行为 -1. **路径解析**:相对路径基于当前文件所在目录解析,绝对路径直接使用 -2. **最大递归深度**:5 层(超过后跳过并触发 `OnLoadWarning`) -3. **循环引用检测**:自动检测并跳过循环引用(触发 `OnLoadWarning`) -4. **全局去重**:同一文件不会被重复加载 -5. **支持的文件扩展名**(路径中不含 `/` 时):`.md`, `.txt`, `.mdx`, `.yaml`, `.yml`, `.json`, `.toml` -6. **误报过滤**:不含 `/` 且扩展名不在允许列表中的 `@引用` 会被忽略(避免将 `@someone` 或 `@example.com` 识别为导入) + + + + + + + + +
    规则说明
    路径解析相对路径基于当前文件所在目录解析;绝对路径直接使用
    最大递归深度5 层(超过后跳过并触发
    OnLoadWarning
    循环引用检测当前祖先链中已存在的路径会被跳过(触发
    OnLoadWarning
    全局去重整次加载中同一文件路径只会被读取和注入一次
    原文保留@import 引用的文件作为独立段落追加,原文中的
    @path
    文本不被移除
    字节预算累计字节数超过
    AllAgentsMDMaxBytes
    后,跳过后续 import
    -### @import 目录结构示例 +### 目录结构示例 ``` project/ ├── Agents.md # 主入口文件 ├── rules/ -│ ├── code-style.md # 代码风格规范 -│ ├── api-conventions.md # API 规范 -│ └── testing.md # 测试规范 +│ ├── code-style.md # @rules/code-style.md +│ ├── api-conventions.md # @rules/api-conventions.md +│ └── testing.md └── context/ - └── architecture.md # 架构说明 + └── architecture.md ``` --- ## 工作原理 +### 实现钩子 + +中间件实现 `TypedChatModelAgentMiddleware` 接口的 `BeforeModelRewriteState` 方法(**非** WrapModel)。此钩子在每次模型调用前、对 state 进行改写时触发。 + ### 注入流程 +### 注入后的消息序列 + ``` -用户消息 + 历史消息 - │ - ▼ -┌─────────────────────┐ -│ agentsmd 中间件 │ -│ (WrapModel) │ -│ │ -│ 1. 加载 Agents.md │ -│ 2. 缓存到 RunLocal │ -│ 3. 生成注入消息 │ -└─────────────────────┘ - │ - ▼ -┌─────────────────────────────────────┐ -│ 注入后的消息序列 │ -│ │ -│ [System] 系统提示词 │ -│ [User] ← Agents.md 内容注入 │ ← 插入在第一条 User 消息之前 -│ [User] 用户历史消息 1 │ -│ [Assistant] 助手回复 1 │ -│ [User] 用户当前消息 │ -└─────────────────────────────────────┘ - │ - ▼ - 模型调用 (Generate / Stream) +[System] 系统提示词 +[User] ← Agents.md 内容(带 Extra 标记) +[User] 用户历史消息 1 +[Assistant] 助手回复 1 +[User] 用户当前消息 ``` ### 关键机制 -1. **瞬态注入**:Agents.md 内容仅在模型调用时临时插入,不写入 `ChatModelAgentState`,因此不会被摘要/压缩中间件处理 -2. **Run 级别缓存**:同一次 Agent `Run()` 中,Agents.md 内容加载后会缓存在 `RunLocalValue` 中,后续的模型调用(如多轮工具调用)直接复用缓存,避免重复读取 -3. **插入位置**:内容作为 `User` 角色消息插入在第一条 User 消息之前;如果没有 User 消息,则追加到末尾 -4. **国际化**:格式化输出自动适配中英文(根据系统语言环境) +**1. 持久化注入 + 幂等性保证**框架会将 `BeforeModelRewriteState` 返回的 state 持久化到 agent 内部状态(`st.Messages = state.Messages`)。注入的消息通过 `Extra["__agentsmd_content__"]` 标记,每次进入钩子时先扫描——若已存在该标记则直接返回原 state,避免重复注入。因此效果上:内容在首次 model call 时被注入并持久化,后续迭代不再重复插入。**2. Run 级别缓存**同一次 `Run()` 中,首次加载的内容通过 `adk.SetRunLocalValue` 缓存到 RunLocal 存储。后续模型调用(如多轮工具调用)通过 `adk.GetRunLocalValue` 直接复用缓存。每次新的 `Run()` 会重新加载,因此文件修改会在下次 Run 时生效。**4. 插入位置**内容作为 `User` 角色消息插入在**第一条 User 消息之前**。如果消息序列中没有 User 消息,则追加到末尾。**5. 内容格式化**加载的文件内容经过格式化处理: + +- 外层包裹 `` 标签 +- 含 i18n 的 header(提示模型遵循指令)和 footer(提示上下文可能不相关) +- 每个文件以 `文件内容:{路径}(指令):` 为前缀独立展示 +- 语言(中/英文)通过 `adk.SetLanguage` 全局控制 --- @@ -221,13 +197,11 @@ project/ ### 中间件顺序 -**推荐将 ****agentsmd**** 中间件放在 summarization/compression 中间件之后。** 这样可以确保 Agents.md 内容: - -- 不会被摘要中间件压缩掉 -- 每次模型调用都能获得完整的指令内容 +> 💡 +> **推荐将 agentsmd 中间件放在 summarization/compression 中间件之后。** 这样 Agents.md 内容不会被摘要压缩,每次模型调用都能获得完整指令。 ```go -Middlewares: []adk.ChatModelAgentMiddleware{ +Handlers: []adk.ChatModelAgentMiddleware{ summarizationMiddleware, // 先摘要 agentsMDMiddleware, // 后注入 Agents.md } @@ -237,44 +211,51 @@ Middlewares: []adk.ChatModelAgentMiddleware{ - - - + + + - +
    场景行为
    文件不存在 (
    os.ErrNotExist
    )
    跳过该文件,触发
    OnLoadWarning
    循环
    @import
    跳过循环文件,触发
    OnLoadWarning
    @import
    深度超过 5 层
    跳过,触发
    OnLoadWarning
    文件不存在(
    os.ErrNotExist
    跳过该文件,触发
    OnLoadWarning
    循环 @import跳过循环文件,触发
    OnLoadWarning
    @import 深度超过 5 层跳过,触发
    OnLoadWarning
    累计大小超过
    AllAgentsMDMaxBytes
    跳过后续文件,触发
    OnLoadWarning
    (第一个文件始终完整加载)
    权限被拒 / I/O 错误中止加载,返回 error
    所有文件内容为空不注入,原样传递输入消息
    所有文件内容为空不注入,原样传递消息
    -### Backend 实现要求 - -- 文件不存在时**必须**返回 `os.ErrNotExist` 包裹的错误(`fmt.Errorf("... : %w", os.ErrNotExist)`),否则 loader 无法区分"文件缺失"和"真正的 I/O 错误" -- `Read` 方法应当是并发安全的 - ### 性能考虑 -- 合理设置 `AllAgentsMDMaxBytes`,避免注入过多内容占用模型上下文窗口 -- Agents.md 内容在每次 `Run()` 中只加载一次(Run 级别缓存),但**每次新的 ****Run()**** 都会重新加载**,因此文件内容的修改会在下次 Run 时生效 -- 避免在 Agents.md 中 `@import` 过多文件,递归深度上限为 5 层 +- 合理设置 `AllAgentsMDMaxBytes`,避免注入过多内容占用上下文窗口 +- Agents.md 内容在每次 `Run()` 中只加载一次(Run 级别缓存),但**每次新 Run() 都会重新加载** +- 避免 @import 过多文件,递归深度上限为 5 层 ### Agents.md 编写建议 - 保持内容精炼,只包含对模型行为真正有影响的指令 -- 使用 `@import` 拆分关注点(代码规范、API 规范、架构说明等) -- 避免在 Agents.md 中包含大量代码示例或数据,以免浪费上下文窗口 -- 文件内容会被包裹在 `` 标签中传递给模型,模型会将其视为系统级指令 +- 使用 @import 按关注点拆分(代码规范、API 规范、架构说明等) +- 避免包含大量代码示例或数据,以免浪费上下文窗口 +- 文件内容会被包裹在 `` 标签中传递给模型 --- ## FAQ **Q: Agents.md 的内容会被保存到对话历史中吗?** -A: 不会。内容是在模型调用时动态注入的,不会写入 `ChatModelAgentState`,因此对话历史中不会出现 Agents.md 的内容。 + +A: 会。`BeforeModelRewriteState` 返回的 state 会被框架持久化。但由于幂等性检查(`Extra["__agentsmd_content__"]` 标记),内容只在首次 model call 时注入一次,后续迭代直接跳过。建议将 agentsmd 放在 summarization 之后,避免注入内容被摘要压缩。 **Q: 如果某个 Agents.md 文件不存在会怎样?** -A: 该文件会被跳过,触发 `OnLoadWarning` 回调(默认 `log.Printf`),不会导致整体加载失败。 + +A: 该文件被跳过,触发 `OnLoadWarning` 回调(默认 `log.Printf`),不影响其他文件的加载。 **Q: @import 的路径是相对于什么目录?** -A: 相对于当前文件所在目录。例如 `/project/Agents.md` 中的 `@rules/style.md` 会解析为 `/project/rules/style.md`。 + +A: 相对于当前文件所在目录。例如 `/project/Agents.md` 中的 `@rules/style.md` 解析为 `/project/rules/style.md`。 **Q: 多个文件中 @import 了同一个文件会重复加载吗?** -A: 不会。loader 维护了全局去重 map,同一个文件路径只会被读取和注入一次。 + +A: 不会。loader 维护全局去重 map(`seen`),同一路径只会被读取和注入一次。 + +**Q: 原文中的 @path 引用会被替换掉吗?** + +A: 不会。@import 的文件作为独立段落追加在原文之后,原文内容保持不变。 + +**Q: New 和 NewTyped 有什么区别?** + +A: `New` 返回 `ChatModelAgentMiddleware`(即 `TypedChatModelAgentMiddleware[*schema.Message]`),适用于标准 Agent。`NewTyped` 是泛型版本,额外支持 `*schema.AgenticMessage` 类型,用于 Agentic Model 场景。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md index e96a2709d8c..906affe379e 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md +++ b/content/zh/docs/eino/core_modules/eino_adk/Eino_ADK_ChatModelAgentMiddleware/middleware_filesystem.md @@ -1,187 +1,221 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: FileSystem weight: 2 --- -> 💡 Package: [github.com/cloudwego/eino/adk/middlewares/filesystem](https://github.com/cloudwego/eino/tree/main/adk/middlewares/filesystem) +FileSystem 中间件为 Agent 注入一组文件系统操作工具(ls、read\_file、write\_file、edit\_file、glob、grep)以及可选的命令执行工具(execute),使 Agent 具备与本地或远程文件系统交互的能力。 -## 概述 - -FileSystem Middleware 为 Agent 提供文件系统访问能力。它通过 [FileSystem Backend](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/filesystem_backend) 接口操作文件系统,自动向 Agent 注入一组文件操作工具及对应的 system prompt,使 Agent 能够直接进行文件读写、搜索、编辑等操作。 - -核心功能: - -- **文件系统工具注入** — 自动注册 ls、read_file、write_file、edit_file、glob、grep 等工具 -- **Shell 命令执行** — 可选注入 execute 工具,支持同步和流式命令执行 -- **工具级别配置** — 每个工具均可独立配置名称、描述、自定义实现或禁用 -- **多语言提示词** — 工具描述和 system prompt 支持中英文切换 +``` +import "github.com/cloudwego/eino/adk/middlewares/filesystem" +``` -## 创建中间件 +--- -推荐使用 `New` 函数创建中间件(返回 `ChatModelAgentMiddleware`): +## 快速开始 ```go -import "github.com/cloudwego/eino/adk/middlewares/filesystem" +import ( + "context" + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/adk/middlewares/filesystem" +) +// 1. 创建 middleware middleware, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: myBackend, - // 如果需要 shell 命令执行能力,设置 Shell 或 StreamingShell - Shell: myShell, + Backend: myBackend, // 实现 filesystem.Backend 接口 }) -if err != nil { - // handle error -} +// 2. 注入 Agent agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ // ... Middlewares: []adk.ChatModelAgentMiddleware{middleware}, }) ``` +--- + +## 构造函数 + + + + + +
    函数签名说明
    New(ctx, *MiddlewareConfig) (ChatModelAgentMiddleware, error)
    推荐。返回
    ChatModelAgentMiddleware
    ,支持通过
    BeforeAgent
    钩子动态修改 Instruction 和 Tools。
    NewTyped[M MessageType](ctx, *MiddlewareConfig) (TypedChatModelAgentMiddleware[M], error)
    泛型版本,类型参数
    M
    支持
    *schema.Message
    *schema.AgenticMessage
    New
    等价于
    NewTyped[*schema.Message]
    + > 💡 -> `New` 返回 `ChatModelAgentMiddleware`,提供更好的上下文传播能力(通过 `BeforeAgent` hook 在运行时修改 Agent 的 instruction 和 tools)。 +> **Deprecated**: `NewMiddleware(ctx, *Config) (AgentMiddleware, error)` 为旧版构造函数,新代码请使用 `New`。`NewMiddleware` 返回结构体 `AgentMiddleware`,缺少 `BeforeAgent` 钩子的灵活性;此外它默认启用「大结果卸载」功能(见下文),在 `New` 路径中该功能已被移除。 -## MiddlewareConfig 配置项 +--- -```go -type MiddlewareConfig struct { - // Backend 提供文件系统操作 - // 必填 - Backend filesystem.Backend - - // Shell 提供 shell 命令执行能力 - // 如果设置,会注册 execute 工具 - // 可选,与 StreamingShell 互斥 - Shell filesystem.Shell - - // StreamingShell 提供流式 shell 命令执行能力 - // 如果设置,会注册流式 execute 工具(支持实时输出) - // 可选,与 Shell 互斥 - StreamingShell filesystem.StreamingShell - - // 以下为各工具的独立配置,均为可选 - LsToolConfig *ToolConfig // ls 工具配置 - ReadFileToolConfig *ToolConfig // read_file 工具配置 - WriteFileToolConfig *ToolConfig // write_file 工具配置 - EditFileToolConfig *ToolConfig // edit_file 工具配置 - GlobToolConfig *ToolConfig // glob 工具配置 - GrepToolConfig *ToolConfig // grep 工具配置 - - // CustomSystemPrompt 覆盖默认的系统提示词 - // 可选,默认 ToolsSystemPrompt - CustomSystemPrompt *string - - // 以下字段已 Deprecated,请使用对应的 *ToolConfig.Desc 替代 - // CustomLsToolDesc, CustomReadFileToolDesc, CustomGrepToolDesc, - // CustomGlobToolDesc, CustomWriteFileToolDesc, CustomEditToolDesc -} -``` +## MiddlewareConfig + +`MiddlewareConfig` 是 `New` / `NewTyped` 使用的配置结构体。 -### ToolConfig +### 核心字段 -每个工具均可通过 `ToolConfig` 独立配置: + + + + + + + +
    字段类型说明
    Backend
    filesystem.Backend
    必填。提供文件系统操作能力,驱动 ls、read\_file、write\_file、edit\_file、glob、grep 共 6 个工具。接口定义在
    github.com/cloudwego/eino/adk/filesystem
    包。
    Shell
    filesystem.Shell
    可选。提供命令执行能力,设置后注册
    execute
    工具。与
    StreamingShell
    互斥
    StreamingShell
    filesystem.StreamingShell
    可选。提供流式命令执行能力,设置后注册流式
    execute
    工具。与
    Shell
    互斥
    UseMultiModalRead
    bool
    可选,默认
    false
    。开启后
    read_file
    工具变为
    EnhancedInvokableTool
    ,支持返回图片/PDF 等多模态内容。要求 Backend 同时实现 filesystem.MultiModalReader 接口
    CustomSystemPrompt
    *string
    可选。覆盖追加到 Agent Instruction 的系统提示词。若为
    nil
    不追加任何系统提示词
    + +### 工具配置字段 + +每个工具均有对应的 `*ToolConfig` 字段,用于自定义工具名称、描述、替换实现或禁用: + + + + + + + + + +
    字段对应工具
    LsToolConfig
    ls
    ReadFileToolConfig
    read\_file
    WriteFileToolConfig
    write\_file
    EditFileToolConfig
    edit\_file
    GlobToolConfig
    glob
    GrepToolConfig
    grep
    + +> `execute` 工具当前不支持通过 `ToolConfig` 自定义,其注册仅由 `Shell` / `StreamingShell` 是否设置来控制。 + +--- + +## ToolConfig ```go type ToolConfig struct { - // Name 覆盖工具名称 - // 可选,不设置则使用默认名称(如 "ls"、"read_file" 等) - Name string - - // Desc 覆盖工具描述 - // 可选,不设置则使用默认描述 - Desc *string - - // CustomTool 提供自定义工具实现 - // 如果设置,将使用此自定义实现替代基于 Backend 的默认实现 - // 可选 - CustomTool tool.BaseTool - - // Disable 禁用此工具 - // 如果为 true,该工具将不会被注册 - // 可选,默认 false - Disable bool + Name string // 覆盖工具名称,空串使用默认值 + Desc *string // 覆盖工具描述,nil 使用默认值 + CustomTool tool.BaseTool // 自定义工具实现,设置后替代 Backend 默认实现 + Disable bool // 设为 true 则不注册该工具 } ``` -示例 — 自定义工具名称并禁用写入: +**优先级**:`Disable=true` > `CustomTool` > Backend 默认实现。 + +--- + +## 工具名称常量 ```go -middleware, err := filesystem.New(ctx, &filesystem.MiddlewareConfig{ - Backend: myBackend, - ReadFileToolConfig: &filesystem.ToolConfig{ - Name: "cat_file", // 自定义名称 - }, - WriteFileToolConfig: &filesystem.ToolConfig{ - Disable: true, // 禁用写入工具 - }, -}) +const ( + ToolNameLs = "ls" + ToolNameReadFile = "read_file" + ToolNameWriteFile = "write_file" + ToolNameEditFile = "edit_file" + ToolNameGlob = "glob" + ToolNameGrep = "grep" + ToolNameExecute = "execute" +) ``` +--- + ## 注入的工具 - - - - - - - - + + + + + + + +
    工具默认名称描述条件
    列出目录
    ls
    列出指定路径下的文件和目录Backend 不为 nil 时注入
    读取文件
    read_file
    读取文件内容,支持按行分页(offset + limit)Backend 不为 nil 时注入
    写入文件
    write_file
    创建或覆盖文件Backend 不为 nil 时注入
    编辑文件
    edit_file
    替换文件中的字符串Backend 不为 nil 时注入
    Glob 查找
    glob
    按 glob pattern 查找文件Backend 不为 nil 时注入
    内容搜索
    grep
    按 pattern 搜索文件内容,支持多种输出模式Backend 不为 nil 时注入
    命令执行
    execute
    执行 shell 命令需配置 Shell 或 StreamingShell
    工具默认名称注册条件功能说明
    ls
    ls
    Backend ≠ nil列出目录下的文件和子目录
    read\_file
    read_file
    Backend ≠ nil读取文件内容,支持 offset/limit 分页。开启
    UseMultiModalRead
    后可读取图片和 PDF
    write\_file
    write_file
    Backend ≠ nil创建或覆盖写入文件
    edit\_file
    edit_file
    Backend ≠ nil精确字符串替换编辑,支持
    replace_all
    glob
    glob
    Backend ≠ nil按 glob 模式匹配文件路径
    grep
    grep
    Backend ≠ nil正则搜索文件内容,支持多种输出模式和分页
    execute
    execute
    Shell ≠ nil 或 StreamingShell ≠ nil执行 Shell 命令
    -每个工具均可通过对应的 `*ToolConfig` 禁用(`Disable: true`)或提供自定义实现(`CustomTool`)。 +--- -## 多语言支持 +## Backend 接口 -工具描述和内置提示词默认为英文。如需切换为中文,可通过 `adk.SetLanguage()` 设置: +`Backend` 定义在 `github.com/cloudwego/eino/adk/filesystem` 包中。middleware 包通过类型别名重导出了请求/响应类型(如 `ReadRequest`、`FileContent` 等),但 **Backend 接口本身需要从 adk/filesystem 包引用**。 ```go -import "github.com/cloudwego/eino/adk" - -adk.SetLanguage(adk.LanguageChinese) // 切换为中文 -adk.SetLanguage(adk.LanguageEnglish) // 切换为英文(默认) +type Backend interface { + LsInfo(ctx context.Context, req *LsInfoRequest) ([]FileInfo, error) + Read(ctx context.Context, req *ReadRequest) (*FileContent, error) + GrepRaw(ctx context.Context, req *GrepRequest) ([]GrepMatch, error) + GlobInfo(ctx context.Context, req *GlobInfoRequest) ([]FileInfo, error) + Write(ctx context.Context, req *WriteRequest) error + Edit(ctx context.Context, req *EditRequest) error +} ``` -也可以通过 `ToolConfig.Desc` 或 `CustomSystemPrompt` 自定义各工具的说明文本。 +### Shell 与 StreamingShell -## [deprecated] 工具结果卸载 +```go +type Shell interface { + Execute(ctx context.Context, input *ExecuteRequest) (*ExecuteResponse, error) +} -> 💡 -> 该功能即将在 0.8.0 中 deprecate。请迁移到 Middleware: ToolReduction +type StreamingShell interface { + ExecuteStreaming(ctx context.Context, input *ExecuteRequest) (*schema.StreamReader[*ExecuteResponse], error) +} +``` -> 注意:工具结果卸载仅在旧的 `Config` + `NewMiddleware` 函数中可用。推荐的 `MiddlewareConfig` + `New` 不包含此功能,如需要请配合 ToolReduction middleware 使用。 +二者互斥,只能设置其中一个。`StreamingShell` 支持流式输出,适合长时间运行的命令。 -当工具调用结果过大(例如读取大文件、grep 命中大量内容),如果继续将完整结果放入对话上下文,会导致: +--- -- token 急剧增加 -- Agent 历史上下文污染 -- 推理效率变差 +## MultiModalReader 扩展接口 -为此,旧版 Middleware(`NewMiddleware`)提供了自动卸载机制: +当 `UseMultiModalRead = true` 时,Backend 需要额外实现 `MultiModalReader` 接口: -- 当结果大小超过阈值(默认 20,000 tokens)时,不直接返回全部内容给 LLM -- 实际结果会保存到文件系统(Backend) -- 上下文中仅包含摘要和文件路径(Agent 可再次调用 `read_file` 工具按需读取) +```go +type MultiModalReader interface { + MultiModalRead(ctx context.Context, req *MultiModalReadRequest) (*MultiFileContent, error) +} +``` -该功能默认启用,可通过 `Config`(非 `MiddlewareConfig`)配置: +**行为说明**: -```go -type Config struct { - // ... Backend, Shell, StreamingShell, ToolConfig 等字段同 MiddlewareConfig +- `read_file` 工具将从 `InvokableTool` 升级为 `EnhancedInvokableTool`,通过 `schema.ToolResult.Parts` 返回多模态结果 +- 默认实现支持读取图片文件(PNG、JPG 等)和 PDF 文件(支持 `pages` 参数指定页面范围,每次最多 20 页) +- 工具描述会自动追加多模态能力后缀;若通过 `ReadFileToolConfig.Desc` 自定义了描述,则不会追加 - // 关闭自动卸载 - WithoutLargeToolResultOffloading bool +> 💡 +> 使用 `ChatModelAgentMiddleware` 时,需要实现 `WrapEnhancedInvokableToolCall` 方法,多模态 read\_file 工具才能生效。 + +```go +// MultiModalReadRequest 扩展了 ReadRequest +type MultiModalReadRequest struct { + ReadRequest + Pages string // PDF 页面范围,如 "1-5"、"3"、"10-20" +} - // 自定义触发阈值(默认 20000 tokens) - LargeToolResultOffloadingTokenLimit int +// MultiFileContent 返回结果 +type MultiFileContent struct { + *FileContent // 纯文本结果 + Parts []FileContentPart // 多模态结果(与 FileContent 互斥,Parts 非空时忽略 FileContent) +} - // 自定义卸载文件生成路径 - // 默认路径格式: /large_tool_result/{ToolCallID} - LargeToolResultOffloadingPathGen func(ctx context.Context, input *compose.ToolInput) (string, error) +type FileContentPart struct { + Type FileContentPartType // "image" 或 "pdf" + MIMEType string // 如 "image/png"、"application/pdf" + Data []byte // 原始二进制数据 } ``` + +--- + +## Deprecated: 旧版 Config 与大结果卸载 + +> 💡 +> 以下内容仅适用于 `NewMiddleware` + `Config` 旧版路径。`New` / `NewTyped` 路径**不包含**大结果卸载功能。 + +旧版 `Config` 在 `MiddlewareConfig` 的基础上额外提供了「大工具结果卸载」(Large Tool Result Offloading) 机制: + + + + + + +
    字段说明
    WithoutLargeToolResultOffloading bool
    设为
    true
    禁用卸载,默认
    false
    (启用)
    LargeToolResultOffloadingTokenLimit int
    Token 阈值,默认
    20000
    LargeToolResultOffloadingPathGen func(ctx, *compose.ToolInput) (string, error)
    卸载路径生成函数,默认
    /large_tool_result/{ToolCallID}
    + +**触发条件**:当工具返回结果的字符数 > `tokenLimit × 4` 时触发卸载。 + +**卸载行为**:将完整结果通过 `Backend.Write` 写入文件,并用摘要(前 10 行 + 文件路径提示)替换原始返回。Agent 可通过 `read_file` 分页读取完整结果。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_cancel_and_turnloop_quickstart.md b/content/zh/docs/eino/core_modules/eino_adk/agent_cancel_and_turnloop_quickstart.md new file mode 100644 index 00000000000..04a64e15dac --- /dev/null +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_cancel_and_turnloop_quickstart.md @@ -0,0 +1,540 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: Agent Cancel 与 TurnLoop 快速入门 +weight: 10 +--- + +Eino ADK 中 **Agent 取消** 和 **TurnLoop** 两项核心特性的快速入门指南。自 [v0.9.0-alpha.9](https://github.com/cloudwego/eino/releases/tag/v0.9.0-alpha.9) 版本引入。 + +## 类型约定 + +本文示例统一使用以下泛型实例化: + +- `T = string`(推送给 TurnLoop 的业务项类型) +- `M = *schema.Message`(Agent 消息类型,即标准 `Message`) + +ADK 中相关类型别名: + +```go +type Agent = TypedAgent[*schema.Message] +type AgentInput = TypedAgentInput[*schema.Message] +type AgentEvent = TypedAgentEvent[*schema.Message] +``` + +当需要使用 `*schema.AgenticMessage` 时,将 `M` 替换为对应类型即可,所有 API 签名完全对称。 + +--- + +## 第一部分:Agent 取消 + +### 场景 + +用户向 agent 发送请求后,因等待过长或需求变更,希望取消当前执行。 + +### 核心 API + +```go +// 创建取消选项和取消函数 +cancelOpt, cancelFunc := adk.WithCancel() + +// 启动 agent,传入取消选项 +iter := runner.Run(ctx, []*schema.Message{schema.UserMessage("你好")}, cancelOpt) + +// 发起取消(可在任意 goroutine 调用) +handle, contributed := cancelFunc(adk.WithAgentCancelMode(adk.CancelImmediate)) +// contributed == true: 本次调用影响了执行结果 +// contributed == false: agent 已结束或取消已完成,本次调用无实际效果 + +err := handle.Wait() +``` + +`CancelHandle.Wait()` 的三种返回值: + +```go +switch { +case err == nil: + // 取消成功 +case errors.Is(err, adk.ErrCancelTimeout): + // 安全点超时,已自动升级为立即取消 +case errors.Is(err, adk.ErrExecutionEnded): + // agent 在取消生效前已自然结束 +} +``` + +### 三种取消模式 + + + + + + +
    模式行为适用场景
    CancelImmediate
    立即中断,不等待安全点紧急停止、超时兜底
    CancelAfterChatModel
    等当前 ChatModel 调用完成后取消需要完整模型回答
    CancelAfterToolCalls
    等当前 ToolCalls 全部完成后取消确保 tool 副作用完整
    + +> 💡 +> `CancelMode` 是位掩码,可组合使用:`CancelAfterChatModel | CancelAfterToolCalls` 等价于"哪个安全点先到达就取消"。 + +### 安全点取消 + +```go +// 等 ChatModel 完成后取消,5 秒超时保护 +handle, _ := cancelFunc( + adk.WithAgentCancelMode(adk.CancelAfterChatModel), + adk.WithAgentCancelTimeout(5*time.Second), +) +``` + +> 💡 +> 安全点模式务必配合 `WithAgentCancelTimeout`。若 agent 永远不到达安全点,超时后自动升级为立即取消。 + +### 递归取消 + +默认取消仅影响根 agent。使用 `WithRecursive()` 将取消传播到 AgentTool 内嵌套的子 agent: + +```go +handle, _ := cancelFunc( + adk.WithAgentCancelMode(adk.CancelAfterChatModel), + adk.WithRecursive(), +) +``` + +### 消费端识别取消 + +```go +for { + event, ok := iter.Next() + if !ok { + break + } + if event.Err != nil { + var cancelErr *adk.CancelError + if errors.As(event.Err, &cancelErr) { + log.Printf("Agent 被取消 (mode=%v, escalated=%v)", + cancelErr.Info.Mode, cancelErr.Info.Escalated) + } + break + } + // 处理正常事件... +} +``` + +--- + +## 第二部分:TurnLoop + +### 场景 + +构建一个持续运行的 agent 服务:用户随时发送消息,agent 按轮次处理;紧急消息可抢占当前执行。 + +### Turn 生命周期 + + + +### 基本用法 + +```go +loop := adk.NewTurnLoop(adk.TurnLoopConfig[string, *schema.Message]{ + // GenInput:接收缓冲区所有项目,决定本轮消费哪些 + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(strings.Join(items, "\n"))}}, + Consumed: items, + }, nil + }, + + // PrepareAgent:根据本轮消费项构建 Agent + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return myAgent, nil + }, + + // OnAgentEvents:处理 agent 事件流(可选) + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + if event.Err != nil { + return event.Err + } + log.Printf("收到事件: agent=%s", event.AgentName) + } + return nil + }, +}) + +loop.Push("消息 1") +loop.Push("消息 2") +loop.Run(ctx) // 非阻塞,启动后台处理 +loop.Push("消息 3") // 运行中仍可推入 +loop.Stop() +result := loop.Wait() // 阻塞至退出 +``` + +### 核心回调 + + + + + + + +
    回调必填职责
    GenInput
    接收缓冲区所有项目,返回
    Consumed
    (本轮处理)和
    Remaining
    (留给后续轮次)。不在两者中的项目会被丢弃。
    PrepareAgent
    根据 Consumed 项目构建 Agent(设置 prompt、tools、middleware 等)
    OnAgentEvents
    处理 agent 事件流。未设置时默认 drain 事件并返回首个错误
    GenResume
    从 checkpoint 恢复时调用,决定如何合并 interrupted/unhandled/new items
    + +> 💡 +> `OnAgentEvents` 中**不要传播 CancelError**——框架会自动处理。Stop 导致的 `CancelError` 作为 `ExitReason` 传播;Preempt 导致的 `CancelError` 被框架吞掉,循环继续下一轮。回调仅在自身出现致命错误时才应返回 non-nil error。 + +### 抢占(Preempt) + +```go +// 推送紧急消息,在安全点取消当前 agent +accepted, ack := loop.Push("紧急消息!", adk.WithPreempt[string, *schema.Message](adk.AnySafePoint)) + +if accepted { + <-ack // 等待抢占信号被提交(当前 turn 保证会被取消) +} +``` + +抢占是原子操作——"推入新消息"和"取消当前 agent"作为整体执行: + +1. 紧急消息入缓冲区 +2. 当前 agent 在安全点被取消 +3. TurnLoop 自动开始新 turn +4. `GenInput` 收到所有缓冲项目(含紧急消息),重新决策 + +> 💡 +> `WithPreempt` 始终使用安全点取消,**不自动设置 WithRecursive**。而 `WithPreemptTimeout` 会自动启用 `WithRecursive`——超时升级为立即取消时,嵌套子 agent 也会被终止。 + +### 带超时 / 带延迟的抢占 + +```go +// 安全点等待,5 秒超时后升级为立即取消(自动递归) +loop.Push("紧急", adk.WithPreemptTimeout[string, *schema.Message](adk.AnySafePoint, 5*time.Second)) + +// 2 秒宽限期后再发起抢占 +loop.Push("新消息", + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + adk.WithPreemptDelay[string, *schema.Message](2*time.Second), +) +``` + +### 条件抢占:WithPushStrategy + +当抢占决策依赖当前 turn 状态时,使用 `WithPushStrategy` 避免 TOCTOU 竞态: + +```go +loop.Push(urgentItem, adk.WithPushStrategy( + func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message]) []adk.PushOption[string, *schema.Message] { + if tc == nil { + return nil // 当前无活跃 turn,无需抢占 + } + if isLowPriority(tc.Consumed) { + return []adk.PushOption[string, *schema.Message]{ + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + } + } + return nil // 当前是高优先级任务,不抢占 + }, +)) +``` + +### 在 OnAgentEvents 中感知抢占和停止 + +`TurnContext` 提供 `Preempted` 和 `Stopped` 两个信号通道: + +```go +OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + + select { + case <-tc.Preempted: + log.Println("当前 turn 被抢占,正在收尾...") + case <-tc.Stopped: + log.Printf("循环正在停止,原因: %s", tc.StopCause()) + default: + } + + if event.Err != nil { + return event.Err + } + // 处理事件... + } + return nil +}, +``` + +> 💡 +> `Preempted` / `Stopped` 仅在对应的取消调用实际 "contribute" 到当前 turn 的 `CancelError` 时才关闭。如果取消已被其他信号最终确定,通道保持打开。 + +### 停止 TurnLoop + +```go +// 等当前 turn 完成后退出(ExitReason 为 nil) +loop.Stop() + +// 立即中止当前 agent(递归传播到嵌套 agent) +loop.Stop(adk.WithImmediate()) + +// 安全点停止(递归传播,无超时) +loop.Stop(adk.WithGraceful()) + +// 带超时的安全点停止(超时后升级为立即取消) +loop.Stop(adk.WithGracefulTimeout(10 * time.Second)) + +// 空闲后自动关停(持续空闲 30 秒后停止) +loop.Stop(adk.UntilIdleFor(30 * time.Second)) +``` + +> 💡 +> 可多次调用 `Stop()` 升级取消策略。典型模式:先 `WithGraceful()`,超时后再 `WithImmediate()`。 + +### 附带停止原因 + +```go +loop.Stop( + adk.WithGraceful(), + adk.WithStopCause("quota exceeded"), +) +result := loop.Wait() +log.Printf("停止原因: %s", result.StopCause) +``` + +--- + +## 第三部分:声明式 Checkpoint 恢复 + +### 场景 + +Agent 被取消或中断后,下次启动时自动从断点恢复,而非从头开始。TurnLoop 自动管理输入簿记(bookkeeping),应用层只需声明 interrupted/unhandled/new items 如何重入后续 turn。 + +### 配置 Checkpoint + +在 `TurnLoopConfig` 中同时设置 `Store` 和 `CheckpointID` 即可启用: + +```go +store := NewMyCheckpointStore() // 实现 CheckPointStore 接口 + +cfg := adk.TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(items[0])}}, + Consumed: items[:1], + Remaining: items[1:], + }, nil + }, + + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return myAgent, nil + }, + + // GenResume:从 checkpoint 恢复时调用 + GenResume: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], interruptedItems, unhandledItems, newItems []string) (*adk.GenResumeResult[string, *schema.Message], error) { + all := append(append(interruptedItems, unhandledItems...), newItems...) + return &adk.GenResumeResult[string, *schema.Message]{ + Consumed: all[:1], + Remaining: all[1:], + }, nil + }, + + Store: store, + CheckpointID: "session-123", +} +``` + +### 恢复流程 + +`Run()` 启动时自动查询 Store: + + + + + + +
    Checkpoint 状态行为
    存在 mid-turn checkpoint(agent 执行中被中断)调用
    GenResume
    ,将 interrupted/unhandled/new items 交给应用层决策后恢复执行
    存在 between-turns checkpoint(轮次间被停止)将已缓冲项目加入 buffer,通过
    GenInput
    正常处理
    不存在 checkpoint从头开始
    + +```go +// 第一次运行 +loop := adk.NewTurnLoop(cfg) +loop.Push("消息 1") +loop.Run(ctx) +loop.Stop(adk.WithGraceful()) +exit := loop.Wait() +log.Printf("checkpoint 尝试: %v, err: %v", exit.CheckpointAttempted, exit.CheckpointErr) + +// 第二次运行(相同 cfg,包含相同 CheckpointID) +loop2 := adk.NewTurnLoop(cfg) +loop2.Push("新消息") // 作为 newItems 传入 GenResume +loop2.Run(ctx) // 自动检测 checkpoint 并恢复 +result := loop2.Wait() +``` + +### 跳过 Checkpoint + +```go +loop.Stop(adk.WithSkipCheckpoint()) // 本次退出不保存 checkpoint +``` + +### 实现 CheckPointStore + +```go +type CheckPointStore interface { + Get(ctx context.Context, checkPointID string) ([]byte, bool, error) + Set(ctx context.Context, checkPointID string, checkPoint []byte) error +} +``` + +可选实现 `CheckPointDeleter` 以支持显式删除过期 checkpoint: + +```go +type CheckPointDeleter interface { + Delete(ctx context.Context, checkPointID string) error +} +``` + +正常退出(未保存新 checkpoint)时,TurnLoop 会尝试删除先前加载的 checkpoint 以防过期恢复。**只有实现了 CheckPointDeleter 的 Store 才会执行删除**;否则由 Store 自身管理生命周期。 + +> 💡 +> 使用 `Store` 时,泛型参数 `T` 必须支持 `encoding/gob` 编解码——TurnLoop 通过 gob 持久化 runner checkpoint 和 item 簿记信息。 + +--- + +## 第四部分:完整示例 + +模拟一个支持优先级调度、抢占和 checkpoint 恢复的聊天服务: + +```go +package main + +import ( + "context" + "log" + "strings" + "time" + + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/schema" +) + +func main() { + ctx := context.Background() + store := adk.NewInMemoryStore() + + cfg := adk.TurnLoopConfig[string, *schema.Message]{ + GenInput: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], items []string) (*adk.GenInputResult[string, *schema.Message], error) { + // 按优先级排序后,只消费第一条,其余留给后续轮次 + sorted := sortByPriority(items) + return &adk.GenInputResult[string, *schema.Message]{ + Input: &adk.AgentInput{Messages: []*schema.Message{schema.UserMessage(sorted[0])}}, + Consumed: sorted[:1], + Remaining: sorted[1:], // 不在两者中的项目会被丢弃 + }, nil + }, + + GenResume: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], interruptedItems, unhandledItems, newItems []string) (*adk.GenResumeResult[string, *schema.Message], error) { + all := append(append(interruptedItems, unhandledItems...), newItems...) + return &adk.GenResumeResult[string, *schema.Message]{ + Consumed: all[:1], + Remaining: all[1:], + }, nil + }, + + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[string, *schema.Message], consumed []string) (adk.Agent, error) { + return buildAgent(consumed), nil + }, + + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[string, *schema.Message], events *adk.AsyncIterator[*adk.AgentEvent]) error { + for { + event, ok := events.Next() + if !ok { + break + } + // 感知抢占/停止信号,做收尾处理 + select { + case <-tc.Preempted: + log.Println("被更高优先级消息抢占") + case <-tc.Stopped: + log.Printf("服务关停: %s", tc.StopCause()) + default: + } + if event.Err != nil { + // 不传播 CancelError,框架自动处理 + return event.Err + } + log.Printf("[%s] %s", event.AgentName, extractText(event)) + } + return nil + }, + + Store: store, + CheckpointID: "chat-session-001", + } + + loop := adk.NewTurnLoop(cfg) + loop.Push("你好,帮我查一下天气") + loop.Run(ctx) + + // 1 秒后发送紧急消息抢占 + time.AfterFunc(1*time.Second, func() { + loop.Push("停!先帮我处理这个紧急问题", + adk.WithPreempt[string, *schema.Message](adk.AnySafePoint), + ) + }) + + // 5 秒后优雅关停 + time.AfterFunc(5*time.Second, func() { + loop.Stop( + adk.WithGracefulTimeout(3*time.Second), + adk.WithStopCause("service shutdown"), + ) + }) + + result := loop.Wait() + log.Printf("退出原因: %v", result.ExitReason) + log.Printf("未处理消息: %v", result.UnhandledItems) + log.Printf("停止原因: %s", result.StopCause) + log.Printf("checkpoint: attempted=%v, err=%v", result.CheckpointAttempted, result.CheckpointErr) + + // 下次以相同 cfg 启动将自动从 checkpoint 恢复 +} +``` + +--- + +## 常见问题 + +### Q: 安全点取消会不会永远等不到安全点? + +会。如果 agent 陷入长时间运行的 tool 或 model 调用,安全点可能迟迟不到。**务必配合 WithAgentCancelTimeout 使用**,超时后自动升级为 `CancelImmediate`。 + +### Q: `WithRecursive` 什么时候需要? + +默认取消仅影响根 agent。当 agent 层级中包含 AgentTool 嵌套的子 agent,且你希望子 agent 也在安全点响应取消时,才需要。不确定时,先不加。 + +### Q: 泛型参数 T 有什么要求? + +当配置了 `Store` 时,`T` 必须可被 `encoding/gob` 编解码。基础类型(`string`、`int` 等)和全导出字段的 struct 默认支持。若 `T` 包含 interface 字段,需通过 `gob.Register` 注册具体类型。 + +### Q: `Push` 在 loop 停止后会怎样? + +`Push` 返回 `(false, closedCh)`。这些 "late items" 不会进入 checkpoint,可在 `Wait()` 返回后通过 `result.TakeLateItems()` 回收。一旦调用 `TakeLateItems()`,后续 `Push` 会 panic 以防数据静默丢失。 + +### Q: 多次调用 `Stop()` 会怎样? + +安全——每次调用可以升级取消策略。典型模式: + +```go +loop.Stop(adk.WithGraceful()) // 先尝试优雅停止 +time.AfterFunc(3*time.Second, func() { + loop.Stop(adk.WithImmediate()) // 3 秒后升级为立即取消 +}) +``` + +### Q: `GenInput` 返回的 items 不在 Consumed 也不在 Remaining 会怎样? + +会被丢弃。这是刻意设计——允许 `GenInput` 在决策时过滤掉不需要的项目。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_collaboration.md b/content/zh/docs/eino/core_modules/eino_adk/agent_collaboration.md index 05419d80660..2bc6c0a9782 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_collaboration.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_collaboration.md @@ -1,521 +1,116 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: Agent 协作 weight: 4 --- -# Agent 协作 +# 多 Agent 协作 -概述文档已经对 Agent 协作提供了基础的说明,下面将结合代码,对协作与组合原语的设计与实现进行介绍: +Eino ADK 提供两种主要的 Agent 协作方式: -## 协作原语 +## AgentAsTool(推荐) -### Agent 间协作方式 +将子 Agent 包装为 Tool,父 Agent 通过 ToolCall 自主决定何时调用。子 Agent 独立执行,结果返回父 Agent 的上下文。 - - - - -
    协作方式描述
    Transfer直接将任务转让给另外一个 Agent,本 Agent 则执行结束后退出,不关心转让 Agent 的任务执行状态
    ToolCall(AgentAsTool)将 Agent 当成 ToolCall 调用,等待 Agent 的响应,并可获取被调用Agent 的输出结果,进行下一轮处理
    - -### AgentInput 的上下文策略 - - - - - -
    上下文策略描述
    上游 Agent 全对话获取本 Agent 的上游 Agent 的完整对话记录
    全新任务描述忽略掉上游 Agent 的完整对话记录,给出一个全新的任务总结,作为子 Agent 的 AgentInput 输入
    - -### 决策自主性 - - - - - -
    决策自主性描述
    自主决策在 Agent 内部,基于其可选的下游 Agent, 如需协助时,自主选择下游 Agent 进行协助。 一般来说,Agent 内部是基于 LLM 进行决策,不过即使是基于预设逻辑进行选择,从 Agent 外部看依然视为自主决策
    预设决策事先预设好一个Agent 执行任务后的下一个 Agent。 Agent 的执行顺序是事先确定、可预测的
    - -### 组合原语 - - - - - - - - -
    类型描述运行模式协作方式上下文策略决策自主性
    SubAgents将用户提供的 agent 作为 父Agent,用户提供的 subAgents 列表作为 子Agents,组合而成可自主决策的 Agent,其中的 Name 和 Description 作为该 Agent 的名称标识和描述。
  • 当前限定一个 Agent 只能有一个 父 Agent
  • 可采用 SetSubAgents 函数,构建 「多叉树」 形式的 Multi-Agent
  • 在这个「多叉树」中,AgentName 需要保持唯一
  • Transfer上游 Agent 全对话自主决策
    Sequential将用户提供的 SubAgents 列表,组合成按照顺序依次执行的 Sequential Agent,其中的 Name 和 Description 作为 Sequential Agent 的名称标识和描述。Sequential Agent 执行时,将 SubAgents 列表,按照顺序依次执行,直至将所有 Agent 执行一遍后结束。Transfer上游 Agent 全对话预设决策
    Parallel将用户提供的 SubAgents 列表,组合成基于相同上下文,并发执行的 Parallel Agent,其中的 Name 和 Description 作为 Parallel Agent 的名称标识和描述。Parallel Agent 执行时,将 SubAgents 列表,并发执行,待所有 Agent 执行完成后结束。Transfer上游 Agent 全对话预设决策
    Loop将用户提供的 SubAgents 列表,按照数组顺序依次执行,循环往复,组合成 Loop Agent,其中的 Name 和 Description 作为 Loop Agent 的名称标识和描述。Loop Agent 执行时,将 SubAgents 列表,顺序执行,待所有 Agent 执行完成后结束。Transfer上游 Agent 全对话预设决策
    AgentAsTool将一个 Agent 转换成 Tool,被其他的 Agent 当成普通的 Tool 使用。一个 Agent 能否将其他 Agent 当成 Tool 进行调用,取决于自身的实现。adk 中提供的 ChatModelAgent 支持 AgentAsTool 的功能ToolCall全新任务描述自主决策
    - -## 上下文传递 - -在构建多 Agent 系统时,让不同 Agent 之间高效、准确地共享信息至关重要。Eino ADK 提供了两种核心的上下文传递机制,以满足不同的协作需求: History 和 SessionValues。 - -### History - -#### 概念 - -History 对应【上游 Agent 全对话上下文策略】,多 Agent 系统中每一个 Agent 产生的 AgentEvent 都会被保存到 History 中,调用一个新 Agent 时 (Workflow/ Transfer) History 中的 AgentEvent 会被转换并拼接到 AgentInput 中。 - -默认情况下,其他 Agent 的 Assistant 或 Tool Message,被转换为 User Message。这相当于在告诉当前的 LLM:“刚才, Agent_A 调用了 some_tool ,返回了 some_result 。现在,轮到你来决策了。” - -通过这种方式,其他 Agent 的行为被当作了提供给当前 Agent 的“外部信息”或“事实陈述”,而不是它自己的行为,从而避免了 LLM 的上下文混乱。 - - - -在 Eino ADK 中,当为一个 Agent 构建 AgentInput 时,它能看到的 History 是“所有在我之前产生的 AgentEvent”。 - -值得一提的是 ParallelWorkflowAgent:并行的两个子 Agent(A,B),在并行执行过程中,相互不可见对方产生的 AgentEvent,因为并行的 A、B 没有谁是在另一个之前。 - -#### RunPath - -History 中每个 AgentEvent 都是由“特定 Agent 在特定的执行序列中产生的”,也就是 AgentEvent 有自身的 RunPath。RunPath 的作用是传递出这个信息,在 eino 框架中不乘载其他功能。 - -下面表格中给出各种编排模式下,Agent 执行时的具体 RunPath: - - - - - - - -
    ExampleRunPath
  • Agent: [Agent]
  • SubAgent: [Agent, SubAgent]
  • Agent: [Agent]
  • Agent(after function call): [Agent]
  • Agent1: [SequentialAgent, LoopAgent, Agent1]
  • Agent2: [SequentialAgent, LoopAgent, Agent1, Agent2]
  • Agent1: [SequentialAgent, LoopAgent, Agent1, Agent2, Agent1]
  • Agent2: [SequentialAgent, LoopAgent, Agent1, Agent2, Agent1, Agent2]
  • Agent3: [SequentialAgent, LoopAgent, Agent3]
  • Agent4: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent4]
  • Agent5: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent5]
  • Agent6: [SequentialAgent, LoopAgent, Agent3, ParallelAgent, Agent6]
  • Agent: [Agent]
  • SubAgent: [Agent, SubAgent]
  • Agent: [Agent, SubAgent, Agent]
  • - -#### 自定义 - -有些情况下在 Agent 运行前需要对 History 的内容进行调整,此时通过 AgentWithOptions 可以自定义 Agent 从 History 中生成 AgentInput 的方式: - -```go -// github.com/cloudwego/eino/adk/flow.go - -type HistoryRewriter func(ctx context.Context, entries []*HistoryEntry) ([]Message, error) - -func WithHistoryRewriter(h HistoryRewriter) AgentOption -``` - -### SessionValues - -#### 概念 +这是最灵活、最可组合的协作模式: -SessionValues 是在一次运行中持续存在的全局临时 KV 存储,用于支持跨 Agent 的状态管理和数据共享,一次运行中的任何 Agent 可以在任何时间读写 SessionValues。 - -Eino ADK 提供了多种方法供 Agent 运行时内部并发安全的读写 Session Values: - -```go -// github.com/cloudwego/eino/adk/runctx.go - -// 获取全部 SessionValues -func GetSessionValues(ctx context.Context) map[string]any -// 批量设置 SessionValues -func AddSessionValues(ctx context.Context, kvs map[string]any) -// 指定 key 获取 SessionValues 中的一个值,key 不存在时第二个返回值为 false,否则为 true -func GetSessionValue(ctx context.Context, key string) (any, bool) -// 设置单个 SessionValues -func AddSessionValue(ctx context.Context, key string, value any) -``` - -需要注意的是,由于 SessionValues 机制基于 Context 来实现,而 Runner 运行会对 Context 重新初始化,因此在 Run 方法外通过 `AddSessionValues` 或 `AddSessionValue` 注入 SessionValues 是不生效的。 - -如果您需要在 Agent 运行前就注入数据到 SessionValues 中,需要使用专用的 Option 来协助实现,用法如下: - -```go -// github.com/cloudwego/eino/adk/call_option.go -// WithSessionValues 在 Agent 运行前注入 SessionValues -func WithSessionValues(v map[string]any) AgentRunOption - -// 用法: -runner := adk.NewRunner(ctx, adk.RunnerConfig{Agent: agent}) -iterator := runner.Run(ctx, []adk.Message{schema.UserMessage("xxx")}, - adk.WithSessionValues(map[string]any{ - PlanSessionKey: 123, - UserInputSessionKey: []adk.Message{schema.UserMessage("yyy")}, - }), -) -``` - -## Transfer SubAgents - -### 概念 - -Transfer 对应【Transfer 协作方式】,Agent 运行时产生带有包含 TransferAction 的 AgentEvent 后,Eino ADK 会调用 Action 指定的 Agent,被调用的 Agent 被称为子 Agent(SubAgent)。 - -TransferAction 可以使用 `NewTransferToAgentAction` 快速创建: - -```go -import "github.com/cloudwego/eino/adk" - -event := adk.NewTransferToAgentAction("dest agent name") -``` - -为了让 Eino ADK 在接受到 TransferAction 可以找到子 Agent 实例并运行,在运行前需要先调用 `SetSubAgents` 将可能的子 Agent 注册到 Eino ADK 中: - -```go -// github.com/cloudwego/eino/adk/flow.go -func SetSubAgents(ctx context.Context, agent Agent, subAgents []Agent) (Agent, error) -``` - -> 💡 -> Transfer 的含义是将任务**移交**给子 Agent,而不是委托或者分配,因此: -> -> 1. 区别于 ToolCall,通过 Transfer 调用子 Agent,子 Agent 运行结束后,不会再调用父 Agent 总结内容或进行下一步操作。 -> 2. 调用子 Agent 时,子 Agent 的输入仍然是原始输入,父 Agent 的输出会作为上下文供子 Agent 参考。 - -在触发 SetSubAgents 时,父子 Agent 双方都需要进行处理来完成初始化操作,Eino ADK 定义了 `OnSubAgents` 接口用于支持此功能: - -```go -// github.com/cloudwego/eino/adk/interface.go -type OnSubAgents interface { - OnSetSubAgents(ctx context.Context, subAgents []Agent) error - OnSetAsSubAgent(ctx context.Context, parent Agent) error - OnDisallowTransferToParent(ctx context.Context) error -} -``` - -如果 Agent 实现了 `OnSubAgents` 接口,`SetSubAgents` 中会调用相应的方法向 Agent 注册,例如 `ChatModelAgent` 的实现 - -### 示例 - -接下来以一个多功能对话 Agent 演示 Transfer 能力,目标是搭建一个可以查询天气或者与用户对话的 Agent,Agent 结构如下: - - - -三个 Agent 均使用 ChatModelAgent 实现: +- 父 Agent 保持控制权,可基于子 Agent 结果继续推理 +- 子 Agent 接收独立的任务描述,不继承父 Agent 的完整对话历史 +- 多个子 Agent 可并行调用 ```go import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" "github.com/cloudwego/eino/compose" + "github.com/cloudwego/eino/components/tool" ) -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -type GetWeatherInput struct { - City string `json:"city"` -} - -func NewWeatherAgent() adk.Agent { - weatherTool, err := utils.InferTool( - "get_weather", - "Gets the current weather for a specific city.", // English description - func(ctx context.Context, input *GetWeatherInput) (string, error) { - return fmt.Sprintf(`the temperature in %s is 25°C`, input.City), nil - }, - ) - if err != nil { - log.Fatal(err) - } - - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "WeatherAgent", - Description: "This agent can get the current weather for a given city.", - Instruction: "Your sole purpose is to get the current weather for a given city by using the 'get_weather' tool. After calling the tool, report the result directly to the user.", - Model: newChatModel(), - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{weatherTool}, - }, - }, - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func NewChatAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ChatAgent", - Description: "A general-purpose agent for handling conversational chat.", // English description - Instruction: "You are a friendly conversational assistant. Your role is to handle general chit-chat and answer questions that are not related to any specific tool-based tasks.", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func NewRouterAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "RouterAgent", - Description: "A manual router that transfers tasks to other expert agents.", - Instruction: `You are an intelligent task router. Your responsibility is to analyze the user's request and delegate it to the most appropriate expert agent.If no Agent can handle the task, simply inform the user it cannot be processed.`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} -``` - -之后使用 Eino ADK 的 Transfer 能力搭建 Multi-Agent 并运行,ChatModelAgent 实现了 OnSubAgent 接口,在 adk.SetSubAgents 方法中会使用此接口向 ChatModelAgent 注册父/子 Agent,不需要用户处理 TransferAction 生成问题: - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino/adk" -) - -func main() { - weatherAgent := NewWeatherAgent() - chatAgent := NewChatAgent() - routerAgent := NewRouterAgent() - - ctx := context.Background() - a, err := adk.SetSubAgents(ctx, routerAgent, []adk.Agent{chatAgent, weatherAgent}) - if err != nil { - log.Fatal(err) - } - - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: a, - }) - - // query weather - println("\n\n>>>>>>>>>query weather<<<<<<<<<") - iter := runner.Query(ctx, "What's the weather in Beijing?") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil { - fmt.Printf("\nAgent[%s]: transfer to %+v\n\n======\n", event.AgentName, event.Action.TransferToAgent.DestAgentName) - } else { - fmt.Printf("\nAgent[%s]:\n%+v\n\n======\n", event.AgentName, event.Output.MessageOutput.Message) - } - } - - // failed to route - println("\n\n>>>>>>>>>failed to route<<<<<<<<<") - iter = runner.Query(ctx, "Book me a flight from New York to London tomorrow.") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil { - fmt.Printf("\nAgent[%s]: transfer to %+v\n\n======\n", event.AgentName, event.Action.TransferToAgent.DestAgentName) - } else { - fmt.Printf("\nAgent[%s]:\n%+v\n\n======\n", event.AgentName, event.Output.MessageOutput.Message) - } - } -} -``` - -运行结果: - -```yaml ->>>>>>>>>query weather<<<<<<<<< -Agent[RouterAgent]: -assistant: -tool_calls: -{Index: ID:call_SKNsPwKCTdp1oHxSlAFt8sO6 Type:function Function:{Name:transfer_to_agent Arguments:{"agent_name":"WeatherAgent"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{201 17 218} -====== -Agent[RouterAgent]: transfer to WeatherAgent -====== -Agent[WeatherAgent]: -assistant: -tool_calls: -{Index: ID:call_QMBdUwKj84hKDAwMMX1gOiES Type:function Function:{Name:get_weather Arguments:{"city":"Beijing"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{255 15 270} -====== -Agent[WeatherAgent]: -tool: the temperature in Beijing is 25°C -tool_call_id: call_QMBdUwKj84hKDAwMMX1gOiES -tool_call_name: get_weather -====== -Agent[WeatherAgent]: -assistant: The current temperature in Beijing is 25°C. -finish_reason: stop -usage: &{286 11 297} -====== - ->>>>>>>>>failed to route<<<<<<<<< -Agent[RouterAgent]: -assistant: I'm unable to assist with booking flights. Please use a relevant travel service or booking platform to make your reservation. -finish_reason: stop -usage: &{206 23 229} -====== -``` - -OnSubAgents 的另外两个方法在 Agent 作为 SetSubAgents 中的子 Agent 时被调用: - -- OnSetAsSubAgent 用来注册向 Agent 注册其父 Agent 信息 -- OnDisallowTransferToParent 在 Agent 设置 WithDisallowTransferToParent option 时会被调用,用来告知 Agent 不要产生向父 Agent 的 TransferAction。 - -```go -adk.SetSubAgents( - ctx, - Agent1, - []adk.Agent{ - adk.AgentWithOptions(ctx, Agent2, adk.WithDisallowTransferToParent()), +// 创建子 Agent +subAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "researcher", + Description: "搜索并总结相关信息", + Instruction: "你是一个研究助手...", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{searchTool}, + }, }, -) +}) + +// 包装为 Tool +agentTool := adk.NewAgentTool(ctx, subAgent) + +// 父 Agent 注册子 Agent Tool +parentAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "coordinator", + Description: "协调任务的主 Agent", + Instruction: "你是一个任务协调者...", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{agentTool}, + }, + }, +}) ``` -### 静态配置 Transfer - -AgentWithDeterministicTransferTo 是一个 Agent Wrapper,在原 Agent 执行完后生成预设的 TransferAction,从而实现静态配置 Agent 跳转的能力: - -```go -// github.com/cloudwego/eino/adk/flow.go +### AgentTool 选项 -type DeterministicTransferConfig struct { - Agent Agent - ToAgentNames []string -} - -func AgentWithDeterministicTransferTo(_ context.Context, config *DeterministicTransferConfig) Agent -``` - -在 Supervisor 模式中,子 Agent 执行完毕后固定回到 Supervisor,由 Supervisor 生成下一步任务目标。此时可以使用 AgentWithDeterministicTransferTo: + + + + +
    选项说明
    WithFullChatHistoryAsInput()
    将父 Agent 的完整对话历史作为子 Agent 输入(默认只传模型生成的 request 参数)
    WithAgentInputSchema(schema)
    自定义子 Agent 的输入 schema
    - +### 事件流透传 -```go -// github.com/cloudwego/eino/adk/prebuilt/supervisor.go +当 `ToolsConfig.EmitInternalEvents = true` 时,子 Agent 的事件会实时透传到父 Agent 的事件流,允许终端用户看到子 Agent 的中间过程。 -type SupervisorConfig struct { - Supervisor adk.Agent - SubAgents []adk.Agent -} +> 💡 +> 透传的事件不影响父 Agent 的状态或 checkpoint,仅用于用户展示。唯一例外是 Interrupted action,会通过 CompositeInterrupt 跨边界传播以支持中断恢复。 -func NewSupervisor(ctx context.Context, conf *SupervisorConfig) (adk.Agent, error) { - subAgents := make([]adk.Agent, 0, len(conf.SubAgents)) - supervisorName := conf.Supervisor.Name(ctx) - for _, subAgent := range conf.SubAgents { - subAgents = append(subAgents, adk.AgentWithDeterministicTransferTo(ctx, &adk.DeterministicTransferConfig{ - Agent: subAgent, - ToAgentNames: []string{supervisorName}, - })) - } +### 预构建示例:DeepAgents - return adk.SetSubAgents(ctx, conf.Supervisor, subAgents) -} -``` +[DeepAgents](/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents) 是 AgentAsTool 模式的最佳实践:主 Agent 通过 **TaskTool** 将子任务委派给子 Agent 执行,配合 **WriteTodos** 进行任务规划和进度追踪。 ## Workflow Agents -WorkflowAgent 支持以代码中预设好的流程运行 Agents。Eino ADK 提供了三种基础 Workflow Agent:Sequential、Parallel、Loop,它们之间可以互相嵌套以完成更复杂的任务。 - -默认情况下,Workflow 中每个 Agent 的输入由 History 章节中介绍的方式生成,可以通过 WithHistoryRewriter 自定 AgentInput 生成方式。 - -当 Agent 产生 ExitAction Event 后,Workflow Agent 会立刻退出,无论之后有没有其他需要运行的 Agent。 - -详解与用例参考请见:[Eino ADK: Workflow Agents](/zh/docs/eino/core_modules/eino_adk/agent_implementation/workflow) - -### SequentialAgent - -SequentialAgent 会按照你提供的顺序,依次执行一系列 Agent: - - - -```go -type SequentialAgentConfig struct { - Name string - Description string - SubAgents []Agent -} - -func NewSequentialAgent(ctx context.Context, config *SequentialAgentConfig) (Agent, error) -``` - -### LoopAgent - -LoopAgent 基于 SequentialAgent 实现,在 SequentialAgent 运行完成后,再次从头运行: - - - -```go -type LoopAgentConfig struct { - Name string - Description string - SubAgents []Agent - - MaxIterations int // 最大循环次数 -} - -func NewLoopAgent(ctx context.Context, config *LoopAgentConfig) (Agent, error) -``` - -### ParallelAgent - -ParallelAgent 会并发运行若干 Agent: +确定性编排,用于流程固定的多步任务: - + + + + + +
    类型说明构造函数
    Sequential按数组顺序依次执行子 Agent
    adk.NewSequentialAgent
    Parallel并发执行所有子 Agent,全部完成后结束
    adk.NewParallelAgent
    Loop循环执行子 Agent 序列,直到 BreakLoop 或超过 MaxIterations
    adk.NewLoopAgent
    -```go -type ParallelAgentConfig struct { - Name string - Description string - SubAgents []Agent -} +Workflow Agent 之间通过 Transfer 传递上下文:上游 Agent 的输出自动拼接到下游 Agent 的输入 Messages 中。 -func NewParallelAgent(ctx context.Context, config *ParallelAgentConfig) (Agent, error) -``` +# 上下文传递 -## AgentAsTool +## SessionValues -当 Agent 运行仅需要明确清晰的指令,而非完整运行上下文(History)时,该 Agent 可以转换为 Tool 进行调用: +跨 Agent 的全局 KV 存储,一次运行内任何 Agent 可并发安全地读写: ```go -func NewAgentTool(_ context.Context, agent Agent, options ...AgentToolOption) tool.BaseTool +// 读写 API +adk.AddSessionValue(ctx, "key", value) +val, ok := adk.GetSessionValue(ctx, "key") +adk.AddSessionValues(ctx, map[string]any{"k1": v1, "k2": v2}) +all := adk.GetSessionValues(ctx) ``` -转换为 Tool 后,Agent 可以被支持 function calling 的 ChatModel 调用,也可以被所有基于 LLM 驱动的 Agent 调用,调用方式取决于 Agent 实现。 - -消息历史隔离:作为 Tool 的 Agent,不会继承上级 Agent 的消息历史(History)。 - -SessionValues 共享:但是,会共享上级 Agent 的 SessionValues,即读写同一个 KV map。 - -内部事件透出:作为 Tool 的 Agent 也是 Agent,会产生 AgentEvent。这些内部的 AgentEvent,默认情况下,不会通过 `Runner` 返回的 `AsyncIterator` 透出。在部分业务场景中,如果需要像用户透出内部 AgentTool 的 AgentEvent,需要在 AgentTool 的上级 `ChatModelAgent` 的 `ToolsConfig` 中增加配置,开启内部事件透出: +> 💡 +> SessionValues 基于 Context 实现,Runner 运行时会重新初始化 Context。如需在运行前注入数据,使用 `WithSessionValues` Option: ```go -// from adk/chatmodel.go - -**type **ToolsConfig **struct **{ - // other configurations... - - _// EmitInternalEvents indicates whether internal events from agentTool should be emitted_ -_ // to the parent generator via a tool option injection at run-time._ -_ _EmitInternalEvents bool -} +iter := runner.Run(ctx, messages, + adk.WithSessionValues(map[string]any{ + "user_id": "123", + }), +) ``` - -这些内部事件,不会进入上级 agent 的上下文(除了本来就会进入的最后一条 message),各种 AgentAction 也不会生效(InterruptAction 除外)。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_extension.md b/content/zh/docs/eino/core_modules/eino_adk/agent_extension.md index 1a87c5db017..cf7b28dad77 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_extension.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_extension.md @@ -1,118 +1,133 @@ --- Description: "" -date: "2025-11-20" +date: "2026-05-17" lastmod: "" tags: [] title: Agent Runner 与扩展 weight: 6 --- -# Agent Runner +# Runner -## 定义 +Runner 是 Agent 的执行入口,负责管理 Agent 生命周期、上下文初始化、Checkpoint 持久化和中断恢复。**任何 Agent 都应通过 Runner 运行。** -Runner 是 Eino ADK 中负责执行 Agent 的核心引擎。它的主要作用是管理和控制 Agent 的整个生命周期,如处理多 Agent 协作,保存传递上下文等,interrupt、callback 等切面能力也均依赖 Runner 实现。任何 Agent 都应通过 Runner 来运行。 +## 基本用法 -## Interrupt & Resume - -Agent Runner 提供运行时中断与恢复的功能,该功能允许一个正在运行的 Agent 主动中断其执行并保存当前状态,支持从中断点恢复执行。该功能常用于 Agent 处理流程中需要外部输入、长时间等待或可暂停等场景。 - -下面将对一次中断到恢复过程中的三个关键点进行介绍: +```go +import "github.com/cloudwego/eino/adk" + +// 创建 Runner +runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + CheckPointStore: store, // 可选,启用中断恢复需要 +}) + +// 方式一:Query — 直接发送用户问题 +iter := runner.Query(ctx, "帮我搜索今天的新闻") + +// 方式二:Run — 传入完整 Messages +iter := runner.Run(ctx, []*schema.Message{ + schema.UserMessage("你好"), +}, adk.WithSessionValues(map[string]any{"user": "alice"})) + +// 消费事件流 +for { + event, ok := iter.Next() + if !ok { + break + } + // 处理 event +} +``` -1. Interrupted Action:由 Agent 抛出中断事件,Agent Runner 拦截 -2. Checkpoint:Agent Runner 拦截事件后保存当前运行状态 -3. Resume:运行条件重新 ready 后,由 Agent Runner 从断点恢复运行 +## 泛型支持 -### Interrupted Action +```go +type TypedRunner[M MessageType] struct { ... } +type Runner = TypedRunner[*schema.Message] -在 Agent 的执行过程中,可以通过产生包含 Interrupted Action 的 AgentEvent 来主动中断 Runner 的运行。 +func NewTypedRunner[M MessageType](conf TypedRunnerConfig[M]) *TypedRunner[M] +``` -当 Event 中的 Interrupted 不为空时,Agent Runner 便会认为发生中断: +`*schema.AgenticMessage` 路径使用 `NewTypedRunner` 构造。 -```go -// github.com/cloudwego/eino/adk/interface.go -type AgentAction struct { - // other actions - Interrupted *InterruptInfo - // other actions -} +## Interrupt & Resume -// github.com/cloudwego/eino/adk/interrupt.go -type InterruptInfo struct { - Data any -} -``` +Agent 可在运行中主动中断,Runner 自动保存状态(需配置 `CheckPointStore`),后续可从断点恢复。 -当中断发生时,可以通过 InterruptInfo 结构体附带自定义的中断信息。此信息: +### 中断 -1. 会被传递给调用者,可以通过该信息向调用者说明中断原因等 -2. 如果后续需要恢复 Agent 运行,InterruptInfo 会在恢复时重新传递给中断的 Agent,Agent 可以依据该信息恢复运行 +Agent 产出包含 `Interrupted` 的事件即可触发中断: ```go -// 例如 ChatModelAgent 中断时,会发送如下的 AgentEvent: -h.Send(&AgentEvent{AgentName: h.agentName, Action: &AgentAction{ - Interrupted: &InterruptInfo{ - Data: &ChatModelAgentInterruptInfo{Data: data, Info: info}, +gen.Send(&adk.AgentEvent{ + Action: &adk.AgentAction{ + Interrupted: &adk.InterruptInfo{Data: myData}, }, -}}) +}) ``` -### 状态持久化 (Checkpoint) - -当 Runner 捕获到这个带有 Interrupted Action 的 Event 时,会立即终止当前的执行流程。 如果: +### 状态持久化 -1. Runner 中设置了 CheckPointStore +Runner 捕获中断后,将运行状态(输入、对话历史、InterruptInfo)以 CheckPointID 为 key 存入 `CheckPointStore`: ```go -// github.com/cloudwego/eino/adk/runner.go -type RunnerConfig struct { - // other fields - CheckPointStore CheckPointStore -} - -// github.com/cloudwego/eino/adk/interrupt.go type CheckPointStore interface { Set(ctx context.Context, key string, value []byte) error Get(ctx context.Context, key string) ([]byte, bool, error) } ``` -1. 调用 Runner 时通过 AgentRunOption WithCheckPointID 传入 CheckPointID +调用时通过 Option 传入 CheckPointID: ```go -// github.com/cloudwego/eino/adk/interrupt.go -func WithCheckPointID(id string) _AgentRunOption_ +iter := runner.Run(ctx, messages, adk.WithCheckPointID("cp-123")) ``` -Runner 在终止运行后会将当前运行状态(原始输入、对话历史等)以及 Agent 抛出的 InterruptInfo 以 CheckPointID 为 key 持久化到 CheckPointStore 中。 - > 💡 -> 为了保存 interface 中数据的原本类型,Eino ADK 使用 gob([https://pkg.go.dev/encoding/gob](https://pkg.go.dev/encoding/gob))序列化运行状态。因此在使用自定义类型时需要提前使用 gob.Register 或 gob.RegisterName 注册类型(更推荐后者,前者使用路径加类型名作为默认名字,因此类型的位置和名字均不能发生变更)。Eino 会自动注册框架内置的类型。 +> ADK 使用 gob 序列化运行状态。自定义类型需提前 gob.RegisterName 注册。框架内置类型已自动注册。 -### Resume - -运行中断,调用 Runner 的 Resume 接口传入中断时的 CheckPointID 可以恢复运行: +### 恢复 ```go -// github.com/cloudwego/eino/adk/runner.go -func (r *Runner) Resume(ctx context.Context, checkPointID string, opts ...AgentRunOption) (*AsyncIterator[*AgentEvent], error) +// 简单恢复:隐式恢复所有中断点 +iter, err := runner.Resume(ctx, "cp-123") + +// 精确恢复:指定目标和数据 +iter, err := runner.ResumeWithParams(ctx, "cp-123", &adk.ResumeParams{ + Targets: map[string]any{ + "agent-address": resumeData, + }, +}) ``` -恢复 Agent 运行需要发生中断的 Agent 实现了 ResumableAgent 接口, Runner 从 CheckPointerStore 读取运行状态并恢复运行,其中 InterruptInfo 和上次运行配置的 EnableStreaming 会作为输入提供给 Agent: +恢复需要中断的 Agent 实现 `ResumableAgent` 接口: ```go -// github.com/cloudwego/eino/adk/interface.go -type ResumableAgent interface { - Agent - - Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] -} - -// github.com/cloudwego/eino/adk/interrupt.go -type ResumeInfo struct { - EnableStreaming bool - *_InterruptInfo_ +type TypedResumableAgent[M MessageType] interface { + TypedAgent[M] + Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*TypedAgentEvent[M]] } ``` -Resume 如果向 Agent 传入新信息,可以定义 AgentRunOption,在调用 Runner.Resume 时传入。 +# 多轮运行时:TurnLoop + +对于需要多轮交互的场景(聊天应用、持续对话),ADK 提供 `TurnLoop` 运行时: + +- **Push-based 事件循环**:Push 新消息触发 Agent 运行 +- **抢占(Preempt)**:用户在 Agent 运行中发送新消息时,可取消当前运行 +- **Stop**:停止事件循环 +- **声明式 Checkpoint/Resume**:TurnLoop 自动管理输入 bookkeeping,应用层只需声明恢复策略 + +详见:[Agent Cancel 与 TurnLoop 快速入门](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) + +# Agent Cancel + +v0.9 新增的运行时取消能力,支持: + +- **CancelMode 位掩码组合**:`CancelModelStream | CancelToolCalls` +- **CancelHandle.Wait()**:等待取消完成 +- **与 TurnLoop 集成**:Preempt 时自动触发 Cancel + +详见:[Agent Cancel 与 TurnLoop 快速入门](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md deleted file mode 100644 index 6c4a682268c..00000000000 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model.md +++ /dev/null @@ -1,897 +0,0 @@ ---- -Description: "" -date: "2026-03-24" -lastmod: "" -tags: [] -title: ChatModelAgent -weight: 1 ---- - -# ChatModelAgent 概述 - -## Import Path - -`import ``github.com/cloudwego/eino/adk` - -## 什么是 ChatModelAgent - -`ChatModelAgent` 是 Eino ADK 中的一个核心预构建 的 Agent,它封装了与大语言模型(LLM)进行交互、并支持使用工具来完成任务的复杂逻辑。 - -## ChatModelAgent ReAct 模式 - -`ChatModelAgent` 内使用了 [ReAct](https://react-lm.github.io/) 模式,该模式旨在通过让 ChatModel 进行显式的、一步一步的“思考”来解决复杂问题。为 `ChatModelAgent` 配置了工具后,它在内部的执行流程就遵循了 ReAct 模式: - -- 调用 ChatModel(Reason) -- LLM 返回工具调用请求(Action) -- ChatModelAgent 执行工具(Act) -- 它将工具结果返回给 ChatModel(Observation),然后开始新的循环,直到 ChatModel 判断不需要调用 Tool 结束。 - -当没有配置工具时,`ChatModelAgent` 退化为一次 ChatModel 调用。 - - - -可以通过 ToolsConfig 为 ChatModelAgent 配置 Tool: - -```go -// github.com/cloudwego/eino/adk/chatmodel.go - -type ToolsConfig struct { - compose.ToolsNodeConfig - - // Names of the tools that will make agent return directly when the tool is called. - // When multiple tools are called and more than one tool is in the return directly list, only the first one will be returned. - ReturnDirectly map[string]bool - - // EmitInternalEvents indicates whether internal events from agentTool should be emitted - // to the parent generator via a tool option injection at run-time. - EmitInternalEvents bool -} -``` - -ToolsConfig 复用了 Eino Graph ToolsNodeConfig,详细参考:[Eino: ToolsNode&Tool 使用说明](/zh/docs/eino/core_modules/components/tools_node_guide)。额外提供了 ReturnDirectly 配置,ChatModelAgent 调用配置在 ReturnDirectly 中的 Tool 后会直接退出。 - -## ChatModelAgent 配置字段 - -> 💡 -> 注意:GenModelInput 默认情况下,会通过 adk.GetSessionValues() 并以 F-String 的格式渲染 Instruction,如需关闭此行为,可定制 GenModelInput 方法。 - -```go -type ChatModelAgentConfig struct { - // Name of the agent. Better be unique across all agents. - Name string - // Description of the agent's capabilities. - // Helps other agents determine whether to transfer tasks to this agent. - Description string - // Instruction used as the system prompt for this agent. - // Optional. If empty, no system prompt will be used. - // Supports f-string placeholders for session values in default GenModelInput, for example: - // "You are a helpful assistant. The current time is {Time}. The current user is {User}." - // These placeholders will be replaced with session values for "Time" and "User". - Instruction string - - Model model.ToolCallingChatModel - - ToolsConfig ToolsConfig - - // GenModelInput transforms instructions and input messages into the model's input format. - // Optional. Defaults to defaultGenModelInput which combines instruction and messages. - GenModelInput GenModelInput - - // Exit defines the tool used to terminate the agent process. - // Optional. If nil, no Exit Action will be generated. - // You can use the provided 'ExitTool' implementation directly. - Exit tool.BaseTool - - // OutputKey stores the agent's response in the session. - // Optional. When set, stores output via AddSessionValue(ctx, outputKey, msg.Content). - OutputKey string - - // MaxIterations defines the upper limit of ChatModel generation cycles. - // The agent will terminate with an error if this limit is exceeded. - // Optional. Defaults to 20. - MaxIterations int - - // ModelRetryConfig configures retry behavior for the ChatModel. - // When set, the agent will automatically retry failed ChatModel calls - // based on the configured policy. - // Optional. If nil, no retry will be performed. - ModelRetryConfig *ModelRetryConfig -} - -type ToolsConfig struct { - compose.ToolsNodeConfig - - // Names of the tools that will make agent return directly when the tool is called. - // When multiple tools are called and more than one tool is in the return directly list, only the first one will be returned. - ReturnDirectly map[string]bool - - // EmitInternalEvents indicates whether internal events from agentTool should be emitted - // to the parent generator via a tool option injection at run-time. - EmitInternalEvents bool -} - -type GenModelInput func(ctx context.Context, instruction string, input *AgentInput) ([]Message, error) -``` - -- `Name`:Agent 名称 -- `Description`:Agent 描述 -- `Instruction`:调用 ChatModel 时的 System Prompt,支持 f-string 渲染 -- `Model`:运行所使用的 ChatModel,要求支持工具调用 -- `ToolsConfig`:工具配置 - - ToolsConfig 复用了 Eino Graph ToolsNodeConfig,详细参考:[Eino: ToolsNode&Tool 使用说明](/zh/docs/eino/core_modules/components/tools_node_guide)。 - - ReturnDirectly:当 ChatModelAgent 调用配置在 ReturnDirectly 中的 Tool 后,将携带结果立刻退出,不会按照 react 模式返回 ChatModel。如果命中了多个 Tool,只有首个 Tool 会返回。Map key 为 Tool 名称。 - - EmitInternalEvents:当通过 adk.AgentTool() 将一个 Agent 通过 ToolCall 的形式当成 SubAgent 时,默认情况下,这个 SubAgent 不会发送 AgentEvent,只将最终结果作为 ToolResult 返回。 -- `GenModelInput`:Agent 被调用时会使用该方法将 `Instruction` 和 `AgentInput` 转换为调用 ChatModel 的 Messages。Agent 提供了默认的 GenModelInput 方法: - 1. 将 `Instruction` 作为 `System Message` 加到 `AgentInput.Messages` 前 - 2. 将 `SessionValues` 为 variables 渲染到步骤 1 的 message list 中 - -> 💡 -> 默认的 `GenModelInput` 使用 pyfmt 渲染,message list 中的文本会被作为 pyfmt 模板,这意味着文本中的 '{' 与 '}' 都会被视为关键字,如果希望直接输入这两个字符,需要进行转义 '{{'、'}}' - -- `OutputKey`:配置后,ChatModelAgent 运行产生的最后一条 Message 将会以 `OutputKey` 为 key 设置到 `SessionValues` 中 -- `MaxIterations`:react 模式下 ChatModel 最大生成次数,超过时 Agent 会报错退出,默认值为 20 -- `Exit`:Exit 是一个特殊的 Tool,当模型调用这个工具并执行后,ChatModelAgent 将直接退出,效果与 `ToolsConfig.ReturnDirectly` 类似。ADK 提供了一个默认 ExitTool 实现供用户使用: - -```go -type ExitTool struct{} - -func (et ExitTool) Info(_ context.Context) (*schema.ToolInfo, error) { - return ToolInfoExit, nil -} - -func (et ExitTool) InvokableRun(ctx context.Context, argumentsInJSON string, _ ...tool.Option) (string, error) { - type exitParams struct { - FinalResult string `json:"final_result"` - } - - params := &exitParams{} - err := sonic.UnmarshalString(argumentsInJSON, params) - if err != nil { - return "", err - } - - err = SendToolGenAction(ctx, "exit", NewExitAction()) - if err != nil { - return "", err - } - - return params.FinalResult, nil -} -``` - -- `ModelRetryConfig`: 配置后,ChatModel 请求过程中发生的各种错误(包括直接返回错误、流式响应过程中发生错误等),都会按照配置的策略选择是否以及何时进行重试。如果是流式响应过程中发生错误,则这一次流式响应依然会第一时间通过 AgentEvent 的形式返回出去。如果这次流式响应过程中的错误,按照配置的策略,会进行重试,则消费 AgentEvent 中的 message stream,会得到 `WillRetryError`。用户可以处理这个 error,做对应的上屏展示等处理,示例如下: - -```go -iterator := agent.Run(ctx, input) -for { - event, ok := iterator.Next() - if !ok { - break - } - - if event.Err != nil { - handleFinalError(event.Err) - break - } - - // Process streaming output - if event.Output != nil && event.Output.MessageOutput.IsStreaming { - stream := event.Output.MessageOutput.MessageStream - for { - msg, err := stream.Recv() - if err == io.EOF { - break // Stream completed successfully - } - if err != nil { - // Check if this error will be retried (more streams coming) - var willRetry *adk.WillRetryError - if errors.As(err, &willRetry) { - log.Printf("Attempt %d failed, retrying...", willRetry.RetryAttempt) - break // Wait for next event with new stream - } - // Original error - won't retry, agent will stop and the next AgentEvent probably will be an error - log.Printf("Final error (no retry): %v", err) - break - } - // Display chunk to user - displayChunk(msg) - } - } -} -``` - -## ChatModelAgent Transfer - -`ChatModelAgent` 支持将其他 Agent 的元信息转为自身的 Tool ,经由 ChatModel 判断实现动态 Transfer: - -- `ChatModelAgent` 实现了 `OnSubAgents` 接口,使用 `SetSubAgents` 为 `ChatModelAgent` 设置子 Agents 后,`ChatModelAgent` 会增加一个 `Transfer Tool`,并且在 prompt 中指示 ChatModel 在需要 transfer 时调用这个 Tool 并以 transfer 目标 AgentName 作为 Tool 输入。 - -```go -const ( - TransferToAgentInstruction = `Available other agents: %s - -Decision rule: -- If you're best suited for the question according to your description: ANSWER -- If another agent is better according its description: CALL '%s' function with their agent name - -When transferring: OUTPUT ONLY THE FUNCTION CALL` -) - -func genTransferToAgentInstruction(ctx context.Context, agents []Agent) string { - var sb strings.Builder - for _, agent := range agents { - sb.WriteString(fmt.Sprintf("\n- Agent name: %s\n Agent description: %s", - agent.Name(ctx), agent.Description(ctx))) - } - - return fmt.Sprintf(TransferToAgentInstruction, sb.String(), TransferToAgentToolName) -} -``` - -- `Transfer Tool` 运行会设置 Transfer Event,指定跳转到目标 Agent 上,完成后 ChatModelAgent 退出。 -- Agent Runner 接收到 Transfer Event 后,跳转到目标 Agent 上执行,完成 Transfer 操作 - -## ChatModelAgent AgentAsTool - -当需要被调用的 Agent 不需要完整的运行上下文,仅需要明确清晰的入参即可正确运行时,该 Agent 可以转换为 Tool 交由 `ChatModelAgent` 判断调用: - -- ADK 中提供了工具方法,可以方便地将 Eino ADK Agent 转化为 Tool 供 ChatModelAgent 调用: - -```go -// github.com/cloudwego/eino/adk/agent_tool.go - -func NewAgentTool(_ context.Context, agent Agent, options ...AgentToolOption) tool.BaseTool -``` - -- 被转换为 Tool 后的 Agent 可以通过 `ToolsConfig` 直接注册在 ChatModelAgent 中 - -```go -bookRecommender := NewBookRecommendAgent() -bookRecommendeTool := NewAgentTool(ctx, bookRecommender) - -a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // ... - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{bookRecommendeTool}, - }, - }, -}) -``` - -## ChatModelAgent Middleware - -`ChatModelAgentMiddleware` 是 `ChatModelAgent` 的扩展机制,允许开发者在 Agent 执行的各个阶段注入自定义逻辑: - - - -`ChatModelAgentMiddleware` 定义为 interface,开发者可以实现此 interface 并通过配置到 `ChatModelAgentConfig` 使其在 `ChatModelAgent` 中生效: - -```go -type ChatModelAgentMiddleware interface { - // ... -} - -a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // ... - Handlers: []adk.ChatModelAgentMiddleware{ - &MyMiddleware{}, - }, -}) -``` - -**使用 BaseChatModelAgentMiddleware** - -`BaseChatModelAgentMiddleware` 提供所有方法的默认空实现。通过嵌入它,可以只覆盖需要的方法: - -```go -type MyMiddleware struct { - *adk.BaseChatModelAgentMiddleware - // 自定义字段 - logger *log.Logger -} - -// 只需覆盖需要的方法 -func (m *MyMiddleware) BeforeModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - m.logger.Printf("Messages count: %d", len(state.Messages)) - return ctx, state, nil -} -``` - -### BeforeAgent - -在每次 Agent 运行前调用,可用于修改指令和工具配置。ChatModelAgentContext 定义了 BeforeAgent 中可读写的内容: - -```go -type ChatModelAgentContext struct { - // InstructionAgent 是当前 Agent 的指令 - Instruction string - // Tools 是当前配置的原始工具列表 - Tools []tool.BaseTool - // ReturnDirectly 配置调用后直接返回的工具名称集合 - ReturnDirectly map[string]bool -} - -type ChatModelAgentMiddleware interface { - // ... - BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) - // ... -} -``` - -例子: - -```go -func (m *MyMiddleware) BeforeAgent( - ctx context.Context, - runCtx *adk.ChatModelAgentContext, -) (context.Context, *adk.ChatModelAgentContext, error) { - // 拷贝 runCtx,避免修改输入 - nRunCtx := *runCtx - - // 修改指令 - nRunCtx.Instruction += "\n\n请始终使用中文回复。" - - // 添加工具 - nRunCtx.Tools = append(runCtx.Tools, myCustomTool) - - // 设置工具直接返回 - nRunCtx.ReturnDirectly["my_tool"] = true - - return ctx, &nRunCtx, nil -} -``` - -### BeforeModelRewriteState / AfterModelRewriteState - -在每次模型调用前/后调用,可用于检查和修改消息历史。ModelContext 定义了只读内容,ChatModelAgentState 定义了可读写内容: - -```go -type ModelContext struct { - // Tools 包含当前配置给 Agent 的工具列表 - // 在请求时填充,包含将要发送给模型的工具信息 - Tools []*schema.ToolInfo - - // ModelRetryConfig 包含模型的重试配置 - // 从 Agent 的 ModelRetryConfig 填充 - ModelRetryConfig *ModelRetryConfig -} - -type ChatModelAgentState struct { - // Messages 包含当前会话中的所有消息 - Messages []Message -} - -type ChatModelAgentMiddleware interface { - BeforeModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) - AfterModelRewriteState(ctx context.Context, state *ChatModelAgentState, mc *ModelContext) (context.Context, *ChatModelAgentState, error) -} -``` - -例子: - -```go -func (m *MyMiddleware) BeforeModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - // 拷贝 state,避免修改入参 - nState := *state - - // 检查消息历史 - if len(state.Messages) > 50 { - // 截断过旧的消息 - nState.Messages = state.Messages[len(state.Messages)-50:] - } - return ctx, &nState, nil -} - -func (m *MyMiddleware) AfterModelRewriteState( - ctx context.Context, - state *adk.ChatModelAgentState, - mc *adk.ModelContext, -) (context.Context, *adk.ChatModelAgentState, error) { - // 模型响应是最后一条消息 - lastMsg := state.Messages[len(state.Messages)-1] - m.logger.Printf("Model response: %s", lastMsg.Content) - return ctx, state, nil -} -``` - -### WrapModel - -包装模型调用,可用于拦截和修改模型的输入输出: - -```go -type ChatModelAgentMiddleware interface { - WrapModel(ctx context.Context, m model.BaseChatModel, mc *ModelContext) (model.BaseChatModel, error) -} -``` - -例子: - -```go -func (m *MyMiddleware) WrapModel( - ctx context.Context, - chatModel model.BaseChatModel, - mc *adk.ModelContext, -) (model.BaseChatModel, error) { - return &loggingModel{ - inner: chatModel, - logger: m.logger, - }, nil -} - -type loggingModel struct { - inner model.BaseChatModel - logger *log.Logger -} - -func (m *loggingModel) Generate(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.Message, error) { - m.logger.Printf("Input messages: %d", len(msgs)) - resp, err := m.inner.Generate(ctx, msgs, opts...) - m.logger.Printf("Output: %v, error: %v", resp != nil, err) - return resp, err -} - -func (m *loggingModel) Stream(ctx context.Context, msgs []*schema.Message, opts ...model.Option) (*schema.StreamReader[*schema.Message], error) { - return m.inner.Stream(ctx, msgs, opts...) -} -``` - -### WrapInvokableToolCall / WrapStreamableToolCall - -包装工具调用,可用于拦截和修改工具的输入输出: - -```go -// InvokableToolCallEndpoint 是工具调用的函数签名。 -// Middleware 开发者围绕这个 Endpoint 添加自定义逻辑。 -type InvokableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) - -// StreamableToolCallEndpoint 是流式工具调用的函数签名。 -// Middleware 开发者围绕这个 Endpoint 添加自定义逻辑。 -type StreamableToolCallEndpoint func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (*schema.StreamReader[string], error) - -type ToolContext struct { - // Name 说明了本次调用工具的名称 - Name string - // CallID 说明了本次调用工具的 ToolCallID - CallID string -} - -type ChatModelAgentMiddleware interface { - WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) - WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) -} -``` - -例子: - -```go -func (m *MyMiddleware) WrapInvokableToolCall( - ctx context.Context, - endpoint adk.InvokableToolCallEndpoint, - tCtx *adk.ToolContext, -) (adk.InvokableToolCallEndpoint, error) { - return func(ctx context.Context, argumentsInJSON string, opts ...tool.Option) (string, error) { - m.logger.Printf("Calling tool: %s (ID: %s)", tCtx.Name, tCtx.CallID) - start := time.Now() - - result, err := endpoint(ctx, argumentsInJSON, opts...) - - m.logger.Printf("Tool %s completed in %v", tCtx.Name, time.Since(start)) - return result, err - }, nil -} -``` - -# ChatModelAgent 使用示例 - -## 场景说明 - -创建一个图书推荐 Agent,Agent 将能够根据用户的输入推荐相关图书。 - -## 代码实现 - -### 步骤 1: 定义工具 - -图书推荐 Agent 需要一个根据能够根据用户要求(题材、评分等)检索图书的工具 `book_search` 。 - -利用 Eino 提供的工具方法可以方便地创建(可参考[如何创建一个 tool ?](/zh/docs/eino/core_modules/components/tools_node_guide/how_to_create_a_tool)): - -```go -import ( - "context" - "log" - - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" -) - -type BookSearchInput struct { - Genre string `json:"genre" jsonschema:"description=Preferred book genre,enum=fiction,enum=sci-fi,enum=mystery,enum=biography,enum=business"` - MaxPages int `json:"max_pages" jsonschema:"description=Maximum page length (0 for no limit)"` - MinRating int `json:"min_rating" jsonschema:"description=Minimum user rating (0-5 scale)"` -} - -type BookSearchOutput struct { - Books []string -} - -func NewBookRecommender() tool.InvokableTool { - bookSearchTool, err := utils.InferTool("search_book", "Search books based on user preferences", func(ctx context.Context, input *BookSearchInput) (output *BookSearchOutput, err error) { - // search code - // ... - return &BookSearchOutput{Books: []string{"God's blessing on this wonderful world!"}}, nil - }) - if err != nil { - log.Fatalf("failed to create search book tool: %v", err) - } - return bookSearchTool -} -``` - -### 步骤 2: 创建 ChatModel - -Eino 提供了多种 ChatModel 封装(如 openai、gemini、doubao 等,详见 [Eino: ChatModel 使用说明](/zh/docs/eino/core_modules/components/chat_model_guide)),这里以 openai ChatModel 为例: - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/components/model" -) - -func NewChatModel() model.ToolCallingChatModel { - ctx := context.Background() - apiKey := os.Getenv("OPENAI_API_KEY") - openaiModel := os.Getenv("OPENAI_MODEL") - - cm, err := openai.NewChatModel(ctx, &openai.ChatModelConfig{ - APIKey: apiKey, - Model: openaiModel, - }) - if err != nil { - log.Fatal(fmt.Errorf("failed to create chatmodel: %w", err)) - } - return cm -} -``` - -### 步骤 3: 创建 ChatModelAgent - -除了配置 ChatModel 和工具外,还需要配置描述 Agent 功能用途的 Name 和 Description,以及指示 ChatModel 的 Instruction,Instruction 最终会作为 system message 被传递给 ChatModel。 - -```go -import ( - "context" - "fmt" - "log" - - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/compose" -) - -func NewBookRecommendAgent() adk.Agent { - ctx := context.Background() - - a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - Name: "BookRecommender", - Description: "An agent that can recommend books", - Instruction: `You are an expert book recommender. Based on the user's request, use the "search_book" tool to find relevant books. Finally, present the results to the user.`, - Model: NewChatModel(), - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{NewBookRecommender()}, - }, - }, - }) - if err != nil { - log.Fatal(fmt.Errorf("failed to create chatmodel: %w", err)) - } - - return a -} -``` - -### - -### 步骤 4: 通过 Runner 运行 - -```go -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino/adk" - - "github.com/cloudwego/eino-examples/adk/intro/chatmodel/subagents" -) - -func main() { - ctx := context.Background() - a := subagents.NewBookRecommendAgent() - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: a, - }) - iter := runner.Query(ctx, "recommend a fiction book to me") - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - msg, err := event.Output.MessageOutput.GetMessage() - if err != nil { - log.Fatal(err) - } - fmt.Printf("\nmessage:\n%v\n======", msg) - } -} -``` - -## 运行结果 - -```yaml -message: -assistant: -tool_calls: -{Index: ID:call_o2It087hoqj8L7atzr70EnfG Type:function Function:{Name:search_book Arguments:{"genre":"fiction","max_pages":0,"min_rating":0}} Extra:map[]} - -finish_reason: tool_calls -usage: &{140 24 164} -====== - - -message: -tool: {"Books":["God's blessing on this wonderful world!"]} -tool_call_id: call_o2It087hoqj8L7atzr70EnfG -tool_call_name: search_book -====== - - -message: -assistant: I recommend the fiction book "God's blessing on this wonderful world!". It's a great choice for readers looking for an exciting story. Enjoy your reading! -finish_reason: stop -usage: &{185 31 216} -====== -``` - -# ChatModelAgent 中断与恢复 - -## 介绍 - -`ChatModelAgent` 使用了 Eino Graph 实现,因此在 agent 中可以复用 Eino Graph 的 Interrupt&Resume 能力。 - -- Interrupt 时,通过在工具中返回特殊错误使 Graph 触发中断并向外抛出自定义信息,在恢复时 Graph 会重新运行此工具: - -```go -// github.com/cloudwego/eino/adk/interrupt.go - -func NewInterruptAndRerunErr(extra any) error -``` - -- Resume 时,支持自定义 ToolOption,用于在恢复时传递额外信息到 Tool 中: - -```go -import ( - "github.com/cloudwego/eino/components/tool" -) - -type askForClarificationOptions struct { - NewInput *string -} - -func WithNewInput(input string) tool.Option { - return tool.WrapImplSpecificOptFn(func(t *askForClarificationOptions) { - t.NewInput = &input - }) -} -``` - -## 示例 - -下面我们将基于上面【ChatModelAgent 使用示例】小节中的代码,为 `BookRecommendAgent` 增加一个工具 `ask_for_clarification`,当用户提供的信息不足以支持推荐时,Agent 将调用这个工具向用户询问更多信息,`ask_for_clarification` 使用了 Interrupt&Resume 能力来实现向用户“询问”。 - -### 步骤 1 : 新增 Tool 支持中断 - -```go -import ( - "context" - "log" - - "github.com/cloudwego/eino/components/tool" - "github.com/cloudwego/eino/components/tool/utils" - "github.com/cloudwego/eino/compose" -) - -type askForClarificationOptions struct { - NewInput *string -} - -func WithNewInput(input string) tool.Option { - return tool.WrapImplSpecificOptFn(func(t *askForClarificationOptions) { - t.NewInput = &input - }) -} - -type AskForClarificationInput struct { - Question string `json:"question" jsonschema:"description=The specific question you want to ask the user to get the missing information"` -} - -func NewAskForClarificationTool() tool.InvokableTool { - t, err := utils.InferOptionableTool( - "ask_for_clarification", - "Call this tool when the user's request is ambiguous or lacks the necessary information to proceed. Use it to ask a follow-up question to get the details you need, such as the book's genre, before you can use other tools effectively.", - func(ctx context.Context, input *AskForClarificationInput, opts ...tool.Option) (output string, err error) { - o := tool.GetImplSpecificOptions[askForClarificationOptions](nil, opts...) - if o.NewInput == nil { - return "", compose.NewInterruptAndRerunErr(input.Question) - } - return *o.NewInput, nil - }) - if err != nil { - log.Fatal(err) - } - return t -} -``` - -### 步骤 2: 添加 Tool 到 Agent 中 - -```go -func NewBookRecommendAgent() adk.Agent { - // xxx - a, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ - // xxx - ToolsConfig: adk.ToolsConfig{ - ToolsNodeConfig: compose.ToolsNodeConfig{ - Tools: []tool.BaseTool{NewBookRecommender(), NewAskForClarificationTool()}, - }, - // Tool 内部通过 AgentTool() 调用 SubAgent 时,是否将这个 SubAgent 的 AgentEvent 输出 - EmitInternalEvents: true, - }, - }) - // xxx -} -``` - -### 步骤 3: Agent Runner 配置 CheckPointStore - -在 Runner 中配置 `CheckPointStore`(例子中使用最简单的 InMemoryStore),并在调用 Agent 时传入 `CheckPointID`,用于在恢复时使用。另外,在中断时,Graph 会将 `InterruptInfo` 放入 `Interrupted.Data` 中: - -```go -func newInMemoryStore() compose.CheckPointStore { - return &inMemoryStore{ - mem: map[string][]byte{}, - } -} - -func main() { - ctx := context.Background() - a := subagents.NewBookRecommendAgent() - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - EnableStreaming: true, // you can disable streaming here - Agent: a, - CheckPointStore: newInMemoryStore(), - }) - iter := runner.Query(ctx, "recommend a book to me", adk.WithCheckPointID("1")) - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Err != nil { - log.Fatal(event.Err) - } - if event.Action != nil && event.Action.Interrupted != nil { - fmt.Printf("\ninterrupt happened, info: %+v\n", event.Action.Interrupted.Data.(*adk.ChatModelAgentInterruptInfo).RerunNodesExtra["ToolNode"]) - continue - } - msg, err := event.Output.MessageOutput.GetMessage() - if err != nil { - log.Fatal(err) - } - fmt.Printf("\nmessage:\n%v\n======\n\n", msg) - } - - scanner := bufio.NewScanner(os.Stdin) - fmt.Print("\nyour input here: ") - scanner.Scan() - fmt.Println() - nInput := scanner.Text() - - iter, err := runner.Resume(ctx, "1", adk.WithToolOptions([]tool.Option{subagents.WithNewInput(nInput)})) - if err != nil { - log.Fatal(err) - } - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - prints.Event(event) - } -} -``` - -### 运行结果 - -运行后会发生中断 - -``` -message: -assistant: -tool_calls: -{Index: ID:call_3HAobzkJvW3JsTmSHSBRftaG Type:function Function:{Name:ask_for_clarification Arguments:{"question":"Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?"}} Extra:map[]} - -finish_reason: tool_calls -usage: &{219 37 256} -====== - - -interrupt happened, info: &{ToolCalls:[{Index: ID:call_3HAobzkJvW3JsTmSHSBRftaG Type:function Function:{Name:ask_for_clarification Arguments:{"question":"Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?"}} Extra:map[]}] ExecutedTools:map[] RerunTools:[call_3HAobzkJvW3JsTmSHSBRftaG] RerunExtraMap:map[call_3HAobzkJvW3JsTmSHSBRftaG:Could you please specify the genre you're interested in and any preferences like maximum page length or minimum user rating?]} -your input here: -``` - -stdin 输入后,从 CheckPointStore 取出之前中断状态,结合补全的输入,继续运行 - -``` -new input is: -recommend me a fiction book - -message: -tool: recommend me a fiction book -tool_call_id: call_3HAobzkJvW3JsTmSHSBRftaG -tool_call_name: ask_for_clarification -====== - - -message: -assistant: -tool_calls: -{Index: ID:call_3fC5OqPZLls11epXMv7sZGAF Type:function Function:{Name:search_book Arguments:{"genre":"fiction","max_pages":0,"min_rating":0}} Extra:map[]} - -finish_reason: tool_calls -usage: &{272 24 296} -====== - - -message: -tool: {"Books":["God's blessing on this wonderful world!"]} -tool_call_id: call_3fC5OqPZLls11epXMv7sZGAF -tool_call_name: search_book -====== - - -message: -assistant: I recommend the fiction book "God's Blessing on This Wonderful World!" Enjoy your reading! -finish_reason: stop -usage: &{317 20 337} -====== -``` - -# 总结 - -`ChatModelAgent` 是 ADK 核心 Agent 实现,充当应用程序 "思考" 的部分,利用 LLM 强大的功能进行推理、理解自然语言、作出决策、生成相应、进行工具交互。 - -`ChatModelAgent` 的行为是非确定性的,通过 LLM 来动态的决定使用哪些工具,或转交控制权到其他 Agent 上。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md new file mode 100644 index 00000000000..f740fa3fdc4 --- /dev/null +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/_index.md @@ -0,0 +1,306 @@ +--- +Description: "" +date: "2026-05-21" +lastmod: "" +tags: [] +title: ChatModelAgent +weight: 1 +--- + +# ChatModelAgent 概述 + +`import "github.com/cloudwego/eino/adk"` + +## 什么是 ChatModelAgent + +`ChatModelAgent` 是 Eino ADK 的核心 Agent 实现——以 ChatModel 为决策器、以 Tools 为行动空间、通过 ReAct Loop 自主推进问题求解。 + +关于 ChatModelAgent 的概念、ReAct Loop、Middleware 体系的完整介绍,见:[ChatModelAgent 介绍](/zh/docs/eino/overview/eino_adk_quickstart) + +## ReAct Loop + +当配置了 Tools 时,ChatModelAgent 按 ReAct 模式循环执行: + +1. **Reason**:调用 ChatModel,模型决定下一步行动 +2. **Action**:模型返回 ToolCall 请求 +3. **Act**:执行对应 Tool +4. **Observation**:将 Tool 结果注入上下文,开始新一轮循环 + +循环持续直到模型判断无需再调用 Tool。未配置 Tools 时退化为单次 ChatModel 调用。 + +# 配置 + +## TypedChatModelAgentConfig + +```go +type TypedChatModelAgentConfig[M MessageType] struct { + Name string + Description string + Instruction string + + Model model.BaseModel[M] // 必填。使用 Tools 时须支持 model.WithTools + + ToolsConfig ToolsConfig + GenModelInput TypedGenModelInput[M] + + Exit tool.BaseTool // NOT RECOMMENDED + OutputKey string // NOT RECOMMENDED + MaxIterations int // 默认 20 + + Handlers []TypedChatModelAgentMiddleware[M] + Middlewares []AgentMiddleware // 旧版兼容 + + ModelRetryConfig *TypedModelRetryConfig[M] + ModelFailoverConfig *ModelFailoverConfig[M] +} + +// 默认别名 +type ChatModelAgentConfig = TypedChatModelAgentConfig[*schema.Message] +``` + +### 字段说明 + + + + + + + + + + + + + + +
    字段说明
    Name
    Agent 名称。用作 AgentTool 时必填
    Description
    Agent 能力描述。用作 AgentTool 时必填
    Instruction
    System Prompt。支持
    {Key}
    占位符,默认
    GenModelInput
    会用 SessionValues 渲染
    Model
    必填
    model.BaseModel[M]
    类型,使用 Tools 时须支持
    model.WithTools
    ToolsConfig
    工具配置,详见下文
    GenModelInput
    自定义输入转换。默认将 Instruction 作为 System Message + f-string 渲染
    MaxIterations
    ReAct 最大循环次数,超过报错退出。默认 20
    Handlers
    接口式 Middleware(
    TypedChatModelAgentMiddleware[M]
    ),推荐使用
    Middlewares
    结构体式 Middleware(
    AgentMiddleware
    ),旧版兼容
    ModelRetryConfig
    模型调用失败时的重试策略
    ModelFailoverConfig
    模型调用失败时切换备用模型。需配置
    GetFailoverModel
    ShouldFailover
    + +> 💡 +> 默认 GenModelInput 使用 pyfmt 渲染,Messages 中的 `{` 和 `}` 会被视为占位符。如需直接输出这两个字符,用 `{{` 和 `}}` 转义。 + +### ToolsConfig + +```go +type ToolsConfig struct { + compose.ToolsNodeConfig + + ReturnDirectly map[string]bool // 调用后直接返回的 Tool 名称 + EmitInternalEvents bool // 透传 AgentTool 内部事件 +} +``` + +- **ReturnDirectly**:命中的 Tool 执行后 Agent 立即退出,不再回调模型。多个命中时取首个 +- **EmitInternalEvents**:当子 Agent 通过 AgentTool 调用时,将子 Agent 事件实时透传到父 Agent 事件流 + +### 构造函数 + +```go +func NewChatModelAgent(ctx context.Context, config *ChatModelAgentConfig) (*ChatModelAgent, error) +func NewTypedChatModelAgent[M MessageType](ctx context.Context, config *TypedChatModelAgentConfig[M]) (*TypedChatModelAgent[M], error) +``` + +# Middleware(ChatModelAgentMiddleware) + +## 接口定义 + +```go +type TypedChatModelAgentMiddleware[M MessageType] interface { + BeforeAgent(ctx context.Context, runCtx *ChatModelAgentContext) (context.Context, *ChatModelAgentContext, error) + AfterAgent(ctx context.Context, state *TypedChatModelAgentState[M]) (context.Context, error) + + BeforeModelRewriteState(ctx context.Context, state *TypedChatModelAgentState[M], mc *TypedModelContext[M]) (context.Context, *TypedChatModelAgentState[M], error) + AfterModelRewriteState(ctx context.Context, state *TypedChatModelAgentState[M], mc *TypedModelContext[M]) (context.Context, *TypedChatModelAgentState[M], error) + + WrapModel(ctx context.Context, m model.BaseModel[M], mc *TypedModelContext[M]) (model.BaseModel[M], error) + + WrapInvokableToolCall(ctx context.Context, endpoint InvokableToolCallEndpoint, tCtx *ToolContext) (InvokableToolCallEndpoint, error) + WrapStreamableToolCall(ctx context.Context, endpoint StreamableToolCallEndpoint, tCtx *ToolContext) (StreamableToolCallEndpoint, error) + WrapEnhancedInvokableToolCall(ctx context.Context, endpoint EnhancedInvokableToolCallEndpoint, tCtx *ToolContext) (EnhancedInvokableToolCallEndpoint, error) + WrapEnhancedStreamableToolCall(ctx context.Context, endpoint EnhancedStreamableToolCallEndpoint, tCtx *ToolContext) (EnhancedStreamableToolCallEndpoint, error) +} + +type ChatModelAgentMiddleware = TypedChatModelAgentMiddleware[*schema.Message] +``` + +使用 `*BaseChatModelAgentMiddleware` 嵌入可只覆盖需要的方法: + +```go +type MyMiddleware struct { + *adk.BaseChatModelAgentMiddleware +} + +func (m *MyMiddleware) BeforeModelRewriteState( + ctx context.Context, + state *adk.ChatModelAgentState, + mc *adk.ModelContext, +) (context.Context, *adk.ChatModelAgentState, error) { + // 自定义逻辑 + return ctx, state, nil +} +``` + +## 钩子点位 + + + + + + + + + +
    钩子时机可修改内容
    BeforeAgent
    Agent 运行前(仅一次)Instruction、Tools、ReturnDirectly、ToolSearchTool
    AfterAgent
    Agent 成功结束后读取最终 state(不修改)
    BeforeModelRewriteState
    每次模型调用前Messages、ToolInfos、DeferredToolInfos(持久化到 state
    AfterModelRewriteState
    每次模型调用后Messages(含模型响应)、ToolInfos(持久化到 state
    WrapModel
    包装模型调用重试、failover、事件发送(不要修改 Messages
    WrapToolCall
    包装工具调用权限检查、日志、输出改写
    + +> 💡 +> `BeforeModelRewriteState` 返回的 state 会被框架持久化到 agent 内部状态。因此该钩子中的修改(如压缩 Messages、过滤 ToolInfos)会影响后续所有迭代。 + +## 核心类型 + +### ChatModelAgentContext(BeforeAgent 参数) + +```go +type ChatModelAgentContext struct { + Instruction string + Tools []tool.BaseTool + ReturnDirectly map[string]bool + ToolSearchTool *schema.ToolInfo // 模型原生 ToolSearch 能力 +} +``` + +### ChatModelAgentState(BeforeModel/AfterModel 参数) + +```go +type TypedChatModelAgentState[M MessageType] struct { + Messages []M + ToolInfos []*schema.ToolInfo // 传给模型的工具列表 + DeferredToolInfos []*schema.ToolInfo // 服务端延迟检索的工具列表 +} + +type ChatModelAgentState = TypedChatModelAgentState[*schema.Message] +``` + +### ModelContext(WrapModel 参数) + +```go +type TypedModelContext[M MessageType] struct { + Tools []*schema.ToolInfo // Deprecated: 用 state.ToolInfos + ModelRetryConfig *TypedModelRetryConfig[M] + ModelFailoverConfig *ModelFailoverConfig[M] +} + +type ModelContext = TypedModelContext[*schema.Message] +``` + +## 执行顺序 + +**模型调用链**(外到内): + +1. `AgentMiddleware.BeforeChatModel` +2. **BeforeModelRewriteState** +3. failover wrapper(内置) +4. retry wrapper(内置) +5. event sender wrapper(内置) +6. **WrapModel**(先注册 = 最外层) +7. callback injection(内置) +8. 实际模型调用 +9. **AfterModelRewriteState** +10. `AgentMiddleware.AfterChatModel` + +**工具调用链**(外到内): + +1. event sender(内置) +2. `ToolsConfig.ToolCallMiddlewares` +3. `AgentMiddleware.WrapToolCall` +4. **WrapToolCall**(先注册 = 最外层) +5. callback injection(内置) +6. 实际工具调用 + +# AgentAsTool + +将子 Agent 包装为 Tool,父 Agent 通过 ToolCall 自主调用: + +```go +subAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "researcher", + Description: "搜索并总结信息", + Model: chatModel, + // ... +}) + +agentTool := adk.NewAgentTool(ctx, subAgent) + +parentAgent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + // ... + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{agentTool}, + }, + }, +}) +``` + +泛型版本:`adk.NewTypedAgentTool[M](ctx, agent, options...)` + +选项:`WithFullChatHistoryAsInput()`(传递完整对话历史)、`WithAgentInputSchema(schema)`(自定义输入 schema) + +# ModelRetry + +配置后,ChatModel 调用失败时自动重试。流式响应中发生错误时,当前流仍会通过 AgentEvent 返回,消费 MessageStream 得到 `WillRetryError`: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + // ... + ModelRetryConfig: &adk.ModelRetryConfig{ + // 重试策略配置 + }, +}) + +// 消费事件流时处理 WillRetryError +stream := event.Output.MessageOutput.MessageStream +for { + msg, err := stream.Recv() + if err == io.EOF { + break + } + if err != nil { + var willRetry *adk.WillRetryError + if errors.As(err, &willRetry) { + log.Printf("Attempt %d failed, retrying...", willRetry.RetryAttempt) + break // 等待下一个事件 + } + break + } + displayChunk(msg) +} +``` + +# ModelFailover + +配置后,模型调用失败时切换备用模型: + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: primaryModel, + ModelFailoverConfig: &adk.ModelFailoverConfig{ + GetFailoverModel: func(ctx context.Context, err error) (model.BaseModel[*schema.Message], error) { + return backupModel, nil + }, + ShouldFailover: func(err error) bool { + return true // 根据错误类型决定是否 failover + }, + }, +}) +``` + +# Cancel + +v0.9 新增的运行时取消能力。详见 [Agent Cancel 与 TurnLoop](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart)。 + +```go +cancelOpt, cancelFn := adk.WithCancel() +iter := runner.Run(ctx, messages, cancelOpt) + +// 稍后取消(CancelMode 支持位掩码组合) +handle := cancelFn(adk.CancelAfterChatModel | adk.CancelAfterToolCalls) +handle.Wait() // 等待取消完成 +``` diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md new file mode 100644 index 00000000000..404dab7b805 --- /dev/null +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide.md @@ -0,0 +1,173 @@ +--- +Description: "" +date: "2026-05-21" +lastmod: "" +tags: [] +title: ChatModel Failover 功能文档 +weight: 1 +--- + +## 概述 + +`ChatModelAgent` 内置模型故障转移(Failover)能力:主模型调用失败时自动切换备用模型,支持 Generate(同步)和 Stream(流式)。通过 `ModelFailoverConfig[M]` 配置,与 `TypedModelRetryConfig[M]`(同模型重试)正交组合。 + +> 本文以默认 `*schema.Message` 类型为例。泛型用法请将 API 替换为对应的 `Typed` 前缀版本,消息类型参数化为 `M MessageType`。 + +## 核心数据结构 + +### ModelFailoverConfig[M] + +```go +type ModelFailoverConfig[M MessageType] struct { + // 最大故障转移次数。0 表示不 failover; + // 1 表示 GetFailoverModel 最多被调用 1 次。 + // 含 lastSuccessModel 时先尝试它,再调用 GetFailoverModel。 + MaxRetries uint + + // 判断是否触发 failover。ctx.Err() != nil 时不论返回值均停止。 + // 与 ModelRetryConfig 组合时,outputErr 为 *RetryExhaustedError; + // 原始错误通过 RetryExhaustedError.LastErr 获取。 + // 流式场景下 outputMessage 可能携带已接收的部分消息。 + // 配置 ModelFailoverConfig 时此字段必填。 + ShouldFailover func(ctx context.Context, outputMessage M, outputErr error) bool + + // 选择下一个模型并可选地转换输入消息。 + // failoverCtx.FailoverAttempt 从 1 开始。 + // 返回 nil failoverModelInputMessages 表示沿用原始输入。 + // 返回非 nil failoverErr 立即终止 failover。 + // 配置 ModelFailoverConfig 时此字段必填。 + GetFailoverModel func(ctx context.Context, failoverCtx *FailoverContext[M]) ( + failoverModel model.BaseModel[M], + failoverModelInputMessages []M, + failoverErr error, + ) +} +``` + +### FailoverContext[M] + +```go +type FailoverContext[M MessageType] struct { + FailoverAttempt uint // 当前尝试编号,从 1 开始 + InputMessages []M // 转换前的原始输入 + LastOutputMessage M // 上次失败的输出(流式下为部分消息) + // 与 ModelRetryConfig 组合时为 *RetryExhaustedError + LastErr error // 上次失败的错误 +} +``` + +## 快速接入 + +### 基础用法:双模型故障转移 + +```go +agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "my-agent", + Instruction: "You are a helpful assistant.", + Model: primaryModel, // model.BaseModel[*schema.Message],必填 + + ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, // 最多 1 次 failover(共 2 次调用) + + ShouldFailover: func(ctx context.Context, msg *schema.Message, err error) bool { + return !errors.Is(err, context.Canceled) && + !errors.Is(err, context.DeadlineExceeded) + }, + + GetFailoverModel: func(ctx context.Context, fc *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + return fallbackModel, nil, nil // nil 消息 → 沿用原始输入 + }, + }, +}) +``` + +> 💡 +> `model.BaseChatModel` 是 `model.BaseModel[*schema.Message]` 的类型别名,两者可互换使用。 + +### 故障转移时转换输入 + +当备用模型不支持某些功能(如图片输入)时: + +```go +ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, + ShouldFailover: func(_ context.Context, _ *schema.Message, _ error) bool { + return true + }, + GetFailoverModel: func(_ context.Context, fc *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + // 过滤掉图片内容,降级到纯文本模型 + return textModel, filterTextOnly(fc.InputMessages), nil + }, +}, +``` + +### 结合 Retry + +Failover 与 Retry 正交组合。语义:**每个模型先按 Retry 策略重试,重试耗尽后触发 Failover 切换**。 + +```go +agent, _ := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Model: primaryModel, + // ... + + ModelRetryConfig: &adk.ModelRetryConfig{ + MaxRetries: 2, + IsRetryAble: func(_ context.Context, err error) bool { + return isTransientError(err) + }, + }, + + ModelFailoverConfig: &adk.ModelFailoverConfig{ + MaxRetries: 1, + ShouldFailover: func(_ context.Context, _ *schema.Message, err error) bool { + // err 此时为 *RetryExhaustedError + return true + }, + GetFailoverModel: func(_ context.Context, _ *adk.FailoverContext) ( + model.BaseChatModel, []*schema.Message, error, + ) { + return fallbackModel, nil, nil + }, + }, +}) +``` + +## 流式 Failover 行为 + + + + + + +
    场景行为
    Stream()
    初始化失败
    与 Generate 一致,直接触发 failover 判定
    流中途出错已接收 chunk 拼接为
    LastOutputMessage
    传入
    ShouldFailover
    ;决定 failover 后关闭当前流,用新模型重启
    客户端影响失败尝试中已发送的事件不会被撤回。客户端应在收到新一轮流时重置部分结果或按元数据去重
    + +> 💡 +> `ErrStreamCanceled`(调用方主动放弃流)不触发 failover,直接返回。 + +## Model 调用链执行顺序 + +Failover 在包装链中的位置(从外到内): + +``` +1. AgentMiddleware.BeforeChatModel + 2. ChatModelAgentMiddleware.BeforeModelRewriteState + 3. failoverModelWrapper ← failover 在此层 + 4. retryModelWrapper ← 每个 failover 模型内部重试 + 5. eventSenderModelWrapper + 6. ChatModelAgentMiddleware.WrapModel(先注册的在最外层) + 7. callbackInjectionModelWrapper(failover 启用时由 failoverProxyModel 内部处理) + 8. failoverProxyModel / Model.Generate|Stream + 9. ChatModelAgentMiddleware.AfterModelRewriteState +10. AgentMiddleware.AfterChatModel +``` + +## 注意事项 + +- **必填校验**:`ShouldFailover` 和 `GetFailoverModel` 在配置 `ModelFailoverConfig` 时均为必填,缺少任一在 `NewChatModelAgent` 时返回错误。`Model` 字段始终必填。 +- **Attempt 编号**:`FailoverAttempt` 从 1 开始。单次 Model 调用最多执行 `1 + MaxRetries` 次(初始 1 次 + failover 最多 MaxRetries 次)。 +- **输入消息**:`GetFailoverModel` 返回 `nil` 消息时沿用原始输入;返回非 `nil` 时替代原始输入。 +- **与 Retry 组合时的错误类型**:`ShouldFailover` 和 `FailoverContext.LastErr` 收到的是 `*RetryExhaustedError`,原始错误通过 `RetryExhaustedError.LastErr` 获取。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md index 3fcfb76163f..c412653db50 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents.md @@ -1,196 +1,208 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: DeepAgents -weight: 5 +weight: 3 --- -## DeepAgents 概述 +> 💡 +> 本功能要求 eino >= v0.5.14。 -DeepAgents 是在 ChatModelAgent (详见:[Eino ADK: ChatModelAgent](/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model))的基础上实现的一种开箱即用的 agent 方案。你无需自己去拼装提示词、工具或上下文管理,就可以立即获得一个可运行的 agent,并仍可使用 ChatModelAgent 的扩展能力来为 agent 增加业务功能,如添加自定义 tools 和 middleware 等。 +## 概述 -**包含内容:** +DeepAgents 是基于 ChatModelAgent 的开箱即用方案。无需手动拼装提示词、工具或上下文管理,即可获得具备规划、文件系统、Shell 执行和子 Agent 委派能力的 Agent,同时保留 ChatModelAgent 的全部扩展能力(自定义 tools、middleware、handlers)。 -- **规划能力** —— 通过 `write_todos` 进行任务拆解与进度跟踪 -- **文件系统** —— 提供 `read_file`、`write_file`、`edit_file`、`ls`、`glob`、`grep`,用于读取和写入上下文 -- **Shell 访问** —— 使用 `execute` 运行命令 -- **子 Agent** —— 通过 `task` 将工作委派给拥有独立上下文窗口的子智能体 -- **智能默认配置** —— 内置 Prompt,教模型如何高效使用这些工具 -- **上下文管理** —— 长对话历史自动摘要,大体量输出自动保存到文件 - - SummarizationMiddleware、ReductionMiddleware 正在建设中 +**内置能力**: -### ImportPath +- **规划** — `write_todos` 工具进行任务拆解与进度跟踪 +- **文件系统** — `ls`、`read_file`、`write_file`、`edit_file`、`glob`、`grep` +- **Shell** — `execute`(支持流式) +- **子 Agent** — `task` 工具将任务委派到上下文隔离的子智能体 +- **智能默认** — 内置 Prompt 教模型高效使用工具 +- **上下文管理** — 大体量输出自动保存到文件 -Eino 版本需大于等于 v0.5.14 +### Import ```go -import github.com/cloudwego/eino/adk/prebuilt/deep +import "github.com/cloudwego/eino/adk/prebuilt/deep" -agent, err := deep.New(ctx, &deep.Config{}) +agent, err := deep.New(ctx, &deep.Config{ + ChatModel: myModel, +}) ``` -### DeepAgents 结构 - -DeepAgents 核心思想在于通过一个主 agent(MainAgent)来协调、规划、委派或自主执行任务。主 agent 利用其内置的 ChatModel 和一系列工具来与外部世界交互或将复杂任务分解给专门的子 agents(SubAgents)。 - - +--- -上图展示了 DeepAgents 的核心组件与它们之间的调用关系: +## Config 完整定义 -- 主 Agent: 系统的入口和总指挥,接收初始任务,以 ReAct 方式调用工具完成任务并负责最终结果的呈现。 -- ChatModel (ToolCallingChatModel): 通常是一个具备工具调用能力的大语言模型,负责理解任务、推理、选择并调用工具。 -- Tools: MainAgent 可用的一系列能力的集合,包括: - - WriteTodos: 内置的规划工具,用于将复杂任务拆解为结构化的待办事项列表。 - - TaskTool: 一个特殊的工具,作为调用子 Agent 的统一入口。 - - BuiltinTools、CustomTools: DeepAgents 内置的通用工具以及用户根据业务需求自定义的各类工具。 -- SubAgents: 负责执行具体、独立的子任务,与 MainAgent 上下文独立。 - - GeneralPurpose: 通用子 Agent,具有与 MainAgent 相同的 Tools(除了 TaskTool),用于在“干净”的上下文中执行子任务。 - - CustomSubAgents: 用户根据业务需求自定义的各种子 Agent。 +```go +type Config = TypedConfig[*schema.Message] -### 内置能力 +type TypedConfig[M adk.MessageType] struct { + Name string // Agent 标识名 + Description string // 用途描述 + ChatModel model.BaseModel[M] // 必填;需支持 model.WithTools + Instruction string // 系统提示词;为空时使用内置默认 Prompt -#### Filesystem + // 子 Agent(绑定到 TaskTool) + SubAgents []adk.TypedAgent[M] -> 💡 -> 目前处于 alpha 状态 + // 自定义工具 + ToolsConfig adk.ToolsConfig + MaxIteration int // 最大推理迭代次数 -创建 DeepAgents 时配置相关 Backend,DeepAgents 会自动加载相应工具: + // 文件系统(三选一或组合) + Backend filesystem.Backend // 注册 ls/read_file/write_file/edit_file/glob/grep + Shell filesystem.Shell // 注册 execute(与 StreamingShell 互斥) + StreamingShell filesystem.StreamingShell // 注册 execute(流式,与 Shell 互斥) -``` -type Config struct { - // ... - Backend filesystem.Backend - Shell filesystem.Shell - StreamingShell filesystem.StreamingShell - // ... -} -``` + // 内置功能开关 + WithoutWriteTodos bool // true 时关闭 write_todos 工具 + WithoutGeneralSubAgent bool // true 时关闭默认 general-purpose 子 Agent - - - - - -
    配置功能添加工具
    Backend提供文件系统访问能力,可选read_file, write_file, edit_file, glob, grep
    Shell提供 Shell 能力,可选,与 StreamShell 互斥 execute
    StreamingShell提供可以流式返回结果的 Shell 能力,可选,与 Shell 互斥execute(streaming)
    + // TaskTool 描述生成器(自定义 task 工具的 description) + TaskToolDescriptionGenerator func(ctx context.Context, agents []adk.TypedAgent[M]) (string, error) -DeepAgents 内引用 filesystem middleware 来实现内置 filesystem,此 middleware 更详细的能力说明见:[Middleware: FileSystem](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem) + // 扩展 + Middlewares []adk.AgentMiddleware // struct-based 中间件 + Handlers []adk.TypedChatModelAgentMiddleware[M] // interface-based handlers -### 任务拆解与规划 + // 模型容错 + ModelRetryConfig *adk.TypedModelRetryConfig[M] + ModelFailoverConfig *adk.ModelFailoverConfig[M] -WriteTodos 的 Description 描述了任务拆解、规划的原则,主 Agent 通过调用 WriteTodos 工具,在上下文中添加子任务列表来启发后续推理、执行过程: + // 输出存储(通过 AddSessionValue 写入会话) + OutputKey string +} +``` - +### 构造函数 -1. 模型接收用户输入。 -2. 模型调用 WriteTodos 工具,参数为依照 WriteTodos Description 产生的任务列表。这次工具调用被添加到上下文中,供后续参考。 -3. 模型依照上下文中的 todos,调用 TaskTool 完成第一个 todo。 -4. 再次调用 WriteTodos ,更新 Todos 执行进度。 +```go +// 标准版(M = *schema.Message) +func New(ctx context.Context, cfg *Config) (adk.ResumableAgent, error) -> 💡 -> 对简单任务来说,每次都调用 WriteTodos 可能会起到反效果。WriteTodos Description 中添加了一些比较通用的正反例子来避免不调用或过度调用 WriteTodos。使用 DeepAgents 时,可以根据实际业务场景添加更多 prompt 来让 WriteTodos 在合适的时候被调用。 +// 泛型版(支持 *schema.AgenticMessage) +func NewTyped[M adk.MessageType](ctx context.Context, cfg *TypedConfig[M]) (adk.TypedResumableAgent[M], error) +``` > 💡 -> WriteTodos 会被默认添加到 Agent 中,配置 `WithoutWriteTodos=true` 可以关闭 WriteTodos。 +> 返回 ResumableAgent(包含 Resume 方法),可与 Runner 的 checkpoint/resume 机制配合使用。 -### 任务委派与 SubAgents 调用 - -**TaskTool** +--- -所有子 Agent 会被绑定到 TaskTool 上,当主 Agent 分配子任务给子 Agent 处理时,它会调用 TaskTool,并指明需要哪个子代理及执行的任务。TaskTool 随后将任务路由到指定的子代理,并在其执行完毕后,将结果返回给主 Agent。TaskTool 的默认 Description 会说明调用子 Agent 的通用规则并拼接每个子 Agent 的 Description,开发者可以通过配置 `TaskToolDescriptionGenerator` 来自定义 TaskTool 的 Description。 +## 架构 -> 当用户配置了 Config.SubAgents 时,这些 Agent 会基于 ChatModelAgent AgentAsTool 的能力绑定到 TaskTool 上 + -**上下文隔离** +- **主 Agent**:系统入口,以 ReAct 方式调用工具完成任务 +- **ChatModel**(`model.BaseModel[M]`):负责推理与工具选择 +- **Tools**: + - `write_todos`:内置规划工具,将任务拆解为结构化 TODO 列表 + - `task`:子 Agent 调用入口(路由参数:`subagent_type`、`description`) + - 内置工具(文件系统/Shell)+ 用户自定义工具(`ToolsConfig`) +- **SubAgents**:上下文隔离,独立执行子任务 + - `general-purpose`:默认子 Agent,拥有与主 Agent 相同的工具(除 task)和配置 + - 自定义子 Agent(`Config.SubAgents`) -Agent 之间的上下文隔离: +--- -- 信息传递: 主 Agent 与子 Agent 之间不共享上下文。子 Agent 仅接收主 Agent 分配的子任务目标,不会接收整个任务的处理过程;主 Agent 仅接收子 Agent 的处理结果,不会接受子 Agent 的处理过程。 -- 避免污染: 这种隔离确保了子 Agent 的执行过程(如大量的工具调用和中间步骤)不会“污染”主代理的上下文,主代理只接收简洁、明确的最终答案。 +## 内置文件系统 -**general-purpose** + + + + + +
    配置字段注册工具说明
    Backend
    ls, read_file, write_file, edit_file, glob, grep文件系统操作
    Shell
    execute非流式命令执行,与 StreamingShell 互斥
    StreamingShell
    execute (streaming)流式命令执行,与 Shell 互斥
    -DeepAgents 会默认增加一个子 Agent:general-purpose。general-purpose 具有和主 Agent 相同的 system prompt 和工具(除了 TaskTool),当任务没有专门的子 Agent 来解决时,主 Agent 可以调用 general-purpose 来隔离上下文。开发者可以通过配置 `WithoutGeneralSubAgent=true` 去掉此 Agent。 +内部使用 FileSystem Middleware 实现。 -### 与其他 Agent 对比 +--- -- 对比 ReAct Agent +## 任务规划:write_todos - - 优势:DeepAgents 通过内置 WriteTodos 强化任务拆解与规划;同时隔离多 Agents 上下文,在大规模、多步骤任务中通常效果更优。 - - 劣势:制定计划与调用子 Agent 会带来额外的模型请求,增加耗时与 token 成本;若任务拆分不合理,可能对效果产生反作用。 -- 对比 Plan-and-Execute + - - 优势:DeepAgents 将 Plan/RePlan 作为工具供主 Agent 自由调用,可以在任务中跳过不必要的规划,整体上减少模型调用次数、降低耗时与成本。 - - 劣势:任务规划与委派由一次模型调用完成,对模型能力要求更高,提示词调优也相对更困难。 +`write_todos` 工具将结构化 TODO 列表写入会话(key: `deep_agent_session_key_todos`),供后续推理参考。 -## DeepAgents 使用示例 +**TODO 结构**: -### 场景说明 +```go +type TODO struct { + Content string `json:"content"` + ActiveForm string `json:"activeForm"` + Status string `json:"status"` // "pending" | "in_progress" | "completed" +} +``` -Excel Agent 是一个“看得懂 Excel 的智能助手”,它先把问题拆解成步骤,再一步步执行并校验结果。它能理解用户问题与上传的文件内容,提出可行的解决方案,并选择合适的工具(系统命令、生成并运行 Python 代码、网络查询等等)完成任务。 +**工作流程**: -在真实业务里,你可以把 Excel Agent 当成一位“Excel 专家 + 自动化工程师”。当你交付一个原始表格和目标描述,它会给出方案并完成执行: +1. 模型接收用户输入 +2. 调用 `write_todos` 拆解任务,写入上下文 +3. 按 TODO 逐项执行(调用 task 或直接工具) +4. 再次调用 `write_todos` 更新进度 -- **数据清理与格式化**:从一个包含大量数据的 Excel 文件中完成去重、空值处理、日期格式标准化操作。 -- **数据分析与报告生成**:从销售数据中提取每月的销售总额,聚合统计、透视,最终生成并导出图表报告。 -- **自动化预算计算**:根据不同部门的预算申请,自动计算总预算并生成部门预算分配表。 -- **数据匹配与合并**:将多个不同来源的客户信息表进行匹配合并,生成完整的客户信息数据库。 +> 💡 +> 对简单任务,每次都调用 write_todos 可能适得其反。内置 Prompt 已包含正反例指导何时使用。可通过自定义 Instruction 进一步调优。配置 WithoutWriteTodos=true 可完全关闭。 -用 DeepAgents 搭建的 Excel Agent 结构如下: +--- - +## 子 Agent 委派:task 工具 -1. 在主 Agent 添加 ReadFile 工具,允许主 Agent 查看文件内容,辅助子任务制定 -2. 添加 Code 和 WebSearch 两个子 Agent:Code 可以编写 python 代码来操作 excel 表格;WebSearch 可以搜索信息并总结。 +**TaskTool** 是所有子 Agent 的统一调用入口: -### 代码实现 +- 参数:`subagent_type`(目标子 Agent 名称)、`description`(任务描述) +- 内部通过 `adk.NewTypedAgentTool` 将每个子 Agent 包装为工具 +- 默认 Description 包含所有可用子 Agent 的名称与说明;可通过 `TaskToolDescriptionGenerator` 自定义 -[https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) +**上下文隔离**: -### 运行结果 +- 子 Agent 仅接收主 Agent 分配的任务描述,不共享对话历史 +- 主 Agent 仅接收子 Agent 的最终结果,中间步骤不回传 +- 避免大量工具调用和中间推理"污染"主 Agent 上下文 -```yaml -name: ExcelAgent -path: [{ExcelAgent}] -tool name: task -arguments: {"subagent_type":"CodeAgent","description":"请帮我处理一个CSV文件操作任务。具体要求如下:1. 读取当前目录下的questions.csv文件;2. 提取该CSV文件中的第一列数据;3. 将提取出的第一列数据保存到一个新的CSV文件中,新文件命名为first_column.csv。请使用pandas库来完成这些操作,确保代码能够正确执行并处理可能出现的文件路径问题。"} +**general-purpose 子 Agent**: -name: ExcelAgent -path: [{ExcelAgent}] -tool name: python_runner -arguments: {"code":"```python\nimport pandas as pd\nimport os\n\n# 获取当前工作目录并构建文件路径\nbase_dir = os.getcwd()\ninput_path = os.path.join(base_dir, 'questions.csv')\noutput_path = os.path.join(base_dir, 'first_column.csv')\n\ntry:\n # 读取CSV文件\n df = pd.read_csv(input_path)\n # 提取第一列数据\n first_column = df.iloc[:, 0]\n # 保存为新CSV文件(不包含索引)\n first_column.to_csv(output_path, index=False)\n print(f\"成功将第一列数据保存至: {output_path}\")\nexcept FileNotFoundError:\n print(f\"错误: 未找到文件 {input_path}\")\nexcept Exception as e:\n print(f\"处理过程中发生错误: {str(e)}\")\n```"} +- 默认创建,拥有与主 Agent 相同的工具(除 task)、Instruction 和 ModelFailoverConfig +- 用于在隔离上下文中执行无专门子 Agent 的通用任务 +- 配置 `WithoutGeneralSubAgent=true` 可关闭 -name: ExcelAgent -path: [{ExcelAgent}] -tool response: 成功将第一列数据保存至: /Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv +--- +## 与其他方案对比 -name: ExcelAgent -path: [{ExcelAgent}] -answer: 任务已完成。已成功读取当前目录下的 `questions.csv` 文件,提取第一列数据,并将结果保存至 `first_column.csv`。具体输出路径如下: + + + + +
    维度DeepAgents vs ReActDeepAgents vs Plan-and-Execute
    优势内置规划 + 子 Agent 上下文隔离,多步任务效果更优Plan/RePlan 作为工具按需调用,减少不必要的规划开销
    劣势规划 + 子 Agent 调用增加模型请求、耗时与 token 成本规划与委派在单次调用中完成,对模型能力要求更高
    -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv` +--- -代码已处理路径拼接和异常捕获(如文件不存在或格式错误),确保执行稳定性。 +## 使用示例 -name: ExcelAgent -path: [{ExcelAgent}] -tool response: 任务已完成。已成功读取当前目录下的 `questions.csv` 文件,提取第一列数据,并将结果保存至 `first_column.csv`。具体输出路径如下: +### Excel Agent 场景 -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c44b131973/first_column.csv` + -代码已处理路径拼接和异常捕获(如文件不存在或格式错误),确保执行稳定性。 +- 主 Agent 配置 ReadFile 工具辅助任务制定 +- 添加 Code(Python 操作 Excel)和 WebSearch 两个子 Agent -name: ExcelAgent -path: [{ExcelAgent}] -answer: 已成功将 `questions.csv` 表格中的第一列数据提取至新文件 `first_column.csv`,文件保存路径为 -: +### 代码 -`/Users/bytedance/go/src/github.com/cloudwego/eino-examples/adk/multiagent/deep/playground/262be931-532c-4d83-8cff-96c4 -4b131973/first_column.csv` +完整示例:[https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) -操作过程中已处理路径拼接和异常捕获(如文件不存在、格式错误等问题),确保数据 -提取完整性和文件生成稳定性。若需要调整文件路径或对数据格式有进一步要求,请随时告知 -。 +```go +agent, err := deep.New(ctx, &deep.Config{ + Name: "ExcelAgent", + ChatModel: myModel, + Backend: localBackend, + SubAgents: []adk.Agent{codeAgent, webSearchAgent}, + ToolsConfig: adk.ToolsConfig{ + InvokableTools: []tool.InvokableTool{readFileTool}, + }, +}) ``` diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md index 1f3ba69429e..e90c0266c67 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/plan_execute.md @@ -1,10 +1,10 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: Plan-Execute Agent -weight: 4 +weight: 2 --- ## Plan-Execute Agent 概述 @@ -275,7 +275,7 @@ func newPlanExecuteAgent(ctx context.Context) adk.Agent { replanner := newReplanner(ctx, model) // 组合为 PlanExecuteAgent(固定 execute - replan 最大迭代 10 次) - planExecuteAgent, err := planexecute.NewPlanExecuteAgent(ctx, &planexecute.PlanExecuteConfig{ + planExecuteAgent, err := planexecute.New(ctx, &planexecute.PlanExecuteConfig{ Planner: planner, Executor: executor, Replanner: replanner, diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md deleted file mode 100644 index f583790aa52..00000000000 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/supervisor.md +++ /dev/null @@ -1,499 +0,0 @@ ---- -Description: "" -date: "2026-03-02" -lastmod: "" -tags: [] -title: Supervisor Agent -weight: 3 ---- - -## Supervisor Agent 概述 - -### Import Path - -`import ``github.com/cloudwego/eino/adk/prebuilt/supervisor` - -### 什么是 Supervisor Agent? - -Supervisor Agent 是一种中心化多 Agent 协作模式,由一个监督者(Supervisor Agent) 和多个子 Agent(SubAgents)组成。Supervisor 负责任务的分配、子 Agent 执行过程的监控,以及子 Agent 完成后的结果汇总与下一步决策;子 Agent 则专注于执行具体任务,并在完成后通过 WithDeterministicTransferTo 自动将任务控制权交回 Supervisor。 - - - -该模式适用于需要动态协调多个专业 Agent 完成复杂任务的场景,例如: - -- 科研项目管理(Supervisor 分配调研、实验、报告撰写任务给不同子 Agent)。 -- 客户服务流程(Supervisor 根据用户问题类型,分配给技术支持、售后、销售等子 Agent)。 - -### Supervisor Agent 结构 - -Supervisor 模式的核心结构如下: - -- **Supervisor Agent**:作为协作核心,具备任务分配逻辑(如基于规则或 LLM 决策),可通过 `SetSubAgents` 将子 Agent 纳入管理。 -- **SubAgents**:每个子 Agent 被 WithDeterministicTransferTo 增强,预设 `ToAgentNames` 为 Supervisor 名称,确保任务完成后自动转让回 Supervisor。 - -### Supervisor Agent 特点 - -1. **确定性回调**:子 Agent 执行完毕(未中断)后,通过 WithDeterministicTransferTo 自动触发 Transfer 事件,将任务控制权交回 Supervisor,避免协作流程中断。 -2. **中心化控制**:Supervisor 统一管理子 Agent,可根据子 Agent 的执行结果动态调整任务分配(如分配给其他子 Agent 或直接生成最终结果)。 -3. **松耦合扩展**:子 Agent 可独立开发、测试和替换,只需确保实现 Agent 接口并绑定到 Supervisor,即可接入协作流程。 -4. **支持中断与恢复**:若子 Agent 或 Supervisor 支持 `ResumableAgent` 接口,协作流程可在中断后恢复,保持任务上下文连续性。 - -### Supervisor Agent 运行流程 - -Supervisor 模式的典型协作流程如下: - -1. **任务启动**:Runner 触发 Supervisor 运行,输入初始任务(如“完成一份 LLM 发展历史报告”)。 -2. **任务分配**:Supervisor 根据任务需求,通过 Transfer 事件将任务转让给指定子 Agent(如“调研 Agent”)。 -3. **子 Agent 执行**:子 Agent 执行具体任务(如调研 LLM 关键里程碑),并生成执行结果事件。 -4. **自动回调**:子 Agent 完成后,WithDeterministicTransferTo 触发 Transfer 事件,将任务转让回 Supervisor。 -5. **结果处理**:Supervisor 接收子 Agent 的结果,决定下一步(如分配给“报告撰写 Agent”继续处理,或直接输出最终结果)。 - -## Supervisor Agent 使用示例 - -### 场景说明 - -创建一个科研报告生成系统: - -- **Supervisor**:基于用户输入的研究主题,分配任务给“调研 Agent”和“撰写 Agent”,并汇总最终报告。 -- **调研 Agent**:负责生成研究计划(如 LLM 发展的关键阶段)。 -- **撰写 Agent**:负责根据调研计划撰写完整报告。 - -### 代码实现 - -#### 步骤 1:实现子 Agent - -首先创建两个子 Agent,分别负责调研和撰写任务: - -```go -// 调研 Agent:生成研究计划 -func NewResearchAgent(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ResearchAgent", - Description: "Generates a detailed research plan for a given topic.", - Instruction: ` -You are a research planner. Given a topic, output a step-by-step research plan with key stages and milestones. -Output ONLY the plan, no extra text.`, - Model: model, - }) - return agent -} - -// 撰写 Agent:根据研究计划撰写报告 -func NewWriterAgent(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "WriterAgent", - Description: "Writes a report based on a research plan.", - Instruction: ` -You are an academic writer. Given a research plan, expand it into a structured report with details and analysis. -Output ONLY the report, no extra text.`, - Model: model, - }) - return agent -} -``` - -#### 步骤 2:实现 Supervisor Agent - -创建 Supervisor Agent,定义任务分配逻辑(此处简化为基于规则:先分配给调研 Agent,再分配给撰写 Agent): - -```go -// Supervisor Agent:协调调研和撰写任务 -func NewReportSupervisor(model model.ToolCallingChatModel) adk.Agent { - agent, _ := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ReportSupervisor", - Description: "Coordinates research and writing to generate a report.", - Instruction: ` -You are a project supervisor. Your task is to coordinate two sub-agents: -- ResearchAgent: generates a research plan. -- WriterAgent: writes a report based on the plan. - -Workflow: -1. When receiving a topic, first transfer the task to ResearchAgent. -2. After ResearchAgent finishes, transfer the task to WriterAgent with the plan as input. -3. After WriterAgent finishes, output the final report.`, - Model: model, - }) - return agent -} -``` - -#### 步骤 3:组合 Supervisor 与子 Agent - -使用 `NewSupervisor` 将 Supervisor 和子 Agent 组合: - -```go -import ( - "context" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/adk/prebuilt/supervisor" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -func main() { - ctx := context.Background() - - // 1. 创建 LLM 模型(如 GPT-4o) - model, _ := openai.NewChatModel(ctx, &openai.ChatModelConfig{ - APIKey: "YOUR_API_KEY", - Model: "gpt-4o", - }) - - // 2. 创建子 Agent 和 Supervisor - researchAgent := NewResearchAgent(model) - writerAgent := NewWriterAgent(model) - reportSupervisor := NewReportSupervisor(model) - - // 3. 组合 Supervisor 与子 Agent - supervisorAgent, _ := supervisor.New(ctx, &supervisor.Config{ - Supervisor: reportSupervisor, - SubAgents: []adk.Agent{researchAgent, writerAgent}, - }) - - // 4. 运行 Supervisor 模式 - iter := supervisorAgent.Run(ctx, &adk.AgentInput{ - Messages: []adk.Message{ - schema.UserMessage("Write a report on the history of Large Language Models."), - }, - EnableStreaming: true, - }) - - // 5. 消费事件流(打印结果) - for { - event, ok := iter.Next() - if !ok { - break - } - if event.Output != nil && event.Output.MessageOutput != nil { - msg, _ := event.Output.MessageOutput.GetMessage() - println("Agent[" + event.AgentName + "]:\n" + msg.Content + "\n===========") - } - } -} -``` - -### 运行结果 - -```markdown -Agent[ReportSupervisor]: - -=========== -Agent[ReportSupervisor]: -successfully transferred to agent [ResearchAgent] -=========== -Agent[ResearchAgent]: -1. **Scope Definition & Background Research** - - Task: Define "Large Language Model" (LLM) for the report (e.g., size thresholds, key characteristics: transformer-based, large-scale pretraining, general-purpose). - - Task: Identify foundational NLP/AI concepts pre-LLMs (statistical models, early neural networks, word embeddings) to contextualize origins. - - Milestone: 3-day literature review of academic definitions, industry reports, and AI historiographies to finalize scope. - -2. **Chronological Periodization** - - Task: Divide LLM history into distinct eras (e.g., Pre-2017: Pre-transformer foundations; 2017-2020: Transformer revolution & early LLMs; 2020-Present: Scaling & mainstream adoption). - - Task: Map key events, models, and breakthroughs per era (e.g., 2017: "Attention Is All You Need"; 2018: GPT-1/BERT; 2020: GPT-3; 2022: ChatGPT; 2023: Llama 2). - - Milestone: 10-day timeline draft with annotated model releases, research papers, and technological shifts. - -3. **Key Technical Milestones** - - Task: Deep-dive into critical innovations (transformer architecture, pretraining-fine-tuning paradigm, scaling laws, in-context learning). - - Task: Extract details from seminal papers (authors, institutions, methodologies, performance benchmarks). - - Milestone: 1-week analysis of 5-7 foundational papers (e.g., Vaswani et al. 2017; Radford et al. 2018; Devlin et al. 2018) with technical summaries. - -4. **Stakeholder Mapping** - - Task: Identify key organizations (OpenAI, Google DeepMind, Meta AI, Microsoft Research) and academic labs (Stanford, Berkeley) driving LLM development. - - Task: Document institutional contributions (e.g., OpenAI’s GPT series, Google’s BERT/PaLM, Meta’s Llama) and research priorities (open vs. closed models). - - Milestone: 5-day stakeholder profile draft with org-specific timelines and model lineages. - -5. **Technical Evolution & Innovation Trajectory** - - Task: Analyze shifts in architecture (from RNNs/LSTMs to transformers), training paradigms (pretraining + fine-tuning → instruction tuning → RLHF), and compute scaling (parameters, data size, GPU usage over time). - - Task: Link technical changes to performance improvements (e.g., GPT-1 (124M params) vs. GPT-4 (100B+ params): task generalization, emergent abilities). - - Milestone: 1-week technical trajectory report with data visualizations (param scaling, benchmark scores over time). - -6. **Impact & Societal Context** - - Task: Research LLM impact on NLP tasks (translation, summarization, QA) and beyond (education, content creation, policy). - - Task: Document cultural/industry shifts (rise of prompt engineering, "AI-native" products, public perception post-ChatGPT). - - Milestone: 5-day impact analysis integrating case studies (e.g., GitHub Copilot, healthcare LLMs) and media/scholarly discourse. - -7. **Challenges & Critiques (Historical Perspective)** - - Task: Track historical limitations (pre-2020: data sparsity, task specificity; post-2020: bias, misinformation, energy use) and responses (e.g., 2019: BERT bias audits; 2023: EU AI Act). - - Task: Cite key critiques (e.g., "On the Dangers of Stochastic Parrots," 2021) and industry/academic reactions. - - Milestone: 5-day challenge timeline linking issues to their emergence and mitigation efforts. - -8. **Synthesis & Narrative Drafting** - - Task: Integrate chronological, technical, and societal data into a coherent narrative (origins → revolution → scaling → mainstream impact). - - Task: Outline report structure (Abstract, Introduction, Era-by-Era Analysis, Key Innovations, Stakeholders, Impact, Challenges, Conclusion). - - Milestone: 1-week first draft of full report (8,000–10,000 words). - -9. **Validation & Fact-Checking** - - Task: Verify model release dates, paper citations, parameter counts, and stakeholder claims via primary sources (original papers, official press releases, archived GitHub repos). - - Task: Cross-check with secondary sources (AI history books, expert interviews, peer-reviewed historiographies). - - Milestone: 3-day validation report flagging/correcting inaccuracies. - -10. **Finalization & Revision** - - Task: Edit for clarity, narrative flow, and consistency; refine visuals (timelines, param scaling charts). - - Task: Format references (APA/MLA) and appendices (model comparison table, key paper list). - - Milestone: 2-day final report submission. -=========== -Agent[ResearchAgent]: - -=========== -Agent[ResearchAgent]: -successfully transferred to agent [ReportSupervisor] -=========== -Agent[ReportSupervisor]: - -=========== -Agent[ReportSupervisor]: -successfully transferred to agent [WriterAgent] -=========== -Agent[WriterAgent]: -# The History of Large Language Models: From Foundations to Mainstream Revolution - - -## Abstract -Large Language Models (LLMs) represent one of the most transformative technological innovations of the 21st century, enabling machines to understand, generate, and manipulate human language with unprecedented fluency. This report traces the historical trajectory of LLMs, from their conceptual roots in early natural language processing (NLP) to their current status as mainstream tools. It examines key technical milestones—including the invention of the transformer architecture, the rise of pretraining-fine-tuning paradigms, and the scaling of model parameters—and contextualizes these within the contributions of academic labs and tech giants. The report also analyzes societal impacts, from revolutionizing NLP tasks to sparking debates over bias, misinformation, and AI regulation. By synthesizing chronological, technical, and cultural data, this history reveals how LLMs evolved from niche research experiments to agents of global change. - - -## 1. Introduction: Defining Large Language Models -A **Large Language Model (LLM)** is a type of machine learning model designed to process and generate human language by learning patterns from massive text datasets. Key characteristics include: (1) a transformer-based architecture, enabling parallel processing of text sequences; (2) large-scale pretraining on diverse corpora (e.g., books, websites, articles); (3) general-purpose functionality, allowing adaptation to tasks like translation, summarization, or dialogue without task-specific engineering; and (4) scale, typically defined by billions (or tens of billions) of parameters (adjustable weights that capture linguistic patterns). - -LLMs emerged from decades of NLP research, building on foundational concepts like statistical models (e.g., n-grams), early neural networks (e.g., recurrent neural networks [RNNs]), and word embeddings (e.g., Word2Vec, GloVe). By the 2010s, these predecessors had laid groundwork for "language understanding," but were limited by task specificity (e.g., a model trained for translation could not summarize text) and data sparsity. LLMs addressed these gaps by prioritizing scale, generality, and architectural innovation—ultimately redefining the boundaries of machine language capability. - - -## 2. Era-by-Era Analysis: The Evolution of LLMs - -### 2.1 Pre-2017: Pre-Transformer Foundations (1950s–2016) -The roots of LLMs lie in mid-20th-century NLP, when researchers first sought to automate language tasks. Early efforts relied on rule-based systems (e.g., 1950s machine translation using syntax rules) and statistical methods (e.g., 1990s n-gram models for speech recognition). By the 2010s, neural networks gained traction: RNNs and long short-term memory (LSTM) models (Hochreiter & Schmidhuber, 1997) enabled sequence modeling, while word embeddings (Mikolov et al., 2013) represented words as dense vectors, capturing semantic relationships. - -Despite progress, pre-2017 models faced critical limitations: RNNs/LSTMs processed text sequentially, making them slow to train and unable to handle long-range dependencies (e.g., linking "it" in a sentence to a noun paragraphs earlier). Data was also constrained: models like Word2Vec trained on millions, not billions, of tokens. These bottlenecks set the stage for a paradigm shift. - - -### 2.2 2017–2020: The Transformer Revolution and Early LLMs -The year 2017 marked the dawn of the LLM era with the publication of *"Attention Is All You Need"* (Vaswani et al.), which introduced the **transformer architecture**. Unlike RNNs, transformers use "self-attention" mechanisms to weigh the importance of different words in a sequence simultaneously, enabling parallel computation and capturing long-range dependencies. This breakthrough reduced training time and improved performance on language tasks. - -#### Key Models and Breakthroughs: -- **2018**: OpenAI released **GPT-1** (Radford et al.), the first transformer-based LLM. With 124 million parameters, it introduced the "pretraining-fine-tuning" paradigm: pretraining on a large unlabeled corpus (BooksCorpus) to learn general language patterns, then fine-tuning on task-specific labeled data (e.g., sentiment analysis). -- **2018**: Google published **BERT** (Devlin et al.), a bidirectional transformer that processed text from left-to-right *and* right-to-left, outperforming GPT-1 on context-dependent tasks like question answering. BERT’s success popularized "contextual embeddings," where word meaning depends on surrounding text (e.g., "bank" as a financial institution vs. a riverbank). -- **2019**: OpenAI scaled up with **GPT-2** (1.5 billion parameters), demonstrating improved text generation but sparking early concerns about misuse (OpenAI initially delayed full release over fears of disinformation). -- **2020**: Google’s **T5** (Text-to-Text Transfer Transformer) unified NLP tasks under a single "text-to-text" framework (e.g., translating "translate English to French: Hello" to "Bonjour"), simplifying model adaptation. - - -### 2.3 2020–Present: Scaling, Emergence, and Mainstream Adoption -The 2020s saw LLMs transition from research curiosities to global phenomena, driven by exponential scaling of parameters, data, and compute. - -#### Key Developments: -- **2020**: OpenAI’s **GPT-3** (175 billion parameters) marked a turning point. Trained on 45 terabytes of text, it exhibited "few-shot" and "zero-shot" learning—adapting to tasks with minimal examples (e.g., "Write a poem about AI" with no prior poetry training). GPT-3’s release via API (OpenAI Playground) introduced LLMs to developers, enabling early applications like chatbots and code generation. -- **2022**: **ChatGPT** (based on GPT-3.5) brought LLMs to the public. Launched in November, its user-friendly interface and conversational ability sparked a viral explosion (100 million users by January 2023). ChatGPT refined training with **Reinforcement Learning from Human Feedback (RLHF)**, aligning outputs with human preferences (e.g., helpfulness, safety). -- **2023**: Meta released **Llama 2** (7B–70B parameters), an open-source LLM that lowered barriers to entry, allowing researchers and startups to fine-tune models without proprietary access. Meanwhile, OpenAI’s **GPT-4** (100B+ parameters) expanded multimodality (text + images) and improved reasoning (e.g., solving math problems, coding). -- **2023–2024**: The "race to scale" continued with models like Google’s **PaLM 2** (540B parameters), Anthropic’s **Claude 2** (200B+ parameters), and open-source alternatives (e.g., Mistral, Falcon). Compute usage skyrocketed: training GPT-3 required ~3.14e23 floating-point operations (FLOPs), equivalent to 355 years of a single GPU’s work. - - -## 3. Key Technical Milestones -### 3.1 The Transformer Architecture (2017) -Vaswani et al.’s *"Attention Is All You Need"* (Google, University of Toronto) replaced RNNs with self-attention, a mechanism that computes "attention scores" between every pair of words in a sequence. For example, in "The cat sat on the mat; it purred," self-attention links "it" to "cat." This parallel processing reduced training time from weeks (for RNNs) to days, enabling larger models. - -### 3.2 Pretraining-Fine-Tuning Paradigm (2018) -GPT-1 and BERT established the now-standard workflow: (1) Pretrain on a large, unlabeled corpus (e.g., Common Crawl, a web scrape of 1.1 trillion tokens) to learn syntax, semantics, and world knowledge; (2) Fine-tune on task-specific data (e.g., GLUE, a benchmark of 10 NLP tasks). This decoupled language learning from task engineering, enabling generalization. - -### 3.3 Scaling Laws and Emergent Abilities (2020s) -In 2020, OpenAI researchers articulated **scaling laws**: model performance improves predictably with increased parameters, data, and compute. By 2022, this led to "emergent abilities"—skills not present in smaller models, such as GPT-3’s in-context learning or GPT-4’s multi-step reasoning. - -### 3.4 Instruction Tuning and RLHF (2022) -Post-2020, training shifted from task-specific fine-tuning to **instruction tuning** (training on natural language instructions like "Summarize this article") and **RLHF** (rewarding models for human-preferred outputs). These methods made LLMs more usable: ChatGPT, for instance, follows prompts like "Explain quantum physics like I’m 5" without explicit fine-tuning. - - -## 4. Stakeholders: The Ecosystem of LLM Development -LLM evolution has been driven by a mix of tech giants, academic labs, and startups, each with distinct priorities: - -### 4.1 Tech Giants: Closed vs. Open Models -- **OpenAI** (founded 2015, backed by Microsoft): Pioneered the GPT series, prioritizing commercialization via closed APIs (e.g., ChatGPT Plus, GPT-4 API). Focus: user-friendliness and safety (via RLHF). -- **Google DeepMind**: Developed BERT, T5, and PaLM, integrating LLMs into products like Google Search (via BERT) and Bard. Balances closed (PaLM) and open (T5) models. -- **Meta AI**: Advocated for open science with Llama 1/2 (2023), releasing weights for research and commercial use. Meta’s "open" approach aims to democratize LLM access and accelerate safety research. -- **Microsoft**: Partnered with OpenAI (2019–present), providing Azure compute and integrating GPT into Bing (search), Office (Copilot), and GitHub (Copilot X for coding). - -### 4.2 Academic Labs -- **Stanford NLP**: Contributed to BERT and T5 research; developed HELM (Holistic Evaluation of Language Models), a benchmark for LLM safety and fairness. -- **UC Berkeley**: Studied LLM bias (e.g., 2021 paper "On the Dangers of Stochastic Parrots," critiquing LLMs as "statistical mimics" lacking true understanding). - - -## 5. Impact & Societal Context -### 5.1 Transforming NLP and Beyond -LLMs have redefined NLP performance: By 2023, GPT-4 outperformed humans on the MMLU benchmark (a test of 57 subjects, including math, law, and biology), scoring 86.4% vs. 86.5% for humans. Beyond NLP, they have revolutionized: -- **Content Creation**: Tools like Jasper and Copy.ai automate marketing copy; artists use DALL-E (paired with LLMs) for text-to-image generation. -- **Education**: Khan Academy’s Khanmigo tutors students; Coursera uses LLMs for personalized feedback. -- **Coding**: GitHub Copilot (2021) generates code from comments, boosting developer productivity by 55% (Microsoft, 2023). - -### 5.2 Cultural Shifts -- **Prompt Engineering**: The rise of "prompt engineers"—professionals skilled in crafting text inputs to elicit desired LLM outputs—became a new career path. -- **AI-Native Products**: Startups like Character.AI (chatbots with distinct personalities) and Perplexity (AI-powered search) emerged as "LLM-first" services. -- **Public Perception**: Post-ChatGPT, LLMs shifted from "AI hype" to tangible utility, though skepticism persists (e.g., 62% of U.S. adults worry about job displacement, Pew Research, 2023). - - -## 6. Challenges & Critiques: A Historical Perspective -### 6.1 Technical Limitations -- **Pre-2020**: Data sparsity (small corpora limited generalization); task specificity (models like BERT required retraining for new tasks). -- **Post-2020**: **Hallucinations** (fabricating facts, e.g., GPT-3 citing fake research papers); **energy use** (training GPT-3 emitted ~500 tons of CO₂, equivalent to 125 round-trip flights from NYC to London); **computational inequality** (only tech giants can afford 100B+ parameter models). - -### 6.2 Societal Risks -- **Bias**: Early LLMs mirrored training data biases (e.g., BERT associated "doctor" with "male" in 2019 audits). Responses included bias mitigation datasets (e.g., WinoBias) and audits (e.g., Stanford’s Gender Shades). -- **Misinformation**: GPT-2’s realistic text generation prompted calls for regulation; by 2023, deepfakes (e.g., AI-generated political speeches) became a policy focus. -- **Regulation**: The EU AI Act (2024) classified LLMs as "high-risk," requiring transparency (e.g., disclosing AI-generated content) and safety testing. - - -## 7. Conclusion: A Revolution in Five Years -The history of LLMs is a story of exponential progress: from the transformer’s 2017 invention to ChatGPT’s 2022 viral explosion, a mere five years. What began as an academic breakthrough—parallelizing text processing with self-attention—evolved into a technology that writes code, tutors students, and shapes global policy. - -Yet challenges persist: scaling has outpaced our understanding of how LLMs "think," and debates over bias, energy use, and access (closed vs. open models) intensify. As we look to the future, this history reminds us that LLMs are not just technical achievements, but mirrors of society—reflecting both our ingenuity and our flaws. Their next chapter will depend on balancing innovation with responsibility, ensuring these models serve as tools for collective progress. - - -## References -- Devlin, J., et al. (2018). *BERT: Pre-training of deep bidirectional transformers for language understanding*. NAACL. -- Hochreiter, S., & Schmidhuber, J. (1997). *Long short-term memory*. Neural Computation. -- Mikolov, T., et al. (2013). *Efficient estimation of word representations in vector space*. ICLR. -- Radford, A., et al. (2018). *Improving language understanding by generative pre-training*. OpenAI. -- Vaswani, A., et al. (2017). *Attention is all you need*. NeurIPS. -- Weidinger, L., et al. (2021). *On the dangers of stochastic parrots: Can language models be too big?*. ACM FAccT. -=========== -Agent[WriterAgent]: - -=========== -Agent[WriterAgent]: -successfully transferred to agent [ReportSupervisor] -=========== -``` - -## WithDeterministicTransferTo - -### 什么是 WithDeterministicTransferTo? - -`WithDeterministicTransferTo` 是 Eino ADK 提供的 Agent 增强工具,用于为 Agent 注入任务转让(Transfer)能力 。它允许开发者为目标 Agent 预设固定的任务转让路径,当该 Agent 完成任务(未被中断)时,会自动生成 Transfer 事件,将任务流转到预设的目标 Agent。 - -这一能力是构建 Supervisor Agent 协作模式的基础,确保子 Agent 在执行完毕后能可靠地将任务控制权交回监督者(Supervisor),形成“分配-执行-反馈”的闭环协作流程。 - -### WithDeterministicTransferTo 核心实现 - -#### 配置结构 - -通过 `DeterministicTransferConfig` 定义任务转让的核心参数: - -```go -// 包装方法 -func AgentWithDeterministicTransferTo(_ context.Context, config *DeterministicTransferConfig) Agent - -// 配置详情 -type DeterministicTransferConfig struct { - Agent Agent // 被增强的目标 Agent - ToAgentNames []string // 任务完成后转让的目标 Agent 名称列表 -} -``` - -- `Agent`:需要添加转让能力的原始 Agent。 -- `ToAgentNames`:当 `Agent` 完成任务且未中断时,自动转让任务的目标 Agent 名称列表(按顺序转让)。 - -#### Agent 包装 - -WithDeterministicTransferTo 会对原始 Agent 进行包装,根据其是否实现 `ResumableAgent` 接口(支持中断与恢复),分别返回 `agentWithDeterministicTransferTo` 或 `resumableAgentWithDeterministicTransferTo` 实例,确保增强能力与 Agent 原有功能(如 `Resume` 方法)兼容。 - -包装后的 Agent 会覆盖 `Run` 方法(对 `ResumableAgent` 还会覆盖 `Resume` 方法),在原始 Agent 的事件流基础上追加 Transfer 事件: - -```go -// 对普通 Agent 的包装 -type agentWithDeterministicTransferTo struct { - agent Agent // 原始 Agent - toAgentNames []string // 目标 Agent 名称列表 -} - -// Run 方法:执行原始 Agent 任务,并在任务完成后追加 Transfer 事件 -func (a *agentWithDeterministicTransferTo) Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent] { - aIter := a.agent.Run(ctx, input, options...) - - iterator, generator := NewAsyncIteratorPair[*AgentEvent]() - - // 异步处理原始事件流,并追加 Transfer 事件 - go appendTransferAction(ctx, aIter, generator, a.toAgentNames) - - return iterator -} -``` - -对于 `ResumableAgent`,额外实现 `Resume` 方法,确保恢复执行后仍能触发确定性转让: - -```go -type resumableAgentWithDeterministicTransferTo struct { - agent ResumableAgent // 支持恢复的原始 Agent - toAgentNames []string // 目标 Agent 名称列表 -} - -// Resume 方法:恢复执行原始 Agent 任务,并在完成后追加 Transfer 事件 -func (a *resumableAgentWithDeterministicTransferTo) Resume(ctx context.Context, info *ResumeInfo, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] { - aIter := a.agent.Resume(ctx, info, opts...) - iterator, generator := NewAsyncIteratorPair[*AgentEvent]() - go appendTransferAction(ctx, aIter, generator, a.toAgentNames) - return iterator -} -``` - -#### 事件流追加 Transfer 事件 - -`appendTransferAction` 是实现确定性转让的核心逻辑,它会消费原始 Agent 的事件流,在 Agent 任务正常结束(未中断)后,自动生成并发送 Transfer 事件到目标 Agent: - -```go -func appendTransferAction(ctx context.Context, aIter *AsyncIterator[*AgentEvent], generator *AsyncGenerator[*AgentEvent], toAgentNames []string) { - defer func() { - // 异常处理:捕获 panic 并通过事件传递错误 - if panicErr := recover(); panicErr != nil { - generator.Send(&AgentEvent{Err: safe.NewPanicErr(panicErr, debug.Stack())}) - } - generator.Close() // 事件流结束,关闭生成器 - }() - - interrupted := false - - // 1. 转发原始 Agent 的所有事件 - for { - event, ok := aIter.Next() - if !ok { // 原始事件流结束 - break - } - generator.Send(event) // 转发事件给调用方 - - // 检查是否发生中断(如 InterruptAction) - if event.Action != nil && event.Action.Interrupted != nil { - interrupted = true - } else { - interrupted = false - } - } - - // 2. 若未中断且存在目标 Agent,生成 Transfer 事件 - if !interrupted && len(toAgentNames) > 0 { - for _, toAgentName := range toAgentNames { - // 生成转让消息(系统提示 + Transfer 动作) - aMsg, tMsg := GenTransferMessages(ctx, toAgentName) - // 发送系统提示事件(告知用户任务转让) - aEvent := EventFromMessage(aMsg, nil, schema.Assistant, "") - generator.Send(aEvent) - // 发送 Transfer 动作事件(触发任务转让) - tEvent := EventFromMessage(tMsg, nil, schema.Tool, tMsg.ToolName) - tEvent.Action = &AgentAction{ - TransferToAgent: &TransferToAgentAction{ - DestAgentName: toAgentName, // 目标 Agent 名称 - }, - } - generator.Send(tEvent) - } - } -} -``` - -**关键逻辑**: - -- **事件转发**:原始 Agent 产生的所有事件(如思考、工具调用、输出结果)会被完整转发,确保业务逻辑不受影响。 -- **中断检查**:若 Agent 执行过程中被中断(如 `InterruptAction`),则不触发 Transfer(中断视为任务未正常完成)。 -- **Transfer 事件生成**:任务正常结束后,为每个 `ToAgentNames` 生成两条事件: - 1. 系统提示事件(`schema.Assistant` 角色):告知用户任务将转让给目标 Agent。 - 2. Transfer 动作事件(`schema.Tool` 角色):携带 `TransferToAgentAction`,触发 ADK 运行时将任务转让给 `DestAgentName` 对应的 Agent。 - -## 总结 - -WithDeterministicTransferTo 为 Agent 提供了可靠的任务转让能力,是构建 Supervisor 模式的核心基石;而 Supervisor 模式通过中心化协调与确定性回调,实现了多 Agent 之间的高效协作,显著降低了复杂任务的开发与维护成本。结合两者,开发者可快速搭建灵活、可扩展的多 Agent 系统。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md b/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md deleted file mode 100644 index f03c7c390c9..00000000000 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_implementation/workflow.md +++ /dev/null @@ -1,1265 +0,0 @@ ---- -Description: "" -date: "2026-03-09" -lastmod: "" -tags: [] -title: Workflow Agents -weight: 2 ---- - -# Workflow Agents 概述 - -## 导入路径 - -`import ``github.com/cloudwego/eino/adk` - -## 什么是 Workflow Agents - -Workflow Agents 是 eino ADK 中的一种特殊 Agent 类型,它允许开发者以预设的流程来组织和执行多个子 Agent。 - -与基于 LLM 自主决策的 Transfer 模式不同,Workflow Agents 采用**预设决策**的方式,按照代码中定义好的执行流程来运行子 Agent,提供了更可预测和可控的多 Agent 协作方式。 - -Eino ADK 提供了三种基础的 Workflow Agent 类型: - -- **SequentialAgent**:按顺序依次执行子 Agent -- **LoopAgent**:循环执行子 Agent 序列 -- **ParallelAgent**:并发执行多个子 Agent - -这些 Workflow Agent 可以相互嵌套,构建更复杂的执行流程,满足各种业务场景需求。 - -# SequentialAgent - -## 功能 - -SequentialAgent 是最基础的 Workflow Agent,它按照配置中提供的顺序,依次执行一系列子 Agent。每个子 Agent 执行完成后,其输出会通过 History 机制传递给下一个子 Agent,形成一个线性的执行链。 - - - -```go -type SequentialAgentConfig struct { - Name string // Agent 名称 - Description string // Agent 描述 - SubAgents []Agent // 子 Agent 列表,按执行顺序排列 -} - -func NewSequentialAgent(ctx context.Context, config *SequentialAgentConfig) (Agent, error) -``` - -SequentialAgent 的执行遵循以下设定: - -1. **线性执行**:严格按照 SubAgents 数组的顺序执行 -2. **History 传递**:每个 Agent 的执行结果都会被添加到 History 中,后续 Agent 可以访问前面 Agent 的执行历史 -3. **提前退出**:如果任何一个子 Agent 产生 ExitAction / Interrupt,整个 Sequential 流程会立即终止 - -SequentialAgent 适用于以下场景: - -- **多步骤处理流程**:如数据预处理 -> 分析 -> 生成报告 -- **管道式处理**:每个步骤的输出作为下个步骤的输入 -- **有依赖关系的任务序列**:后续任务依赖前面任务的结果 - -## 示例 - -示例展示了如何使用 SequentialAgent 创建一个三步骤的文档处理流水线: - -1. **DocumentAnalyzer**:分析文档内容 -2. **ContentSummarizer**:总结分析结果 -3. **ReportGenerator**:生成最终报告 - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -// 创建 ChatModel 实例 -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// 文档分析 Agent -func NewDocumentAnalyzerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "DocumentAnalyzer", - Description: "分析文档内容并提取关键信息", - Instruction: "你是一个文档分析专家。请仔细分析用户提供的文档内容,提取其中的关键信息、主要观点和重要数据。", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 内容总结 Agent -func NewContentSummarizerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ContentSummarizer", - Description: "对分析结果进行总结", - Instruction: "基于前面的文档分析结果,生成一个简洁明了的总结,突出最重要的发现和结论。", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 报告生成 Agent -func NewReportGeneratorAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ReportGenerator", - Description: "生成最终的分析报告", - Instruction: "基于前面的分析和总结,生成一份结构化的分析报告,包含执行摘要、详细分析和建议。", - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // 创建三个处理步骤的 Agent - analyzer := NewDocumentAnalyzerAgent() - summarizer := NewContentSummarizerAgent() - generator := NewReportGeneratorAgent() - - // 创建 SequentialAgent - sequentialAgent, err := adk.NewSequentialAgent(ctx, &adk.SequentialAgentConfig{ - Name: "DocumentProcessingPipeline", - Description: "文档处理流水线:分析 → 总结 → 报告生成", - SubAgents: []adk.Agent{analyzer, summarizer, generator}, - }) - if err != nil { - log.Fatal(err) - } - - // 创建 Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: sequentialAgent, - }) - - // 执行文档处理流程 - input := "请分析以下市场报告:2024年第三季度,公司营收增长15%,主要得益于新产品线的成功推出。但运营成本也上升了8%,需要优化效率。" - - fmt.Println("开始执行文档处理流水线...") - iter := runner.Query(ctx, input) - - stepCount := 1 - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - if event.Output != nil && event.Output.MessageOutput != nil { - fmt.Printf("\n=== 步骤 %d: %s ===\n", stepCount, event.AgentName) - fmt.Printf("%s\n", event.Output.MessageOutput.Message.Content) - stepCount++ - } - } - - fmt.Println("\n文档处理流水线执行完成!") -} -``` - -运行结果为: - -```markdown -开始执行文档处理流水线... - -=== 步骤 1: DocumentAnalyzer === -市场报告关键信息分析: - -1. 营收增长情况: - - 2024年第三季度,公司营收同比增长15%。 - - 营收增长的主要驱动力是新产品线的成功推出。 - -2. 成本情况: - - 运营成本上涨了8%。 - - 成本上升提醒公司需要进行效率优化。 - -主要观点总结: -- 新产品线推出显著推动了营收增长,显示公司在产品创新方面取得良好成果。 -- 虽然营收提升,但运营成本的增加在一定程度上影响了盈利能力,指出了提升运营效率的重要性。 - -重要数据: -- 营收增长率:15% -- 运营成本增长率:8% - -=== 步骤 2: ContentSummarizer === -总结:2024年第三季度,公司实现了15%的营收增长,主要归功于新产品线的成功推出,体现了公司产品创新能力的显著提升。然而,运营成本同时上涨了8%,对盈利能力构成一定压力,强调了优化运营效率的迫切需求。整体来看,公司在增长与成本控制之间需寻求更好的平衡以保障持续健康发展。 - -=== 步骤 3: ReportGenerator === -分析报告 - -一、执行摘要 -2024年第三季度,公司实现营收同比增长15%,主要得益于新产品线的成功推出,展现了强劲的产品创新能力。然而,运营成本也同比提升了8%,对利润空间形成一定压力。为确保持续的盈利增长,需重点关注运营效率的优化,推动成本控制与收入增长的平衡发展。 - -二、详细分析 -1. 营收增长分析 -- 公司营收增长15%,反映出新产品线市场接受度良好,有效拓展了收入来源。 -- 新产品线的推出体现了公司研发及市场响应能力的提升,为未来持续增长奠定基础。 - -2. 运营成本情况 -- 运营成本上升8%,可能来自原材料价格上涨、生产效率下降或销售推广费用增加等多个方面。 -- 该成本提升在一定程度上抵消了收入增长带来的利润增益,影响整体盈利能力。 - -3. 盈利能力及效率考量 -- 营收与成本增长的不匹配显示出当前运营效率存在改进空间。 -- 优化供应链管理、提升生产自动化及加强成本控制将成为关键措施。 - -三、建议 -1. 加强新产品线后续支持,包括市场推广和客户反馈机制,持续推动营收增长。 -2. 深入分析运营成本构成,识别主要成本驱动因素,制定针对性降低成本的策略。 -3. 推动内部流程优化与技术升级,提升生产及运营效率,缓解成本压力。 -4. 建立动态的财务监控体系,实现对营收与成本的实时跟踪与调整,确保公司财务健康。 - -四、结论 -公司在2024年第三季度展现出了良好的增长动力,但同时面临成本上升带来的挑战。通过持续的产品创新结合有效的成本管理,未来有望实现盈利能力和市场竞争力的双重提升,推动公司稳健发展。 - -文档处理流水线执行完成! -``` - -# LoopAgent - -## 功能 - -LoopAgent 基于 SequentialAgent 实现,它会重复执行配置的子 Agent 序列,直到达到最大迭代次数或某个子 Agent 产生 ExitAction。LoopAgent 特别适用于需要迭代优化、反复处理或持续监控的场景。 - - - -```go -type LoopAgentConfig struct { - Name string // Agent 名称 - Description string // Agent 描述 - SubAgents []Agent // 子 Agent 列表 - MaxIterations int // 最大迭代次数,0 表示无限循环 -} - -func NewLoopAgent(ctx context.Context, config *LoopAgentConfig) (Agent, error) -``` - -LoopAgent 的执行遵循以下设定: - -1. **循环执行**:重复执行 SubAgents 序列,每次循环都是一个完整的 Sequential 执行过程 -2. **History 累积**:每次迭代的结果都会累积到 History 中,后续迭代可以访问所有历史信息 -3. **条件退出**:支持通过 ExitAction 或达到最大迭代次数来终止循环,配置 `MaxIterations=0` 时表示无限循环 - -LoopAgent 适用于以下场景: - -- **迭代优化**:如代码优化、参数调优等需要反复改进的任务 -- **持续监控**:定期检查状态并执行相应操作 -- **反复处理**:需要多轮处理才能达到满意结果的任务 -- **自我改进**:Agent 根据前面的执行结果不断改进自己的输出 - -## 示例 - -示例展示了如何使用 LoopAgent 创建一个代码优化循环: - -1. **CodeAnalyzer**:分析代码问题 -2. **CodeOptimizer**:根据分析结果优化代码 -3. **ExitController**:判断是否需要退出循环 - -循环会持续执行直到代码质量达到标准或达到最大迭代次数。 - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" - "github.com/cloudwego/eino/schema" -) - -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// 代码分析 Agent -func NewCodeAnalyzerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "CodeAnalyzer", - Description: "分析代码质量和性能问题", - Instruction: `你是一个代码分析专家。请分析提供的代码,识别以下问题: -1. 性能瓶颈 -2. 代码重复 -3. 可读性问题 -4. 潜在的 bug -5. 不符合最佳实践的地方 - -如果代码已经足够优秀,请输出 "EXIT: 代码质量已达到标准" 来结束优化流程。`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 代码优化 Agent -func NewCodeOptimizerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "CodeOptimizer", - Description: "根据分析结果优化代码", - Instruction: `基于前面的代码分析结果,对代码进行优化改进: -1. 修复识别出的性能问题 -2. 消除代码重复 -3. 提高代码可读性 -4. 修复潜在 bug -5. 应用最佳实践 - -请提供优化后的完整代码。`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 创建一个特殊的 Agent 来处理退出逻辑 -func NewExitControllerAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "ExitController", - Description: "控制优化循环的退出", - Instruction: `检查前面的分析结果,如果代码分析师认为代码质量已达到标准(包含"EXIT"关键词), -则输出 "TERMINATE" 并生成退出动作来结束循环。否则继续下一轮优化。`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // 创建优化流程的 Agent - analyzer := NewCodeAnalyzerAgent() - optimizer := NewCodeOptimizerAgent() - controller := NewExitControllerAgent() - - // 创建 LoopAgent,最多执行 5 轮优化 - loopAgent, err := adk.NewLoopAgent(ctx, &adk.LoopAgentConfig{ - Name: "CodeOptimizationLoop", - Description: "代码优化循环:分析 → 优化 → 检查退出条件", - SubAgents: []adk.Agent{analyzer, optimizer, controller}, - MaxIterations: 5, // 最多 5 轮优化 - }) - if err != nil { - log.Fatal(err) - } - - // 创建 Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: loopAgent, - }) - - // 待优化的代码示例 - codeToOptimize := ` -func processData(data []int) []int { - result := []int{} - for i := 0; i < len(data); i++ { - for j := 0; j < len(data); j++ { - if data[i] > data[j] { - result = append(result, data[i]) - break - } - } - } - return result -} -` - - fmt.Println("开始代码优化循环...") - iter := runner.Query(ctx, "请优化以下 Go 代码:\n"+codeToOptimize) - - iteration := 1 - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Fatal(event.Err) - } - - if event.Output != nil && event.Output.MessageOutput != nil { - fmt.Printf("\n=== 第 %d 轮 - %s ===\n", iteration, event.AgentName) - fmt.Printf("%s\n", event.Output.MessageOutput.Message.Content) - - // 检查是否需要退出 - if event.AgentName == "ExitController" { - if event.Action != nil && event.Action.Exit { - fmt.Println("\n优化循环提前结束!") - break - } - iteration++ - } - } - } - - fmt.Println("\n代码优化循环执行完成!") -} -``` - -运行结果为: - -```java -开始代码优化循环... - -=== 第 1 轮 - CodeAnalyzer === -分析提供的代码: - -```go -func processData(data []int) []int { - result := []int{} - for i := 0; i < len(data); i++ { - for j := 0; j < len(data); j++ { - if data[i] > data[j] { - result = append(result, data[i]) - break - } - } - } - return result -} -``` - -### 1. 性能瓶颈 - -- 双层循环,时间复杂度为 O(n²),对于较大的数据量,性能不佳。 -- 内层循环当条件满足时立即 break,减少了部分不必要的比较,但整体仍然是二次复杂度。 - -### 2. 代码重复 - -- 代码没有明显重复,但逻辑可以简化。 - -### 3. 可读性问题 - -- 代码的意图不太明确。根据代码逻辑,函数筛选出那些在数组里至少有一个元素比它小的元素。换句话说,剔除了数组里所有等于或最小的数字。 -- 可以通过注释或重命名来增加可读性。 - -### 4. 潜在的 bug - -- 目前没有明显的 bug,但输入为空数组时,返回空数组符合预期。 - -### 5. 不符合最佳实践的地方 - -- 过程较为低效,未使用 Go 语言的特性优化性能。 -- 缺少注释或文档说明。 - ---- - -## 优化建议 - -如果目的确实是要返回所有比数组中某些元素大的元素(排除数组中的最小元素),可以改为: - -- 找出数组中的最小元素 minVal。 -- 遍历数组,将大于 minVal 的元素加入结果。 -- 这样时间复杂度降为 O(n)。 - -优化代码: - -```go -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -### 优化点说明 - -- 使用一次遍历确定最小值,O(n)。 -- 再次遍历构造结果,O(n)。 -- 适当使用切片容量初始化,减少内存分配。 -- 代码意图明确且性能提升明显。 - ---- - -如果你期望的是其他逻辑,请告知我进一步调整。 - -=== 第 1 轮 - CodeOptimizer === - -```go -// processData 返回输入切片中所有大于最小元素的值。 -// 如果输入为空,则返回 nil。 -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - // 找出切片中的最小值 - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - // 收集所有大于最小值的元素 - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== 第 1 轮 - ExitController === -TERMINATE - -=== 第 2 轮 - CodeAnalyzer === -EXIT: 代码质量已达到标准 - -=== 第 2 轮 - CodeOptimizer === - -```go -// processData 返回输入切片中所有大于最小元素的值。 -// 如果输入为空,则返回 nil。 -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - // 找出切片中的最小值 - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - // 收集所有大于最小值的元素 - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== 第 2 轮 - ExitController === -TERMINATE - -=== 第 3 轮 - CodeAnalyzer === -EXIT: 代码质量已达到标准 - -=== 第 3 轮 - CodeOptimizer === - -```go -// processData 返回输入切片中所有大于最小元素的值。 -// 如果输入为空,则返回 nil。 -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== 第 3 轮 - ExitController === -TERMINATE - -=== 第 4 轮 - CodeAnalyzer === -EXIT: 代码质量已达到标准 - -=== 第 4 轮 - CodeOptimizer === - -```go -// processData 返回切片中所有大于最小元素的元素。 -// 如果输入为空,则返回 nil。 -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== 第 4 轮 - ExitController === -TERMINATE - -=== 第 5 轮 - CodeAnalyzer === -EXIT: 代码质量已达到标准 - -=== 第 5 轮 - CodeOptimizer === - -```go -// processData 返回输入切片中所有大于最小元素的值。 -// 如果输入为空,则返回 nil。 -func processData(data []int) []int { - if len(data) == 0 { - return nil - } - - minVal := data[0] - for _, v := range data[1:] { - if v < minVal { - minVal = v - } - } - - result := make([]int, 0, len(data)) - for _, v := range data { - if v > minVal { - result = append(result, v) - } - } - - return result -} -``` - -=== 第 5 轮 - ExitController === -TERMINATE - -代码优化循环执行完成! - -``` - - - - -## BreakLoop - - -在 Loop Agent 中,当某个 Agent 需要中断循环运行时,您可以使用 ADK 提供的对应 Break Action。 - -```go -// BreakLoopAction is a programmatic-only agent action used to prematurely -// terminate the execution of a loop workflow agent. -// When a loop workflow agent receives this action from a sub-agent, it will stop its -// current iteration and will not proceed to the next one. -// It will mark the BreakLoopAction as Done, signalling to any 'upper level' loop agent -// that this action has been processed and should be ignored further up. -// This action is not intended to be used by LLMs. -type BreakLoopAction struct { - // From records the name of the agent that initiated the break loop action. - From string - // Done is a state flag that can be used by the framework to mark when the - // action has been handled. - Done bool - // CurrentIterations is populated by the framework to record at which - // iteration the loop was broken. - CurrentIterations int -} - -// NewBreakLoopAction creates a new BreakLoopAction, signaling a request -// to terminate the current loop. -func NewBreakLoopAction(agentName string) *AgentAction { - return &AgentAction{BreakLoop: &BreakLoopAction{ - From: agentName, - }} -} -``` - -Break Action 在达到中断目的的同时不影响 Loop Agent 外的其他 Agent 运行,而 Exit Action 会立刻中断所有后续的 Agent 运行。 - -以下图为例: - - - -- 当 Agent1 发出 BreakAction 时,Loop Agent 将中断,Sequential 继续运行 Agent3 -- 当 Agent1 发出 ExitAction 时,Sequential 运行流程整体终止,Agent2 / Agent3 均不会运行 - -# ParallelAgent - -## 功能 - -ParallelAgent 允许多个子 Agent 基于相同的输入上下文并发执行,所有子 Agent 同时开始执行,并等待全部完成后结束。这种模式特别适用于可以独立并行处理的任务,能够显著提高执行效率。 - - - -```go -type ParallelAgentConfig struct { - Name string // Agent 名称 - Description string // Agent 描述 - SubAgents []Agent // 并发执行的子 Agent 列表 -} - -func NewParallelAgent(ctx context.Context, config *ParallelAgentConfig) (Agent, error) -``` - -ParallelAgent 的执行遵循以下设定: - -1. **并发执行**:所有子 Agent 同时启动,在独立的 goroutine 中并行执行 -2. **共享输入**:所有子 Agent 接收相同的初始输入和上下文 -3. **等待与结果聚合**:内部使用 sync.WaitGroup 等待所有子 Agent 执行完成,收集所有子 Agent 的执行结果并按接收顺序输出 - -另外 Parallel 内部默认包含异常处理机制: - -- **Panic 恢复**:每个 goroutine 都有独立的 panic 恢复机制 -- **错误隔离**:单个子 Agent 的错误不会影响其他子 Agent 的执行 -- **中断处理**:支持子 Agent 的中断和恢复机制 - -ParallelAgent 适用于以下场景: - -- **独立任务并行处理**:多个不相关的任务可以同时执行 -- **多角度分析**:从不同角度同时分析同一个问题 -- **性能优化**:通过并行执行减少总体执行时间 -- **多专家咨询**:同时咨询多个专业领域的 Agent - -## 示例 - -示例展示了如何使用 ParallelAgent 同时从四个不同角度分析产品方案: - -1. **TechnicalAnalyst**:技术可行性分析 -2. **BusinessAnalyst**:商业价值分析 -3. **UXAnalyst**:用户体验分析 -4. **SecurityAnalyst**:安全风险分析 - -```go -package main - -import ( - "context" - "fmt" - "log" - "os" - "sync" - - "github.com/cloudwego/eino-ext/components/model/openai" - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/components/model" -) - -func newChatModel() model.ToolCallingChatModel { - cm, err := openai.NewChatModel(context.Background(), &openai.ChatModelConfig{ - APIKey: os.Getenv("OPENAI_API_KEY"), - Model: os.Getenv("OPENAI_MODEL"), - }) - if err != nil { - log.Fatal(err) - } - return cm -} - -// 技术分析 Agent -func NewTechnicalAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "TechnicalAnalyst", - Description: "从技术角度分析内容", - Instruction: `你是一个技术专家。请从技术实现、架构设计、性能优化等技术角度分析提供的内容。 -重点关注: -1. 技术可行性 -2. 架构合理性 -3. 性能考量 -4. 技术风险 -5. 实现复杂度`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 商业分析 Agent -func NewBusinessAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "BusinessAnalyst", - Description: "从商业角度分析内容", - Instruction: `你是一个商业分析专家。请从商业价值、市场前景、成本效益等商业角度分析提供的内容。 -重点关注: -1. 商业价值 -2. 市场需求 -3. 竞争优势 -4. 成本分析 -5. 盈利模式`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 用户体验分析 Agent -func NewUXAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "UXAnalyst", - Description: "从用户体验角度分析内容", - Instruction: `你是一个用户体验专家。请从用户体验、易用性、用户满意度等角度分析提供的内容。 -重点关注: -1. 用户友好性 -2. 操作便利性 -3. 学习成本 -4. 用户满意度 -5. 可访问性`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -// 安全分析 Agent -func NewSecurityAnalystAgent() adk.Agent { - a, err := adk.NewChatModelAgent(context.Background(), &adk.ChatModelAgentConfig{ - Name: "SecurityAnalyst", - Description: "从安全角度分析内容", - Instruction: `你是一个安全专家。请从信息安全、数据保护、隐私合规等安全角度分析提供的内容。 -重点关注: -1. 数据安全 -2. 隐私保护 -3. 访问控制 -4. 安全漏洞 -5. 合规要求`, - Model: newChatModel(), - }) - if err != nil { - log.Fatal(err) - } - return a -} - -func main() { - ctx := context.Background() - - // 创建四个不同角度的分析 Agent - techAnalyst := NewTechnicalAnalystAgent() - bizAnalyst := NewBusinessAnalystAgent() - uxAnalyst := NewUXAnalystAgent() - secAnalyst := NewSecurityAnalystAgent() - - // 创建 ParallelAgent,同时进行多角度分析 - parallelAgent, err := adk.NewParallelAgent(ctx, &adk.ParallelAgentConfig{ - Name: "MultiPerspectiveAnalyzer", - Description: "多角度并行分析:技术 + 商业 + 用户体验 + 安全", - SubAgents: []adk.Agent{techAnalyst, bizAnalyst, uxAnalyst, secAnalyst}, - }) - if err != nil { - log.Fatal(err) - } - - // 创建 Runner - runner := adk.NewRunner(ctx, adk.RunnerConfig{ - Agent: parallelAgent, - }) - - // 要分析的产品方案 - productProposal := ` -产品方案:智能客服系统 - -概述:开发一个基于大语言模型的智能客服系统,能够自动回答用户问题,处理常见业务咨询,并在必要时转接人工客服。 - -主要功能: -1. 自然语言理解和回复 -2. 多轮对话管理 -3. 知识库集成 -4. 情感分析 -5. 人工客服转接 -6. 对话历史记录 -7. 多渠道接入(网页、微信、APP) - -技术架构: -- 前端:React + TypeScript -- 后端:Go + Gin 框架 -- 数据库:PostgreSQL + Redis -- AI模型:GPT-4 API -- 部署:Docker + Kubernetes -` - - fmt.Println("开始多角度并行分析...") - iter := runner.Query(ctx, "请分析以下产品方案:\n"+productProposal) - - // 使用 map 来收集不同分析师的结果 - results := make(map[string]string) - var mu sync.Mutex - - for { - event, ok := iter.Next() - if !ok { - break - } - - if event.Err != nil { - log.Printf("分析过程中出现错误: %v", event.Err) - continue - } - - if event.Output != nil && event.Output.MessageOutput != nil { - mu.Lock() - results[event.AgentName] = event.Output.MessageOutput.Message.Content - mu.Unlock() - - fmt.Printf("\n=== %s 分析完成 ===\n", event.AgentName) - } - } - - // 输出所有分析结果 - fmt.Println("\n" + "============================================================") - fmt.Println("多角度分析结果汇总") - fmt.Println("============================================================") - - analysisOrder := []string{"TechnicalAnalyst", "BusinessAnalyst", "UXAnalyst", "SecurityAnalyst"} - analysisNames := map[string]string{ - "TechnicalAnalyst": "技术分析", - "BusinessAnalyst": "商业分析", - "UXAnalyst": "用户体验分析", - "SecurityAnalyst": "安全分析", - } - - for _, agentName := range analysisOrder { - if result, exists := results[agentName]; exists { - fmt.Printf("\n【%s】\n", analysisNames[agentName]) - fmt.Printf("%s\n", result) - fmt.Println("----------------------------------------") - } - } - - fmt.Println("\n多角度并行分析完成!") - fmt.Printf("共收到 %d 个分析结果\n", len(results)) -} -``` - -运行结果为: - -```markdown -开始多角度并行分析... - -=== BusinessAnalyst 分析完成 === - -=== UXAnalyst 分析完成 === - -=== SecurityAnalyst 分析完成 === - -=== TechnicalAnalyst 分析完成 === - -============================================================ -多角度分析结果汇总 -============================================================ - -【技术分析】 -针对该智能客服系统方案,下面从技术实现、架构设计及性能优化等角度进行详细分析: - ---- - -### 一、技术可行性 - -1. **自然语言理解和回复** - - 利用 GPT-4 API 实现自然语言理解和自动回复是当前成熟且可行的方案。GPT-4具备强大的语言理解和生成能力,适合处理复杂、多样的问题。 - -2. **多轮对话管理** - - 依赖后端维护上下文状态,结合GPT-4模型能够较好处理多轮交互。需要设计合理的上下文管理机制(例如对话历史维护、关键槽位抽取等),确保上下文信息完整性。 - -3. **知识库集成** - - 可通过向GPT-4 API添加特定的知识库检索结果(检索增强生成),或者通过本地检索接口集成知识库。技术上可行,但对于实时性和准确性有较高要求。 - -4. **情感分析** - - 情感分析功能可以用独立的轻量模型实现(例如基于BERT微调),也可尝试利用GPT-4输出,但成本较高。情感分析能力帮助智能客服更好地理解用户情绪,提升用户体验。 - -5. **人工客服转接** - - 技术上通过建立事件触发规则(如轮次数、情绪阈值、关键词检测)实现自动转人工。系统需支持工单或会话传递机制,并保障会话无缝切换。 - -6. **多渠道接入** - - 网页、微信、App等多渠道接入均可通过统一API网关实现,技术成熟,同时需要处理渠道差异性(消息格式、认证、推送机制等)。 - ---- - -### 二、架构合理性 - -- **前端 React + TypeScript** - 非常适合搭建响应式客服界面,生态成熟,方便多渠道共享组件。 - -- **后端 Go + Gin** - Go语言性能优异,Gin框架轻量且性能高,适合高并发场景。后端承担对接 GPT-4 API、管理状态、多渠道消息转发等职责,选择合理。 - -- **数据库 PostgreSQL + Redis** - - PostgreSQL 负责存储结构化数据,如用户信息、对话历史、知识库元数据。 - - Redis 负责缓存会话状态、热点知识库、限流等,提升访问性能。 - 架构设计符合常见大型互联网产品模式,组件分工明确。 - -- **AI模型 GPT-4 API** - 使用成熟API降低开发难度和模型维护成本;缺点是对网络和API调用依赖度高。 - -- **部署 Docker + Kubernetes** - 容器化和K8s编排能保证系统弹性伸缩、高可用和灰度发布,适合生产环境,符合现代微服务架构趋势。 - ---- - -### 三、性能考量 - -1. **响应时间** - - GPT-4 API调用本身有一定延迟(通常几百毫秒到1秒不等),对响应时间影响较大。需要做好接口异步处理与前端体验设计(如加载动画、部分渐进响应)。 - -2. **并发处理能力** - - 后端Go具有高并发处理优势,配合Redis缓存热点数据,能大幅提升整体吞吐能力。 - - 但GPT-4 API调用受限于OpenAI服务的QPS限制与调用成本,需合理设计调用频率与降级策略。 - -3. **缓存策略** - - 对用户对话上下文和常见问题答案进行缓存,减少重复API调用。 - - 如关键问题先做本地匹配,失败后才调用GPT-4,提升效率。 - -4. **多渠道负载均衡** - - 需要设计统一消息总线和可靠的异步队列,防止某渠道流量突增影响整体系统稳定。 - ---- - -### 四、技术风险 - -1. **GPT-4 API依赖** - - 高度依赖第三方API,风险包括服务中断、接口变更及成本波动。 - - 建议设计本地缓存和有限的替代回答逻辑以应对API异常。 - -2. **多轮对话上下文管理难度** - - 上下文过长或复杂会导致回答质量降低,需要设计限制上下文长度、选择性保留重要信息机制。 - -3. **知识库集成复杂度** - - 如何做到知识库与 ----------------------------------------- - -【商业分析】 -以下是对智能客服系统产品方案的商业角度分析: - -1. 商业价值 -- 提升客户服务效率:自动解答用户问题和常见咨询,减少人工客服压力,降低用人成本。 -- 提升用户体验:多轮对话和情感分析使交互更自然,增强客户满意度和粘性。 -- 数据驱动决策支持:对话历史与知识库集成为企业提供宝贵的用户反馈和行为数据,优化产品和服务。 -- 支持业务扩展:多渠道接入(网页、微信、APP)满足不同客户接入习惯,提升覆盖率。 - -2. 市场需求 -- 市场对智能客服的需求持续增长,特别是在电商、金融、医疗、教育等行业,客户服务自动化是企业数字化转型的重要方向。 -- 随着AI技术的成熟,企业期望借助大语言模型提升客服智能化水平。 -- 用户对即时响应、全天候服务的需求增加,推动智能客服系统的广泛采用。 - -3. 竞争优势 -- 采用先进的GPT-4大语言模型,拥有较强的自然语言理解与生成能力,提升问答准确率和对话自然度。 -- 情感分析功能有助于精准识别用户情绪,动态调整回复策略,提高客户满意度。 -- 多渠道接入设计满足企业多元化客户触达需求,增强产品适用性。 -- 技术架构采用微服务、容器化部署,便于弹性扩展和维护,提升系统稳定性和扩展能力。 - -4. 成本分析 -- AI模型调用成本较高,依赖GPT-4 API,需根据调用量和响应速度调整预算。 -- 技术研发投入较大,涉及前后端、多渠道融合、AI和知识库管理。 -- 运维和服务器成本需考虑多渠道并发访问。 -- 长期来看,人工客服人数可显著减少,节省人力成本。 -- 可通过云服务降低硬件初期投入,但云资源使用需精细管理以控制费用。 - -5. 盈利模式 -- SaaS订阅服务:按月/年向企业客户收取服务费,基于接入渠道数、并发量和功能级别分层定价。 -- 按调用次数或对话数收费,适合业务波动较大的客户。 -- 增值服务:数据分析报告定制、行业知识库集成、人工客服协同工具等收费。 -- 中大型客户可提供定制开发和技术支持,收取项目费用。 -- 通过持续优化模型和服务,增加客户留存和续费率。 - -综上,该智能客服系统基于成熟技术与AI优势,具备良好的商业价值和市场潜力。其多渠道接入和情感分析等功能增强竞争力,但需合理控制AI调用成本和运营费用。建议重点推进SaaS订阅和增值服务,结合市场推广,快速占领客户资源,提升盈利能力。 ----------------------------------------- - -【用户体验分析】 -针对该智能客服系统方案,我将从用户体验、易用性、用户满意度及可访问性等角度进行分析: - -1. 用户友好性 -- 自然语言理解和回复能力提升了用户与系统的沟通体验,使用户能够用自然话语表达需求,降低交流障碍。 -- 多轮对话管理允许系统理解上下文,减少重复解释,增强对话连贯性,进一步提升用户体验。 -- 情感分析功能有助于系统识别用户情绪,做出更贴心的回应,提高互动的个性化和人性化。 -- 多渠道接入覆盖用户常用的访问途径,方便用户随时随地获取服务,提升友好度。 - -2. 操作便利性 -- 自动回答常见业务咨询能够减轻用户等待时间和操作负担,提高响应速度。 -- 人工客服转接机制确保复杂问题可被及时处理,保障服务连续性和操作的无缝衔接。 -- 对话历史记录方便用户回顾咨询内容,避免重复查询,提升操作便利。 -- 使用现代技术栈(React、TypeScript)为前端交互提供良好性能和响应速度,间接增强操作流畅性。 - -3. 学习成本 -- 基于自然语言处理,用户无需学习特殊指令,降低使用门槛。 -- 多轮对话自然衔接,让用户更易理解系统响应逻辑,减少迷惑和挫败感。 -- 不同渠道的一致性界面(如在网页和微信中保持类似体验)有助于用户迅速上手。 -- 通过情感分析提供的更精准反馈,减少用户因误解而频繁尝试的时间成本。 - -4. 用户满意度 -- 快速准确的自动回复和多轮对话减少用户等待和重复输入,提升满意度。 -- 情感分析让系统更懂用户情绪,带来更温暖的交互体验,增加用户粘性。 -- 人工客服介入保障复杂问题得到妥善处理,提高服务质量感知。 -- 多渠道覆盖满足不同用户的使用场景,增强整体满意度。 - -5. 可访问性 -- 多渠道接入覆盖网页、微信、APP,适应不同用户的设备和环境,提升可访问性。 -- 方案未明确提及无障碍设计(如屏幕阅读器兼容、高对比度模式等),这可能是未来需要补充的部分。 -- 前端采用React和TypeScript,有利于实现响应式设计和无障碍功能,但需确保开发规范落地。 -- 后端架构和部署方案保证系统的稳定性和扩展性,间接提升用户持续可访问性。 - -总结: -该智能客服系统方案在用户体验和易用性方面考虑较为充分,利用大语言模型实现自然多轮对话、情感分析和知识库集成,满足用户多样化需求。同时,多渠道接入增强了系统的覆盖能力。建议在具体落地时,强化无障碍设计,实现更全面的可访问性保障,同时继续优化对话策略以提升用户满意度。 ----------------------------------------- - -【安全分析】 -针对该智能客服系统方案,结合信息安全、数据保护及隐私合规等方面,展开如下分析: - -一、数据安全 - -1. 数据传输安全 -- 建议系统所有客户端与服务器间通信均采用TLS/SSL加密,保障数据在传输过程中的机密性与完整性。 -- 由于支持多渠道接入(网页、微信、APP),需确保每个入口均严格实施加密传输。 - -2. 数据存储安全 -- PostgreSQL存储对话历史、用户资料等敏感信息,需启用数据库加密(如透明数据加密TDE或字段级加密),防止数据泄露。 -- Redis作为缓存,可能存储临时会话数据,也需开启访问认证与加密传输。 -- 对用户敏感数据实行最小存储原则,避免无关数据超范围保存。 -- 数据备份过程中需加密保存,且备份访问同样受控。 - -3. API调用安全 -- GPT-4 API调用产生大量用户数据交互,应评估其数据处理及存储政策,确保符合数据安全要求。 -- 增加调用权限管理,限制API密钥访问范围和权限,避免被滥用。 - -4. 日志安全 -- 系统日志中避免存储明文敏感信息,尤其是个人身份信息、对话内容。日志访问需严格控制。 - -二、隐私保护 - -1. 个人数据处理 -- 采集和存储用户个人数据(姓名、联系方式、账务信息等)必须明确告知用户,并征得用户同意。 -- 实施数据匿名化/去标识化技术,尤其是对话历史中的身份信息处理。 - -2. 用户隐私权利 -- 满足相关法律法规(例如《个人信息保护法》、《GDPR》)中用户的访问、更正、删除数据的权利。 -- 提供隐私政策明确披露数据收集、使用和共享情况。 - -3. 交互隐私 -- 多轮对话和情感分析等功能应考虑避免过度侵犯用户隐私,例如敏感情绪数据的使用透明告知和限制。 - -4. 第三方合规 -- GPT-4 API由第三方提供,需确保其服务符合相关隐私合规要求及数据保护标准。 - -三、访问控制 - -1. 用户身份验证 -- 系统中涉及用户身份信息查询和管理时,需建立可靠的身份认证机制。 -- 支持多因素认证增强安全性。 - -2. 权限管理 -- 后端管理接口及人工客服转接模块需采用基于角色的访问控制(RBAC),确保操作权限最小化。 -- 对访问敏感数据的操作需有详细审计和监控。 - -3. 会话管理 -- 对多渠道的会话要有有效的会话管理机制,防止会话劫持。 -- 对话历史访问权限应限制仅允许相关用户或授权人员访问。 - -四、安全漏洞 - -1. 应用安全 -- 前端React+TypeScript应防止XSS、CSRF攻击,合理使用Content Security Policy(CSP)。 -- 后端Go应用需防止SQL注入、请求伪造和权限缺失。Gin框架提供中间件支持,建议充分利用安全模块。 - -2. AI模型风险 -- GPT-4 API本身输入输出可能存在敏感信息泄露或模型误用风险,需限制输入内容、过滤敏感信息。 -- 防止生成恶意回答或信息泄露,建立内容审核机制。 - -3. 容器和部署安全 -- Docker容器须采用安全镜像,及时打补丁。Kubernetes集群网络策略和访问控制需完善。 -- 容器运行权限最小化,避免容器逃逸风险。 - -五、合规要求 - -1. 数据保护法规 -- 根据运营地域,需符合《个人信息保护法》(PIPL)、《欧盟通用数据保护条例》(GDPR)或其他相关法律要求。 -- 明确用户数据的采集、处理、传输和存储流程符合法规。 - -2. 用户隐私告知及同意 -- 应提供清晰的隐私政策和使用条款,说明数据用途及处理方式。 -- 实现用户同意管理(Consent Management)机制。 - -3. 数据跨境传输合规 -- 若系统涉及跨境数据流,需评估合规风险和采取相应技术 ----------------------------------------- - -多角度并行分析完成! -共收到 4 个分析结果 -``` - -# 总结 - -Workflow Agents 为 Eino ADK 提供了强大的多 Agent 协作能力,通过合理选择和组合这些 Workflow Agent,开发者可以构建出高效、可靠的多 Agent 协作系统,满足各种复杂的业务需求。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_interface.md b/content/zh/docs/eino/core_modules/eino_adk/agent_interface.md index a3629db244d..67f85bcf201 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_interface.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_interface.md @@ -1,390 +1,198 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-17" lastmod: "" tags: [] title: Agent 抽象 weight: 3 --- -# Agent 定义 +# Agent 接口 -Eino 定义了 Agent 的基础接口,实现此接口的 Struct 可被视为一个 Agent: +ADK 的所有功能围绕 `Agent` 接口展开: ```go -// github.com/cloudwego/eino/adk/interface.go +// github.com/cloudwego/eino/adk -type Agent interface { +type TypedAgent[M MessageType] interface { Name(ctx context.Context) string Description(ctx context.Context) string - Run(ctx context.Context, input *AgentInput, opts ...AgentRunOption) *AsyncIterator[*AgentEvent] + Run(ctx context.Context, input *TypedAgentInput[M], options ...AgentRunOption) *AsyncIterator[*TypedAgentEvent[M]] } + +// 默认类型别名(使用 *schema.Message) +type Agent = TypedAgent[*schema.Message] ``` - - - - + + + +
    Method 说明
    NameAgent 的名称,作为 Agent 的标识
    DescriptionAgent 的职能描述信息,主要用于让其他的 Agent 了解和判断该 Agent 的职责或功能
    RunAgent 的核心执行方法,返回一个迭代器,调用者可以通过这个迭代器持续接收 Agent 产生的事件
    方法说明
    Name
    Agent 名称标识
    Description
    职能描述,供其他 Agent 或框架了解能力
    Run
    核心执行方法,异步返回事件流(Future 模式)
    -## AgentInput - -Run 方法接收 AgentInput 作为 Agent 的输入: - -```go -type AgentInput struct { - Messages []Message - EnableStreaming bool -} - -type Message = *schema.Message -``` - -Agent 通常以 ChatModel 为核心,因此规定 Agent 的输入为 `Messages`, 与调用 Eino ChatModel 的类型相同。`Messages` 中可以包括用户指令、对话历史、背景知识、样例数据等任何你希望传递给 Agent 的数据。例如: +## MessageType 约束 ```go -import ( - "github.com/cloudwego/eino/adk" - "github.com/cloudwego/eino/schema" -) - -input := &adk.AgentInput{ - Messages: []adk.Message{ - schema.UserMessage("What's the capital of France?"), - schema.AssistantMessage("The capital of France is Paris.", nil), - schema.UserMessage("How far is it from London? "), - }, +type MessageType interface { + *schema.Message | *schema.AgenticMessage } ``` -`EnableStreaming` 用于向 Agent **建议**其输出模式,但它并非一个强制性约束。它的核心思想是控制那些同时支持流式和非流式输出的组件的行为,例如 ChatModel,而仅支持一种输出方式的组件,`EnableStreaming` 不会影响他们的行为。另外在 `AgentOutput.IsStreaming` 字段会标明实际输出类型。运行表现为: +所有 ADK 泛型类型使用 `[M MessageType]` 参数化。`*schema.Message` 支持完整 ADK 特性;`*schema.AgenticMessage` 用于 v0.9 新增的结构化内容块模式。 -- 当 `EnableStreaming=false` 时,对于那些既能流式也能非流式输出的组件,此时会使用一次性返回完整结果的非流式模式。 -- 当 `EnableStreaming=true` 时,对于 Agent 内部能够流式输出的组件(如 ChatModel 调用),应以流的形式逐步返回结果。如果某个组件天然不支持流式,它仍然可以按其原有的非流式方式工作。 +## 类型别名速查 -如下图所示,ChatModel 既可以输出非流也可以输出流,Tool 只能输出非流,即: - -- 当 `EnableStream=false` 时,二者均输出非流 -- 当 `EnableStream=true` 时,ChatModel 输出流,Tool 因为不具备输出流的能力,仍然输出非流。 - - - -## AgentRunOption - -`AgentRunOption` 由 Agent 实现定义,可以在请求维度修改 Agent 配置或者控制 Agent 行为。 - -Eino ADK 提供了一些通用定义的 Option,供用户使用: - -- `WithSessionValues`:设置跨 Agent 读写数据 -- `WithSkipTransferMessages`:配置后,当 Event 为 Transfer SubAgent 时,Event 中的消息不会追加到 History 中 - -Eino ADK 提供了 `WrapImplSpecificOptFn` 和 `GetImplSpecificOptions` 两个方法,供 Agent 包装与读取自定义的 `AgentRunOption`。 - -当使用 `GetImplSpecificOptions` 方法读取 `AgentRunOptions` 时,与所需类型(如例子中的 options)不符的 AgentRunOption 会被忽略。 + + + + + + + +
    泛型类型默认别名
    TypedAgent[*schema.Message]
    Agent
    TypedAgentInput[*schema.Message]
    AgentInput
    TypedAgentEvent[*schema.Message]
    AgentEvent
    TypedAgentOutput[*schema.Message]
    AgentOutput
    TypedMessageVariant[*schema.Message]
    MessageVariant
    -例如可以定义 `WithModelName`,在请求维度要求 Agent 修改调用的模型: +# AgentInput ```go -// github.com/cloudwego/eino/adk/call_option.go -// func WrapImplSpecificOptFn[T any](optFn func(*T)) AgentRunOption -// func GetImplSpecificOptions[T any](base *T, opts ...AgentRunOption) *T - -import "github.com/cloudwego/eino/adk" - -type options struct { - modelName string -} - -func WithModelName(name string) adk.AgentRunOption { - return adk.WrapImplSpecificOptFn(func(t *options) { - t.modelName = name - }) -} - -func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { - o := &options{} - o = adk.GetImplSpecificOptions(o, opts...) - // run code... +type TypedAgentInput[M MessageType] struct { + Messages []M + EnableStreaming bool } ``` -除此之外,AgentRunOption 具有一个 `DesignateAgent` 方法,调用该方法可以在调用多 Agent 系统时指定 Option 生效的 Agent: - -```go -func genOpt() { - // 指定 option 仅对 agent_1 和 agent_2 生效 - opt := adk.WithSessionValues(map[string]any{}).DesignateAgent("agent_1", "agent_2") -} -``` +- **Messages**:用户指令、对话历史、背景知识等,与 ChatModel 输入格式一致 +- **EnableStreaming**:建议 Agent 使用流式输出。支持流式的组件(如 ChatModel)会逐步返回;不支持的组件不受影响 -## AsyncIterator +# AgentEvent -`Agent.Run` 返回了一个迭代器 `AsyncIterator[*AgentEvent]`: +Agent 运行过程中产出的事件: ```go -// github.com/cloudwego/eino/adk/utils.go - -type AsyncIterator[T any] struct { - ... -} - -func (ai *AsyncIterator[T]) Next() (T, bool) { - ... +type TypedAgentEvent[M MessageType] struct { + AgentName string + RunPath []RunStep + Output *TypedAgentOutput[M] + Action *AgentAction + Err error } ``` -它代表一个异步迭代器(异步指生产与消费之间没有同步控制),允许调用者以一种有序、阻塞的方式消费 Agent 在运行过程中产生的一系列事件。 - -- `AsyncIterator` 是一个泛型结构体,可以用于迭代任何类型的数据。当前在 Agent 接口中, Run 方法返回的迭代器类型被固定为 `AsyncIterator[*AgentEvent]` 。这意味着,你从这个迭代器中获取的每一个元素,都将是一个指向 `AgentEvent` 对象的指针。`AgentEvent` 会在后续章节中详细说明。 -- 迭代器的主要交互方式是通过调用其 `Next()` 方法。这个方法的行为是 阻塞式 的,每次调用 `Next()` ,程序会暂停执行,直到以下两种情况之一发生: - - Agent 产生了一个新的 `AgentEvent` : `Next()` 方法会返回这个事件,调用者可以立即对其进行处理。 - - Agent 主动关闭了迭代器 : 当 Agent 不会再产生任何新的事件时(通常是 Agent 运行结束),它会关闭这个迭代器。此时 `Next()` 调用会结束阻塞并在第二个返回值返回 false,告知调用者迭代已经结束。 - -通常情况下,你需要使用 for 循环处理 `AsyncIterator`: +## AgentOutput ```go -iter := myAgent.Run(xxx) // get AsyncIterator from Agent.Run - -for { - event, ok := iter.Next() - if !ok { - break - } - // handle event +type TypedAgentOutput[M MessageType] struct { + MessageOutput *TypedMessageVariant[M] + CustomizedOutput any } ``` -`AsyncIterator` 可以由 `NewAsyncIteratorPair` 创建,该函数返回的另一个参数 `AsyncGenerator` 用来生产数据: +`MessageVariant` 统一处理流式与非流式消息: ```go -// github.com/cloudwego/eino/adk/utils.go - -func NewAsyncIteratorPair[T any]() (*AsyncIterator[T], *AsyncGenerator[T]) -``` - -Agent.Run 返回 AsyncIterator 旨在让调用者实时地接收到 Agent 产生的一系列 AgentEvent,因此 Agent.Run 通常会在 Goroutine 中运行 Agent 从而立刻返回 AsyncIterator 供调用者监听: - -```go -import "github.com/cloudwego/eino/adk" - -func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { - // handle input - iter, gen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() - go func() { - defer func() { - // recover code - gen.Close() - }() - // agent run code - // gen.Send(event) - }() - return iter +type TypedMessageVariant[M MessageType] struct { + IsStreaming bool + Message M + MessageStream *schema.StreamReader[M] + Role schema.RoleType // *schema.Message 路径 + AgenticRole schema.AgenticRoleType // *schema.AgenticMessage 路径 + ToolName string } ``` -## AgentWithOptions - -使用 `AgentWithOptions` 方法可以在 Eino ADK Agent 中进行一些通用配置。 - -与 `AgentRunOption` 不同的是,`AgentWithOptions` 在运行前生效,并且不支持自定义 option。 +- `IsStreaming=true` → 从 `MessageStream` 逐帧读取 +- `IsStreaming=false` → 从 `Message` 一次性获取 +- `Role`/`ToolName`:仅 `*schema.Message` 路径有效(Assistant 或 Tool) +- `AgenticRole`:仅 `*schema.AgenticMessage` 路径有效 -```go -// github.com/cloudwego/eino/adk/flow.go -func AgentWithOptions(ctx context.Context, agent Agent, opts ...AgentOption) Agent -``` - -Eino ADK 当前内置支持的配置有: - -- `WithDisallowTransferToParent`:配置该 SubAgent 不允许 Transfer 到 ParentAgent,会触发该 SubAgent 的 `OnDisallowTransferToParent` 回调方法 -- `WithHistoryRewriter`:配置后该 Agent 在执行前会通过该方法重写接收到的上下文信息 - -# AgentEvent +## AgentAction -AgentEvent 是 Agent 在其运行过程中产生的核心事件数据结构。其中包含了 Agent 的元信息、输出、行为和报错: +控制多 Agent 协作的行为信号: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentEvent struct { - AgentName string - - RunPath []RunStep - - Output *AgentOutput - - Action *AgentAction - - Err error +type AgentAction struct { + Exit bool + Interrupted *InterruptInfo + TransferToAgent *TransferToAgentAction // NOT RECOMMENDED + BreakLoop *BreakLoopAction + CustomizedAction any } - -// EventFromMessage 构建普通 event -func EventFromMessage(msg Message, msgStream MessageStream, role schema.RoleType, toolName string) *AgentEvent ``` -## AgentName & RunPath - -`AgentName` 和 `RunPath` 字段是由框架自动进行填充,它们提供了关于事件来源的重要上下文信息,在复杂的、由多个 Agent 构成的系统中至关重要。 +- **Interrupted**:中断 Runner 运行,携带自定义数据,支持后续 Resume +- **BreakLoop**:中止 LoopAgent 的循环 +- **Exit**:立即退出多 Agent 系统 +- **TransferToAgent**:(不推荐)任务转让,建议使用 AgentAsTool 替代 -```go -type RunStep struct { - agentName string -} -``` +# AgentRunOption -- `AgentName` 标明了是哪一个 Agent 实例产生了当前的 AgentEvent 。 -- `RunPath` 记录了到达当前 Agent 的完整调用链路。`RunPath` 是一个 `RunStep` 切片,它按顺序记录了从最初的入口 Agent 到当前产生事件的 Agent 的所有 `AgentName`。 - -## AgentOutput +请求维度的 Agent 配置。ADK 内置: -`AgentOutput` 封装了 Agent 产生的输出。 +- `WithSessionValues(map[string]any)`:注入跨 Agent 共享的 KV 数据 +- `WithCallbacks(...callbacks.Handler)`:添加回调处理器 +- `WithCancel()`:启用 Agent Cancel 能力(详见 [Cancel 与 TurnLoop](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart)) -Message 输出设置在 MessageOutput 字段中,其他类型的自定义输出设置在 CustomizedOutput 字段中: +自定义 Option: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentOutput struct { - MessageOutput *MessageVariant - - CustomizedOutput any +type myOptions struct { + modelName string } -type MessageVariant struct { - IsStreaming bool +func WithModelName(name string) adk.AgentRunOption { + return adk.WrapImplSpecificOptFn(func(t *myOptions) { + t.modelName = name + }) +} - Message Message - MessageStream MessageStream - // message role: Assistant or Tool - Role schema.RoleType - // only used when Role is Tool - ToolName string +// 在 Run 中读取 +func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + o := adk.GetImplSpecificOptions(&myOptions{}, opts...) + // 使用 o.modelName ... } ``` -`MessageOutput` 字段的类型 `MessageVariant` 是一个核心数据结构,主要功能为: - -1. 统一处理流式与非流式消息:`IsStreaming` 是一个标志位。值为 true 表示当前 `MessageVariant` 包含的是一个流式消息(从 MessageStream 读取),为 false 则表示包含的是一个非流式消息(从 Message 读取): - - - 流式 : 随着时间的推移,逐步返回一系列消息片段,最终构成一个完整的消息(MessageStream)。 - - 非流式 : 一次性返回一个完整的消息(Message)。 -2. 提供便捷的元数据访问:Message 结构体内部包含了一些重要的元信息,如消息的 Role(Assistant 或 Tool),为了方便快速地识别消息类型和来源, MessageVariant 将这些常用的元数据提升到了顶层: - - - `Role`:消息的角色,Assistant / Tool - - `ToolName`:如果消息角色是 Tool ,这个字段会直接提供工具的名称。 - -这样做的好处是,代码在需要根据消息类型进行路由或决策时, 无需深入解析 Message 对象的具体内容 ,可以直接从 MessageVariant 的顶层字段获取所需信息,从而简化了逻辑,提高了代码的可读性和效率。 - -## AgentAction - -Agent 产生包含 AgentAction 的 Event 可以控制多 Agent 协作,比如立刻退出、中断、跳转等: +`DesignateAgent` 可将 Option 限定到指定 Agent: ```go -// github.com/cloudwego/eino/adk/interface.go - -type AgentAction struct { - Exit bool - - Interrupted *InterruptInfo - - TransferToAgent *TransferToAgentAction - - BreakLoop *BreakLoopAction - - CustomizedAction any -} - -type InterruptInfo struct { - Data any -} - -type TransferToAgentAction struct { - DestAgentName string -} +opt := adk.WithSessionValues(map[string]any{"key": "val"}).DesignateAgent("agent_1") ``` -Eino ADK 当前预设 Action 有四种: +# AsyncIterator -1. 退出:当 Agent 产生 Exit Action 时,Multi-Agent 会立刻退出 +`Run` 返回的异步事件迭代器: ```go -func NewExitAction() *AgentAction { - return &AgentAction{Exit: true} +iter := agent.Run(ctx, input) +for { + event, ok := iter.Next() + if !ok { + break + } + // 处理 event } ``` -1. 跳转:当 Agent 产生 Transfer Action 时,会跳转到目标 Agent 运行 +`Next()` 阻塞直到有新事件或迭代结束。Agent 实现通常在 goroutine 中写入 Generator,立即返回 Iterator: ```go -func NewTransferToAgentAction(destAgentName string) *AgentAction { - return &AgentAction{TransferToAgent: &TransferToAgentAction{DestAgentName: destAgentName}} +func (m *MyAgent) Run(ctx context.Context, input *adk.AgentInput, opts ...adk.AgentRunOption) *adk.AsyncIterator[*adk.AgentEvent] { + iter, gen := adk.NewAsyncIteratorPair[*adk.AgentEvent]() + go func() { + defer gen.Close() + // 执行逻辑,通过 gen.Send(event) 产出事件 + }() + return iter } ``` -1. 中断:当 Agent 产生 Interrupt Action 时,会中断 Runner 的运行。由于中断可能发生在任何位置,同时中断时需要向外传递独特的信息,Action 中提供了 `Interrupted` 字段供 Agent 设置自定义数据,Runner 接收到 Interrupted 不为空的 Action 时则认为产生了中断。Interrupt & Resume 内部机制较为复杂,在 【Eino ADK: Agent Runner】-【Eino ADK: Interrupt & Resume】章节会展开详述。 - -```go -// 例如 ChatModelAgent 中断时,会发送如下的 AgentEvent: -h.Send(&AgentEvent{AgentName: h.agentName, Action: &AgentAction{ - Interrupted: &InterruptInfo{ - Data: &ChatModelAgentInterruptInfo{Data: data, Info: info}, - }, -}}) -``` - -4. 中止循环:当 LoopAgent 的一个子 Agent 发出 BreakLoopAction 时,对应的 LoopAgent 会停止循环并正常退出。 - # 语言设置 -ADK 提供了 `SetLanguage` 函数用于设置内置提示词(prompt)的语言。这影响所有 ADK 内置组件和中间件生成的提示词语言。本能力在 [alpha/08](https://github.com/cloudwego/eino/releases/tag/v0.8.0-alpha.13) 版本引入。 - -## API - ```go -// Language 表示 ADK 内置提示词的语言设置 -type Language uint8 - -const ( - // LanguageEnglish 表示英文(默认) - LanguageEnglish Language = iota - // LanguageChinese 表示中文 - LanguageChinese -) - -// SetLanguage 设置 ADK 内置提示词的语言 -// 默认语言是英文(如果未显式设置) -func SetLanguage(lang Language) error +adk.SetLanguage(adk.LanguageChinese) // 或 adk.LanguageEnglish(默认) ``` -## 使用示例 - -```go -import "github.com/cloudwego/eino/adk" - -// 设置为中文 -err := adk.SetLanguage(adk.LanguageChinese) -if err != nil { - // 处理错误 -} - -// 设置为英文(默认) -err = adk.SetLanguage(adk.LanguageEnglish) -``` - -## 影响范围 - -语言设置会影响以下组件的内置提示词: - - - - - - - -
    组件/中间件影响的提示词
    FileSystem Middleware文件系统工具描述、系统提示词、执行工具提示词
    Reduction Middleware工具结果截断/清理的提示文字
    Skill Middleware技能系统提示词、技能工具描述
    ChatModelAgent内置系统提示词
    - -> 💡 -> 建议在程序初始化时设置语言,因为语言设置是全局生效的。在运行时更改语言可能导致同一会话中出现混合语言的提示词。 +影响 ADK 内置提示词(FileSystem、Reduction、Skill、ChatModelAgent 等组件)。建议在程序初始化时设置。 > 💡 -> 语言设置仅影响 ADK 内置的提示词。你自定义的提示词(如 Agent 的 Instruction)需要自行处理国际化。 +> 语言设置仅影响 ADK 内置提示词。自定义 Instruction 需自行处理国际化。 diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_preview.md b/content/zh/docs/eino/core_modules/eino_adk/agent_preview.md index d32061c34a9..1afa28ea851 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_preview.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_preview.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-01-20" +date: "2026-05-17" lastmod: "" tags: [] title: 概述 @@ -9,154 +9,72 @@ weight: 2 # 什么是 Eino ADK? -Eino ADK 参考 [Google-ADK](https://google.github.io/adk-docs/agents/) 的设计,提供了 Go 语言 的 Agents 开发的灵活组合框架,即 Agent、Multi-Agent 开发框架。Eino ADK 为多 Agent 交互时,沉淀了通用的 上下文传递、事件流分发和转换、任务控制权转让、中断与恢复、通用切面等能力。 适用场景广泛、模型无关、部署无关,让 Agent、Multi-Agent 开发更加简单、便利,并提供完善的生产级应用的治理能力。 +Eino ADK 是 Go 语言的 Agent 开发框架,提供: -Eino ADK 旨在帮助开发者开发、管理 Agent 应用。提供灵活且鲁棒的开发环境,助力开发者搭建 对话智能体、非对话智能体、复杂任务、工作流等多种多样的 Agent 应用。 +- **ChatModelAgent**:以 LLM 为决策器的 ReAct Agent,支持工具调用、自主推理、运行时增强(Middleware) +- **Workflow Agents**:确定性编排原语(Sequential / Loop / Parallel) +- **Runner / TurnLoop**:Agent 执行入口,支持事件流、checkpoint/resume、多轮抢占 +- **多 Agent 协作**:AgentAsTool(推荐)、Workflow 组合 -# ADK 框架 +适用场景广泛、模型无关、部署无关。 -Eino ADK 的整体模块构成,如下图所示: - - +# ADK 架构 ## Agent Interface -Eino ADK 的核心是 Agent 抽象(Agent Interface),ADK 的所有功能设计均围绕 Agent 抽象展开。详解请见 [Eino ADK: Agent 抽象 [New]](/zh/docs/eino/core_modules/eino_adk/agent_interface) +ADK 的所有功能围绕 `Agent` 接口展开: ```go type Agent interface { Name(ctx context.Context) string Description(ctx context.Context) string - - // Run runs the agent. - // The returned AgentEvent within the AsyncIterator must be safe to modify. - // If the returned AgentEvent within the AsyncIterator contains MessageStream, - // the MessageStream MUST be exclusive and safe to be received directly. - // NOTE: it's recommended to use SetAutomaticClose() on the MessageStream of AgentEvents emitted by AsyncIterator, - // so that even the events are not processed, the MessageStream can still be closed. Run(ctx context.Context, input *AgentInput, options ...AgentRunOption) *AsyncIterator[*AgentEvent] } ``` -`Agent.Run` 的定义为: - -1. 从入参 AgentInput、AgentRunOption 和可选的 Context Session 中获取任务详情及相关数据 -2. 执行任务,并将执行过程、执行结果写入到 AgentEvent Iterator - -`Agent.Run` 要求 Agent 的实现以 Future 模式异步执行,核心分成三步,具体可参考 ChatModelAgent 中 Run 方法的实现: +`Run` 的语义: -1. 创建一对 Iterator、Generator -2. 启动 Agent 的异步任务,并传入 Generator,处理 AgentInput。Agent 在这个异步任务执行核心逻辑(例如 ChatModelAgent 调用 LLM),并在产生新的事件时写入到 Generator 中,供 Agent 调用方在 Iterator 中消费 -3. 启动 2 中的任务后立即返回 Iterator +1. 从 `AgentInput` 和 Context 中获取任务信息 +2. 异步执行任务,产出的事件写入 `AsyncIterator` +3. 启动异步任务后立即返回 Iterator(Future 模式) -## 多 Agent 协作 +## ChatModelAgent -围绕 Agent 抽象,Eino ADK 提供多种简单易用、场景丰富的组合原语,可支撑开发丰富多样的 Multi-Agent 协同策略,比如 Supervisor、Plan-Execute、Group-Chat 等 Multi-Agent 场景。从而实现不同的 Agent 分工合作模式,处理更复杂的任务。详解请见 [Eino ADK: Agent 组合](/zh/docs/eino/core_modules/eino_adk/agent_collaboration) +ADK 的核心实现。以 ChatModel 为决策器,通过 ReAct Loop 自主推进问题求解。 -Eino ADK 定义的 Agent 协作过程中的协作原语如下: +**ChatModelAgent = ChatModel + Tools + ReAct Loop + Middleware** -- Agent 间协作方式 +详细介绍见:[Eino ADK: ChatModelAgent 介绍](/zh/docs/eino/overview/eino_adk_quickstart) - - - - -
    协助方式描述
    Transfer直接将任务转让给另外一个 Agent,本 Agent 则执行结束后退出,不关心转让 Agent 的任务执行状态
    ToolCall(AgentAsTool)将 Agent 当成 ToolCall 调用,等待 Agent 的响应,并可获取被调用Agent 的输出结果,进行下一轮处理
    +## 多 Agent 协作 -- AgentInput 的上下文策略 +> 💡 +> 推荐方式:**AgentAsTool** — 将子 Agent 转为 Tool,父 Agent 通过 ToolCall 调用并获取结果。这是最灵活、最可组合的协作模式。 - - - + + +
    上下文策略描述
    上游 Agent 全对话获取本 Agent 的上游 Agent 的完整对话记录
    全新任务描述忽略掉上游 Agent 的完整对话记录,给出一个全新的任务总结,作为子 Agent 的 AgentInput 输入
    协作方式机制适用场景
    AgentAsTool(推荐)子 Agent 包装为 Tool,父 Agent 自主决定是否调用委派子任务、能力组合
    WorkflowSequential / Loop / Parallel 确定性编排流程固定的多步任务
    -- 决策自主性 +详见:[Agent 协作](/zh/docs/eino/core_modules/eino_adk/agent_collaboration) - - - - -
    决策自主性描述
    自主决策在 Agent 内部,基于其可选的下游 Agent, 如需协助时,自主选择下游 Agent 进行协助。 一般来说,Agent 内部是基于 LLM 进行决策,不过即使是基于预设逻辑进行选择,从 Agent 外部看依然视为自主决策
    预设决策事先预设好一个Agent 执行任务后的下一个 Agent。 Agent 的执行顺序是事先确定、可预测的
    +## Runner -围绕协作原语,Eino ADK 提供了如下的几种 Agent 组合原语: +Runner 是 Agent 的执行入口。只有通过 Runner 执行时才能使用: - - - - - - - -
    类型描述运行模式协作方式上下文策略决策自主性
    SubAgents将用户提供的 agent 作为 父Agent,用户提供的 subAgents 列表作为 子Agents,组合而成可自主决策的 Agent,其中的 Name 和 Description 作为该 Agent 的名称标识和描述。
  • 当前限定一个 Agent 只能有一个 父 Agent
  • 可采用 SetSubAgents 函数,构建 「多叉树」 形式的 Multi-Agent
  • 在这个「多叉树」中,AgentName 需要保持唯一
  • Transfer上游 Agent 全对话自主决策
    Sequential将用户提供的 SubAgents 列表,组合成按照顺序依次执行的 Sequential Agent,其中的 Name 和 Description 作为 Sequential Agent 的名称标识和描述。Sequential Agent 执行时,将 SubAgents 列表,按照顺序依次执行,直至将所有 Agent 执行一遍后结束。Transfer上游 Agent 全对话预设决策
    Parallel将用户提供的 SubAgents 列表,组合成基于相同上下文,并发执行的 Parallel Agent,其中的 Name 和 Description 作为 Parallel Agent 的名称标识和描述。Parallel Agent 执行时,将 SubAgents 列表,并发执行,待所有 Agent 执行完成后结束。Transfer上游 Agent 全对话预设决策
    Loop将用户提供的 SubAgents 列表,按照数组顺序依次执行,循环往复,组合成 Loop Agent,其中的 Name 和 Description 作为 Loop Agent 的名称标识和描述。Loop Agent 执行时,将 SubAgents 列表,顺序执行,待所有 Agent 执行完成后结束。Transfer上游 Agent 全对话预设决策
    AgentAsTool将一个 Agent 转换成 Tool,被其他的 Agent 当成普通的 Tool 使用。一个 Agent 能否将其他 Agent 当成 Tool 进行调用,取决于自身的实现。Eino ADK 中提供的 ChatModelAgent 支持 AgentAsTool 的功能ToolCall全新任务描述自主决策
    - -## ChatModelAgent - -`ChatModelAgent` 是 Eino ADK 对 Agent 的关键实现,它封装了与大语言模型的交互逻辑,实现了 ReAct 范式的 Agent,基于 Eino 中的 Graph 编排出 ReAct Agent 控制流,通过 callbacks.Handler 导出 ReAct Agent 运行过程中产生的事件,转换成 AgentEvent 返回。 - -想要进一步了解 ChatModelAgent,请见:[Eino ADK: ChatModelAgent [New]](/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model) +- **事件流输出**:Query/Run → AsyncIterator[AgentEvent] +- **Checkpoint / Resume**:持久化运行状态,支持中断恢复 +- **TurnLoop**:多轮运行时,Push/Preempt/Stop ```go -type ChatModelAgentConfig struct { - // Name of the agent. Better be unique across all agents. - Name string - // Description of the agent's capabilities. - // Helps other agents determine whether to transfer tasks to this agent. - Description string - // Instruction used as the system prompt for this agent. - // Optional. If empty, no system prompt will be used. - // Supports f-string placeholders for session values in default GenModelInput, for example: - // "You are a helpful assistant. The current time is {Time}. The current user is {User}." - // These placeholders will be replaced with session values for "Time" and "User". - Instruction string - - Model model.ToolCallingChatModel - - ToolsConfig ToolsConfig - - // GenModelInput transforms instructions and input messages into the model's input format. - // Optional. Defaults to defaultGenModelInput which combines instruction and messages. - GenModelInput GenModelInput - - // Exit defines the tool used to terminate the agent process. - // Optional. If nil, no Exit Action will be generated. - // You can use the provided 'ExitTool' implementation directly. - Exit tool.BaseTool - - // OutputKey stores the agent's response in the session. - // Optional. When set, stores output via AddSessionValue(ctx, outputKey, msg.Content). - OutputKey string - - // MaxIterations defines the upper limit of ChatModel generation cycles. - // The agent will terminate with an error if this limit is exceeded. - // Optional. Defaults to 20. - MaxIterations int -} +runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + CheckPointStore: store, // 可选 +}) -func NewChatModelAgent(_ context.Context, config *ChatModelAgentConfig) (*ChatModelAgent, error) { - // omit code -} +iter := runner.Query(ctx, "你的问题") ``` -# AgentRunner - -AgentRunner 是 Agent 的执行器,为 Agent 运行所需要的拓展功能加以支持,详解请见:[Eino ADK: Agent 扩展](/zh/docs/eino/core_modules/eino_adk/agent_extension) - -只有通过 Runner 执行 agent 时,才可以使用 ADK 的如下功能: - -- Interrupt & Resume -- 切面机制 -- Context 环境的预处理 - - ```go - type RunnerConfig struct { - Agent Agent - EnableStreaming bool - - CheckPointStore compose.CheckPointStore - } - - func NewRunner(_ context.Context, conf RunnerConfig) *Runner { - // omit code - } - ``` +详见:[Agent Runner 与扩展](/zh/docs/eino/core_modules/eino_adk/agent_extension) | [Agent Cancel 与 TurnLoop](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) diff --git a/content/zh/docs/eino/core_modules/eino_adk/agent_quickstart.md b/content/zh/docs/eino/core_modules/eino_adk/agent_quickstart.md index 1b833a6b6e4..f36e3f913d2 100644 --- a/content/zh/docs/eino/core_modules/eino_adk/agent_quickstart.md +++ b/content/zh/docs/eino/core_modules/eino_adk/agent_quickstart.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-01-30" +date: "2026-05-19" lastmod: "" tags: [] title: Quickstart @@ -9,85 +9,47 @@ weight: 1 # Installation -Eino 自 0.5.0 版本正式提供 ADK 功能供用户使用,您可以在项目中输入下面命令来升级 Eino: +Eino ADK 自 v0.5.0 起可用,v0.9.0 为当前推荐版本: ```go -// stable >= eino@v0.5.0 go get github.com/cloudwego/eino@latest ``` -# Agent +# 核心概念 -### 什么是 Eino ADK +**Eino ADK** 是 Go 语言的 Agent 开发框架。核心原语是 **ChatModelAgent**——以 ChatModel 为决策器、以 Tools 为行动空间、通过 ReAct Loop 自主推进问题求解的智能体。 -Eino ADK 参考 [Google-ADK](https://google.github.io/adk-docs/agents/) 的设计,提供了 Go 语言 的 Agents 开发的灵活组合框架,即 Agent、Multi-Agent 开发框架,并为多 Agent 交互场景沉淀了通用的上下文传递、事件流分发和转换、任务控制权转让、中断与恢复、通用切面等能力。 +> 💡 +> 如果你只读一篇文档,请读:[Eino ADK: ChatModelAgent 介绍](/zh/docs/eino/overview/eino_adk_quickstart) -### 什么是 Agent - -Agent 是 Eino ADK 的核心,它代表一个独立的、可执行的智能任务单元。你可以把它想象成一个能够理解指令、执行任务并给出回应的“智能体”。每个 Agent 都有明确的名称和描述,使其可以被其他 Agent 发现和调用。 - -任何需要与大语言模型(LLM)交互的场景都可以抽象为一个 Agent。例如: - -- 一个用于查询天气信息的 Agent。 -- 一个用于预定会议的 Agent。 -- 一个能够回答特定领域知识的 Agent。 - -### Eino ADK 中的 Agent - -Eino ADK 中的所有功能设计均围绕 Agent 抽象设计展开: - -```go -type Agent interface { - Name(ctx context.Context) string - Description(ctx context.Context) string - Run(ctx context.Context, input *AgentInput) *AsyncIterator[*AgentEvent] -} -``` - -基于 Agent 抽象,ADK 提供了三大类基础拓展: - -- `ChatModel Agent`: 应用程序的“思考”部分,利用 LLM 作为核心,理解自然语言,进行推理、规划、生成响应,并动态决定如何执行或使用哪些工具。 -- `Workflow Agents`:应用程序的协调管理部分,基于预定义的逻辑,按照自身类型(顺序 / 并发 / 循环)控制子 Agent 执行流程。Workflow Agents 产生确定性的,可预测的执行模式,不同于 ChatModel Agent 生成的动态随机的决策。 - - 顺序 (Sequential Agent):按顺序依次执行子 Agents - - 循环 (Loop Agent):重复执行子 Agents,直至满足特定的终止条件 - - 并行 (Parallel Agent):并行执行多个子 Agents -- `Custom Agent`:通过接口实现自己的 Agent,允许定义高度定制的复杂 Agent - -基于基础扩展,您可以针对自己的需求排列组合这些基础 Agents,构建所需要的 Multi-Agent 系统。另外,Eino 从日常实践经验出发,内置提供了几种开箱即用的 Multi-Agent 最佳范式: - -- Supervisor: 监督者模式,监督者 Agent 控制所有通信流程和任务委托,并根据当前上下文和任务需求决定调用哪个 Agent。 -- Plan-Execute:计划-执行模式,Plan Agent 生成含多个步骤的计划,Execute Agent 根据用户 query 和计划来完成任务。Execute 后会再次调用 Plan,决定完成任务 / 重新进行规划。 - -下方表格和图提供了这些基础拓展与封装的特点,区别,与关系。后续章节中将展开介绍每种类型的原理与细节: +## 组件地图 - - - - + + + + + +
    类别ChatModel AgentWorkflow AgentsCustom LogicEinoBuiltInAgent(supervisor, plan-execute)
    功能思考,生成,工具调用控制 Agent 之间的执行流程运行自定义逻辑开箱即用的 Multi-agent 模式封装
    核心LLM预确定的执行流程(顺序,并发,循环)自定义代码基于 Eino 实践积累的经验,对前三者的高度封装
    用途生成,动态决策结构化处理,编排定制需求特定场景内的开箱即用
    组件职责文档
    ChatModelAgentReAct Loop:推理 → 行动 → 反馈,自主决策ChatModelAgent 介绍
    Middleware在 ReAct Loop 的生命周期点位注入行为(压缩、搜索、重试等)ChatModelAgentMiddleware
    Runner单次 Agent 运行入口:Query / Run → 事件流Agent Runner 与扩展
    TurnLoop多轮运行时:Push / Preempt / Stop + 声明式 checkpoint/resumeAgent Cancel 与 TurnLoop
    DeepAgents预构建 Agent:任务规划(PlanTask)+ 子任务委派(TaskTool)DeepAgents
    - +## 其他 Agent 类型 -# ADK Examples +除 ChatModelAgent 外,ADK 还提供确定性编排原语: -[Eino-examples](https://github.com/cloudwego/eino-examples/tree/main/adk) 项目中提供了多种 ADK 的实施样例,您可以参考样例代码与简介,对 adk 能力构建初步的认知: +- **Workflow Agents**:Sequential / Loop / Parallel Agent,用于预定义流程的结构化编排。 +- **Custom Agent**:实现 `Agent` 接口即可接入框架。 - - - - - - - - - -
    项目路径简介结构图
    顺序工作流案例该示例代码展示了基于 eino adk 的 Workflow 模式构建的一个顺序执行的多智能体工作流。
  • 顺序工作流构建:通过 adk.NewSequentialAgent 创建一个名为 ResearchAgent 的顺序执行智能体,内部包含两个子智能体(SubAgents)PlanAgent 和 WriterAgent,分别负责研究计划制定和报告撰写。
  • 子智能体职责明确:PlanAgent 接收研究主题,生成详细且逻辑清晰的研究计划;WriterAgent 根据该研究计划撰写结构完整的学术报告。
  • 输入输出串联:PlanAgent 输出的研究计划作为 WriterAgent 的输入,形成清晰的上下游数据流,体现业务步骤的顺序依赖。
  • 循环工作流案例该示例代码基于 eino adk 的 Workflow 模式中的 LoopAgent,构建了一个反思迭代型智能体框架。
  • 迭代反思框架:通过 adk.NewLoopAgent 创建 ReflectionAgent,包含两个子智能体 MainAgent 和 CritiqueAgent,支持最多 5 次迭代,形成主任务解决与批判反馈的闭环。
  • 主智能体(MainAgent):负责根据用户任务生成初步解决方案,追求准确完整的答案输出。
  • 批判智能体(CritiqueAgent):对主智能体输出进行质量审查,反馈改进意见,若结果满意则终止循环,提供最终总结。
  • 循环机制:利用 LoopAgent 的迭代能力,实现在多轮反思中不断优化解决方案,提高输出质量和准确性。
  • 并行工作流案例该示例代码基于 eino adk 的 Workflow 模式中的 ParallelAgent,构建了一个并发信息搜集框架:
  • 并发运行框架:通过 adk.NewParallelAgent 创建 DataCollectionAgent,包含多个信息采集子智能体。
  • 子智能体职责分配:每个子智能体负责一个渠道的信息采集与分析,彼此之间无需交互,功能边界清晰。
  • 并发运行:Parallel Agent 能够同时从多个数据源启动信息收集任务,处理效率相较于串行方式显著提升。
  • supervisor该用例采用单层 Supervisor 管理两个功能较为综合的子 Agent:Research Agent 负责检索任务,Math Agent 负责多种数学运算(加、乘、除),但所有数学运算均由同一个 Math Agent 内部统一处理,而非拆分为多个子 Agent。此设计简化了代理层级,适合任务较为集中且不需要过度拆解的场景,便于快速部署和维护。
    layered-supervisor该用例实现了多层级智能体监督体系,顶层 Supervisor 管理 Research Agent 和 Math Agent,Math Agent 又进一步细分为 Subtract、Multiply、Divide 三个子 Agent。顶层 Supervisor 负责将研究任务和数学任务分配给下级 Agent,Math Agent 作为中层监督者再将具体数学运算任务分派给其子 Agent。
  • 多层级智能体结构:实现了一个顶层 Supervisor Agent,管理两个子智能体 ——Research Agent(负责信息检索)和 Math Agent(负责数学运算)。
  • Math Agent 内部再细分三个子智能体:Subtract Agent、Multiply Agent 和 Divide Agent,分别处理减法、乘法和除法运算,体现多级监督和任务委派。
  • 这种分层管理结构体现了复杂任务的细粒度拆解和多级任务委派,适合任务分类清晰且计算复杂的场景。
    plan-execute 案例本示例基于 eino adk 实现 plan-execute-replan 模式的多 Agent 旅行规划系统,核心功能是处理用户复杂旅行请求(如 “3 天北京游,需从纽约出发的航班、酒店推荐、必去景点”),通过 “计划 - 执行 - 重新计划” 循环完成任务:1. 计划(Plan):
    Planner Agent
    基于大模型生成分步执行计划(如 “第一步查北京天气,第二步搜纽约到北京航班”);2. 执行(Execute):
    Executor Agent
    调用 ** 天气(get_weather)、航班(search_flights)、酒店(search_hotels)、景点(search_attractions)** 等 Mock 工具执行每一步,若用户输入信息缺失(如未说明预算),则调用
    ask_for_clarification
    工具追问;3. 重新计划(Replan):
    Replanner Agent
    根据工具执行结果评估是否需要调整计划(如航班无票则重新选日期)。Execute 和 Replan 不断循环运行,直至完成计划中的所有步骤;4. 支持会话轨迹跟踪(CozeLoop 回调)和状态管理,最终输出完整旅行方案。从结构上看,plan-execute-replan 分为两层:
  • 第二层是由 execute + replan agent 构成的 loop agent,即 replan 后可能需要重新 execute(重新规划后需要查询旅行信息 / 请求用户继续澄清问题)
  • 第一层是由 plan agent + 第二层构造的 loop agent 构成的 sequential agent,即 plan 仅执行一次,然后交由 loop agent 执行
  • 书籍推荐 agent(运行中断与恢复)该代码展示了基于 eino adk 框架构建的一个书籍推荐聊天智能体实现,体现了 Agent 运行中断与恢复功能。
  • Agent 构建:通过 adk.NewChatModelAgent 创建一个名为 BookRecommender 的聊天智能体,用于根据用户请求推荐书籍。
  • 工具集成:集成了两个工具 —— 搜索书籍的 BookSearch 工具 和 询问澄清信息的 AskForClarification 工具,支持多轮交互和信息补充。
  • 状态管理:实现了简单的内存 CheckPoint 存储,支持会话的断点续接,保证上下文连续性。
  • 事件驱动:通过迭代 runner.Query 和 runner.Resume 获取事件流,处理执行过程中的各种事件及错误。
  • 自定义输入:支持动态接收用户输入,利用工具选项传入新的查询请求,灵活驱动任务流程。
  • +> 💡 +> Graph(确定性编排)与 Agent(自主决策)是两种不同的 AI 应用形态。当核心问题是"自主决策 + 运行时增强"时,推荐使用 ChatModelAgent。详见 ChatModelAgent 介绍中的"为什么不继续使用 flow/react"。 -# What's Next +# 示例 -经过 Quickstart 概览,您应该对 Eino ADK 与 Agent 有了基础的认知。 +[eino-examples/adk](https://github.com/cloudwego/eino-examples/tree/main/adk) 提供了完整的 ADK 示例代码: -接下来的文章将深入介绍 ADK 的核心概念,助您理解 Eino ADK 的工作原理并更好的使用它: +- **ChatModelAgent 入门**:[chatmodel](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/chatmodel) — 书籍推荐 Agent,含中断与恢复 +- **DeepAgents**:[deep](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) — 任务规划 + 子任务委派 +- **Workflow**:[sequential](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/sequential) / [loop](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/loop) / [parallel](https://github.com/cloudwego/eino-examples/tree/main/adk/intro/workflow/parallel) +- **Multi-Agent**:[supervisor](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/supervisor) / [plan-execute](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/plan-execute-replan) - +# What's Next diff --git a/content/zh/docs/eino/core_modules/flow_integration_components/react_agent_manual.md b/content/zh/docs/eino/core_modules/flow_integration_components/react_agent_manual.md index 761f9bf5116..b1686c4ed49 100644 --- a/content/zh/docs/eino/core_modules/flow_integration_components/react_agent_manual.md +++ b/content/zh/docs/eino/core_modules/flow_integration_components/react_agent_manual.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-17" lastmod: "" tags: [] title: ReAct Agent 使用手册 diff --git a/content/zh/docs/eino/ecosystem_integration/_index.md b/content/zh/docs/eino/ecosystem_integration/_index.md index dd106a516b6..07e32c60ada 100644 --- a/content/zh/docs/eino/ecosystem_integration/_index.md +++ b/content/zh/docs/eino/ecosystem_integration/_index.md @@ -1,67 +1,10 @@ --- Description: "" -date: "2026-01-20" +date: "2026-05-17" lastmod: "" tags: [] title: 组件集成 -weight: 6 +weight: 5 --- -## 组件集成 -### ChatModel - -- openai: [ChatModel - OpenAI](https://github.com/cloudwego/eino-ext/blob/main/components/model/openai/README.md) -- ark: [ChatModel - ARK](https://github.com/cloudwego/eino-ext/blob/main/components/model/ark/README.md) -- ollama: [ChatModel - Ollama](https://github.com/cloudwego/eino-ext/blob/main/components/model/ollama/README.md) - -### Document - -#### Loader - -- file: [Loader - local file](/zh/docs/eino/ecosystem_integration/document/loader_local_file) -- s3: [Loader - amazon s3](/zh/docs/eino/ecosystem_integration/document/loader_amazon_s3) -- web url: [Loader - web url](/zh/docs/eino/ecosystem_integration/document/loader_web_url) - -#### Parser - -- html: [Parser - html](/zh/docs/eino/ecosystem_integration/document/parser_html) -- pdf: [Parser - pdf](/zh/docs/eino/ecosystem_integration/document/parser_pdf) - -#### Transformer - -- markdown splitter: [Splitter - markdown](/zh/docs/eino/ecosystem_integration/document/splitter_markdown) -- recursive splitter: [Splitter - recursive](/zh/docs/eino/ecosystem_integration/document/splitter_recursive) -- semantic splitter: [Splitter - semantic](/zh/docs/eino/ecosystem_integration/document/splitter_semantic) - -### Embedding - -- ark: [Embedding - ARK](/zh/docs/eino/ecosystem_integration/embedding/embedding_ark) -- openai: [Embedding - OpenAI](/zh/docs/eino/ecosystem_integration/embedding/embedding_openai) - -### Indexer - -- volc vikingdb: [Indexer - volc VikingDB](/zh/docs/eino/ecosystem_integration/indexer/indexer_volc_vikingdb) -- Milvus 2.5+: [Indexer - Milvus 2 (v2.5+)](/zh/docs/eino/ecosystem_integration/indexer/indexer_milvusv2) -- Milvus 2.4: [Indexer - Milvus](/zh/docs/eino/ecosystem_integration/indexer/indexer_milvus) -- OpenSearch 3: [Indexer - OpenSearch 3](/zh/docs/eino/ecosystem_integration/indexer/indexer_opensearch3) -- OpenSearch 2: [Indexer - OpenSearch 2](/zh/docs/eino/ecosystem_integration/indexer/indexer_opensearch2) -- ElasticSearch 9: [Indexer - Elasticsearch 9](/zh/docs/eino/ecosystem_integration/indexer/indexer_elasticsearch9) -- Elasticsearch 8: [Indexer - ES8](/zh/docs/eino/ecosystem_integration/indexer/indexer_es8) -- ElasticSearch 7: [Indexer - Elasticsearch 7 ](/zh/docs/eino/ecosystem_integration/indexer/indexer_elasticsearch7) - -### Retriever - -- volc vikingdb: [Retriever - volc VikingDB](/zh/docs/eino/ecosystem_integration/retriever/retriever_volc_vikingdb) -- Milvus 2.5+: [Retriever - Milvus 2 (v2.5+) ](/zh/docs/eino/ecosystem_integration/retriever/retriever_milvusv2) -- Milvus 2.4: [Retriever - Milvus](/zh/docs/eino/ecosystem_integration/retriever/retriever_milvus) -- OpenSearch 3: [Retriever - OpenSearch 3](/zh/docs/eino/ecosystem_integration/retriever/retriever_opensearch3) -- OpenSearch 2: [Retriever - OpenSearch 2](/zh/docs/eino/ecosystem_integration/retriever/retriever_opensearch2) -- ElasticSearch 9: [Retriever - Elasticsearch 9](/zh/docs/eino/ecosystem_integration/retriever/retriever_elasticsearch9) -- ElasticSearch 8: [Retriever - ES8](/zh/docs/eino/ecosystem_integration/retriever/retriever_es8) -- ElasticSearch 7: [Retriever - ES 7](/zh/docs/eino/ecosystem_integration/retriever/retriever_elasticsearch7) - -### Tools - -- googlesearch: [Tool - Googlesearch](/zh/docs/eino/ecosystem_integration/tool/tool_googlesearch) -- duckduckgo search: [Tool - DuckDuckGoSearch](/zh/docs/eino/ecosystem_integration/tool/tool_duckduckgo_search) diff --git a/content/zh/docs/eino/ecosystem_integration/chat_model/_index.md b/content/zh/docs/eino/ecosystem_integration/chat_model/_index.md index 429009520f3..1189ebfa9fa 100644 --- a/content/zh/docs/eino/ecosystem_integration/chat_model/_index.md +++ b/content/zh/docs/eino/ecosystem_integration/chat_model/_index.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-02" +date: "2026-05-19" lastmod: "" tags: [] title: ChatModel @@ -31,3 +31,24 @@ weight: 1 - 上述链接直接指向 GitHub 仓库的最新文档 - 中文文档和英文文档内容同步更新 - 如需查看历史版本或提交文档修改建议,请访问 GitHub 仓库 + +# AgenticModel 组件列表 + +本分类的各组件详细文档请参考 GitHub README: + + + + + + + + +
    组件名称中文文档English Docs
    AgenticARKREADME.zh_CN.mdREADME.md
    AgenticDeepSeekREADME_zh.mdREADME.md
    AgenticOpenAIREADME.zh_CN.mdREADME.md
    AgenticGeminiREADME.zh_CN.mdREADME.md
    AgenticQwenREADME_zh.mdREADME.md
    + +--- + +**说明**: + +- 上述链接直接指向 GitHub 仓库的最新文档 +- AgenticModel 是面向 Agentic 场景的模型接口,支持 Server Tools、MCP Tools、前缀缓存等高级能力 +- 如需查看历史版本或提交文档修改建议,请访问 GitHub 仓库 diff --git a/content/zh/docs/eino/overview/_index.md b/content/zh/docs/eino/overview/_index.md index 186b15725bf..88b5fc6ae0d 100644 --- a/content/zh/docs/eino/overview/_index.md +++ b/content/zh/docs/eino/overview/_index.md @@ -350,7 +350,7 @@ Eino 框架由几个部分组成: - [Eino Devops](https://github.com/cloudwego/eino-ext/tree/main/devops):可视化开发、可视化调试等。 - [EinoExamples](https://github.com/cloudwego/eino-examples):是包含示例应用程序和最佳实践的代码仓库。 -详见:[Eino 框架结构说明](/zh/docs/eino/overview/eino_框架结构说明) +详见:[Eino 框架结构说明](/zh/docs/eino/overview/eino_architecture) ## 详细文档 diff --git a/content/zh/docs/eino/overview/eino_adk_quickstart.md b/content/zh/docs/eino/overview/eino_adk_quickstart.md new file mode 100644 index 00000000000..0409bda79b6 --- /dev/null +++ b/content/zh/docs/eino/overview/eino_adk_quickstart.md @@ -0,0 +1,255 @@ +--- +Description: "" +date: "2026-05-17" +lastmod: "" +tags: [] +title: 五分钟上手 Eino ADK +weight: 9 +--- + +本文面向已了解 Eino 的开发者,聚焦 ADK 中最重要的自主决策原语:**ChatModelAgent** 及其运行时增强机制 **ChatModelAgentMiddleware**。 + +## 先认识 ChatModelAgent + +当我们谈论 "Agent" 时,绝大多数时候指的是:以大模型为核心,配备工具,能够自主决策并解决复杂现实问题的实体。`ChatModelAgent` 就是 Eino ADK 对这一概念的直接实现。 + +**ChatModelAgent = 以 ChatModel 作为决策器、以 Tools 作为行动空间、以工具反馈和历史记录作为下一轮决策上下文的 ReAct Agent。** + +四个关键部分: + +1. **ChatModel**:大模型,负责推理与决策。 +2. **Tools**:工具集合,定义 Agent 可执行的行动范围。 +3. **反馈**:工具执行结果回到模型上下文,成为下一轮决策的依据。 +4. **历史记录**:完整保留问题求解过程中的推理轨迹、工具调用和工具结果。 + +因此,`ChatModelAgent` 不是一次模型调用,而是一次可持续推进的问题求解过程。 + +## ChatModelAgent 的执行结构:ReAct Loop + +`ChatModelAgent` 的核心能力是**自主决策**——在一次 `Run` 中,模型可以反复推理、行动、获取反馈,直到问题被解决。支撑这种能力的执行结构就是 ReAct Loop。 + +自主决策需要四个要素同时存在: + +1. **决策器(ChatModel)**:每一轮根据当前上下文,判断下一步该做什么。 +2. **行动空间(Tools)**:定义 Agent 能采取的具体行动。 +3. **反馈信号(Tool Feedback)**:行动的结果被注入上下文,成为后续决策的依据——这使 Agent 能根据真实执行结果修正方向,而不是一次猜测到底。 +4. **累积上下文(History)**:完整保留推理轨迹、工具调用与工具结果。每一轮模型看到的不是独立的单次提问,而是从问题开始到当前为止的完整求解过程。 + +这四者缺一不可:没有决策器就无法推理,没有行动空间就无法执行,没有反馈就无法修正,没有累积上下文就无法基于历史做出更好的判断。 + + + +关键特征:**累积上下文驱动的渐进式决策**。每一轮循环不是从零开始,而是在此前所有推理与行动的完整轨迹之上继续推进。模型的每一次决策都基于不断增长的问题求解上下文做出,这让 Agent 能处理需要多步推理、试错、修正的复杂任务。 + +## 什么让你的 ChatModelAgent 不同 + +ReAct Loop 的结构是固定的。那什么让**你的** ChatModelAgent 有别于其他人的,能针对你的具体问题? + +四个维度: + +1. **ChatModel** — 选择哪个模型做决策。 +2. **Instruction** — 系统指令:角色定义、行为约束、少样本示例。 +3. **Tools** — 工具集合:决定 Agent 可以做什么。 +4. **Middleware(ChatModelAgentMiddleware)** — 在 ReAct Loop 的特定生命周期点位上注入行为:拦截、修改、增强循环中的输入和输出。 + +前三者定义了 Agent "是什么"——决策能力、角色约束、行动范围。 + +Middleware 定义了 Agent "怎么跑"——它不改变 Loop 的结构(推理 → 行动 → 反馈始终不变),而是控制循环运行时的具体行为。例如:模型调用前压缩上下文、运行前动态注入工具、工具调用时做权限检查、模型失败时重试或切换备用模型。这些都是在 Loop 的特定点位上做的运行时增强。 + +## Middleware:在 ReAct Loop 中注入行为 + +构建 ChatModelAgent 时,你会遇到这些典型问题: + +- **Agent 需要读写文件、执行命令?** → 需要在运行前注入一组通用工具。 +- **Agent 需要复用一组预定义的指令和知识?** → 需要把可复用能力打包成 Skill,按需加载。 +- **上下文越来越长,超出模型窗口怎么办?** → 需要在每次模型调用前自动压缩历史。 +- **工具太多,全部塞进 prompt 会稀释注意力?** → 需要按需搜索和加载工具。 +- **模型偶尔调用失败或返回垃圾?** → 需要自动重试或切换备用模型。 + +这些需求的共同点:它们不需要改变 ReAct Loop 的结构,只需要在循环的特定点位上拦截和增强。这就是 Middleware 做的事。 + +对应的内置 Middleware: + + + + + + + + +
    场景Middleware做了什么
    需要文件系统能力FileSystem运行前注入 ls/read/write/edit/grep/execute 等工具
    复用预定义能力Skill将指令、知识、工具打包为可按需加载的技能单元
    上下文超窗口Reduction / Summarization模型调用前压缩消息和工具结果
    工具过多ToolSearch按需搜索并加载 Tools,而非一次性暴露全部
    模型调用不稳定ModelRetry / ModelFailover单次模型调用维度做重试 / 故障切换
    + +每个 Middleware 的实现,都是在 ReAct Loop 的某个钩子点位上做注入。下图展示了 `ChatModelAgentMiddleware` 的各个钩子在循环中的位置: + + + +对应的钩子点位总结: + + + + + + + + + +
    钩子点位时机典型用途
    BeforeAgent
    Agent 运行前(仅一次)增强 Instruction,注入 Tools
    BeforeModelRewriteState
    每次模型调用前修改 Messages / ToolInfos
    AfterModelRewriteState
    每次模型调用后修改模型响应或修补状态
    WrapModel
    单次模型调用维度重试、故障切换、改写模型返回
    WrapToolCall
    单次工具调用维度权限、安全、输出改写
    AfterAgent
    Agent 成功结束后后处理、状态清理
    + +完整 Middleware 速查见文末附录。 + +## 快速上手:创建并运行 ChatModelAgent + +`Runner` 是执行 Agent 的入口。它把一次用户请求转化为一次 Agent 运行,负责单次运行配置、事件流输出、流式开关,以及 checkpoint / resume 等运行期能力。最小用法是:把 `ChatModelAgent` 放进 `RunnerConfig`,然后调用 `Query` 或 `Run`。 + +以下示例展示了如何创建一个最简 ChatModelAgent,并通过 Runner 执行: + +```go +package main + +import ( + "context" + "fmt" + "log" + + "github.com/cloudwego/eino-ext/components/model/ark" + "github.com/cloudwego/eino/adk" + "github.com/cloudwego/eino/compose" + "github.com/cloudwego/eino/components/tool" +) + +func main() { + ctx := context.Background() + + // 1. 创建 ChatModel + chatModel, err := ark.NewChatModel(ctx, &ark.ChatModelConfig{ + Model: "doubao-seed-1-8-251228", + APIKey: "your_api_key", // 替换为你的 API Key + }) + if err != nil { + log.Fatal(err) + } + + // 2. 创建 ChatModelAgent + agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ + Name: "my-assistant", + Description: "一个可以使用工具回答问题的助手。", + Instruction: "你是一个有帮助的助手。请根据可用工具回答用户问题。", + Model: chatModel, + ToolsConfig: adk.ToolsConfig{ + ToolsNodeConfig: compose.ToolsNodeConfig{ + Tools: []tool.BaseTool{ + // 注册你的工具,例如 webSearchTool + }, + }, + }, + // Handlers: []adk.ChatModelAgentMiddleware{...}, // 注册 Middleware + }) + if err != nil { + log.Fatal(err) + } + + // 3. 通过 Runner 执行 Agent + runner := adk.NewRunner(ctx, adk.RunnerConfig{ + Agent: agent, + EnableStreaming: true, + }) + + // 4. 发送用户请求并消费事件流 + iter := runner.Query(ctx, "帮我搜索一下今天的新闻") + for { + event, ok := iter.Next() + if !ok { + break + } + fmt.Println(event) + } +} +``` + +核心流程:`NewChatModelAgent` → `NewRunner` → `Runner.Query/Run` → 消费 `AsyncIterator` 事件流。 + +更多基础示例可参考:[Eino: 快速开始](/zh/docs/eino/quick_start)。 + +## 延伸阅读:DeepAgents + +DeepAgents 是一个预构建的 ChatModelAgent,核心价值在于两个预置 Middleware: + +- **WriteTodos(PlanTask)**:让主 Agent 在执行前显式规划任务列表,并在执行过程中持续追踪进度。复杂问题不再靠模型"一口气想完",而是先拆解、再逐步推进。 +- **TaskTool**:让主 Agent 把子任务委派给子 Agent 执行,子 Agent 独立完成后将结果汇总回主循环。这使得单个 Agent 的能力边界可以通过组合来扩展。 + +此外,DeepAgents 还预置了系统提示词和可选的 FileSystem Middleware,开箱即可处理需要任务规划和多 Agent 协作的场景。 + +``` +DeepAgents = ChatModelAgent + + WriteTodos(任务规划与追踪) + + TaskTool(子任务委派) + + 可选 FileSystem + + 预置系统提示词 +``` + +进一步阅读: + +- Eino ADK Deep Agents 完整指南:[Eino ADK: DeepAgents](/zh/docs/eino/core_modules/eino_adk/agent_implementation/deepagents) +- DeepAgents 示例:[eino-examples/adk/multiagent/deep at main · cloudwego/eino-examples](https://github.com/cloudwego/eino-examples/tree/main/adk/multiagent/deep) + +## 延伸阅读:为什么不继续使用 flow/react? + +回到第一性原理:Graph 和 Agent 是两种本质不同的 AI 应用形态。 + +- **Graph** 的核心是**确定性**:开发者预定义拓扑结构,节点间的流转关系在编译期就已确定。输入是结构化的,输出是可预测的。 +- **Agent** 的核心是**自主性**:LLM 在运行时动态决定下一步行动,执行路径不可预知,输出是全过程事件流。 + +`flow/react` 本质上是用 Graph 的方式来"模拟" Agent——把 ReAct 的推理循环展开为静态的节点和边。这可行,但本质上是一种错位:用确定性编排来承载动态决策。当 Agent 的复杂度增长时,这个错位会产生系统性问题: + +1. **交付物不匹配**:Graph 面向"最终结果",而 Agent 的交付物是全过程(推理轨迹、中间工具调用、状态变化)。用 Graph 做 Agent 时,中间过程只能通过 Callback 等旁路抽取——可行,但属于补丁。 +2. **运行模式不匹配**:Graph 是同步执行模型,而 Agent 天然是异步的长时运行。事件流输出、checkpoint / resume、中断恢复等运行期能力,需要框架在 Agent 维度统一管理,而非散落在 Graph 节点的回调中。 +3. **扩展点不匹配**:Agent 的运行时增强(上下文压缩、工具动态加载、模型重试、安全控制)本质上是对决策循环的拦截和注入。在 Graph 中,这些能力没有统一的挂载点,只能散落在各个节点或边上;在 ChatModelAgent 中,它们有明确的生命周期钩子(Middleware)。 + +因此,flow/react 不是被废弃,而是回到它最匹配的位置:**确定性流程编排**。当核心问题是"自主决策 + 运行时增强"时,正确的抽象是 `ChatModelAgent + ChatModelAgentMiddleware`。 + +进一步阅读: + +- Agent 还是 Graph?AI 应用路线辨析:[Agent 还是 Graph?AI 应用路线辨析](/zh/docs/eino/overview/graph_or_agent) + + + +## 附录:Middleware 速查 + +### 实例一览 + + + + + + + + + + + + + + + + + +
    Middleware描述
    Reduction超长工具输出截断 / 写入文件系统,防止 token 超限
    Summarization历史消息摘要压缩
    Skill可复用指令/知识以 Tool 形式暴露,Agent 按需加载
    FileSystemls/read/write/edit/glob/grep/execute 文件操作工具集
    ToolSearch
    tool_search
    元工具,按需搜索加载工具(减少常驻工具列表占用)
    PatchToolCall修补消息历史中的悬空工具调用(缺失工具结果)
    SafeToolWrapToolCall 维度拦截工具执行错误,转为可读文本返回模型,使 Agent 可自行修正而非中断
    ModelRetry模型调用失败时按策略重试 [内置配置]
    ModelFailover模型调用失败时切换备用模型 [内置配置]
    AgentsMD将 Agents.md 知识文件注入模型上下文,提升上下文质量
    PlanTask持久化的任务管理工具集(create/get/update/list),支持依赖关系追踪
    WriteTodos轻量级 TODO 列表工具,Agent 可创建和追踪结构化待办事项 [DeepAgent 内置]
    TaskTool子 Agent 委派工具,主 Agent 通过它把子任务交给子 Agent 独立执行 [DeepAgent 内置]
    Permission工具调用权限控制 [WIP]
    + +> 注:ModelRetry / ModelFailover 在代码中是 `ChatModelAgentConfig` 的内置字段(`ModelRetryConfig` / `ModelFailoverConfig`),概念上对应 `WrapModel` 钩子。SafeTool 为示例模式(见 ChatWithEino ch05),实现为用户自定义 Middleware。WriteTodos / TaskTool 为 DeepAgent 内置,不单独导出。Permission 为规划中能力。 + +### 分类 + + + + + + + + +
    类别解决什么问题包含
    扩展通用 Tool给 Agent 更多能力FileSystem, Skill, ToolSearch, PlanTask, WriteTodos, TaskTool
    处理 ReAct 过程中的错误提高可靠性ModelRetry, ModelFailover, SafeTool, PatchToolCall
    保证上下文窗口在上限内防 token 超限Reduction, Summarization, ToolSearch
    安全与权限约束 Agent 行为Permission
    提高上下文内容质量让模型看到更好的上下文Skill, AgentsMD
    + +ToolSearch 跨两个类别:既是"扩展 Tool"(提供按需工具发现能力),也是"保证上下文窗口"(避免一次性加载过多工具描述)。 + +进一步阅读: + +- ChatModelAgent Middleware 详解:[Eino ADK: ChatModelAgentMiddleware](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware) diff --git a/content/zh/docs/eino/overview/graph_or_agent.md b/content/zh/docs/eino/overview/graph_or_agent.md index 96d9c21c323..5aca3312163 100644 --- a/content/zh/docs/eino/overview/graph_or_agent.md +++ b/content/zh/docs/eino/overview/graph_or_agent.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: Agent 还是 Graph?AI 应用路线辨析 diff --git a/content/zh/docs/eino/quick_start/_index.md b/content/zh/docs/eino/quick_start/_index.md index 9333db3afd1..a08ae5fe2fb 100644 --- a/content/zh/docs/eino/quick_start/_index.md +++ b/content/zh/docs/eino/quick_start/_index.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: 快速开始 @@ -69,7 +69,7 @@ EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run . - + @@ -78,7 +78,8 @@ EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run . - + +
    章节主题入口
    第一章ChatModel 与 Message(Console)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch01_chatmodel_agent_console.md
    第一章ChatModel 与 AgenticMessage(Console)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch01_chatmodel_agent_console.md
    第二章Agent 与 Runner(Console 多轮)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch02_chatmodel_agent_runner_console.md
    第三章Memory 与 Session(持久化对话)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch03_memory_session_jsonl.md
    第四章Tool 与文件系统访问https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch04_tool_backend_filesystem.md
    第七章Interrupt/Resume(中断与恢复)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch07_interrupt_resume.md
    第八章Graph Tool(复杂工作流)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch08_graph_tool.md
    第九章Skill(Console)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch09_skill.md
    最终章A2UI(Web)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch10_a2ui.md
    第十章A2UI(Web)https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch10_a2ui.md
    第十一章 TurnLoophttps://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/docs/ch11_turnloop.md |
    ## 最终交付:一个可扩展的端到端 Agent 应用骨架 diff --git a/content/zh/docs/eino/quick_start/chapter_01_chatmodel_and_message.md b/content/zh/docs/eino/quick_start/chapter_01_chatmodel_and_message.md index 4dcb14ee4eb..d3e3bf27276 100644 --- a/content/zh/docs/eino/quick_start/chapter_01_chatmodel_and_message.md +++ b/content/zh/docs/eino/quick_start/chapter_01_chatmodel_and_message.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-19" lastmod: "" tags: [] title: 第一章:ChatModel 与 Message(Console) @@ -56,15 +56,16 @@ ChatWithEino 是一个基于 Eino 框架构建的智能助手,能够帮助开 - - - - - - - - - + + + + + + + + + +
    章节主题核心内容能力提升
    第一章ChatModel 与 Message理解 Component 抽象,实现单次对话基础对话能力
    第二章Agent 与 Runner引入执行抽象,实现多轮对话会话管理能力
    第三章Memory 与 Session持久化对话历史,支持会话恢复持久化能力
    第四章Tool 与文件系统添加文件访问能力,读取源码工具调用能力
    第五章Middleware中间件机制,统一处理横切关注点扩展性增强
    第六章Callback回调机制,监控 Agent 执行过程可观测性
    第七章Interrupt 与 Resume中断与恢复,支持长时间任务可靠性增强
    第八章Graph 与 Tool使用 Graph 编排复杂工作流复杂编排能力
    第九章A2UIAgent 到 UI 的集成方案生产级应用
    第一章ChatModel 与 AgenticMessage理解 Component 抽象,实现单次对话基础对话能力
    第二章Agent 与 Runner引入执行抽象,实现多轮对话会话管理能力
    第三章Memory 与 Session持久化对话历史,支持会话恢复持久化能力
    第四章Tool 与文件系统添加文件访问能力,读取源码工具调用能力
    第五章Middleware中间件机制,统一处理横切关注点扩展性增强
    第六章Callback回调机制,监控 Agent 执行过程可观测性
    第七章Interrupt 与 Resume中断与恢复,支持长时间任务可靠性增强
    第八章Graph 与 Tool使用 Graph 编排复杂工作流复杂编排能力
    第九章Skill使用 Skill 中间件加载并复用技能文档知识复用能力
    最终章A2UIAgent 到 UI 的集成方案生产级应用
    **为什么这样设计?** @@ -77,7 +78,7 @@ ChatWithEino 是一个基于 Eino 框架构建的智能助手,能够帮助开 --- -本章目标:理解 Eino 的 Component 抽象,用最小代码调用一次 ChatModel(支持流式输出),并掌握 `schema.Message` 的基本用法。 +本章目标:理解 Eino 的 Component 抽象,用最小代码调用一次 ChatModel(支持流式输出),并掌握如何用 `schema.AgenticMessage` 组织模型输入和流式输出。 ## 代码位置 @@ -88,11 +89,12 @@ ChatWithEino 是一个基于 Eino 框架构建的智能助手,能够帮助开 Eino 定义了一组 Component 接口(`ChatModel`、`Tool`、`Retriever`、`Loader` 等),每个接口描述一类可替换的能力: ```go -type BaseChatModel interface { - Generate(ctx context.Context, input []*schema.Message, opts ...Option) (*schema.Message, error) - Stream(ctx context.Context, input []*schema.Message, opts ...Option) ( - *schema.StreamReader[*schema.Message], error) +type BaseModel[M any] interface { + Generate(ctx context.Context, input []M, opts ...Option) (M, error) + Stream(ctx context.Context, input []M, opts ...Option) (*schema.StreamReader[M], error) } + +type AgenticModel = BaseModel[*schema.AgenticMessage] ``` **接口带来的好处:** @@ -103,15 +105,29 @@ type BaseChatModel interface { 本章只涉及 `ChatModel`,后续章节会逐步引入 `Tool`、`Retriever` 等 Component。 -## schema.Message:对话的基本单位 +本示例代码默认使用 `model.AgenticModel`,也就是 `model.BaseModel[*schema.AgenticMessage]`。这样后续章节可以在同一套消息结构里表达文本、reasoning、工具调用、工具结果等内容。 + +## schema.AgenticMessage:对话的基本单位 -`Message` 是 Eino 里对话数据的基本结构: +`AgenticMessage` 是本 Quickstart 使用的对话数据结构: + +在一次模型调用中,模型可能会返回多个有序事件,例如先输出 `reasoning`,再调用 server tool,随后继续 `reasoning`,接着调用 function tool。`AgenticMessage` 会用 `ContentBlock` 按顺序保存这些结构化事件。 ```go -type Message struct { - Role RoleType // system / user / assistant / tool - Content string // 文本内容 - ToolCalls []ToolCall // 仅 assistant 消息可能有 +type AgenticMessage struct { + Role AgenticRoleType + ContentBlocks []*ContentBlock + ResponseMeta *AgenticResponseMeta + Extra map[string]any +} + +type ContentBlock struct { + Type ContentBlockType + Reasoning *Reasoning + UserInputText *UserInputText + AssistantGenText *AssistantGenText + FunctionToolCall *FunctionToolCall + FunctionToolResult *FunctionToolResult // ... } ``` @@ -119,18 +135,23 @@ type Message struct { 常用构造函数: ```go -schema.SystemMessage("You are a helpful assistant.") -schema.UserMessage("What is the weather today?") -schema.AssistantMessage("I don't know.", nil) // 第二个参数是 ToolCalls -schema.ToolMessage("tool result", "call_id") +schema.SystemAgenticMessage("You are a helpful assistant.") +schema.UserAgenticMessage("What is the weather today?") + +&schema.AgenticMessage{ + Role: schema.AgenticRoleTypeAssistant, + ContentBlocks: []*schema.ContentBlock{ + schema.NewContentBlock(&schema.AssistantGenText{Text: "I don't know."}), + }, +} ``` **角色语义:** -- `system`:系统指令,通常放在 messages 最前面 +- `system`:系统指令,通常放在消息列表最前面 - `user`:用户输入 - `assistant`:模型回复 -- `tool`:工具调用结果(后续章节涉及) +- 工具调用和工具结果通过 `function_tool_call` / `function_tool_result` content block 表达(后续章节涉及) ## 前置条件 @@ -181,42 +202,47 @@ go run ./cmd/ch01 -- "用一句话解释 Eino 的 Component 设计解决了什 按执行顺序: -1. **创建 ChatModel**:根据 `MODEL_TYPE` 环境变量选择 OpenAI 或 Ark 实现 -2. **构造输入 messages**:`SystemMessage(instruction)` + `UserMessage(query)` -3. **调用 Stream**:所有 ChatModel 实现都必须支持 `Stream()`,返回 `StreamReader[*Message]` +1. **创建 ChatModel**:根据 `MODEL_TYPE` 环境变量选择 OpenAI 或 Ark 的 agentic model +2. **构造输入 messages**:通过 `msgops.NewSystem[M]` / `msgops.NewUser[M]` 创建 `AgenticMessage` +3. **调用 Stream**:使用 `model.BaseModel[M].Stream()`,返回 `StreamReader[M]` 4. **打印结果**:迭代 `StreamReader` 逐帧打印 assistant 回复 -关键代码片段(**注意:这是简化后的代码片段,不能直接运行****,完整代码请参考** [cmd/ch01/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch01/main.go)): +关键代码片段(**注意:这是简化后的代码片段,不能直接运行,完整代码请参考** [cmd/ch01/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch01/main.go)): ```go -// 构造输入 -messages := []*schema.Message{ - schema.SystemMessage(instruction), - schema.UserMessage(query), -} - -// 调用 Stream(所有 ChatModel 都必须实现) -stream, err := cm.Stream(ctx, messages) -if err != nil { - log.Fatal(err) -} -defer stream.Close() +func runTyped[M adk.MessageType](ctx context.Context, instruction, query string) { + cm, err := chatmodel.NewModel[M](ctx) + if err != nil { + log.Fatal(err) + } -for { - chunk, err := stream.Recv() - if errors.Is(err, io.EOF) { - break + messages := []M{ + msgops.NewSystem[M](instruction), + msgops.NewUser[M](query), } + + stream, err := cm.Stream(ctx, messages) if err != nil { log.Fatal(err) } - fmt.Print(chunk.Content) + defer stream.Close() + + for { + frame, err := stream.Recv() + if errors.Is(err, io.EOF) { + break + } + if err != nil { + log.Fatal(err) + } + fmt.Print(msgops.AssistantDeltaText(frame)) + } } ``` ## 本章小结 - **Component 接口**:定义可替换、可组合、可测试的能力边界 -- **Message**:对话数据的基本单位,通过角色区分语义 +- **AgenticMessage**:对话数据的基本单位,通过角色和 content block 区分语义 - **ChatModel**:最基础的 Component,提供 `Generate` 和 `Stream` 两个核心方法 - **实现选择**:通过环境变量或配置切换 OpenAI/Ark 等不同实现,业务代码无需改动 diff --git a/content/zh/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md b/content/zh/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md index e7360cf64db..6cf34722257 100644 --- a/content/zh/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md +++ b/content/zh/docs/eino/quick_start/chapter_02_chatmodelagent_runner_agentevent.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: 第二章:ChatModelAgent、Runner、AgentEvent(Console 多轮) @@ -113,12 +113,12 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ **ChatModel vs ChatModelAgent:本质区别** - + - - - - + + + +
    维度ChatModelChatModelAgent
    维度ChatModelChatModelAgent
    定位Component(组件)Agent(智能体)
    接口
    Generate() / Stream()
    Run() -> AsyncIterator[*AgentEvent]
    输出直接返回消息内容返回事件流(包含消息、控制动作等)
    能力单纯的模型调用可扩展 tools、middleware、interrupt 等
    适用场景简单的对话场景复杂的智能体应用
    核心接口
    Generate()
    /
    Stream()
    Run() -> AsyncIterator[*AgentEvent]
    输出形态直接返回消息内容返回事件流(包含消息、控制动作等)
    核心能力单纯的大语言模型调用支持扩展 tools、middleware、interrupt 等能力
    适用场景简单对话交互场景复杂智能体应用开发
    **为什么需要 ChatModelAgent?** @@ -163,12 +163,12 @@ type Runner struct { 1. **生命周期管理**:Runner 管理 Agent 的启动、恢复、中断等状态 2. **Checkpoint 支持**:配合 `CheckPointStore` 实现中断恢复(后续章节涉及) 3. **统一入口**:提供 `Run()` 和 `Query()` 等便捷方法 -4. **事件流封装**:将 Agent 的事件流转换为可消费的 `AsyncIterator[*AgentEvent]` +4. **事件流封装**:将 Agent 的事件流转换为可消费的 `AsyncIterator[*TypedAgentEvent[M]]` **使用方式:** ```go -runner := adk.NewRunner(ctx, adk.RunnerConfig{ +runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ Agent: agent, EnableStreaming: true, }) @@ -239,34 +239,45 @@ for { 没有 tools 时,`ChatModelAgent` 在一次 `Run()` 里只会完成一轮模型调用。多轮对话是通过调用侧维护 history 实现的: -1. 用 `history []*schema.Message` 保存累计对话 -2. 每次用户输入:把 `UserMessage` 追加到 history -3. 调用 `runner.Run(ctx, history)` 得到事件流,消费得到 assistant 文本 -4. 把本轮 assistant 文本追加回 history,进入下一轮 +1. 用 `history []M` 保存累计对话,本示例默认 `M` 为 `*schema.AgenticMessage` +2. 每次用户输入:通过 `msgops.NewUser[M]` 追加到 history +3. 调用 `runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history))` 得到事件流,消费得到 assistant 文本 +4. 通过 `msgops.NewAssistant[M]` 把本轮 assistant 文本追加回 history,进入下一轮 **关键代码片段(**注意:这是简化后的代码片段,不能直接运行,完整代码请参考** [cmd/ch02/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch02/main.go)): ```go -history := make([]*schema.Message, 0, 16) +func runTyped[M adk.MessageType](ctx context.Context, instruction string) { + agent, err := adk.NewTypedChatModelAgent[M](ctx, &adk.TypedChatModelAgentConfig[M]{ + Name: "Ch02Agent", + Instruction: instruction, + Model: cm, + }) + if err != nil { + log.Fatal(err) + } -for { - // 1. 读取用户输入 - line := readUserInput() - if line == "" { - break + runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ + Agent: agent, + EnableStreaming: true, + }) + + history := make([]M, 0, 16) + + for { + line := readUserInput() + if line == "" { + break + } + + history = append(history, msgops.NewUser[M](line)) + events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) + result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) + if err != nil { + log.Fatal(err) + } + history = append(history, msgops.NewAssistant[M](result.AssistantText, nil)) } - - // 2. 追加用户消息到 history - history = append(history, schema.UserMessage(line)) - - // 3. 调用 Runner 执行 Agent - events := runner.Run(ctx, history) - - // 4. 消费事件流,收集 assistant 回复 - content := collectAssistantFromEvents(events) - - // 5. 追加 assistant 消息到 history - history = append(history, schema.AssistantMessage(content, nil)) } ``` diff --git a/content/zh/docs/eino/quick_start/chapter_03_memory_and_session.md b/content/zh/docs/eino/quick_start/chapter_03_memory_and_session.md index 0eba8441743..9968f9d5fab 100644 --- a/content/zh/docs/eino/quick_start/chapter_03_memory_and_session.md +++ b/content/zh/docs/eino/quick_start/chapter_03_memory_and_session.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: 第三章:Memory 与 Session(持久化对话) @@ -10,11 +10,12 @@ weight: 3 本章目标:实现对话历史的持久化存储,支持跨进程恢复会话。 > **⚠️ 重要说明:业务层概念 vs 框架概念** - +> > 本章介绍的 **Memory、Session、Store 是业务层概念**,**不是 Eino 框架的核心组件**。 - > - +> - **Eino 框架层面**:提供 `adk.Runner`、`adk.NewTypedRunner[M]`、`schema.AgenticMessage` 等基础抽象,框架本身不关心对话历史的存储方式 +> - **业务层层面**:Memory/Session/Store 是本示例项目为了实现持久化对话而设计的业务逻辑,通过组装给 `adk.Runner` 的输入来与 Eino 框架交互 +> > 换句话说,Eino 框架只负责"如何处理消息",而"如何存储消息"完全由业务层决定。本章提供的实现只是一个简单的参考示例,你可以根据自己的业务需求选择完全不同的存储方案(数据库、Redis、云存储等)。 ## 代码位置 @@ -87,7 +88,7 @@ type Session struct { ID string CreatedAt time.Time - messages []*schema.Message // 对话历史 + messages []M // 对话历史,示例默认 M 为 *schema.AgenticMessage // ... } ``` @@ -120,13 +121,15 @@ type Store struct { 每个 Session 存储为一个 `.jsonl` 文件: ``` -{"type":"session","id":"083d16da-...","created_at":"2026-03-11T10:00:00Z"} -{"role":"user","content":"你好,我是谁?"} -{"role":"assistant","content":"你好!我暂时不知道你是谁..."} -{"role":"user","content":"我叫张三"} -{"role":"assistant","content":"好的,张三,很高兴认识你!"} +{"type":"session","id":"083d16da-...","created_at":"2026-03-11T10:00:00Z","message_kind":"agentic"} +{"role":"user","content_blocks":[{"type":"user_input_text","user_input_text":{"text":"你好,我是谁?"}}]} +{"role":"assistant","content_blocks":[{"type":"assistant_gen_text","assistant_gen_text":{"text":"你好!我暂时不知道你是谁..."}}]} +{"role":"user","content_blocks":[{"type":"user_input_text","user_input_text":{"text":"我叫张三"}}]} +{"role":"assistant","content_blocks":[{"type":"assistant_gen_text","assistant_gen_text":{"text":"好的,张三,很高兴认识你!"}}]} ``` +会话默认保存在 `./data/sessions_agentic`;如果需要放到其他目录,可以设置 `SESSION_DIR_AGENTIC`。 + **为什么用 JSONL?** - **简单**:每行一个 JSON 对象,易于读写 @@ -141,7 +144,7 @@ type Store struct { ### 1. 创建 Store ```go -sessionDir := "./data/sessions" +sessionDir := "./data/sessions_agentic" store, err := mem.NewStore(sessionDir) if err != nil { log.Fatal(err) @@ -161,7 +164,7 @@ if err != nil { ### 3. 追加用户消息 ```go -userMsg := schema.UserMessage("你好") +userMsg := msgops.NewUser[M]("你好") if err := session.Append(userMsg); err != nil { log.Fatal(err) } @@ -171,14 +174,17 @@ if err := session.Append(userMsg); err != nil { ```go history := session.GetMessages() -events := runner.Run(ctx, history) -content := collectAssistantFromEvents(events) +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) +if err != nil { + log.Fatal(err) +} ``` ### 5. 追加助手消息 ```go -assistantMsg := schema.AssistantMessage(content, nil) +assistantMsg := msgops.NewAssistant[M](result.AssistantText, nil) if err := session.Append(assistantMsg); err != nil { log.Fatal(err) } @@ -187,6 +193,11 @@ if err := session.Append(assistantMsg); err != nil { **关键代码片段(**注意:这是简化后的代码片段,不能直接运行,完整代码请参考** [cmd/ch03/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch03/main.go)): ```go +store, err := mem.NewStore[M](msgops.DefaultSessionDir(msgops.KindOf[M]())) +if err != nil { + log.Fatal(err) +} + // 创建或恢复 Session session, err := store.GetOrCreate(sessionID) if err != nil { @@ -194,18 +205,21 @@ if err != nil { } // 用户输入 -userMsg := schema.UserMessage(line) +userMsg := msgops.NewUser[M](line) if err := session.Append(userMsg); err != nil { log.Fatal(err) } // 调用 Agent history := session.GetMessages() -events := runner.Run(ctx, history) -content := collectAssistantFromEvents(events) +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{}) +if err != nil { + log.Fatal(err) +} // 保存助手回复 -assistantMsg := schema.AssistantMessage(content, nil) +assistantMsg := msgops.NewAssistant[M](result.AssistantText, nil) if err := session.Append(assistantMsg); err != nil { log.Fatal(err) } @@ -217,7 +231,7 @@ if err := session.Append(assistantMsg); err != nil { - **Session 是业务层概念**:由业务代码实现和管理,负责存储和加载对话历史 - **Agent(Runner)是框架层概念**:由 Eino 框架提供,负责处理消息并生成回复 -- **两者的交互点**:业务层通过 `session.GetMessages()` 获取消息列表,传递给 `runner.Run(ctx, history)` 进行处理 +- **两者的交互点**:业务层通过 `session.GetMessages()` 获取消息列表,再通过 `msgops.NormalizeMessagesForModelInput(history)` 生成模型输入,最后传递给 `runner.Run(ctx, messages)` 进行处理 **架构分层:** @@ -281,7 +295,7 @@ if err := session.Append(assistantMsg); err != nil { **框架层 vs 业务层:** -- **Eino 框架层**:提供 `adk.Runner`、`schema.Message` 等基础抽象,不关心消息如何存储 +- **Eino 框架层**:提供 `adk.Runner`、typed runner、`schema.AgenticMessage` 等基础抽象,不关心消息如何存储 - **业务层(本章实现)**:Memory/Session/Store 是业务层概念,用于管理对话历史的存储 **业务层概念:** @@ -294,7 +308,7 @@ if err := session.Append(assistantMsg); err != nil { **业务层与框架层的交互:** - 业务层负责存储消息,通过 `session.GetMessages()` 获取消息列表 -- 将消息列表传递给框架层的 `runner.Run(ctx, history)` 进行处理 +- 将消息列表规整为模型输入后,传递给框架层的 `runner.Run(ctx, messages)` 进行处理 - 收集框架层返回的回复,再由业务层保存到存储中 > **💡 提示**:本章的实现只是众多存储方案中的一种简单示例。在实际项目中,你可以根据业务需求选择数据库、Redis、云存储等方案,甚至可以实现更复杂的功能如会话过期清理、搜索、分享等。 diff --git a/content/zh/docs/eino/quick_start/chapter_04_tool_and_filesystem.md b/content/zh/docs/eino/quick_start/chapter_04_tool_and_filesystem.md index 9f0e38919c1..5db71a20e09 100644 --- a/content/zh/docs/eino/quick_start/chapter_04_tool_and_filesystem.md +++ b/content/zh/docs/eino/quick_start/chapter_04_tool_and_filesystem.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: 第四章:Tool 与文件系统访问 @@ -135,7 +135,7 @@ backend, err := localbk.NewBackend(ctx, &localbk.Config{}) 添加自定义 Tool✅ 手动注册每个 Tool✅ 手动注册或自动注册 文件系统访问(Backend)❌ 需手动创建并注册所有文件工具✅ 一级配置,自动注册 命令执行(StreamingShell)❌ 需手动创建✅ 一级配置,自动注册 -内置任务管理❌✅
    write_todos
    工具 +内置任务管理❌✅ write_todos 工具 支持子 Agent❌✅ @@ -170,7 +170,7 @@ agent, err := deep.New(ctx, &deep.Config{ Name: "Ch04ToolAgent", Description: "ChatWithDoc agent with filesystem access via LocalBackend.", ChatModel: cm, - Instruction: instruction, + Instruction: agentInstruction, Backend: backend, // 提供文件系统操作能力 StreamingShell: backend, // 提供命令执行能力 MaxIteration: 50, @@ -214,7 +214,7 @@ ls $PROJECT_ROOT go run ./cmd/ch04 ``` -**PROJECT_ROOT 说明:** +**PROJECT_ROOT**** 说明:** - **不设置时**:`PROJECT_ROOT` 默认为当前工作目录(`chatwitheino` 所在目录),Agent 只能访问本示例项目的文件。这对于快速试验已足够。 - **设置后**:指向 Eino 核心库根目录,Agent 可以检索 Eino 框架的完整代码库(核心库、扩展库、示例库)。这是 ChatWithEino 的完整使用场景。 diff --git a/content/zh/docs/eino/quick_start/chapter_05_middleware.md b/content/zh/docs/eino/quick_start/chapter_05_middleware.md index 0bae0820e06..2dac5a0fa76 100644 --- a/content/zh/docs/eino/quick_start/chapter_05_middleware.md +++ b/content/zh/docs/eino/quick_start/chapter_05_middleware.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: 第五章:Middleware(中间件模式) @@ -182,6 +182,9 @@ func (m *safeToolMiddleware) WrapInvokableToolCall( return func(ctx context.Context, args string, opts ...tool.Option) (string, error) { result, err := endpoint(ctx, args, opts...) if err != nil { + if _, ok := compose.IsInterruptRerunError(err); ok { + return "", err + } // 将错误转换为字符串,而不是返回错误 return fmt.Sprintf("[tool error] %v", err), nil } @@ -305,7 +308,8 @@ agent, err := deep.New(ctx, &deep.Config{ MaxRetries: 5, IsRetryAble: func(_ context.Context, err error) bool { return strings.Contains(err.Error(), "429") || - strings.Contains(err.Error(), "Too Many Requests") + strings.Contains(err.Error(), "Too Many Requests") || + strings.Contains(err.Error(), "qpm limit") }, }, }) @@ -409,9 +413,9 @@ agent, _ := deep.New(ctx, &deep.Config{ - - - + + +
    Middleware功能说明
    reduction工具输出缩减,当工具返回内容过长时自动截断并卸载到文件系统,防止上下文溢出
    summarization对话历史自动摘要,当 token 数量超过阈值时自动生成摘要压缩历史
    skill技能加载中间件,让 Agent 能够动态加载和执行预定义的技能
    reduction工具输出缩减,当工具返回内容过长时自动截断并卸载到文件系统,防止上下文溢出
    summarization对话历史自动摘要,当 token 数量超过阈值时自动生成摘要压缩历史
    skill技能加载中间件,让 Agent 能够动态加载和执行预定义的技能
    **Middleware 链示例:** diff --git a/content/zh/docs/eino/quick_start/chapter_06_callback_and_trace.md b/content/zh/docs/eino/quick_start/chapter_06_callback_and_trace.md index 0ff7dfadc2d..236bac07914 100644 --- a/content/zh/docs/eino/quick_start/chapter_06_callback_and_trace.md +++ b/content/zh/docs/eino/quick_start/chapter_06_callback_and_trace.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: 第六章:Callback 与 Trace(可观测性) @@ -159,12 +159,12 @@ callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client)) Callback 在组件生命周期的 5 个关键时机触发。下表中 `Timing*` 是 Eino 内部常量名(用于 `TimingChecker` 接口),对应的 Handler 接口方法是右侧所示: - - - - - - + + + + + +
    时机常量对应 Handler 方法触发点输入/输出
    TimingOnStart
    OnStart
    组件开始处理前CallbackInput
    TimingOnEnd
    OnEnd
    组件成功返回后CallbackOutput
    TimingOnError
    OnError
    组件返回错误时error
    TimingOnStartWithStreamInput
    OnStartWithStreamInput
    组件接收流式输入时StreamReader[CallbackInput]
    TimingOnEndWithStreamOutput
    OnEndWithStreamOutput
    组件返回流式输出时StreamReader[CallbackOutput]
    时机常量对应 Handler 方法触发点输入 / 输出
    TimingOnStartOnStart组件开始处理前CallbackInput
    TimingOnEndOnEnd组件成功返回后CallbackOutput
    TimingOnErrorOnError组件返回错误时error
    TimingOnStartWithStreamInputOnStartWithStreamInput组件接收流式输入时StreamReader[CallbackInput]
    TimingOnEndWithStreamOutputOnEndWithStreamOutput组件返回流式输出时StreamReader[CallbackOutput]
    **示例:ChatModel 调用流程** @@ -251,48 +251,26 @@ callbacks.AppendGlobalHandlers(handler) ### 2. 集成 CozeLoop ```go -func setupCozeLoop(ctx context.Context) (*cozeloop.Client, error) { - apiToken := os.Getenv("COZELOOP_API_TOKEN") - workspaceID := os.Getenv("COZELOOP_WORKSPACE_ID") - - if apiToken == "" || workspaceID == "" { - return nil, nil // 未配置则跳过 - } - +// Setup CozeLoop tracing (optional) +// Set COZELOOP_API_TOKEN and COZELOOP_WORKSPACE_ID to enable +cozeloopApiToken := os.Getenv("COZELOOP_API_TOKEN") +cozeloopWorkspaceID := os.Getenv("COZELOOP_WORKSPACE_ID") +if cozeloopApiToken != "" && cozeloopWorkspaceID != "" { client, err := cozeloop.NewClient( - cozeloop.WithAPIToken(apiToken), - cozeloop.WithWorkspaceID(workspaceID), + cozeloop.WithAPIToken(cozeloopApiToken), + cozeloop.WithWorkspaceID(cozeloopWorkspaceID), ) if err != nil { - return nil, err + log.Fatalf("cozeloop.NewClient failed: %v", err) } - - // 注册为全局 Callback + defer func() { + time.Sleep(5 * time.Second) + client.Close(ctx) + }() callbacks.AppendGlobalHandlers(clc.NewLoopHandler(client)) - - return client, nil -} -``` - -### 3. 在 main 中使用 - -```go -func main() { - ctx := context.Background() - - // 设置 CozeLoop(可选) - client, err := setupCozeLoop(ctx) - if err != nil { - log.Printf("cozeloop setup failed: %v", err) - } - if client != nil { - defer func() { - time.Sleep(5 * time.Second) // 等待数据上报 - client.Close(ctx) - }() - } - - // 创建 Agent 并运行... + log.Println("CozeLoop tracing enabled") +} else { + log.Println("CozeLoop tracing disabled (set COZELOOP_API_TOKEN and COZELOOP_WORKSPACE_ID to enable)") } ``` diff --git a/content/zh/docs/eino/quick_start/chapter_07_interrupt_resume.md b/content/zh/docs/eino/quick_start/chapter_07_interrupt_resume.md index 87b6bd4b55d..74c12919b19 100644 --- a/content/zh/docs/eino/quick_start/chapter_07_interrupt_resume.md +++ b/content/zh/docs/eino/quick_start/chapter_07_interrupt_resume.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: 第七章:Interrupt/Resume(中断与恢复) @@ -173,11 +173,15 @@ func (m *approvalMiddleware) WrapInvokableToolCall( return fmt.Sprintf("tool '%s' disapproved", tCtx.Name), nil } - // 重新中断 - return "", tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ - ToolName: tCtx.Name, - ArgumentsInJSON: storedArgs, - }, storedArgs) + isTarget, _, _ = tool.GetResumeContext[any](ctx) + if !isTarget { + return "", tool.StatefulInterrupt(ctx, &commontool.ApprovalInfo{ + ToolName: tCtx.Name, + ArgumentsInJSON: storedArgs, + }, storedArgs) + } + + return endpoint(ctx, storedArgs, opts...) }, nil } @@ -248,7 +252,7 @@ type CheckPointStore interface { ### 1. 配置 Runner 使用 CheckPointStore ```go -runner := adk.NewRunner(ctx, adk.RunnerConfig{ +runner := adk.NewTypedRunner[M](adk.TypedRunnerConfig[M]{ Agent: agent, EnableStreaming: true, CheckPointStore: adkstore.NewInMemoryStore(), // 内存存储 @@ -258,11 +262,11 @@ runner := adk.NewRunner(ctx, adk.RunnerConfig{ ### 2. 配置 Agent 使用 ApprovalMiddleware ```go -agent, err := deep.New(ctx, &deep.Config{ +agent, err := deep.NewTyped[M](ctx, &deep.TypedConfig[M]{ // ... 其他配置 - Handlers: []adk.ChatModelAgentMiddleware{ - &approvalMiddleware{}, // 添加审批中间件 - &safeToolMiddleware{}, // 将 Tool 错误转换为字符串(中断类错误会继续向上抛出) + Handlers: []adk.TypedChatModelAgentMiddleware[M]{ + newApprovalMiddleware[M](), // 添加审批中间件 + newSafeToolMiddleware[M](), // 将 Tool 错误转换为字符串(中断类错误会继续向上抛出) }, }) ``` @@ -272,22 +276,27 @@ agent, err := deep.New(ctx, &deep.Config{ ```go checkPointID := sessionID -events := runner.Run(ctx, history, adk.WithCheckPointID(checkPointID)) -content, interruptInfo, err := printAndCollectAssistantFromEvents(events) +events := runner.Run(ctx, msgops.NormalizeMessagesForModelInput(history), adk.WithCheckPointID(checkPointID)) +result, err := helpers.PrintAndCollect[M](events, helpers.PrintOptions{ + ShowToolCalls: true, + ShowToolResults: true, + CaptureInterrupt: true, +}) if err != nil { return err } -if interruptInfo != nil { +assistantText := result.AssistantText +if result.InterruptInfo != nil { // 注意:建议使用同一个 stdin reader 同时读取「用户输入」与「审批 y/n」 // 避免审批输入被当成下一轮 you> 的消息 - content, err = handleInterrupt(ctx, runner, checkPointID, interruptInfo, reader) + assistantText, err = handleInterrupt[M](ctx, runner, checkPointID, result.InterruptInfo, reader) if err != nil { return err } } -_ = session.Append(schema.AssistantMessage(content, nil)) +_ = session.Append(msgops.NewAssistant[M](assistantText, nil)) ``` ## Interrupt/Resume 执行流程 diff --git a/content/zh/docs/eino/quick_start/chapter_08_graph_tool.md b/content/zh/docs/eino/quick_start/chapter_08_graph_tool.md index c7562e9d062..2126cb87769 100644 --- a/content/zh/docs/eino/quick_start/chapter_08_graph_tool.md +++ b/content/zh/docs/eino/quick_start/chapter_08_graph_tool.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-12" +date: "2026-05-19" lastmod: "" tags: [] title: 第八章:Graph Tool(复杂工作流) @@ -51,7 +51,7 @@ you> 请帮我分析 RFC6455 文档中关于 WebSocket 握手的部分 **重要说明:本章只是展示 compose/graph/workflow 能力的一角。** -从更大的视角看,Eino 的 `compose` 包提供了非常通用、确定性的编排能力:你可以把任何需要"确定性业务流程"的系统,用 `compose` 的 Graph/Chain/Workflow 组织成可执行的流水线,并且它能够**原生编排 Eino 的所有 component**(如 ChatModel、Prompt、Tools、Retriever、Embedding、Indexer 等),同时具备完整的 **callback** 体系,以及 **interrupt/resume + checkpoint** 支持。 +从更大的视角看,Eino 的 `compose` 包提供了非常通用、确定性的编排能力:你可以把任何需要“确定性业务流程”的系统,用 `compose` 的 Graph/Chain/Workflow 组织成可执行的流水线,并且它能够**原生编排 Eino 的所有 component**(如 ChatModel、Prompt、Tools、Retriever、Embedding、Indexer 等),同时具备完整的 **callback** 体系,以及 **interrupt/resume + checkpoint** 支持。 **Graph Tool 的定位:** @@ -135,8 +135,8 @@ wf.AddLambdaNode("answer", answerFunc). ```go type Input struct { - FilePath string `json:"file_path" jsonschema:"description=Absolute path to the document"` - Question string `json:"question" jsonschema:"description=The question to answer"` + FilePath string `json:"file_path" jsonschema:"description=Absolute path to the uploaded document file"` + Question string `json:"question" jsonschema:"description=The question to answer from the document"` } type Output struct { @@ -192,23 +192,35 @@ func buildWorkflow(cm model.BaseChatModel) *compose.Workflow[Input, Output] { AddInputWithOptions("chunk", []*compose.FieldMapping{compose.ToField("Chunks")}, compose.WithNoDirectDependency()). AddInputWithOptions(compose.START, []*compose.FieldMapping{compose.MapFields("Question", "Question")}, compose.WithNoDirectDependency()) - // filter: 筛选 top-k + // filter: sort descending by score, keep up to top-3 chunks with score ≥ 3. wf.AddLambdaNode("filter", compose.InvokableLambda( func(ctx context.Context, scored []scoredChunk) ([]scoredChunk, error) { sort.Slice(scored, func(i, j int) bool { return scored[i].Score > scored[j].Score }) - // 返回 top-3 - if len(scored) > 3 { - scored = scored[:3] + const maxK = 3 + var top []scoredChunk + for _, c := range scored { + if c.Score < 3 { + break + } + top = append(top, c) + if len(top) == maxK { + break + } } - return scored, nil + return top, nil }, )).AddInput("score") - // answer: 生成答案 + // answer: synthesize a response from top-k chunks, or return a not-found message if empty. wf.AddLambdaNode("answer", compose.InvokableLambda( func(ctx context.Context, in synthIn) (Output, error) { + if len(in.TopK) == 0 { + return Output{ + Answer: fmt.Sprintf("No relevant content found in the document for: %q", in.Question), + }, nil + } return synthesize(ctx, cm, in) }, )). @@ -229,7 +241,9 @@ func BuildTool(ctx context.Context, cm model.BaseChatModel) (tool.BaseTool, erro return graphtool.NewInvokableGraphTool[Input, Output]( wf, "answer_from_document", - "Search a large document for relevant content and synthesize an answer.", + "Search a large uploaded document for content relevant to a question and synthesize a "+ + "cited answer from the most relevant passages. "+ + "Use this instead of read_file when the document may be too large to fit in context.", ) } ``` @@ -237,6 +251,7 @@ func BuildTool(ctx context.Context, cm model.BaseChatModel) (tool.BaseTool, erro **关键代码片段(**注意:这是简化后的代码片段,不能直接运行,完整代码请参考** [rag/rag.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/rag/rag.go)): ```go +func BuildTool[M adk.MessageType](ctx context.Context, cm model.BaseModel[M]) (tool.BaseTool, error) { // 构建工作流 wf := compose.NewWorkflow[Input, Output]() @@ -249,6 +264,7 @@ wf.AddLambdaNode("score", scoreFunc). // 封装为 Tool return graphtool.NewInvokableGraphTool[Input, Output](wf, "answer_from_document", "...") +} ``` ## Graph Tool 执行流程 diff --git a/content/zh/docs/eino/quick_start/chapter_09_skill_console.md b/content/zh/docs/eino/quick_start/chapter_09_skill_console.md index e66120d0562..aa65ef1b735 100644 --- a/content/zh/docs/eino/quick_start/chapter_09_skill_console.md +++ b/content/zh/docs/eino/quick_start/chapter_09_skill_console.md @@ -1,13 +1,13 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-19" lastmod: "" tags: [] -title: 第九章:Skill(Console) +title: 第九章:Skill Middleware weight: 9 --- -本章目标:在第八章(RAG + Interrupt/Resume + Checkpoint)基础上,引入 `skill` 中间件,让 Agent 可以发现并加载一组可复用的技能文档(`SKILL.md`),并在需要时通过工具调用使用它们。 +本章目标:在第八章(RAG + Interrupt/Resume + Checkpoint)基础上,引入 `skill` 技能包,采用 `skill middleware` 注入和管理 skills,让 Agent 可以发现并加载一组可复用的技能文档(`SKILL.md`),并在需要时通过工具调用使用它们。 ## 代码位置 @@ -17,21 +17,16 @@ weight: 9 ## 前置条件 - 与第一章一致:需要配置一个可用的 ChatModel(OpenAI 或 Ark) -- 准备好 `eino-ext` PR 提供的 skills(`eino-guide` / `eino-component` / `eino-compose` / `eino-agent`) +- 准备好 `eino-ext` PR 提供的 skills 文档(`eino-guide` / `eino-component` / `eino-compose` / `eino-agent`) -为什么是这四个? +`skill middleware` 支持各种 skills 的接入。本章仅以 eino 相关的四个 skills 作为示例,演示如何使用 `skill middleware` 接入 skills。为什么是这四个? -ChatWithEino 的定位是“帮用户学习 Eino 框架、并尝试用 AI 辅助写 Eino 代码”。这四个 skills 正好覆盖了这个目标所需的关键知识面: - -- `eino-guide`:学习入口与导航(从哪里开始、怎么快速跑起来) -- `eino-component`:Component 接口与各类实现参考(Model/Embedding/Retriever/Tool/Callback 等) -- `eino-compose`:编排与确定性工作流参考(Graph/Chain/Workflow 等) -- `eino-agent`:ADK/Agent 相关参考(Agent、Runner、Middleware、Filesystem、Human-in-the-loop 等) +ChatWithEino 的定位是“帮用户学习 Eino 框架、并尝试用 AI 辅助写 Eino 代码”。这四个 skills 文档正好覆盖了这个目标所需的关键知识面。 skills 的来源可以是: - `eino-ext` 仓库本地路径(脚本会自动读取 `/skills/...`) -- 或你已安装 skills 的目录(目录下能看到上述四个子目录) +- 或你已安装 skills 的目录(目录下能看到上述四个子目录)∑ ## 从 Graph Tool 到 Skill:为什么需要“技能文档” @@ -42,6 +37,8 @@ skills 的来源可以是: - **Tool** 更像“动作/能力”:读文件、跑 workflow、调用外部系统 - **Skill** 更像“可复用的知识/指令包”:用一组 markdown(`SKILL.md` + `reference/*.md`)描述“如何做某类事” +而 `Skill middleware` 就是负责把 skills 接入 agent。注册 skill middleware 后,Agent 才能通过 `skill` 工具按需读取某个 Skill。 + 简单类比: - **Tool** = “能做什么”(函数/接口) @@ -53,7 +50,7 @@ skills 的来源可以是: ### 1) 同步 eino-ext skills 到本地目录 -为了让 `skill` 中间件可以“发现”这些 skills,需要把它们放到一个统一目录下,并满足扫描约定: +为了让 `skill` middleware 可以“发现”这些 skills,需要把它们放到一个统一目录下,并满足扫描约定: - `EINO_EXT_SKILLS_DIR//SKILL.md` @@ -73,7 +70,8 @@ go run ./scripts/sync_eino_ext_skills.go -src /path/to/eino-ext -dest ./skills/e ### 2) 启动 Chapter 9 ```bash -EINO_EXT_SKILLS_DIR=/absolute/path/to/chatwitheino/skills/eino-ext go run ./cmd/ch09 +export EINO_EXT_SKILLS_DIR=/absolute/path/to/chatwitheino/skills/eino-ext +go run ./cmd/ch09 ``` 输出示例(节选): @@ -85,11 +83,11 @@ Enter your message (empty line to exit): ## 在 DeepAgent 中启用 Skill -本章的 “Skill 可被调用” 不是自动发生的,你需要在 Agent 构建时把 `skill` 中间件注册进去。核心就是三步: +本章的 “Skill 可被调用” 不是自动发生的,你需要在 Agent 构建时把 `Skill middleware` 注册进去。核心就是三步: 1. 用本地 filesystem backend(本章用 `eino-ext/adk/backend/local`)提供文件读取/Glob 能力 2. 用 `skill.NewBackendFromFilesystem` 把 `EINO_EXT_SKILLS_DIR` 变成一个 Skill Backend -3. 用 `skill.NewMiddleware` 生成中间件,并把它塞进 DeepAgent 的 `Handlers` +3. 用 `skill.NewTyped[M]` 生成泛型 `Skill middleware`,并把它塞进 DeepAgent 的 `Handlers` **关键代码片段(注意:这是简化后的代码片段,不能直接运行,完整代码请参考 ****cmd/ch09/main.go****):** @@ -100,15 +98,15 @@ skillBackend, _ := skill.NewBackendFromFilesystem(ctx, &skill.BackendFromFilesys Backend: backend, BaseDir: skillsDir, // = $EINO_EXT_SKILLS_DIR }) -skillMiddleware, _ := skill.NewMiddleware(ctx, &skill.Config{ +skillMiddleware, _ := skill.NewTyped[M](ctx, &skill.TypedConfig[M]{ Backend: skillBackend, }) -agent, _ := deep.New(ctx, &deep.Config{ +agent, _ := deep.NewTyped[M](ctx, &deep.TypedConfig[M]{ ChatModel: cm, Backend: backend, StreamingShell: backend, - Handlers: []adk.ChatModelAgentMiddleware{ + Handlers: []adk.TypedChatModelAgentMiddleware[M]{ skillMiddleware, // ... 其他中间件,比如 approval/safeTool/retry 等 }, @@ -138,5 +136,5 @@ Use the skill tool with skill="eino-guide" and tell me what the entry point is f - 当模型调用 skill 工具时,控制台会打印: - `[tool call] ...` - `[tool result] ...`(对结果做了截断展示) -- 会话保存在 `SESSION_DIR`(默认 `./data/sessions`),支持恢复: +- 会话默认保存在 `./data/sessions_agentic`,支持恢复: - `go run ./cmd/ch09 --session ` diff --git a/content/zh/docs/eino/quick_start/chapter_09_a2ui_protocol.md b/content/zh/docs/eino/quick_start/chapter_10_a2ui_protocol.md similarity index 81% rename from content/zh/docs/eino/quick_start/chapter_09_a2ui_protocol.md rename to content/zh/docs/eino/quick_start/chapter_10_a2ui_protocol.md index b6ff586670f..346f5a5f8a5 100644 --- a/content/zh/docs/eino/quick_start/chapter_09_a2ui_protocol.md +++ b/content/zh/docs/eino/quick_start/chapter_10_a2ui_protocol.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-16" +date: "2026-05-19" lastmod: "" tags: [] title: 第十章:A2UI 协议(流式 UI 组件) @@ -23,9 +23,7 @@ Eino 更关注“可组合的智能执行与编排能力”,至于“如何呈 ## 代码位置 -- 入口代码:[main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/main.go) -- Agent 构建:[agent.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/agent.go) -- 服务端路由:[server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) +- 入口代码(Runner 版):[cmd/ch10/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch10/main.go) - A2UI 子集实现:[a2ui/types.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/types.go) - A2UI 事件流转换:[a2ui/streamer.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/streamer.go) - 前端页面:[static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html) @@ -39,7 +37,7 @@ Eino 更关注“可组合的智能执行与编排能力”,至于“如何呈 在 `quickstart/chatwitheino` 目录下执行: ```bash -go run . +go run ./cmd/ch10/ ``` 输出示例: @@ -54,9 +52,11 @@ starting server on http://localhost:8080 ```bash go run ./scripts/sync_eino_ext_skills.go -src /path/to/eino-ext -dest ./skills/eino-ext -clean -EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run . +EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run ./cmd/ch10/ ``` +会话默认保存在 `./data/sessions_agentic`。 + ## 从文本到 UI:为什么需要 A2UI 前八章我们实现的 Agent 只输出文本,但现代 AI 应用需要更丰富的交互。 @@ -91,7 +91,7 @@ EINO_EXT_SKILLS_DIR="$(pwd)/skills/eino-ext" go run . 每一行 SSE(`data: {...}`)承载一个 A2UI Message,Message 是一个“信封结构”,每次只会出现一个字段: -**关键代码片段(注意:这是简化后的代码片段,不能直接运行,完整代码请参考 a2ui/types.go):** +**关键代码片段(注意:这是简化后的代码片段,不能直接运行,完整代码请参考 ****a2ui/types.go****):** ```go type Message struct { @@ -122,13 +122,13 @@ type Message struct { 最终 Web 版的核心链路是: -- 后端运行 Agent,得到 `*adk.AsyncIterator[*adk.AgentEvent]` +- 后端运行 Agent,得到 `*adk.AsyncIterator[*adk.TypedAgentEvent[M]]` - 把事件流转换为 A2UI JSONL/SSE 流输出给浏览器(见 [a2ui/streamer.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/a2ui/streamer.go)) - 前端解析 SSE 的 `data:` 行并渲染组件树(见 [static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html)) ### 服务端路由(高层) -与 A2UI 相关的关键接口(见 [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go)): +与 A2UI 相关的关键接口(见 [cmd/ch10/main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/cmd/ch10/main.go)): - `GET /`:返回前端页面 `static/index.html` - `POST /sessions/:id/chat`:返回 SSE 流(A2UI messages),把 Agent 运行结果边跑边渲染到 UI @@ -137,7 +137,7 @@ type Message struct { ### 事件流转换(高层) -服务端把 `Runner.Run(...)` 的事件流交给 `a2ui.StreamToWriter(...)`,后者负责: +服务端把 `Runner.Run(...)` 的事件流交给 `a2ui.StreamToWriter[M](...)`,后者负责: - 对 user/assistant/tool 的输出做拆分 - 把 tool call / tool result 渲染成 “chip 卡片” @@ -148,7 +148,7 @@ type Message struct { - 前端通过 `fetch('/sessions/:id/chat')` 发起请求,然后从 `res.body` 读取流式字节,按行切分并解析 `data: {...}` 的 JSON(见 [static/index.html](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/static/index.html))。 -**关键代码片段(注意:这是简化后的代码片段,不能直接运行,完整代码请参考 static/index.html):** +**关键代码片段(注意:这是简化后的代码片段,不能直接运行,完整代码请参考 ****static/index.html****):** ```javascript const res = await fetch(`/sessions/${id}/chat`, { @@ -220,33 +220,8 @@ while (true) { - **流式输出**:后端以 SSE 推送 A2UI JSONL,前端增量渲染组件树 - **事件到 UI**:把 `AgentEvent` 转为 `tool call / tool result / assistant stream` 的可视化输出 -## 系列收尾:这个 Quickstart Agent 的完整愿景 - -到本章为止,我们用一个可以实际运行的 Agent 串起了 Eino 的核心能力。你可以把它理解为一个可扩展的“端到端 Agent 应用骨架”: - -- 运行时:Runner 驱动执行,支持流式输出与事件模型 -- 工具层:Filesystem / Shell 等 Tool 能力接入,工具错误可被安全处理 -- 中间件:可插拔的 middleware/handler,用于错误处理、重试、审批等横切能力 -- 可观测:callbacks/trace 能力把关键链路打通,便于调试与线上观测 -- 人机协作:interrupt/resume + checkpoint 支持审批、补参、分支选择等交互式流程 -- 确定性编排:compose(graph/chain/workflow)把复杂业务流程组织为可维护、可复用的执行图 -- 业务交付:像 A2UI 这样的 UI 集成,属于业务层自由选择的一环,用来把 Agent 能力以合适的产品形态呈现给用户 - -你可以在这个骨架上逐步替换/扩展任意环节:模型、工具、存储、工作流、前端渲染协议,而不需要推倒重来。 - -## 扩展思考 - -**其他组件类型:** - -- 图表组件(折线图、柱状图、饼图) -- 地图组件 -- 时间线组件 -- 树形组件 -- 标签页组件 +## 下一步 -**高级功能:** +本章的 `cmd/ch10` 使用 `adk.Runner` 实现了完整的 Web 应用。但 Runner 是"一次性"模型——如果用户在 Agent 回答到一半时发出新问题,Runner 没有内置机制来取消当前执行并切换到新输入。 -- 组件交互(点击、拖拽、输入) -- 条件渲染 -- 组件动画 -- 响应式布局 +下一章将引入 `adk.TurnLoop`,为 Agent 增加 **抢占(Preempt)** 和 **中止(Abort)** 能力。 diff --git a/content/zh/docs/eino/quick_start/chapter_11_turnloop.md b/content/zh/docs/eino/quick_start/chapter_11_turnloop.md new file mode 100644 index 00000000000..f59f6083833 --- /dev/null +++ b/content/zh/docs/eino/quick_start/chapter_11_turnloop.md @@ -0,0 +1,247 @@ +--- +Description: "" +date: "2026-05-19" +lastmod: "" +tags: [] +title: 第十一章:TurnLoop — 抢占、中止与多轮生命周期 +weight: 11 +--- + +上一章我们用 `adk.Runner` 实现了完整的 A2UI Web 应用。它能正常工作,但试试这个场景: + +> 你问 Agent 一个复杂问题,它开始调用工具、生成长回答……但你忽然意识到问错了,想换一个问题。 + +在上一章的 Runner 模式下,你只能等它说完,或者刷新页面丢弃一切。 + +本章引入 `adk.TurnLoop`,让 Agent 支持两个用户侧可感知的新能力:**抢占**和**中止**。 + +## 前置条件 + +与第一章一致:需要配置一个可用的 ChatModel(OpenAI 或 Ark),详见第一章的"前置条件"部分。 + +## 运行 & 体验 + +在 `quickstart/chatwitheino` 目录下执行: + +```bash +go run . +``` + +打开浏览器访问 `http://localhost:8080`,然后试试以下操作: + +### 体验抢占(Preempt) + +1. 发送一个会触发长回答的问题,例如"详细解释一下 Eino 的所有组件" +2. **在 Agent 还在回答时**,直接发送一条新消息,例如"算了,就告诉我 ChatModel 是什么" +3. 观察:旧回答立即停止,Agent 开始回答新问题 + +### 体验中止(Abort) + +1. 发送一个问题 +2. **在 Agent 回答过程中**,点击右上角的 **Abort 按钮** +3. 观察:Agent 立即停止,不再继续输出 + +这两个能力在上一章的 Runner 版本中都不存在。以下解释它们是如何实现的。 + +## 代码位置 + +- 入口代码:[main.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/main.go) +- Agent 构建:[agent.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/agent.go) +- TurnLoop 服务端:[server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) + +## 为什么 Runner 做不到 + +上一章的 `cmd/ch10` 中,每个 `/sessions/:id/chat` 请求调用一次 `runner.Run(ctx, messages)`。Runner 是**单轮(single-turn)**模型——调用一次、执行一次、结束。如果用户在 Agent 执行过程中又发了一条消息,Runner 没有"正在运行的循环"可以接收它。 + +TurnLoop 则是一个**持久运行的多轮(multi-turn)执行循环**。它在轮次之间保持 idle 等待,随时可以通过 `Push()` 接收新输入并立即响应。正是因为有一个持续运行的循环,抢占和中止才成为可能——你可以打断一个正在进行的轮次,或者直接停止整个循环。 + + + + + + + + + +
    能力Ch10(Runner,单轮)Ch11(TurnLoop,多轮)
    流式输出
    审批 / 中断
    跨轮次持久运行、实时响应新输入❌ 每次 Run () 独立✅ Push () 随时送入
    抢占正在进行的回答✅ Push(item, WithPreempt(...))
    中止 Agent✅ loop.Stop(WithImmediate())
    灵活的 per-turn 输入构建❌ 业务层手动拼装✅ GenInput 回调
    + +## TurnLoop 的核心模型 + +TurnLoop 是一个**基于推送的事件循环,以轮次(turn)为单位管理 Agent 的执行**。与 Runner 的"调用一次、执行一次"不同,TurnLoop 持续运行:轮次结束后进入 idle 等待,新 item 到来时立即启动下一轮。 + +``` +Push(item) → [队列] → GenInput(items) → Agent.Run() → OnAgentEvents(events) + ↑ │ + └──── idle 等待 / 下一轮 ←──────┘ +``` + +关键概念: + +- **Item**:用户输入的载体。本示例定义为 `ChatItem`,可以携带用户消息或审批决定 +- **GenInput**:从队列中的 items 构建 Agent 输入(选择哪些 items 消费、哪些保留给下一轮) +- **OnAgentEvents**:接收 Agent 输出的事件流,负责渲染和持久化 +- **Push**:向队列推入新 item,可附带抢占选项 + +## 一个 Session 对应一个 TurnLoop + +在本示例的 Web 场景中,每个聊天 session 对应一个 TurnLoop 实例。当用户发送第一条消息时,服务端为该 session 创建一个 TurnLoop 并调用 `Run()` 启动它;后续消息通过 `Push()` 送入同一个循环。这个循环在轮次之间保持 idle 等待,直到 session 被删除或用户 abort。 + +这是 TurnLoop 最典型的使用模式:**循环的生命周期与用户会话绑定**。一个长期运行的 TurnLoop 让抢占和中止成为自然的操作——因为"正在运行的循环"始终存在,新输入随时可以送入。 + +## 常规流程:idle → 新消息 → 回答 → idle + +最简单的场景是用户依次提问、等回答、再提下一个问题: + +```go +// 用户发送第一条消息时,创建并启动 TurnLoop +loop := adk.NewTurnLoop(cfg) +loop.Push(&ChatItem{Query: "hello"}) +loop.Run(ctx) +// → GenInput 构建输入 → Agent 执行 → OnAgentEvents 流式输出 +// → 轮次结束,TurnLoop 进入 idle 等待 + +// 用户发送第二条消息(此时 loop 处于 idle) +loop.Push(&ChatItem{Query: "explain Eino's architecture"}) +// → TurnLoop 唤醒,开始新一轮:GenInput → Agent → OnAgentEvents → idle +``` + +这个流程与上一章的 Runner 在用户体验上没有区别——区别在于 TurnLoop 的循环**持续存在**,不需要每次都重新创建。而一旦用户在 Agent 还在回答时发来新消息,就进入了下面的"抢占"场景。 + +## 抢占是怎么实现的 + +当用户在 Agent 回答过程中发送新消息时,业务层只需一行代码触发抢占: + +```go +loop.Push(item, adk.WithPreempt[*ChatItem, M](adk.AfterToolCalls)) +``` + +TurnLoop 收到这个指令后: + +1. 等待当前 tool call 完成(`AfterToolCalls` 表示不打断正在执行的工具,避免不一致状态) +2. 取消当前轮次——OnAgentEvents 的 context 被取消,旧轮次退出 +3. 从队列取出新 item,通过 GenInput 构建输入,启动新一轮 + +抢占模式可以根据业务需要选择不同的安全点: + + + + + + +
    模式具体行为
    AfterToolCalls等待当前正在执行的工具调用完成后,再取消当前轮次并启动新一轮执行
    AfterChatModel等待当前大模型调用完成后,再取消当前轮次并启动新一轮执行
    AnySafePoint在任一安全点(如工具调用间隙、模型调用间隙)立即取消当前轮次并启动新一轮执行
    + +> 本示例中 TurnLoop 运行在独立 goroutine 中,而 HTTP handler 需要把事件流写入 SSE 响应。两者之间通过 channel 协调(见 [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) 中的 `iterEnvelope`/`iterResult` 以及 `handlerDone` 信号机制)。这些是 HTTP 适配层的细节,不属于 TurnLoop API 本身。 + +## 中止是怎么实现的 + +中止更简单——直接停止整个 TurnLoop: + +```go +loop.Stop(adk.WithImmediate()) // 立即取消,不等待当前轮次 +loop.Wait() // 等待完全退出 +``` + +### Stop 的三种模式 + + + + + + +
    模式具体行为
    loop.Stop()轮次边界退出:等待当前轮次完成后退出
    loop.Stop(WithImmediate())立即退出:取消当前轮次的 context
    loop.Stop(WithGraceful())安全点退出:在下一个安全点(如 tool call 之间)退出
    + +## TurnLoop 的配置 + +创建 TurnLoop 时,通过 `TurnLoopConfig` 指定回调和选项: + +```go +cfg := adk.TurnLoopConfig[*ChatItem, M]{ + // GenInput:每轮开始时调用,决定"这一轮 Agent 看到什么" + // 从队列中选择 items 构建 Agent 输入,返回 Consumed(本轮处理)和 Remaining(留到后续轮次) + GenInput: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], items []*ChatItem) (*adk.GenInputResult[*ChatItem, M], error) { + // ...构建 AgentInput,持久化用户消息... + }, + + // PrepareAgent:每轮调用一次,返回本轮使用的 Agent + // 本示例直接返回同一个 Agent,但你可以根据 items 动态选择不同 Agent + PrepareAgent: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], consumed []*ChatItem) (adk.TypedAgent[M], error) { + return agent, nil + }, + + // OnAgentEvents:接收 Agent 的事件流,负责渲染输出和持久化中间消息 + // 本示例通过 channel 把事件流转交给 HTTP handler 做 SSE 输出 + OnAgentEvents: func(ctx context.Context, tc *adk.TurnContext[*ChatItem, M], events *adk.AsyncIterator[*adk.TypedAgentEvent[M]]) error { + // ...把 events 交给 HTTP handler,等待消费完成... + }, + + // 以下三个字段用于声明式 checkpoint(审批恢复),下一节详细介绍 + GenResume: makeGenResume(), + Store: checkpointStore, + CheckpointID: sessionID, +} + +loop := adk.NewTurnLoop(cfg) +``` + + + + + + + + +
    回调调用时机职责
    GenInput队列中有 items 时选择消费哪些 items,构建 Agent 输入(可决定哪些 items 保留给下一轮)
    PrepareAgentGenInput 之后返回本轮使用的 Agent 实例,支持动态调整 Agent 配置
    OnAgentEventsAgent 产出事件流时消费事件、渲染输出、持久化结果,是业务层处理 Agent 输出的核心入口
    GenResume从 checkpoint 恢复时从新 Push 进来的 items 中提取审批结果,构建
    ResumeParams
    ,实现审批恢复的自动化
    Store + CheckpointID启用声明式 checkpoint,TurnLoop 自动处理执行状态的保存与恢复
    + +> 完整的回调实现请参考 [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go)。 + +## 声明式 Checkpoint:审批恢复的自动化 + +在第七章(Runner 模式)中,审批恢复需要业务层手动调用 `runner.ResumeWithParams()`,自己判断"这次是正常执行还是恢复执行"。TurnLoop 提供了更简洁的方式——在配置中声明 `Store` 和 `CheckpointID`(见上一节),TurnLoop 会自动处理保存与恢复: + +1. Agent 执行到审批 interrupt 时,TurnLoop 自动将执行状态保存到 `Store`(以 `CheckpointID` 为 key) +2. 用户做出审批决定后,业务层创建一个新的 TurnLoop(使用**相同的** `CheckpointID`),并 Push 审批 item +3. 新 TurnLoop `Run()` 时,检测到 checkpoint 存在,**自动调用 `GenResume`**(而非 `GenInput`)获取恢复参数 +4. Agent 从 interrupt 点继续执行 + +`GenResume` 的职责就是从新 Push 进来的 items 中提取审批结果,构建 `ResumeParams`: + +```go +GenResume: func(ctx context.Context, loop *adk.TurnLoop[*ChatItem, M], + canceledItems, unhandledItems, newItems []*ChatItem, +) (*adk.GenResumeResult[*ChatItem, M], error) { + // newItems 包含审批恢复时 Push 的 item + item := newItems[0] + return &adk.GenResumeResult[*ChatItem, M]{ + ResumeParams: &adk.ResumeParams{ + InterruptID: item.InterruptID, + ApprovalResult: item.ApprovalResult, + }, + }, nil +} +``` + +相比 Runner 的 `ResumeWithParams()`,声明式 checkpoint 让业务层不需要管理"正常执行 vs 恢复执行"的分支——TurnLoop 根据 checkpoint 是否存在自动选择走 `GenInput` 还是 `GenResume`。 + +## 本章小结 + +- **TurnLoop** 是一个持久运行的多轮执行循环,生命周期与用户会话绑定 +- **常规流程**:`Push(item)` → GenInput → Agent → OnAgentEvents → idle → 等待下一个 Push +- **抢占**:`Push(item, WithPreempt(AfterToolCalls))` 一行代码取消当前轮次并开始新一轮 +- **中止**:`loop.Stop(WithImmediate())` 一行代码终止整个循环 +- **声明式 checkpoint**:配置 `Store` + `CheckpointID`,TurnLoop 自动处理 interrupt 的保存与恢复 +- 回调的具体实现请参考 [server/server.go](https://github.com/cloudwego/eino-examples/blob/main/quickstart/chatwitheino/server/server.go) + +## 系列收尾:完整 Agent 应用骨架 + +到本章为止,我们用一个可以实际运行的 Agent 串起了 Eino 的核心能力: + +- **运行时**:Runner / TurnLoop 驱动执行,支持流式输出、抢占与中止 +- **工具层**:Filesystem / Shell 等 Tool 能力接入,工具错误可被安全处理 +- **中间件**:可插拔的 middleware/handler,用于错误处理、重试、审批等横切能力 +- **可观测**:callbacks/trace 能力把关键链路打通,便于调试与线上观测 +- **人机协作**:interrupt/resume + checkpoint 支持审批、补参、分支选择等交互式流程 +- **确定性编排**:compose(graph/chain/workflow)把复杂业务流程组织为可维护、可复用的执行图 +- **业务交付**:A2UI 协议把 Agent 能力以流式 UI 的形式呈现给用户 +- **执行控制**:TurnLoop 提供抢占、中止、多轮生命周期管理,适配真实业务场景的复杂交互需求 + +你可以在这个骨架上逐步替换/扩展任意环节:模型、工具、存储、工作流、前端渲染协议,而不需要推倒重来。 diff --git a/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md b/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md index 5224228048c..9a409f96d0e 100644 --- a/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md +++ b/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/_index.md @@ -1,6 +1,6 @@ --- Description: "" -date: "2026-03-24" +date: "2026-05-17" lastmod: "" tags: [] title: v0.8.*-adk middlewares @@ -65,7 +65,7 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ > 💡 > **功能**: 自动对话历史摘要,防止超出模型上下文窗口限制 -📚 **详细文档**: [Middleware: FileSystem](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_filesystem) +📚 **详细文档**: [Middleware: Summarization](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware/middleware_summarization) 当对话历史的 Token 数量超过阈值时,自动调用 LLM 生成摘要,压缩上下文。 @@ -247,7 +247,7 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ > 💡 > 升级到 v0.8 前,请查阅 Breaking Changes 文档了解所有不兼容变更 -📚 **完整文档**: [Eino v0.8 不兼容更新](/zh/docs/eino/release_notes_and_migration/eino_v0.8._-adk_middlewares/eino_v0.8_不兼容更新) +📚 **完整文档**: [Eino v0.8 不兼容更新](/zh/docs/eino/release_notes_and_migration/eino_v0.8._-adk_middlewares/eino_v0.8_breaking_changes) **变更概览**: @@ -263,7 +263,7 @@ agent, err := adk.NewChatModelAgent(ctx, &adk.ChatModelAgentConfig{ ## 升级指南 -详细的迁移步骤和代码示例请参考:[Eino v0.8 不兼容更新](/zh/docs/eino/release_notes_and_migration/eino_v0.8._-adk_middlewares/eino_v0.8_不兼容更新) +详细的迁移步骤和代码示例请参考:[Eino v0.8 不兼容更新](/zh/docs/eino/release_notes_and_migration/eino_v0.8._-adk_middlewares/eino_v0.8_breaking_changes) **快速检查清单**: diff --git "a/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_\344\270\215\345\205\274\345\256\271\346\233\264\346\226\260.md" b/content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/eino_v0.8_breaking_changes.md similarity index 100% rename from "content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/Eino_v0.8_\344\270\215\345\205\274\345\256\271\346\233\264\346\226\260.md" rename to content/zh/docs/eino/release_notes_and_migration/Eino_v0.8._-adk_middlewares/eino_v0.8_breaking_changes.md diff --git a/content/zh/docs/eino/release_notes_and_migration/_index.md b/content/zh/docs/eino/release_notes_and_migration/_index.md index 5bfe102c396..07624cb5f08 100644 --- a/content/zh/docs/eino/release_notes_and_migration/_index.md +++ b/content/zh/docs/eino/release_notes_and_migration/_index.md @@ -4,7 +4,7 @@ date: "2026-03-02" lastmod: "" tags: [] title: 发布记录 & 迁移指引 -weight: 8 +weight: 7 --- # 版本管理规范 diff --git a/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md b/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md new file mode 100644 index 00000000000..c4f5fe9dd5b --- /dev/null +++ b/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/_index.md @@ -0,0 +1,97 @@ +--- +Description: "" +date: "2026-05-21" +lastmod: "" +tags: [] +title: v0.9.* agentic-runtime +weight: 9 +--- + +V0.9 的版本主题是 `agentic-runtime`。该版本主要围绕 ADK 的消息协议、Agent 运行控制和多轮运行时能力展开,在保留 `*schema.Message` 默认路径的同时,引入 `AgenticMessage` 及配套泛型抽象,为更丰富的模型原生 Agent 协议、服务端工具调用、运行中断与恢复打下基础。 + +## 1. AgenticMessage 与 ADK 支持 + +V0.9 新增 `schema.AgenticMessage`,用于表达比传统 `schema.Message` 更完整的 Agentic 消息结构。 + +- `AgenticMessage` 采用 content block 模型,支持文本、推理内容、工具调用、工具结果、服务端工具、MCP 工具和多模态内容等结构化片段。 +- `[]ContentBlock` 能更完整地保留不同模型协议响应中的 block 时序;新增 block 类型也更适配 OpenAI Responses API、Claude、Gemini 等协议中的 tool use、reasoning、streaming metadata 等结构。 +- `components/model` 新增 `AgenticModel` 组件,用于接入以 `AgenticMessage` 为输入输出的模型实现。 +- ADK 对 `AgenticMessage` 路径提供 typed agent、typed event、typed runner 和 typed `ChatModelAgent` 支持,使 AgenticModel 能进入 ADK 的 Agent 生命周期。 +- [Eino: 快速开始](/zh/docs/eino/quick_start):整个系列基于 AgenticMessage 重写。 + +## 2. ChatModelAgent 能力扩展 + +V0.9 对 `ChatModelAgent` 的运行控制、模型调用可靠性和 middleware 扩展点进行了系统增强。 + +### Cancel + +- 新增 Agent Cancel 能力,用于从外部主动终止正在运行的 Agent。 +- 支持安全点取消、递归取消、取消超时升级,以及取消过程中的 checkpoint 持久化。 +- 取消期间发生的 interrupt 会统一进入取消语义,调用方可以通过 `CancelError` 区分主动取消与普通业务失败。 +- [Eino ADK: Agent Cancel 与 TurnLoop 快速入门](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) + +### Model Retry + +- Retry 从简单的 error retry 扩展为 `ShouldRetry(ctx, RetryContext) -> RetryDecision`。 +- Retry 决策可以读取模型输出、拒绝不满足条件的输出、修改下一次输入、追加模型 option,并覆盖 backoff。 + +### Model Failover + +- 新增 Model Failover 能力,用于在模型调用失败后切换到备用模型。 +- Failover 决策可以读取失败 attempt 的输出、错误、原始输入和 attempt 序号,并选择下一次使用的模型。 +- 支持为备用模型改写输入;也支持优先复用上一次调用成功的模型,降低每次从固定主模型开始试错的成本。 +- [ChatModel Failover 功能文档](/zh/docs/eino/core_modules/eino_adk/agent_implementation/chat_model/chatmodel_failover_guide) + +### Middleware 增强 + +- `ChatModelAgentMiddleware` 新增 `AfterAgent`,用于在 Agent 成功结束后执行收尾逻辑。 +- Summarization、reduction、skill、filesystem、plan-task、patch-tool-calls 等 middleware 完成泛型化,支持 `AgenticMessage` 路径。 +- Summarization middleware 新增 `TypedMiddleware.Summarize`,同步 summarization 能力从独立函数转为 middleware 内聚能力。 +- Filesystem middleware 增强多模态读取能力,并增加 PDF pages 校验。 +- 新增 `agentsmd` middleware,用于加载和注入 `AGENTS.md` 风格的项目指令。 +- `ChatModelAgentState` 增加 `ToolInfos` 和 `DeferredToolInfos`,作为 middleware 调整模型可见工具集合的主路径。 +- `ToolInfos` 表示当前模型调用直接可见的工具;`DeferredToolInfos` 表示可由模型通过工具搜索机制按需发现的候选工具。 +- Tool search middleware 支持三类工具加载方式:使用模型侧原生 tool search 能力从 deferred tools 中按需加载;按模型协议要求提供固定 schema 的 `ToolSearchTool`,由模型通过该入口搜索 deferred tools;不依赖模型侧协议,使用 Eino 提供的自定义 `tool_search` tool 检索工具,并把命中的工具追加到常规 `ToolInfos`。 +- Compose 新增 `AgenticToolsNode`,`ToolsNode` 增加 tool name 和 argument alias 支持。 +- [Eino ADK: ChatModelAgentMiddleware](/zh/docs/eino/core_modules/eino_adk/eino_adk_chatmodelagentmiddleware) + +## 3. TurnLoop + +V0.9 新增 `TurnLoop`,用于把一次性的 Agent run 提升为可持续运行、可被外部驱动的 turn 级运行时。 + +- 面向多轮运行:`TurnLoop` 持续接收外部输入,每个 turn 独立规划输入、构造 Agent、消费事件,适合长期在线的交互式 Agent。 +- 支持输入合并:`GenInput` 在 turn 边界决定本轮消费哪些输入、哪些继续等待,应用可以实现批处理、去重、合并用户连续输入等策略。 +- 支持抢占:带 preempt option 的 `Push` 会原子地写入新输入并请求取消当前 turn,使高优先级输入可以打断正在运行的 Agent。 +- 支持声明式 checkpoint/resume:恢复时,应用不需要自行还原输入队列;`TurnLoop` 会区分被中断的输入、尚未处理的输入和恢复后新到达的输入,应用只需声明这些输入如何重新进入后续 turn。 +- [Eino ADK: Agent Cancel 与 TurnLoop 快速入门](/zh/docs/eino/core_modules/eino_adk/eino_adk_agent_cancel_and_turnloop_quickstart) + +## 升级方式 + +> 💡 +> 当前(5.19)最新版本为 v0.9.0-beta.1,预计一周后发布正式版。在正式版发布前,请始终使用最新的 beta 版本;正式版发布后,请升级到最新的正式版本。 + +```bash +# 升级到最新 beta(正式版发布前使用) +go get github.com/cloudwego/eino@v0.9.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticopenai@v0.2.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticark@v0.2.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticclaude@v0.1.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticgemini@v0.2.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticdeepseek@v0.1.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticqwen@v0.1.0-beta.1 +go get github.com/cloudwego/eino-ext/components/model/agenticopenai@v0.2.0-beta.1 +go get github.com/cloudwego/eino-ext/callbacks/cozeloop@v0.3.0-beta.1 + +# 正式版发布后,替换为最新正式版本号 +go get github.com/cloudwego/eino@v0.9.0 +go get github.com/cloudwego/eino-ext/components/model/agenticopenai@v0.2.0 +go get github.com/cloudwego/eino-ext/components/model/agenticark@v0.2.0 +go get github.com/cloudwego/eino-ext/components/model/agenticclaude@v0.1.0 +go get github.com/cloudwego/eino-ext/components/model/agenticgemini@v0.2.0 +go get github.com/cloudwego/eino-ext/components/model/agenticdeepseek@v0.1.0 +go get github.com/cloudwego/eino-ext/components/model/agenticqwen@v0.1.0 +go get github.com/cloudwego/eino-ext/components/model/agenticopenai@v0.2.0 +go get github.com/cloudwego/eino-ext/callbacks/cozeloop@v0.3.0 +``` + +查看最新版本号:[github.com/cloudwego/eino/tags](https://github.com/cloudwego/eino/tags) diff --git a/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md b/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md new file mode 100644 index 00000000000..ab08c27725c --- /dev/null +++ b/content/zh/docs/eino/release_notes_and_migration/eino_v0.9._agentic-runtime/eino_v0.9_migration_notes.md @@ -0,0 +1,198 @@ +--- +Description: "" +date: "2026-05-19" +lastmod: "" +tags: [] +title: Eino V0.9 更新注意事项 +weight: 1 +--- + +本文列出现有用户从 V0.8.x 升级到 V0.9 `agentic-runtime` 时需要关注的 API 和语义变化。未列出的新增能力通常不影响既有 `*schema.Message` 路径。 + +## API 显式变更 + +### Agent Transfer / Workflow Agent / Supervisor 标记为 NOT RECOMMENDED + +V0.9 将基于 Agent Transfer(全上下文共享)的多 Agent 协作模式整体标记为 **NOT RECOMMENDED**。受影响的公开 API 包括: + +**Agent Transfer 相关**: + +- `SetSubAgents` +- `AgentWithOptions` / `WithDisallowTransferToParent` / `WithHistoryRewriter` +- `ChatModelAgentConfig.Exit` / `ChatModelAgentConfig.OutputKey` +- `AgentWithDeterministicTransferTo` +- `OnSetSubAgents` / `OnSetAsSubAgent` / `OnDisallowTransferToParent` + +**Workflow Agent**: + +- `NewSequentialAgent` / `SequentialAgentConfig` +- `NewParallelAgent` / `ParallelAgentConfig` +- `NewLoopAgent` / `LoopAgentConfig` + +**Supervisor**: + +- `supervisor.New` / `supervisor.Config` + +> 💡 +> 这些 API 仍然可以使用,不会编译失败,但不建议在新项目中采用。经验表明,Agent 之间共享完整对话上下文的 transfer 模式在实际效果上并不优于工具调用模式。 + +推荐迁移方向: + +- 使用 `ChatModelAgent` + `AgentTool`(将子 Agent 封装为工具,按需调用)。 +- 使用 `DeepAgent`(结构化子任务委派)。 +- 上述两种方式均可获得更好的可控性、可观测性和 prompt cache 效率。 + +### ChatModelAgentMiddleware 新增 AfterAgent + +`ChatModelAgentMiddleware` 新增 `AfterAgent` 方法。手写实现该接口的类型需要补充该方法,否则会编译失败。 + +推荐做法: + +- 如果 middleware 不需要特殊收尾逻辑,嵌入 `*adk.BaseChatModelAgentMiddleware`。 +- 如果 middleware 需要在 Agent 成功结束后清理状态、记录事件或补充统计,实现 `AfterAgent(ctx, state)`。 + +影响范围: + +- 仅影响显式实现 `ChatModelAgentMiddleware` 的用户代码。 +- 通过 `BaseChatModelAgentMiddleware` 组合扩展的代码可保持兼容。 + +### AgentMiddleware 结构体废弃 + +`AgentMiddleware` 结构体及 `ChatModelAgentConfig.Middlewares` 字段已标记为 **Deprecated**,将在未来版本中移除。 + +> 💡 +> AgentMiddleware 和 Middlewares 字段均已废弃。请迁移至 interface-based 的 Handlers(ChatModelAgentMiddleware)方式。 + +迁移方式: + +- 将 `Middlewares []AgentMiddleware` 中的各项逻辑迁移到 `Handlers []ChatModelAgentMiddleware`。 +- `AgentMiddleware.BeforeChatModel` → 实现 `ChatModelAgentMiddleware.BeforeModelRewriteState`。 +- `AgentMiddleware.AfterChatModel` → 实现 `ChatModelAgentMiddleware.AfterModelRewriteState`。 +- `AgentMiddleware.WrapToolCall` → 实现 `ChatModelAgentMiddleware.WrapToolCall`。 +- `AgentMiddleware.AdditionalInstruction` → 在 `BeforeModelRewriteState` 中修改 `state.Instruction`。 +- `AgentMiddleware.AdditionalTools` → 在 `BeforeModelRewriteState` 中修改 `state.ToolInfos`。 +- 如果 middleware 不需要特殊逻辑,嵌入 `*adk.BaseChatModelAgentMiddleware` 以获得默认空实现。 + +影响范围: + +- 所有在 `ChatModelAgentConfig.Middlewares` 中使用 `AgentMiddleware` 的代码需要迁移。 +- 当前版本两种方式可共存(Handlers 在 Middlewares 之后执行),但建议尽早迁移以避免未来版本移除时的编译失败。 + +### summarization.SummarizeMessages 被移除 + +`summarization.SummarizeMessages` 和 `summarization.SummarizeOutput` 不再导出。 + +迁移方式: + +- 构造 summarization middleware 时继续使用 `summarization.New` 或 `summarization.NewTyped`。 +- 需要主动触发同步 summarization 时,使用 `TypedMiddleware.Summarize`。 + +该调整将 summarization 的配置、状态读取和执行逻辑收敛到 middleware 内部,避免独立函数与运行时状态语义分叉。 + +## 需要关注语义变化的能力 + +### Summarization Finalize 后处理语义变化 + +V0.8.x 中,summarization middleware 会先执行默认 summary 后处理,再调用用户配置的 `Finalize`。因此自定义 `Finalize` 收到的 `summary` 已经包含 `PreserveUserMessages` 替换、`TranscriptFilePath` 注入和 summary preamble。 + +V0.9 中,如果设置了 `Config.Finalize`,middleware 会直接把模型生成的 raw summary 传给 `Finalize`,不再自动执行默认后处理。受影响的配置包括: + +- `PreserveUserMessages` +- `TranscriptFilePath` + +迁移方式: + +- 如果希望保留默认后处理,不要设置 `Finalize`,让 middleware 使用默认 finalization 路径。 +- 如果必须自定义 `Finalize`,但仍希望保留默认后处理,先通过 `DefaultFinalizer` 构造默认 finalizer,再在自定义逻辑中显式组合。 +- `DefaultFinalizer` 不会自动读取外层 `Config.PreserveUserMessages` 和 `Config.TranscriptFilePath`;需要通过 `DefaultFinalizerConfig` 显式传入。 +- 使用 `NewFinalizer().PreserveSkills(...).Build()` 的代码需要特别检查:该 finalizer 只负责 preserve skills,不会自动补上 `PreserveUserMessages` 和 `TranscriptFilePath`。 + +### 工具列表修改路径调整 + +`ModelContext.Tools` 不再是推荐的工具列表修改入口。 + +升级建议: + +- 在 `BeforeModelRewriteState` 中修改 `state.ToolInfos`。 +- 如需模型原生 deferred tool search,修改 `state.DeferredToolInfos`。 +- 不建议在 `WrapModel` 中修改工具列表;该修改只影响当前模型调用,后续 middleware、后续 turn 或 checkpoint/resume 不会继承这次修改。 + +### ToolSearch / AgentsMD Middleware 内部实现迁移 + +ToolSearch 和 AgentsMD middleware 的内部实现从 `WrapModel`(v0.8.x)迁移至 `BeforeModelRewriteState`(v0.9)。 + +> 💡 +> 对仅使用 `toolsearch.New()` / `agentsmd.New()` 的用户,公开 API(Config 结构体、构造函数)未变化,无需修改代码。 + +语义变化: + +- **v0.8.x**:middleware 通过 `WrapModel` 在模型调用时临时注入工具列表(via `model.Option`),变更不持久化,不进入 agent state。 +- **v0.9**:middleware 在 `BeforeModelRewriteState` 中直接修改 `state.ToolInfos` / `state.DeferredToolInfos` 和 `state.Messages`(注入提醒消息),变更随 state 持久化。 + +影响: + +- **Checkpoint/Resume**:ToolSearch 注入的提醒消息和动态工具搜索结果现在会随 checkpoint 持久化并在恢复时正确重建,v0.8.x 中这些信息会在恢复后丢失。 +- **其他 Middleware 可见性**:后续 middleware 的 `BeforeModelRewriteState` / `AfterModelRewriteState` 现在能看到 ToolSearch 修改后的 `state.ToolInfos`,而 v0.8.x 中这些修改对其他 middleware 不可见。 +- **Prompt Cache**:由于工具列表变更现在反映在 state 中(而非每次模型调用时临时注入),模型的 KV-cache 行为可能有差异。 + +需要注意: + +- 如果有自定义 middleware 依赖 `WrapModel` 中的 `ModelContext.Tools` 来读取/修改工具列表,应迁移至 `BeforeModelRewriteState` 中读取 `state.ToolInfos`。 + +### Model Retry 决策语义增强 + +`ModelRetryConfig` 新增 `ShouldRetry`。当 `ShouldRetry` 非空时,`IsRetryAble` 会被忽略。 + +需要注意: + +- 旧的 `IsRetryAble` 仍可用于错误维度的简单重试。 +- 使用 `ShouldRetry` 后,应显式处理成功输出但业务不接受的场景。 +- Interrupt 和 `ErrStreamCanceled` 不作为普通 retry error 处理。 + +### Cancel 错误语义 + +V0.9 引入主动取消语义后,应用需要区分主动取消、普通错误和业务 interrupt。 + +升级建议: + +- 上层应区分 `CancelError`、普通 error 和业务 interrupt。 +- 如果应用主动接入 `WithCancel`,不要把 `CancelError` 当作普通业务失败处理。 + +### AgenticMessage 迁移需要理解新的消息结构 + +`TypedChatModelAgent[*schema.AgenticMessage]` 是面向模型原生 Agentic 协议的新路径。迁移到该路径不只是把泛型参数从 `*schema.Message` 改成 `*schema.AgenticMessage`,还需要按 `AgenticMessage` 的 content block 结构处理消息内容。 + +需要注意: + +- AgenticMessage 路径使用 `AgenticModel` 与 `AgenticToolsNode` 处理工具调用。 +- 工具调用和工具结果通过 `AgenticMessage` content block 表达,尤其需要正确处理 tool call / tool result content block。 +- Agent transfer 能力不适用于 AgenticMessage 路径。 +- 既有应用如果不需要模型原生 Agentic 协议,建议继续使用默认 `*schema.Message` 路径;只有在明确要接入 `AgenticModel` 协议时再迁移。 + +### 模型适配器需要识别新增 option + +V0.9 引入 `AgenticModel` 后,模型适配器需要更严格地处理 call-time options。`AgenticModel` 是 `BaseModel[*schema.AgenticMessage]` 的别名,不再提供类似 `ToolCallingChatModel.WithTools` 的增强接口;工具绑定统一通过 `model.WithTools` 作为 `model.Option` 传入。 + +需要注意: + +- 所有支持 AgenticMessage 的模型适配器都应读取 `Options.Tools`,并将其映射到 provider 的 tool calling 协议。 +- `AgenticModel` 不应要求用户先调用某个 `WithTools` 方法得到“带工具的模型实例”;ADK 会在每次模型调用时通过 `model.WithTools` 传递当前工具列表。 +- 如果适配器只从自身 config 读取工具,而忽略 `model.WithTools`,在 ChatModelAgent / AgenticToolsNode 路径下会出现模型看不到工具或工具列表不随运行态变化的问题。 + +V0.9 还在 `model.Options` 中新增: + +- `DeferredTools` +- `ToolSearchTool` +- `AgenticToolChoice` + +现有模型适配器忽略这些 option 通常不会导致编译失败,但会导致 deferred tool search、模型原生 tool search 或 agentic tool choice 不生效。适配器维护者应按目标 provider 的协议补齐转换逻辑。 + +### ToolInfo 序列化形态变化 + +`ToolInfo` 增加显式 JSON/Gob 编解码,以保留 `ParamsOneOf`。 + +影响: + +- `ToolInfo` 进入了 `ChatModelAgentState.ToolInfos` / `DeferredToolInfos`,因此可能随 Agent state 一起进入 checkpoint。 +- 显式 JSON/Gob 编解码用于保证 `ParamsOneOf` 在 checkpoint、deep copy 和恢复过程中不会丢失。 +- 如果外部系统直接依赖旧版 `ToolInfo` JSON 形态,需要重新确认序列化兼容性。 diff --git a/content/zh/docs/eino/release_notes_and_migration/v02_second_release.md b/content/zh/docs/eino/release_notes_and_migration/v02_second_release.md index fbda1b7ed12..4c26f5728b5 100644 --- a/content/zh/docs/eino/release_notes_and_migration/v02_second_release.md +++ b/content/zh/docs/eino/release_notes_and_migration/v02_second_release.md @@ -74,7 +74,7 @@ weight: 2 ### BugFix -- Fixed the SSTI vulnerability in the Jinja chat template(langchaingo 存在 gonja 模板注入) +- Fixed the SSTI vulnerability in the Jinja chat template [langchaingo 存在 gonja 模板注入](https://bytedance.larkoffice.com/docx/UvqxdlFfSoTIr1xtsQ5cIZTVn2b) ## v0.2.0 diff --git a/static/img/eino/DwTrwyD1eh2DqNbsGE8cfdTNnYb.png b/static/img/eino/DwTrwyD1eh2DqNbsGE8cfdTNnYb.png new file mode 100644 index 00000000000..9752e8ef3bc Binary files /dev/null and b/static/img/eino/DwTrwyD1eh2DqNbsGE8cfdTNnYb.png differ diff --git a/static/img/eino/GzIObeN6roy2SAxpEXBcMqrRnYb.png b/static/img/eino/GzIObeN6roy2SAxpEXBcMqrRnYb.png deleted file mode 100644 index d0994449c34..00000000000 Binary files a/static/img/eino/GzIObeN6roy2SAxpEXBcMqrRnYb.png and /dev/null differ diff --git a/static/img/eino/HAz4wb8f6h4XSOb7yUVc2CkUnAg.png b/static/img/eino/HAz4wb8f6h4XSOb7yUVc2CkUnAg.png new file mode 100644 index 00000000000..31a535951a7 Binary files /dev/null and b/static/img/eino/HAz4wb8f6h4XSOb7yUVc2CkUnAg.png differ diff --git a/static/img/eino/eino_adk_write_todos.png b/static/img/eino/HOJtbxNKWoibi2xzXrAcx0BUndb.png similarity index 100% rename from static/img/eino/eino_adk_write_todos.png rename to static/img/eino/HOJtbxNKWoibi2xzXrAcx0BUndb.png diff --git a/static/img/eino/A737bctqLoOzNrxbK8Hc5ccmnEb.png b/static/img/eino/Ifu5bvB6conps5xBH5fcFdiCnCW.png similarity index 100% rename from static/img/eino/A737bctqLoOzNrxbK8Hc5ccmnEb.png rename to static/img/eino/Ifu5bvB6conps5xBH5fcFdiCnCW.png diff --git a/static/img/eino/N9ZzwvvuWhya0vbIzLEcMx6DnMP.png b/static/img/eino/N9ZzwvvuWhya0vbIzLEcMx6DnMP.png deleted file mode 100644 index 997eeaf21aa..00000000000 Binary files a/static/img/eino/N9ZzwvvuWhya0vbIzLEcMx6DnMP.png and /dev/null differ diff --git a/static/img/eino/eino_adk_excel_using_deep.png b/static/img/eino/PhKjbQyKZoqaM9xyxptcceM9nsg.png similarity index 100% rename from static/img/eino/eino_adk_excel_using_deep.png rename to static/img/eino/PhKjbQyKZoqaM9xyxptcceM9nsg.png diff --git a/static/img/eino/RlIuwflSQh1gzlb7eMkcarFenbe.png b/static/img/eino/RlIuwflSQh1gzlb7eMkcarFenbe.png new file mode 100644 index 00000000000..332a1f260b8 Binary files /dev/null and b/static/img/eino/RlIuwflSQh1gzlb7eMkcarFenbe.png differ diff --git a/static/img/eino/TXVlwT7Iohh1EtbEeC6cIptxnZd.png b/static/img/eino/TXVlwT7Iohh1EtbEeC6cIptxnZd.png deleted file mode 100644 index 0d005ed243f..00000000000 Binary files a/static/img/eino/TXVlwT7Iohh1EtbEeC6cIptxnZd.png and /dev/null differ diff --git a/static/img/eino/X9I4wGCprhpho7bXk6icMHmwnRb.png b/static/img/eino/X9I4wGCprhpho7bXk6icMHmwnRb.png new file mode 100644 index 00000000000..75c1f0d62be Binary files /dev/null and b/static/img/eino/X9I4wGCprhpho7bXk6icMHmwnRb.png differ diff --git a/static/img/eino/XrWqwC669hGGoibW1q3c2ToTnvf.png b/static/img/eino/XrWqwC669hGGoibW1q3c2ToTnvf.png new file mode 100644 index 00000000000..018eb4f5742 Binary files /dev/null and b/static/img/eino/XrWqwC669hGGoibW1q3c2ToTnvf.png differ diff --git a/static/img/eino/Xs38beDNAobevkx0epfcjkCnnFb.png b/static/img/eino/Xs38beDNAobevkx0epfcjkCnnFb.png new file mode 100644 index 00000000000..6fe56677c4b Binary files /dev/null and b/static/img/eino/Xs38beDNAobevkx0epfcjkCnnFb.png differ diff --git a/static/img/eino/eino_adk_agent_as_tool_sequence_diagram_1.png b/static/img/eino/eino_adk_agent_as_tool_sequence_diagram_1.png deleted file mode 100644 index 8022d0dc902..00000000000 Binary files a/static/img/eino/eino_adk_agent_as_tool_sequence_diagram_1.png and /dev/null differ diff --git a/static/img/eino/eino_adk_chat_model_agent_view.png b/static/img/eino/eino_adk_chat_model_agent_view.png deleted file mode 100644 index 4480f271c5c..00000000000 Binary files a/static/img/eino/eino_adk_chat_model_agent_view.png and /dev/null differ diff --git a/static/img/eino/eino_adk_collaboration_example.png b/static/img/eino/eino_adk_collaboration_example.png deleted file mode 100644 index d4b4f93b456..00000000000 Binary files a/static/img/eino/eino_adk_collaboration_example.png and /dev/null differ diff --git a/static/img/eino/eino_adk_collaboration_run_path_sequential.png b/static/img/eino/eino_adk_collaboration_run_path_sequential.png deleted file mode 100644 index d75265eab0f..00000000000 Binary files a/static/img/eino/eino_adk_collaboration_run_path_sequential.png and /dev/null differ diff --git a/static/img/eino/eino_adk_deterministic_transfer.png b/static/img/eino/eino_adk_deterministic_transfer.png deleted file mode 100644 index ac1e2f9e20d..00000000000 Binary files a/static/img/eino/eino_adk_deterministic_transfer.png and /dev/null differ diff --git a/static/img/eino/eino_adk_directory_structure.png b/static/img/eino/eino_adk_directory_structure.png deleted file mode 100644 index 3bb9b51236d..00000000000 Binary files a/static/img/eino/eino_adk_directory_structure.png and /dev/null differ diff --git a/static/img/eino/eino_adk_implementation_nested_loop_sequential.png b/static/img/eino/eino_adk_implementation_nested_loop_sequential.png deleted file mode 100644 index b8e4e0ced2b..00000000000 Binary files a/static/img/eino/eino_adk_implementation_nested_loop_sequential.png and /dev/null differ diff --git a/static/img/eino/eino_adk_loop_agent.png b/static/img/eino/eino_adk_loop_agent.png deleted file mode 100644 index c0037634621..00000000000 Binary files a/static/img/eino/eino_adk_loop_agent.png and /dev/null differ diff --git a/static/img/eino/eino_adk_loop_definition.png b/static/img/eino/eino_adk_loop_definition.png deleted file mode 100644 index 61d49ad8595..00000000000 Binary files a/static/img/eino/eino_adk_loop_definition.png and /dev/null differ diff --git a/static/img/eino/eino_adk_loop_exit.png b/static/img/eino/eino_adk_loop_exit.png deleted file mode 100644 index b2846e59866..00000000000 Binary files a/static/img/eino/eino_adk_loop_exit.png and /dev/null differ diff --git a/static/img/eino/eino_adk_message_event.png b/static/img/eino/eino_adk_message_event.png deleted file mode 100644 index 432abe89ee7..00000000000 Binary files a/static/img/eino/eino_adk_message_event.png and /dev/null differ diff --git a/static/img/eino/eino_adk_module_architecture.png b/static/img/eino/eino_adk_module_architecture.png deleted file mode 100644 index 5f90a1cb074..00000000000 Binary files a/static/img/eino/eino_adk_module_architecture.png and /dev/null differ diff --git a/static/img/eino/eino_adk_overview_sequential.png b/static/img/eino/eino_adk_overview_sequential.png deleted file mode 100644 index ec96a47852c..00000000000 Binary files a/static/img/eino/eino_adk_overview_sequential.png and /dev/null differ diff --git a/static/img/eino/eino_adk_parallel_agent.png b/static/img/eino/eino_adk_parallel_agent.png deleted file mode 100644 index e46c2031c91..00000000000 Binary files a/static/img/eino/eino_adk_parallel_agent.png and /dev/null differ diff --git a/static/img/eino/eino_adk_parallel_controller_overview.png b/static/img/eino/eino_adk_parallel_controller_overview.png deleted file mode 100644 index 934ef4de58d..00000000000 Binary files a/static/img/eino/eino_adk_parallel_controller_overview.png and /dev/null differ diff --git a/static/img/eino/eino_adk_parallel_definition.png b/static/img/eino/eino_adk_parallel_definition.png deleted file mode 100644 index e46c2031c91..00000000000 Binary files a/static/img/eino/eino_adk_parallel_definition.png and /dev/null differ diff --git a/static/img/eino/eino_adk_parallel_use_case.png b/static/img/eino/eino_adk_parallel_use_case.png deleted file mode 100644 index 8d4fcf8bda2..00000000000 Binary files a/static/img/eino/eino_adk_parallel_use_case.png and /dev/null differ diff --git a/static/img/eino/eino_adk_parallel_yet_another_2.png b/static/img/eino/eino_adk_parallel_yet_another_2.png deleted file mode 100644 index 934ef4de58d..00000000000 Binary files a/static/img/eino/eino_adk_parallel_yet_another_2.png and /dev/null differ diff --git a/static/img/eino/eino_adk_plan_execute_replan.png b/static/img/eino/eino_adk_plan_execute_replan.png deleted file mode 100644 index e55f6b66e51..00000000000 Binary files a/static/img/eino/eino_adk_plan_execute_replan.png and /dev/null differ diff --git a/static/img/eino/eino_adk_preview_tree.png b/static/img/eino/eino_adk_preview_tree.png deleted file mode 100644 index 3193c0ec254..00000000000 Binary files a/static/img/eino/eino_adk_preview_tree.png and /dev/null differ diff --git a/static/img/eino/eino_adk_quick_start_agent_types.png b/static/img/eino/eino_adk_quick_start_agent_types.png deleted file mode 100644 index de96fda45b2..00000000000 Binary files a/static/img/eino/eino_adk_quick_start_agent_types.png and /dev/null differ diff --git a/static/img/eino/eino_adk_run_path.png b/static/img/eino/eino_adk_run_path.png deleted file mode 100644 index 860b928c2d8..00000000000 Binary files a/static/img/eino/eino_adk_run_path.png and /dev/null differ diff --git a/static/img/eino/eino_adk_run_path_deterministic.png b/static/img/eino/eino_adk_run_path_deterministic.png deleted file mode 100644 index 36154fa1fad..00000000000 Binary files a/static/img/eino/eino_adk_run_path_deterministic.png and /dev/null differ diff --git a/static/img/eino/eino_adk_run_path_sub_agent.png b/static/img/eino/eino_adk_run_path_sub_agent.png deleted file mode 100644 index 6e44d1197a9..00000000000 Binary files a/static/img/eino/eino_adk_run_path_sub_agent.png and /dev/null differ diff --git a/static/img/eino/eino_adk_self_driving.png b/static/img/eino/eino_adk_self_driving.png deleted file mode 100644 index 3193c0ec254..00000000000 Binary files a/static/img/eino/eino_adk_self_driving.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequence_diagram.png b/static/img/eino/eino_adk_sequence_diagram.png deleted file mode 100644 index 9e9d14dd810..00000000000 Binary files a/static/img/eino/eino_adk_sequence_diagram.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequential_agent.png b/static/img/eino/eino_adk_sequential_agent.png deleted file mode 100644 index 99d71862ccf..00000000000 Binary files a/static/img/eino/eino_adk_sequential_agent.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequential_controller.png b/static/img/eino/eino_adk_sequential_controller.png deleted file mode 100644 index ff9458c4f31..00000000000 Binary files a/static/img/eino/eino_adk_sequential_controller.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequential_definition.png b/static/img/eino/eino_adk_sequential_definition.png deleted file mode 100644 index 99d71862ccf..00000000000 Binary files a/static/img/eino/eino_adk_sequential_definition.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequential_quickstart.png b/static/img/eino/eino_adk_sequential_quickstart.png deleted file mode 100644 index 86c80eecb80..00000000000 Binary files a/static/img/eino/eino_adk_sequential_quickstart.png and /dev/null differ diff --git a/static/img/eino/eino_adk_sequential_with_loop.png b/static/img/eino/eino_adk_sequential_with_loop.png deleted file mode 100644 index 71feae7155d..00000000000 Binary files a/static/img/eino/eino_adk_sequential_with_loop.png and /dev/null differ diff --git a/static/img/eino/eino_adk_streaming.png b/static/img/eino/eino_adk_streaming.png deleted file mode 100644 index 1ecc91272ce..00000000000 Binary files a/static/img/eino/eino_adk_streaming.png and /dev/null differ diff --git a/static/img/eino/eino_adk_supervisor.png b/static/img/eino/eino_adk_supervisor.png deleted file mode 100644 index 5e10e2abb64..00000000000 Binary files a/static/img/eino/eino_adk_supervisor.png and /dev/null differ diff --git a/static/img/eino/eino_adk_supervisor_definition.png b/static/img/eino/eino_adk_supervisor_definition.png deleted file mode 100644 index b733e9ba381..00000000000 Binary files a/static/img/eino/eino_adk_supervisor_definition.png and /dev/null differ diff --git a/static/img/eino/eino_adk_supervisor_example.png b/static/img/eino/eino_adk_supervisor_example.png deleted file mode 100644 index 042240dc3c2..00000000000 Binary files a/static/img/eino/eino_adk_supervisor_example.png and /dev/null differ diff --git a/static/img/eino/eino_adk_yet_another_loop.png b/static/img/eino/eino_adk_yet_another_loop.png deleted file mode 100644 index b2846e59866..00000000000 Binary files a/static/img/eino/eino_adk_yet_another_loop.png and /dev/null differ diff --git a/static/img/eino/eino_collaboration_agent_as_tool_thumbnail.png b/static/img/eino/eino_collaboration_agent_as_tool_thumbnail.png deleted file mode 100644 index 8022d0dc902..00000000000 Binary files a/static/img/eino/eino_collaboration_agent_as_tool_thumbnail.png and /dev/null differ