Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ yarn.lock
.claude

CLAUDE.md
.omc

test-results
playwright-report
Expand Down
6 changes: 3 additions & 3 deletions package.json
Original file line number Diff line number Diff line change
Expand Up @@ -65,8 +65,8 @@
"@eslint/compat": "^1.4.1",
"@eslint/js": "9.39.2",
"@playwright/test": "^1.58.2",
"@rspack/cli": "^1.7.6",
"@rspack/core": "^1.6.8",
"@rspack/cli": "^1.7.11",
"@rspack/core": "^1.7.11",
"@swc/helpers": "^0.5.17",
"@testing-library/jest-dom": "^6.9.1",
"@testing-library/react": "^16.3.0",
Expand Down Expand Up @@ -107,7 +107,7 @@
"unocss": "66.5.4",
"vitest": "^4.0.18"
},
"packageManager": "pnpm@10.12.4",
"packageManager": "pnpm@10.33.0",
"sideEffects": [
"**/*.css",
"**/*.scss",
Expand Down
1,453 changes: 430 additions & 1,023 deletions pnpm-lock.yaml

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions pnpm-workspace.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
minimumReleaseAge: 10080
6 changes: 3 additions & 3 deletions src/app/service/agent/core/compact_prompt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ Include the following sections in your <summary>:
- Key outputs or artifacts produced

3. **User Messages**
- List ALL user messages that are not tool results
- These are critical for understanding the user's feedback and changing intent
- Include any mid-conversation corrections or preference changes
- List ALL user messages that are not tool results, in order
- **Mid-task corrections are highest priority** — if the user interrupted an ongoing operation with a correction (e.g. "stop", "do it differently", "that's wrong"), record these verbatim. These messages are the most commonly lost in long conversations and the most damaging to skip: a resumed agent will repeat the exact mistake that was already corrected.
- Include preference changes, clarifications, and any instruction that overrides an earlier one

4. **Errors and Fixes**
- All errors encountered and how they were resolved
Expand Down
9 changes: 6 additions & 3 deletions src/app/service/agent/core/sub_agent_types.ts
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,8 @@ You are a research-focused sub-agent. Your job is to search, fetch, read, and su
- Synthesize information from multiple sources when possible.
- Close tabs you no longer need to avoid clutter.
- Return structured, concise results that the parent agent can act on.
- If you cannot find the information, say so clearly rather than guessing.`,
- If you cannot find the information, say so clearly rather than guessing.
- **Distinguish confidence levels in your output**: prefix confirmed facts with the source ("Source X states…"), flag inferences explicitly ("Based on the above, it appears…"), and call out gaps ("I could not confirm…"). Do not blend all three into a single narrative — the parent agent cannot act correctly on untagged uncertainty.`,
},

page_operator: {
Expand Down Expand Up @@ -77,7 +78,8 @@ You are a page interaction sub-agent. Your job is to navigate web pages, interac
- Always read the page content (get_tab_content) before interacting to understand the current state.
- Verify page state after each interaction — never assume an action succeeded.
- For form filling, check that inputs exist and are visible before attempting to fill them.
- Return extracted data in a structured format.`,
- Return extracted data in a structured format.
- **Separate action from outcome**: "I clicked the submit button" and "the form was submitted successfully" are two different facts. Always verify the outcome with \`get_tab_content\` or a targeted \`execute_script\` check before reporting success. If you cannot confirm the outcome, say so.`,
},

general: {
Expand All @@ -90,7 +92,8 @@ You are a page interaction sub-agent. Your job is to navigate web pages, interac

You are a general-purpose sub-agent with access to all tools except user interaction and nested sub-agents.

**Limitations:** You cannot ask the user questions and cannot spawn nested sub-agents. If you encounter a situation that requires user input, describe the situation clearly in your response so the parent agent can handle it.`,
**Limitations:** You cannot ask the user questions and cannot spawn nested sub-agents. If you encounter a situation that requires user input, describe the situation clearly in your response so the parent agent can handle it.
- When multiple approaches are possible, briefly note the tradeoff rather than silently picking one. If an approach fails, report it as failure — do not reframe it as partial success.`,
},
};

Expand Down
65 changes: 65 additions & 0 deletions src/app/service/agent/core/system_prompt.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,31 @@ describe("buildSystemPrompt", () => {
// 不应出现连续三个换行(即空段落)
expect(result).not.toContain("\n\n\n");
});

// P1: sub-agent result validation section
it("P1: Sub-Agent 段包含结果验证指引", () => {
const result = buildSystemPrompt({});
expect(result).toContain("### Receiving Sub-Agent Results");
expect(result).toContain("Check for issues first");
expect(result).toContain("Partial results are not successes");
expect(result).toContain("validate each result independently");
});

// P3: userscript in irreversible actions
it("P3: Safety 段包含 userscript 安装确认要求", () => {
const result = buildSystemPrompt({});
expect(result).toContain("installing or modifying userscripts");
expect(result).toContain("@match");
expect(result).toContain("runs on every matching page");
});

// P6: parallel agent fallback
it("P6: Sub-Agent 段包含并行任务 fallback 指引", () => {
const result = buildSystemPrompt({});
expect(result).toContain("fallback instructions for dependent tasks");
expect(result).toContain("If the file does not exist, report that clearly and do not proceed");
expect(result).toContain("Never assume upstream succeeded");
});
});

describe("buildSubAgentSystemPrompt", () => {
Expand Down Expand Up @@ -186,4 +211,44 @@ describe("buildSubAgentSystemPrompt", () => {

expect(result).toContain("## OPFS Workspace");
});

it.concurrent("sub-agent Tool Usage 段 budget 强调子任务独立性", () => {
const config = SUB_AGENT_TYPES.general;
const result = buildSubAgentSystemPrompt(config, allTools);

expect(result).toContain("covers this subtask only");
expect(result).toContain("independent of the parent agent");
expect(result).toContain("giving up prematurely");
// 不应再有旧的 "limited number of tool calls. Use them wisely" 措辞
expect(result).not.toContain("Use them wisely");
});

it.concurrent("researcher role 包含置信度分层要求", () => {
const config = SUB_AGENT_TYPES.researcher;
const tools = config.allowedTools || [];
const result = buildSubAgentSystemPrompt(config, tools);

expect(result).toContain("Distinguish confidence levels");
expect(result).toContain("Source X states");
expect(result).toContain("it appears");
expect(result).toContain("I could not confirm");
});

it.concurrent("page_operator role 包含动作与结果分离要求", () => {
const config = SUB_AGENT_TYPES.page_operator;
const tools = config.allowedTools || [];
const result = buildSubAgentSystemPrompt(config, tools);

expect(result).toContain("Separate action from outcome");
expect(result).toContain("verify the outcome");
expect(result).toContain("get_tab_content");
});

it.concurrent("general role 包含 tradeoff 汇报和失败诚实要求", () => {
const config = SUB_AGENT_TYPES.general;
const result = buildSubAgentSystemPrompt(config, allTools);

expect(result).toContain("briefly note the tradeoff");
expect(result).toContain("do not reframe it as partial success");
});
});
12 changes: 10 additions & 2 deletions src/app/service/agent/core/system_prompt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ When stopped due to failures:

const SECTION_SAFETY = `## Safety

- **Confirm before irreversible actions**: submitting forms, making purchases, deleting data, posting content.
- **Confirm before irreversible actions**: submitting forms, making purchases, deleting data, posting content, installing or modifying userscripts. For userscripts specifically, show the script's \`@match\` patterns and a summary of what it does before installing — a userscript runs on every matching page after installation and cannot be easily recalled.
- **Proceed freely on read-only actions**: navigating, reading content, taking screenshots, extracting data.
- **Never fill sensitive data you invented** — only use credentials or personal info the user explicitly provided.
- **Never bypass site security** — do not attempt to circumvent CAPTCHAs, rate limits, or access controls. If blocked, inform the user.
Expand Down Expand Up @@ -123,13 +123,21 @@ The sub-agent starts fresh — it has zero context from this conversation. Brief
- **Include what you already know** — relevant data, URLs, selectors, constraints. Don't make it re-discover things you already found.
- **Describe what you've ruled out** — so it doesn't repeat failed approaches.
- **Never delegate understanding** — don't write "based on the research, do X". Digest the information yourself first, then write specific instructions with concrete details (file paths, selectors, exact data to fill).
- **Include fallback instructions for dependent tasks** — if a sub-agent depends on output from a previous step (e.g. a file in OPFS), tell it explicitly what to do if that input is missing: \`"If the file does not exist, report that clearly and do not proceed."\` Never assume upstream succeeded.

### Anti-Patterns

- **Don't predict sub-agent results** — after launching, you know nothing about what it found. If the user asks before results arrive, tell them the sub-agent is still running — give status, not a guess.
- **Don't duplicate work** — if you delegated research to a sub-agent, do not also perform the same searches yourself.
- **Don't chain blindly** — if sub-agent A's result feeds into sub-agent B, wait for A to finish and digest its output before writing B's prompt.

### Receiving Sub-Agent Results

When a sub-agent returns, always inspect the result before using it:
- **Check for issues first** — if the result contains an \`Issues\` section or describes a failure, do not silently incorporate it. Decide explicitly: retry with a corrected prompt, spawn a different sub-agent, or surface the problem to the user.
- **Partial results are not successes** — if a sub-agent completed 3 of 5 requested items and noted failures for the other 2, treat it as a partial failure, not a success.
- **When merging multiple sub-agent results** — validate each result independently before combining. A merged output that contains one silent failure is harder to debug than an early report.

### Usage Notes

- **Always include a short description** (3-5 words) summarizing what the sub-agent will do.
Expand Down Expand Up @@ -227,7 +235,7 @@ const SUB_AGENT_SECTION_TOOL_USAGE = `## Tool Usage

Read each tool's description before calling — it defines behavior, parameters, and constraints. When a tool returns an error, read the error message and adapt — do not blindly retry.

**Tool call budget**: You have a limited number of tool calls. Use them wisely — plan before acting, combine steps when possible, and stop early if stuck.
**Tool call budget**: Your budget covers this subtask only — it is independent of the parent agent's budget. Use calls purposefully: plan before acting, combine steps where possible. Do not conserve budget by skipping verification steps or giving up prematurely. If you are genuinely stuck after trying different approaches, report the failure clearly instead of continuing to burn calls on a dead end.

### Failure Detection — Stop Early

Expand Down
98 changes: 98 additions & 0 deletions src/app/service/service_worker/resource.test.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
import { initTestEnv } from "@Tests/utils";
import { ResourceService } from "./resource";
import { vi, describe, it, expect, beforeEach } from "vitest";
import type { Group } from "@Packages/message/server";
import type { IMessageQueue } from "@Packages/message/message_queue";

initTestEnv();

// mock fetch
const mockFetch = vi.fn();
vi.stubGlobal("fetch", mockFetch);

// 创建文本 blob 和二进制 blob 的辅助函数
function textBlob(content: string, contentType = "text/plain") {
return new Blob([content], { type: contentType });
}

function binaryBlob(bytes: number[]) {
return new Blob([new Uint8Array(bytes)], { type: "application/octet-stream" });
}

function mockResponse(blob: Blob, status = 200, contentType?: string) {
return {
status,
blob: () => Promise.resolve(blob),
headers: new Headers(contentType ? { "content-type": contentType } : {}),
} as unknown as Response;
}

describe("ResourceService - loadByUrl", () => {
let service: ResourceService;

beforeEach(() => {
vi.clearAllMocks();
const mockGroup = {} as Group;
const mockMQ = {} as IMessageQueue;
service = new ResourceService(mockGroup, mockMQ);
// calculateHash 不影响核心逻辑,直接 mock
vi.spyOn(service, "calculateHash").mockResolvedValue({
md5: "mock-md5",
sha1: "",
sha256: "",
sha384: "",
sha512: "",
});
});

it("加载文本资源(require)时应设置 content", async () => {
const jsCode = "console.log('hello');";
mockFetch.mockResolvedValue(mockResponse(textBlob(jsCode), 200, "application/javascript; charset=utf-8"));

const res = await service.loadByUrl("https://example.com/lib.js", "require");

expect(res.url).toBe("https://example.com/lib.js");
expect(res.content).toBeTruthy();
expect(res.contentType).toBe("application/javascript");
expect(res.base64).toBeTruthy();
expect(res.type).toBe("require");
});

it("加载文本资源(resource)时应通过 blob.text() 设置 content", async () => {
const text = "plain text content";
mockFetch.mockResolvedValue(mockResponse(textBlob(text), 200, "text/plain"));

const res = await service.loadByUrl("https://example.com/data.txt", "resource");

expect(res.content).toBe(text);
expect(res.type).toBe("resource");
});

it("加载二进制资源时 content 应为空", async () => {
// 包含 null 字节的二进制数据,isText 会返回 false
const bytes = [0x89, 0x50, 0x4e, 0x47, 0x00, 0x00, 0x00, 0x00];
mockFetch.mockResolvedValue(mockResponse(binaryBlob(bytes), 200, "image/png"));

const res = await service.loadByUrl("https://example.com/img.png", "resource");

expect(res.content).toBe("");
expect(res.base64).toBeTruthy();
expect(res.contentType).toBe("image/png");
});

it("响应非200时应抛出异常", async () => {
mockFetch.mockResolvedValue(mockResponse(textBlob(""), 404));

await expect(service.loadByUrl("https://example.com/404", "require")).rejects.toThrow(
"resource response status not 200: 404"
);
});

it("没有 content-type 时应默认为 application/octet-stream", async () => {
mockFetch.mockResolvedValue(mockResponse(textBlob("data"), 200));

const res = await service.loadByUrl("https://example.com/noct", "resource");

expect(res.contentType).toBe("application/octet-stream");
});
});
3 changes: 1 addition & 2 deletions src/app/service/service_worker/resource.ts
Original file line number Diff line number Diff line change
Expand Up @@ -264,7 +264,7 @@ export class ResourceService {
throw new Error(`resource response status not 200: ${resp.status}`);
}
const data = await resp.blob();
const [hash, arrayBuffer, base64] = await Promise.all([
const [hash, uint8Array, base64] = await Promise.all([
this.calculateHash(data),
blobToUint8Array(data),
blobToBase64(data),
Expand All @@ -280,7 +280,6 @@ export class ResourceService {
type,
createtime: Date.now(),
};
const uint8Array = new Uint8Array(arrayBuffer);
if (isText(uint8Array)) {
if (type === "require" || type === "require-css") {
resource.content = await readBlobContent(data, contentType); // @require和@require-css 是会转换成代码运行的,可以进行解码
Expand Down
20 changes: 10 additions & 10 deletions src/pkg/utils/monaco-editor/utils.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ describe("findGlobalInsertionInfo", () => {
});

it("应该处理包含global关键字的多行块注释", () => {
const model = createMockModel(["/* global jQuery,", " axios */", "const x = 1;"]);
const model = createMockModel(["/* global jQuery,", " moment */", "const x = 1;"]);
const result = findGlobalInsertionInfo(model);
expect(result).toEqual({ insertLine: 3, globalLine: 1 });
});
Expand Down Expand Up @@ -106,8 +106,8 @@ describe("updateGlobalCommentLine", () => {

it("应该在注释末尾添加新的全局变量", () => {
const line = "/* global jQuery */";
const result = updateGlobalCommentLine(line, "axios");
expect(result).toBe("/* global jQuery, axios */");
const result = updateGlobalCommentLine(line, "moment");
expect(result).toBe("/* global jQuery, moment */");
});

it("应该在只有global关键字的注释后添加变量", () => {
Expand All @@ -118,8 +118,8 @@ describe("updateGlobalCommentLine", () => {

it("应该处理以逗号结尾的注释", () => {
const line = "/* global jQuery, */";
const result = updateGlobalCommentLine(line, "axios");
expect(result).toBe("/* global jQuery, axios */");
const result = updateGlobalCommentLine(line, "moment");
expect(result).toBe("/* global jQuery, moment */");
});

it("应该处理多个已存在的全局变量", () => {
Expand All @@ -130,18 +130,18 @@ describe("updateGlobalCommentLine", () => {

it("应该处理注释后有额外内容的情况", () => {
const line = "/* global jQuery */ // some comment";
const result = updateGlobalCommentLine(line, "axios");
expect(result).toBe("/* global jQuery, axios */ // some comment");
const result = updateGlobalCommentLine(line, "moment");
expect(result).toBe("/* global jQuery, moment */ // some comment");
});

it("应该处理格式不正确的注释(缺少*/)", () => {
const line = "/* global jQuery";
const result = updateGlobalCommentLine(line, "axios");
expect(result).toBe("/* global jQuery, axios");
const result = updateGlobalCommentLine(line, "moment");
expect(result).toBe("/* global jQuery, moment");
});

it("应该避免重复添加相同的全局变量", () => {
const line = "/* global jQuery, axios */";
const line = "/* global jQuery, moment */";
const result = updateGlobalCommentLine(line, "jQuery");
expect(result).toBe(line);
});
Expand Down
Loading