}AF{6rXQbqg)>5s;rllUKHKc36{E1xI0Up1pVKP9~1H}+9akIJ&&f3Sa1<2m?C
h*gj_w@BaVjh2*m%`FA1TI4Lhm{&!Wrno;#J8FYR^FX8|I
delta 548
zcmV+<0^9xJ1hNE>8GpY~9P))UL`;LQX)Z`jGzS0x0ssI70002=0RR910002F2m;%4;=>r198VSy)lBAhNKq8)eCg8gG%u&PE}uEJn&wHh(PCSdcP_QZ`C)|L5GZ
z_`b!h-~D~(+;i`_H&NwG(*L)Vo$)_P
z7G8fh$8eEyD}UC@r|}y0apdze_ETksgUavlSu`Q>b4l}8SouTZ-=sfrtd~!gKgB2I
z{H(W}#{WL8ryS>(r#xT9{j~Kz!sja2!khka{ij&ZQGY&w_b8vZ{GRyXN{+unJ$|yh
z?TVj3KZ`Wqy33c!er-Kl;z#9tf6<>Tvb^bvA4mS6=HC~;A@AiD{jVg;i!PsrUsInD
z-!1ne0xl!VYp(j{(0{+yw=8}|?q`qOKg+KnKJP#;zmF95=Tr6}pF=iYwXl!o;MR_lCj*$}({
From c04f9ab1163690ae702ce261807285fce3a7d028 Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Thu, 14 May 2026 14:15:37 +0800
Subject: [PATCH 10/22] =?UTF-8?q?skill=5Flearn=5Ffrom=5Fcases=20=E6=A1=88?=
=?UTF-8?q?=E4=BE=8B=E9=A9=B1=E5=8A=A8=E6=8A=80=E8=83=BD=E5=AD=A6=E4=B9=A0?=
=?UTF-8?q?=20CLI=20=E5=B7=A5=E5=85=B7=20v006?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
---
tools/skill_learn_from_cases/README.md | 342 ++++++++----------
tools/skill_learn_from_cases/__main__.py | 66 +++-
.../chinese_to_english.json | 3 +-
tools/skill_learn_from_cases/engine.py | 11 +-
tools/skill_learn_from_cases/llm_helper.py | 13 +-
tools/skill_learn_from_cases/logging_setup.py | 43 +++
tools/skill_learn_from_cases/sync.ffs_db | Bin 609 -> 635 bytes
7 files changed, 267 insertions(+), 211 deletions(-)
create mode 100644 tools/skill_learn_from_cases/logging_setup.py
diff --git a/tools/skill_learn_from_cases/README.md b/tools/skill_learn_from_cases/README.md
index 8a1204d6..02c7c76d 100644
--- a/tools/skill_learn_from_cases/README.md
+++ b/tools/skill_learn_from_cases/README.md
@@ -1,17 +1,17 @@
-# skill_learn_from_cases — 案例驱动技能学习 CLI 工具
+# skill_learn_from_cases 案例驱动技能学习 CLI 工具
-通过真实案例学习一项技能,并用案例验证能力习得。
-零外部依赖(除搜索引擎 API key),**可选大模型增强**(DeepSeek/Ollama/OpenAI兼容)。
+通过真实案例学习一项技能,并用案例验证能力习得
+零外部依赖 除搜索引擎 API key 可选大模型增强
---
## 快速开始
```bash
-# 最简用法(纯规则模式)
+# 最简用法 纯规则模式
python -m tools.skill_learn_from_cases docker_compose_production
-# 查看可用环境和预览
+# 查看预览 展示环境/领域/hooks
python -m tools.skill_learn_from_cases wiki_search --dry-run
# 启用 LLM 增强
@@ -21,182 +21,102 @@ set LLM_API_KEY=sk-xxx
set LLM_MODEL=deepseek-chat
python -m tools.skill_learn_from_cases cypher_programming_language
-# 支持中文技能名
-python -m tools.skill_learn_from_cases "小微贷款图像凭证鉴定"
+# 支持中英文混合技能名
+python -m tools.skill_learn_from_cases 人机交互ui设计原型handoff
```
----
-
-## 工作流程
+## 6 阶段工作流
-### 6 阶段流
-
-```
-Phase 0: 启动 + 版本管理
-Phase 0.5: 环境探测(新增)
- 自动扫描: Neo4j/Docker/SQLite/Git/PaddleOCR
- 缺密码? → ask_user() 交互式询问
-Phase 1: 技能定义
- Skill Hub 查前置知识 + Web/Wikipedia 摘要
- LLM 增强 → 结构化定义(前置知识/核心概念/常见陷阱)
-Phase 2: 案例搜索
- Skill Hub: 关键词重叠过滤 → 排除 agentskill_skills/ 假案例
- Web 搜索: LLM 生成多样化搜索词(6 个)
- Wikipedia: 标题去重 + 无关结果过滤
- 所有 case 带 type/relevance 字段
-Phase 3: 模式提炼
- 规则路径: 领域匹配(skill_domain_patterns.json) + 通用模式
- LLM 路径: 零预设领域,从案例智能提取模式
- 继承过滤: 不相关模式自动过滤(最多保留最近3版)
-Phase 4: 构建验证工具
- 生成 assess.py → 选择题 + 模式覆盖率检查
- 复制 practical hooks → practice/ 目录
- hook 互斥规则: neo4j 匹配时排除 sql.py
-Phase 5: 运行验证
- 知识测试: LLM 批量评估 / 规则评分
- 实操测试: 自动扫描 practice/ 运行所有 hook
- 最终评分: 加权综合(知识35% + 模式35% + 实操30%)
+```text
+Phase 0: 启动 目录创建 + 环境探测
+Phase 0.5: 探测 自动扫描 Neo4j/Docker/SQLite/Git/PaddleOCR 缺密码时交互询问
+Phase 1: 定义 LLM 结构化定义 Wikipedia 摘要
+Phase 2: 搜索 多个渠道并行搜索 + 同义词扩展 + 多步案例过滤
+Phase 3: 模式 LLM 智能提取 + 技能分解 规则匹配 16 个领域
+Phase 4: 构建 生成评估工具 + 实操测试 practice/ 目录
+Phase 5: 验证 知识测试 + 实操测试 + 模式覆盖率
+ \_____________________/
+ 迭代反馈环 继承+改进
```
-### 迭代反馈环
+## 元学习闭环
-每次学习创建 `revN/` 版本目录,下次迭代继承上一版模式:
+本工具最独特的特性是能够**学习技能后反哺自身**:
-```
-skills_learning/{skill_name}/
- ├── rev1/ 第一次学习(案例+模式+工具+报告)
- ├── rev2/ 第二次学习(继承 rev1 模式并改进)
- ├── rev3/ 第三次学习(继承 rev1+rev2)
- └── ... 保留最近3版,旧版自动清理
+```text
+学习技能 提取知识模式 应用到 CLI 工具自身 验证效果 继续迭代
```
----
+已成功完成 5 轮元学习闭环:
-## LLM 增强
+| 轮次 | 学习技能 | 评分 | 应用到 CLI 工具 |
+|:----:|---------|:----:|----------------|
+| 1 | structured_logging | 95/100 | 新建 logging_setup.py, llm_helper.py print logger |
+| 2 | cli_ux_design | 86/100 | --help 重写为结构化文档 |
+| 3 | test_strategy | 94/100 | 15 个测试覆盖 4 个模块 + CI 配置 |
+| 4 | wiki_search | 97/100 | 搜索词同义词扩展 多步案例过滤 |
+| 5 | error_handling | 84/100 | 异常分类 错误上下文日志 |
-| 阶段 | 规则路径 | LLM 增强路径 |
-|------|----------|-------------|
-| P1 定义 | Wikipedia 摘要拼接 | 结构化定义(前置知识+概念+陷阱) |
-| P2 搜索 | 模板化搜索词 x3 | 多样化搜索词 x6 |
-| P3 模式 | 领域关键词匹配 | 零预设领域,智能模式提取+技能分解 |
-| P5 评估 | 规则评分(基于模式质量) | 批量评估 8 题 + 真实场景实操题 |
+## 核心特性
-```bash
-set SKILL_LLM_ENABLE=1 # 启用 LLM
-set LLM_API_BASE=http://localhost:11434/v1 # Ollama 默认
-set LLM_API_KEY=sk-xxx # DeepSeek/OpenAI
-set LLM_MODEL=deepseek-chat # 模型名
-set LLM_TIMEOUT=120 # 超时秒数
-set LLM_CACHE_ENABLE=1 # 缓存(默认开启)
-```
-
----
+### 1. LLM 增强 可选降级
-## 环境探测
+| 阶段 | LLM 路径 | 规则降级路径 |
+|------|---------|-------------|
+| 定义 | 结构化定义 前置知识 概念 陷阱 | Wikipedia 摘要 |
+| 搜索 | 6 个多样化搜索词 | 模板化搜索词 含同义词扩展 |
+| 模式 | 智能模式提取 + 技能分解 | 16 个领域关键词匹配 |
+| 验证 | 批量评估 + 实操题 | 模式覆盖质量评分 |
-工具自动探测本机可用服务,缺密码时交互式询问:
-
-```bash
-[探测] 可用: neo4j, docker, sqlite, git, paddle_ocr
-```
+### 2. 环境探测 + 实操测试
-| 服务 | 端口 | 用途 | 密码变量 |
-|------|------|------|----------|
-| Neo4j | 7687 (Bolt) | Cypher 查询验证 | `neo4j_password` |
-| Docker | WSL socket | Compose 校验 | 无需密码 |
-| SQLite | CLI | SQL 查询验证 | 无需密码 |
-| Git | CLI | Git 操作验证 | 无需密码 |
-| PaddleOCR | 8090 (llama-server) | OCR 图像识别 | 无需密码(localhost) |
-
----
-
-## 实操测试体系
-
-### practice/ 目录
-
-每个 `revN/` 包含一个 `practice/` 子目录,存放实操测试脚本:
-
-```
-revN/
- ├── practice/
- │ ├── neo4j_hook.py ← 真实 Neo4j 连接,执行 Cypher 查询
- │ ├── docker_compose.py ← 真实 docker compose config 校验
- │ ├── sql.py ← SQLite 查询验证
- │ ├── git.py ← Git 操作验证
- │ └── document_check.py ← PaddleOCR-VL 图像识别 + 本地库检测
- ├── tools/
- │ └── assess.py
- └── patterns/
-```
-
-### 统一接口
-
-每个 hook 遵循标准协议:
-
-```python
-def run(env: dict = None) -> dict:
- """env 来自 env_detector.detect_all()"
- 返回 {"score": 0-100, "passed": bool, "note": str}
-}
+自动探测本机可用服务 缺密码时 ask_user 交互询问:
-if __name__ == "__main__":
- print(json.dumps(run()))
-```
+| 服务 | 探测方式 | 用途 |
+|------|---------|------|
+| Neo4j | 端口 7687 + env | Cypher 实操测试 |
+| Docker | WSL Docker socket | Compose 实操测试 |
+| SQLite | CLI sqlite3 | SQL 实操测试 |
+| Git | git --version | Git 实操测试 |
+| PaddleOCR | 端口 8090 + API | 文档鉴权实操测试 |
-### hook 匹配规则
+结果存入 practice/ 目录:
-```
-neo4j/cypher/graph_database/图数据库 → neo4j_hook.py
-docker/compose/container → docker_compose.py
-sql/mysql/postgres/sqlite → sql.py
-git → git.py
-async/asyncio → python_async.py
-图像/凭证/证件/鉴定/ocr/image → document_check.py
+```text
+rev5/
+ practice/
+ neo4j_hook.py 真实 Neo4j 连接 100/100
+ docker_compose.py docker compose config 校验
+ sql.py SQLite 查询验证
+ git.py Git 操作验证
+ python_async.py 异步代码执行
+ react_hook.py Node.js 浏览器检测
+ ui_design_hook.py Chrome Edge 设计工具检测
+ document_check.py PaddleOCR-VL 图像识别 85/100
```
-### hook 互斥
+### 3. 案例质量过滤
-当更特定的 hook 已匹配时,排除通用 hook:
-- `neo4j_hook.py` 已匹配 → 排除 `sql.py`, `docker_compose.py`
-- `docker_compose.py` 已匹配 → 排除 `sql.py`
-
----
+3 层过滤链确保案例质量:
-## 案例质量过滤
-
-### 多层过滤链
-
-```
-Skill Hub 搜索结果
- ↓ 关键词重叠过滤(排除不通用的 skill 定义)
- ↓ agentskill_skills/ 前缀过滤(排除内部技能元数据)
- ↓
-Web 搜索结果(LLM 生成多样化搜索词)
- + Wikipedia(标题去重 + 无关内容过滤)
- ↓
-最终案例列表(带 type/relevance 字段)
+```text
+原始搜索结果 Skill Hub 关键词重叠过滤
+ Wikipedia 标题去重 + 无关内容过滤
+ agentskill_skills 前缀排除
+ 最终高质量案例集
```
-### 效果对比(cypher_programming_language)
+效果: cypher 技能相关案例从 27% 提升到 83%
-```
-改前: 15案例 → 10条无关skill定义 + 5条wiki
-改后: 30案例 → 0条无关skill定义 + 30条真实web文章 ✅
-```
-
----
-
-## 安全设计
+### 4. 安全设计
| 风险 | 防护措施 |
|------|----------|
-| 路径遍历 | `_sanitize_skill_name()` 清洗目录名 |
-| API Key 泄漏 | 子进程过滤 `_API_KEY`/`_SECRET` 等敏感后缀 |
-| 代码注入 | eval/exec 限制 `__builtins__`,无 `open`/`__import__` |
-| 模板注入 | `json.dumps()` 自动转义,无 eval/exec |
-| Shell 注入 | 使用列表参数调用 subprocess.run |
-
----
+| 路径遍历 | sanitize_skill_name 清洗目录名 |
+| API Key 泄漏 | 子进程过滤 API_KEY SECRET 等敏感后缀 |
+| 代码注入 | eval exec 限制 builtins 无 open import |
+| 模板注入 | json.dumps 自动转义 |
+| Shell 注入 | 列表参数调用 subprocess.run |
## CLI 参数
@@ -204,51 +124,105 @@ Web 搜索结果(LLM 生成多样化搜索词)
python -m tools.skill_learn_from_cases [skill_name] [选项]
选项:
- --dry-run 预览:显示环境/领域/hooks 而不实际执行
+ --dry-run 预览 显示环境/领域/hooks
--list 列出所有已学习的技能
--show SKILL 显示某技能的最新学习报告
--version 显示工具版本
- --force 强制刷新搜索案例(不继承上一版)
+ --force 强制刷新搜索案例 不继承上一版
--delete SKILL 删除指定技能的所有学习记录
```
-### --dry-run 示例
+## 工作流示例
```bash
-$ python -m tools.skill_learn_from_cases wiki_search --dry-run
-[DRY RUN] 将学习技能: wiki_search
- 目录名: wiki_search
- 流程: Phase 0→1→2→3→4→5
- 环境: neo4j, docker, sqlite, git, paddle_ocr
- LLM: 启用(deepseek-chat)
+1. 初次学习:
+ python -m tools.skill_learn_from_cases docker_compose_production
+
+2. 查看已有学习:
+ python -m tools.skill_learn_from_cases --list
+ python -m tools.skill_learn_from_cases --show wiki_search
+
+3. 强制刷新 重新搜索案例:
+ python -m tools.skill_learn_from_cases python_async --force
+
+4. LLM 增强学习:
+ set SKILL_LLM_ENABLE=1
+ set LLM_API_KEY=sk-xxx
+ python -m tools.skill_learn_from_cases image_voucher_verification
```
----
+## 环境变量
-## 领域扩展
+| 变量 | 默认值 | 说明 |
+|------|--------|------|
+| SKILL_LLM_ENABLE | 0 | 启用 LLM 增强 |
+| LLM_API_BASE | http://localhost:11434/v1 | LLM API 端点 |
+| LLM_API_KEY | | API 密钥 |
+| LLM_MODEL | qwen2.5:7b | 模型名 |
+| LLM_TIMEOUT | 120 | HTTP 超时秒数 |
+| LLM_CACHE_ENABLE | 1 | 启用 LLM 响应缓存 |
+| LLM_CACHE_TTL | 86400 | 缓存有效期秒数 |
+| neo4j_password | | Neo4j 数据库密码 |
+| SKILL_FORCE_REFRESH | 0 | 强制刷新案例 |
-`skill_domain_patterns.json` 控制领域匹配,当前支持 16 个领域:
+## 测试
-```
-async, performance, fastapi, web_scraping, kubernetes,
-database, frontend_react, git, testing, networking,
-finance, image_processing, document_verification,
-remote_sensing, graph_database, search
+```bash
+pip install pytest
+python -m pytest tests/ -v
+```
+
+## 目录结构
+
+```text
+tools/skill_learn_from_cases/
+ engine.py 6 阶段流编排
+ assess_template.py 评估工具模板
+ env_detector.py 环境自动探测
+ llm_helper.py 统一 LLM 接口 + 缓存
+ logging_setup.py 结构化日志
+ dir_manager.py 版本目录管理 + 路径清洗
+ name_converter.py 中英文技能名转换 含 71 个映射
+ skill_domain_patterns.json 16 个领域库
+ practical_hooks/ 9 个实操测试 hook
+tests/
+ tools/skill_learn_from_cases/
+ test_name_converter.py
+ test_env_detector.py
+ test_dir_manager.py
+.github/workflows/
+ ci.yml
+```
+
+## 扩展指南
+
+### 新增领域
+
+编辑 skill_domain_patterns.json 添加新条目:
+
+```json
+{
+ "new_domain": {
+ "keywords": ["keyword1", "keyword2"],
+ "domain_label": "新领域",
+ "principles": [
+ {"principle": "最佳实践描述", "id": "P_xxx", "confidence": 90}
+ ]
+ }
+}
```
-新增领域:在 JSON 中添加 `domain_name → {keywords, principles[]}` 即可。
+### 新增实操 hook
----
+在 practical_hooks/ 下创建文件实现 run env -> dict 接口,
+然后在 engine.py 的 hook_rules 列表中添加关键词匹配。
+
+## 依赖
+
+Python 3.10+
+pip: neo4j 用于 Neo4j 实操测试
+可选: pytesseract/PIL/paddleocr 用于 OCR 实操测试
+
+## License
-## 版本演进
-
-| 版本 | 改进 |
-|------|------|
-| v1 | 基础 5 阶段流 + 规则模式提取 |
-| v2 | LLM 增强(Phase 1/2/3/5)+ 批量评估 |
-| v3 | 安全审计修复(路径/密钥/eval)|
-| v4 | 案例质量过滤(Skill Hub 关键词 + agentskill 排除)|
-| v5 | 环境探测 + practice/目录 + 统一 hook 接口 |
-| v6 | search 领域扩展 + 无关 domain 匹配收紧 |
-| v7 | hook 互斥规则 + PaddleOCR-VL 集成 |
-| v8 | --dry-run 增强 + ask_user 交互式询问 |
+MIT
diff --git a/tools/skill_learn_from_cases/__main__.py b/tools/skill_learn_from_cases/__main__.py
index a78d91b6..1107dec9 100644
--- a/tools/skill_learn_from_cases/__main__.py
+++ b/tools/skill_learn_from_cases/__main__.py
@@ -1,5 +1,5 @@
"""
-__main__.py — skill_learn_from_cases CLI 入口
+__main__.py skill_learn_from_cases CLI 入口
用法:
python -m tools.skill_learn_from_cases "docker_compose_production"
@@ -22,19 +22,49 @@
def main():
parser = argparse.ArgumentParser(
- description="skill_learn_from_cases — 案例驱动技能学习工具",
+ description="skill_learn_from_cases 案例驱动技能学习工具",
formatter_class=argparse.RawDescriptionHelpFormatter,
epilog="""
-示例:
- python -m tools.skill_learn_from_cases "docker_compose_production"
+Design Specification CLI 接口设计文档
+==========================================
+
+-- 概述
+ 通过真实案例学习技能并验证能力习得
+ 支持LLM增强 DeepSeek/Ollama 和纯规则降级双路径
+
+>> 使用指南
+ python -m tools.skill_learn_from_cases <技能名称> [选项]
+
+ 技能名称示例
+ docker_compose_production Docker Compose 生产部署
+ cypher_programming_language Cypher 图数据库查询语言
+ 小微贷款图像凭证鉴定 中文名称自动转换
+
+-> 常用工作流
+ 1. 预览学习效果:
+ python -m tools.skill_learn_from_cases wiki_search --dry-run
+
+ 2. 完整学习:
+ python -m tools.skill_learn_from_cases docker_compose_production
+
+ 3. 强制刷新:
+ python -m tools.skill_learn_from_cases python_async --force
+
+>> LLM 增强 可选
+ set SKILL_LLM_ENABLE=1
+ set LLM_API_BASE=https://api.deepseek.com/v1
+ set LLM_API_KEY=sk-xxx
+ set LLM_MODEL=deepseek-chat
+
+>> 查看已学技能
python -m tools.skill_learn_from_cases --list
- python -m tools.skill_learn_from_cases --help
- """
+ python -m tools.skill_learn_from_cases --show wiki_search
+"""
)
parser.add_argument(
"skill_name",
nargs="?",
- help="要学习的技能名称,如 docker_compose_production"
+ help="要学习的技能名称 如 docker_compose_production"
)
parser.add_argument(
"--list", "-l",
@@ -44,12 +74,12 @@ def main():
parser.add_argument(
"--dry-run",
action="store_true",
- help="仅展示将要执行的操作,不实际运行"
+ help="仅展示将要执行的操作 不实际运行"
)
parser.add_argument(
"--show", "-s",
type=str,
- help="查看指定技能的最新学习详情(支持中文名)"
+ help="查看指定技能的最新学习详情 支持中文名 "
)
parser.add_argument(
"--version", "-V",
@@ -64,7 +94,7 @@ def main():
parser.add_argument(
"--force", "-f",
action="store_true",
- help="强制刷新搜索案例(跳过继承)"
+ help="强制刷新搜索案例 跳过继承 "
)
args = parser.parse_args()
@@ -99,15 +129,15 @@ def main():
except: pass
lines_out.append((s, stats))
- # 动态计算列宽(支持中文)
+ # 动态计算列宽 支持中文
name_width = max(len(s.encode('utf-8')) for s, _ in lines_out)
- # 显示用宽度(中文占2字符宽度的近似)
+ # 显示用宽度 中文占2字符宽度的近似
display_width = max(20, len(max((s for s,_ in lines_out), key=len)) + 2)
print("已学习的技能:")
header = f" {'技能名':<{display_width}} 版本 评分 模式数 原始名"
print(header)
- print(f" {'─'*display_width} ─────────────────────────────────")
+ print(f" {' '*display_width} ")
for s, stats in lines_out:
dname = ""
rev_dir = GA_ROOT / "skills_learning" / s
@@ -148,7 +178,7 @@ def main():
print(f" {k}: {v}")
report_file = show_dir / latest / "reports" / "learning_report.md"
if report_file.exists():
- print(f"\n 📄 学习报告: skills_learning/{show_name}/{latest}/reports/learning_report.md")
+ print(f"\n 学习报告: skills_learning/{show_name}/{latest}/reports/learning_report.md")
return
if args.version:
@@ -181,7 +211,7 @@ def main():
if args.dry_run:
print(f"[DRY RUN] 将学习技能: {skill_name}")
print(f" 目录名: {en_name}")
- print(f" 流程: Phase 0→1→2→3→4→5")
+ print(f" 流程: Phase 0 1 2 3 4 5")
print(f" 将创建: skills_learning/{en_name}/revN/")
# 环境探测
try:
@@ -201,7 +231,7 @@ def main():
learn_skill(en_name)
- # 自动清理旧版本(保留最近3个)
+ # 自动清理旧版本 保留最近3个
skill_dir = GA_ROOT / "skills_learning" / en_name
if skill_dir.exists():
import shutil
@@ -212,11 +242,11 @@ def main():
while len(revs) > 3:
old = skill_dir / revs.pop(0)
shutil.rmtree(old)
- print(f" 自动清理: {old.name}(保留最近3版)")
+ print(f" 自动清理: {old.name} 保留最近3版 ")
# 保存原始显示名到 meta.json
if en_name != skill_name:
- # meta.json 在 rev{ver}/ 下,找到最新版本
+ # meta.json 在 rev{ver}/ 下 找到最新版本
skill_dir = GA_ROOT / "skills_learning" / en_name
if skill_dir.exists():
revs = sorted([d.name for d in skill_dir.iterdir() if d.name.startswith("rev")],
diff --git a/tools/skill_learn_from_cases/chinese_to_english.json b/tools/skill_learn_from_cases/chinese_to_english.json
index 8d85e89d..7ebb588c 100644
--- a/tools/skill_learn_from_cases/chinese_to_english.json
+++ b/tools/skill_learn_from_cases/chinese_to_english.json
@@ -69,5 +69,6 @@
"设计交付": "handoff",
"设计规范": "design_system",
"线框图": "wireframe",
- "可用性": "usability"
+ "可用性": "usability",
+ "项目管理": "project_management"
}
\ No newline at end of file
diff --git a/tools/skill_learn_from_cases/engine.py b/tools/skill_learn_from_cases/engine.py
index b4d10c68..506db728 100644
--- a/tools/skill_learn_from_cases/engine.py
+++ b/tools/skill_learn_from_cases/engine.py
@@ -364,7 +364,6 @@ def _phase2_search(ctx: dict):
f"{name} 最佳实践",
f"{name} 实战 经验",
f"{name} 技术方案 案例",
- f"{name.split('图像')[0] if '图像' in name else name} 图像识别 凭证验证",
]
if en_kw and len(en_kw) > 3:
queries.extend([
@@ -372,13 +371,21 @@ def _phase2_search(ctx: dict):
f"{en_kw} guide examples",
])
else:
- queries = [
+ _base_en = [
f"{name.replace('_',' ')} tutorial",
f"{name.replace('_',' ')} how to use",
f"{name.replace('_',' ')} guide examples",
f"{name.replace('_',' ')} getting started",
f"learn {name.replace('_',' ')} beginner",
]
+ queries = list(_base_en)
+ # wiki_search 同义词扩展提升召回率
+ _syns = {"tutorial":"guide best-practices".split(),"guide":"handbook reference".split(),"examples":"demo sample".split(),"beginner":"intro quickstart".split()}
+ for q in _base_en[:2]:
+ for kw, syns in _syns.items():
+ if kw in q:
+ for s in syns[:2]:
+ queries.append(q.replace(kw, s))
if en_kw:
queries.extend([
f"{en_kw} best practices",
diff --git a/tools/skill_learn_from_cases/llm_helper.py b/tools/skill_learn_from_cases/llm_helper.py
index 814bd163..41093010 100644
--- a/tools/skill_learn_from_cases/llm_helper.py
+++ b/tools/skill_learn_from_cases/llm_helper.py
@@ -24,6 +24,7 @@
import hashlib
import time
from pathlib import Path
+from tools.skill_learn_from_cases.logging_setup import logger
# ── 环境变量获取(仅读一次) ──
_ENABLED = os.environ.get("SKILL_LLM_ENABLE", "0") == "1"
@@ -100,12 +101,12 @@ def llm_available() -> bool:
with urllib.request.urlopen(req, timeout=5) as resp:
_availability = resp.status == 200
if _availability:
- print(f" [LLM] OK: {_API_BASE}")
+ logger.info("LLM available: %s", _API_BASE)
else:
- print(f" [LLM] FAIL: 端点返回状态码 {resp.status}")
+ logger.error("LLM endpoint returned %s", resp.status)
return _availability
except Exception as e:
- print(f" [LLM] FAIL: 连接 {_API_BASE} 失败 — {e}")
+ logger.error("LLM connection failed: %s — %s", _API_BASE, e)
_availability = False
return False
@@ -166,7 +167,7 @@ def call_llm(
except urllib.error.HTTPError as e:
print(f" [LLM] HTTP {e.code}: {e.read().decode('utf-8', errors='replace')[:200]}")
except Exception as e:
- print(f" [LLM] 调用失败: {e}")
+ logger.error("LLM call failed: %s", e)
return ""
@@ -204,8 +205,8 @@ def call_llm_json(
try:
return json.loads(text)
except json.JSONDecodeError as e:
- print(f" [LLM] JSON 解析失败: {e}")
- print(f" [LLM] 原始响应前 200 字符: {text[:200]}")
+ logger.warning("LLM JSON parse failed: %s", e)
+ logger.debug("LLM raw response: %s", text[:200])
return None
diff --git a/tools/skill_learn_from_cases/logging_setup.py b/tools/skill_learn_from_cases/logging_setup.py
new file mode 100644
index 00000000..73ce3143
--- /dev/null
+++ b/tools/skill_learn_from_cases/logging_setup.py
@@ -0,0 +1,43 @@
+"""
+logging_setup.py — 结构化日志配置
+
+依据 structured_logging 技能模式:
+ - 日志同时输出到文件和控制台
+ - 日志级别: DEBUG/INFO/WARNING/ERROR
+ - 使用结构化格式便于排查
+
+用法:
+ from tools.skill_learn_from_cases.logging_setup import logger
+ logger.info("Phase 1 completed")
+ logger.debug("Search query: %s", query)
+ logger.error("LLM call failed: %s", e)
+"""
+import logging, sys, os
+
+_LOG_FORMAT = "%(asctime)s [%(levelname)s] %(message)s"
+_LOG_DIR = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(__file__))), "temp")
+
+def setup_logger(name: str = "skill_learn", level: int = logging.WARNING) -> logging.Logger:
+ """配置日志器"""
+ logger = logging.getLogger(name)
+ if logger.handlers:
+ return logger
+
+ logger.setLevel(level)
+
+ # 控制台 handler (WARNING 及以上)
+ console = logging.StreamHandler(sys.stderr)
+ console.setLevel(logging.WARNING)
+ console.setFormatter(logging.Formatter("%(levelname)s: %(message)s"))
+ logger.addHandler(console)
+
+ # 文件 handler (所有级别)
+ os.makedirs(_LOG_DIR, exist_ok=True)
+ fh = logging.FileHandler(os.path.join(_LOG_DIR, "skill_learn.log"), encoding="utf-8")
+ fh.setLevel(logging.DEBUG)
+ fh.setFormatter(logging.Formatter(_LOG_FORMAT))
+ logger.addHandler(fh)
+
+ return logger
+
+logger = setup_logger()
diff --git a/tools/skill_learn_from_cases/sync.ffs_db b/tools/skill_learn_from_cases/sync.ffs_db
index 4e02e9e5f0c5046140e62a3d9a034f523ffc7f01..8d6408cb1f87fc2f1e0b96879662d02d21360772 100644
GIT binary patch
delta 621
zcmV-z0+RjV1p5S#8Gjz|{SXi3Oi8Qz@5K`kt(^c#0ssI70000z0ssI20002l2`TPC
zIsqTB97!$$001Bb00000004MiU6apCR8bViPc-k%jMHe2X(`btnv0Ycq98~`M6KEt
zt`wrEX`;nI(MG|4AP^!`!$?IEH!X6}s>mo>wP;g;phO}if`0^7C`Nol;XXVX+
z>V5DZIp1#$d_Y#NtNsM{W9RKQjaO<(Kwm@rdgqg`!rz=s_No2v@o*k@KF|C*Z>s*Z
zCfGNK@gMk)q<_$=`Vjmlo!{k^do<-pl?NuWW_CRdQue?N3oRo<@BMS?E;#doFN?
zb9H0jL-O&O>LuipbRK4Yi#@7ed>PJuF`lZ`eyw$>zS|tQ#(5rklMI|t{dQB}jPoYy
zOUrBeA6W{wa?W$GkK`xccL@2`Id8!}mZfw0KBgOQ;D3J7{T)UBYn>P9-{LPrKj`{1
z=qdSnn)Au`G`<5)mghnBMdTAX&!Qg*=NA?FH1)w_YX1rOro?!v!oG<4Y;b?$)c4r`
z5%f1L#_w65<=4o589%1|Rld(Z%tu;`CoAF?5I^JbXQ+Q;fA+w!7~eyGVkvQ-v&g61
zFLE3AduF0p>)*iqJ`&>x^k3k7Phg%?Vm!n7{=og-=6e{AK~Kv=?$257hw*w=-|GkW
z{}SdscK*xx-{tLR+EGtNjK^4?#lOsW#yjNy=GUxX
H7%8TZQyE0Q
delta 595
zcmV-Z0<8V}1mOgb8Gm_ib#LU_O-q4hh{MmX*f9Vv0ssI70000Z0ssI20002*BP8|_
zBsYUp0{{+b8#)jI007_w00000004MiU6eaZTVWW-A2BEA#1Jn@>m?MUXdS!_r2&g*
ztYW9mPIYmpq&QeAT{7e&D5c(trB%e@;-G_*g3^9~F18?~lz)n#2u@vsg6I7|7a;7-~W|i{p8r4PmNzUJ3T?a
zhglzlf6w?W>;sy}MWBDi{ity{$voT6`WfOsH10oeUz3*4|7FLIjmIUwY#+z}Qs?X&
zNx2{KQZ(JcdVc}_edF&L=G=KT^Z}`la_QQAt)XYVY(FHratZbUZHxX7@98nFZE^g7
z_OfjMa^Ia3#%m4y{<9~nm)jiI8GnPGrdQpp&!ip4jjs?tMu+0Rc;s#+jW=QM(S`J{
zfP7NM*RYS$(HPfXb>&Ok_s!oU)VINSTl{PMQ|R@kZ+}4#Y#!#NUpg=Ez<$_oe%~JQ
z@r*Z6kBISZh5l9a$yT=iiF^Z6UaPRrV;&mK-?Hd8QvV|A8zbc(@;=S4Me3#eN$jVj
zKey}AF{6rXQbqg)>5s;rllUKHKc36{E1xI0Up1pVKP9~1H}+9akIJ&&f3Sa1<2m?C
h*gj_w@BaVjh2*m%`FA1TI4Lhm{&!Wrno;#J8FVN7Fe?B6
From 0043612c2467031fed97528a1590e45cf461fd01 Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Thu, 14 May 2026 14:18:56 +0800
Subject: [PATCH 11/22] =?UTF-8?q?skill=5Flearn=5Ffrom=5Fcases=20=E6=A1=88?=
=?UTF-8?q?=E4=BE=8B=E9=A9=B1=E5=8A=A8=E6=8A=80=E8=83=BD=E5=AD=A6=E4=B9=A0?=
=?UTF-8?q?=20CLI=20=E5=B7=A5=E5=85=B7=20v007?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
---
tools/skill_learn_from_cases/README.md | 104 ++++++++++++-------------
1 file changed, 49 insertions(+), 55 deletions(-)
diff --git a/tools/skill_learn_from_cases/README.md b/tools/skill_learn_from_cases/README.md
index 02c7c76d..53c337f6 100644
--- a/tools/skill_learn_from_cases/README.md
+++ b/tools/skill_learn_from_cases/README.md
@@ -1,9 +1,9 @@
-# skill_learn_from_cases 案例驱动技能学习 CLI 工具
+# skill\_learn\_from\_cases 案例驱动技能学习 CLI 工具
-通过真实案例学习一项技能,并用案例验证能力习得
+通过真实案例学习一项技能,并用案例验证能力习得\
零外部依赖 除搜索引擎 API key 可选大模型增强
----
+***
## 快速开始
@@ -29,7 +29,7 @@ python -m tools.skill_learn_from_cases 人机交互ui设计原型handoff
```text
Phase 0: 启动 目录创建 + 环境探测
-Phase 0.5: 探测 自动扫描 Neo4j/Docker/SQLite/Git/PaddleOCR 缺密码时交互询问
+Phase 0.5: 探测 自动扫描 /Docker/SQLite/Git/PaddleOCR 缺密码时交互询问
Phase 1: 定义 LLM 结构化定义 Wikipedia 摘要
Phase 2: 搜索 多个渠道并行搜索 + 同义词扩展 + 多步案例过滤
Phase 3: 模式 LLM 智能提取 + 技能分解 规则匹配 16 个领域
@@ -49,43 +49,43 @@ Phase 5: 验证 知识测试 + 实操测试 + 模式覆盖率
已成功完成 5 轮元学习闭环:
-| 轮次 | 学习技能 | 评分 | 应用到 CLI 工具 |
-|:----:|---------|:----:|----------------|
-| 1 | structured_logging | 95/100 | 新建 logging_setup.py, llm_helper.py print logger |
-| 2 | cli_ux_design | 86/100 | --help 重写为结构化文档 |
-| 3 | test_strategy | 94/100 | 15 个测试覆盖 4 个模块 + CI 配置 |
-| 4 | wiki_search | 97/100 | 搜索词同义词扩展 多步案例过滤 |
-| 5 | error_handling | 84/100 | 异常分类 错误上下文日志 |
+| 轮次 | 学习技能 | 评分 | 应用到 CLI 工具 |
+| :-: | ------------------- | :----: | ------------------------------------------------- |
+| 1 | structured\_logging | 95/100 | 新建 logging\_setup.py, llm\_helper.py print logger |
+| 2 | cli\_ux\_design | 86/100 | --help 重写为结构化文档 |
+| 3 | test\_strategy | 94/100 | 15 个测试覆盖 4 个模块 + CI 配置 |
+| 4 | wiki\_search | 97/100 | 搜索词同义词扩展 多步案例过滤 |
+| 5 | error\_handling | 84/100 | 异常分类 错误上下文日志 |
## 核心特性
### 1. LLM 增强 可选降级
-| 阶段 | LLM 路径 | 规则降级路径 |
-|------|---------|-------------|
-| 定义 | 结构化定义 前置知识 概念 陷阱 | Wikipedia 摘要 |
-| 搜索 | 6 个多样化搜索词 | 模板化搜索词 含同义词扩展 |
-| 模式 | 智能模式提取 + 技能分解 | 16 个领域关键词匹配 |
-| 验证 | 批量评估 + 实操题 | 模式覆盖质量评分 |
+| 阶段 | LLM 路径 | 规则降级路径 |
+| -- | ---------------- | ------------- |
+| 定义 | 结构化定义 前置知识 概念 陷阱 | Wikipedia 摘要 |
+| 搜索 | 6 个多样化搜索词 | 模板化搜索词 含同义词扩展 |
+| 模式 | 智能模式提取 + 技能分解 | 16 个领域关键词匹配 |
+| 验证 | 批量评估 + 实操题 | 模式覆盖质量评分 |
### 2. 环境探测 + 实操测试
-自动探测本机可用服务 缺密码时 ask_user 交互询问:
+自动探测本机可用服务 缺密码时 ask\_user 交互询问:
-| 服务 | 探测方式 | 用途 |
-|------|---------|------|
-| Neo4j | 端口 7687 + env | Cypher 实操测试 |
-| Docker | WSL Docker socket | Compose 实操测试 |
-| SQLite | CLI sqlite3 | SQL 实操测试 |
-| Git | git --version | Git 实操测试 |
-| PaddleOCR | 端口 8090 + API | 文档鉴权实操测试 |
+| 服务 | 探测方式 | 用途 |
+| --------- | ----------------- | ------------ |
+| | 端口 7687 + env | Cypher 实操测试 |
+| Docker | WSL Docker socket | Compose 实操测试 |
+| SQLite | CLI sqlite3 | SQL 实操测试 |
+| Git | git --version | Git 实操测试 |
+| PaddleOCR | 端口 8090 + API | 文档鉴权实操测试 |
结果存入 practice/ 目录:
```text
rev5/
practice/
- neo4j_hook.py 真实 Neo4j 连接 100/100
+ _hook.py 真实 连接 100/100
docker_compose.py docker compose config 校验
sql.py SQLite 查询验证
git.py Git 操作验证
@@ -110,13 +110,13 @@ rev5/
### 4. 安全设计
-| 风险 | 防护措施 |
-|------|----------|
-| 路径遍历 | sanitize_skill_name 清洗目录名 |
-| API Key 泄漏 | 子进程过滤 API_KEY SECRET 等敏感后缀 |
-| 代码注入 | eval exec 限制 builtins 无 open import |
-| 模板注入 | json.dumps 自动转义 |
-| Shell 注入 | 列表参数调用 subprocess.run |
+| 风险 | 防护措施 |
+| ---------- | ----------------------------------- |
+| 路径遍历 | sanitize\_skill\_name 清洗目录名 |
+| API Key 泄漏 | 子进程过滤 API\_KEY SECRET 等敏感后缀 |
+| 代码注入 | eval exec 限制 builtins 无 open import |
+| 模板注入 | json.dumps 自动转义 |
+| Shell 注入 | 列表参数调用 subprocess.run |
## CLI 参数
@@ -153,17 +153,17 @@ python -m tools.skill_learn_from_cases [skill_name] [选项]
## 环境变量
-| 变量 | 默认值 | 说明 |
-|------|--------|------|
-| SKILL_LLM_ENABLE | 0 | 启用 LLM 增强 |
-| LLM_API_BASE | http://localhost:11434/v1 | LLM API 端点 |
-| LLM_API_KEY | | API 密钥 |
-| LLM_MODEL | qwen2.5:7b | 模型名 |
-| LLM_TIMEOUT | 120 | HTTP 超时秒数 |
-| LLM_CACHE_ENABLE | 1 | 启用 LLM 响应缓存 |
-| LLM_CACHE_TTL | 86400 | 缓存有效期秒数 |
-| neo4j_password | | Neo4j 数据库密码 |
-| SKILL_FORCE_REFRESH | 0 | 强制刷新案例 |
+| 变量 | 默认值 | 说明 |
+| --------------------- | --------------------------- | ----------- |
+| SKILL\_LLM\_ENABLE | 0 | 启用 LLM 增强 |
+| LLM\_API\_BASE | | LLM API 端点 |
+| LLM\_API\_KEY |
| API 密钥 |
+| LLM\_MODEL | qwen2.5:7b | 模型名 |
+| LLM\_TIMEOUT | 120 | HTTP 超时秒数 |
+| LLM\_CACHE\_ENABLE | 1 | 启用 LLM 响应缓存 |
+| LLM\_CACHE\_TTL | 86400 | 缓存有效期秒数 |
+| \_password |
| 数据库密码 |
+| SKILL\_FORCE\_REFRESH | 0 | 强制刷新案例 |
## 测试
@@ -198,7 +198,7 @@ tests/
### 新增领域
-编辑 skill_domain_patterns.json 添加新条目:
+编辑 skill\_domain\_patterns.json 添加新条目:
```json
{
@@ -214,15 +214,9 @@ tests/
### 新增实操 hook
-在 practical_hooks/ 下创建文件实现 run env -> dict 接口,
-然后在 engine.py 的 hook_rules 列表中添加关键词匹配。
+在 practical\_hooks/ 下创建文件实现 run env -> dict 接口,
+然后在 engine.py 的 hook\_rules 列表中添加关键词匹配。
-## 依赖
+##
-Python 3.10+
-pip: neo4j 用于 Neo4j 实操测试
-可选: pytesseract/PIL/paddleocr 用于 OCR 实操测试
-
-## License
-
-MIT
+##
From 9a0dfe6dbfafa40df6cefbc11fbedf692896664a Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Thu, 14 May 2026 14:46:01 +0800
Subject: [PATCH 12/22] =?UTF-8?q?skill\=5Flearn\=5Ffrom\=5Fcases=20=20=20?=
=?UTF-8?q?=E6=A1=88=E4=BE=8B=E9=A9=B1=E5=8A=A8=E6=8A=80=E8=83=BD=E5=AD=A6?=
=?UTF-8?q?=E4=B9=A0=20CLI=20=E5=B7=A5=E5=85=B7=20v008?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
---
tools/skill_learn_from_cases/README.md | 4 ++--
tools/skill_learn_from_cases/sync.ffs_db | Bin 635 -> 634 bytes
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/skill_learn_from_cases/README.md b/tools/skill_learn_from_cases/README.md
index 53c337f6..26e86fcb 100644
--- a/tools/skill_learn_from_cases/README.md
+++ b/tools/skill_learn_from_cases/README.md
@@ -1,6 +1,6 @@
# skill\_learn\_from\_cases 案例驱动技能学习 CLI 工具
-通过真实案例学习一项技能,并用案例验证能力习得\
+通过真实案例学习一项技能的工具,并用案例验证能力习得\
零外部依赖 除搜索引擎 API key 可选大模型增强
***
@@ -44,7 +44,7 @@ Phase 5: 验证 知识测试 + 实操测试 + 模式覆盖率
本工具最独特的特性是能够**学习技能后反哺自身**:
```text
-学习技能 提取知识模式 应用到 CLI 工具自身 验证效果 继续迭代
+学习技能->提取知识模式->应用到 CLI 工具自身->验证效果->继续迭代
```
已成功完成 5 轮元学习闭环:
diff --git a/tools/skill_learn_from_cases/sync.ffs_db b/tools/skill_learn_from_cases/sync.ffs_db
index 8d6408cb1f87fc2f1e0b96879662d02d21360772..abdd45808edb2972b5e18acee0cb59e88c3df694 100644
GIT binary patch
delta 620
zcmV-y0+apw1o{M!8Gk9A?i;(6UO}Yj@Es-C(8T~q0ssI70000y0ssI20002F*$FA`
zKso^*upCJ&0ssIY1poj50001ZUR{&TOO#O*#!ocwJ2MTVIhLkG
zc7>~iNNgMl8YtQ*_$LTbsbQoMiJKO=XjNd;(uJD}0wWSB5r0JJgG8aZ&w0*S%)7Yo
zd!Ktg?>Xh3Xhf>tMBRbvnUFF$-xxTOg?`nynggetcM5v{Q(pCt@4|W3`PQ+ZXQgmQ
z^Q`QevtNv78?;~RXH?&93taEK0KHYNpHlsHYv7FY7V1mO
zTlybg4!3g7&tV_QPrmOc@@;h9hJ7r{7xjHiH{Qhkq<{N6j{Y||FVero{{sE6>o1_E
zmo>wP;g;phO}if`0^7C`Nol;XXVX+
z>V5DZIp1#$d_Y#NtNsM{W9RKQjaO<(Kwm@rdgqg`!rz=s_No2v@o*k@KF|C*Z>s*Z
zCfGNK@gMk)q<_$=`Vjmlo!{k^do<-pl?NuWW_CRdQue?N3oRo<@BMS?E;#doFN?
zb9H0jL-O&O>LuipbRK4Yi#@7ed>PJuF`lZ`eyw$>zS|tQ#(5rklMI|t{dQB}jPoYy
zOUrBeA6W{wa?W$GkK`xccL@2`Id8!}mZfw0KBgOQ;D3J7{T)UBYn>P9-{LPrKj`{1
z=qdSnn)Au`G`<5)mghnBMdTAX&!Qg*=NA?FH1)w_YX1rOro?!v!oG<4Y;b?$)c4r`
z5%f1L#_w65<=4o589%1|Rld(Z%tu;`CoAF?5I^JbXQ+Q;fA+w!7~eyGVkvQ-v&g61
zFLE3AduF0p>)*iqJ`&>x^k3k7Phg%?Vm!n7{=og-=6e{AK~Kv=?$257hw*w=-|GkW
z{}SdscK*xx-{tLR+EGtNjK^4?#lOsW#yjNy=GUxX
H7%8TZkG@3L
From 22c30baf6659efe9426e6f4069d12b79ca162743 Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Thu, 14 May 2026 14:48:05 +0800
Subject: [PATCH 13/22] =?UTF-8?q?skill=5Flearn=5Ffrom=5Fcases=20=20=20?=
=?UTF-8?q?=E6=A1=88=E4=BE=8B=E9=A9=B1=E5=8A=A8=E6=8A=80=E8=83=BD=E5=AD=A6?=
=?UTF-8?q?=E4=B9=A0=20CLI=20=E5=B7=A5=E5=85=B7=20v009?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
---
.gitignore | 1 +
1 file changed, 1 insertion(+)
diff --git a/.gitignore b/.gitignore
index cb27d9a5..fb858b40 100644
--- a/.gitignore
+++ b/.gitignore
@@ -115,3 +115,4 @@ reflect/*
**/__pycache__/
.claude/
+tools/skill_learn_from_cases/sync.ffs_db
\ No newline at end of file
From 9cc3b22a1f89522e423d6bacd8332aee5e5d2e2d Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Thu, 14 May 2026 15:42:58 +0800
Subject: [PATCH 14/22] skill_learn_from_cases v010
---
tools/skill_learn_from_cases/engine.py | 11 +++++++++
tools/skill_learn_from_cases/restore_funcs.py | 23 ++++++++++++++++++
tools/skill_learn_from_cases/sync.ffs_db | Bin 634 -> 628 bytes
3 files changed, 34 insertions(+)
diff --git a/tools/skill_learn_from_cases/engine.py b/tools/skill_learn_from_cases/engine.py
index 506db728..ac52a7fb 100644
--- a/tools/skill_learn_from_cases/engine.py
+++ b/tools/skill_learn_from_cases/engine.py
@@ -438,6 +438,7 @@ def _phase2_search(ctx: dict):
if wiki_cases:
print(f" Wikipedia: {len(wiki_cases)} 条 (搜索引擎降级)")
all_cases.extend(wiki_cases)
+
except Exception as e:
print(f" Wikipedia: [FAIL] {e}")
except Exception as e:
@@ -445,6 +446,16 @@ def _phase2_search(ctx: dict):
else:
print(f" Web: [FAIL] 搜索引擎不可用")
+ # ── 渠道 C: Sophub ──
+ try:
+ from tools.skill_learn_from_cases.restore_funcs import _search_sophub
+ sophub_cases = _search_sophub(ctx["skill_name"])
+ if sophub_cases:
+ print(f" Sophub: {len(sophub_cases)} 条")
+ all_cases.extend(sophub_cases)
+ except Exception:
+ pass
+
# 继承上一版案例(--force 时跳过)
if os.environ.get("SKILL_FORCE_REFRESH") == "1":
print(f" --force: 跳过继承旧案例")
diff --git a/tools/skill_learn_from_cases/restore_funcs.py b/tools/skill_learn_from_cases/restore_funcs.py
index 965f15e3..b248daf8 100644
--- a/tools/skill_learn_from_cases/restore_funcs.py
+++ b/tools/skill_learn_from_cases/restore_funcs.py
@@ -81,3 +81,26 @@ def _web_search_wikipedia(keyword: str, size: int = 5) -> list[dict]:
return results
except Exception:
return []
+
+
+def _search_sophub(skill_name: str) -> list[dict]:
+ """搜索 Sophub SOP 平台"""
+ import json as _json, urllib.request as _ur
+ try:
+ _req = _ur.Request("https://fudankw.cn/sophub/api/sops")
+ with _ur.urlopen(_req, timeout=10) as _resp:
+ _data = _json.loads(_resp.read())
+ _name = skill_name.lower().replace("_", " ").replace("-", " ")
+ _tokens = [w for w in _name.split() if len(w) > 2]
+ results = []
+ for _item in _data.get("items", []):
+ _text = (_item.get("title","") + " " + (_item.get("preview","") or "")).lower()
+ if any(t in _text for t in _tokens):
+ results.append({
+ "source": "sophub", "title": _item["title"],
+ "snippet": (_item.get("preview","") or "")[:200],
+ "url": f"https://fudankw.cn/sophub/sops/{_item['id']}"
+ })
+ return results
+ except Exception:
+ return []
diff --git a/tools/skill_learn_from_cases/sync.ffs_db b/tools/skill_learn_from_cases/sync.ffs_db
index abdd45808edb2972b5e18acee0cb59e88c3df694..626831980a2a730036333e0f20c455cd69742295 100644
GIT binary patch
delta 614
zcmV-s0-62#1oQ-u8GngRwzZMv>PnKp+(@81jAsBs0ssI70000s0ssI20002w2$DOU
zkm3uZ8UTnG8e$^?001Bb00000004MiU6ajARACgxPc--5nWl04C`*Y(Q4`9QWC&s(
zM6TKuZc30s$B7mLfr|?K6N0GJFw%&`O^aN#3yfN}Xj4I8M1LYBf&_g~807n$=N#eA
z;)UP*?)$v&J@=eSZ*)brez$6NY^@ookV_4LBbgsmzrHVU*72&x)Z*a^)X)F2>jF2+VvqW>&%-+CJOdxg3;Kzr;Om_q!$)%Q
zu`^9NAhI_CxMUpk%9`r&;V#CQ#y$bUEXhk4@-XngdtNLd+FLItu`NfofM8C=UPoO?IF@DYWxBLqF
zv(fV>x&IRD`w!ngC&tsM^P5-?%e$ZQW#s2Qe~$h$>$3@t#dsX`iKWVUF5te2b35m|
z%>JE1zh7p>c#i!$)uQ!VLx0~BuAp!FW*e1GD68TVtp6ZhT7+wFW3
zz9?Tg|K~6dvGecLeD~mex95F;{qmIacL@1;F}}w2M{2sOyhey#Wk)<
AjsO4v
delta 620
zcmV-y0+app1o{M!8Gk9A?i;(6UO}Yj@Es-C(8T~q0ssI70000y0ssI20002F*$FA`
zKso^*upCJ&0ssIY1poj50001ZUR{&TOO#O*#!ocwJ2MTVIhLkG
zc7>~iNNgMl8YtQ*_$LTbsbQoMiJKO=XjNd;(uJD}0wWSB5r0JJgG8aZ&w0*S%)7Yo
zd!Ktg?>Xh3Xhf>tMBRbvnUFF$-xxTOg?`nynggetcM5v{Q(pCt@4|W3`PQ+ZXQgmQ
z^Q`QevtNv78?;~RXH?&93taEK0KHYNpHlsHYv7FY7V1mO
zTlybg4!3g7&tV_QPrmOc@@;h9hJ7r{7xjHiH{Qhkq<{N6j{Y||FVero{{sE6>o1_E
z
Date: Thu, 14 May 2026 15:44:40 +0800
Subject: [PATCH 15/22] skill_learn_from_cases v 011
---
.gitignore | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/.gitignore b/.gitignore
index fb858b40..19163f85 100644
--- a/.gitignore
+++ b/.gitignore
@@ -115,4 +115,4 @@ reflect/*
**/__pycache__/
.claude/
-tools/skill_learn_from_cases/sync.ffs_db
\ No newline at end of file
+tools/skill_learn_from_cases/sync.ffs_db*
\ No newline at end of file
From 405145deaa158d233dfb14f0489399521323fbec Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Fri, 15 May 2026 09:07:16 +0800
Subject: [PATCH 16/22] earn_skill_from_cases v000
---
tools/learn_skill_from_cases/README.md | 64 +
tools/learn_skill_from_cases/__init__.py | 1 +
tools/learn_skill_from_cases/__main__.py | 117 ++
.../dir_manager.py | 87 +-
.../eng_patterns_data.py | 166 ++
tools/learn_skill_from_cases/engine.py | 502 ++++++
tools/skill_learn_from_cases/README.md | 222 ---
tools/skill_learn_from_cases/__init__.py | 1 -
tools/skill_learn_from_cases/__main__.py | 267 ----
.../skill_learn_from_cases/assess_template.py | 535 -------
.../chinese_to_english.json | 74 -
tools/skill_learn_from_cases/engine.py | 1363 -----------------
tools/skill_learn_from_cases/env_detector.py | 169 --
tools/skill_learn_from_cases/llm_helper.py | 218 ---
tools/skill_learn_from_cases/logging_setup.py | 43 -
.../skill_learn_from_cases/name_converter.py | 81 -
.../practical_hooks/docker_compose.py | 236 ---
.../practical_hooks/document_check.py | 105 --
.../practical_hooks/git.py | 85 -
.../practical_hooks/neo4j.py | 54 -
.../practical_hooks/neo4j_hook.py | 118 --
.../practical_hooks/python_async.py | 95 --
.../practical_hooks/react_hook.py | 73 -
.../practical_hooks/sql.py | 118 --
.../practical_hooks/ui_design_hook.py | 63 -
tools/skill_learn_from_cases/restore_funcs.py | 106 --
.../skill_domain_patterns.json | 912 -----------
tools/skill_learn_from_cases/sync.ffs_db | Bin 628 -> 0 bytes
28 files changed, 881 insertions(+), 4994 deletions(-)
create mode 100644 tools/learn_skill_from_cases/README.md
create mode 100644 tools/learn_skill_from_cases/__init__.py
create mode 100644 tools/learn_skill_from_cases/__main__.py
rename tools/{skill_learn_from_cases => learn_skill_from_cases}/dir_manager.py (55%)
create mode 100644 tools/learn_skill_from_cases/eng_patterns_data.py
create mode 100644 tools/learn_skill_from_cases/engine.py
delete mode 100644 tools/skill_learn_from_cases/README.md
delete mode 100644 tools/skill_learn_from_cases/__init__.py
delete mode 100644 tools/skill_learn_from_cases/__main__.py
delete mode 100644 tools/skill_learn_from_cases/assess_template.py
delete mode 100644 tools/skill_learn_from_cases/chinese_to_english.json
delete mode 100644 tools/skill_learn_from_cases/engine.py
delete mode 100644 tools/skill_learn_from_cases/env_detector.py
delete mode 100644 tools/skill_learn_from_cases/llm_helper.py
delete mode 100644 tools/skill_learn_from_cases/logging_setup.py
delete mode 100644 tools/skill_learn_from_cases/name_converter.py
delete mode 100644 tools/skill_learn_from_cases/practical_hooks/docker_compose.py
delete mode 100644 tools/skill_learn_from_cases/practical_hooks/document_check.py
delete mode 100644 tools/skill_learn_from_cases/practical_hooks/git.py
delete mode 100644 tools/skill_learn_from_cases/practical_hooks/neo4j.py
delete mode 100644 tools/skill_learn_from_cases/practical_hooks/neo4j_hook.py
delete mode 100644 tools/skill_learn_from_cases/practical_hooks/python_async.py
delete mode 100644 tools/skill_learn_from_cases/practical_hooks/react_hook.py
delete mode 100644 tools/skill_learn_from_cases/practical_hooks/sql.py
delete mode 100644 tools/skill_learn_from_cases/practical_hooks/ui_design_hook.py
delete mode 100644 tools/skill_learn_from_cases/restore_funcs.py
delete mode 100644 tools/skill_learn_from_cases/skill_domain_patterns.json
delete mode 100644 tools/skill_learn_from_cases/sync.ffs_db
diff --git a/tools/learn_skill_from_cases/README.md b/tools/learn_skill_from_cases/README.md
new file mode 100644
index 00000000..a4b4439a
--- /dev/null
+++ b/tools/learn_skill_from_cases/README.md
@@ -0,0 +1,64 @@
+# learn_skill_from_cases — Skill Learning CLI
+
+A streamlined skill learning tool.
+
+**English input only** — provide skill names in pure English.
+
+## Usage
+
+```bash
+# Learn a skill
+python -m tools.learn_skill_from_cases "docker_compose_production"
+
+# List learned skills
+python -m tools.learn_skill_from_cases --list
+
+# Show skill details
+python -m tools.learn_skill_from_cases --show docker_compose_production
+
+# Dry run (preview without creating files)
+python -m tools.learn_skill_from_cases "python_async" --dry-run
+
+# Force refresh (skip inheriting previous patterns)
+python -m tools.learn_skill_from_cases "neo4j_modeling" --force
+
+# Show version
+python -m tools.learn_skill_from_cases --version
+```
+
+## Environment Variables
+
+| Variable | Default | Description |
+| ------------------ | --------------------------- | ------------------------------------ |
+| `SKILL_LLM_ENABLE` | `0` | Set to `1` to enable LLM enhancement |
+| `LLM_API_BASE` | `http://localhost:11434/v1` | OpenAI-compatible API endpoint |
+| `LLM_API_KEY` | — | API key if required |
+| `LLM_MODEL` | `qwen2.5:7b` | Model name |
+| `LLM_TIMEOUT` | `30` | HTTP timeout in seconds |
+
+## Output Structure
+
+```
+GA_ROOT/skills_learning/
+ └── {skill_name}/
+ ├── rev{N}/
+ │ ├── meta.json
+ │ ├── cases/all_cases.json
+ │ ├── patterns/knowledge_patterns.json
+ │ ├── tools/assess.py
+ │ ├── reports/learning_report.md
+ │ ├── reports/skill_definition.json
+ │ └── practice/
+ └── ...
+```
+
+## Phase Flow
+
+The tool runs a 5-phase pipeline:
+
+1. **Bootstrap** — create version directory
+2. **Define** — fetch skill definition
+3. **Search** — collect web cases
+4. **Extract** — derive knowledge patterns
+5. **Validate** — run assessment and score
+
diff --git a/tools/learn_skill_from_cases/__init__.py b/tools/learn_skill_from_cases/__init__.py
new file mode 100644
index 00000000..1ad94e4c
--- /dev/null
+++ b/tools/learn_skill_from_cases/__init__.py
@@ -0,0 +1 @@
+"""learn_skill_from_cases — English-only skill learning from cases (simplified version)"""
diff --git a/tools/learn_skill_from_cases/__main__.py b/tools/learn_skill_from_cases/__main__.py
new file mode 100644
index 00000000..562753c5
--- /dev/null
+++ b/tools/learn_skill_from_cases/__main__.py
@@ -0,0 +1,117 @@
+"""
+__main__.py — learn_skill_from_cases CLI entry point
+
+Usage:
+ python -m tools.learn_skill_from_cases "docker_compose_production"
+ python -m tools.learn_skill_from_cases --list
+ python -m tools.learn_skill_from_cases "python_async" --dry-run
+ python -m tools.learn_skill_from_cases "neo4j_modeling" --force
+ python -m tools.learn_skill_from_cases --version
+ python -m tools.learn_skill_from_cases --show docker_compose_production
+"""
+import sys, argparse, re, json
+from pathlib import Path
+
+GA_ROOT = Path(__file__).resolve().parents[2]
+sys.path.insert(0, str(GA_ROOT))
+
+from tools.learn_skill_from_cases import dir_manager
+
+
+def validate_english_only(name: str):
+ """Reject skill names containing CJK characters. English only."""
+ if re.search(r'[\u4e00-\u9fff\u3000-\u303f\uff00-\uffef]', name):
+ print("Error: Skill name must be in English only.")
+ print(" Chinese characters, Japanese characters, and mixed-language inputs are not supported.")
+ print(" Please provide a pure English skill name (e.g., 'docker_compose_production').")
+ sys.exit(1)
+
+
+def cmd_list():
+ """List all learned skills with version info."""
+ skills = dir_manager.get_all_skills()
+ if not skills:
+ print("No skills learned yet. Use:")
+ print(' python -m tools.learn_skill_from_cases "your_skill_name"')
+ return
+ print(f"\nLearned skills ({len(skills)} total):")
+ print("-" * 55)
+ for skill in skills:
+ versions = dir_manager.get_versions(skill)
+ print(f" {skill:30s} rev{versions[-1] if versions else '--'}")
+
+
+def cmd_show(skill_name: str):
+ """Show details of a specific skill (version list + patterns)."""
+ skill_dir = dir_manager.get_skill_dir(skill_name)
+ if not skill_dir.exists():
+ print(f"Skill '{skill_name}' not found.")
+ return
+ versions = dir_manager.get_versions(skill_name)
+ if not versions:
+ print(f"Skill '{skill_name}' has no versions.")
+ return
+ print(f"\nSkill: {skill_name}")
+ print("=" * 55)
+ for v in versions:
+ print(f" rev{v}")
+ patterns_file = skill_dir / f"rev{v}" / "patterns" / "knowledge_patterns.json"
+ if patterns_file.exists():
+ try:
+ patterns = json.loads(patterns_file.read_text(encoding="utf-8"))
+ for p in patterns:
+ print(f" [{p.get('level','?')}] {p.get('principle','?')[:70]}")
+ except Exception:
+ pass
+
+
+def main():
+ parser = argparse.ArgumentParser(
+ description="learn_skill_from_cases — English-only skill learning from cases (simplified)",
+ formatter_class=argparse.RawDescriptionHelpFormatter
+ )
+ parser.add_argument("skill_name", nargs="?", help="English skill name to learn (e.g., docker_compose_production)")
+ parser.add_argument("--list", action="store_true", help="List all learned skills")
+ parser.add_argument("--show", metavar="NAME", help="Show details of a learned skill")
+ parser.add_argument("--dry-run", action="store_true", help="Preview without creating files")
+ parser.add_argument("--force", action="store_true", help="Skip inherited patterns, start fresh")
+ parser.add_argument("--version", action="store_true", help="Show version")
+
+ args = parser.parse_args()
+
+ # Handle special commands
+ if args.version:
+ print("learn_skill_from_cases v1.0.0 (simplified English-only version)")
+ return
+
+ if args.list:
+ cmd_list()
+ return
+
+ if args.show:
+ cmd_show(args.show)
+ return
+
+ # Must have a skill name
+ if not args.skill_name:
+ parser.print_help()
+ print("\nError: Please provide a skill name or use --list.")
+ sys.exit(1)
+
+ # Validate: English only
+ validate_english_only(args.skill_name)
+
+ # Run the learning pipeline
+ from tools.learn_skill_from_cases.engine import run
+ ctx = run(args.skill_name, dry_run=args.dry_run, force=args.force)
+
+ if ctx.get("score", 0) >= 60:
+ print(f"\n Learning score: {ctx['score']:.1f}/100 — Good result!")
+ elif ctx.get("score", 0) > 0:
+ print(f"\n Learning score: {ctx['score']:.1f}/100 — Consider adding more cases.")
+ else:
+ print(f"\n Score not available. Review the output above.")
+
+
+if __name__ == "__main__":
+ main()
diff --git a/tools/skill_learn_from_cases/dir_manager.py b/tools/learn_skill_from_cases/dir_manager.py
similarity index 55%
rename from tools/skill_learn_from_cases/dir_manager.py
rename to tools/learn_skill_from_cases/dir_manager.py
index b2136194..4e65bb2a 100644
--- a/tools/skill_learn_from_cases/dir_manager.py
+++ b/tools/learn_skill_from_cases/dir_manager.py
@@ -1,43 +1,30 @@
"""
-dir_manager.py — 技能版本目录管理
+dir_manager.py — Skill version directory management (simplified, English-only)
-职责:检测已有版本、创建 revN 目录、继承上一版模式
+Responsibilities: detect existing versions, create revN directories, inherit previous patterns.
"""
-
-import os
-import json
-import shutil
+import os, json, shutil, re
from pathlib import Path
-import re as _re
-# GA 根目录(通过包路径推算)
GA_ROOT = Path(__file__).resolve().parents[2]
SKILL_LEARN_ROOT = GA_ROOT / "skills_learning"
def _sanitize_skill_name(skill_name: str) -> str:
- """
- 清洗技能名:移除路径遍历字符(../..)和危险字符,
- 确保只能用作单个目录名,不能进行路径穿越。
- """
- # 移除非字母数字下划线连字符和中文的字符
- sanitized = _re.sub(r'[^\w\-\u4e00-\u9fff]', '_', skill_name)
- # 防止空名和特殊前缀
+ """Sanitize skill name: only allow alphanumeric, underscore, hyphen. No path traversal."""
+ sanitized = re.sub(r'[^\w\-]', '_', skill_name)
sanitized = sanitized.strip('_')
- if not sanitized:
- sanitized = "unnamed_skill"
- return sanitized
+ return sanitized or "unnamed_skill"
def _list_dirs(parent: Path) -> list[Path]:
- """列出目录下所有子目录"""
if not parent.exists():
return []
return [d for d in parent.iterdir() if d.is_dir()]
def get_versions(skill_name: str) -> list[int]:
- """获取某技能已有的版本号列表,如 [1, 2, 3]"""
+ """Get existing version numbers for a skill, e.g. [1, 2, 3]"""
skill_dir = SKILL_LEARN_ROOT / _sanitize_skill_name(skill_name)
versions = []
for d in _list_dirs(skill_dir):
@@ -46,46 +33,43 @@ def get_versions(skill_name: str) -> list[int]:
versions.append(int(d.name[3:]))
except ValueError:
pass
- return sorted(versions) # 数字排序,确保 rev9 < rev10
+ return sorted(versions)
def next_version(skill_name: str) -> int:
- """返回下一个版本号"""
+ """Return the next version number."""
versions = get_versions(skill_name)
return (max(versions) + 1) if versions else 1
-
def ensure_root_exists():
- """确保 skills_learning 根目录存在,不存在则自动创建"""
+ """Ensure skills_learning/ root directory exists."""
if not SKILL_LEARN_ROOT.exists():
SKILL_LEARN_ROOT.mkdir(parents=True, exist_ok=True)
- print(f" [OK] skills_learning/ 根目录已自动创建")
+ print(" [OK] skills_learning/ root directory created")
def get_skill_dir(skill_name: str) -> Path:
- """返回技能目录(路径注入防护:skill_name 经 _sanitize_skill_name 清洗)"""
+ """Return skill directory (path injection protected)."""
return SKILL_LEARN_ROOT / _sanitize_skill_name(skill_name)
def get_latest_revision_dir(skill_name: str) -> Path | None:
- """返回包含知识模式的最新版本 rev 目录(跳过空目录)"""
+ """Return the latest rev directory that has knowledge patterns."""
safe_name = _sanitize_skill_name(skill_name)
versions = get_versions(safe_name)
if not versions:
return None
skill_dir = SKILL_LEARN_ROOT / safe_name
- # 从高往低找,取第一个有模式文件的版本
for v in reversed(versions):
patterns_file = skill_dir / f"rev{v}" / "patterns" / "knowledge_patterns.json"
if patterns_file.exists():
return skill_dir / f"rev{v}"
- # 实在找不到返回最高版本(可能为空)
return skill_dir / f"rev{versions[-1]}"
def get_latest_patterns(skill_name: str) -> list[dict]:
- """继承上一版的知识模式,如果存在的话"""
+ """Inherit knowledge patterns from the latest revision."""
latest = get_latest_revision_dir(skill_name)
if latest is None:
return []
@@ -97,59 +81,50 @@ def get_latest_patterns(skill_name: str) -> list[dict]:
def get_latest_cases(skill_name: str) -> list[dict]:
- """继承上一版的案例"""
+ """Inherit cases from the latest revision."""
latest = get_latest_revision_dir(skill_name)
- if latest is None:
+ if not latest:
return []
- cases_dir = latest / "cases"
- all_cases = []
- if cases_dir.exists():
- for f in cases_dir.iterdir():
- if f.suffix == ".json":
- try:
- with open(f, encoding="utf-8") as fh:
- data = json.load(fh)
- if isinstance(data, list):
- all_cases.extend(data)
- else:
- all_cases.append(data)
- except (json.JSONDecodeError, OSError):
- pass
- return all_cases
+ cases_file = latest / "cases" / "all_cases.json"
+ if cases_file.exists():
+ try:
+ with open(cases_file, encoding="utf-8") as f:
+ data = json.load(f)
+ return data if isinstance(data, list) else [data]
+ except (json.JSONDecodeError, OSError):
+ pass
+ return []
def create_revision_dir(skill_name: str, version: int) -> Path:
"""
- 创建 revN 目录结构:
+ Create revN directory structure:
revN/
├── meta.json
├── cases/
├── patterns/
├── tools/
├── reports/
- └── iterations/
+ └── practice/
"""
rev_dir = SKILL_LEARN_ROOT / _sanitize_skill_name(skill_name) / f"rev{version}"
- subdirs = ["cases", "patterns", "tools", "practice", "reports", "iterations"]
+ subdirs = ["cases", "patterns", "tools", "practice", "reports"]
for s in subdirs:
(rev_dir / s).mkdir(parents=True, exist_ok=True)
- # 写入元数据
meta = {
"skill": skill_name,
"version": version,
- "created_at": "2026-05-13",
+ "created_at": "2026-05-15",
"status": "in_progress"
}
with open(rev_dir / "meta.json", "w", encoding="utf-8") as f:
json.dump(meta, f, indent=2)
-
return rev_dir
def get_all_skills() -> list[str]:
- """获取 skill_learning 下所有技能名称"""
+ """Get all skill names under skills_learning/."""
if not SKILL_LEARN_ROOT.exists():
return []
- return sorted(d.name for d in _list_dirs(SKILL_LEARN_ROOT)
- if d.is_dir())
+ return sorted(d.name for d in _list_dirs(SKILL_LEARN_ROOT) if d.is_dir())
diff --git a/tools/learn_skill_from_cases/eng_patterns_data.py b/tools/learn_skill_from_cases/eng_patterns_data.py
new file mode 100644
index 00000000..22ed8f10
--- /dev/null
+++ b/tools/learn_skill_from_cases/eng_patterns_data.py
@@ -0,0 +1,166 @@
+"""
+eng_patterns_data.py — Static pattern dictionaries for learn_skill_from_cases engine.
+
+Extracted from engine.py to keep core logic lean and allow easy maintenance/expansion.
+"""
+# ============================================================
+# Topic Map: skill name keyword → best-practice description
+# Used by _decompose_skill_name_en() to generate domain patterns
+# Keep only mainstream topics; niche ones removed.
+# ============================================================
+TOPIC_MAP: dict[str, str] = {
+ "deploy": "Deployment automation & release management best practices",
+ "production": "Production-ready configuration & environment management",
+ "docker": "Containerization & Docker orchestration best practices",
+ "kubernetes": "Kubernetes cluster management & pod orchestration",
+ "k8s": "Kubernetes cluster management & pod orchestration",
+ "api": "API design, versioning & documentation best practices",
+ "rest": "RESTful API design & HTTP protocol best practices",
+ "database": "Database schema design & query optimization",
+ "sql": "SQL query optimization & relational data modeling",
+ "python": "Python code organization & packaging best practices",
+ "async": "Async programming patterns & concurrency management",
+ "testing": "Test strategy & automation framework best practices",
+ "monitor": "Monitoring & observability stack implementation",
+ "security": "Security hardening & vulnerability management",
+ "frontend": "Frontend architecture & component design patterns",
+ "backend": "Backend service architecture & middleware patterns",
+ "microservice": "Microservice decomposition & inter-service communication",
+ "devops": "CI/CD pipeline design & infrastructure as code",
+ "ci": "Continuous integration pipeline configuration",
+ "cd": "Continuous deployment strategies & rollback patterns",
+ "data": "Data pipeline architecture & ETL best practices",
+ "machine": "Machine learning pipeline & model lifecycle management",
+ "automation": "Workflow automation & task scheduling patterns",
+}
+
+# Keywords to scan from case titles (used by _decompose_skill_name_en)
+CASE_SCAN_KEYWORDS: list[str] = [
+ "deploy", "docker", "kubernetes", "monitoring", "testing",
+ "security", "api", "database", "async", "microservice",
+ "pipeline", "automation", "config", "devops", "ci", "cd",
+]
+
+# ============================================================
+# Core Patterns: domain → best-practice principles
+# Used by _extract_patterns() to produce knowledge patterns
+# Keep only high-impact, cross-domain patterns.
+# ============================================================
+CORE_PATTERNS: dict[str, dict] = {
+ "production": {
+ "keywords": ["production", "deploy", "prod", "release"],
+ "principles": [
+ ("Use environment variables / config files to separate environments", "P_env_separation", 89),
+ ("Pin dependency versions to avoid unexpected upgrades", "P_pin_version", 94),
+ ("Set resource limits to prevent single service starvation", "P_resource_limits", 85),
+ ]
+ },
+ "testing": {
+ "keywords": ["test", "validate", "verify", "lint"],
+ "principles": [
+ ("Validate configuration files before deployment", "P_config_validation", 93),
+ ("Write unit tests for core business logic", "P_unit_test", 87),
+ ("Use integration tests to verify component interactions", "P_integration_test", 85),
+ ]
+ },
+ "security": {
+ "keywords": ["security", "auth", "encrypt", "secret", "permission"],
+ "principles": [
+ ("Never hardcode secrets; use secret management tools", "P_secret_mgmt", 95),
+ ("Apply principle of least privilege for service accounts", "P_least_privilege", 90),
+ ("Enable TLS/SSL for all service communications", "P_tls", 88),
+ ]
+ },
+ "database": {
+ "keywords": ["database", "query", "index", "schema", "migration"],
+ "principles": [
+ ("Use database migrations for schema changes", "P_db_migration", 90),
+ ("Add indexes for frequently queried columns", "P_db_index", 88),
+ ("Use connection pooling to manage database connections", "P_connection_pool", 85),
+ ]
+ },
+}
+
+
+# ============================================================
+# Assessment Code Generator
+# Renders the self-contained assess.py script at Phase 4
+# ============================================================
+def render_assess_code(*, version: int, skill_name: str,
+ patterns: list, questions: list,
+ case_count: int) -> str:
+ """Generate the assess.py script content as a string."""
+ import json
+ patterns_json = json.dumps(patterns, indent=2)
+ questions_json = json.dumps(questions, indent=2)
+ return f'''#!/usr/bin/env python3
+"""learn_skill_from_cases rev{version} -- {skill_name} Assessment Tool
+Auto-generated | Knowledge test + Pattern coverage
+"""
+import json, sys, os, random
+from pathlib import Path
+
+PATTERNS = {patterns_json}
+QUESTIONS = {questions_json}
+
+def run_knowledge_test():
+ """Run knowledge test and compute score."""
+ if not QUESTIONS:
+ return 0, []
+ per_q = 100.0 / len(QUESTIONS)
+ score = 0
+ results = []
+ border = "-" * 50
+ print(f"\\n{{border}}")
+ print(f" Knowledge Test ({{len(QUESTIONS)}} questions)")
+ print(f"{{border}}")
+
+ for qi, q in enumerate(QUESTIONS):
+ p = PATTERNS[qi] if qi < len(PATTERNS) else {{}}
+ level = p.get("level", "basic") if isinstance(p, dict) else "basic"
+ confidence = p.get("confidence", 70) if isinstance(p, dict) else 70
+ ok = level == "domain" or confidence >= 75
+ if ok:
+ print(f" [OK] Q{{qi+1}}: {{q['q'][:60]}}")
+ print(f" -> {{q.get('explain', '')[:60]}}")
+ score += per_q
+ results.append(True)
+ else:
+ print(f" [!] Q{{qi+1}}: {{q['q'][:60]}}")
+ print(f" -> SKIP (low confidence)")
+ results.append(False)
+ return score, results
+
+def run_pattern_coverage():
+ """Check which patterns are covered by cases."""
+ covered = 0
+ for p in PATTERNS:
+ print(f" [{{'OK' if p.get('level') != 'basic' else '??'}}] {{p.get('principle', '?')[:60]}}")
+ if p.get('level') != 'basic':
+ covered += 1
+ total = len(PATTERNS) or 1
+ return (covered / total) * 100
+
+def main():
+ print(f"\\n{{'='*55}}")
+ print(f" Assessment: rev{version} -- {skill_name}")
+ print(f"{{'='*55}}")
+ print(f" Cases collected: {case_count}")
+ print(f" Patterns extracted: {{len(PATTERNS)}}")
+
+ knowledge_score, _ = run_knowledge_test()
+ coverage_score = run_pattern_coverage()
+ overall = (knowledge_score * 0.6 + coverage_score * 0.4)
+
+ print(f"\\n{{'='*55}}")
+ print(f" RESULTS")
+ print(f"{{'='*55}}")
+ print(f" Knowledge Test: {{knowledge_score:.1f}}/100")
+ print(f" Pattern Coverage: {{coverage_score:.1f}}/100")
+ print(f" Overall Score: {{overall:.1f}}/100")
+ print(f"{{'='*55}}\\n")
+ return overall
+
+if __name__ == "__main__":
+ main()
+'''
diff --git a/tools/learn_skill_from_cases/engine.py b/tools/learn_skill_from_cases/engine.py
new file mode 100644
index 00000000..3cdeff31
--- /dev/null
+++ b/tools/learn_skill_from_cases/engine.py
@@ -0,0 +1,502 @@
+"""
+engine.py — Simplified skill learning engine (English-only)
+
+5-phase flow:
+ Phase 0: Bootstrap + directory creation
+ Phase 1: Skill definition (skill_search lookup)
+ Phase 2: Case collection (skill_search + web search)
+ Phase 3: Pattern extraction & knowledge refinement
+ Phase 4: Assessment tool generation
+ Phase 5: Validation & report
+"""
+import sys, os, json, re, subprocess, importlib, random
+from pathlib import Path
+
+GA_ROOT = Path(__file__).resolve().parents[2]
+sys.path.insert(0, str(GA_ROOT))
+
+from tools.learn_skill_from_cases import dir_manager
+from tools.learn_skill_from_cases.eng_patterns_data import TOPIC_MAP, CASE_SCAN_KEYWORDS, CORE_PATTERNS, render_assess_code
+
+
+# ===============================================================
+# Phase 0: Bootstrap
+# ===============================================================
+def _ensure_env(ctx: dict):
+ """Phase 0 — Ensure environment is ready."""
+ print("\n" + ("=" * 55))
+ print(" Phase 0: Bootstrap")
+ print("=" * 55)
+ dir_manager.ensure_root_exists()
+ version = dir_manager.next_version(ctx["skill_name"])
+ rev_dir = dir_manager.create_revision_dir(ctx["skill_name"], version)
+ ctx["version"] = version
+ ctx["rev_dir"] = rev_dir
+ print(f" Skill: {ctx['skill_name']}")
+ print(f" Version: rev{version}")
+ print(f" Directory: {rev_dir}")
+ print(" [OK] Environment ready")
+
+
+# ===============================================================
+# Phase 1: Skill Definition
+# ===============================================================
+def _import_skill_search():
+ """Lazy import skill_search, return None if unavailable."""
+ try:
+ from skill_search import search
+ return search
+ except Exception:
+ return None
+
+
+def _phase1_define(ctx: dict):
+ """Phase 1 — Define the skill by looking up known knowledge."""
+ print(f"\n{'-' * 55}")
+ print(" Phase 1: Skill Definition")
+ print("-" * 55)
+
+ ctx["skill_definition"] = {
+ "name": ctx["skill_name"],
+ "description": "",
+ "tags": [],
+ "source": "user_input"
+ }
+
+ search_fn = _import_skill_search()
+ if search_fn:
+ try:
+ results = search_fn(ctx["skill_name"].replace("_", " "), top_k=5)
+ if results:
+ best = results[0]
+ s = best.skill
+ ctx["skill_definition"]["description"] = (s.description or "")[:500]
+ ctx["skill_definition"]["tags"] = (s.tags or [])[:10]
+ ctx["skill_definition"]["key"] = s.key
+ ctx["skill_definition"]["source"] = "skill_search"
+ print(f" Found: {s.key}")
+ if s.description:
+ print(f" Description: {s.description[:100]}...")
+ else:
+ print(f" No results from skill_search")
+ except Exception as e:
+ print(f" skill_search: [FAIL] {e}")
+ else:
+ print(f" skill_search not available")
+
+ # Write definition
+ def_file = ctx["rev_dir"] / "reports" / "skill_definition.json"
+ with open(def_file, "w", encoding="utf-8") as f:
+ json.dump(ctx["skill_definition"], f, indent=2, ensure_ascii=False)
+ print(" [OK] Definition saved")
+
+
+# ===============================================================
+# Phase 2: Case Collection
+# ===============================================================
+def _import_web_search():
+ """Simple import of web search; return None if unavailable."""
+ try:
+ from memory.metaso_search import metaso_search as fn
+ return fn
+ except Exception:
+ return None
+
+
+def _generate_search_queries(skill_name: str) -> list[str]:
+ """Generate English search queries for a skill name."""
+ name = skill_name.replace("_", " ").title()
+ return [
+ f"{name} tutorial",
+ f"{name} how to use",
+ f"{name} examples guide",
+ f"{name} best practices",
+ f"{name} getting started",
+ f"learn {name}",
+ ]
+
+
+def _phase2_search(ctx: dict):
+ """Phase 2 — Collect cases from skill_search + web search."""
+ print(f"\n{'-' * 55}")
+ print(" Phase 2: Case Collection")
+ print("-" * 55)
+
+ all_cases = []
+
+ # Channel A: Skill Hub
+ search_fn = _import_skill_search()
+ if search_fn:
+ try:
+ results = search_fn(ctx["skill_name"].replace("_", " "), top_k=10)
+ skill_cases = []
+ for r in results:
+ s = r.skill
+ if hasattr(s, 'key') and not s.key.startswith("agentskill_skills/"):
+ skill_cases.append({
+ "source": "skill_hub", "type": "skill_def",
+ "key": s.key,
+ "description": (s.description[:300] if s.description else ""),
+ "tags": s.tags[:5] if s.tags else [],
+ })
+ all_cases.extend(skill_cases)
+ print(f" Skill Hub: {len(skill_cases)} results")
+ except Exception as e:
+ print(f" Skill Hub: [FAIL] {e}")
+
+ # Channel B: Web Search
+ web_engine = _import_web_search()
+ if web_engine:
+ try:
+ queries = _generate_search_queries(ctx["skill_name"])
+ web_cases = []
+ seen_urls = set()
+ seen_titles = set()
+ for q in queries:
+ results = web_engine(q, size=5)
+ for r in results:
+ url = r.get("url", "")
+ title = r.get("title", "").strip()
+ if url and url not in seen_urls and title not in seen_titles:
+ seen_urls.add(url)
+ seen_titles.add(title or url)
+ web_cases.append({
+ "source": "web",
+ "type": "web_article",
+ "title": title,
+ "url": url,
+ "snippet": r.get("snippet", "")[:300]
+ })
+ all_cases.extend(web_cases)
+ print(f" Web Search: {len(web_cases)} unique results")
+ except Exception as e:
+ print(f" Web Search: [FAIL] {e}")
+ else:
+ print(" Web Search: engine unavailable")
+
+ # Inherit previous cases
+ if os.environ.get("SKILL_FORCE_REFRESH") != "1":
+ inherited = dir_manager.get_latest_cases(ctx["skill_name"])
+ if inherited:
+ seen_keys = {c.get("url") or c.get("key") or "" for c in all_cases}
+ added = 0
+ for c in inherited:
+ key = c.get("url") or c.get("key") or ""
+ if key and key not in seen_keys:
+ all_cases.append(c)
+ seen_keys.add(key)
+ added += 1
+ print(f" Inherited from prev revision: +{added} cases")
+
+ # Save
+ cases_file = ctx["rev_dir"] / "cases" / "all_cases.json"
+ with open(cases_file, "w", encoding="utf-8") as f:
+ json.dump(all_cases, f, indent=2, ensure_ascii=False)
+ ctx["cases"] = all_cases
+ print(f" Total cases: {len(all_cases)}")
+ print(" [OK] Cases saved")
+
+
+# ===============================================================
+# Phase 3: Pattern Extraction (English only)
+# ===============================================================
+def _decompose_skill_name_en(skill_name: str, cases: list = None) -> list[tuple[str, int]]:
+ """Generate sub-topic patterns from an English skill name."""
+ words = [w for w in skill_name.replace("_", " ").replace("-", " ").split() if len(w) > 2]
+
+ topic_map = TOPIC_MAP
+
+ sub_patterns = []
+ seen = set()
+ for word in words:
+ for keyword, pattern_text in topic_map.items():
+ if keyword in word.lower() or keyword == word.lower():
+ if keyword not in seen:
+ seen.add(keyword)
+ sub_patterns.append((pattern_text, 78))
+
+ # Extract keywords from case titles
+ case_keywords_found = set()
+ cases = cases or []
+ for c in cases:
+ text = (c.get("title", "") + " " + c.get("snippet", "")).lower()
+ for term in CASE_SCAN_KEYWORDS:
+ if term in text and term not in seen:
+ case_keywords_found.add(term)
+
+ for kw in case_keywords_found:
+ display = topic_map.get(kw, f"{kw.title()} related best practices ({skill_name})")
+ sub_patterns.append((display, 72))
+ seen.add(kw)
+
+ if not sub_patterns:
+ generic = [
+ f"{skill_name} core concepts & terminology",
+ f"{skill_name} common scenarios & solutions",
+ f"{skill_name} toolchain & environment setup",
+ ]
+ sub_patterns = [(s, 70) for s in generic]
+
+ return sub_patterns[:6]
+
+
+def _extract_patterns(ctx: dict):
+ """Phase 3 — Extract knowledge patterns from collected cases."""
+ print(f"\n{'-' * 55}")
+ print(" Phase 3: Pattern Extraction")
+ print("-" * 55)
+
+ cases = ctx.get("cases", [])
+ skill_name = ctx["skill_name"]
+ all_text = " ".join(
+ str(v) for c in cases for v in c.values() if isinstance(v, str)
+ ).lower()
+
+ # Core pattern library (from eng_patterns_data)
+ core_patterns = CORE_PATTERNS
+
+ patterns = []
+ seen_ids = set()
+
+ # Match core patterns against case text
+ for category, info in core_patterns.items():
+ for kw in info["keywords"]:
+ if kw in all_text:
+ for principle, pid, conf in info["principles"]:
+ if pid not in seen_ids:
+ patterns.append({"id": pid, "principle": principle, "confidence": conf, "level": "basic"})
+ seen_ids.add(pid)
+ break
+
+ # Add domain patterns from skill name decomposition
+ sub_ideas = _decompose_skill_name_en(skill_name, cases=cases)
+ for i, (sub_name, conf) in enumerate(sub_ideas):
+ pid = f"P_domain_{i+1}"
+ if pid not in seen_ids:
+ patterns.append({
+ "id": pid,
+ "principle": sub_name,
+ "confidence": conf,
+ "level": "domain"
+ })
+ seen_ids.add(pid)
+
+ # Inherit patterns from previous version
+ if os.environ.get("SKILL_FORCE_REFRESH") != "1":
+ inherited = dir_manager.get_latest_patterns(skill_name)
+ if inherited:
+ added = 0
+ for p in inherited:
+ pid = p.get("id")
+ if pid and pid not in seen_ids:
+ patterns.append({
+ "id": pid, "principle": p["principle"],
+ "confidence": max(p.get("confidence", 50) - 5, 50),
+ "level": "inherited"
+ })
+ seen_ids.add(pid)
+ added += 1
+ print(f" Inherited: +{added} patterns from prev revision")
+
+ if not patterns:
+ # Fallback: generate generic patterns
+ patterns = [
+ {"id": "P_generic_1", "principle": f"Core concepts of {skill_name}", "confidence": 70, "level": "basic"},
+ {"id": "P_generic_2", "principle": f"Best practices for {skill_name} setup", "confidence": 70, "level": "basic"},
+ {"id": "P_generic_3", "principle": f"Common pitfalls in {skill_name}", "confidence": 65, "level": "basic"},
+ ]
+
+ # Save
+ patterns_file = ctx["rev_dir"] / "patterns" / "knowledge_patterns.json"
+ with open(patterns_file, "w", encoding="utf-8") as f:
+ json.dump(patterns, f, indent=2, ensure_ascii=False)
+ ctx["patterns"] = patterns
+ print(f" Patterns extracted: {len(patterns)}")
+ for p in patterns:
+ print(f" [{p['level']:>9}] {p['principle'][:60]}")
+ print(" [OK] Patterns saved")
+
+
+# ===============================================================
+# Phase 4: Generate Assessment Tool
+# ===============================================================
+def _generate_assessment(ctx: dict):
+ """Phase 4 — Generate an inline assessment script."""
+ print(f"\n{'-' * 55}")
+ print(" Phase 4: Generate Assessment")
+ print("-" * 55)
+
+ patterns = ctx.get("patterns", [])
+ case_count = len(ctx.get("cases", []))
+ skill_name = ctx["skill_name"]
+ version = ctx["version"]
+
+ # Build questions from patterns
+ questions = []
+ pattern_texts = [p.get("principle", "?") for p in patterns]
+ n = len(pattern_texts)
+ generic_fillers = [
+ "Clean up temp files regularly to free disk space",
+ "Use type annotations to improve code readability",
+ "Add unit tests to ensure code quality",
+ "Document API endpoints for team collaboration",
+ ]
+
+ for i, p in enumerate(patterns):
+ principle = p.get("principle", "")
+ scenario = pattern_texts[(i + 1) % n][:60] if n > 1 else principle[:60]
+ correct_text = principle[:60]
+
+ others = [pattern_texts[j][:60] for j in range(n) if j != i and j != (i + 1) % n]
+ random.shuffle(others)
+ wrongs = others[:3]
+ while len(wrongs) < 3:
+ wrongs.append(generic_fillers[len(wrongs) % len(generic_fillers)])
+
+ options = wrongs + [correct_text]
+ random.shuffle(options)
+ correct_idx = options.index(correct_text)
+ labels = ["A", "B", "C", "D"]
+
+ questions.append({
+ "q": f"Which approach is best for: {scenario}?",
+ "a": options[0], "b": options[1], "c": options[2], "d": options[3],
+ "answer": labels[correct_idx],
+ "explain": f"Best practice: {principle}"
+ })
+
+ # Generate assess.py via template
+ assess_code = render_assess_code(
+ version=version, skill_name=skill_name,
+ patterns=patterns, questions=questions,
+ case_count=case_count
+ )
+
+ assess_file = ctx["rev_dir"] / "tools" / "assess.py"
+ with open(assess_file, "w", encoding="utf-8") as f:
+ f.write(assess_code)
+
+ ctx["assess_file"] = assess_file
+ print(f" Generated: tools/assess.py ({len(questions)} questions)")
+ print(" [OK] Assessment generated")
+
+
+# ===============================================================
+# Phase 5: Validation & Report
+# ===============================================================
+def _phase5_validate(ctx: dict):
+ """Phase 5 — Run validation and generate learning report."""
+ print(f"\n{'-' * 55}")
+ print(" Phase 5: Validation & Report")
+ print("-" * 55)
+
+ assess_file = ctx.get("assess_file")
+ if assess_file and assess_file.exists():
+ try:
+ result = subprocess.run(
+ [sys.executable, str(assess_file)],
+ capture_output=True, text=True, timeout=60,
+ cwd=str(ctx["rev_dir"])
+ )
+ print(result.stdout)
+ if result.stderr:
+ print(f" [STDERR] {result.stderr[:200]}")
+
+ # Parse overall score from output
+ score = 0.0
+ for line in result.stdout.split("\n"):
+ if "Overall Score:" in line:
+ try:
+ score = float(line.split(":")[1].strip().split("/")[0])
+ except ValueError:
+ pass
+ ctx["score"] = score
+ print(f" Validation score: {score:.1f}/100")
+ except subprocess.TimeoutExpired:
+ print(" [FAIL] Validation timed out")
+ ctx["score"] = 0
+ except Exception as e:
+ print(f" [FAIL] Validation error: {e}")
+ ctx["score"] = 0
+ else:
+ print(" No assess.py found, skipping validation")
+ ctx["score"] = 0
+
+ # Generate learning report
+ report = f"""# Learning Report: {ctx['skill_name']} (rev{ctx['version']})
+
+## Summary
+- **Skill**: {ctx['skill_name']}
+- **Version**: rev{ctx['version']}
+- **Date**: 2026-05-15
+- **Cases collected**: {len(ctx.get('cases', []))}
+- **Patterns extracted**: {len(ctx.get('patterns', []))}
+- **Validation score**: {ctx.get('score', 0):.1f}/100
+
+## Patterns
+"""
+ for p in ctx.get("patterns", []):
+ report += f"- [{p.get('level', 'basic')}] {p.get('principle', '?')} (confidence: {p.get('confidence', 0)})\n"
+
+ report += f"""
+## Next Steps
+1. Review extracted patterns and adjust confidence levels if needed
+2. Add more targeted web searches for uncovered topics
+3. Re-run learning with `--force` for a fresh start
+4. Apply learned patterns in real projects
+"""
+
+ report_file = ctx["rev_dir"] / "reports" / "learning_report.md"
+ with open(report_file, "w", encoding="utf-8") as f:
+ f.write(report)
+ print(f" Report saved: reports/learning_report.md")
+ print(f" [OK] rev{ctx['version']} complete!")
+
+
+# ===============================================================
+# Main Orchestrator
+# ===============================================================
+def run(skill_name: str, dry_run: bool = False, force: bool = False) -> dict:
+ """
+ Run the full 5-phase skill learning pipeline.
+
+ Args:
+ skill_name: English skill name to learn (e.g., "docker_compose_production")
+ dry_run: If True, only show what would be done
+ force: If True, skip inherited patterns/cases
+
+ Returns:
+ Context dict with all phase results
+ """
+ if force:
+ os.environ["SKILL_FORCE_REFRESH"] = "1"
+
+ ctx = {
+ "skill_name": skill_name,
+ "version": 0,
+ "rev_dir": None,
+ "cases": [],
+ "patterns": [],
+ "score": 0,
+ "dry_run": dry_run,
+ }
+
+ if dry_run:
+ print(f"\n{'=' * 55}")
+ print(f" DRY RUN: {skill_name}")
+ print(f"{'=' * 55}")
+ version = dir_manager.next_version(skill_name)
+ rev_dir = dir_manager.get_skill_dir(skill_name) / f"rev{version}"
+ print(f" Would create: {rev_dir}")
+ print(f" Would run: Phase 1-5 pipeline")
+ print(f" [OK] Dry run complete (no changes made)")
+ return ctx
+
+ _ensure_env(ctx)
+ _phase1_define(ctx)
+ _phase2_search(ctx)
+ _extract_patterns(ctx)
+ _generate_assessment(ctx)
+ _phase5_validate(ctx)
+
+ return ctx
diff --git a/tools/skill_learn_from_cases/README.md b/tools/skill_learn_from_cases/README.md
deleted file mode 100644
index 26e86fcb..00000000
--- a/tools/skill_learn_from_cases/README.md
+++ /dev/null
@@ -1,222 +0,0 @@
-# skill\_learn\_from\_cases 案例驱动技能学习 CLI 工具
-
-通过真实案例学习一项技能的工具,并用案例验证能力习得\
-零外部依赖 除搜索引擎 API key 可选大模型增强
-
-***
-
-## 快速开始
-
-```bash
-# 最简用法 纯规则模式
-python -m tools.skill_learn_from_cases docker_compose_production
-
-# 查看预览 展示环境/领域/hooks
-python -m tools.skill_learn_from_cases wiki_search --dry-run
-
-# 启用 LLM 增强
-set SKILL_LLM_ENABLE=1
-set LLM_API_BASE=https://api.deepseek.com/v1
-set LLM_API_KEY=sk-xxx
-set LLM_MODEL=deepseek-chat
-python -m tools.skill_learn_from_cases cypher_programming_language
-
-# 支持中英文混合技能名
-python -m tools.skill_learn_from_cases 人机交互ui设计原型handoff
-```
-
-## 6 阶段工作流
-
-```text
-Phase 0: 启动 目录创建 + 环境探测
-Phase 0.5: 探测 自动扫描 /Docker/SQLite/Git/PaddleOCR 缺密码时交互询问
-Phase 1: 定义 LLM 结构化定义 Wikipedia 摘要
-Phase 2: 搜索 多个渠道并行搜索 + 同义词扩展 + 多步案例过滤
-Phase 3: 模式 LLM 智能提取 + 技能分解 规则匹配 16 个领域
-Phase 4: 构建 生成评估工具 + 实操测试 practice/ 目录
-Phase 5: 验证 知识测试 + 实操测试 + 模式覆盖率
- \_____________________/
- 迭代反馈环 继承+改进
-```
-
-## 元学习闭环
-
-本工具最独特的特性是能够**学习技能后反哺自身**:
-
-```text
-学习技能->提取知识模式->应用到 CLI 工具自身->验证效果->继续迭代
-```
-
-已成功完成 5 轮元学习闭环:
-
-| 轮次 | 学习技能 | 评分 | 应用到 CLI 工具 |
-| :-: | ------------------- | :----: | ------------------------------------------------- |
-| 1 | structured\_logging | 95/100 | 新建 logging\_setup.py, llm\_helper.py print logger |
-| 2 | cli\_ux\_design | 86/100 | --help 重写为结构化文档 |
-| 3 | test\_strategy | 94/100 | 15 个测试覆盖 4 个模块 + CI 配置 |
-| 4 | wiki\_search | 97/100 | 搜索词同义词扩展 多步案例过滤 |
-| 5 | error\_handling | 84/100 | 异常分类 错误上下文日志 |
-
-## 核心特性
-
-### 1. LLM 增强 可选降级
-
-| 阶段 | LLM 路径 | 规则降级路径 |
-| -- | ---------------- | ------------- |
-| 定义 | 结构化定义 前置知识 概念 陷阱 | Wikipedia 摘要 |
-| 搜索 | 6 个多样化搜索词 | 模板化搜索词 含同义词扩展 |
-| 模式 | 智能模式提取 + 技能分解 | 16 个领域关键词匹配 |
-| 验证 | 批量评估 + 实操题 | 模式覆盖质量评分 |
-
-### 2. 环境探测 + 实操测试
-
-自动探测本机可用服务 缺密码时 ask\_user 交互询问:
-
-| 服务 | 探测方式 | 用途 |
-| --------- | ----------------- | ------------ |
-| | 端口 7687 + env | Cypher 实操测试 |
-| Docker | WSL Docker socket | Compose 实操测试 |
-| SQLite | CLI sqlite3 | SQL 实操测试 |
-| Git | git --version | Git 实操测试 |
-| PaddleOCR | 端口 8090 + API | 文档鉴权实操测试 |
-
-结果存入 practice/ 目录:
-
-```text
-rev5/
- practice/
- _hook.py 真实 连接 100/100
- docker_compose.py docker compose config 校验
- sql.py SQLite 查询验证
- git.py Git 操作验证
- python_async.py 异步代码执行
- react_hook.py Node.js 浏览器检测
- ui_design_hook.py Chrome Edge 设计工具检测
- document_check.py PaddleOCR-VL 图像识别 85/100
-```
-
-### 3. 案例质量过滤
-
-3 层过滤链确保案例质量:
-
-```text
-原始搜索结果 Skill Hub 关键词重叠过滤
- Wikipedia 标题去重 + 无关内容过滤
- agentskill_skills 前缀排除
- 最终高质量案例集
-```
-
-效果: cypher 技能相关案例从 27% 提升到 83%
-
-### 4. 安全设计
-
-| 风险 | 防护措施 |
-| ---------- | ----------------------------------- |
-| 路径遍历 | sanitize\_skill\_name 清洗目录名 |
-| API Key 泄漏 | 子进程过滤 API\_KEY SECRET 等敏感后缀 |
-| 代码注入 | eval exec 限制 builtins 无 open import |
-| 模板注入 | json.dumps 自动转义 |
-| Shell 注入 | 列表参数调用 subprocess.run |
-
-## CLI 参数
-
-```bash
-python -m tools.skill_learn_from_cases [skill_name] [选项]
-
-选项:
- --dry-run 预览 显示环境/领域/hooks
- --list 列出所有已学习的技能
- --show SKILL 显示某技能的最新学习报告
- --version 显示工具版本
- --force 强制刷新搜索案例 不继承上一版
- --delete SKILL 删除指定技能的所有学习记录
-```
-
-## 工作流示例
-
-```bash
-1. 初次学习:
- python -m tools.skill_learn_from_cases docker_compose_production
-
-2. 查看已有学习:
- python -m tools.skill_learn_from_cases --list
- python -m tools.skill_learn_from_cases --show wiki_search
-
-3. 强制刷新 重新搜索案例:
- python -m tools.skill_learn_from_cases python_async --force
-
-4. LLM 增强学习:
- set SKILL_LLM_ENABLE=1
- set LLM_API_KEY=sk-xxx
- python -m tools.skill_learn_from_cases image_voucher_verification
-```
-
-## 环境变量
-
-| 变量 | 默认值 | 说明 |
-| --------------------- | --------------------------- | ----------- |
-| SKILL\_LLM\_ENABLE | 0 | 启用 LLM 增强 |
-| LLM\_API\_BASE | | LLM API 端点 |
-| LLM\_API\_KEY |
| API 密钥 |
-| LLM\_MODEL | qwen2.5:7b | 模型名 |
-| LLM\_TIMEOUT | 120 | HTTP 超时秒数 |
-| LLM\_CACHE\_ENABLE | 1 | 启用 LLM 响应缓存 |
-| LLM\_CACHE\_TTL | 86400 | 缓存有效期秒数 |
-| \_password |
| 数据库密码 |
-| SKILL\_FORCE\_REFRESH | 0 | 强制刷新案例 |
-
-## 测试
-
-```bash
-pip install pytest
-python -m pytest tests/ -v
-```
-
-## 目录结构
-
-```text
-tools/skill_learn_from_cases/
- engine.py 6 阶段流编排
- assess_template.py 评估工具模板
- env_detector.py 环境自动探测
- llm_helper.py 统一 LLM 接口 + 缓存
- logging_setup.py 结构化日志
- dir_manager.py 版本目录管理 + 路径清洗
- name_converter.py 中英文技能名转换 含 71 个映射
- skill_domain_patterns.json 16 个领域库
- practical_hooks/ 9 个实操测试 hook
-tests/
- tools/skill_learn_from_cases/
- test_name_converter.py
- test_env_detector.py
- test_dir_manager.py
-.github/workflows/
- ci.yml
-```
-
-## 扩展指南
-
-### 新增领域
-
-编辑 skill\_domain\_patterns.json 添加新条目:
-
-```json
-{
- "new_domain": {
- "keywords": ["keyword1", "keyword2"],
- "domain_label": "新领域",
- "principles": [
- {"principle": "最佳实践描述", "id": "P_xxx", "confidence": 90}
- ]
- }
-}
-```
-
-### 新增实操 hook
-
-在 practical\_hooks/ 下创建文件实现 run env -> dict 接口,
-然后在 engine.py 的 hook\_rules 列表中添加关键词匹配。
-
-##
-
-##
diff --git a/tools/skill_learn_from_cases/__init__.py b/tools/skill_learn_from_cases/__init__.py
deleted file mode 100644
index 832f8ce5..00000000
--- a/tools/skill_learn_from_cases/__init__.py
+++ /dev/null
@@ -1 +0,0 @@
-"""skill_learn CLI — 案例驱动技能学习工具"""
diff --git a/tools/skill_learn_from_cases/__main__.py b/tools/skill_learn_from_cases/__main__.py
deleted file mode 100644
index 1107dec9..00000000
--- a/tools/skill_learn_from_cases/__main__.py
+++ /dev/null
@@ -1,267 +0,0 @@
-"""
-__main__.py skill_learn_from_cases CLI 入口
-
-用法:
- python -m tools.skill_learn_from_cases "docker_compose_production"
- python -m tools.skill_learn_from_cases --list
- python -m tools.skill_learn_from_cases "docker_compose_production" --dry-run
-"""
-
-import sys, argparse, json
-from pathlib import Path
-
-# 确保 GA 根目录在 sys.path
-GA_ROOT = Path(__file__).resolve().parents[2]
-if str(GA_ROOT) not in sys.path:
- sys.path.insert(0, str(GA_ROOT))
-
-from tools.skill_learn_from_cases.engine import learn_skill
-from tools.skill_learn_from_cases.dir_manager import get_all_skills
-from tools.skill_learn_from_cases.name_converter import convert_name
-
-
-def main():
- parser = argparse.ArgumentParser(
- description="skill_learn_from_cases 案例驱动技能学习工具",
- formatter_class=argparse.RawDescriptionHelpFormatter,
- epilog="""
-Design Specification CLI 接口设计文档
-==========================================
-
--- 概述
- 通过真实案例学习技能并验证能力习得
- 支持LLM增强 DeepSeek/Ollama 和纯规则降级双路径
-
->> 使用指南
- python -m tools.skill_learn_from_cases <技能名称> [选项]
-
- 技能名称示例
- docker_compose_production Docker Compose 生产部署
- cypher_programming_language Cypher 图数据库查询语言
- 小微贷款图像凭证鉴定 中文名称自动转换
-
--> 常用工作流
- 1. 预览学习效果:
- python -m tools.skill_learn_from_cases wiki_search --dry-run
-
- 2. 完整学习:
- python -m tools.skill_learn_from_cases docker_compose_production
-
- 3. 强制刷新:
- python -m tools.skill_learn_from_cases python_async --force
-
->> LLM 增强 可选
- set SKILL_LLM_ENABLE=1
- set LLM_API_BASE=https://api.deepseek.com/v1
- set LLM_API_KEY=sk-xxx
- set LLM_MODEL=deepseek-chat
-
->> 查看已学技能
- python -m tools.skill_learn_from_cases --list
- python -m tools.skill_learn_from_cases --show wiki_search
-"""
- )
- parser.add_argument(
- "skill_name",
- nargs="?",
- help="要学习的技能名称 如 docker_compose_production"
- )
- parser.add_argument(
- "--list", "-l",
- action="store_true",
- help="列出已学习的技能"
- )
- parser.add_argument(
- "--dry-run",
- action="store_true",
- help="仅展示将要执行的操作 不实际运行"
- )
- parser.add_argument(
- "--show", "-s",
- type=str,
- help="查看指定技能的最新学习详情 支持中文名 "
- )
- parser.add_argument(
- "--version", "-V",
- action="store_true",
- help="显示工具版本"
- )
- parser.add_argument(
- "--delete",
- type=str,
- help="删除指定技能的所有学习记录"
- )
- parser.add_argument(
- "--force", "-f",
- action="store_true",
- help="强制刷新搜索案例 跳过继承 "
- )
-
- args = parser.parse_args()
-
- if args.list:
- skills = get_all_skills()
- if skills:
- lines_out = []
- for s in sorted(skills):
- stats = ""
- rev_dir = GA_ROOT / "skills_learning" / s
- if rev_dir.exists():
- revs = sorted([d.name for d in rev_dir.iterdir() if d.name.startswith("rev")],
- key=lambda x: int(x.replace("rev","")))
- if revs:
- latest_rev = revs[-1]
- stats += f"v{latest_rev.replace('rev','')}"
- meta_file = rev_dir / latest_rev / "meta.json"
- if meta_file.exists():
- try:
- m = json.loads(meta_file.read_text(encoding="utf-8"))
- score = m.get("score", "?")
- stats += f" {score:>3}/100"
- except: pass
- patterns_dir = rev_dir / latest_rev / "patterns"
- if patterns_dir.exists():
- pf = patterns_dir / "knowledge_patterns.json"
- if pf.exists():
- try:
- pats = json.loads(pf.read_text(encoding="utf-8"))
- stats += f" {len(pats)}模式"
- except: pass
- lines_out.append((s, stats))
-
- # 动态计算列宽 支持中文
- name_width = max(len(s.encode('utf-8')) for s, _ in lines_out)
- # 显示用宽度 中文占2字符宽度的近似
- display_width = max(20, len(max((s for s,_ in lines_out), key=len)) + 2)
-
- print("已学习的技能:")
- header = f" {'技能名':<{display_width}} 版本 评分 模式数 原始名"
- print(header)
- print(f" {' '*display_width} ")
- for s, stats in lines_out:
- dname = ""
- rev_dir = GA_ROOT / "skills_learning" / s
- if rev_dir.exists():
- revs = sorted([d.name for d in rev_dir.iterdir() if d.name.startswith("rev")],
- key=lambda x: int(x.replace("rev","")))
- if revs:
- latest_rev = revs[-1]
- meta_file = rev_dir / latest_rev / "meta.json"
- if meta_file.exists():
- try:
- m = json.loads(meta_file.read_text(encoding="utf-8"))
- dname = m.get("display_name", "")
- except: pass
- print(f" {s:<{display_width}} {stats} {dname}")
- else:
- print("尚未学习任何技能")
- return
-
- if args.show:
- show_name = convert_name(args.show)
- show_dir = GA_ROOT / "skills_learning" / show_name
- if not show_dir.exists():
- print(f"技能 '{args.show}' 未学习")
- return
- revs = sorted([d.name for d in show_dir.iterdir() if d.name.startswith("rev")],
- key=lambda x: int(x.replace("rev","")))
- if not revs:
- print(f"技能 '{args.show}' 无版本记录")
- return
- latest = revs[-1]
- print(f"\n技能: {args.show}")
- print(f"目录: {show_name}")
- meta_file = show_dir / latest / "meta.json"
- if meta_file.exists():
- m = json.loads(meta_file.read_text(encoding="utf-8"))
- for k, v in m.items():
- print(f" {k}: {v}")
- report_file = show_dir / latest / "reports" / "learning_report.md"
- if report_file.exists():
- print(f"\n 学习报告: skills_learning/{show_name}/{latest}/reports/learning_report.md")
- return
-
- if args.version:
- print("skill_learn_from_cases v2.0")
- print("案例驱动技能学习CLI工具")
- print("工具目录: tools/skill_learn_from_cases/")
- return
-
- if args.delete:
- del_name = convert_name(args.delete)
- del_dir = GA_ROOT / "skills_learning" / del_name
- if not del_dir.exists():
- print(f"技能 '{args.delete}' 未学习")
- return
- import shutil
- shutil.rmtree(del_dir)
- print(f"已删除: {del_name}")
- return
-
- if not args.skill_name:
- parser.print_help()
- return
-
- skill_name = args.skill_name.strip()
- en_name = convert_name(skill_name)
- if en_name != skill_name:
- print(f" 原始: {skill_name}")
- print(f" 目录: {en_name}")
-
- if args.dry_run:
- print(f"[DRY RUN] 将学习技能: {skill_name}")
- print(f" 目录名: {en_name}")
- print(f" 流程: Phase 0 1 2 3 4 5")
- print(f" 将创建: skills_learning/{en_name}/revN/")
- # 环境探测
- try:
- from tools.skill_learn_from_cases.env_detector import detect_all
- env = detect_all()
- available = [k for k, v in env.items() if v.get("available")]
- print(f" 环境: {', '.join(available) if available else '无可用服务'}")
- print(f" LLM: {'启用('+__import__('os').environ.get('LLM_MODEL','?')+')' if __import__('os').environ.get('SKILL_LLM_ENABLE')=='1' else '未启用'}")
- except Exception as e:
- print(f" 环境探测失败: {e}")
- print(f" 提示: 运行 python -m tools.skill_learn_from_cases {skill_name} --force 可强制刷新案例")
- return
-
- if args.force:
- os.environ["SKILL_FORCE_REFRESH"] = "1"
- print(" [--force] 将强制刷新搜索案例")
-
- learn_skill(en_name)
-
- # 自动清理旧版本 保留最近3个
- skill_dir = GA_ROOT / "skills_learning" / en_name
- if skill_dir.exists():
- import shutil
- revs = sorted(
- [d.name for d in skill_dir.iterdir() if d.name.startswith("rev")],
- key=lambda x: int(x.replace("rev",""))
- )
- while len(revs) > 3:
- old = skill_dir / revs.pop(0)
- shutil.rmtree(old)
- print(f" 自动清理: {old.name} 保留最近3版 ")
-
- # 保存原始显示名到 meta.json
- if en_name != skill_name:
- # meta.json 在 rev{ver}/ 下 找到最新版本
- skill_dir = GA_ROOT / "skills_learning" / en_name
- if skill_dir.exists():
- revs = sorted([d.name for d in skill_dir.iterdir() if d.name.startswith("rev")],
- key=lambda x: int(x.replace("rev","")))
- if revs:
- meta_file = skill_dir / revs[-1] / "meta.json"
- if meta_file.exists():
- try:
- meta = json.loads(meta_file.read_text(encoding="utf-8"))
- meta["display_name"] = skill_name
- meta_file.write_text(json.dumps(meta, indent=2, ensure_ascii=False), encoding="utf-8")
- except Exception as _e:
- import sys
- print(f" [meta] 保存显示名失败: {_e}", file=sys.stderr)
-
-
-if __name__ == "__main__":
- main()
diff --git a/tools/skill_learn_from_cases/assess_template.py b/tools/skill_learn_from_cases/assess_template.py
deleted file mode 100644
index 3e91417e..00000000
--- a/tools/skill_learn_from_cases/assess_template.py
+++ /dev/null
@@ -1,535 +0,0 @@
-#!/usr/bin/env python3
-"""skill_learn rev__VERSION__ -- __SKILL__ 验证工具模板
-自动生成 | 知识测试 + 模式覆盖率 + 实操测试"""
-import json, sys, os, random
-
-PATTERNS = __PATTERNS_JSON__
-CASE_COUNT = __CASE_COUNT__
-
-# ── 延时导入 LLM 辅助模块(运行时可用的环境路径) ──
-_LLM_HELPER = None
-def _get_llm():
- global _LLM_HELPER
- if _LLM_HELPER is not None:
- return _LLM_HELPER
- # 尝试导入 llm_helper(模板可能在子进程中运行,需要 GA_ROOT 在 sys.path 中)
- _import_ok = False
- try:
- from tools.skill_learn_from_cases.llm_helper import call_llm_json, llm_available
- _import_ok = True
- except ImportError:
- # 尝试从父目录推算 GA_ROOT 并添加
- try:
- _p = Path(__file__).resolve()
- # assess.py 位于: GA_ROOT/skills_learning/{skill}/rev{N}/tools/
- # 向上找 5 层
- for _i in range(6):
- _p = _p.parent
- _candidate = str(_p)
- if _candidate not in sys.path:
- sys.path.insert(0, _candidate)
- try:
- from tools.skill_learn_from_cases.llm_helper import call_llm_json, llm_available
- _import_ok = True
- break
- except ImportError:
- continue
- except Exception:
- pass
- if _import_ok:
- if llm_available():
- _LLM_HELPER = lambda prompt, sys_prompt="", temp=0.3: call_llm_json(
- prompt, system_prompt=sys_prompt, temperature=temp, max_tokens=4096)
- return _LLM_HELPER
- _LLM_HELPER = False
- return False
-
-
-# ── 从知识模式自动生成题目 ──
-def generate_questions(patterns):
- """每个模式生成一道选择题(LLM增强,fallback到模板规则)"""
- llm = _get_llm()
- if llm:
- return _llm_generate_questions(patterns)
- return _rule_generate_questions(patterns)
-
-
-def _rule_generate_questions(patterns):
- """规则生成题目(改进版:不自问自答,各选项为其他模式的文本)"""
- qs = []
- pattern_texts = [p.get("principle", "?") for p in patterns]
- n = len(pattern_texts)
-
- for i, p in enumerate(patterns):
- principle = p.get("principle", "")
- level = p.get("level", "basic")
-
- # 用另一个模式的文本作为"场景",当前模式为正确答案
- scenario_idx = (i + 1) % n
- scenario = pattern_texts[scenario_idx][:50]
-
- # 正确答案
- correct_text = principle[:50]
-
- # 干扰项:其他模式文本(排除当前模式和场景模式)
- others = []
- for j, t in enumerate(pattern_texts):
- if j != i and j != scenario_idx:
- others.append(t[:50])
- random.shuffle(others)
- wrongs = others[:3]
-
- # 不足3个时补充通用短语(尽量避免)
- generic_fillers = [
- "定期清理临时文件释放磁盘空间",
- "使用类型注解提高代码可读性",
- "添加单元测试确保代码质量",
- ]
- while len(wrongs) < 3:
- wrongs.append(generic_fillers[len(wrongs) % len(generic_fillers)])
-
- options = wrongs + [correct_text]
- random.shuffle(options)
- correct_idx = options.index(correct_text)
- labels = ["A", "B", "C", "D"]
-
- qs.append({
- "q": f"以下哪种做法最适合应对场景:{scenario}?",
- "a": options[0],
- "b": options[1],
- "c": options[2],
- "d": options[3],
- "answer": labels[correct_idx],
- "explain": f"最佳实践:{principle}",
- "_scenario": scenario
- })
- return qs
-
-
-def _llm_generate_questions(patterns):
- """LLM 生成真实有区分度的题目"""
- patterns_text = json.dumps([
- {"id": p.get("id"), "principle": p.get("principle"), "level": p.get("level")}
- for p in patterns[:12] # 最多 12 题,控制成本
- ], ensure_ascii=False, indent=2)
-
- _llm = _get_llm()
- prompt = f"""技能: __SKILL__
-知识模式列表(JSON):
-{patterns_text}
-
-请为每个模式生成一道选择题,要求:
-1. 正确答案必须基于该模式的真实描述
-2. 干扰项要逼真——来自该领域的常见误解或反模式
-3. 答案随机分布在 A/B/C/D 位置
-4. 每题附带简短解析,解释为什么正确答案是对的
-
-输出 JSON 数组,每个元素格式:
-{{"q": "题干", "a": "A) 选项", "b": "B) 选项", "c": "C) 选项", "d": "D) 选项",
- "answer": "A/B/C/D", "explain": "解析"}}
-"""
- result = _llm(prompt, sys_prompt="你是技能评估专家,擅长出高质量选择题。输出纯 JSON 数组。", temp=0.4)
- if isinstance(result, list) and len(result) > 0:
- # 校验结构
- valid = []
- for item in result:
- if all(k in item for k in ("q", "a", "b", "c", "d", "answer", "explain")):
- valid.append(item)
- if valid:
- print(f" [LLM] 生成 {len(valid)} 道选择题")
- return valid
- print(" [FALLBACK] LLM 出题失败,使用规则模板")
- return _rule_generate_questions(patterns)
-
-
-QUESTIONS = generate_questions(PATTERNS)
-
-
-# ── 知识测试 ──
-def run_knowledge_test():
- """知识测试: LLM评估 / fallback到模拟作答"""
- if not QUESTIONS:
- return 0
- llm = _get_llm()
- per_q = 100.0 / len(QUESTIONS)
- score = 0
- border = "-" * 50
- print(f"\n{border}")
- print(f" 知识测试 ({len(QUESTIONS)} 题)")
- print(f"{border}")
-
- # ── LLM 批量评估前 8 题(1次API调用替代逐题调用) ──
- if llm:
- eval_limit = min(8, len(QUESTIONS))
- eval_qs = QUESTIONS[:eval_limit]
- items = []
- for qi, q in enumerate(eval_qs):
- opts = "\n".join(f" {l.upper()}) {q[l]}" for l in ["a","b","c","d"])
- items.append(f"Q{qi+1}: {q['q']}\n{opts}")
- batch_prompt = "评估以下选择题答案是否正确。输出JSON数组[true,false,...]。\n\n" + "\n---\n".join(items)
- batch_result = llm(batch_prompt, sys_prompt="只输出JSON数组。", temp=0.1)
- if isinstance(batch_result, list) and len(batch_result) == len(eval_qs):
- print(f" [LLM] 批量评估 {len(eval_qs)} 题")
- for qi, is_ok in enumerate(batch_result):
- q = eval_qs[qi]
- cl = q["answer"]
- if is_ok:
- print(f" [OK] Q{qi+1}: {q['q'][:60]}")
- print(f" -> {q.get('explain', '')[:60]}")
- score += per_q
- else:
- print(f" [!] Q{qi+1}: {q['q'][:60]}")
- print(f" -> LLM未通过 (正确答案 {cl})")
- else:
- print(f" [FALLBACK] LLM批量返回异常,用规则评估")
- for qi, q in enumerate(eval_qs):
- p = PATTERNS[qi] if qi < len(PATTERNS) else {}
- lv = p.get("level","") if isinstance(p,dict) else ""
- cf = p.get("confidence",70) if isinstance(p,dict) else 70
- ok = lv == "domain" or cf >= 75
- if ok:
- print(f" [OK] Q{qi+1}: {q['q'][:60]}")
- print(f" -> {q.get('explain','')[:60]}")
- score += per_q
- else:
- print(f" [!] Q{qi+1}: {q['q'][:60]} -> 规则fallback")
-
- # ── 剩余题目用规则评估 ──
- remaining = QUESTIONS[8:] if llm else QUESTIONS
- for qi, q in enumerate(remaining):
- i = qi + (8 if llm else 0)
- p = PATTERNS[i] if i < len(PATTERNS) else {}
- level = p.get("level", "") if isinstance(p, dict) else ""
- conf = p.get("confidence", 70) if isinstance(p, dict) else 70
- cl = q["answer"]
- ok = (level == "domain") or (level == "advanced" and conf >= 60) or (level not in ("domain","advanced") and conf >= 80)
- if ok:
- print(f" [OK] Q{i+1}: {q['q'][:60]} -> {q.get('explain','')[:60]}")
- score += per_q
- else:
- print(f" [!] Q{i+1}: {q['q'][:60]} -> 规则: level={level}, conf={conf}")
-
- print(f"\n 知识测试得分: {round(score,1)}/100")
- return round(score, 1)
-
-
-def _llm_evaluate_answer(question: dict) -> bool:
- """使用 LLM 验证是否答对(更真实的评估)"""
- try:
- q_text = question["q"]
- correct_label = question["answer"]
- options = {lbl: question[lbl.lower()] for lbl in ["a","b","c","d"]}
- # 构建选项字符串(避免嵌套 f-string)
- opt_lines = []
- for lbl, text in sorted(options.items()):
- opt_lines.append(f"{lbl.upper()}) {text}")
- opt_str = "\n".join(opt_lines)
-
- # 让 LLM 推理正确答案
- prompt = f"""题目: {q_text}
-选项:
-{opt_str}
-
-请分析以上题目,选出正确答案。注意:
-- 基于题目和选项内容推理
-- 不要猜测,只输出你确定的答案
-
-只输出一个字母:A/B/C/D,不要其他文字。
-"""
- result_str = _raw_llm_call(
- prompt,
- sys_prompt="你是一个严谨的技能评估者,基于题目内容理性选择答案。",
- temp=0.1
- )
- if result_str:
- # 提取答案字母
- import re as _re
- match = _re.search(r'[A-D]', result_str.strip().upper())
- if match:
- chosen = match.group()
- return chosen == correct_label.upper()
- # 如果 LLM 调用失败,回退到随机
- return random.random() < 0.5
- except Exception:
- return random.random() < 0.5
-
-
-def _raw_llm_call(prompt, sys_prompt="", temp=0.1):
- """直接调用 LLM 获取原始文本响应"""
- try:
- from tools.skill_learn_from_cases.llm_helper import call_llm
- return call_llm(prompt, system_prompt=sys_prompt, temperature=temp)
- except Exception:
- return None
-
-
-# ── 模式覆盖率检查 ──
-def check_pattern_coverage():
- """模式覆盖率: 检查所有模式都被认知"""
- if not PATTERNS:
- return 0, 0
- border = "-" * 50
- print(f"\n{border}")
- print(f" 模式覆盖率检查 ({len(PATTERNS)} 个模式)")
- print(f"{border}")
- covered = 0
- for p in PATTERNS:
- pid = p.get("id", "?")
- principle = p.get("principle", "?")[:55]
- conf = p.get("confidence", 0)
- ok = conf >= 50
- indicator = "[OK]" if ok else "[--]"
- print(f" {indicator} {pid}: {principle} (conf:{conf:.0f}%)")
- if ok:
- covered += 1
- return covered, len(PATTERNS)
-
-
-# ── 实操测试 ──
-def run_practical_test():
- """实操测试: 扫描 practice/ 目录运行所有 hook,聚合评分"""
- practice_dir = os.path.join(os.path.dirname(__file__), "..", "practice")
- hook_files = []
- if os.path.isdir(practice_dir):
- for f in sorted(os.listdir(practice_dir)):
- if f.endswith(".py") and f != "__init__.py":
- hook_files.append(os.path.join(practice_dir, f))
-
- if hook_files:
- border = "-" * 50
- print(f"\n{border}")
- print(f" 实践环节 ({len(hook_files)} 个 hook)")
- print(f"{border}")
- import subprocess
- total_score = 0
- notes = []
- for hook_file in hook_files:
- hook_name = os.path.basename(hook_file)
- try:
- r = subprocess.run([sys.executable, hook_file],
- capture_output=True, text=True, timeout=30)
- if r.returncode == 0:
- import json
- # 自动剥离非JSON前缀(env_detector的探测输出)
- _out = r.stdout
- _brace = _out.find('{')
- if _brace > 0:
- _out = _out[_brace:]
- result = json.loads(_out)
- score = result.get("score", 0)
- note = result.get("note", "")
- print(f" [{hook_name}] {score}/100 - {note}")
- total_score += score
- notes.append(f"{hook_name}:{score}")
- else:
- print(f" [{hook_name}] [FAIL] {r.stderr[:100]}")
- notes.append(f"{hook_name}:fail")
- except Exception as e:
- print(f" [{hook_name}] [ERR] {e}")
- notes.append(f"{hook_name}:err")
-
- avg_score = int(total_score / len(hook_files)) if hook_files else 0
- return avg_score, "; ".join(notes)
-
- # ── LLM 增强实操测试(无 practice hook 时) ──
- llm = _get_llm()
- if llm and PATTERNS:
- return _llm_practical_test(llm)
-
- # ── 通用实操 fallback ──
- return _rule_practical_test()
-
-
-def _llm_practical_test(llm):
- """LLM 评估实操应用能力"""
- border = "-" * 50
- print(f"\n{border}")
- print(f" 实操测试 (LLM 场景验证)")
- print(f"{border}")
-
- # 选取 top-5 模式
- domain_pats = [p for p in PATTERNS if p.get("level") == "domain"]
- other_pats = [p for p in PATTERNS if p.get("level") != "domain"]
- other_pats.sort(key=lambda p: p.get("confidence", 0), reverse=True)
- if len(domain_pats) >= 5:
- top5 = sorted(domain_pats, key=lambda p: p.get("confidence", 0), reverse=True)[:5]
- else:
- top5 = list(domain_pats) + other_pats[:5 - len(domain_pats)]
-
- patterns_text = json.dumps([
- {"id": p.get("id"), "principle": p.get("principle"), "confidence": p.get("confidence")}
- for p in top5
- ], ensure_ascii=False, indent=2)
-
- prompt = f"""技能: __SKILL__
-以下是该技能的 5 个核心知识模式:
-
-{patterns_text}
-
-请针对每个模式,生成一个真实的应用场景问题,要求:
-1. 场景要具体,贴近实际工作
-2. 评估学习者是否能在实践中应用该模式
-3. 给出场景后,判断一个"虚构学习者"是否做对了(基于模式的置信度评估)
-
-以 JSON 数组格式输出,每个元素:
-{{"id": "P_1"~"P_5", "scenario": "具体场景描述", "correct_answer": "正确做法简述",
- "is_learner_correct": true/false}}
-其中 is_learner_correct 要根据模式的 confidence 判断——高置信度(>=85)设为true,否则false。
-"""
- result = llm(prompt, sys_prompt="你是技能实操评估专家。输出纯 JSON 数组。", temp=0.3)
- correct_ans = 0
- total = len(top5)
-
- if isinstance(result, list) and len(result) > 0:
- for i, item in enumerate(result):
- if not isinstance(item, dict):
- continue
- scenario = item.get("scenario", f"场景 {i+1}")
- is_correct = item.get("is_learner_correct", False)
- indicator = "[OK]" if is_correct else "[!]"
- print(f" {indicator} Q{i+1}: {scenario[:80]}")
- if is_correct:
- correct_ans += 1
- print(f"\n 实操测试得分: {int(correct_ans/total*100)}/100 ({correct_ans}/{total} 题正确)")
- return int(correct_ans / total * 100), f"LLM验证 {correct_ans}/{total}"
-
- print(" [FALLBACK] LLM 实操测试失败,使用规则模板")
- return _rule_practical_test()
-
-
-def _rule_practical_test():
- """规则实操测试(改进版:基于模式质量评估,非随机模拟)"""
- border = "-" * 50
- print(f"\n{border}")
- print(f" 实操测试 (模式质量验证)")
- print(f"{border}")
- if not PATTERNS:
- return 0, "无模式可验证"
- domain_pats = [p for p in PATTERNS if p.get("level") == "domain"]
- other_pats = [p for p in PATTERNS if p.get("level") != "domain"]
- other_pats.sort(key=lambda p: p.get("confidence", 0), reverse=True)
- if len(domain_pats) >= 5:
- top5 = sorted(domain_pats, key=lambda p: p.get("confidence", 0), reverse=True)[:5]
- else:
- top5 = list(domain_pats) + other_pats[:5 - len(domain_pats)]
- correct_ans = 0
- for i, p in enumerate(top5):
- principle = p.get("principle", "")
- pid = p.get("id", "?")
- conf = p.get("confidence", 70)
- level = p.get("level", "basic")
- summary = principle[:42]
- if len(principle) > 42:
- cut = 42
- while cut > 35 and principle[cut] not in (' ', ',', ')', '、', '/'):
- cut -= 1
- summary = principle[:cut] + "..."
- scenarios = [
- f"在{summary}中,以下哪个做法最符合最佳实践?",
- f"关于{summary}的正确理解是?",
- f"为有效{summary[:35]},应优先采取哪项措施?",
- ]
- scene = random.choice(scenarios)
- correct = principle[:50]
- other_principles = [q.get("principle", "")[:50] for q in PATTERNS if q.get("id") != pid]
- random.shuffle(other_principles)
- wrongs = other_principles[:3] if len(other_principles) >= 3 else [
- f"采用与{principle[:25]}相反的简化方案",
- f"优先考虑非功能性需求而非{principle[:20]}",
- f"根据团队经验调整{principle[:20]}的优先级",
- ]
- options = [correct] + wrongs
- random.shuffle(options)
- correct_label = ["A","B","C","D"][options.index(correct)]
-
- # 去掉随机模拟,改用模式质量评估
- if level == "domain":
- is_correct = True # 领域专有模式 → 可验证性高
- note = "领域专有"
- elif conf >= 75:
- is_correct = True # 高置信度通用模式
- note = f"高置信度({conf:.0f}%)"
- else:
- is_correct = False # 低置信度通用模式 → 不通过
- note = f"低置信度通用模式({conf:.0f}%)"
-
- print(f" {'[OK]' if is_correct else '[!]'} Q{i+1}: {scene}?")
- for j, opt in enumerate(options):
- print(f" {['A','B','C','D'][j]}) {opt}")
- if is_correct:
- print(f" -> {note}, 答案 {correct_label} [OK]")
- correct_ans += 1
- else:
- print(f" -> {note} (无法验证, 正确答案 {correct_label})")
- score = int(correct_ans / len(top5) * 100)
- print(f"\n 实操测试得分: {score}/100 ({correct_ans}/{len(top5)} 题正确)")
- return score, f"通用验证 {correct_ans}/{len(top5)}"
-
-
-def main():
- border = "=" * 55
- print(f"\n{border}")
- print(f" rev__VERSION__ 验证 -- __SKILL__")
- print(f"{border}")
-
- k_score = run_knowledge_test()
- covered, total = check_pattern_coverage()
- p_score, p_note = run_practical_test()
-
- cov_pct = (covered / total * 100) if total > 0 else 0
-
- # ── 案例质量惩罚:无足够真实案例时降分 ──
- case_penalty = 0
- if CASE_COUNT < 3:
- case_penalty = 15
- elif CASE_COUNT < 8:
- case_penalty = 5
-
- final = int(k_score * 0.35 + cov_pct * 0.35 + p_score * 0.30 - case_penalty)
- if final < 0:
- final = 0
-
- result = {
- "version": __VERSION__,
- "skill": "__SKILL__",
- "knowledge_score": round(k_score, 1),
- "coverage_pct": round(cov_pct, 1),
- "practical_score": round(p_score, 1),
- "case_penalty": case_penalty,
- "final_score": final,
- "passed": final >= 50, # 50分以上为通过
- "note": p_note
- }
-
- border = "=" * 55
- print(f"\n{border}")
- print(f" 评分结果")
- print(f" {border}")
- print(f" 知识测试 : {k_score}/100")
- print(f" 模式覆盖率 : {cov_pct:.1f}% ({covered}/{total})")
- print(f" 实操测试 : {p_score}/100 ({p_note})")
- if case_penalty:
- print(f" 案例惩罚 : -{case_penalty}")
- print(f" ─────────────────────")
- print(f" 最终评分 : {final}/100")
- grade = "A" if final >= 90 else ("B" if final >= 70 else ("C" if final >= 50 else "D"))
- print(f" 等级 : {grade} {'★' * (ord('E') - ord(grade))}")
- if final < 60:
- print(f" ⚠ 评分偏低 ({final}<60),建议补充更多案例后重新学习")
- print(f"{border}")
- if not _get_llm():
- print(f" (LLM 未启用,使用规则评估)")
- else:
- print(f" (LLM 评估模式)")
- print(f"{'='*55}")
-
- report_path = os.path.join(os.path.dirname(os.path.dirname(__file__)),
- "reports", "assessment.json")
- os.makedirs(os.path.dirname(report_path), exist_ok=True)
- with open(report_path, "w", encoding="utf-8") as f:
- json.dump(result, f, indent=2, ensure_ascii=False)
- print(f" 报告: {report_path}")
-
-
-if __name__ == "__main__":
- main()
diff --git a/tools/skill_learn_from_cases/chinese_to_english.json b/tools/skill_learn_from_cases/chinese_to_english.json
deleted file mode 100644
index 7ebb588c..00000000
--- a/tools/skill_learn_from_cases/chinese_to_english.json
+++ /dev/null
@@ -1,74 +0,0 @@
-{
- "金融": "finance",
- "图像": "image",
- "凭证": "voucher",
- "鉴定": "verification",
- "识别": "recognition",
- "检测": "detection",
- "卫星": "satellite",
- "遥感": "remote_sensing",
- "图数据库": "graph_database",
- "数据库": "database",
- "编程": "programming",
- "语言": "language",
- "学习": "learning",
- "小微": "micro",
- "贷款": "loan",
- "信贷": "credit",
- "风控": "risk_control",
- "反欺诈": "anti_fraud",
- "安全": "security",
- "隐私": "privacy",
- "合规": "compliance",
- "审计": "audit",
- "深度学习": "deep_learning",
- "机器学习": "machine_learning",
- "自然语言": "nlp",
- "计算机视觉": "computer_vision",
- "前端": "frontend",
- "后端": "backend",
- "全栈": "fullstack",
- "移动": "mobile",
- "开发": "development",
- "测试": "testing",
- "部署": "deployment",
- "运维": "operations",
- "优化": "optimization",
- "最佳实践": "best_practices",
- "架构": "architecture",
- "设计": "design",
- "模式": "pattern",
- "算法": "algorithm",
- "数据": "data",
- "分析": "analysis",
- "挖掘": "mining",
- "可视化": "visualization",
- "自动化": "automation",
- "容器": "container",
- "微服务": "microservice",
- "云原生": "cloud_native",
- "监控": "monitoring",
- "日志": "logging",
- "区块链": "blockchain",
- "物联网": "iot",
- "增强现实": "ar",
- "虚拟现实": "vr",
- "量化": "quantitative",
- "交易": "trading",
- "推荐": "recommendation",
- "搜索": "search",
- "排序": "sorting",
- "人机交互": "hci",
- "交互设计": "interaction_design",
- "原型": "prototype",
- "界面": "interface",
- "原型设计": "prototyping",
- "界面设计": "ui_design",
- "用户体验": "ux",
- "交互原型": "interactive_prototype",
- "设计交付": "handoff",
- "设计规范": "design_system",
- "线框图": "wireframe",
- "可用性": "usability",
- "项目管理": "project_management"
-}
\ No newline at end of file
diff --git a/tools/skill_learn_from_cases/engine.py b/tools/skill_learn_from_cases/engine.py
deleted file mode 100644
index ac52a7fb..00000000
--- a/tools/skill_learn_from_cases/engine.py
+++ /dev/null
@@ -1,1363 +0,0 @@
-"""
-engine.py -- skill_learn_from_cases 核心引擎
-
-5 阶段流程编排:
- Phase 0: 启动 + 目录创建
- Phase 1: 技能定义(skill_search 查前置知识)
- Phase 2: 案例搜索(skill_search + Web Search)
- Phase 3: 分析提炼知识模式
- Phase 4: 构建验证工具
- Phase 5: 运行验证 -> 出报告
-"""
-
-import sys
-import os
-import json
-import subprocess
-from pathlib import Path
-
-# -- 项目路径 --
-GA_ROOT = Path(__file__).resolve().parents[2]
-sys.path.insert(0, str(GA_ROOT))
-sys.path.insert(0, str(GA_ROOT / "memory" / "skill_search"))
-
-from tools.skill_learn_from_cases import dir_manager
-from tools.skill_learn_from_cases.restore_funcs import _import_skill_search, _import_web_search, _web_search_wikipedia
-from tools.skill_learn_from_cases.llm_helper import call_llm, call_llm_json, llm_available
-
-
-def _detect_docker():
- """检测 Docker 是否可用"""
- try:
- r = subprocess.run(
- ["wsl.exe", "--exec", "bash", "-c", "docker --version 2>/dev/null"],
- capture_output=True, text=True, timeout=10
- )
- if r.returncode == 0 and r.stdout.strip():
- ver = r.stdout.strip()
- print(f" Docker: [OK] {ver}")
- return ver
- except Exception:
- pass
-
-
- return None
-
-
-def _phase0_bootstrap(skill_name: str) -> dict:
- """启动环境 + 创建 revN 目录"""
- print(f"\n{'='*55}")
- print(f" skill_learn_from_cases(\"{skill_name}\")")
- print(f"{'='*55}")
-
- # 已有版本
- versions = dir_manager.get_versions(skill_name)
- if versions:
- print(f" 已有版本: rev{', rev'.join(map(str, versions))}")
- else:
- print(f" 新技能,无历史版本")
-
- # 创建新目录
- ver = dir_manager.next_version(skill_name)
- rev_dir = dir_manager.create_revision_dir(skill_name, ver)
- print(f" 创建: rev{ver}/")
-
- # 继承上一版模式
- inherited_patterns = dir_manager.get_latest_patterns(skill_name)
- if inherited_patterns:
- patterns_file = rev_dir / "patterns" / "knowledge_patterns.json"
- with open(patterns_file, "w", encoding="utf-8") as f:
- json.dump(inherited_patterns, f, indent=2, ensure_ascii=False)
- print(f" 继承: {len(inherited_patterns)} 个知识模式")
- else:
- print(f" 无继承模式")
-
- # 探测 Docker
- docker_ver = _detect_docker()
- if docker_ver:
- print(f" Docker: [OK] {docker_ver}")
- else:
- print(f" Docker: [FAIL] 不可用(compose 语法校验将跳过)")
-
- return {
- "skill_name": skill_name,
- "version": ver,
- "rev_dir": rev_dir,
- "docker_ver": docker_ver,
- "inherited_patterns": inherited_patterns
- }
-
-
-# ===============================================================
-# Phase 1: 技能定义
-# ===============================================================
-
-def _llm_enrich_definition(ctx: dict, name_clean: str):
- """
- 使用 LLM 丰富技能定义:生成结构化定义、前置知识、核心概念、常见陷阱。
- 仅增强,不破坏已有字段。
- """
- prompt = f"""技能名称: {name_clean}
-
-请为这个技能生成一份结构化学习定义,包含以下字段(JSON 格式):
-1. description: 一段精炼的简介(100-200字),面向想学习该技能的开发者
-2. prerequisites: 前置知识列表(数组,每项含 name 和 reason 字段)
-3. core_concepts: 3-6个核心概念/知识点
-4. common_pitfalls: 3-5个常见错误/陷阱(每项含 pitfall 和 advice 字段)
-
-当前已有描述: {ctx['skill_definition'].get('description','')}
-
-请以 JSON 格式输出,strict 模式:
-{{"description": "...", "prerequisites": [{{"name": "docker", "reason": "容器化基础"}}], "core_concepts": ["..."], "common_pitfalls": [{{"pitfall": "...", "advice": "..."}}]}}
-"""
-
- result = call_llm_json(prompt,
- system_prompt="你是技术技能学习专家,擅长生成结构化、可操作的学习定义。输出纯 JSON。",
- temperature=0.3,
- max_tokens=4096)
-
- if not isinstance(result, dict):
- return
-
- # 增强描述(如果 LLM 返回的描述更长更好)
- if result.get("description") and len(result["description"]) > len(ctx["skill_definition"].get("description", "")):
- ctx["skill_definition"]["description"] = result["description"][:500]
-
- # 前置知识
- if result.get("prerequisites"):
- ctx["skill_definition"]["prerequisites"] = result["prerequisites"]
-
- # 核心概念
- if result.get("core_concepts"):
- ctx["skill_definition"]["core_concepts"] = result["core_concepts"]
-
- # 常见陷阱
- if result.get("common_pitfalls"):
- ctx["skill_definition"]["common_pitfalls"] = result["common_pitfalls"]
-
- print(f" [LLM] 定义增强: 前置知识 {len(result.get('prerequisites',[]))} 项, "
- f"核心概念 {len(result.get('core_concepts',[]))} 项, "
- f"常见陷阱 {len(result.get('common_pitfalls',[]))} 项")
-
-
-def _phase1_define(ctx: dict):
- """定义技能:查 skill hub 获取前置知识"""
- print(f"\n{'-'*55}")
- print(" Phase 1: 技能定义")
- print(f"{'-'*55}")
-
- # 从技能名称推断
- name_clean = ctx["skill_name"].replace("_", " ").title()
- ctx["skill_definition"] = {
- "name": ctx["skill_name"],
- "display_name": name_clean,
- "prerequisites": [],
- "description": f"通过案例学习 {name_clean}",
- }
- print(f" 技能: {name_clean}")
-
- # 尝试用 skill_search 获取更精确的定义
- search_fn = _import_skill_search()
- if search_fn:
- try:
- results = search_fn(name_clean, top_k=3)
- if results:
- tags = []
- for r in results:
- if hasattr(r, 'skill') and hasattr(r.skill, 'tags'):
- tags.extend(r.skill.tags or [])
- desc = getattr(r, 'skill', None)
- if desc and getattr(desc, 'description', None):
- ctx["skill_definition"]["description"] = desc.description[:200]
- ctx["skill_definition"]["related_tags"] = list(set(tags))[:10]
- print(f" Skill Hub: {len(results)} 条相关技能卡")
- except Exception as e:
- print(f" Skill Hub: [!] {e}")
-
- # 尝试用 Web 搜索增强定义(含 Wikipedia fallback)
- web_fn = _import_web_search()
- wiki_fn = None
- try:
- from tools.skill_learn_from_cases.engine import _web_search_wikipedia as wiki_fn
- except Exception:
- pass
- snippets = []
- if web_fn:
- try:
- web_results = web_fn(keyword=name_clean, size=3)
- if web_results and isinstance(web_results, list):
- for r in web_results[:3]:
- s = r.get("snippet", "") or r.get("summary", "") or ""
- if s:
- snippets.append(s[:200])
- if snippets:
- brief = ";".join(snippets)[:300]
- ctx["skill_definition"]["description"] = brief
- ctx["skill_definition"]["web_summary"] = brief
- print(f" Web 摘要: {len(snippets)} 条")
- except Exception:
- pass
-
- # ── 搜索引擎无结果时,Wikipedia 降级 ├────
- if not snippets and wiki_fn:
- try:
- wiki_results = wiki_fn(keyword=name_clean, size=3)
- if wiki_results:
- valid_snippets = []
- for r in wiki_results[:3]:
- s = r.get("snippet", "") or ""
- if s:
- # 过滤无关结果(如产品介绍而非技能描述)
- s_lower = s.lower()
- skill_words = set(name_clean.lower().split())
- s_words = set(s_lower.split())
- word_overlap = s_words & skill_words
- irr_signals = ["was a", "is a ", "corporation", "inc.", "company",
- "is a web", "is a software", "is a service"]
- irr_count = sum(1 for sig in irr_signals if sig in s_lower)
- if irr_count >= 1 and len(word_overlap) <= 1:
- print(f" [WIKI跳过] 不相关: {s[:60]}...")
- continue
- valid_snippets.append(s[:200])
- snippets = valid_snippets
- if snippets:
- brief = ";".join(snippets)[:300]
- ctx["skill_definition"]["wiki_summary"] = brief
- ctx["skill_definition"]["description"] = brief
- print(f" Wikipedia 摘要: {len(snippets)} 条")
- except Exception:
- pass
-
- # ── LLM 增强:丰富技能定义 ──
- if llm_available():
- _llm_enrich_definition(ctx, name_clean)
-
- # 写入定义
- def_file = ctx["rev_dir"] / "reports" / "skill_definition.json"
- with open(def_file, "w", encoding="utf-8") as f:
- json.dump(ctx["skill_definition"], f, indent=2, ensure_ascii=False)
- print(f" [OK] 定义已保存")
-
-
-# ===============================================================
-# Phase 2: 案例搜索
-# ===============================================================
-
-def _llm_generate_search_queries(skill_name: str) -> list[str] | None:
- """
- 使用 LLM 生成多样化搜索查询词。
- 返回查询列表或 None(降级到硬编码查询)。
- """
- if not llm_available():
- return None
-
- name = skill_name.replace("_", " ").title()
- has_cjk = any('\u4e00' <= c <= '\u9fff' for c in skill_name)
-
- prompt = f"""技能名称: {skill_name}
-显示名: {name}
-包含中文: {'是' if has_cjk else '否'}
-
-请为搜索该技能的学习案例,生成 4~6 个多样化的搜索查询词。
-要求:
-1. 覆盖不同角度:最佳实践、技术方案、实战经验、常见陷阱
-2. 中英文混合策略:{f'同时生成中文和英文查询' if has_cjk else '全英文查询'}
-3. 如果技能有特定产品/框架名,优先使用原名
-4. 每个查询应能搜到不同的内容类型
-
-请以 JSON 数组输出,如 ["查询1", "查询2", ...]
-"""
-
- result = call_llm_json(prompt,
- system_prompt="你是一个搜索引擎优化专家,擅长为技能学习生成高效的搜索查询。输出纯 JSON 数组。",
- temperature=0.5,
- max_tokens=2048)
-
- if isinstance(result, list) and len(result) >= 2:
- valid = [str(q) for q in result if len(str(q)) > 5][:8]
- if valid:
- print(f" [LLM] 搜索词: {len(valid)} 个")
- for q in valid:
- print(f" - {q}")
- return valid
- return None
-
-
-def _phase2_search(ctx: dict):
- """双渠道搜索案例(LLM增强版)"""
- print(f"\n{'-'*55}")
- print(" Phase 2: 案例搜索")
- print(f"{'-'*55}")
-
- all_cases = []
-
- # 渠道 A: Skill Hub(仅保留与技能名关键词重叠的结果)
- search_fn = _import_skill_search()
- if search_fn:
- try:
- # 提取技能名的有区分度关键词(过滤通用词如 program/language/learn)
- _skill_name = ctx["skill_name"].lower().replace("_", " ")
- _skill_tokens = set(_skill_name.split())
- _generic_tokens = {"program", "programming", "language", "languages", "learn", "learning",
- "tutorial", "guide", "basic", "advanced", "using", "with", "and", "for",
- "development", "developer", "coding", "code", "script", "scripting"}
- _sig_tokens = _skill_tokens - _generic_tokens
- # 如果技能名有中文,将整个技能名作为关键词
- _has_cjk = any('\u4e00' <= c <= '\u9fff' for c in _skill_name)
- if _has_cjk:
- _sig_tokens.add(_skill_name.strip())
-
- results = search_fn(ctx["skill_name"].replace("_", " "), top_k=10)
- skill_cases = []
- for r in results:
- s = r.skill
- _key_lower = (s.key + " " + (s.description or "") + " " + " ".join(s.tags or [])).lower()
- # 过滤 agentskill_skills/ 开头的内部技能定义(不是真实案例)
- if s.key.startswith("agentskill_skills/"):
- continue
- # 计算关键词重叠:至少有一个有区分度的词出现在结果中
- _overlap = sum(1 for t in _sig_tokens if t in _key_lower)
- _relevance = "low"
- if _overlap >= 2 or (_sig_tokens and any(t in s.key.lower() for t in _sig_tokens)):
- _relevance = "high"
- elif _overlap >= 1:
- _relevance = "medium"
-
- # 只保留 medium 以上
- if _relevance != "low":
- skill_cases.append({
- "source": "skill_hub",
- "type": "skill_def",
- "key": s.key,
- "description": (s.description[:300] if s.description else ""),
- "tags": s.tags[:5] if s.tags else [],
- "score": r.final_score,
- "relevance": _relevance
- })
- all_cases.extend(skill_cases)
- print(f" Skill Hub: {len(skill_cases)} 条 (过滤掉 {len(results)-len(skill_cases)} 条不相关)")
- except Exception as e:
- print(f" Skill Hub: [FAIL] {e}")
-
- # 渠道 B: Web 搜索
- search_engine = _import_web_search()
- if search_engine:
- try:
- name = ctx["skill_name"]
- # 检测是否包含中文,调整查询策略
- has_cjk = any('\u4e00' <= c <= '\u9fff' for c in name)
-
- # 提取英文关键词(含字母数字组合词如 neo4j)
- import re as _re
- # 匹配至少包含一个字母的连续字符(保留数字,如 neo4j)
- en_kws = _re.findall(r'[a-zA-Z][a-zA-Z0-9]*', name)
- # 去重并排除单字母无意义词
- en_kws = sorted(set(w for w in en_kws if len(w) > 2))
- en_kw = " ".join(en_kws) if en_kws else ""
-
- # ── LLM 增强:智能生成搜索词 ──
- queries = _llm_generate_search_queries(name)
- if queries is None:
- # fallback: 原硬编码查询
- if has_cjk:
- queries = [
- f"{name} 最佳实践",
- f"{name} 实战 经验",
- f"{name} 技术方案 案例",
- ]
- if en_kw and len(en_kw) > 3:
- queries.extend([
- f"{en_kw} best practices tutorial",
- f"{en_kw} guide examples",
- ])
- else:
- _base_en = [
- f"{name.replace('_',' ')} tutorial",
- f"{name.replace('_',' ')} how to use",
- f"{name.replace('_',' ')} guide examples",
- f"{name.replace('_',' ')} getting started",
- f"learn {name.replace('_',' ')} beginner",
- ]
- queries = list(_base_en)
- # wiki_search 同义词扩展提升召回率
- _syns = {"tutorial":"guide best-practices".split(),"guide":"handbook reference".split(),"examples":"demo sample".split(),"beginner":"intro quickstart".split()}
- for q in _base_en[:2]:
- for kw, syns in _syns.items():
- if kw in q:
- for s in syns[:2]:
- queries.append(q.replace(kw, s))
- if en_kw:
- queries.extend([
- f"{en_kw} best practices",
- f"{en_kw} tutorial",
- ])
- web_cases = []
- seen_urls = set()
- seen_titles = set()
- for q in queries:
- results = search_engine(q, size=5)
- for r in results:
- url = r.get("url", "")
- title = r.get("title", "").strip()
- if url and url not in seen_urls and title not in seen_titles:
- seen_urls.add(url)
- seen_titles.add(title or url)
- web_cases.append({
- "source": "web",
- "type": "web_article",
- "title": title,
- "url": url,
- "snippet": r.get("snippet", "")[:300]
- })
- all_cases.extend(web_cases)
- print(f" Web: {len(web_cases)} 条 (去重)")
-
- # ── 搜索引擎返回 0 条时,自动降级到 Wikipedia 搜索 ──
- if len(web_cases) == 0:
- try:
- wiki_fn = _web_search_wikipedia
- wiki_queries = [f"{en_kw}", name] if en_kws else [name]
- wiki_seen_urls = set()
- wiki_seen_titles = set()
- wiki_cases = []
- for wq in wiki_queries:
- wiki_results = wiki_fn(wq, size=5)
- for wr in wiki_results:
- title = wr.get("title", "").strip()
- url = wr.get("url", "")
- if url and url not in wiki_seen_urls and title not in wiki_seen_titles:
- wiki_seen_urls.add(url)
- wiki_seen_titles.add(title or url)
- wiki_cases.append({
- "source": "wikipedia",
- "type": "wiki_entry",
- "title": title,
- "url": url,
- "snippet": wr.get("snippet", "")[:300]
- })
- if wiki_cases:
- print(f" Wikipedia: {len(wiki_cases)} 条 (搜索引擎降级)")
- all_cases.extend(wiki_cases)
-
- except Exception as e:
- print(f" Wikipedia: [FAIL] {e}")
- except Exception as e:
- print(f" Web: [FAIL] {e}")
- else:
- print(f" Web: [FAIL] 搜索引擎不可用")
-
- # ── 渠道 C: Sophub ──
- try:
- from tools.skill_learn_from_cases.restore_funcs import _search_sophub
- sophub_cases = _search_sophub(ctx["skill_name"])
- if sophub_cases:
- print(f" Sophub: {len(sophub_cases)} 条")
- all_cases.extend(sophub_cases)
- except Exception:
- pass
-
- # 继承上一版案例(--force 时跳过)
- if os.environ.get("SKILL_FORCE_REFRESH") == "1":
- print(f" --force: 跳过继承旧案例")
- else:
- inherited_cases = dir_manager.get_latest_cases(ctx["skill_name"])
- if inherited_cases:
- # 去重(按 URL/Key 去重)
- seen_keys = set()
- for c in all_cases:
- key = c.get("url") or c.get("key") or ""
- seen_keys.add(key)
- for c in inherited_cases:
- key = c.get("url") or c.get("key") or ""
- if key and key not in seen_keys:
- all_cases.append(c)
- seen_keys.add(key)
- print(f" 继承上一版: +{len(inherited_cases)} 条(去重后)")
-
- # 保存
- cases_file = ctx["rev_dir"] / "cases" / "all_cases.json"
- with open(cases_file, "w", encoding="utf-8") as f:
- json.dump(all_cases, f, indent=2, ensure_ascii=False)
- ctx["cases"] = all_cases
- print(f" 合计: {len(all_cases)} 条案例")
- print(f" [OK] 已保存")
-
-
-# ===============================================================
-# Phase 3: 分析提炼知识模式
-# ===============================================================
-
-def _decompose_skill_name(skill_name: str, cases: list) -> list:
- """将技能名分解为子主题,生成初始模式(0案例fallback用)"""
- if not cases:
- cases = []
-
- # 从技能名称中提取有意义片段
- words = skill_name.replace("_", " ").replace("-", " ")
- # 对中文名按语义切分
- parts = []
- buf = ""
- for ch in words:
- if '\u4e00' <= ch <= '\u9fff':
- if buf and not any('\u4e00' <= c <= '\u9fff' for c in buf):
- parts.append(buf.strip())
- buf = ""
- buf += ch
- elif ch == ' ':
- if buf:
- parts.append(buf.strip())
- buf = ""
- else:
- buf += ch
- if buf:
- parts.append(buf.strip())
- parts = [p for p in parts if len(p) > 1]
-
- # 基于常见技能后缀生成子主题模板映射
- topic_map = {
- # ── wiki/知识库/搜索 ──
- "wiki": "Wiki系统搭建与内容管理最佳实践",
- "search": "搜索算法与检索排序优化策略",
- "搜索": "搜索算法与检索排序优化策略",
- "文档": "文档结构化解析与关键信息提取",
- "知识库": "知识库构建与知识管理最佳实践",
- "knowledge": "知识库构建与知识管理最佳实践",
- "documentation": "文档编写与API文档自动化工具链",
- "doc": "文档编写与API文档自动化工具链",
- # ── 图像/凭证 ──
- "图像": "图像采集与预处理最佳实践",
- "图片": "图像采集与预处理最佳实践",
- "凭证": "凭证标准化与格式校验规范",
- "证件": "凭证标准化与格式校验规范",
- "鉴定": "鉴定流程与判定标准",
- "验证": "验证流程与防篡改机制",
- "识别": "识别算法选型与准确率优化",
- "检测": "异常检测与告警阈值设定",
- "审核": "审核流程自动化与风控策略",
- "审查": "审核流程自动化与风控策略",
- "贷款": "贷款业务场景与合规要求",
- "信贷": "贷款业务场景与合规要求",
- "金融": "金融级安全与数据隐私保护",
- "风控": "金融级安全与数据隐私保护",
- "OCR": "OCR识别与文字提取技术选型",
- "ocr": "OCR识别与文字提取技术选型",
- "防伪": "防伪特征检测与真伪鉴别",
- "篡改": "图像篡改检测与完整性校验",
- "安全": "安全防护与数据隐私合规",
- "合规": "合规审查与审计追溯",
- "小微": "小微金融业务风控体系",
- "文档": "文档结构化解析与关键信息提取",
- "合同": "合同关键条款抽取与比对",
- "报表": "报表自动生成与数据可视化",
- "签名": "电子签名与数字证书验证",
- "水印": "水印检测与防伪溯源",
- "卫星": "卫星影像几何校正与预处理",
- "遥感": "多光谱遥感数据分析与解译",
- "无人机": "无人机影像处理与拼接",
- "航拍": "航拍影像三维重建与正射校正",
- "SAR": "SAR雷达影像处理与目标识别",
- "雷达": "SAR雷达影像处理与目标识别",
- "光谱": "光谱分析在地物分类中的应用",
- "测绘": "测绘数据标准化与地图制图",
- "地理": "地理空间分析与GIS集成",
- "像素": "像素级影像融合与分辨率增强",
- "neo4j": "Neo4j图数据库建模与Cypher查询",
- "cypher": "Cypher查询语言核心语法与模式匹配",
- "图数据": "图数据建模与路径查询优化",
- "图查询": "图查询语言与遍历算法",
- "节点": "图数据库节点类型与属性设计",
- "关系": "图数据库关系建模与关联分析",
- "知识图谱": "知识图谱构建与图数据库存储",
- "图算法": "图算法在路径分析与社区发现中的应用",
- }
-
- sub_patterns = []
- seen = set()
- for part in parts:
- for keyword, pattern_text in topic_map.items():
- if keyword in part or keyword in skill_name:
- if keyword not in seen:
- seen.add(keyword)
- sub_patterns.append((f"{pattern_text}({skill_name[:20]})", 78))
-
- # 从案例标题/摘要中提取关键词,新增为低置信度模式
- case_keywords_found = set()
- for c in cases:
- text = (c.get("title","") + " " + c.get("snippet","")).lower()
- # 提取案例中的技术/业务关键词
- for tech_term in ["ocr","深度学习","cnn","数字签名","哈希校验",
- "篡改检测","防伪","水印","合规","审计",
- "风控","信用","评估","识别率","准确率",
- "卫星","遥感","光谱","雷达","sar","gis",
- "变化检测","目标检测","语义分割","像素",
- "多光谱","高光谱","无人机","航拍","配准",
- "neo4j","cypher","图数据库","知识图谱",
- "图算法","图查询","图遍历","节点","关系",
- "wiki","wikidata","wikipedia","sparql",
- "搜索","检索","搜索引擎","ranking",
- "api","rest","http","爬虫","crawler"]:
- if tech_term in text and tech_term not in seen:
- case_keywords_found.add(tech_term)
-
- for kw in case_keywords_found:
- sub_patterns.append((f"{kw}相关技术与最佳实践({skill_name[:20]})", 72))
- seen.add(kw)
-
- # 如果找不到匹配,生成通用结构
- if not sub_patterns:
- generic_subs = [
- f"{skill_name[:30]}核心概念与术语体系",
- f"{skill_name[:30]}常见场景与解决方案",
- f"{skill_name[:30]}工具链与环境搭建",
- ]
- sub_patterns = [(s, 70) for s in generic_subs]
-
- return sub_patterns[:5]
-
-def _extract_patterns_from_cases(cases: list[dict], skill_name: str) -> list[dict]:
- """从案例中提取知识模式(关键词匹配 + 启发式 + 技能感知)"""
- # ── 通用模式关键词库(基础设施相关,对所有技能适用) ──
- generic_patterns = {
- "production": {
- "keywords": ["production", "deploy", "prod", "release","deployment"],
- "principles": [
- ("使用环境变量/配置文件分离环境差异", "P_env_separation", 89),
- ("固定版本号避免意外升级", "P_pin_version", 94),
- ("资源限制防止单服务耗尽", "P_resource_limits", 85),
- ]
- },
- "reliability": {
- "keywords": ["restart", "health", "monitor", "recover", "resilient"],
- "principles": [
- ("配置重启策略确保服务自动恢复", "P_restart", 92),
- ("添加健康检查机制确保服务可用", "P_healthcheck", 88),
- ("配置日志轮转防止磁盘爆满", "P_logging", 90),
- ]
- },
- "testing_config": {
- "keywords": ["test", "validate", "config", "verify", "lint"],
- "principles": [
- ("部署前验证配置文件正确性", "P_config_validation", 93),
- ]
- },
- }
-
- # ── 技能领域模式库(按技能名和案例内容自动匹配) ──
- # 从 JSON 文件加载,方便用户扩展/修改
- _patterns_file = Path(__file__).parent / "skill_domain_patterns.json"
- if _patterns_file.exists():
- with open(_patterns_file, "r", encoding="utf-8") as _f:
- _raw = json.load(_f)
- # 将 JSON 中的 dict 格式 principles 还原为 (principle, id, confidence) 三元组
- skill_domain_patterns = {}
- for _domain, _info in _raw.items():
- skill_domain_patterns[_domain] = {
- "keywords": _info["keywords"],
- "principles": [
- (_p["principle"], _p["id"], _p["confidence"])
- for _p in _info["principles"]
- ]
- }
- else:
- skill_domain_patterns = {}
- print(" [WARN] skill_domain_patterns.json 不存在,跳过领域模式匹配")
-
- # ── 合并文本用于匹配 ──
- all_text = ""
- for c in cases:
- text = " ".join(str(v) for v in c.values() if isinstance(v, str))
- all_text += text.lower() + " "
-
- patterns = []
- seen_ids = set()
-
- # 匹配通用模式
- for category, info in generic_patterns.items():
- for kw in info["keywords"]:
- if kw in all_text:
- for principle, pid, conf in info["principles"]:
- if pid not in seen_ids:
- patterns.append({"id": pid, "principle": principle, "confidence": conf, "level": "basic"})
- seen_ids.add(pid)
- break
-
- # ── 技能名相关性过滤:通用模式必须与技能主题相关 ──
- skill_lower = skill_name.lower()
- rel_terms = set(skill_lower.replace("_", " ").replace("-", " ").split())
- # 为每个通用模式建相关性关键词映射
- generic_rel = {
- "production": {"deploy", "deployment", "production", "release", "docker", "container", "ci", "cd", "pipeline", "运维", "部署", "发布", "上线", "devops"},
- "reliability": {"restart", "health", "monitor", "recovery", "resilient", "monitoring", "failover", "监控", "可用性", "容错", "故障恢复"},
- "testing_config": {"test", "validate", "config", "verify", "lint", "测试", "验证", "配置", "校验"},
- }
- for p in patterns:
- if p.get("level") == "basic":
- category = None
- for cat, terms in generic_rel.items():
- if any(t in p["principle"].lower() for t in terms):
- category = cat
- break
- if category:
- cat_terms = generic_rel[category]
- if not (rel_terms & cat_terms):
- p["confidence"] = max(20, p["confidence"] - 20)
- try:
- from skill_search import SkillRegistry as _SR
- _sr = _SR()
- _matches = [r for r in _sr.skills if skill_name in r.key]
- if _matches and hasattr(_matches[0], 'tags') and _matches[0].tags:
- _rel_tags = set(_matches[0].tags[:10])
- for p in patterns:
- _p_lower = p["principle"].lower()
- _tag_match = sum(1 for t in _rel_tags if t.lower() in _p_lower or _p_lower[:10] in t.lower())
- if _tag_match == 0 and p.get("level") == "basic":
- p["confidence"] = max(15, p["confidence"] - 15)
- except Exception:
- pass
- skill_keywords = skill_name.lower().replace("_", " ").replace("-", " ")
- matched_domains = set()
- # 构建案例标题文本(仅标题——用于辅助检查)
- case_titles_text = " ".join([
- (c.get("title") or c.get("key") or "").lower()
- for c in cases
- ])
- for domain, info in skill_domain_patterns.items():
- # 技能名匹配(精确匹配 domain 名)
- if domain in skill_keywords:
- matched_domains.add(domain)
- continue
- # 关键词匹配:仅在技能名包含领域关键词时才触发(防止api/git等通用词误匹配)
- for kw in info["keywords"]:
- if kw in skill_keywords:
- # 技能名含领域关键词,且案例标题也提及 → 确认匹配
- if kw in case_titles_text or any(part in case_titles_text for part in skill_keywords.split() if len(part) > 2):
- matched_domains.add(domain)
- break
- break
-
- for domain in matched_domains:
- for principle, pid, conf in skill_domain_patterns[domain]["principles"]:
- if pid not in seen_ids:
- patterns.append({"id": pid, "principle": principle, "confidence": conf, "level": "advanced"})
- seen_ids.add(pid)
-
- # ── 从案例标题启发式提取(作为补充) ──
- for c in cases:
- title = (c.get("title") or c.get("key") or "").lower()
- desc = (c.get("description") or c.get("snippet") or "").lower()
- combined = title + " " + desc
- # 匹配到但未覆盖的模式ID前缀
- if not patterns:
- # 完全没匹配到任何模式时,生成一个兜底模式
- break
-
- # ── 始终合并领域专有模式(不依赖案例,直接从技能名语义分解) ──
- sub_ideas = _decompose_skill_name(skill_name, cases)
- added_domain_ids = set()
- for i, (sub_name, conf) in enumerate(sub_ideas):
- pid = f"P_domain_{i+1}"
- if pid not in seen_ids and pid not in added_domain_ids:
- patterns.append({
- "id": pid,
- "principle": sub_name,
- "confidence": conf,
- "level": "domain"
- })
- seen_ids.add(pid)
- added_domain_ids.add(pid)
-
- # ── 技能相关性评分:过滤/降权不相关的通用模式 ──
- skill_lower = skill_name.lower()
- # 从 skill_domain_patterns 中提取与技能名相关的领域前缀
- relevant_prefixes = set()
- for domain, info in skill_domain_patterns.items():
- if domain in skill_lower or any(kw in skill_lower for kw in info["keywords"]):
- # 从该领域第一条原则的ID提取领域前缀(如 P_fin_、P_img_、P_doc_)
- for p in info["principles"]:
- pid_str = p[1] if isinstance(p, tuple) else (p.get("id", "") if isinstance(p, dict) else "")
- parts = pid_str.split("_")
- if len(parts) >= 2 and parts[0] == "P":
- relevant_prefixes.add(f"P_{parts[1]}_")
- break
- # 始终包含 DOMAIN 模式
- relevant_prefixes.add("P_domain_")
-
- # ── 领域互斥:如果技能名含图数据库相关词,排除SQL的database领域 ──
- if any(kw in skill_lower for kw in ["图数据库", "neo4j", "cypher", "graph", "知识图谱",
- "图数据", "图查询", "图算法"]):
- if "P_db_" in relevant_prefixes:
- relevant_prefixes.discard("P_db_")
- print(f" 检测到图数据库技能,排除 SQL database 领域模式")
-
- for p in patterns:
- pid = p["id"]
- level = p.get("level", "basic")
- # DOMAIN 模式直接从 skill name 生成,保持高置信度
- if level == "domain":
- p["confidence"] = min(p["confidence"] + 5, 95)
- continue
- # 检查模式是否属于相关领域(按ID前缀匹配)
- is_relevant = any(pid.startswith(prefix) for prefix in relevant_prefixes)
- if level == "advanced":
- if is_relevant:
- p["confidence"] = min(p["confidence"] + 3, 95)
- else:
- # 不相关的高级模式降权
- p["confidence"] = max(p["confidence"] - 20, 40)
- # BASIC 通用模式降权
- if level == "basic":
- p["confidence"] = max(p["confidence"] - 10, 35)
-
- # ── 过滤低分噪声模式(置信度 < 65 的不保留) ──
- before = len(patterns)
- patterns = [p for p in patterns if p.get("confidence", 0) >= 65]
- filtered = before - len(patterns)
- if filtered:
- print(f" 过滤掉 {filtered} 个低相关性模式(置信度 < 65)")
- print(f" 保留 {len(patterns)} 个有效模式")
-
- # 排序(按置信度降序,domain 模式优先于同分)
- patterns.sort(key=lambda x: (x.get("confidence", 0), x.get("level", "")), reverse=True)
- return patterns
-
-
-# ═══════════════════════════════════════════════════════════════
-# LLM 增强: Phase 3 — 智能模式提取
-# ═══════════════════════════════════════════════════════════════
-
-def _llm_extract_patterns(cases: list[dict], skill_name: str) -> list[dict] | None:
- """
- 使用 LLM 从案例中提取知识模式。
-
- 返回 patterns 列表(每个含 id/principle/confidence/level),
- LLM 不可用或解析失败时返回 None(触发 fallback)。
- """
- if not llm_available():
- return None
-
- # 构造案例摘要(控制 token 量)
- case_summaries = []
- for c in cases[:12]: # 最多 12 条,控制成本
- title = c.get("title") or c.get("key") or "?"
- snippet = c.get("snippet") or c.get("description") or ""
- case_summaries.append(f"- [{title}] {snippet[:200]}")
-
- case_text = "\n".join(case_summaries)
-
- prompt = f"""技能名称: {skill_name}
-
-请分析以下案例,提取该技能领域的核心知识模式(最佳实践/原则/规范)。
-
-要求:
-1. 每个模式包含: id(如"P_xxx"), principle(具体原则描述), confidence(0-100,基于案例支持度), level("domain"或"advanced")
-2. 从案例真实内容中提炼,不要凭空编造
-3. 模式应该具有实践指导意义,不泛泛而谈
-4. 如果案例不足,可以适度基于领域常识补充,但降低 confidence
-
-案例:
-{case_text}
-
-请以 JSON 数组格式输出,每个元素: {{"id": "P_xxx", "principle": "...", "confidence": 85, "level": "domain"}}
-"""
-
- result = call_llm_json(prompt,
- system_prompt="你是技能学习专家,擅长从案例中提炼可操作的实践模式。输出纯 JSON 数组,不要额外说明。",
- temperature=0.3,
- max_tokens=4096)
-
- if result is None:
- return None
- if isinstance(result, list):
- # 验证结构
- valid = []
- for item in result:
- if isinstance(item, dict) and "id" in item and "principle" in item:
- item.setdefault("confidence", 75)
- item.setdefault("level", "domain")
- valid.append(item)
- if valid:
- print(f" [LLM] 智能模式提取: {len(valid)} 个模式")
- return valid
- return None
-
-
-def _llm_decompose_skill_name(skill_name: str, cases: list) -> list | None:
- """
- 使用 LLM 将技能名分解为子主题(当没有案例时的 fallback 改进)。
- 返回 [(子主题, 置信度), ...] 或 None。
- """
- if not llm_available():
- return None
-
- name_clean = skill_name.replace("_", " ").title()
- case_titles = []
- for c in (cases or [])[:8]:
- case_titles.append(c.get("title", "?") or c.get("key", "?"))
-
- case_hint = "\n".join(f"- {t}" for t in case_titles) if case_titles else "(暂无案例)"
-
- prompt = f"""技能名称: {name_clean}
-
-请将这个技能分解为 3~5 个子主题(sub-topics),每个子主题代表该领域的一个重要实践方向。
-子主题应该是可操作的、有区分度的,而非空泛概念。
-
-参考案例标题:
-{case_hint}
-
-请以 JSON 数组格式输出,每个元素: {{"topic": "子主题名称(含简要说明)", "confidence": 78}}
-confidence 表示该子主题与技能的相关程度(0-100)。
-"""
-
- result = call_llm_json(prompt,
- system_prompt="你是技能学习专家,善于将复杂技能拆解为可学习的子主题。输出纯 JSON 数组。",
- temperature=0.3,
- max_tokens=2048)
-
- if isinstance(result, list) and result:
- subs = []
- for item in result:
- if isinstance(item, dict) and "topic" in item:
- conf = item.get("confidence", 70)
- subs.append((item["topic"], conf))
- if subs:
- print(f" [LLM] 技能分解: {len(subs)} 个子主题")
- return subs
- return None
-
-
-def _phase3_analyze(ctx: dict):
- """Phase 3: 分析提炼知识模式(LLM增强版)"""
- print(f"\n{'-'*55}")
- print(" Phase 3: 模式提炼")
- print(f"{'-'*55}")
-
- cases = ctx.get("cases", [])
- skill_name = ctx["skill_name"]
-
- # ── LLM 增强路径 ──
- patterns = _llm_extract_patterns(cases, skill_name)
- if patterns is None:
- # fallback: 规则路径
- print(" [FALLBACK] 使用规则模式提取(LLM 不可用或解析失败)")
- patterns = _extract_patterns_from_cases(cases, skill_name)
- # 仍然用 LLM 增强技能分解
- llm_subs = _llm_decompose_skill_name(skill_name, cases)
- if llm_subs:
- added_ids = {p["id"] for p in patterns}
- for i, (sub_name, conf) in enumerate(llm_subs):
- pid = f"P_domain_llm_{i+1}"
- if pid not in added_ids:
- patterns.append({
- "id": pid,
- "principle": sub_name,
- "confidence": conf,
- "level": "domain"
- })
- added_ids.add(pid)
- else:
- # LLM 模式提取成功,仍用 LLM 补充技能分解子主题
- llm_subs = _llm_decompose_skill_name(skill_name, cases)
- if llm_subs:
- existing_ids = {p["id"] for p in patterns}
- for i, (sub_name, conf) in enumerate(llm_subs):
- pid = f"P_domain_llm_{i+1}"
- if pid not in existing_ids:
- patterns.append({
- "id": pid,
- "principle": sub_name,
- "confidence": conf,
- "level": "domain"
- })
- existing_ids.add(pid)
-
- # 合并历史模式(如果有继承)——过滤掉不相关的领域模式
- SKILL_DOMAIN_PREFIXES = {
- "async": ["P_fastapi_", "P_async_"],
- "fastapi": ["P_fastapi_", "P_async_"],
- "web_scraping": ["P_scrape_"],
- "scrape": ["P_scrape_"],
- "crawl": ["P_scrape_"],
- "database": ["P_db_"],"sql":["P_db_"],
- "db": ["P_db_"],
- "git": ["P_git_"],
- "graph": ["P_gql_"],
- "neo4j": ["P_gql_"],
- "cypher": ["P_gql_"],
- "graph_database": ["P_gql_"],
- "network": ["P_net_"],
- "networking": ["P_net_"],
- "security": ["P_net_", "P_sec_"],
- "finance": ["P_doc_", "P_fin_"],
- "image": ["P_img_", "P_doc_"],
- "document": ["P_doc_"],
- "凭证": ["P_doc_"],
- "鉴定": ["P_doc_"],
- "satellite": ["P_rem_"],
- "remote_sensing": ["P_rem_"],
- "testing": ["P_test_"],
- "kubernetes": ["P_k8s_"],
- "k8s": ["P_k8s_"],
- "frontend": ["P_fe_"],
- "react": ["P_fe_"],
- "performance": ["P_perf_"],
- "wiki": ["P_domain_"],
- "search": ["P_domain_"],
- "知识": ["P_domain_"],
- "文档": ["P_domain_"],
- }
- existing = ctx.get("inherited_patterns", [])
- skill_lower = skill_name.lower()
- # 找出当前技能相关的领域前缀
- relevant_prefixes = set()
- for kw, prefixes in SKILL_DOMAIN_PREFIXES.items():
- if kw in skill_lower:
- relevant_prefixes.update(prefixes)
- # 通用模式(P_pin_, P_env_, P_res_ 等)总是保留
- always_keep = {"P_pin_", "P_env_", "P_res_", "P_health_", "P_log_", "P_cfg_"}
-
- filtered_existing = []
- filtered_count = 0
- for p in existing:
- pid = p.get("id", "")
- # 保留: 通用模式 / 技能相关领域模式 / domain模式 / llm模式
- if (any(pid.startswith(pre) for pre in always_keep) or
- any(pid.startswith(pre) for pre in relevant_prefixes) or
- pid.startswith("P_domain_") or
- pid.startswith("P_domain_llm_") or
- p.get("level") in ("basic", "generic")):
- filtered_existing.append(p)
- else:
- # 标记为已存在但不再加入
- filtered_count += 1
-
- existing_ids = {p["id"] for p in filtered_existing}
- merged = list(filtered_existing)
- for p in patterns:
- if p["id"] not in existing_ids:
- merged.append(p)
- existing_ids.add(p["id"])
-
- merged.sort(key=lambda x: x.get("confidence", 0), reverse=True)
-
- patterns_file = ctx["rev_dir"] / "patterns" / "knowledge_patterns.json"
- with open(patterns_file, "w", encoding="utf-8") as f:
- json.dump(merged, f, indent=2, ensure_ascii=False)
- ctx["patterns"] = merged
-
- print(f" 继承: {len(filtered_existing)} 个 (过滤掉 {filtered_count} 个不相关)")
- print(f" 新增: {len(patterns)} 个")
- print(f" 总计: {len(merged)} 个")
- for p in merged[:5]:
- print(f" [{p.get('confidence',0)}%] {p['principle'][:50]}")
- if len(merged) > 5:
- print(f" ... 还有 {len(merged)-5} 个")
- print(f" [OK] 已保存")
-
-# ===============================================================
-# Phase 4: 构建验证工具
-# ===============================================================
-
-# 验证工具模板
-# ── 验证工具模板(外部文件) ──
-TEMPLATE_FILE = Path(__file__).parent / 'assess_template.py'
-
-
-
-def _phase4_build_tool(ctx: dict):
- """生成验证工具"""
- print(f"\n{'-'*55}")
- print(" Phase 4: 构建验证工具")
- print(f"{'-'*55}")
-
- patterns = ctx.get("patterns", [])
- patterns_json = json.dumps(patterns, indent=2, ensure_ascii=False)
- case_count = len(ctx.get("cases", []))
-
- # 从外部模板文件读取
- template_file = Path(__file__).parent / 'assess_template.py'
- tool_code = template_file.read_text(encoding='utf-8')
- tool_code = tool_code.replace("__VERSION__", str(ctx["version"]))
- tool_code = tool_code.replace("__SKILL__", ctx["skill_name"])
- tool_code = tool_code.replace("__SKILL_DISPLAY__",
- ctx.get("skill_definition", {}).get("display_name", ctx["skill_name"]))
- tool_code = tool_code.replace("__PATTERNS_JSON__", patterns_json)
- tool_code = tool_code.replace("__CASE_COUNT__", str(case_count))
-
- tool_file = ctx["rev_dir"] / "tools" / "assess.py"
- with open(tool_file, "w", encoding="utf-8") as f:
- f.write(tool_code)
-
- # 语法检查
- try:
- compile(tool_code, str(tool_file), "exec")
- print(f" [OK] 工具已创建: tools/assess.py (语法检查通过)")
- except SyntaxError as e:
- print(f" [!] 工具已创建,但语法检查失败: {e}")
-
- ctx["tool_file"] = tool_file
-
- # ── 实践环节:检测所有匹配的 practical hook,复制到 practice/ ──
- skill_lower = ctx["skill_name"].lower()
- hooks_dir = Path(__file__).parent / "practical_hooks"
-
- # 关键词匹配规则
- hook_rules = [
- ("docker", "docker_compose.py"),
- ("compose", "docker_compose.py"),
- ("container", "docker_compose.py"),
- ("neo4j", "neo4j_hook.py"),
- ("cypher", "neo4j_hook.py"),
- ("graph_database", "neo4j_hook.py"),
- ("图数据库", "neo4j_hook.py"),
- ("sql", "sql.py"),
- ("mysql", "sql.py"),
- ("postgres", "sql.py"),
- ("git", "git.py"),
- ("async", "python_async.py"), # Python async 技能
- ("asyncio", "python_async.py"),
- ("图像", "document_check.py"), # 图像/文档鉴权技能
- ("凭证", "document_check.py"),
- ("证件", "document_check.py"),
- ("鉴定", "document_check.py"),
- ("ocr", "document_check.py"),
- ("image", "document_check.py"),
- ("ui", "ui_design_hook.py"), # UI/UX设计技能 → 设计工具验证
- ("ux", "ui_design_hook.py"),
- ("设计", "ui_design_hook.py"),
- ("prototype", "ui_design_hook.py"),
- ("handoff", "ui_design_hook.py"),
- ("react", "react_hook.py"),
- ("frontend", "react_hook.py"),
- ("hooks", "react_hook.py"),
- ]
-
- import shutil
- matched_hooks = []
- seen_hooks = set()
- # hook互斥规则:如果特定hook已匹配,则排除其互斥hook
- hook_exclusions = {
- "neo4j_hook.py": ["sql.py", "docker_compose.py"],
- "docker_compose.py": ["sql.py"],
- }
- for keyword, hook_name in hook_rules:
- if keyword in skill_lower and hook_name not in seen_hooks:
- # 互斥检查:如果已匹配的hook排斥当前hook则跳过
- if any(hook_name in excl_list for excl_hook, excl_list in hook_exclusions.items() if excl_hook in seen_hooks):
- continue
- hook_file = hooks_dir / hook_name
- if hook_file.exists():
- seen_hooks.add(hook_name)
- practice_target = ctx["rev_dir"] / "practice" / hook_name
- shutil.copy2(str(hook_file), str(practice_target))
- matched_hooks.append(hook_name)
-
- ctx["practice_hooks"] = matched_hooks
- ctx["has_practical"] = len(matched_hooks) > 0
- if matched_hooks:
- print(f" [OK] 实践环节: {', '.join(matched_hooks)}")
- else:
- print(" [OK] 无匹配实践")
-
-
-
-# ===============================================================
-# Phase 5: 运行验证
-# ===============================================================
-
-def _phase5_validate(ctx: dict):
- """运行验证工具"""
- print(f"\n{'-'*55}")
- print(" Phase 5: 运行验证")
- print(f"{'-'*55}")
-
- tool_file = ctx.get("tool_file")
- if not tool_file or not tool_file.exists():
- print(" [FAIL] 验证工具不存在")
- return
-
- # 直接在本 Python 进程中运行
- try:
- # 安全:从父进程复制环境变量,过滤掉敏感密钥
- subprocess_env = os.environ.copy()
- # 过滤敏感密钥(仅限 API 密钥和 Token,保留数据库密码供 practical hook 使用)
- _sensitive_suffixes = ("_API_KEY", "_API_SECRET", "_ACCESS_KEY", "_SECRET_KEY", "_AUTH_TOKEN")
- for key in list(subprocess_env.keys()):
- if any(key.upper().endswith(suf) for suf in _sensitive_suffixes):
- del subprocess_env[key]
- # 确保 LLM 变量被传递
- for key in ("SKILL_LLM_ENABLE", "LLM_API_BASE", "LLM_API_KEY", "LLM_MODEL", "LLM_TIMEOUT"):
- if key in os.environ:
- subprocess_env[key] = os.environ[key]
- # GA_ROOT = Path(__file__).resolve().parents[2]
- ga_root = str(Path(__file__).resolve().parents[2])
- subprocess_env.setdefault("PYTHONPATH", "")
- paths = subprocess_env["PYTHONPATH"].split(os.pathsep) if subprocess_env["PYTHONPATH"] else []
- if ga_root not in paths:
- subprocess_env["PYTHONPATH"] = ga_root + os.pathsep + subprocess_env["PYTHONPATH"]
-
- result = subprocess.run(
- [sys.executable, str(tool_file)],
- capture_output=True, text=True, timeout=90,
- env=subprocess_env
- )
- print(result.stdout)
- if result.stderr:
- print(f" [stderr]: {result.stderr[:200]}")
- except subprocess.TimeoutExpired:
- print(" [!] 验证超时")
- except Exception as e:
- print(f" [!] 运行失败: {e}")
-
- # 读取报告
- report_file = ctx["rev_dir"] / "reports" / "assessment.json"
- if report_file.exists():
- with open(report_file, encoding="utf-8") as f:
- report = json.load(f)
- ctx["assessment"] = report
- passed = report.get("passed", False)
- score = report.get("final_score", 0)
- print(f"\n {'='*55}")
- print(f" rev{ctx['version']} {'[OK] PASS' if passed else '[FAIL] FAIL'} ({score}/100)")
- print(f" {'='*55}")
-
- # 更新 meta.json
- meta_file = ctx["rev_dir"] / "meta.json"
- if meta_file.exists():
- with open(meta_file, encoding="utf-8") as f:
- meta = json.load(f)
- meta["status"] = "passed" if passed else "failed"
- meta["score"] = score
- with open(meta_file, "w", encoding="utf-8") as f:
- json.dump(meta, f, indent=2)
- else:
- print(" [!] 未生成验证报告")
-
-
-# ===============================================================
-# 主入口
-# ===============================================================
-
-def learn_skill(skill_name: str):
- """
- 案例驱动技能学习完整流程
-
- 用法:
- learn_skill("docker_compose_production")
- """
- ctx = {}
-
- try:
- ctx = _phase0_bootstrap(skill_name)
-
- # ── 环境探测 (Phase 0.5) ──
- try:
- from tools.skill_learn_from_cases.env_detector import detect_all
- ctx["env"] = detect_all()
- # 如果有可用环境,给用户提示
- available = [k for k,v in ctx["env"].items() if v.get("available")]
- need_auth = [k for k,v in ctx["env"].items() if v.get("url") and not v.get("auth")]
- if available:
- print(f" [环境] 可用: {', '.join(available)}")
- if need_auth:
- print(f" [环境] ⚠ 需密码: {', '.join(need_auth)}(设置 {x}_password 环境变量)")
- except Exception as e:
- print(f" [环境] [!] 探测失败: {e}")
- ctx["env"] = {}
-
- _phase1_define(ctx)
- _phase2_search(ctx)
- _phase3_analyze(ctx)
- _phase4_build_tool(ctx)
- _phase5_validate(ctx)
- except KeyboardInterrupt:
- print("\n NO_ENTRY 用户中断")
- return
- except Exception as e:
- print(f"\n [FAIL] 流程异常: {e}")
- import traceback
- traceback.print_exc()
- return
-
- # 输出总结
- assessment = ctx.get("assessment", {})
- print(f"\n [CHART] 总结: rev{ctx.get('version','?')} | "
- f"模式: {len(ctx.get('patterns',[]))}个 | "
- f"评分: {assessment.get('final_score','?')}/100")
- print(f" 目录: {ctx.get('rev_dir','')}")
-
- # 生成 Markdown 报告
- _generate_markdown_report(ctx)
-
-
-def _generate_markdown_report(ctx: dict):
- """生成人类可读的 Markdown 学习报告"""
- rev_dir = ctx.get("rev_dir")
- if not rev_dir:
- return
- skill = ctx.get("skill_name", "unknown")
- version = ctx.get("version", 0)
- patterns = ctx.get("patterns", [])
- cases = ctx.get("cases", [])
- assessment = ctx.get("assessment", {})
- score = assessment.get("final_score", 0)
- passed = assessment.get("passed", False)
- inherited = ctx.get("inherited_patterns", [])
- inherited_ids = {p["id"] for p in inherited}
- prev_version = version - 1 if version > 1 else None
-
- # 分割模式
- domains = [p for p in patterns if p.get("level") == "domain"]
- advanced = [p for p in patterns if p.get("level") == "advanced"]
- basics = [p for p in patterns if p.get("level") == "basic"]
-
- # 区分继承vs新增
- inherited_cnt = sum(1 for p in patterns if p["id"] in inherited_ids)
- new_cnt = len(patterns) - inherited_cnt
-
- lines = []
- lines.append(f"# 技能学习报告: {skill}")
- lines.append(f"")
- lines.append(f"| 属性 | 值 |")
- lines.append(f"|------|-----|")
- lines.append(f"| 版本 | rev{version} |")
- lines.append(f"| 评分 | {score}/100 {'PASS' if passed else 'FAIL'} |")
- lines.append(f"| 案例数 | {len(cases)} 条 |")
- lines.append(f"| 模式总数 | {len(patterns)} 个 |")
- if prev_version:
- lines.append(f"| 继承自 rev{prev_version} | {inherited_cnt} 个 |")
- lines.append(f"| 新增 | {new_cnt} 个 |")
- lines.append(f"")
- lines.append(f"## 知识模式")
- lines.append(f"")
- if domains:
- lines.append(f"### 领域专有 ({len(domains)}个)")
- for p in sorted(domains, key=lambda x: -x.get("confidence", 0)):
- lines.append(f"- [{p['confidence']}%] {p['principle']}")
- lines.append(f"")
- if advanced:
- lines.append(f"### 高级模式 ({len(advanced)}个)")
- for p in sorted(advanced, key=lambda x: -x.get("confidence", 0)):
- lines.append(f"- [{p['confidence']}%] {p['principle']}")
- lines.append(f"")
- if basics:
- lines.append(f"### 基础模式 ({len(basics)}个)")
- for p in sorted(basics, key=lambda x: -x.get("confidence", 0)):
- lines.append(f"- [{p['confidence']}%] {p['principle']}")
- lines.append(f"")
-
- # 案例摘要
- if cases:
- lines.append(f"## 参考案例 ({len(cases)}条)")
- lines.append(f"")
- for c in cases[:10]:
- title = c.get("title", c.get("key", "?"))
- url = c.get("url", "")
- if url:
- lines.append(f"- [{title}]({url})")
- else:
- lines.append(f"- {title}")
-
- report_path = rev_dir / "reports" / "learning_report.md"
- with open(report_path, "w", encoding="utf-8") as f:
- f.write("\n".join(lines))
- print(f" [OK] 学习报告: reports/learning_report.md")
diff --git a/tools/skill_learn_from_cases/env_detector.py b/tools/skill_learn_from_cases/env_detector.py
deleted file mode 100644
index d8f2d66a..00000000
--- a/tools/skill_learn_from_cases/env_detector.py
+++ /dev/null
@@ -1,169 +0,0 @@
-"""
-env_detector.py — 环境探测模块
-
-自动探测本机可用的数据库/工具服务,供 practical hooks 使用。
-探测结果写入 ctx["env"],Phase 5 据此决定是否执行真实实操测试。
-
-规则:
- - 环境变量中有密码/密钥 → 自动连接测试
- - 端口开放但无凭据 → 记录为"需用户确认"
- - 既无端口也无凭据 → 标记为不可用
-"""
-
-import os
-import socket
-import subprocess
-import json
-from pathlib import Path
-
-_GA_ROOT = Path(__file__).resolve().parents[2]
-
-
-def _port_open(host: str, port: int, timeout: float = 2.0) -> bool:
- """检查端口是否开放"""
- s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
- s.settimeout(timeout)
- try:
- return s.connect_ex((host, port)) == 0
- finally:
- s.close()
-
-
-def _check_env(name: str) -> str:
- """读取环境变量,返回 (值 or '')"""
- return os.environ.get(name, "")
-
-
-def detect_neo4j() -> dict:
- """
- 探测 Neo4j 环境
- 返回: {"available": bool, "url": str, "auth": bool, "note": str}
- """
- result = {"available": False, "url": "", "auth": False, "note": ""}
-
- bolt_open = _port_open("127.0.0.1", 7687)
- http_open = _port_open("127.0.0.1", 7474)
- password = _check_env("neo4j_password")
-
- if not (bolt_open or http_open):
- result["note"] = "Neo4j 端口未开放"
- return result
-
- if password:
- result["available"] = True
- result["auth"] = True
- result["url"] = "bolt://localhost:7687"
- result["note"] = "Neo4j 就绪(已配置密码)"
- else:
- result["url"] = "bolt://localhost:7687"
- result["note"] = "Neo4j 端口开放,但未设置 neo4j_password 环境变量"
-
- return result
-
-
-def detect_docker() -> dict:
- """探测 Docker (通过 WSL)"""
- result = {"available": False, "note": ""}
- try:
- r = subprocess.run(
- ["wsl.exe", "--exec", "bash", "-c", "docker info --format '{{.ServerVersion}}' 2>/dev/null"],
- capture_output=True, text=True, timeout=5
- )
- if r.returncode == 0 and r.stdout.strip():
- result["available"] = True
- result["note"] = f"Docker {r.stdout.strip()} (WSL)"
- except:
- result["note"] = "Docker 不可用"
- return result
-
-
-def detect_sqlite() -> dict:
- """探测 SQLite CLI"""
- result = {"available": False, "note": ""}
- for path in [
- r"D:\ProgramData\miniconda3\Library\bin\sqlite3.exe",
- r"C:\Program Files\SQLite\sqlite3.exe",
- ]:
- if os.path.exists(path):
- try:
- r = subprocess.run([path, "--version"], capture_output=True, text=True, timeout=3)
- if r.returncode == 0:
- result["available"] = True
- result["path"] = path
- result["note"] = f"SQLite {r.stdout.strip()}"
- break
- except:
- pass
- if not result["available"]:
- result["note"] = "SQLite CLI 未找到"
- return result
-
-
-def detect_git() -> dict:
- """探测 Git"""
- result = {"available": False, "note": ""}
- try:
- r = subprocess.run(["git", "--version"], capture_output=True, text=True, timeout=3)
- if r.returncode == 0:
- result["available"] = True
- result["note"] = r.stdout.strip()
- except:
- result["note"] = "Git 不可用"
- return result
-
-
-def detect_all() -> dict:
- """全量环境探测"""
- print(" [探测] 扫描本机可用环境...")
- env = {
- "neo4j": detect_neo4j(),
- "docker": detect_docker(),
- "sqlite": detect_sqlite(),
- "git": detect_git(),
- "paddle_ocr": detect_paddle_ocr(),
- }
-
- # 汇总
- available = [k for k, v in env.items() if v.get("available")]
- need_auth = [k for k, v in env.items() if v.get("url") and not v.get("auth")]
-
- print(f" [探测] 可用: {', '.join(available) if available else '无'}")
-
- # 询问用户缺失的凭据(仅交互模式下)
- for item in need_auth:
- key = f"{item}_password"
- if key not in os.environ:
- try:
- from ga import ask_user
- pw = ask_user(f"检测到 {item} 服务(端口开放),请输入密码\n(设置环境变量 {key} 可跳过此提示)")
- if pw and pw.strip():
- os.environ[key] = pw.strip()
- env[item]["auth"] = True
- print(f" [探测] ✅ {item} 密码已通过用户提供")
- except ImportError:
- print(f" [探测] ⚠ {item}: 端口已开放,需要密码(设置 {key} 环境变量)")
-
- return env
-
-
-if __name__ == "__main__":
- # 独立测试
- print(json.dumps(detect_all(), indent=2, ensure_ascii=False))
-
-
-def detect_paddle_ocr() -> dict:
- """探测 PaddleOCR-VL (llama-server on :8090)"""
- if not _port_open("127.0.0.1", 8090):
- return {"available": False}
- try:
- import urllib.request, json
- req = urllib.request.Request("http://localhost:8090/v1/models")
- with urllib.request.urlopen(req, timeout=3) as resp:
- models = json.loads(resp.read())
- for m in models.get("models", []):
- name = m.get("name", "")
- if "ocr" in name.lower() or "Paddle" in name:
- return {"available": True, "url": "http://localhost:8090/v1/chat/completions", "model": "PaddleOCR-VL-1.5-GGUF", "auth": True, "note": f"PaddleOCR-VL: {name}"}
- return {"available": True, "url": "http://localhost:8090/v1/chat/completions", "note": "llama-server running"}
- except:
- return {"available": False}
diff --git a/tools/skill_learn_from_cases/llm_helper.py b/tools/skill_learn_from_cases/llm_helper.py
deleted file mode 100644
index 41093010..00000000
--- a/tools/skill_learn_from_cases/llm_helper.py
+++ /dev/null
@@ -1,218 +0,0 @@
-"""
-llm_helper.py — 大模型调用统一接口(零外部依赖 + LLM 响应缓存)
-
-渐进增强设计:
- - LLM 可用时(环境变量 SKILL_LLM_ENABLE=1 + LLM_API_BASE 可达)
- - 不可用时自动降级,不影响流程
- - 自动缓存 LLM 响应到磁盘,相同 prompt 不重复调用
-
-环境变量:
- SKILL_LLM_ENABLE=1 启用 LLM 增强
- LLM_API_BASE 兼容 OpenAI Chat API 的端点(默认 http://localhost:11434/v1)
- LLM_API_KEY 可选 API Key
- LLM_MODEL 模型名(默认 qwen2.5:7b)
- LLM_TIMEOUT HTTP 超时秒数(默认 30)
- LLM_CACHE_DIR 缓存目录(默认 {GA_ROOT}/.llm_cache)
- LLM_CACHE_TTL 缓存有效期秒数(默认 86400=1天)
-"""
-
-import json
-import os
-import urllib.request
-import urllib.error
-import sys
-import hashlib
-import time
-from pathlib import Path
-from tools.skill_learn_from_cases.logging_setup import logger
-
-# ── 环境变量获取(仅读一次) ──
-_ENABLED = os.environ.get("SKILL_LLM_ENABLE", "0") == "1"
-_API_BASE = os.environ.get("LLM_API_BASE", "http://localhost:11434/v1").rstrip("/")
-_API_KEY = os.environ.get("LLM_API_KEY", "")
-_MODEL = os.environ.get("LLM_MODEL", "qwen2.5:7b")
-_TIMEOUT = int(os.environ.get("LLM_TIMEOUT", "30"))
-
-# ── 缓存配置 ──
-_CACHE_ENABLED = os.environ.get("LLM_CACHE_ENABLE", "1") == "1"
-_CACHE_TTL = int(os.environ.get("LLM_CACHE_TTL", "86400")) # 默认1天
-_GA_ROOT = Path(__file__).resolve().parents[2]
-_CACHE_DIR = Path(os.environ.get("LLM_CACHE_DIR", str(_GA_ROOT / ".llm_cache")))
-
-# 可用性缓存
-_availability = None
-
-
-def _get_cache_key(prompt: str, system_prompt: str) -> str:
- """生成缓存键(prompt + system_prompt 的 SHA256 前 24 位)"""
- raw = f"{_MODEL}||{system_prompt}||{prompt}"
- return hashlib.sha256(raw.encode("utf-8")).hexdigest()[:24]
-
-
-def _load_cache(cache_key: str) -> str | None:
- """从磁盘加载缓存(过期返回 None)"""
- if not _CACHE_ENABLED:
- return None
- cache_file = _CACHE_DIR / cache_key
- if not cache_file.exists():
- return None
- try:
- data = json.loads(cache_file.read_text(encoding="utf-8"))
- if time.time() - data["ts"] > _CACHE_TTL:
- cache_file.unlink(missing_ok=True)
- return None
- return data["response"]
- except Exception:
- return None
-
-
-def _save_cache(cache_key: str, response: str):
- """保存响应到磁盘缓存"""
- if not _CACHE_ENABLED or not response:
- return
- try:
- _CACHE_DIR.mkdir(parents=True, exist_ok=True)
- cache_file = _CACHE_DIR / cache_key
- cache_file.write_text(
- json.dumps({"ts": time.time(), "response": response}, ensure_ascii=False),
- encoding="utf-8"
- )
- except Exception:
- pass
-
-
-def llm_available() -> bool:
- """检查 LLM 是否可用(连接测试 + 环境变量开关)"""
- global _availability
- if _availability is not None:
- return _availability
-
- if not _ENABLED:
- print(" [LLM] SKIP: SKILL_LLM_ENABLE 未设置或不为 1")
- _availability = False
- return False
-
- # 快速连接测试
- try:
- test_url = f"{_API_BASE}/models"
- req = urllib.request.Request(test_url, method="GET")
- if _API_KEY:
- req.add_header("Authorization", f"Bearer {_API_KEY}")
- with urllib.request.urlopen(req, timeout=5) as resp:
- _availability = resp.status == 200
- if _availability:
- logger.info("LLM available: %s", _API_BASE)
- else:
- logger.error("LLM endpoint returned %s", resp.status)
- return _availability
- except Exception as e:
- logger.error("LLM connection failed: %s — %s", _API_BASE, e)
- _availability = False
- return False
-
-
-def call_llm(
- prompt: str,
- system_prompt: str = "You are a helpful AI assistant.",
- temperature: float = 0.3,
- max_tokens: int = 2048,
-) -> str:
- """
- 调用 LLM(OpenAI 兼容 API),返回文本响应。
- 自动缓存相同 prompt 的响应到磁盘。
- """
- if not llm_available():
- return ""
-
- # 低温度时启用缓存(高温度每次都新鲜调用)
- use_cache = _CACHE_ENABLED and temperature <= 0.3
- if use_cache:
- cache_key = _get_cache_key(prompt, system_prompt)
- cached = _load_cache(cache_key)
- if cached is not None:
- return cached
-
- payload = {
- "model": _MODEL,
- "messages": [
- {"role": "system", "content": system_prompt},
- {"role": "user", "content": prompt},
- ],
- "temperature": temperature,
- "max_tokens": max_tokens,
- "stream": False,
- }
-
- body = json.dumps(payload).encode("utf-8")
- url = f"{_API_BASE}/chat/completions"
-
- req = urllib.request.Request(url, data=body, method="POST")
- req.add_header("Content-Type", "application/json")
- if _API_KEY:
- req.add_header("Authorization", f"Bearer {_API_KEY}")
-
- try:
- with urllib.request.urlopen(req, timeout=_TIMEOUT) as resp:
- result = json.loads(resp.read().decode("utf-8"))
- content = (
- result.get("choices", [{}])[0]
- .get("message", {})
- .get("content", "")
- )
- content = content.strip()
- # 缓存非空响应(低温度才缓存)
- if use_cache and content:
- _save_cache(cache_key, content)
- return content
- except urllib.error.HTTPError as e:
- print(f" [LLM] HTTP {e.code}: {e.read().decode('utf-8', errors='replace')[:200]}")
- except Exception as e:
- logger.error("LLM call failed: %s", e)
-
- return ""
-
-
-def call_llm_json(
- prompt: str,
- system_prompt: str = "You are a helpful AI assistant. Always output valid JSON.",
- temperature: float = 0.2,
- max_tokens: int = 4096,
-) -> dict | list | None:
- """
- 调用 LLM 并解析 JSON 响应。
-
- 返回:
- 解析后的 Python 对象(dict/list),失败或非 JSON 时返回 None
- """
- text = call_llm(prompt, system_prompt, temperature, max_tokens)
- if not text:
- return None
-
- # 尝试提取 JSON 块(处理 LLM 可能在 markdown 代码块中返回的情况)
- import re
-
- # 先找 ```json ... ``` 块
- json_match = re.search(r"```(?:json)?\s*([\s\S]*?)\s*```", text)
- if json_match:
- text = json_match.group(1).strip()
-
- # 再找 ``` ... ``` 块(不带标签)
- if not json_match:
- json_match = re.search(r"```\s*([\s\S]*?)\s*```", text)
- if json_match:
- text = json_match.group(1).strip()
-
- try:
- return json.loads(text)
- except json.JSONDecodeError as e:
- logger.warning("LLM JSON parse failed: %s", e)
- logger.debug("LLM raw response: %s", text[:200])
- return None
-
-
-def clear_cache():
- """清除所有 LLM 缓存"""
- if _CACHE_DIR.exists():
- import shutil
- shutil.rmtree(_CACHE_DIR)
- print(f" [LLM] 缓存已清除: {_CACHE_DIR}")
diff --git a/tools/skill_learn_from_cases/logging_setup.py b/tools/skill_learn_from_cases/logging_setup.py
deleted file mode 100644
index 73ce3143..00000000
--- a/tools/skill_learn_from_cases/logging_setup.py
+++ /dev/null
@@ -1,43 +0,0 @@
-"""
-logging_setup.py — 结构化日志配置
-
-依据 structured_logging 技能模式:
- - 日志同时输出到文件和控制台
- - 日志级别: DEBUG/INFO/WARNING/ERROR
- - 使用结构化格式便于排查
-
-用法:
- from tools.skill_learn_from_cases.logging_setup import logger
- logger.info("Phase 1 completed")
- logger.debug("Search query: %s", query)
- logger.error("LLM call failed: %s", e)
-"""
-import logging, sys, os
-
-_LOG_FORMAT = "%(asctime)s [%(levelname)s] %(message)s"
-_LOG_DIR = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(__file__))), "temp")
-
-def setup_logger(name: str = "skill_learn", level: int = logging.WARNING) -> logging.Logger:
- """配置日志器"""
- logger = logging.getLogger(name)
- if logger.handlers:
- return logger
-
- logger.setLevel(level)
-
- # 控制台 handler (WARNING 及以上)
- console = logging.StreamHandler(sys.stderr)
- console.setLevel(logging.WARNING)
- console.setFormatter(logging.Formatter("%(levelname)s: %(message)s"))
- logger.addHandler(console)
-
- # 文件 handler (所有级别)
- os.makedirs(_LOG_DIR, exist_ok=True)
- fh = logging.FileHandler(os.path.join(_LOG_DIR, "skill_learn.log"), encoding="utf-8")
- fh.setLevel(logging.DEBUG)
- fh.setFormatter(logging.Formatter(_LOG_FORMAT))
- logger.addHandler(fh)
-
- return logger
-
-logger = setup_logger()
diff --git a/tools/skill_learn_from_cases/name_converter.py b/tools/skill_learn_from_cases/name_converter.py
deleted file mode 100644
index fce14c70..00000000
--- a/tools/skill_learn_from_cases/name_converter.py
+++ /dev/null
@@ -1,81 +0,0 @@
-"""
-name_converter.py — 中英技能名转换
-
-将中文技能名(如"金融图像凭证鉴定")转换为标准英文名(finance_image_voucher_verification)
-用于统一目录命名和搜索引擎查询优化。
-
-用法:
- from tools.skill_learn_from_cases.name_converter import convert_name
- en_name = convert_name("金融图像凭证鉴定")
- # → "finance_image_voucher_verification"
-"""
-
-from pathlib import Path
-import json
-import re
-
-# 中文→英文技术术语映射(可编辑扩展)
-_MAPPING_FILE = Path(__file__).parent / "chinese_to_english.json"
-
-# 内置缓存
-_mapping_cache = None
-
-def _load_mapping() -> dict:
- """加载中英映射字典"""
- global _mapping_cache
- if _mapping_cache is not None:
- return _mapping_cache
- mapping = {}
- if _MAPPING_FILE.exists():
- with open(_MAPPING_FILE, 'r', encoding='utf-8') as f:
- mapping = json.load(f)
- _mapping_cache = mapping
- return mapping
-
-
-def convert_name(skill_name: str) -> str:
- """将任意技能名转换为标准英文名(下划线分隔)"""
- if not skill_name:
- return "unknown"
-
- has_cjk = any('\u4e00' <= c <= '\u9fff' for c in skill_name)
-
- # 纯英文名:直接规范化
- if not has_cjk:
- safe = skill_name.strip().lower().replace(" ", "_").replace("-", "_")
- # 路径注入防护
- safe = re.sub(r'[^\w\-\u4e00-\u9fff]', '', safe).strip('_')
- return safe or "unknown"
-
- mapping = _load_mapping()
- seen = set()
- result = []
-
- # 1. 提取英文关键词(保留数字组合如 neo4j)
- en_words = re.findall(r'[a-zA-Z][a-zA-Z0-9]*', skill_name)
- for w in en_words:
- w = w.lower()
- if len(w) >= 2 and w not in seen:
- seen.add(w)
- result.append(w)
-
- # 2. 中文映射(按关键词长度降序,优先匹配长词,匹配后消耗文本)
- remaining = skill_name
- sorted_mapping = sorted(mapping.items(), key=lambda x: -len(x[0]))
- for zh, en in sorted_mapping:
- if zh in remaining:
- for word in en.split("_"):
- word = word.strip()
- if word and word not in seen:
- seen.add(word)
- result.append(word)
- # 消耗匹配的中文文本(防止"数据"重复匹配"数据库")
- remaining = remaining.replace(zh, " " * len(zh), 1)
-
- return "_".join(result) if result else skill_name.strip().lower().replace(" ", "_")
-
-
-def refresh_cache():
- """清除缓存,下次调用将重新加载映射文件(编辑后调用)"""
- global _mapping_cache
- _mapping_cache = None
diff --git a/tools/skill_learn_from_cases/practical_hooks/docker_compose.py b/tools/skill_learn_from_cases/practical_hooks/docker_compose.py
deleted file mode 100644
index ac582d18..00000000
--- a/tools/skill_learn_from_cases/practical_hooks/docker_compose.py
+++ /dev/null
@@ -1,236 +0,0 @@
-#!/usr/bin/env python3
-"""Docker Compose 实操测试 — 真实 compose config 校验
-
-被 assess_template.py 的 run_practical_test() 调用,
-输出 JSON: {"score": int(0-100), "passed": bool, "note": str}
-"""
-import json, subprocess, tempfile, os, sys, platform
-import shutil
-
-
-def _win_to_wsl_path(win_path):
- """Windows路径转WSL路径: D:\\foo\\bar -> /mnt/d/foo/bar"""
- if not win_path or len(win_path) < 2 or win_path[1] != ":":
- return win_path.replace("\\", "/")
- drive = win_path[0].lower()
- rest = win_path[2:].replace("\\", "/")
- return "/mnt/" + drive + rest
-
-
-def _docker_cmd(cmd_list, **kwargs):
- """运行 docker 命令,Windows 下通过 wsl.exe 调用"""
- if platform.system() == "Windows":
- wsl = shutil.which("wsl.exe") or "wsl.exe"
- cmd = [wsl, "--exec"] + cmd_list
- else:
- cmd = cmd_list
- default_kwargs = {"capture_output": True, "text": True, "timeout": 30}
- default_kwargs.update(kwargs)
- return subprocess.run(cmd, **default_kwargs)
-
-
-def check_docker_available():
- """检测 Docker 引擎是否可用"""
- try:
- r = _docker_cmd(["docker", "info", "--format=json"])
- return r.returncode == 0, "29.4.1" if r.returncode == 0 else r.stderr[:100]
- except Exception as e:
- return False, str(e)
-
-
-PRODUCTION_COMPOSE = """\
-services:
- app:
- image: myapp:${APP_VERSION:-latest}
- env_file:
- - .env.production
- volumes:
- - app_data:/data/app
- depends_on:
- db:
- condition: service_healthy
- redis:
- condition: service_started
- healthcheck:
- test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
- interval: 30s
- timeout: 10s
- retries: 3
- start_period: 40s
- deploy:
- replicas: 3
- resources:
- limits:
- cpus: "2.0"
- memory: "1G"
- reservations:
- cpus: "0.5"
- memory: "256M"
- networks:
- - frontend
- - backend
- logging:
- driver: "json-file"
- options:
- max-size: "10m"
- max-file: "3"
- restart: unless-stopped
-
- db:
- image: postgres:16-alpine
- restart: unless-stopped
- shm_size: 256m
- environment:
- POSTGRES_DB: myapp
- POSTGRES_USER: myapp
- POSTGRES_PASSWORD_FILE: /run/secrets/db_password
- secrets:
- - db_password
- volumes:
- - pg_data:/var/lib/postgresql/data
- healthcheck:
- test: ["CMD-SHELL", "pg_isready -U myapp"]
- interval: 10s
- timeout: 5s
- retries: 5
- networks:
- - backend
- deploy:
- resources:
- limits:
- cpus: "1.0"
- memory: "512M"
- logging:
- driver: "json-file"
- options:
- max-size: "10m"
- max-file: "3"
- stop_grace_period: 30s
-
- redis:
- image: redis:7-alpine
- restart: unless-stopped
- command: ["redis-server", "--appendonly", "yes", "--requirepass", "${REDIS_PASSWORD:-Ch4ngeMe!}"]
- stop_grace_period: 30s
- volumes:
- - redis_data:/data
- healthcheck:
- test: ["CMD", "redis-cli", "ping"]
- interval: 10s
- timeout: 5s
- retries: 3
- networks:
- - backend
- deploy:
- resources:
- limits:
- cpus: "0.5"
- memory: "256M"
- logging:
- driver: "json-file"
- options:
- max-size: "10m"
- max-file: "3"
-
- backup:
- image: alpine:3.19
- restart: unless-stopped
- volumes:
- - pg_data:/data/db:ro
- - ./backup:/backup
- networks:
- - backend
- deploy:
- resources:
- limits:
- cpus: "0.5"
- memory: "256M"
- command: >
- sh -c "while true; do
- tar czf /backup/db-$(date +%Y%m%d).tar.gz -C /data/db .;
- sleep 86400;
- done"
-
-volumes:
- app_data:
- pg_data:
- redis_data:
-
-networks:
- frontend:
- backend:
- internal: true
-
-secrets:
- db_password:
- file: ./secrets/db_password.txt
-"""
-
-
-def main():
- result = {"score": 0, "passed": False, "note": ""}
-
- # 1. 检查 Docker
- avail, ver = check_docker_available()
- if not avail:
- result["note"] = "Docker 引擎不可用: " + ver
- print(json.dumps(result))
- sys.exit(1)
-
- # 2. 写 compose 文件到临时目录
- with tempfile.TemporaryDirectory() as tmpdir:
- compose_file = os.path.join(tmpdir, "docker-compose.yml")
- env_file = os.path.join(tmpdir, ".env.production")
- secrets_dir = os.path.join(tmpdir, "secrets")
- os.makedirs(secrets_dir, exist_ok=True)
-
- with open(compose_file, "w") as f:
- f.write(PRODUCTION_COMPOSE)
- with open(env_file, "w") as f:
- f.write("APP_VERSION=1.2.3\nREDIS_PASSWORD=Ch4ngeMe!\n")
- with open(os.path.join(secrets_dir, "db_password.txt"), "w") as f:
- f.write("SuperSecretDBPass123!")
-
- # 3. 运行 docker compose config
- wsl_compose_file = _win_to_wsl_path(compose_file)
- r = _docker_cmd(
- ["docker", "compose", "-f", wsl_compose_file, "config"]
- )
-
- if r.returncode == 0:
- result["score"] = 100
- result["passed"] = True
- result["note"] = "生产级Compose通过真实 docker compose config 校验"
- result["output_preview"] = r.stdout[:200]
- else:
- result["score"] = 30
- result["note"] = "compose config 校验失败"
- result["error"] = r.stderr[:300]
-
- print(json.dumps(result))
- sys.exit(0 if result["passed"] else 1)
-
-
-
-
-# ── 统一接口 ──
-def run(env: dict = None) -> dict:
- """统一入口: run(env) 接收 env_detector 的输出,返回测试结果"""
- if env is None:
- try:
- import contextlib, io
- with contextlib.redirect_stdout(io.StringIO()):
- from env_detector import detect_all
- env = detect_all()
- except ImportError:
- import sys
- with contextlib.redirect_stdout(io.StringIO()):
- sys.path.insert(0, r"""D:\open_claw_agent\GenericAgent\tools\skill_learn_from_cases""")
- from env_detector import detect_all
- env = detect_all()
- return main()
-
-
-if __name__ == "__main__":
- result = run()
- print(json.dumps(result, ensure_ascii=False))
diff --git a/tools/skill_learn_from_cases/practical_hooks/document_check.py b/tools/skill_learn_from_cases/practical_hooks/document_check.py
deleted file mode 100644
index 9895a6b6..00000000
--- a/tools/skill_learn_from_cases/practical_hooks/document_check.py
+++ /dev/null
@@ -1,105 +0,0 @@
-#!/usr/bin/env python3
-"""
-document_check.py — 文档/图像鉴权实操测试
-
-统一接口: run(env) -> dict
-使用本机 PaddleOCR-VL-1.5 (llama-server on :8090) 做真实OCR验证
-降级: 检测本地OCR库可用性
-"""
-import json, sys, os, urllib.request, base64, io, contextlib
-
-PADDLE_API = "http://localhost:8090/v1/chat/completions"
-_TEST_TEXT = "OCR Test 123"
-
-
-def _make_test_image():
- """创建测试图片(base64)"""
- # fallback: 1x1 pixel PNG
- return "iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mNk+M9QDwADhgGAWjR9awAAAABJRU5ErkJggg=="
-
-
-def _paddle_ocr_available():
- """检测 PaddleOCR-VL API"""
- try:
- req = urllib.request.Request("http://localhost:8090/v1/models")
- with urllib.request.urlopen(req, timeout=3) as r:
- models = json.loads(r.read())
- for m in models.get("models", []):
- if "PaddleOCR" in m.get("name", "") or "ocr" in m.get("name", "").lower():
- return True
- return False
- except:
- return False
-
-
-def _run_paddle_ocr(img_b64: str) -> str | None:
- """调用 PaddleOCR-VL API 做OCR"""
- try:
- req_data = {
- "model": "PaddleOCR-VL-1.5-GGUF",
- "messages": [{"role": "user", "content": [
- {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{img_b64}"}},
- {"type": "text", "text": "请精准识别图片中的所有文字,逐行输出,不要多余内容"}
- ]}],
- "temperature": 0.0,
- "max_tokens": 1024
- }
- req = urllib.request.Request(PADDLE_API, data=json.dumps(req_data).encode(),
- headers={"Content-Type": "application/json"})
- with urllib.request.urlopen(req, timeout=30) as resp:
- result = json.loads(resp.read())
- return result["choices"][0]["message"]["content"]
- except:
- return None
-
-
-def run(env: dict = None) -> dict:
- """统一入口"""
- detail = {"paddle_ocr_api": False, "local_libs": []}
- notes = []
- score = 0
-
- # 1. 检测 PaddleOCR API
- if _paddle_ocr_available():
- detail["paddle_ocr_api"] = True
- score += 30
- notes.append("PaddleOCR-VL API可用")
- # 尝试OCR
- img = _make_test_image()
- ocr_result = _run_paddle_ocr(img)
- if ocr_result:
- detail["ocr_test_result"] = ocr_result[:100]
- score += 40
- notes.append("OCR测试通过")
- else:
- notes.append("OCR测试失败(API未返回有效文本)")
- else:
- notes.append("PaddleOCR-VL API不可用")
-
- # 2. 检测本地库
- libs_found = []
- for lib in ["pytesseract", "PIL", "cv2", "paddleocr"]:
- try:
- with contextlib.redirect_stdout(io.StringIO()), contextlib.redirect_stderr(io.StringIO()):
- __import__(lib)
- libs_found.append(lib)
- score += 15
- except:
- pass
- detail["local_libs"] = libs_found
- if libs_found:
- notes.append(f"本地库: {','.join(libs_found)}")
- if not notes:
- notes.append("无可用OCR工具")
-
- return {
- "score": min(score, 100),
- "passed": score >= 40,
- "note": "; ".join(notes),
- "detail": detail
- }
-
-
-if __name__ == "__main__":
- result = run()
- print(json.dumps(result, ensure_ascii=False))
diff --git a/tools/skill_learn_from_cases/practical_hooks/git.py b/tools/skill_learn_from_cases/practical_hooks/git.py
deleted file mode 100644
index 55a38f55..00000000
--- a/tools/skill_learn_from_cases/practical_hooks/git.py
+++ /dev/null
@@ -1,85 +0,0 @@
-#!/usr/bin/env python3
-"""Git 实操测试 — 临时仓库验证 git 操作能力"""
-import json, sys, tempfile, subprocess
-
-
-def _exec(cmd, cwd=None):
- return subprocess.run(cmd, capture_output=True, text=True, timeout=15, cwd=cwd)
-
-
-def main():
- result = {"score": 0, "passed": False, "note": ""}
- try:
- with tempfile.TemporaryDirectory() as tmpdir:
- # init(显式设置默认分支名)
- _exec(["git", "-c", "init.defaultBranch=main", "init"], tmpdir)
- _exec(["git", "config", "user.email", "t@t.com"], tmpdir)
- _exec(["git", "config", "user.name", "T"], tmpdir)
-
- # commit 1: initial
- with open(tmpdir + "/a.txt", "w") as f:
- f.write("hello")
- _exec(["git", "add", "."], tmpdir)
- _exec(["git", "commit", "-m", "init"], tmpdir)
-
- # branch + switch + commit 2
- _exec(["git", "checkout", "-b", "feature"], tmpdir)
- with open(tmpdir + "/b.txt", "w") as f:
- f.write("feature work")
- _exec(["git", "add", "."], tmpdir)
- _exec(["git", "commit", "-m", "feat: add feature"], tmpdir)
-
- # back to main
- _exec(["git", "checkout", "main"], tmpdir)
-
- # log check: 2 branches, 2 commits
- r = _exec(["git", "log", "--oneline", "--all"], tmpdir)
- commits = [l for l in r.stdout.strip().split("\n") if l]
- assert len(commits) == 2, f"期望2个提交, 实际{len(commits)}"
-
- # branch list
- r = _exec(["git", "branch", "-a"], tmpdir)
- branches = [l.strip().replace("*", "").strip() for l in r.stdout.strip().split("\n") if l.strip()]
- assert "main" in branches, "缺少 main 分支"
- assert "feature" in branches, "缺少 feature 分支"
-
- result["score"] = 100
- result["passed"] = True
- result["note"] = "Git 实操测试通过!分支创建/切换/提交/日志查看全部正确"
- except AssertionError as e:
- result["score"] = 50
- result["note"] = f"Git 测试: {e}"
- except Exception as e:
- result["score"] = 30
- result["note"] = f"Git 测试异常: {e}"
-
- print(json.dumps(result))
- sys.exit(0 if result["passed"] else 1)
-
-
-if __name__ == "__main__":
- main()
-
-
-# ── 统一接口 ──
-def run(env: dict = None) -> dict:
- """统一入口: run(env) 接收 env_detector 的输出,返回测试结果"""
- if env is None:
- try:
- from env_detector import detect_all
- import contextlib, io
- with contextlib.redirect_stdout(io.StringIO()):
- env = detect_all()
- except ImportError:
- import sys
- sys.path.insert(0, r"""D:\\open_claw_agent\\GenericAgent\\tools\\skill_learn_from_cases""")
- from env_detector import detect_all
- import contextlib, io
- with contextlib.redirect_stdout(io.StringIO()):
- env = detect_all()
- return main()
-
-
-if __name__ == "__main__":
- result = run()
- print(json.dumps(result, ensure_ascii=False))
diff --git a/tools/skill_learn_from_cases/practical_hooks/neo4j.py b/tools/skill_learn_from_cases/practical_hooks/neo4j.py
deleted file mode 100644
index 3362481a..00000000
--- a/tools/skill_learn_from_cases/practical_hooks/neo4j.py
+++ /dev/null
@@ -1,54 +0,0 @@
-#!/usr/bin/env python3
-"""Neo4j/Cypher 实操测试 — Cypher 查询验证"""
-import json, sys
-
-try:
- from neo4j import GraphDatabase
- HAS_NEO4J = True
-except ImportError:
- HAS_NEO4J = False
-
-def run_tests():
- """执行 Cypher 知识验证(无需真实 Neo4j 连接)"""
- results = []
-
- # 测试1: 理解 MATCH 子句
- q1 = "MATCH (n:Person)-[:KNOWS]->(f:Person) RETURN n.name, f.name"
- results.append({"query": q1, "valid_syntax": True, "description": "Cypher MATCH 基本模式匹配"})
-
- # 测试2: 理解 CREATE 子句
- q2 = "CREATE (n:Person {name: 'Alice', age: 30}) RETURN n"
- results.append({"query": q2, "valid_syntax": True, "description": "Cypher CREATE 节点 with 属性"})
-
- # 测试3: 理解 WHERE 条件
- q3 = "MATCH (n:User) WHERE n.age > 25 RETURN n.name, n.email"
- results.append({"query": q3, "valid_syntax": True, "description": "Cypher WHERE 条件过滤"})
-
- # 测试4: 理解路径长度
- q4 = "MATCH (a:Person)-[:FRIEND*1..3]->(b:Person) RETURN a.name, b.name"
- results.append({"query": q4, "valid_syntax": True, "description": "Cypher 变长路径查询"})
-
- # 测试5: 理解聚合
- q5 = "MATCH (n:Order) RETURN n.category, count(*) AS cnt, avg(n.amount) AS avg_amount"
- results.append({"query": q5, "valid_syntax": True, "description": "Cypher 聚合函数"})
-
- return results
-
-# 运行
-test_results = run_tests()
-correct = sum(1 for r in test_results if r.get("valid_syntax"))
-total = len(test_results)
-score = int(correct / total * 100) if total > 0 else 0
-
-report = {
- "practical_score": score,
- "detail": test_results,
- "note": f"Cypher 查询验证 {correct}/{total}"
-}
-
-report_path = sys.argv[1] if len(sys.argv) > 1 else None
-if report_path:
- with open(report_path, "w", encoding="utf-8") as f:
- json.dump(report, f, indent=2)
-
-print(json.dumps(report, ensure_ascii=False))
diff --git a/tools/skill_learn_from_cases/practical_hooks/neo4j_hook.py b/tools/skill_learn_from_cases/practical_hooks/neo4j_hook.py
deleted file mode 100644
index b72c1ea3..00000000
--- a/tools/skill_learn_from_cases/practical_hooks/neo4j_hook.py
+++ /dev/null
@@ -1,118 +0,0 @@
-#!/usr/bin/env python3
-"""
-neo4j_hook.py — Neo4j/Cypher 实操测试
-
-统一接口: run(env: dict) -> dict
- env 来自 env_detector.detect_all()
- 返回 {"score": 0-100, "passed": bool, "note": str, "detail": [...]}
-
-独立运行: python neo4j_hook.py (会自动探测环境)
-"""
-import json, sys, os, re
-
-
-def run(env: dict = None) -> dict:
- """统一入口:接收 env 字典,返回测试结果"""
- if env is None:
- env = _detect_env()
-
- neo4j_info = env.get("neo4j", {})
- password = os.environ.get("neo4j_password", "")
-
- if neo4j_info.get("available") and password:
- return _real_neo4j_test(password)
- else:
- return _syntax_fallback()
-
-
-def _detect_env() -> dict:
- """独立运行时的环境探测"""
- try:
- sys.path.insert(0, os.path.join(os.path.dirname(__file__), ".."))
- from env_detector import detect_all
- return detect_all(quiet=True)
- except Exception:
- return {}
-
-
-def _syntax_fallback() -> dict:
- """降级:Cypher 语法检查"""
- tests = [
- ("MATCH (n) RETURN n", True, "基本查询"),
- ("MATCH (n:Person)-[:KNOWS]->(f) RETURN n,f", True, "关系查询"),
- ("MERGE (n:Label {id:$id}) ON CREATE SET n.ts=timestamp()", True, "MERGE模式"),
- ("这个不是cypher语法 123 !!!", False, "语法错误检测"),
- ]
- correct = sum(1 for q, ok, _ in tests if _check_syntax(q) == ok)
- return {
- "score": int(correct / len(tests) * 100),
- "passed": correct >= 3,
- "note": f"Cypher 语法检查 {correct}/{len(tests)} (无连接)",
- "detail": [{"name": t[2], "passed": _check_syntax(t[0]) == t[1]} for t in tests]
- }
-
-
-def _check_syntax(query: str) -> bool:
- if not query.strip():
- return False
- keywords = ["match", "create", "merge", "return", "set", "delete", "where"]
- has_keyword = any(kw in query.lower().split() for kw in keywords)
- has_arrow = "->" in query or "-[" in query or "--" in query
- return has_keyword or has_arrow
-
-
-def _real_neo4j_test(password: str) -> dict:
- """真实 Neo4j 连接测试"""
- try:
- from neo4j import GraphDatabase
- except ImportError:
- return _syntax_fallback()
-
- driver = None
- try:
- driver = GraphDatabase.driver(
- "bolt://localhost:7687",
- auth=("neo4j", password),
- connection_timeout=5
- )
- with driver.session() as session:
- r = session.run("RETURN 1 AS n")
- if not r.single() or r.single()["n"] != 1:
- raise Exception("连接验证失败")
-
- tests = [
- ("RETURN 'hello' AS greeting", "基本Cypher查询"),
- ("match (n) return n limit 5", "图数据查询"),
- ("call db.labels()", "获取标签列表"),
- ("call db.relationshipTypes()", "获取关系类型"),
- ("call db.propertyKeys()", "获取属性键"),
- ]
- correct = 0
- detail = []
- with driver.session() as session:
- for query, desc in tests:
- try:
- session.run(query).consume()
- correct += 1
- detail.append({"name": desc, "passed": True})
- except Exception as e:
- detail.append({"name": desc, "passed": False, "error": str(e)})
-
- return {
- "score": int((correct / len(tests)) * 100),
- "passed": correct >= 3,
- "note": f"Neo4j 真实连接 {correct}/{len(tests)}",
- "detail": detail
- }
- except Exception as e:
- result = _syntax_fallback()
- result["note"] += f" (连接失败: {e})"
- return result
- finally:
- if driver:
- driver.close()
-
-
-if __name__ == "__main__":
- result = run()
- print(json.dumps(result, ensure_ascii=False))
diff --git a/tools/skill_learn_from_cases/practical_hooks/python_async.py b/tools/skill_learn_from_cases/practical_hooks/python_async.py
deleted file mode 100644
index ab58a423..00000000
--- a/tools/skill_learn_from_cases/practical_hooks/python_async.py
+++ /dev/null
@@ -1,95 +0,0 @@
-#!/usr/bin/env python3
-"""Python Async 实操测试 — 运行真实异步代码验证"""
-import json, sys, asyncio
-
-
-async def run_async_tests():
- """执行多项异步操作并验证结果"""
- results = []
-
- # 测试1: async/await 基本调用
- async def echo(msg):
- return msg
- r1 = await echo("hello")
- assert r1 == "hello", f"async/await 失败: {r1}"
- results.append("async/await OK")
-
- # 测试2: asyncio.gather 并发
- async def double(n):
- await asyncio.sleep(0.01)
- return n * 2
- r2 = await asyncio.gather(double(1), double(2), double(3))
- assert r2 == [2, 4, 6], f"gather 失败: {r2}"
- results.append("gather OK")
-
- # 测试3: asyncio.timeout
- async def slow():
- await asyncio.sleep(10)
- return "slow"
- try:
- async with asyncio.timeout(0.01):
- await slow()
- results.append("timeout FAIL")
- except TimeoutError:
- results.append("timeout OK")
-
- # 测试4: asyncio.Queue 生产者消费者
- queue = asyncio.Queue()
- async def producer():
- for i in range(3):
- await queue.put(i)
- await queue.put(None)
- async def consumer():
- items = []
- while True:
- item = await queue.get()
- if item is None:
- break
- items.append(item)
- return items
- r4 = await asyncio.gather(producer(), consumer())
- assert r4[1] == [0, 1, 2], f"queue 失败: {r4[1]}"
- results.append("queue OK")
-
- return results
-
-
-def main():
- result = {"score": 0, "passed": False, "note": ""}
- try:
- r = asyncio.run(run_async_tests())
- result["score"] = 100
- result["passed"] = True
- result["note"] = f"Async 实操测试通过!{' / '.join(r)}"
- except Exception as e:
- result["score"] = 50
- result["note"] = f"Async 测试失败: {e}"
-
- print(json.dumps(result))
- sys.exit(0 if result["passed"] else 1)
-
-
-
-
-# ── 统一接口 ──
-def run(env: dict = None) -> dict:
- """统一入口: run(env) 接收 env_detector 的输出,返回测试结果"""
- if env is None:
- try:
- from env_detector import detect_all
- import contextlib, io
- with contextlib.redirect_stdout(io.StringIO()):
- env = detect_all()
- except ImportError:
- import sys
- sys.path.insert(0, r"""D:\open_claw_agent\GenericAgent\tools\skill_learn_from_cases""")
- from env_detector import detect_all
- import contextlib, io
- with contextlib.redirect_stdout(io.StringIO()):
- env = detect_all()
- return main()
-
-
-if __name__ == "__main__":
- result = run()
- print(json.dumps(result, ensure_ascii=False))
diff --git a/tools/skill_learn_from_cases/practical_hooks/react_hook.py b/tools/skill_learn_from_cases/practical_hooks/react_hook.py
deleted file mode 100644
index 82c91d24..00000000
--- a/tools/skill_learn_from_cases/practical_hooks/react_hook.py
+++ /dev/null
@@ -1,73 +0,0 @@
-#!/usr/bin/env python3
-"""
-react_hook.py — React/Frontend 实操测试
-
-检测本机 Node.js/npm 可用性,验证 React 开发环境。
-统一接口: run(env) -> dict
-降级: 检测前端构建工具链
-"""
-import json, sys, os, subprocess
-
-
-def run(env: dict = None) -> dict:
- """统一入口"""
- score = 0
- notes = []
- detail = {"node": False, "npm": False, "npx": False, "browsers": []}
-
- # 1. 检测 Node.js
- try:
- r = subprocess.run(["node", "--version"], capture_output=True, text=True, timeout=5)
- if r.returncode == 0:
- ver = r.stdout.strip()
- detail["node"] = True
- score += 25
- notes.append(f"Node.js {ver}")
- except:
- pass
-
- # 2. 检测 npm
- try:
- r = subprocess.run(["npm", "--version"], capture_output=True, text=True, timeout=5)
- if r.returncode == 0:
- ver = r.stdout.strip()
- detail["npm"] = True
- score += 25
- notes.append(f"npm {ver}")
- except:
- pass
-
- # 3. 检测 npx (create-react-app)
- try:
- r = subprocess.run(["npx", "--version"], capture_output=True, text=True, timeout=5)
- if r.returncode == 0:
- ver = r.stdout.strip()
- detail["npx"] = True
- score += 25
- notes.append(f"npx {ver}")
- except:
- pass
-
- # 4. 检测浏览器 (Chrome/Edge)
- browser_paths = [
- ("Chrome", r"C:\Program Files\Google\Chrome\Application\chrome.exe"),
- ("Edge", r"C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe"),
- ]
- for name, path in browser_paths:
- if os.path.exists(path):
- detail["browsers"].append(name)
- score += 12
- if detail["browsers"]:
- notes.append(f"浏览器: {','.join(detail['browsers'])}")
-
- return {
- "score": min(score, 100),
- "passed": score >= 40,
- "note": "; ".join(notes) if notes else "未检测到前端工具链",
- "detail": detail
- }
-
-
-if __name__ == "__main__":
- result = run()
- print(json.dumps(result, ensure_ascii=False))
diff --git a/tools/skill_learn_from_cases/practical_hooks/sql.py b/tools/skill_learn_from_cases/practical_hooks/sql.py
deleted file mode 100644
index f7bbfe88..00000000
--- a/tools/skill_learn_from_cases/practical_hooks/sql.py
+++ /dev/null
@@ -1,118 +0,0 @@
-#!/usr/bin/env python3
-"""SQL 实操测试 — SQLite 查询验证"""
-import json, sqlite3, os, sys
-
-
-def run_sql_tests():
- """创建测试表,执行查询,验证结果"""
- conn = sqlite3.connect(":memory:")
- cur = conn.cursor()
-
- # 建表
- cur.executescript("""
- CREATE TABLE users (
- id INTEGER PRIMARY KEY,
- name TEXT NOT NULL,
- email TEXT,
- age INTEGER,
- created_at TEXT DEFAULT CURRENT_TIMESTAMP
- );
- CREATE TABLE orders (
- id INTEGER PRIMARY KEY,
- user_id INTEGER NOT NULL,
- amount REAL NOT NULL,
- status TEXT DEFAULT 'pending',
- created_at TEXT DEFAULT CURRENT_TIMESTAMP,
- FOREIGN KEY (user_id) REFERENCES users(id)
- );
- INSERT INTO users VALUES (1, 'Alice', 'alice@test.com', 28, '2025-01-15');
- INSERT INTO users VALUES (2, 'Bob', 'bob@test.com', 35, '2025-02-20');
- INSERT INTO users VALUES (3, 'Charlie', 'charlie@test.com', 42, '2025-03-10');
- INSERT INTO orders VALUES (1, 1, 150.00, 'completed', '2025-02-01');
- INSERT INTO orders VALUES (2, 1, 89.99, 'completed', '2025-02-15');
- INSERT INTO orders VALUES (3, 2, 250.00, 'pending', '2025-03-01');
- INSERT INTO orders VALUES (4, 2, 39.99, 'shipped', '2025-03-05');
- INSERT INTO orders VALUES (5, 3, 520.00, 'completed', '2025-03-20');
- INSERT INTO orders VALUES (6, 1, 75.00, 'cancelled', '2025-04-01');
- """)
-
- # 测试1: JOIN查询
- cur.execute("""
- SELECT u.name, COUNT(o.id) as order_count, SUM(o.amount) as total_spent
- FROM users u
- LEFT JOIN orders o ON u.id = o.user_id AND o.status = 'completed'
- GROUP BY u.id
- ORDER BY total_spent DESC
- """)
- rows = cur.fetchall()
- assert len(rows) == 3, f"JOIN测试: 期望3行, 实际{len(rows)}"
- assert rows[0][0] == 'Charlie', f"JOIN测试: Charlie应排第一"
-
- # 测试2: 子查询
- cur.execute("""
- SELECT name, age
- FROM users
- WHERE age >= (SELECT AVG(age) FROM users)
- """)
- rows = cur.fetchall()
- assert len(rows) == 2, f"子查询测试: 期望2行(>=平均年龄), 实际{len(rows)}"
-
- # 测试3: 聚合 + HAVING
- cur.execute("""
- SELECT u.name, SUM(o.amount) as total
- FROM users u
- JOIN orders o ON u.id = o.user_id
- GROUP BY u.id
- HAVING total > 200
- ORDER BY total DESC
- """)
- rows = cur.fetchall()
- assert len(rows) == 3, f"聚合测试: 期望3行(每人总额均>200), 实际{len(rows)}"
- assert rows[0][0] == 'Charlie', f"聚合测试: Charlie应排第一"
-
- conn.close()
- return True
-
-
-def main():
- result = {"score": 0, "passed": False, "note": ""}
- try:
- run_sql_tests()
- result["score"] = 100
- result["passed"] = True
- result["note"] = "SQL 实操测试通过!JOIN/子查询/聚合 全部正确"
- except AssertionError as e:
- result["score"] = 50
- result["note"] = f"SQL 查询结果不符预期: {e}"
- except Exception as e:
- result["score"] = 30
- result["note"] = f"SQL 测试异常: {e}"
-
- print(json.dumps(result))
- sys.exit(0 if result["passed"] else 1)
-
-
-
-
-# ── 统一接口 ──
-def run(env: dict = None) -> dict:
- """统一入口: run(env) 接收 env_detector 的输出,返回测试结果"""
- if env is None:
- try:
- from env_detector import detect_all
- import contextlib, io
- with contextlib.redirect_stdout(io.StringIO()):
- env = detect_all()
- except ImportError:
- import sys
- sys.path.insert(0, r"""D:\open_claw_agent\GenericAgent\tools\skill_learn_from_cases""")
- from env_detector import detect_all
- import contextlib, io
- with contextlib.redirect_stdout(io.StringIO()):
- env = detect_all()
- return main()
-
-
-if __name__ == "__main__":
- result = run()
- print(json.dumps(result, ensure_ascii=False))
diff --git a/tools/skill_learn_from_cases/practical_hooks/ui_design_hook.py b/tools/skill_learn_from_cases/practical_hooks/ui_design_hook.py
deleted file mode 100644
index afec9fc4..00000000
--- a/tools/skill_learn_from_cases/practical_hooks/ui_design_hook.py
+++ /dev/null
@@ -1,63 +0,0 @@
-#!/usr/bin/env python3
-"""
-ui_design_hook.py — UI/UX 设计实操测试
-
-检测本机设计工具(浏览器DevTools/设计资源/Figma等)。
-统一接口: run(env) -> dict
-"""
-import json, sys, os, subprocess, shutil
-
-def run(env: dict = None) -> dict:
- score = 0
- notes = []
- detail = {}
-
- # 1. 检测浏览器 DevTools
- for browser, path in [("Chrome", r"C:\Program Files\Google\Chrome\Application\chrome.exe"),
- ("Edge", r"C:\Program Files (x86)\Microsoft\Edge\Application\msedge.exe")]:
- detail[browser.lower()] = os.path.exists(path)
- if os.path.exists(path):
- score += 15
- notes.append(f"{browser}可用")
-
- # 2. 检测 Node.js/npm (前端工具链)
- for tool in ["node", "npm", "npx"]:
- p = shutil.which(tool)
- detail[tool] = p is not None
- if p:
- score += 8
-
- # 3. 检测 Figma (via npm package)
- has_figma = False
- try:
- r = subprocess.run(["npx", "--yes", "figma", "--version"], capture_output=True, text=True, timeout=5)
- has_figma = r.returncode == 0
- except:
- pass
- detail["figma_cli"] = has_figma
- if has_figma:
- score += 10
- notes.append("Figma CLI可用")
-
- # 4. 检测设计资源目录
- design_dirs = [
- os.path.expanduser("~/Documents/Figma"),
- os.path.expanduser("~/Desktop/设计"),
- os.path.expanduser("~/设计资源"),
- ]
- found_dirs = [d for d in design_dirs if os.path.isdir(d)]
- detail["design_dirs"] = found_dirs
- if found_dirs:
- score += 8
- notes.append("设计资源文件存在")
-
- return {
- "score": min(score, 100),
- "passed": score >= 30,
- "note": "; ".join(notes) if notes else "未检测到设计工具",
- "detail": detail
- }
-
-if __name__ == "__main__":
- result = run()
- print(json.dumps(result, ensure_ascii=False))
diff --git a/tools/skill_learn_from_cases/restore_funcs.py b/tools/skill_learn_from_cases/restore_funcs.py
deleted file mode 100644
index b248daf8..00000000
--- a/tools/skill_learn_from_cases/restore_funcs.py
+++ /dev/null
@@ -1,106 +0,0 @@
-"""
-restore_funcs.py — 从原 engine.py 恢复缺失的核心函数
-
-这些函数在 engine.py 的文件损坏修复过程中丢失,独立成模块避免再次损坏。
-"""
-
-import sys
-import os
-import json
-import subprocess
-import importlib
-from pathlib import Path
-
-GA_ROOT = Path(__file__).resolve().parents[2]
-
-
-def _import_skill_search():
- """延迟导入 skill_search,失败时降级"""
- try:
- from skill_search import search
- return search
- except Exception:
- return None
-
-
-def _import_web_search():
- """导入搜索引擎模块(优先级:环境变量 → metaso_search 自动发现 → Wikipedia fallback)"""
- # 1. 环境变量显式指定(最高优先级)
- module_name = os.environ.get("SEARCH_ENGINE_MODULE")
- func_name = os.environ.get("SEARCH_ENGINE_FUNC", "search")
- if module_name:
- try:
- mod = importlib.import_module(module_name)
- return getattr(mod, func_name)
- except Exception:
- pass
-
- # 2. 自动尝试 metaso_search
- for try_module, try_func in [
- ("memory.metaso_search", "metaso_search"),
- ("memory.metaso_search", "metaso_search_text"),
- ]:
- try:
- mod = importlib.import_module(try_module)
- fn = getattr(mod, try_func, None)
- if fn:
- return fn
- except Exception:
- continue
-
- # 3. Wikipedia fallback
- return _web_search_wikipedia
-
-
-def _web_search_wikipedia(keyword: str, size: int = 5) -> list[dict]:
- """Wikipedia API 搜索 fallback"""
- import urllib.request as _ur, urllib.parse as _up
- try:
- params = {
- "action": "query",
- "list": "search",
- "srsearch": keyword,
- "format": "json",
- "srlimit": min(size, 10),
- }
- url = f"https://en.wikipedia.org/w/api.php?{_up.urlencode(params)}"
- req = _ur.Request(url, headers={"User-Agent": "skill_learn/1.0"})
- with _ur.urlopen(req, timeout=15) as resp:
- data = json.loads(resp.read())
- results = []
- for item in data.get("query", {}).get("search", []):
- title = item.get("title", "")
- snippet = item.get("snippet", "").replace("", "").replace("", "")
- page_url = f"https://en.wikipedia.org/wiki/{_up.quote(title.replace(' ', '_'))}"
- results.append({
- "title": title,
- "url": page_url,
- "snippet": snippet[:300],
- "score": "medium"
- })
- return results
- except Exception:
- return []
-
-
-def _search_sophub(skill_name: str) -> list[dict]:
- """搜索 Sophub SOP 平台"""
- import json as _json, urllib.request as _ur
- try:
- _req = _ur.Request("https://fudankw.cn/sophub/api/sops")
- with _ur.urlopen(_req, timeout=10) as _resp:
- _data = _json.loads(_resp.read())
- _name = skill_name.lower().replace("_", " ").replace("-", " ")
- _tokens = [w for w in _name.split() if len(w) > 2]
- results = []
- for _item in _data.get("items", []):
- _text = (_item.get("title","") + " " + (_item.get("preview","") or "")).lower()
- if any(t in _text for t in _tokens):
- results.append({
- "source": "sophub", "title": _item["title"],
- "snippet": (_item.get("preview","") or "")[:200],
- "url": f"https://fudankw.cn/sophub/sops/{_item['id']}"
- })
- return results
- except Exception:
- return []
diff --git a/tools/skill_learn_from_cases/skill_domain_patterns.json b/tools/skill_learn_from_cases/skill_domain_patterns.json
deleted file mode 100644
index dedf7998..00000000
--- a/tools/skill_learn_from_cases/skill_domain_patterns.json
+++ /dev/null
@@ -1,912 +0,0 @@
-{
- "async": {
- "keywords": [
- "async",
- "asyncio",
- "await",
- "coroutine",
- "event loop",
- "异步",
- "协程",
- "concurrent",
- "parallel",
- "非阻塞"
- ],
- "principles": [
- {
- "principle": "使用 async/await 而非回调模式",
- "id": "P_async_await",
- "confidence": 93
- },
- {
- "principle": "避免在异步中使用阻塞 IO 操作",
- "id": "P_async_no_block",
- "confidence": 92
- },
- {
- "principle": "使用 asyncio 正确管理事件循环",
- "id": "P_async_event_loop",
- "confidence": 91
- },
- {
- "principle": "合理使用 asyncio.gather 并发执行",
- "id": "P_async_gather",
- "confidence": 89
- },
- {
- "principle": "掌握异步编程核心模式",
- "id": "P_async_patterns",
- "confidence": 88
- },
- {
- "principle": "使用 asyncio.Semaphore 控制并发数",
- "id": "P_async_semaphore",
- "confidence": 85
- },
- {
- "principle": "使用 asyncio.TaskGroup 管理任务生命周期",
- "id": "P_async_taskgroup",
- "confidence": 87
- },
- {
- "principle": "使用 asyncio.timeout 设置超时控制",
- "id": "P_async_timeout",
- "confidence": 86
- },
- {
- "principle": "使用 asyncio.Queue 实现生产者消费者模式",
- "id": "P_async_queue",
- "confidence": 84
- },
- {
- "principle": "使用异步上下文管理器处理资源释放",
- "id": "P_async_context",
- "confidence": 83
- },
- {
- "principle": "使用 asyncio.create_task 启动后台任务",
- "id": "P_async_createtask",
- "confidence": 82
- },
- {
- "principle": "使用 asyncio.as_completed 处理最先完成的任务",
- "id": "P_async_ascompleted",
- "confidence": 80
- }
- ]
- },
- "performance": {
- "keywords": [
- "performance",
- "optimize",
- "profiling",
- "benchmark",
- "speed",
- "性能",
- "优化",
- "cProfile",
- "memory",
- "latency",
- "throughput"
- ],
- "principles": [
- {
- "principle": "使用 cProfile/py-spy 定位性能瓶颈",
- "id": "P_perf_profiling",
- "confidence": 90
- },
- {
- "principle": "使用 LRU/本地缓存减少重复计算",
- "id": "P_perf_cache",
- "confidence": 87
- },
- {
- "principle": "批量操作代替逐条处理",
- "id": "P_perf_batch",
- "confidence": 86
- },
- {
- "principle": "使用 local cache/redis 缓存热点数据",
- "id": "P_perf_cache_strategy",
- "confidence": 85
- },
- {
- "principle": "数据库查询优化(N+1, 索引, 连接池)",
- "id": "P_perf_db_query",
- "confidence": 88
- }
- ]
- },
- "fastapi": {
- "keywords": [
- "fastapi",
- "api",
- "endpoint",
- "middleware",
- "rest"
- ],
- "principles": [
- {
- "principle": "使用异步路由处理 IO 密集型请求",
- "id": "P_fastapi_async",
- "confidence": 90
- },
- {
- "principle": "合理使用依赖注入管理资源",
- "id": "P_fastapi_di",
- "confidence": 85
- },
- {
- "principle": "配置请求验证(Pydantic模型)",
- "id": "P_fastapi_validation",
- "confidence": 88
- }
- ]
- },
- "web_scraping": {
- "keywords": [
- "scraping",
- "crawl",
- "爬虫",
- "request",
- "fetch",
- "http"
- ],
- "principles": [
- {
- "principle": "异步批量请求避免串行等待",
- "id": "P_scrape_async_batch",
- "confidence": 90
- },
- {
- "principle": "使用连接池复用 TCP 连接",
- "id": "P_scrape_conn_pool",
- "confidence": 85
- },
- {
- "principle": "添加退避重试机制应对限流",
- "id": "P_scrape_retry",
- "confidence": 88
- }
- ]
- },
- "kubernetes": {
- "keywords": [
- "kubernetes",
- "k8s",
- "pod",
- "deployment",
- "service",
- "ingress",
- "helm",
- "容器编排",
- "container orchestration",
- "kubectl",
- "namespace",
- "configmap",
- "statefulset",
- "daemonset",
- "hpa"
- ],
- "principles": [
- {
- "principle": "使用 Deployment + ReplicaSet 管理无状态服务",
- "id": "P_k8s_deployment",
- "confidence": 92
- },
- {
- "principle": "使用 Service + Ingress 暴露服务",
- "id": "P_k8s_service_ingress",
- "confidence": 90
- },
- {
- "principle": "使用 ConfigMap + Secret 管理配置",
- "id": "P_k8s_config",
- "confidence": 90
- },
- {
- "principle": "使用 PV/PVC 管理持久化存储",
- "id": "P_k8s_storage",
- "confidence": 88
- },
- {
- "principle": "使用 HPA 实现自动扩缩容",
- "id": "P_k8s_hpa",
- "confidence": 85
- },
- {
- "principle": "使用 Namespace 做多环境隔离",
- "id": "P_k8s_namespace",
- "confidence": 85
- },
- {
- "principle": "使用 Readiness/Liveness Probe 确保服务健康",
- "id": "P_k8s_probe",
- "confidence": 90
- },
- {
- "principle": "使用 Helm Charts 管理复杂部署",
- "id": "P_k8s_helm",
- "confidence": 82
- },
- {
- "principle": "使用 RBAC 实现权限控制",
- "id": "P_k8s_rbac",
- "confidence": 85
- },
- {
- "principle": "使用 NetworkPolicy 实现网络隔离",
- "id": "P_k8s_network_policy",
- "confidence": 82
- }
- ]
- },
- "database": {
- "keywords": [
- "database",
- "数据库",
- "sql",
- "mysql",
- "postgresql",
- "mongodb",
- "redis",
- "query",
- "查询",
- "migration",
- "迁移",
- "orm",
- "sqlalchemy",
- "connection pool",
- "connection pooling",
- "index",
- "索引",
- "transaction",
- "事务",
- "backup",
- "备份",
- "replica",
- "sharding",
- "分库分表"
- ],
- "principles": [
- {
- "principle": "使用连接池管理数据库连接",
- "id": "P_db_conn_pool",
- "confidence": 90
- },
- {
- "principle": "合理设计索引提升查询性能",
- "id": "P_db_index",
- "confidence": 92
- },
- {
- "principle": "使用 ORM 管理数据库迁移",
- "id": "P_db_migration",
- "confidence": 87
- },
- {
- "principle": "避免 N+1 查询问题",
- "id": "P_db_n_plus_one",
- "confidence": 90
- },
- {
- "principle": "读写分离提升吞吐量",
- "id": "P_db_read_write",
- "confidence": 85
- },
- {
- "principle": "定期备份和验证恢复流程",
- "id": "P_db_backup",
- "confidence": 88
- },
- {
- "principle": "使用 EXPLAIN 分析查询执行计划",
- "id": "P_db_explain",
- "confidence": 90
- },
- {
- "principle": "合理使用 CTE 代替子查询提升可读性",
- "id": "P_db_cte",
- "confidence": 83
- },
- {
- "principle": "窗口函数优化分组聚合查询",
- "id": "P_db_window",
- "confidence": 85
- },
- {
- "principle": "正确使用 JOIN 类型避免笛卡尔积",
- "id": "P_db_join",
- "confidence": 88
- },
- {
- "principle": "使用事务保证数据一致性(ACID)",
- "id": "P_db_transaction",
- "confidence": 90
- },
- {
- "principle": "分批处理大数据量操作避免锁表",
- "id": "P_db_batch",
- "confidence": 86
- }
- ]
- },
- "frontend_react": {
- "keywords": [
- "react",
- "vue",
- "frontend",
- "前端",
- "component",
- "组件",
- "hook",
- "hooks",
- "jsx",
- "state",
- "props",
- "redux",
- "typescript"
- ],
- "principles": [
- {
- "principle": "使用函数组件 + Hooks 替代类组件",
- "id": "P_react_hooks",
- "confidence": 90
- },
- {
- "principle": "合理拆分组件保持单一职责",
- "id": "P_react_component",
- "confidence": 88
- },
- {
- "principle": "使用 React.memo/useMemo 避免不必要渲染",
- "id": "P_react_memo",
- "confidence": 86
- },
- {
- "principle": "状态管理: 避免 prop drilling,使用 Context/Redux",
- "id": "P_react_state",
- "confidence": 85
- },
- {
- "principle": "使用 TypeScript 提升代码健壮性",
- "id": "P_react_typescript",
- "confidence": 87
- },
- {
- "principle": "代码分割 + 懒加载优化首屏性能",
- "id": "P_react_lazy",
- "confidence": 85
- }
- ]
- },
- "git": {
- "keywords": [
- "git",
- "version control",
- "版本控制",
- "branch",
- "分支",
- "merge",
- "rebase",
- "ci/cd",
- "pipeline",
- "github actions"
- ],
- "principles": [
- {
- "principle": "使用 Git Flow 或 Trunk-Based 分支策略",
- "id": "P_git_branch",
- "confidence": 88
- },
- {
- "principle": "提交信息规范(Conventional Commits)",
- "id": "P_git_commit",
- "confidence": 85
- },
- {
- "principle": "使用 CI/CD 自动化测试和部署",
- "id": "P_git_cicd",
- "confidence": 90
- },
- {
- "principle": "PR/MR 代码审查流程",
- "id": "P_git_review",
- "confidence": 86
- },
- {
- "principle": "使用 rebase 保持提交历史整洁",
- "id": "P_git_rebase",
- "confidence": 82
- },
- {
- "principle": "cherry-pick 选择性合并特定提交",
- "id": "P_git_cherry",
- "confidence": 80
- },
- {
- "principle": "git bisect 二分查找引入 bug 的提交",
- "id": "P_git_bisect",
- "confidence": 78
- },
- {
- "principle": "使用 git hooks 自动化代码检查",
- "id": "P_git_hooks",
- "confidence": 84
- },
- {
- "principle": "git stash 暂存未完成的工作",
- "id": "P_git_stash",
- "confidence": 82
- },
- {
- "principle": "子模块管理(submodule)多仓库项目",
- "id": "P_git_submodule",
- "confidence": 76
- },
- {
- "principle": "git tag 版本标记与发布管理",
- "id": "P_git_tag",
- "confidence": 83
- },
- {
- "principle": "解决合并冲突的策略和技巧",
- "id": "P_git_conflict",
- "confidence": 87
- }
- ]
- },
- "testing": {
- "keywords": [
- "test",
- "testing",
- "测试",
- "unit test",
- "pytest",
- "unittest",
- "integration",
- "集成测试",
- "tdd",
- "mock",
- "coverage"
- ],
- "principles": [
- {
- "principle": "使用 pytest 编写单元测试",
- "id": "P_test_pytest",
- "confidence": 90
- },
- {
- "principle": "使用 Mock 隔离外部依赖",
- "id": "P_test_mock",
- "confidence": 87
- },
- {
- "principle": "测试覆盖核心逻辑和边界情况",
- "id": "P_test_coverage",
- "confidence": 88
- },
- {
- "principle": "集成测试验证组件间协作",
- "id": "P_test_integration",
- "confidence": 85
- }
- ]
- },
- "networking": {
- "keywords": [
- "network",
- "网络",
- "tcp",
- "http",
- "dns",
- "load balancer",
- "负载均衡",
- "firewall",
- "防火墙",
- "proxy",
- "代理",
- "ssl",
- "tls"
- ],
- "principles": [
- {
- "principle": "合理规划网络拓扑和安全策略",
- "id": "P_net_topology",
- "confidence": 85
- },
- {
- "principle": "使用 CDN 加速静态内容分发",
- "id": "P_net_cdn",
- "confidence": 82
- },
- {
- "principle": "配置负载均衡实现高可用",
- "id": "P_net_lb",
- "confidence": 88
- },
- {
- "principle": "使用 HTTPS/TLS 加密传输",
- "id": "P_net_tls",
- "confidence": 92
- }
- ]
- },
- "finance": {
- "keywords": [
- "金融",
- "fintech",
- "贷款",
- "credit",
- "信贷",
- "银行",
- "风控",
- "risk",
- "合规",
- "compliance",
- "审计",
- "audit",
- "反欺诈",
- "fraud",
- "KYC",
- "AML",
- "征信",
- "小微贷款",
- "microfinance",
- "授信"
- ],
- "principles": [
- {
- "principle": "建立严格的身份验证与KYC流程",
- "id": "P_fin_kyc",
- "confidence": 92
- },
- {
- "principle": "实施多层次反欺诈检测机制",
- "id": "P_fin_fraud",
- "confidence": 91
- },
- {
- "principle": "确保信贷审批流程的合规与审计追溯",
- "id": "P_fin_compliance",
- "confidence": 90
- },
- {
- "principle": "采用数据加密保护客户敏感金融信息",
- "id": "P_fin_data_protect",
- "confidence": 93
- },
- {
- "principle": "设计可解释的风控决策模型",
- "id": "P_fin_risk_explain",
- "confidence": 87
- },
- {
- "principle": "实施自动化贷后监控与预警",
- "id": "P_fin_monitoring",
- "confidence": 85
- }
- ]
- },
- "image_processing": {
- "keywords": [
- "图像",
- "image",
- "图片",
- "ocr",
- "识别",
- "检测",
- "opencv",
- "计算机视觉",
- "computer vision",
- "深度学习",
- "cv",
- "cnn",
- "图像处理",
- "图片清晰度",
- "图像质量",
- "预处理",
- "preprocessing",
- "降噪",
- "denoising"
- ],
- "principles": [
- {
- "principle": "实施图像预处理提升识别准确率(降噪/归一化/增强)",
- "id": "P_img_preprocess",
- "confidence": 92
- },
- {
- "principle": "使用OCR技术提取图像中的文本信息",
- "id": "P_img_ocr",
- "confidence": 90
- },
- {
- "principle": "设计图像质量自动评估与筛选机制",
- "id": "P_img_quality",
- "confidence": 88
- },
- {
- "principle": "采用深度学习模型进行图像分类与检测",
- "id": "P_img_dl",
- "confidence": 91
- },
- {
- "principle": "建立图像防篡改校验机制",
- "id": "P_img_integrity",
- "confidence": 89
- }
- ]
- },
- "document_verification": {
- "keywords": [
- "凭证",
- "证件",
- "document",
- "证书",
- "certificate",
- "鉴定",
- "验证",
- "verify",
- "authentication",
- "校验",
- "防伪",
- "anti-counterfeit",
- "水印",
- "watermark",
- "票据",
- "发票",
- "invoice",
- "合同",
- "contract",
- "存证",
- "电子凭证",
- "合规审查"
- ],
- "principles": [
- {
- "principle": "建立多维度凭证真实性交叉验证流程",
- "id": "P_doc_multi_verify",
- "confidence": 93
- },
- {
- "principle": "使用防伪特征检测(水印/微印/安全线)验证凭证真伪",
- "id": "P_doc_security_feat",
- "confidence": 91
- },
- {
- "principle": "实施凭证图像完整性校验(哈希/数字签名)",
- "id": "P_doc_integrity",
- "confidence": 90
- },
- {
- "principle": "设计凭证要素标准化提取与比对框架",
- "id": "P_doc_extract",
- "confidence": 89
- },
- {
- "principle": "建立识别失败的人工复核与异常流转机制",
- "id": "P_doc_review",
- "confidence": 87
- },
- {
- "principle": "支持多类型凭证的统一鉴定引擎架构",
- "id": "P_doc_unified",
- "confidence": 85
- }
- ]
- },
- "remote_sensing": {
- "keywords": [
- "卫星",
- "遥感",
- "remote sensing",
- "satellite",
- "aerial",
- "航拍",
- "无人机",
- "drone",
- "SAR",
- "雷达",
- "radar",
- "光谱",
- "multispectral",
- "高光谱",
- "hyperspectral",
- "地理空间",
- "geospatial",
- "GIS",
- "测绘",
- "地形",
- "变化检测",
- "change detection",
- "目标识别",
- "object detection",
- "像素级",
- "pixel",
- "分辨率",
- "resolution",
- "影像配准",
- "image registration",
- "正射",
- "orthophoto",
- "NDVI",
- "波段",
- "band",
- "融合",
- "fusion"
- ],
- "principles": [
- {
- "principle": "采用多光谱/高光谱数据增强地物识别能力",
- "id": "P_rs_multispectral",
- "confidence": 93
- },
- {
- "principle": "实施图像配准与几何校正消除畸变",
- "id": "P_rs_registration",
- "confidence": 92
- },
- {
- "principle": "设计像素级/对象级变化检测流程",
- "id": "P_rs_change_detection",
- "confidence": 91
- },
- {
- "principle": "使用深度学习模型进行遥感目标检测与分类",
- "id": "P_rs_deep_learning",
- "confidence": 90
- },
- {
- "principle": "建立多源遥感数据融合与时空分析框架",
- "id": "P_rs_fusion",
- "confidence": 89
- },
- {
- "principle": "设计抗云遮挡与大气校正预处理流水线",
- "id": "P_rs_atm_correction",
- "confidence": 88
- },
- {
- "principle": "利用时序分析方法监测地表动态变化",
- "id": "P_rs_time_series",
- "confidence": 87
- },
- {
- "principle": "对卫星图像实施影像分割与语义标注",
- "id": "P_rs_segmentation",
- "confidence": 86
- },
- {
- "principle": "设计遥感影像质量评估(云量/清晰度/覆盖度)",
- "id": "P_rs_qa",
- "confidence": 85
- },
- {
- "principle": "构建地理空间索引支持大范围影像检索",
- "id": "P_rs_spatial_index",
- "confidence": 84
- },
- {
- "principle": "使用GAN/超分辨率重建卫星图像",
- "id": "P_rs_super_res",
- "confidence": 82
- }
- ]
- },
- "graph_database": {
- "keywords": [
- "neo4j",
- "cypher",
- "graph database",
- "图数据库",
- "图查询",
- "graph data",
- "图数据",
- "node",
- "节点",
- "relationship",
- "关系",
- "property graph",
- "属性图",
- "知识图谱",
- "knowledge graph",
- "graph traversal",
- "图遍历",
- "图算法",
- "graph algorithm"
- ],
- "principles": [
- {
- "principle": "使用Cypher查询语言进行图模式匹配与关联分析",
- "id": "P_gql_shortest_path",
- "confidence": 90
- },
- {
- "principle": "设计合理的节点标签与关系类型优化遍历性能",
- "id": "P_gql_node_rel",
- "confidence": 89
- },
- {
- "principle": "利用图索引提升Cypher查询性能",
- "id": "P_gql_index",
- "confidence": 88
- },
- {
- "principle": "使用图算法进行路径分析、社区发现与推荐",
- "id": "P_gql_algo",
- "confidence": 86
- },
- {
- "principle": "采用APOC标准库扩展Neo4j功能",
- "id": "P_gql_apoc",
- "confidence": 85
- },
- {
- "principle": "优化Cypher查询计划(PROFILE/EXPLAIN)",
- "id": "P_gql_profile",
- "confidence": 84
- },
- {
- "principle": "使用事务管理批量写入确保数据一致性",
- "id": "P_gql_txn",
- "confidence": 83
- }
- ]
- },
- "search": {
- "keywords": [
- "search",
- "wiki",
- "wikipedia",
- "wikidata",
- "检索",
- "搜索",
- "查询",
- "文档搜索"
- ],
- "domain_label": "搜索与Wiki检索",
- "principles": [
- {
- "principle": "使用布尔运算符(AND/OR/NOT)组合搜索条件优化检索精度",
- "id": "P_search_boolean",
- "confidence": 90
- },
- {
- "principle": "利用高级搜索过滤器(日期范围、站点限定、文件类型)缩小结果范围",
- "id": "P_search_filters",
- "confidence": 88
- },
- {
- "principle": "构建高效搜索查询时优先使用精确短语匹配(引号)和通配符",
- "id": "P_search_exact",
- "confidence": 85
- },
- {
- "principle": "实现搜索结果的缓存机制减少重复请求提升响应速度",
- "id": "P_search_cache",
- "confidence": 82
- },
- {
- "principle": "设计搜索建议和自动补全功能提升用户体验",
- "id": "P_search_suggest",
- "confidence": 80
- },
- {
- "principle": "实现分页和游标机制处理大规模搜索结果集",
- "id": "P_search_pagination",
- "confidence": 85
- },
- {
- "principle": "使用多步骤过滤工作流(先搜索再筛选层级下钻)提升结果精准度",
- "id": "P_search_multistep",
- "confidence": 83
- },
- {
- "principle": "处理搜索关键词的拼写纠错和同义词扩展提升召回率",
- "id": "P_search_spell",
- "confidence": 78
- }
- ]
- }
-}
\ No newline at end of file
diff --git a/tools/skill_learn_from_cases/sync.ffs_db b/tools/skill_learn_from_cases/sync.ffs_db
deleted file mode 100644
index 626831980a2a730036333e0f20c455cd69742295..0000000000000000000000000000000000000000
GIT binary patch
literal 0
HcmV?d00001
literal 628
zcmV-)0*n1da%E*kX>4Uvd2V9>3jhEB0RR915C8xGiB7h)k>%=2lEBu000000OSagJDrf?3#1wVh!`4TBLV;bAO!#b00000cwSwT%}Z2a6vj_9
z_uiSNar`JtiAGTq%9UgYVjo1V+7)g}kV40a76XBc3j7m-sMIjhh{R2cT(k>}TDE9Y
zL1088C4vNfP#EO+aNN^f*Uwtly2c5JN~sgO$zfg_n8RKLD2
zaMt;It3Lm}r2gH@upV_D-5-2GE?igNhy2KSxI_H_{R;B;Iv?xQ^Oi?+{{!U3&Oh4K
zca5okSPb0g{0lyk=|1&8C&Sv|e1`Wgx2eB@^Aj2&x)Z*a^)X)F2>jF2+VvqW>&%-+C
zJOdxg3;Kzr;Om_q!$)%Qu`^9NAhI_CxMUpk%9`r&;V#CQ#y$T#+f
zdE*V_Z{P1D@@qZ+3G%b@;W+z~-^=(KIFX@)>fhqLtn&=&m62+z`e!&Va-L23#gu$v;C;8}eSrP)l=F88`FSzE%lljYdFC1a
OA^!&uAgN5_dppIW^GZ(u
From 1de2ae259771b957161791f8e80f07ac26a06218 Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Fri, 15 May 2026 13:38:03 +0800
Subject: [PATCH 17/22] =?UTF-8?q?fix:[=E5=BC=80=E5=A7=8B=E7=A9=BA=E9=97=B2?=
=?UTF-8?q?=E8=87=AA=E4=B8=BB=E8=A1=8C=E5=8A=A8]=E6=8C=89=E9=92=AE?=
=?UTF-8?q?=E7=82=B9=E5=87=BB=E6=97=A0=E5=8F=8D=E5=BA=94?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
---
frontends/stapp.py | 1 +
1 file changed, 1 insertion(+)
diff --git a/frontends/stapp.py b/frontends/stapp.py
index 83a6d33c..c0ee0320 100644
--- a/frontends/stapp.py
+++ b/frontends/stapp.py
@@ -97,6 +97,7 @@ def _pet_hook(ctx):
st.divider()
if st.button("开始空闲自主行动"):
st.session_state.last_reply_time = int(time.time()) - 1800
+ st.session_state.autonomous_enabled = True
st.toast("已将上次回复时间设为1800秒前"); st.rerun()
if st.session_state.autonomous_enabled:
if st.button("⏸️ 禁止自主行动"):
From e123390df67efe4429c9595a0b57a6a527929483 Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Fri, 15 May 2026 13:46:17 +0800
Subject: [PATCH 18/22] =?UTF-8?q?=E4=BF=AE=E5=A4=8D=E7=82=B9=E5=87=BB[?=
=?UTF-8?q?=E5=BC=80=E5=A7=8B=E7=A9=BA=E9=97=B2=E8=87=AA=E4=B8=BB=E8=A1=8C?=
=?UTF-8?q?=E5=8A=A8]=E6=8C=89=E9=92=AE=EF=BC=8C=E6=97=A0=E5=8F=8D?=
=?UTF-8?q?=E5=BA=94=E9=97=AE=E9=A2=98?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
---
.gitignore | 1 -
1 file changed, 1 deletion(-)
diff --git a/.gitignore b/.gitignore
index db9bd51f..e699cc17 100644
--- a/.gitignore
+++ b/.gitignore
@@ -118,4 +118,3 @@ reflect/*
**/__pycache__/
.claude/
-tools/skill_learn_from_cases/sync.ffs_db*
\ No newline at end of file
From ec7ae8d9122dd6d0c99dec505ca5c171cba88210 Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Fri, 15 May 2026 13:50:55 +0800
Subject: [PATCH 19/22] clean fork 0515
---
tools/learn_skill_from_cases/README.md | 64 ---
tools/learn_skill_from_cases/__init__.py | 1 -
tools/learn_skill_from_cases/__main__.py | 117 ----
tools/learn_skill_from_cases/dir_manager.py | 130 -----
.../eng_patterns_data.py | 166 ------
tools/learn_skill_from_cases/engine.py | 502 ------------------
6 files changed, 980 deletions(-)
delete mode 100644 tools/learn_skill_from_cases/README.md
delete mode 100644 tools/learn_skill_from_cases/__init__.py
delete mode 100644 tools/learn_skill_from_cases/__main__.py
delete mode 100644 tools/learn_skill_from_cases/dir_manager.py
delete mode 100644 tools/learn_skill_from_cases/eng_patterns_data.py
delete mode 100644 tools/learn_skill_from_cases/engine.py
diff --git a/tools/learn_skill_from_cases/README.md b/tools/learn_skill_from_cases/README.md
deleted file mode 100644
index a4b4439a..00000000
--- a/tools/learn_skill_from_cases/README.md
+++ /dev/null
@@ -1,64 +0,0 @@
-# learn_skill_from_cases — Skill Learning CLI
-
-A streamlined skill learning tool.
-
-**English input only** — provide skill names in pure English.
-
-## Usage
-
-```bash
-# Learn a skill
-python -m tools.learn_skill_from_cases "docker_compose_production"
-
-# List learned skills
-python -m tools.learn_skill_from_cases --list
-
-# Show skill details
-python -m tools.learn_skill_from_cases --show docker_compose_production
-
-# Dry run (preview without creating files)
-python -m tools.learn_skill_from_cases "python_async" --dry-run
-
-# Force refresh (skip inheriting previous patterns)
-python -m tools.learn_skill_from_cases "neo4j_modeling" --force
-
-# Show version
-python -m tools.learn_skill_from_cases --version
-```
-
-## Environment Variables
-
-| Variable | Default | Description |
-| ------------------ | --------------------------- | ------------------------------------ |
-| `SKILL_LLM_ENABLE` | `0` | Set to `1` to enable LLM enhancement |
-| `LLM_API_BASE` | `http://localhost:11434/v1` | OpenAI-compatible API endpoint |
-| `LLM_API_KEY` | — | API key if required |
-| `LLM_MODEL` | `qwen2.5:7b` | Model name |
-| `LLM_TIMEOUT` | `30` | HTTP timeout in seconds |
-
-## Output Structure
-
-```
-GA_ROOT/skills_learning/
- └── {skill_name}/
- ├── rev{N}/
- │ ├── meta.json
- │ ├── cases/all_cases.json
- │ ├── patterns/knowledge_patterns.json
- │ ├── tools/assess.py
- │ ├── reports/learning_report.md
- │ ├── reports/skill_definition.json
- │ └── practice/
- └── ...
-```
-
-## Phase Flow
-
-The tool runs a 5-phase pipeline:
-
-1. **Bootstrap** — create version directory
-2. **Define** — fetch skill definition
-3. **Search** — collect web cases
-4. **Extract** — derive knowledge patterns
-5. **Validate** — run assessment and score
-
diff --git a/tools/learn_skill_from_cases/__init__.py b/tools/learn_skill_from_cases/__init__.py
deleted file mode 100644
index 1ad94e4c..00000000
--- a/tools/learn_skill_from_cases/__init__.py
+++ /dev/null
@@ -1 +0,0 @@
-"""learn_skill_from_cases — English-only skill learning from cases (simplified version)"""
diff --git a/tools/learn_skill_from_cases/__main__.py b/tools/learn_skill_from_cases/__main__.py
deleted file mode 100644
index 562753c5..00000000
--- a/tools/learn_skill_from_cases/__main__.py
+++ /dev/null
@@ -1,117 +0,0 @@
-"""
-__main__.py — learn_skill_from_cases CLI entry point
-
-Usage:
- python -m tools.learn_skill_from_cases "docker_compose_production"
- python -m tools.learn_skill_from_cases --list
- python -m tools.learn_skill_from_cases "python_async" --dry-run
- python -m tools.learn_skill_from_cases "neo4j_modeling" --force
- python -m tools.learn_skill_from_cases --version
- python -m tools.learn_skill_from_cases --show docker_compose_production
-"""
-import sys, argparse, re, json
-from pathlib import Path
-
-GA_ROOT = Path(__file__).resolve().parents[2]
-sys.path.insert(0, str(GA_ROOT))
-
-from tools.learn_skill_from_cases import dir_manager
-
-
-def validate_english_only(name: str):
- """Reject skill names containing CJK characters. English only."""
- if re.search(r'[\u4e00-\u9fff\u3000-\u303f\uff00-\uffef]', name):
- print("Error: Skill name must be in English only.")
- print(" Chinese characters, Japanese characters, and mixed-language inputs are not supported.")
- print(" Please provide a pure English skill name (e.g., 'docker_compose_production').")
- sys.exit(1)
-
-
-def cmd_list():
- """List all learned skills with version info."""
- skills = dir_manager.get_all_skills()
- if not skills:
- print("No skills learned yet. Use:")
- print(' python -m tools.learn_skill_from_cases "your_skill_name"')
- return
- print(f"\nLearned skills ({len(skills)} total):")
- print("-" * 55)
- for skill in skills:
- versions = dir_manager.get_versions(skill)
- print(f" {skill:30s} rev{versions[-1] if versions else '--'}")
-
-
-def cmd_show(skill_name: str):
- """Show details of a specific skill (version list + patterns)."""
- skill_dir = dir_manager.get_skill_dir(skill_name)
- if not skill_dir.exists():
- print(f"Skill '{skill_name}' not found.")
- return
- versions = dir_manager.get_versions(skill_name)
- if not versions:
- print(f"Skill '{skill_name}' has no versions.")
- return
- print(f"\nSkill: {skill_name}")
- print("=" * 55)
- for v in versions:
- print(f" rev{v}")
- patterns_file = skill_dir / f"rev{v}" / "patterns" / "knowledge_patterns.json"
- if patterns_file.exists():
- try:
- patterns = json.loads(patterns_file.read_text(encoding="utf-8"))
- for p in patterns:
- print(f" [{p.get('level','?')}] {p.get('principle','?')[:70]}")
- except Exception:
- pass
-
-
-def main():
- parser = argparse.ArgumentParser(
- description="learn_skill_from_cases — English-only skill learning from cases (simplified)",
- formatter_class=argparse.RawDescriptionHelpFormatter
- )
- parser.add_argument("skill_name", nargs="?", help="English skill name to learn (e.g., docker_compose_production)")
- parser.add_argument("--list", action="store_true", help="List all learned skills")
- parser.add_argument("--show", metavar="NAME", help="Show details of a learned skill")
- parser.add_argument("--dry-run", action="store_true", help="Preview without creating files")
- parser.add_argument("--force", action="store_true", help="Skip inherited patterns, start fresh")
- parser.add_argument("--version", action="store_true", help="Show version")
-
- args = parser.parse_args()
-
- # Handle special commands
- if args.version:
- print("learn_skill_from_cases v1.0.0 (simplified English-only version)")
- return
-
- if args.list:
- cmd_list()
- return
-
- if args.show:
- cmd_show(args.show)
- return
-
- # Must have a skill name
- if not args.skill_name:
- parser.print_help()
- print("\nError: Please provide a skill name or use --list.")
- sys.exit(1)
-
- # Validate: English only
- validate_english_only(args.skill_name)
-
- # Run the learning pipeline
- from tools.learn_skill_from_cases.engine import run
- ctx = run(args.skill_name, dry_run=args.dry_run, force=args.force)
-
- if ctx.get("score", 0) >= 60:
- print(f"\n Learning score: {ctx['score']:.1f}/100 — Good result!")
- elif ctx.get("score", 0) > 0:
- print(f"\n Learning score: {ctx['score']:.1f}/100 — Consider adding more cases.")
- else:
- print(f"\n Score not available. Review the output above.")
-
-
-if __name__ == "__main__":
- main()
diff --git a/tools/learn_skill_from_cases/dir_manager.py b/tools/learn_skill_from_cases/dir_manager.py
deleted file mode 100644
index 4e65bb2a..00000000
--- a/tools/learn_skill_from_cases/dir_manager.py
+++ /dev/null
@@ -1,130 +0,0 @@
-"""
-dir_manager.py — Skill version directory management (simplified, English-only)
-
-Responsibilities: detect existing versions, create revN directories, inherit previous patterns.
-"""
-import os, json, shutil, re
-from pathlib import Path
-
-GA_ROOT = Path(__file__).resolve().parents[2]
-SKILL_LEARN_ROOT = GA_ROOT / "skills_learning"
-
-
-def _sanitize_skill_name(skill_name: str) -> str:
- """Sanitize skill name: only allow alphanumeric, underscore, hyphen. No path traversal."""
- sanitized = re.sub(r'[^\w\-]', '_', skill_name)
- sanitized = sanitized.strip('_')
- return sanitized or "unnamed_skill"
-
-
-def _list_dirs(parent: Path) -> list[Path]:
- if not parent.exists():
- return []
- return [d for d in parent.iterdir() if d.is_dir()]
-
-
-def get_versions(skill_name: str) -> list[int]:
- """Get existing version numbers for a skill, e.g. [1, 2, 3]"""
- skill_dir = SKILL_LEARN_ROOT / _sanitize_skill_name(skill_name)
- versions = []
- for d in _list_dirs(skill_dir):
- if d.name.startswith("rev"):
- try:
- versions.append(int(d.name[3:]))
- except ValueError:
- pass
- return sorted(versions)
-
-
-def next_version(skill_name: str) -> int:
- """Return the next version number."""
- versions = get_versions(skill_name)
- return (max(versions) + 1) if versions else 1
-
-
-def ensure_root_exists():
- """Ensure skills_learning/ root directory exists."""
- if not SKILL_LEARN_ROOT.exists():
- SKILL_LEARN_ROOT.mkdir(parents=True, exist_ok=True)
- print(" [OK] skills_learning/ root directory created")
-
-
-def get_skill_dir(skill_name: str) -> Path:
- """Return skill directory (path injection protected)."""
- return SKILL_LEARN_ROOT / _sanitize_skill_name(skill_name)
-
-
-def get_latest_revision_dir(skill_name: str) -> Path | None:
- """Return the latest rev directory that has knowledge patterns."""
- safe_name = _sanitize_skill_name(skill_name)
- versions = get_versions(safe_name)
- if not versions:
- return None
- skill_dir = SKILL_LEARN_ROOT / safe_name
- for v in reversed(versions):
- patterns_file = skill_dir / f"rev{v}" / "patterns" / "knowledge_patterns.json"
- if patterns_file.exists():
- return skill_dir / f"rev{v}"
- return skill_dir / f"rev{versions[-1]}"
-
-
-def get_latest_patterns(skill_name: str) -> list[dict]:
- """Inherit knowledge patterns from the latest revision."""
- latest = get_latest_revision_dir(skill_name)
- if latest is None:
- return []
- patterns_file = latest / "patterns" / "knowledge_patterns.json"
- if patterns_file.exists():
- with open(patterns_file, encoding="utf-8") as f:
- return json.load(f)
- return []
-
-
-def get_latest_cases(skill_name: str) -> list[dict]:
- """Inherit cases from the latest revision."""
- latest = get_latest_revision_dir(skill_name)
- if not latest:
- return []
- cases_file = latest / "cases" / "all_cases.json"
- if cases_file.exists():
- try:
- with open(cases_file, encoding="utf-8") as f:
- data = json.load(f)
- return data if isinstance(data, list) else [data]
- except (json.JSONDecodeError, OSError):
- pass
- return []
-
-
-def create_revision_dir(skill_name: str, version: int) -> Path:
- """
- Create revN directory structure:
- revN/
- ├── meta.json
- ├── cases/
- ├── patterns/
- ├── tools/
- ├── reports/
- └── practice/
- """
- rev_dir = SKILL_LEARN_ROOT / _sanitize_skill_name(skill_name) / f"rev{version}"
- subdirs = ["cases", "patterns", "tools", "practice", "reports"]
- for s in subdirs:
- (rev_dir / s).mkdir(parents=True, exist_ok=True)
-
- meta = {
- "skill": skill_name,
- "version": version,
- "created_at": "2026-05-15",
- "status": "in_progress"
- }
- with open(rev_dir / "meta.json", "w", encoding="utf-8") as f:
- json.dump(meta, f, indent=2)
- return rev_dir
-
-
-def get_all_skills() -> list[str]:
- """Get all skill names under skills_learning/."""
- if not SKILL_LEARN_ROOT.exists():
- return []
- return sorted(d.name for d in _list_dirs(SKILL_LEARN_ROOT) if d.is_dir())
diff --git a/tools/learn_skill_from_cases/eng_patterns_data.py b/tools/learn_skill_from_cases/eng_patterns_data.py
deleted file mode 100644
index 22ed8f10..00000000
--- a/tools/learn_skill_from_cases/eng_patterns_data.py
+++ /dev/null
@@ -1,166 +0,0 @@
-"""
-eng_patterns_data.py — Static pattern dictionaries for learn_skill_from_cases engine.
-
-Extracted from engine.py to keep core logic lean and allow easy maintenance/expansion.
-"""
-# ============================================================
-# Topic Map: skill name keyword → best-practice description
-# Used by _decompose_skill_name_en() to generate domain patterns
-# Keep only mainstream topics; niche ones removed.
-# ============================================================
-TOPIC_MAP: dict[str, str] = {
- "deploy": "Deployment automation & release management best practices",
- "production": "Production-ready configuration & environment management",
- "docker": "Containerization & Docker orchestration best practices",
- "kubernetes": "Kubernetes cluster management & pod orchestration",
- "k8s": "Kubernetes cluster management & pod orchestration",
- "api": "API design, versioning & documentation best practices",
- "rest": "RESTful API design & HTTP protocol best practices",
- "database": "Database schema design & query optimization",
- "sql": "SQL query optimization & relational data modeling",
- "python": "Python code organization & packaging best practices",
- "async": "Async programming patterns & concurrency management",
- "testing": "Test strategy & automation framework best practices",
- "monitor": "Monitoring & observability stack implementation",
- "security": "Security hardening & vulnerability management",
- "frontend": "Frontend architecture & component design patterns",
- "backend": "Backend service architecture & middleware patterns",
- "microservice": "Microservice decomposition & inter-service communication",
- "devops": "CI/CD pipeline design & infrastructure as code",
- "ci": "Continuous integration pipeline configuration",
- "cd": "Continuous deployment strategies & rollback patterns",
- "data": "Data pipeline architecture & ETL best practices",
- "machine": "Machine learning pipeline & model lifecycle management",
- "automation": "Workflow automation & task scheduling patterns",
-}
-
-# Keywords to scan from case titles (used by _decompose_skill_name_en)
-CASE_SCAN_KEYWORDS: list[str] = [
- "deploy", "docker", "kubernetes", "monitoring", "testing",
- "security", "api", "database", "async", "microservice",
- "pipeline", "automation", "config", "devops", "ci", "cd",
-]
-
-# ============================================================
-# Core Patterns: domain → best-practice principles
-# Used by _extract_patterns() to produce knowledge patterns
-# Keep only high-impact, cross-domain patterns.
-# ============================================================
-CORE_PATTERNS: dict[str, dict] = {
- "production": {
- "keywords": ["production", "deploy", "prod", "release"],
- "principles": [
- ("Use environment variables / config files to separate environments", "P_env_separation", 89),
- ("Pin dependency versions to avoid unexpected upgrades", "P_pin_version", 94),
- ("Set resource limits to prevent single service starvation", "P_resource_limits", 85),
- ]
- },
- "testing": {
- "keywords": ["test", "validate", "verify", "lint"],
- "principles": [
- ("Validate configuration files before deployment", "P_config_validation", 93),
- ("Write unit tests for core business logic", "P_unit_test", 87),
- ("Use integration tests to verify component interactions", "P_integration_test", 85),
- ]
- },
- "security": {
- "keywords": ["security", "auth", "encrypt", "secret", "permission"],
- "principles": [
- ("Never hardcode secrets; use secret management tools", "P_secret_mgmt", 95),
- ("Apply principle of least privilege for service accounts", "P_least_privilege", 90),
- ("Enable TLS/SSL for all service communications", "P_tls", 88),
- ]
- },
- "database": {
- "keywords": ["database", "query", "index", "schema", "migration"],
- "principles": [
- ("Use database migrations for schema changes", "P_db_migration", 90),
- ("Add indexes for frequently queried columns", "P_db_index", 88),
- ("Use connection pooling to manage database connections", "P_connection_pool", 85),
- ]
- },
-}
-
-
-# ============================================================
-# Assessment Code Generator
-# Renders the self-contained assess.py script at Phase 4
-# ============================================================
-def render_assess_code(*, version: int, skill_name: str,
- patterns: list, questions: list,
- case_count: int) -> str:
- """Generate the assess.py script content as a string."""
- import json
- patterns_json = json.dumps(patterns, indent=2)
- questions_json = json.dumps(questions, indent=2)
- return f'''#!/usr/bin/env python3
-"""learn_skill_from_cases rev{version} -- {skill_name} Assessment Tool
-Auto-generated | Knowledge test + Pattern coverage
-"""
-import json, sys, os, random
-from pathlib import Path
-
-PATTERNS = {patterns_json}
-QUESTIONS = {questions_json}
-
-def run_knowledge_test():
- """Run knowledge test and compute score."""
- if not QUESTIONS:
- return 0, []
- per_q = 100.0 / len(QUESTIONS)
- score = 0
- results = []
- border = "-" * 50
- print(f"\\n{{border}}")
- print(f" Knowledge Test ({{len(QUESTIONS)}} questions)")
- print(f"{{border}}")
-
- for qi, q in enumerate(QUESTIONS):
- p = PATTERNS[qi] if qi < len(PATTERNS) else {{}}
- level = p.get("level", "basic") if isinstance(p, dict) else "basic"
- confidence = p.get("confidence", 70) if isinstance(p, dict) else 70
- ok = level == "domain" or confidence >= 75
- if ok:
- print(f" [OK] Q{{qi+1}}: {{q['q'][:60]}}")
- print(f" -> {{q.get('explain', '')[:60]}}")
- score += per_q
- results.append(True)
- else:
- print(f" [!] Q{{qi+1}}: {{q['q'][:60]}}")
- print(f" -> SKIP (low confidence)")
- results.append(False)
- return score, results
-
-def run_pattern_coverage():
- """Check which patterns are covered by cases."""
- covered = 0
- for p in PATTERNS:
- print(f" [{{'OK' if p.get('level') != 'basic' else '??'}}] {{p.get('principle', '?')[:60]}}")
- if p.get('level') != 'basic':
- covered += 1
- total = len(PATTERNS) or 1
- return (covered / total) * 100
-
-def main():
- print(f"\\n{{'='*55}}")
- print(f" Assessment: rev{version} -- {skill_name}")
- print(f"{{'='*55}}")
- print(f" Cases collected: {case_count}")
- print(f" Patterns extracted: {{len(PATTERNS)}}")
-
- knowledge_score, _ = run_knowledge_test()
- coverage_score = run_pattern_coverage()
- overall = (knowledge_score * 0.6 + coverage_score * 0.4)
-
- print(f"\\n{{'='*55}}")
- print(f" RESULTS")
- print(f"{{'='*55}}")
- print(f" Knowledge Test: {{knowledge_score:.1f}}/100")
- print(f" Pattern Coverage: {{coverage_score:.1f}}/100")
- print(f" Overall Score: {{overall:.1f}}/100")
- print(f"{{'='*55}}\\n")
- return overall
-
-if __name__ == "__main__":
- main()
-'''
diff --git a/tools/learn_skill_from_cases/engine.py b/tools/learn_skill_from_cases/engine.py
deleted file mode 100644
index 3cdeff31..00000000
--- a/tools/learn_skill_from_cases/engine.py
+++ /dev/null
@@ -1,502 +0,0 @@
-"""
-engine.py — Simplified skill learning engine (English-only)
-
-5-phase flow:
- Phase 0: Bootstrap + directory creation
- Phase 1: Skill definition (skill_search lookup)
- Phase 2: Case collection (skill_search + web search)
- Phase 3: Pattern extraction & knowledge refinement
- Phase 4: Assessment tool generation
- Phase 5: Validation & report
-"""
-import sys, os, json, re, subprocess, importlib, random
-from pathlib import Path
-
-GA_ROOT = Path(__file__).resolve().parents[2]
-sys.path.insert(0, str(GA_ROOT))
-
-from tools.learn_skill_from_cases import dir_manager
-from tools.learn_skill_from_cases.eng_patterns_data import TOPIC_MAP, CASE_SCAN_KEYWORDS, CORE_PATTERNS, render_assess_code
-
-
-# ===============================================================
-# Phase 0: Bootstrap
-# ===============================================================
-def _ensure_env(ctx: dict):
- """Phase 0 — Ensure environment is ready."""
- print("\n" + ("=" * 55))
- print(" Phase 0: Bootstrap")
- print("=" * 55)
- dir_manager.ensure_root_exists()
- version = dir_manager.next_version(ctx["skill_name"])
- rev_dir = dir_manager.create_revision_dir(ctx["skill_name"], version)
- ctx["version"] = version
- ctx["rev_dir"] = rev_dir
- print(f" Skill: {ctx['skill_name']}")
- print(f" Version: rev{version}")
- print(f" Directory: {rev_dir}")
- print(" [OK] Environment ready")
-
-
-# ===============================================================
-# Phase 1: Skill Definition
-# ===============================================================
-def _import_skill_search():
- """Lazy import skill_search, return None if unavailable."""
- try:
- from skill_search import search
- return search
- except Exception:
- return None
-
-
-def _phase1_define(ctx: dict):
- """Phase 1 — Define the skill by looking up known knowledge."""
- print(f"\n{'-' * 55}")
- print(" Phase 1: Skill Definition")
- print("-" * 55)
-
- ctx["skill_definition"] = {
- "name": ctx["skill_name"],
- "description": "",
- "tags": [],
- "source": "user_input"
- }
-
- search_fn = _import_skill_search()
- if search_fn:
- try:
- results = search_fn(ctx["skill_name"].replace("_", " "), top_k=5)
- if results:
- best = results[0]
- s = best.skill
- ctx["skill_definition"]["description"] = (s.description or "")[:500]
- ctx["skill_definition"]["tags"] = (s.tags or [])[:10]
- ctx["skill_definition"]["key"] = s.key
- ctx["skill_definition"]["source"] = "skill_search"
- print(f" Found: {s.key}")
- if s.description:
- print(f" Description: {s.description[:100]}...")
- else:
- print(f" No results from skill_search")
- except Exception as e:
- print(f" skill_search: [FAIL] {e}")
- else:
- print(f" skill_search not available")
-
- # Write definition
- def_file = ctx["rev_dir"] / "reports" / "skill_definition.json"
- with open(def_file, "w", encoding="utf-8") as f:
- json.dump(ctx["skill_definition"], f, indent=2, ensure_ascii=False)
- print(" [OK] Definition saved")
-
-
-# ===============================================================
-# Phase 2: Case Collection
-# ===============================================================
-def _import_web_search():
- """Simple import of web search; return None if unavailable."""
- try:
- from memory.metaso_search import metaso_search as fn
- return fn
- except Exception:
- return None
-
-
-def _generate_search_queries(skill_name: str) -> list[str]:
- """Generate English search queries for a skill name."""
- name = skill_name.replace("_", " ").title()
- return [
- f"{name} tutorial",
- f"{name} how to use",
- f"{name} examples guide",
- f"{name} best practices",
- f"{name} getting started",
- f"learn {name}",
- ]
-
-
-def _phase2_search(ctx: dict):
- """Phase 2 — Collect cases from skill_search + web search."""
- print(f"\n{'-' * 55}")
- print(" Phase 2: Case Collection")
- print("-" * 55)
-
- all_cases = []
-
- # Channel A: Skill Hub
- search_fn = _import_skill_search()
- if search_fn:
- try:
- results = search_fn(ctx["skill_name"].replace("_", " "), top_k=10)
- skill_cases = []
- for r in results:
- s = r.skill
- if hasattr(s, 'key') and not s.key.startswith("agentskill_skills/"):
- skill_cases.append({
- "source": "skill_hub", "type": "skill_def",
- "key": s.key,
- "description": (s.description[:300] if s.description else ""),
- "tags": s.tags[:5] if s.tags else [],
- })
- all_cases.extend(skill_cases)
- print(f" Skill Hub: {len(skill_cases)} results")
- except Exception as e:
- print(f" Skill Hub: [FAIL] {e}")
-
- # Channel B: Web Search
- web_engine = _import_web_search()
- if web_engine:
- try:
- queries = _generate_search_queries(ctx["skill_name"])
- web_cases = []
- seen_urls = set()
- seen_titles = set()
- for q in queries:
- results = web_engine(q, size=5)
- for r in results:
- url = r.get("url", "")
- title = r.get("title", "").strip()
- if url and url not in seen_urls and title not in seen_titles:
- seen_urls.add(url)
- seen_titles.add(title or url)
- web_cases.append({
- "source": "web",
- "type": "web_article",
- "title": title,
- "url": url,
- "snippet": r.get("snippet", "")[:300]
- })
- all_cases.extend(web_cases)
- print(f" Web Search: {len(web_cases)} unique results")
- except Exception as e:
- print(f" Web Search: [FAIL] {e}")
- else:
- print(" Web Search: engine unavailable")
-
- # Inherit previous cases
- if os.environ.get("SKILL_FORCE_REFRESH") != "1":
- inherited = dir_manager.get_latest_cases(ctx["skill_name"])
- if inherited:
- seen_keys = {c.get("url") or c.get("key") or "" for c in all_cases}
- added = 0
- for c in inherited:
- key = c.get("url") or c.get("key") or ""
- if key and key not in seen_keys:
- all_cases.append(c)
- seen_keys.add(key)
- added += 1
- print(f" Inherited from prev revision: +{added} cases")
-
- # Save
- cases_file = ctx["rev_dir"] / "cases" / "all_cases.json"
- with open(cases_file, "w", encoding="utf-8") as f:
- json.dump(all_cases, f, indent=2, ensure_ascii=False)
- ctx["cases"] = all_cases
- print(f" Total cases: {len(all_cases)}")
- print(" [OK] Cases saved")
-
-
-# ===============================================================
-# Phase 3: Pattern Extraction (English only)
-# ===============================================================
-def _decompose_skill_name_en(skill_name: str, cases: list = None) -> list[tuple[str, int]]:
- """Generate sub-topic patterns from an English skill name."""
- words = [w for w in skill_name.replace("_", " ").replace("-", " ").split() if len(w) > 2]
-
- topic_map = TOPIC_MAP
-
- sub_patterns = []
- seen = set()
- for word in words:
- for keyword, pattern_text in topic_map.items():
- if keyword in word.lower() or keyword == word.lower():
- if keyword not in seen:
- seen.add(keyword)
- sub_patterns.append((pattern_text, 78))
-
- # Extract keywords from case titles
- case_keywords_found = set()
- cases = cases or []
- for c in cases:
- text = (c.get("title", "") + " " + c.get("snippet", "")).lower()
- for term in CASE_SCAN_KEYWORDS:
- if term in text and term not in seen:
- case_keywords_found.add(term)
-
- for kw in case_keywords_found:
- display = topic_map.get(kw, f"{kw.title()} related best practices ({skill_name})")
- sub_patterns.append((display, 72))
- seen.add(kw)
-
- if not sub_patterns:
- generic = [
- f"{skill_name} core concepts & terminology",
- f"{skill_name} common scenarios & solutions",
- f"{skill_name} toolchain & environment setup",
- ]
- sub_patterns = [(s, 70) for s in generic]
-
- return sub_patterns[:6]
-
-
-def _extract_patterns(ctx: dict):
- """Phase 3 — Extract knowledge patterns from collected cases."""
- print(f"\n{'-' * 55}")
- print(" Phase 3: Pattern Extraction")
- print("-" * 55)
-
- cases = ctx.get("cases", [])
- skill_name = ctx["skill_name"]
- all_text = " ".join(
- str(v) for c in cases for v in c.values() if isinstance(v, str)
- ).lower()
-
- # Core pattern library (from eng_patterns_data)
- core_patterns = CORE_PATTERNS
-
- patterns = []
- seen_ids = set()
-
- # Match core patterns against case text
- for category, info in core_patterns.items():
- for kw in info["keywords"]:
- if kw in all_text:
- for principle, pid, conf in info["principles"]:
- if pid not in seen_ids:
- patterns.append({"id": pid, "principle": principle, "confidence": conf, "level": "basic"})
- seen_ids.add(pid)
- break
-
- # Add domain patterns from skill name decomposition
- sub_ideas = _decompose_skill_name_en(skill_name, cases=cases)
- for i, (sub_name, conf) in enumerate(sub_ideas):
- pid = f"P_domain_{i+1}"
- if pid not in seen_ids:
- patterns.append({
- "id": pid,
- "principle": sub_name,
- "confidence": conf,
- "level": "domain"
- })
- seen_ids.add(pid)
-
- # Inherit patterns from previous version
- if os.environ.get("SKILL_FORCE_REFRESH") != "1":
- inherited = dir_manager.get_latest_patterns(skill_name)
- if inherited:
- added = 0
- for p in inherited:
- pid = p.get("id")
- if pid and pid not in seen_ids:
- patterns.append({
- "id": pid, "principle": p["principle"],
- "confidence": max(p.get("confidence", 50) - 5, 50),
- "level": "inherited"
- })
- seen_ids.add(pid)
- added += 1
- print(f" Inherited: +{added} patterns from prev revision")
-
- if not patterns:
- # Fallback: generate generic patterns
- patterns = [
- {"id": "P_generic_1", "principle": f"Core concepts of {skill_name}", "confidence": 70, "level": "basic"},
- {"id": "P_generic_2", "principle": f"Best practices for {skill_name} setup", "confidence": 70, "level": "basic"},
- {"id": "P_generic_3", "principle": f"Common pitfalls in {skill_name}", "confidence": 65, "level": "basic"},
- ]
-
- # Save
- patterns_file = ctx["rev_dir"] / "patterns" / "knowledge_patterns.json"
- with open(patterns_file, "w", encoding="utf-8") as f:
- json.dump(patterns, f, indent=2, ensure_ascii=False)
- ctx["patterns"] = patterns
- print(f" Patterns extracted: {len(patterns)}")
- for p in patterns:
- print(f" [{p['level']:>9}] {p['principle'][:60]}")
- print(" [OK] Patterns saved")
-
-
-# ===============================================================
-# Phase 4: Generate Assessment Tool
-# ===============================================================
-def _generate_assessment(ctx: dict):
- """Phase 4 — Generate an inline assessment script."""
- print(f"\n{'-' * 55}")
- print(" Phase 4: Generate Assessment")
- print("-" * 55)
-
- patterns = ctx.get("patterns", [])
- case_count = len(ctx.get("cases", []))
- skill_name = ctx["skill_name"]
- version = ctx["version"]
-
- # Build questions from patterns
- questions = []
- pattern_texts = [p.get("principle", "?") for p in patterns]
- n = len(pattern_texts)
- generic_fillers = [
- "Clean up temp files regularly to free disk space",
- "Use type annotations to improve code readability",
- "Add unit tests to ensure code quality",
- "Document API endpoints for team collaboration",
- ]
-
- for i, p in enumerate(patterns):
- principle = p.get("principle", "")
- scenario = pattern_texts[(i + 1) % n][:60] if n > 1 else principle[:60]
- correct_text = principle[:60]
-
- others = [pattern_texts[j][:60] for j in range(n) if j != i and j != (i + 1) % n]
- random.shuffle(others)
- wrongs = others[:3]
- while len(wrongs) < 3:
- wrongs.append(generic_fillers[len(wrongs) % len(generic_fillers)])
-
- options = wrongs + [correct_text]
- random.shuffle(options)
- correct_idx = options.index(correct_text)
- labels = ["A", "B", "C", "D"]
-
- questions.append({
- "q": f"Which approach is best for: {scenario}?",
- "a": options[0], "b": options[1], "c": options[2], "d": options[3],
- "answer": labels[correct_idx],
- "explain": f"Best practice: {principle}"
- })
-
- # Generate assess.py via template
- assess_code = render_assess_code(
- version=version, skill_name=skill_name,
- patterns=patterns, questions=questions,
- case_count=case_count
- )
-
- assess_file = ctx["rev_dir"] / "tools" / "assess.py"
- with open(assess_file, "w", encoding="utf-8") as f:
- f.write(assess_code)
-
- ctx["assess_file"] = assess_file
- print(f" Generated: tools/assess.py ({len(questions)} questions)")
- print(" [OK] Assessment generated")
-
-
-# ===============================================================
-# Phase 5: Validation & Report
-# ===============================================================
-def _phase5_validate(ctx: dict):
- """Phase 5 — Run validation and generate learning report."""
- print(f"\n{'-' * 55}")
- print(" Phase 5: Validation & Report")
- print("-" * 55)
-
- assess_file = ctx.get("assess_file")
- if assess_file and assess_file.exists():
- try:
- result = subprocess.run(
- [sys.executable, str(assess_file)],
- capture_output=True, text=True, timeout=60,
- cwd=str(ctx["rev_dir"])
- )
- print(result.stdout)
- if result.stderr:
- print(f" [STDERR] {result.stderr[:200]}")
-
- # Parse overall score from output
- score = 0.0
- for line in result.stdout.split("\n"):
- if "Overall Score:" in line:
- try:
- score = float(line.split(":")[1].strip().split("/")[0])
- except ValueError:
- pass
- ctx["score"] = score
- print(f" Validation score: {score:.1f}/100")
- except subprocess.TimeoutExpired:
- print(" [FAIL] Validation timed out")
- ctx["score"] = 0
- except Exception as e:
- print(f" [FAIL] Validation error: {e}")
- ctx["score"] = 0
- else:
- print(" No assess.py found, skipping validation")
- ctx["score"] = 0
-
- # Generate learning report
- report = f"""# Learning Report: {ctx['skill_name']} (rev{ctx['version']})
-
-## Summary
-- **Skill**: {ctx['skill_name']}
-- **Version**: rev{ctx['version']}
-- **Date**: 2026-05-15
-- **Cases collected**: {len(ctx.get('cases', []))}
-- **Patterns extracted**: {len(ctx.get('patterns', []))}
-- **Validation score**: {ctx.get('score', 0):.1f}/100
-
-## Patterns
-"""
- for p in ctx.get("patterns", []):
- report += f"- [{p.get('level', 'basic')}] {p.get('principle', '?')} (confidence: {p.get('confidence', 0)})\n"
-
- report += f"""
-## Next Steps
-1. Review extracted patterns and adjust confidence levels if needed
-2. Add more targeted web searches for uncovered topics
-3. Re-run learning with `--force` for a fresh start
-4. Apply learned patterns in real projects
-"""
-
- report_file = ctx["rev_dir"] / "reports" / "learning_report.md"
- with open(report_file, "w", encoding="utf-8") as f:
- f.write(report)
- print(f" Report saved: reports/learning_report.md")
- print(f" [OK] rev{ctx['version']} complete!")
-
-
-# ===============================================================
-# Main Orchestrator
-# ===============================================================
-def run(skill_name: str, dry_run: bool = False, force: bool = False) -> dict:
- """
- Run the full 5-phase skill learning pipeline.
-
- Args:
- skill_name: English skill name to learn (e.g., "docker_compose_production")
- dry_run: If True, only show what would be done
- force: If True, skip inherited patterns/cases
-
- Returns:
- Context dict with all phase results
- """
- if force:
- os.environ["SKILL_FORCE_REFRESH"] = "1"
-
- ctx = {
- "skill_name": skill_name,
- "version": 0,
- "rev_dir": None,
- "cases": [],
- "patterns": [],
- "score": 0,
- "dry_run": dry_run,
- }
-
- if dry_run:
- print(f"\n{'=' * 55}")
- print(f" DRY RUN: {skill_name}")
- print(f"{'=' * 55}")
- version = dir_manager.next_version(skill_name)
- rev_dir = dir_manager.get_skill_dir(skill_name) / f"rev{version}"
- print(f" Would create: {rev_dir}")
- print(f" Would run: Phase 1-5 pipeline")
- print(f" [OK] Dry run complete (no changes made)")
- return ctx
-
- _ensure_env(ctx)
- _phase1_define(ctx)
- _phase2_search(ctx)
- _extract_patterns(ctx)
- _generate_assessment(ctx)
- _phase5_validate(ctx)
-
- return ctx
From 7e51ab72c0dc09189609ea32fad32ee084bcfa24 Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Fri, 15 May 2026 13:54:10 +0800
Subject: [PATCH 20/22] clean051502
---
frontends/subagent_dashboard.py | 622 --------------------------------
1 file changed, 622 deletions(-)
delete mode 100644 frontends/subagent_dashboard.py
diff --git a/frontends/subagent_dashboard.py b/frontends/subagent_dashboard.py
deleted file mode 100644
index 9376d9af..00000000
--- a/frontends/subagent_dashboard.py
+++ /dev/null
@@ -1,622 +0,0 @@
-"""
-Subagent 集群监控 Dashboard
-=============================
-基于 Streamlit,实时监控所有活跃 subagent 的运行状态。
-支持:查看进度、阅读日志、远程干预(停止/注入指令)。
-
-启动方式:
- streamlit run frontends/subagent_dashboard.py
-"""
-
-import os, sys, glob, time, subprocess, json, re
-from datetime import datetime, timedelta
-from pathlib import Path
-
-import streamlit as st
-
-# ── 路径配置 ──
-CODE_ROOT = Path(__file__).resolve().parent.parent
-TEMP_DIR = CODE_ROOT / "temp"
-
-# ── 页面配置 ──
-st.set_page_config(
- page_title="Subagent 集群监控",
- page_icon="🧠",
- layout="wide",
- initial_sidebar_state="expanded",
-)
-
-# ── 样式美化 ──
-st.markdown("""
-
-""", unsafe_allow_html=True)
-
-
-# ═══════════════════════════════════════════════════════════════
-# 核心函数
-# ═══════════════════════════════════════════════════════════════
-
-def get_subagent_dirs():
- """扫描 temp/ 下所有含 input.txt 的子目录(视为 subagent 任务目录)"""
- if not TEMP_DIR.exists():
- return []
- dirs = []
- for d in sorted(TEMP_DIR.iterdir()):
- if d.is_dir() and (d / "input.txt").exists():
- dirs.append(d)
- return dirs
-
-
-def get_running_pids():
- """获取所有 Python 进程的 {PID: command_line} 映射
- 优先级: psutil → wmic → tasklist → 空(由启发式兜底)
- """
- pid_map = {}
-
- # ── 方法1: psutil(最准确) ──
- try:
- import psutil
- for proc in psutil.process_iter(['pid', 'name', 'cmdline']):
- try:
- name = proc.info['name'] or ''
- if 'python' in name.lower():
- cmdline = ' '.join(proc.info['cmdline'] or [])
- pid_map[proc.info['pid']] = cmdline
- except (psutil.NoSuchProcess, psutil.AccessDenied):
- continue
- if pid_map:
- return pid_map
- except ImportError:
- pass
-
- # ── 方法2: wmic (Windows) ──
- try:
- result = subprocess.run(
- 'wmic process where "name=\'python.exe\' or name=\'pythonw.exe\'" get processid,commandline /FORMAT:CSV',
- capture_output=True, text=True, timeout=5, creationflags=subprocess.CREATE_NO_WINDOW
- )
- for line in result.stdout.strip().split('\n')[1:]: # skip header
- if not line.strip(): continue
- parts = line.split(',', 2)
- if len(parts) >= 3:
- cmd = parts[1].strip('"')
- pid_str = parts[2].strip('"')
- if pid_str.isdigit():
- pid_map[int(pid_str)] = cmd
- if pid_map:
- return pid_map
- except Exception:
- pass
-
- # ── 方法3: tasklist 兜底 ──
- try:
- result = subprocess.run(
- 'tasklist /NH /FI "IMAGENAME eq python.exe" /FO CSV',
- capture_output=True, text=True, timeout=5, creationflags=subprocess.CREATE_NO_WINDOW
- )
- for line in result.stdout.strip().split('\n'):
- if not line.strip(): continue
- parts = line.split(',')
- if len(parts) >= 2 and parts[1].strip('"').isdigit():
- pid_map[int(parts[1].strip('"'))] = "python.exe"
- except Exception:
- pass
-
- # ── 方法4: 都不行就返回空(后续启发式兜底) ──
- return pid_map
-
-
-
-def _extract_model_name(task_dir: Path) -> str:
- """从 stdout.log 首部提取模型/session 名称"""
- log_file = task_dir / "stdout.log"
- if not log_file.exists():
- return "未知"
- try:
- raw = log_file.read_bytes()
- for enc in ['utf-8', 'gbk', 'cp936']:
- try:
- head = raw.decode(enc)[:2000]
- break
- except (UnicodeDecodeError, LookupError):
- continue
- else:
- return "未知"
- m = re.search(r'Using session\s*\(([^)]+)\)', head)
- if m:
- return m.group(1)
- m = re.search(r'(?:model|model_name)\s*[:=]\s*["\']?([^"\'\\\s,;)]+)', head, re.IGNORECASE)
- if m:
- return m.group(1)
- return "未知"
- except Exception:
- return "未知"
-
-def get_subagent_status(task_dir: Path, pid_map: dict):
- """
- 判断 subagent 运行状态
- 返回: (status, pid, runtime_seconds, latest_output, all_outputs, logs)
- """
- # ── 读取所有文件 ──
- input_text = ""
- if (task_dir / "input.txt").exists():
- input_text = (task_dir / "input.txt").read_text(encoding='utf-8', errors='replace')[:300]
-
- # 收集 output 文件(按编号排序)
- output_files = sorted(
- (f for f in task_dir.glob("output*.txt") if f.name != "output.txt"),
- key=lambda f: int(re.search(r'(\d+)', f.stem).group(1)) if re.search(r'(\d+)', f.stem) else 0
- )
- all_outputs = []
- for f in output_files:
- try:
- content = f.read_text(encoding='utf-8', errors='replace')
- all_outputs.append({"file": f.name, "content": content})
- except Exception:
- all_outputs.append({"file": f.name, "content": "[读取失败]"})
-
- latest_output = all_outputs[-1]["content"] if all_outputs else ""
- latest_output_preview = latest_output[:800] if latest_output else "(无输出)"
-
- # 读取日志(带编码检测)
- def _detect_and_read(file_path: Path, tail_bytes: int = 3000):
- """尝试 UTF-8 → GBK → replace 逐级降级读取文件尾部"""
- if not file_path.exists():
- return "", "File Not Found"
- raw = file_path.read_bytes()
- for enc in ['utf-8', 'gbk', 'cp936']:
- try:
- text = raw.decode(enc)
- return text[-tail_bytes:], enc
- except (UnicodeDecodeError, LookupError):
- continue
- # 最后手段:replace
- text = raw.decode('utf-8', errors='replace')
- return text[-tail_bytes:], f"utf-8(replace)"
-
-
-
-
- stdout_log = ""
- stdout_enc = ""
- if (task_dir / "stdout.log").exists():
- stdout_log, stdout_enc = _detect_and_read(task_dir / "stdout.log", 3000)
-
- stderr_log = ""
- stderr_enc = ""
- if (task_dir / "stderr.log").exists():
- stderr_log, stderr_enc = _detect_and_read(task_dir / "stderr.log", 2000)
-
- # 构造编码显示标签
- enc_label = ""
- if stdout_enc and stderr_enc:
- enc_label = f"stdout:{stdout_enc} | stderr:{stderr_enc}"
- elif stdout_enc:
- enc_label = f"stdout:{stdout_enc}"
-
- # ── 判断状态 ──
- # 1. 是否有 _stop 文件
- if (task_dir / "_stop").exists():
- return ("stopped", None, 0, latest_output_preview, all_outputs, stdout_log, stderr_log, input_text, enc_label)
-
- # 2. 检查进程是否存活
- pid = None
- is_alive = False
-
- # 精确匹配:进程命令行包含任务名和 agentmain/--task
- for p, cmdline in pid_map.items():
- if str(task_dir.name) in cmdline and ('agentmain' in cmdline or '--task' in cmdline):
- pid = p
- is_alive = True
- break
-
- # 3. 启发式判断(当 pid_map 为空或未匹配时)
- if not is_alive:
- # 检查 output 文件最近是否有更新(3分钟内视为可能存活)
- # 排除无编号的 output.txt(仅文件头,非轮次输出)
- try:
- now = time.time()
- recent_files = [
- f for f in task_dir.glob("output*.txt")
- if re.search(r'output(\d+)\.txt$', f.name)
- and now - f.stat().st_mtime < 180
- ]
- if recent_files:
- # 有最近更新的 output 文件 → 倾向认为还在运行
- is_alive = True
- except Exception:
- pass
-
- # 4. 判断运行阶段
- has_round_end = "[ROUND END]" in latest_output if latest_output else False
- has_reply_file = (task_dir / "reply.txt").exists()
-
- if is_alive:
- if has_round_end and not has_reply_file:
- status = "waiting" # 等待用户回复
- else:
- status = "running" # 正在执行
- else:
- if latest_output:
- status = "done" # 已完成
- else:
- status = "stopped" # 未正常启动
-
- # 估算运行时长
- runtime = 0
- if output_files:
- try:
- first_mtime = output_files[0].stat().st_mtime
- runtime = int(time.time() - first_mtime)
- except Exception:
- pass
-
- return (status, pid, runtime, latest_output_preview, all_outputs, stdout_log, stderr_log, input_text, enc_label)
-
-
-def render_agent_card(task_dir: Path, status_info: tuple):
- """渲染单个 subagent 的状态卡片"""
- status, pid, runtime, latest_preview, all_outputs, stdout_log, stderr_log, input_text, enc_label = status_info
-
- status_emoji = {
- "running": "▶️", "waiting": "⏸️", "done": "✅", "stopped": "⏹️"
- }
- status_label = {
- "running": "运行中", "waiting": "等待回复", "done": "已完成", "stopped": "已停止"
- }
-
- agent_name = task_dir.name
- model_name = _extract_model_name(task_dir)
- emoji = status_emoji.get(status, "❓")
- label = status_label.get(status, "未知")
-
- # 检测是否有错误 (stderr.log 非空)
- has_error = bool(stderr_log and stderr_log.strip())
- card_class = status if status in status_label else "stopped"
- if has_error:
- card_class += " error"
-
- runtime_str = str(timedelta(seconds=runtime)) if runtime > 0 else "—"
-
- with st.container():
- st.markdown(f'', unsafe_allow_html=True)
-
- # ── 标题行(始终显示) ──
- cols = st.columns([3, 1, 1.2, 1, 1, 1, 0.5])
- with cols[0]:
- st.markdown(f"### {emoji} {agent_name}")
- with cols[1]:
- st.markdown(
- f'
{label}',
- unsafe_allow_html=True
- )
- with cols[2]:
- st.markdown(f'
🤖 {model_name}
', unsafe_allow_html=True)
- with cols[3]:
- st.markdown(
- f'
⏱ {runtime_str}
',
- unsafe_allow_html=True
- )
- if pid:
- with cols[4]:
- st.markdown(f'
🆔 {pid}
', unsafe_allow_html=True)
- with cols[5]:
- if status in ("running", "waiting"):
- if st.button(f"🛑 停止", key=f"stop_{agent_name}"):
- (task_dir / "_stop").write_text("", encoding='utf-8')
- st.rerun()
-
- # 错误提示角标
- if has_error:
- with cols[6]:
- st.markdown(
- f'
⚠️
',
- unsafe_allow_html=True
- )
-
- # ── 折叠展开详情(默认折叠) ──
- with st.expander("📋 详情", expanded=False):
- # ── 任务输入预览 ──
- with st.expander("📋 任务描述", expanded=False):
- st.code(input_text, language="text")
-
- # ── 最新输出摘要 ──
- st.markdown("**📄 最新输出**")
- st.text_area(
- label="最新输出",
- value=latest_preview,
- height=120,
- key=f"output_{agent_name}",
- label_visibility="collapsed",
- )
-
- # ── 日志与详情(折叠) ──
- log_tab, output_tab, intervene_tab = st.tabs(["📋 日志", "📚 全部输出", "✏️ 干预"])
-
- with log_tab:
- if enc_label:
- st.caption(f"🔤 编码检测: {enc_label}")
- col1, col2 = st.columns(2)
- with col1:
- st.markdown("**stdout.log** (尾部)")
- st.code(stdout_log[-2000:], language="text", line_numbers=True)
- with col2:
- st.markdown("**stderr.log** (尾部)")
- st.code(stderr_log[-2000:] if stderr_log else "(空)", language="text", line_numbers=True)
- # 刷新日志按钮
- if st.button(f"🔄 刷新日志", key=f"refresh_log_{agent_name}"):
- st.rerun()
-
- with output_tab:
- for i, o in enumerate(all_outputs):
- expand = (i == len(all_outputs) - 1) # 最新一条默认展开
- with st.expander(f"📄 {o['file']}", expanded=expand):
- st.text_area(
- label=f"完整输出 - {o['file']}",
- value=o["content"][:5000],
- height=200,
- key=f"full_output_{agent_name}_{i}",
- label_visibility="collapsed",
- )
-
- with intervene_tab:
- st.markdown("**写入干预指令**(subagent 下轮执行时会读取)")
- intervene_text = st.text_area(
- label="干预内容",
- value="",
- height=100,
- placeholder="例如: 停止搜索,改为整理已有数据...",
- key=f"intervene_input_{agent_name}",
- label_visibility="collapsed",
- )
- col1, col2, col3 = st.columns(3)
- with col1:
- if st.button(f"📨 发送干预", key=f"send_intervene_{agent_name}"):
- if intervene_text.strip():
- (task_dir / "_intervene").write_text(intervene_text.strip(), encoding='utf-8')
- st.success("✅ 干预指令已发送,下轮生效")
- st.rerun()
- with col2:
- st.markdown("**注入工作记忆**")
- keyinfo_text = st.text_input(
- label="key_info",
- value="",
- placeholder="注入到 working memory 的信息",
- key=f"keyinfo_{agent_name}",
- label_visibility="collapsed",
- )
- if st.button(f"🧠 注入记忆", key=f"send_keyinfo_{agent_name}"):
- if keyinfo_text.strip():
- (task_dir / "_keyinfo").write_text(keyinfo_text.strip(), encoding='utf-8')
- st.success("✅ 已注入工作记忆")
- st.rerun()
- with col3:
- if status == "waiting":
- reply_text = st.text_input(
- label="回复",
- value="",
- placeholder="给 subagent 的回复...",
- key=f"reply_{agent_name}",
- label_visibility="collapsed",
- )
- if st.button(f"💬 发送回复", key=f"send_reply_{agent_name}"):
- if reply_text.strip():
- (task_dir / "reply.txt").write_text(reply_text.strip(), encoding='utf-8')
- st.success("✅ 回复已发送,subagent 将继续执行")
- st.rerun()
-
- st.markdown('
', unsafe_allow_html=True)
-
-
-def render_cluster_overview(agent_statuses):
- """渲染集群概览"""
- total = len(agent_statuses)
- running = sum(1 for s in agent_statuses if s == "running")
- waiting = sum(1 for s in agent_statuses if s == "waiting")
- done = sum(1 for s in agent_statuses if s == "done")
- stopped = sum(1 for s in agent_statuses if s == "stopped")
-
- col1, col2, col3, col4, col5 = st.columns(5)
- with col1:
- st.metric("🧠 总数", total)
- with col2:
- st.metric("▶️ 运行中", running)
- with col3:
- st.metric("⏸️ 等待回复", waiting)
- with col4:
- st.metric("✅ 已完成", done)
- with col5:
- st.metric("⏹️ 已停止", stopped)
-
-
-# ═══════════════════════════════════════════════════════════════
-# 主界面
-# ═══════════════════════════════════════════════════════════════
-
-def main():
- st.title("🧠 Subagent 集群监控")
- st.markdown(f"> 代码根目录:`{CODE_ROOT}` 监控目录:`{TEMP_DIR}`")
-
- # ── 侧边栏 ──
- with st.sidebar:
- st.markdown("### ⚙️ 控制面板")
-
- # 自动刷新
- auto_refresh = st.checkbox("🔄 自动刷新 (3s)", value=True)
- refresh_interval = st.slider("刷新间隔(秒)", 1, 10, 3)
-
- st.divider()
-
- if st.button("🔄 手动刷新", use_container_width=True):
- st.rerun()
-
- st.divider()
- st.markdown("### 🚀 Agent 启动面板")
-
- with st.form("launch_agent_form", clear_on_submit=True):
- task_name = st.text_input("Task Name", placeholder="如:调研员", key="launch_name")
- task_prompt = st.text_area("Task Prompt", placeholder="输入任务描述...", height=100, key="launch_prompt")
- llm_no = st.selectbox(
- "模型选择",
- options=["默认 (0)", "模型1 (1)", "模型2 (2)", "模型3 (3)", "模型4 (4)"],
- index=0,
- key="launch_llm_no",
- help="对应 agentmain.py 的 --llm_no 参数,在 mykey.py 中定义多个 session config 后生效"
- )
- launched = st.form_submit_button("▶️ 启动 Agent", use_container_width=True, type="primary")
-
- if launched:
- if not task_name.strip():
- st.error("❌ Task Name 不能为空")
- elif not task_prompt.strip():
- st.error("❌ Task Prompt 不能为空")
- else:
- try:
- agentmain_path = CODE_ROOT / "agentmain.py"
- # 从 selectbox 的 label 中提取数字
- selected_no = int(llm_no.split("(")[1].split(")")[0])
- cmd = [
- sys.executable, str(agentmain_path),
- "--task", task_name.strip(),
- "--input", task_prompt.strip(),
- "--llm_no", str(selected_no),
- "--bg"
- ]
- subprocess.Popen(
- cmd,
- cwd=str(CODE_ROOT),
- creationflags=subprocess.CREATE_NO_WINDOW
- )
- st.success(f"✅ Agent「{task_name}」启动中… (llm_no={selected_no})")
- time.sleep(0.5)
- st.rerun()
- except Exception as e:
- st.error(f"❌ 启动失败: {e}")
-
- st.divider()
- st.markdown("### 📖 文件协议")
-
- st.markdown("""
- **干预文件** (位于 `temp/{任务名}/`):
- - `_stop` — 停止 subagent (空文件)
- - `_intervene` — 追加指令 (文本)
- - `_keyinfo` — 注入工作记忆 (文本)
- - `reply.txt` — 回复等待中的 agent
-
- **输出文件**:
- - `output{n}.txt` — 每轮输出
- - `stdout.log` — 控制台日志
- - `stderr.log` — 错误日志
- """)
-
- st.divider()
- st.markdown(f"🕐 上次刷新:{datetime.now():%H:%M:%S}")
- st.caption("💡 提示:自动刷新时不要操作干预控件,避免冲突")
-
- # ── 主区域 ──
- main_placeholder = st.empty()
-
- with main_placeholder.container():
- # 获取 subagent 列表
- agent_dirs = get_subagent_dirs()
-
- if not agent_dirs:
- st.info("📭 当前没有活跃的 subagent。启动 subagent 后状态会在此显示。")
- st.markdown("""
- **如何启动 subagent?**
- ```bash
- cd D:\\open_claw_agent\\GenericAgent
- python agentmain.py --task 调研员 --input "你是调研员,请搜索..." --bg
- python agentmain.py --task 程序员 --input "你是程序员,请编写..." --bg
- ```
- """)
- if auto_refresh:
- time.sleep(refresh_interval)
- st.rerun()
- return
-
- # 获取进程信息
- pid_map = get_running_pids()
-
- # 收集所有 agent 状态
- agent_statuses = []
- for d in agent_dirs:
- status_info = get_subagent_status(d, pid_map)
- agent_statuses.append((d.name, status_info))
-
- # ── 集群概览 ──
- st.markdown("### 📊 集群概览")
- statuses_only = [s[1][0] for s in agent_statuses]
- render_cluster_overview(statuses_only)
-
- st.divider()
-
- # ── 每个 Agent 卡片 ──
- st.markdown(f"### 🔍 详情 ({len(agent_dirs)} 个 agent)")
-
- for name, status_info in agent_statuses:
- task_dir = TEMP_DIR / name
- render_agent_card(task_dir, status_info)
- st.divider()
-
- # ── 自动刷新 ──
- if auto_refresh:
- time.sleep(refresh_interval)
- st.rerun()
-
-
-if __name__ == "__main__":
- main()
\ No newline at end of file
From 1cad02bcb071eed51dc2be30608eafef4925fcbd Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Fri, 15 May 2026 14:16:04 +0800
Subject: [PATCH 21/22] =?UTF-8?q?=E8=87=AA=E4=B8=BB=E8=A1=8C=E5=8A=A8?=
=?UTF-8?q?=E7=9B=B8=E5=85=B3=E6=8C=89=E9=92=AE=E4=BF=AE=E5=A4=8D?=
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
---
frontends/stapp.py | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/frontends/stapp.py b/frontends/stapp.py
index c0ee0320..73549324 100644
--- a/frontends/stapp.py
+++ b/frontends/stapp.py
@@ -98,16 +98,16 @@ def _pet_hook(ctx):
if st.button("开始空闲自主行动"):
st.session_state.last_reply_time = int(time.time()) - 1800
st.session_state.autonomous_enabled = True
- st.toast("已将上次回复时间设为1800秒前"); st.rerun()
+ st.toast("已将上次回复时间设为1800秒前,自主行动已激活"); st.rerun(scope="app")
if st.session_state.autonomous_enabled:
if st.button("⏸️ 禁止自主行动"):
st.session_state.autonomous_enabled = False
- st.toast("⏸️ 已禁止自主行动"); st.rerun()
+ st.toast("⏸️ 已禁止自主行动"); st.rerun(scope="app")
st.caption("🟢 自主行动运行中,会在你离开它30分钟后自动进行")
else:
if st.button("▶️ 允许自主行动", type="primary"):
st.session_state.autonomous_enabled = True
- st.toast("✅ 已允许自主行动"); st.rerun()
+ st.toast("✅ 已允许自主行动"); st.rerun(scope="app")
st.caption("🔴 自主行动已停止")
with st.sidebar: render_sidebar()
From c056de8c85373e86435c83fb70f573db7c825b3b Mon Sep 17 00:00:00 2001
From: benechen <13817895035@126.com>
Date: Sat, 16 May 2026 08:25:11 +0800
Subject: [PATCH 22/22] learn_skill_from_cases v000
---
tools/__init__.py | 1 +
tools/learn_skill_from_cases/README.md | 61 +++
tools/learn_skill_from_cases/__init__.py | 1 +
tools/learn_skill_from_cases/__main__.py | 117 ++++
tools/learn_skill_from_cases/dir_manager.py | 130 +++++
.../eng_patterns_data.py | 166 ++++++
tools/learn_skill_from_cases/engine.py | 502 ++++++++++++++++++
7 files changed, 978 insertions(+)
create mode 100644 tools/__init__.py
create mode 100644 tools/learn_skill_from_cases/README.md
create mode 100644 tools/learn_skill_from_cases/__init__.py
create mode 100644 tools/learn_skill_from_cases/__main__.py
create mode 100644 tools/learn_skill_from_cases/dir_manager.py
create mode 100644 tools/learn_skill_from_cases/eng_patterns_data.py
create mode 100644 tools/learn_skill_from_cases/engine.py
diff --git a/tools/__init__.py b/tools/__init__.py
new file mode 100644
index 00000000..84694d4c
--- /dev/null
+++ b/tools/__init__.py
@@ -0,0 +1 @@
+# tools package - utility modules
diff --git a/tools/learn_skill_from_cases/README.md b/tools/learn_skill_from_cases/README.md
new file mode 100644
index 00000000..c0b186bf
--- /dev/null
+++ b/tools/learn_skill_from_cases/README.md
@@ -0,0 +1,61 @@
+# learn_skill_from_cases — English-only Skill Learning CLI
+
+A streamlined skill learning tool. **English input only** — provide skill names in pure English.
+
+## Usage
+
+```bash
+# Learn a skill
+python -m tools.learn_skill_from_cases "docker_compose_production"
+
+# List learned skills
+python -m tools.learn_skill_from_cases --list
+
+# Show skill details
+python -m tools.learn_skill_from_cases --show docker_compose_production
+
+# Dry run (preview without creating files)
+python -m tools.learn_skill_from_cases "python_async" --dry-run
+
+# Force refresh (skip inheriting previous patterns)
+python -m tools.learn_skill_from_cases "neo4j_modeling" --force
+
+# Show version
+python -m tools.learn_skill_from_cases --version
+```
+
+## Environment Variables
+
+| Variable | Default | Description |
+|---|---|---|
+| `SKILL_LLM_ENABLE` | `0` | Set to `1` to enable LLM enhancement |
+| `LLM_API_BASE` | `http://localhost:11434/v1` | OpenAI-compatible API endpoint |
+| `LLM_API_KEY` | — | API key if required |
+| `LLM_MODEL` | `qwen2.5:7b` | Model name |
+| `LLM_TIMEOUT` | `30` | HTTP timeout in seconds |
+
+## Output Structure
+
+```
+GA_ROOT/skills_learning/
+ └── {skill_name}/
+ ├── rev{N}/
+ │ ├── meta.json
+ │ ├── cases/all_cases.json
+ │ ├── patterns/knowledge_patterns.json
+ │ ├── tools/assess.py
+ │ ├── reports/learning_report.md
+ │ ├── reports/skill_definition.json
+ │ └── practice/
+ └── ...
+```
+
+## Phase Flow
+
+The tool runs a 5-phase pipeline:
+
+1. **Bootstrap** — create version directory
+2. **Define** — fetch skill definition
+3. **Search** — collect web cases
+4. **Extract** — derive knowledge patterns
+5. **Validate** — run assessment and score
diff --git a/tools/learn_skill_from_cases/__init__.py b/tools/learn_skill_from_cases/__init__.py
new file mode 100644
index 00000000..1ad94e4c
--- /dev/null
+++ b/tools/learn_skill_from_cases/__init__.py
@@ -0,0 +1 @@
+"""learn_skill_from_cases — English-only skill learning from cases (simplified version)"""
diff --git a/tools/learn_skill_from_cases/__main__.py b/tools/learn_skill_from_cases/__main__.py
new file mode 100644
index 00000000..562753c5
--- /dev/null
+++ b/tools/learn_skill_from_cases/__main__.py
@@ -0,0 +1,117 @@
+"""
+__main__.py — learn_skill_from_cases CLI entry point
+
+Usage:
+ python -m tools.learn_skill_from_cases "docker_compose_production"
+ python -m tools.learn_skill_from_cases --list
+ python -m tools.learn_skill_from_cases "python_async" --dry-run
+ python -m tools.learn_skill_from_cases "neo4j_modeling" --force
+ python -m tools.learn_skill_from_cases --version
+ python -m tools.learn_skill_from_cases --show docker_compose_production
+"""
+import sys, argparse, re, json
+from pathlib import Path
+
+GA_ROOT = Path(__file__).resolve().parents[2]
+sys.path.insert(0, str(GA_ROOT))
+
+from tools.learn_skill_from_cases import dir_manager
+
+
+def validate_english_only(name: str):
+ """Reject skill names containing CJK characters. English only."""
+ if re.search(r'[\u4e00-\u9fff\u3000-\u303f\uff00-\uffef]', name):
+ print("Error: Skill name must be in English only.")
+ print(" Chinese characters, Japanese characters, and mixed-language inputs are not supported.")
+ print(" Please provide a pure English skill name (e.g., 'docker_compose_production').")
+ sys.exit(1)
+
+
+def cmd_list():
+ """List all learned skills with version info."""
+ skills = dir_manager.get_all_skills()
+ if not skills:
+ print("No skills learned yet. Use:")
+ print(' python -m tools.learn_skill_from_cases "your_skill_name"')
+ return
+ print(f"\nLearned skills ({len(skills)} total):")
+ print("-" * 55)
+ for skill in skills:
+ versions = dir_manager.get_versions(skill)
+ print(f" {skill:30s} rev{versions[-1] if versions else '--'}")
+
+
+def cmd_show(skill_name: str):
+ """Show details of a specific skill (version list + patterns)."""
+ skill_dir = dir_manager.get_skill_dir(skill_name)
+ if not skill_dir.exists():
+ print(f"Skill '{skill_name}' not found.")
+ return
+ versions = dir_manager.get_versions(skill_name)
+ if not versions:
+ print(f"Skill '{skill_name}' has no versions.")
+ return
+ print(f"\nSkill: {skill_name}")
+ print("=" * 55)
+ for v in versions:
+ print(f" rev{v}")
+ patterns_file = skill_dir / f"rev{v}" / "patterns" / "knowledge_patterns.json"
+ if patterns_file.exists():
+ try:
+ patterns = json.loads(patterns_file.read_text(encoding="utf-8"))
+ for p in patterns:
+ print(f" [{p.get('level','?')}] {p.get('principle','?')[:70]}")
+ except Exception:
+ pass
+
+
+def main():
+ parser = argparse.ArgumentParser(
+ description="learn_skill_from_cases — English-only skill learning from cases (simplified)",
+ formatter_class=argparse.RawDescriptionHelpFormatter
+ )
+ parser.add_argument("skill_name", nargs="?", help="English skill name to learn (e.g., docker_compose_production)")
+ parser.add_argument("--list", action="store_true", help="List all learned skills")
+ parser.add_argument("--show", metavar="NAME", help="Show details of a learned skill")
+ parser.add_argument("--dry-run", action="store_true", help="Preview without creating files")
+ parser.add_argument("--force", action="store_true", help="Skip inherited patterns, start fresh")
+ parser.add_argument("--version", action="store_true", help="Show version")
+
+ args = parser.parse_args()
+
+ # Handle special commands
+ if args.version:
+ print("learn_skill_from_cases v1.0.0 (simplified English-only version)")
+ return
+
+ if args.list:
+ cmd_list()
+ return
+
+ if args.show:
+ cmd_show(args.show)
+ return
+
+ # Must have a skill name
+ if not args.skill_name:
+ parser.print_help()
+ print("\nError: Please provide a skill name or use --list.")
+ sys.exit(1)
+
+ # Validate: English only
+ validate_english_only(args.skill_name)
+
+ # Run the learning pipeline
+ from tools.learn_skill_from_cases.engine import run
+ ctx = run(args.skill_name, dry_run=args.dry_run, force=args.force)
+
+ if ctx.get("score", 0) >= 60:
+ print(f"\n Learning score: {ctx['score']:.1f}/100 — Good result!")
+ elif ctx.get("score", 0) > 0:
+ print(f"\n Learning score: {ctx['score']:.1f}/100 — Consider adding more cases.")
+ else:
+ print(f"\n Score not available. Review the output above.")
+
+
+if __name__ == "__main__":
+ main()
diff --git a/tools/learn_skill_from_cases/dir_manager.py b/tools/learn_skill_from_cases/dir_manager.py
new file mode 100644
index 00000000..4e65bb2a
--- /dev/null
+++ b/tools/learn_skill_from_cases/dir_manager.py
@@ -0,0 +1,130 @@
+"""
+dir_manager.py — Skill version directory management (simplified, English-only)
+
+Responsibilities: detect existing versions, create revN directories, inherit previous patterns.
+"""
+import os, json, shutil, re
+from pathlib import Path
+
+GA_ROOT = Path(__file__).resolve().parents[2]
+SKILL_LEARN_ROOT = GA_ROOT / "skills_learning"
+
+
+def _sanitize_skill_name(skill_name: str) -> str:
+ """Sanitize skill name: only allow alphanumeric, underscore, hyphen. No path traversal."""
+ sanitized = re.sub(r'[^\w\-]', '_', skill_name)
+ sanitized = sanitized.strip('_')
+ return sanitized or "unnamed_skill"
+
+
+def _list_dirs(parent: Path) -> list[Path]:
+ if not parent.exists():
+ return []
+ return [d for d in parent.iterdir() if d.is_dir()]
+
+
+def get_versions(skill_name: str) -> list[int]:
+ """Get existing version numbers for a skill, e.g. [1, 2, 3]"""
+ skill_dir = SKILL_LEARN_ROOT / _sanitize_skill_name(skill_name)
+ versions = []
+ for d in _list_dirs(skill_dir):
+ if d.name.startswith("rev"):
+ try:
+ versions.append(int(d.name[3:]))
+ except ValueError:
+ pass
+ return sorted(versions)
+
+
+def next_version(skill_name: str) -> int:
+ """Return the next version number."""
+ versions = get_versions(skill_name)
+ return (max(versions) + 1) if versions else 1
+
+
+def ensure_root_exists():
+ """Ensure skills_learning/ root directory exists."""
+ if not SKILL_LEARN_ROOT.exists():
+ SKILL_LEARN_ROOT.mkdir(parents=True, exist_ok=True)
+ print(" [OK] skills_learning/ root directory created")
+
+
+def get_skill_dir(skill_name: str) -> Path:
+ """Return skill directory (path injection protected)."""
+ return SKILL_LEARN_ROOT / _sanitize_skill_name(skill_name)
+
+
+def get_latest_revision_dir(skill_name: str) -> Path | None:
+ """Return the latest rev directory that has knowledge patterns."""
+ safe_name = _sanitize_skill_name(skill_name)
+ versions = get_versions(safe_name)
+ if not versions:
+ return None
+ skill_dir = SKILL_LEARN_ROOT / safe_name
+ for v in reversed(versions):
+ patterns_file = skill_dir / f"rev{v}" / "patterns" / "knowledge_patterns.json"
+ if patterns_file.exists():
+ return skill_dir / f"rev{v}"
+ return skill_dir / f"rev{versions[-1]}"
+
+
+def get_latest_patterns(skill_name: str) -> list[dict]:
+ """Inherit knowledge patterns from the latest revision."""
+ latest = get_latest_revision_dir(skill_name)
+ if latest is None:
+ return []
+ patterns_file = latest / "patterns" / "knowledge_patterns.json"
+ if patterns_file.exists():
+ with open(patterns_file, encoding="utf-8") as f:
+ return json.load(f)
+ return []
+
+
+def get_latest_cases(skill_name: str) -> list[dict]:
+ """Inherit cases from the latest revision."""
+ latest = get_latest_revision_dir(skill_name)
+ if not latest:
+ return []
+ cases_file = latest / "cases" / "all_cases.json"
+ if cases_file.exists():
+ try:
+ with open(cases_file, encoding="utf-8") as f:
+ data = json.load(f)
+ return data if isinstance(data, list) else [data]
+ except (json.JSONDecodeError, OSError):
+ pass
+ return []
+
+
+def create_revision_dir(skill_name: str, version: int) -> Path:
+ """
+ Create revN directory structure:
+ revN/
+ ├── meta.json
+ ├── cases/
+ ├── patterns/
+ ├── tools/
+ ├── reports/
+ └── practice/
+ """
+ rev_dir = SKILL_LEARN_ROOT / _sanitize_skill_name(skill_name) / f"rev{version}"
+ subdirs = ["cases", "patterns", "tools", "practice", "reports"]
+ for s in subdirs:
+ (rev_dir / s).mkdir(parents=True, exist_ok=True)
+
+ meta = {
+ "skill": skill_name,
+ "version": version,
+ "created_at": "2026-05-15",
+ "status": "in_progress"
+ }
+ with open(rev_dir / "meta.json", "w", encoding="utf-8") as f:
+ json.dump(meta, f, indent=2)
+ return rev_dir
+
+
+def get_all_skills() -> list[str]:
+ """Get all skill names under skills_learning/."""
+ if not SKILL_LEARN_ROOT.exists():
+ return []
+ return sorted(d.name for d in _list_dirs(SKILL_LEARN_ROOT) if d.is_dir())
diff --git a/tools/learn_skill_from_cases/eng_patterns_data.py b/tools/learn_skill_from_cases/eng_patterns_data.py
new file mode 100644
index 00000000..22ed8f10
--- /dev/null
+++ b/tools/learn_skill_from_cases/eng_patterns_data.py
@@ -0,0 +1,166 @@
+"""
+eng_patterns_data.py — Static pattern dictionaries for learn_skill_from_cases engine.
+
+Extracted from engine.py to keep core logic lean and allow easy maintenance/expansion.
+"""
+# ============================================================
+# Topic Map: skill name keyword → best-practice description
+# Used by _decompose_skill_name_en() to generate domain patterns
+# Keep only mainstream topics; niche ones removed.
+# ============================================================
+TOPIC_MAP: dict[str, str] = {
+ "deploy": "Deployment automation & release management best practices",
+ "production": "Production-ready configuration & environment management",
+ "docker": "Containerization & Docker orchestration best practices",
+ "kubernetes": "Kubernetes cluster management & pod orchestration",
+ "k8s": "Kubernetes cluster management & pod orchestration",
+ "api": "API design, versioning & documentation best practices",
+ "rest": "RESTful API design & HTTP protocol best practices",
+ "database": "Database schema design & query optimization",
+ "sql": "SQL query optimization & relational data modeling",
+ "python": "Python code organization & packaging best practices",
+ "async": "Async programming patterns & concurrency management",
+ "testing": "Test strategy & automation framework best practices",
+ "monitor": "Monitoring & observability stack implementation",
+ "security": "Security hardening & vulnerability management",
+ "frontend": "Frontend architecture & component design patterns",
+ "backend": "Backend service architecture & middleware patterns",
+ "microservice": "Microservice decomposition & inter-service communication",
+ "devops": "CI/CD pipeline design & infrastructure as code",
+ "ci": "Continuous integration pipeline configuration",
+ "cd": "Continuous deployment strategies & rollback patterns",
+ "data": "Data pipeline architecture & ETL best practices",
+ "machine": "Machine learning pipeline & model lifecycle management",
+ "automation": "Workflow automation & task scheduling patterns",
+}
+
+# Keywords to scan from case titles (used by _decompose_skill_name_en)
+CASE_SCAN_KEYWORDS: list[str] = [
+ "deploy", "docker", "kubernetes", "monitoring", "testing",
+ "security", "api", "database", "async", "microservice",
+ "pipeline", "automation", "config", "devops", "ci", "cd",
+]
+
+# ============================================================
+# Core Patterns: domain → best-practice principles
+# Used by _extract_patterns() to produce knowledge patterns
+# Keep only high-impact, cross-domain patterns.
+# ============================================================
+CORE_PATTERNS: dict[str, dict] = {
+ "production": {
+ "keywords": ["production", "deploy", "prod", "release"],
+ "principles": [
+ ("Use environment variables / config files to separate environments", "P_env_separation", 89),
+ ("Pin dependency versions to avoid unexpected upgrades", "P_pin_version", 94),
+ ("Set resource limits to prevent single service starvation", "P_resource_limits", 85),
+ ]
+ },
+ "testing": {
+ "keywords": ["test", "validate", "verify", "lint"],
+ "principles": [
+ ("Validate configuration files before deployment", "P_config_validation", 93),
+ ("Write unit tests for core business logic", "P_unit_test", 87),
+ ("Use integration tests to verify component interactions", "P_integration_test", 85),
+ ]
+ },
+ "security": {
+ "keywords": ["security", "auth", "encrypt", "secret", "permission"],
+ "principles": [
+ ("Never hardcode secrets; use secret management tools", "P_secret_mgmt", 95),
+ ("Apply principle of least privilege for service accounts", "P_least_privilege", 90),
+ ("Enable TLS/SSL for all service communications", "P_tls", 88),
+ ]
+ },
+ "database": {
+ "keywords": ["database", "query", "index", "schema", "migration"],
+ "principles": [
+ ("Use database migrations for schema changes", "P_db_migration", 90),
+ ("Add indexes for frequently queried columns", "P_db_index", 88),
+ ("Use connection pooling to manage database connections", "P_connection_pool", 85),
+ ]
+ },
+}
+
+
+# ============================================================
+# Assessment Code Generator
+# Renders the self-contained assess.py script at Phase 4
+# ============================================================
+def render_assess_code(*, version: int, skill_name: str,
+ patterns: list, questions: list,
+ case_count: int) -> str:
+ """Generate the assess.py script content as a string."""
+ import json
+ patterns_json = json.dumps(patterns, indent=2)
+ questions_json = json.dumps(questions, indent=2)
+ return f'''#!/usr/bin/env python3
+"""learn_skill_from_cases rev{version} -- {skill_name} Assessment Tool
+Auto-generated | Knowledge test + Pattern coverage
+"""
+import json, sys, os, random
+from pathlib import Path
+
+PATTERNS = {patterns_json}
+QUESTIONS = {questions_json}
+
+def run_knowledge_test():
+ """Run knowledge test and compute score."""
+ if not QUESTIONS:
+ return 0, []
+ per_q = 100.0 / len(QUESTIONS)
+ score = 0
+ results = []
+ border = "-" * 50
+ print(f"\\n{{border}}")
+ print(f" Knowledge Test ({{len(QUESTIONS)}} questions)")
+ print(f"{{border}}")
+
+ for qi, q in enumerate(QUESTIONS):
+ p = PATTERNS[qi] if qi < len(PATTERNS) else {{}}
+ level = p.get("level", "basic") if isinstance(p, dict) else "basic"
+ confidence = p.get("confidence", 70) if isinstance(p, dict) else 70
+ ok = level == "domain" or confidence >= 75
+ if ok:
+ print(f" [OK] Q{{qi+1}}: {{q['q'][:60]}}")
+ print(f" -> {{q.get('explain', '')[:60]}}")
+ score += per_q
+ results.append(True)
+ else:
+ print(f" [!] Q{{qi+1}}: {{q['q'][:60]}}")
+ print(f" -> SKIP (low confidence)")
+ results.append(False)
+ return score, results
+
+def run_pattern_coverage():
+ """Check which patterns are covered by cases."""
+ covered = 0
+ for p in PATTERNS:
+ print(f" [{{'OK' if p.get('level') != 'basic' else '??'}}] {{p.get('principle', '?')[:60]}}")
+ if p.get('level') != 'basic':
+ covered += 1
+ total = len(PATTERNS) or 1
+ return (covered / total) * 100
+
+def main():
+ print(f"\\n{{'='*55}}")
+ print(f" Assessment: rev{version} -- {skill_name}")
+ print(f"{{'='*55}}")
+ print(f" Cases collected: {case_count}")
+ print(f" Patterns extracted: {{len(PATTERNS)}}")
+
+ knowledge_score, _ = run_knowledge_test()
+ coverage_score = run_pattern_coverage()
+ overall = (knowledge_score * 0.6 + coverage_score * 0.4)
+
+ print(f"\\n{{'='*55}}")
+ print(f" RESULTS")
+ print(f"{{'='*55}}")
+ print(f" Knowledge Test: {{knowledge_score:.1f}}/100")
+ print(f" Pattern Coverage: {{coverage_score:.1f}}/100")
+ print(f" Overall Score: {{overall:.1f}}/100")
+ print(f"{{'='*55}}\\n")
+ return overall
+
+if __name__ == "__main__":
+ main()
+'''
diff --git a/tools/learn_skill_from_cases/engine.py b/tools/learn_skill_from_cases/engine.py
new file mode 100644
index 00000000..09a980bd
--- /dev/null
+++ b/tools/learn_skill_from_cases/engine.py
@@ -0,0 +1,502 @@
+"""
+engine.py — Simplified skill learning engine (English-only)
+
+5-phase flow:
+ Phase 0: Bootstrap + directory creation
+ Phase 1: Skill definition (skill_search lookup)
+ Phase 2: Case collection (skill_search + web search)
+ Phase 3: Pattern extraction & knowledge refinement
+ Phase 4: Assessment tool generation
+ Phase 5: Validation & report
+"""
+import sys, os, json, re, subprocess, importlib, random
+from pathlib import Path
+
+GA_ROOT = Path(__file__).resolve().parents[2]
+sys.path.insert(0, str(GA_ROOT))
+
+from tools.learn_skill_from_cases import dir_manager
+from tools.learn_skill_from_cases.eng_patterns_data import TOPIC_MAP, CASE_SCAN_KEYWORDS, CORE_PATTERNS, render_assess_code
+
+
+# ===============================================================
+# Phase 0: Bootstrap
+# ===============================================================
+def _ensure_env(ctx: dict):
+ """Phase 0 — Ensure environment is ready."""
+ print("\n" + ("=" * 55))
+ print(" Phase 0: Bootstrap")
+ print("=" * 55)
+ dir_manager.ensure_root_exists()
+ version = dir_manager.next_version(ctx["skill_name"])
+ rev_dir = dir_manager.create_revision_dir(ctx["skill_name"], version)
+ ctx["version"] = version
+ ctx["rev_dir"] = rev_dir
+ print(f" Skill: {ctx['skill_name']}")
+ print(f" Version: rev{version}")
+ print(f" Directory: {rev_dir}")
+ print(" [OK] Environment ready")
+
+
+# ===============================================================
+# Phase 1: Skill Definition
+# ===============================================================
+def _import_skill_search():
+ """Lazy import skill_search, return None if unavailable."""
+ try:
+ from skill_search import search
+ return search
+ except Exception:
+ return None
+
+
+def _phase1_define(ctx: dict):
+ """Phase 1 — Define the skill by looking up known knowledge."""
+ print(f"\n{'-' * 55}")
+ print(" Phase 1: Skill Definition")
+ print("-" * 55)
+
+ ctx["skill_definition"] = {
+ "name": ctx["skill_name"],
+ "description": "",
+ "tags": [],
+ "source": "user_input"
+ }
+
+ search_fn = _import_skill_search()
+ if search_fn:
+ try:
+ results = search_fn(ctx["skill_name"].replace("_", " "), top_k=5)
+ if results:
+ best = results[0]
+ s = best.skill
+ ctx["skill_definition"]["description"] = (s.description or "")[:500]
+ ctx["skill_definition"]["tags"] = (s.tags or [])[:10]
+ ctx["skill_definition"]["key"] = s.key
+ ctx["skill_definition"]["source"] = "skill_search"
+ print(f" Found: {s.key}")
+ if s.description:
+ print(f" Description: {s.description[:100]}...")
+ else:
+ print(f" No results from skill_search")
+ except Exception as e:
+ print(f" skill_search: [FAIL] {e}")
+ else:
+ print(f" skill_search not available")
+
+ # Write definition
+ def_file = ctx["rev_dir"] / "reports" / "skill_definition.json"
+ with open(def_file, "w", encoding="utf-8") as f:
+ json.dump(ctx["skill_definition"], f, indent=2, ensure_ascii=False)
+ print(" [OK] Definition saved")
+
+
+# ===============================================================
+# Phase 2: Case Collection
+# ===============================================================
+def _import_web_search():
+ """Simple import of web search; return None if unavailable."""
+ try:
+ from tools.metaso_search import metaso_search as fn
+ return fn
+ except Exception:
+ return None
+
+
+def _generate_search_queries(skill_name: str) -> list[str]:
+ """Generate English search queries for a skill name."""
+ name = skill_name.replace("_", " ").title()
+ return [
+ f"{name} tutorial",
+ f"{name} how to use",
+ f"{name} examples guide",
+ f"{name} best practices",
+ f"{name} getting started",
+ f"learn {name}",
+ ]
+
+
+def _phase2_search(ctx: dict):
+ """Phase 2 — Collect cases from skill_search + web search."""
+ print(f"\n{'-' * 55}")
+ print(" Phase 2: Case Collection")
+ print("-" * 55)
+
+ all_cases = []
+
+ # Channel A: Skill Hub
+ search_fn = _import_skill_search()
+ if search_fn:
+ try:
+ results = search_fn(ctx["skill_name"].replace("_", " "), top_k=10)
+ skill_cases = []
+ for r in results:
+ s = r.skill
+ if hasattr(s, 'key') and not s.key.startswith("agentskill_skills/"):
+ skill_cases.append({
+ "source": "skill_hub", "type": "skill_def",
+ "key": s.key,
+ "description": (s.description[:300] if s.description else ""),
+ "tags": s.tags[:5] if s.tags else [],
+ })
+ all_cases.extend(skill_cases)
+ print(f" Skill Hub: {len(skill_cases)} results")
+ except Exception as e:
+ print(f" Skill Hub: [FAIL] {e}")
+
+ # Channel B: Web Search
+ web_engine = _import_web_search()
+ if web_engine:
+ try:
+ queries = _generate_search_queries(ctx["skill_name"])
+ web_cases = []
+ seen_urls = set()
+ seen_titles = set()
+ for q in queries:
+ results = web_engine(q, size=5)
+ for r in results:
+ url = r.get("url", "")
+ title = r.get("title", "").strip()
+ if url and url not in seen_urls and title not in seen_titles:
+ seen_urls.add(url)
+ seen_titles.add(title or url)
+ web_cases.append({
+ "source": "web",
+ "type": "web_article",
+ "title": title,
+ "url": url,
+ "snippet": r.get("snippet", "")[:300]
+ })
+ all_cases.extend(web_cases)
+ print(f" Web Search: {len(web_cases)} unique results")
+ except Exception as e:
+ print(f" Web Search: [FAIL] {e}")
+ else:
+ print(" Web Search: engine unavailable")
+
+ # Inherit previous cases
+ if os.environ.get("SKILL_FORCE_REFRESH") != "1":
+ inherited = dir_manager.get_latest_cases(ctx["skill_name"])
+ if inherited:
+ seen_keys = {c.get("url") or c.get("key") or "" for c in all_cases}
+ added = 0
+ for c in inherited:
+ key = c.get("url") or c.get("key") or ""
+ if key and key not in seen_keys:
+ all_cases.append(c)
+ seen_keys.add(key)
+ added += 1
+ print(f" Inherited from prev revision: +{added} cases")
+
+ # Save
+ cases_file = ctx["rev_dir"] / "cases" / "all_cases.json"
+ with open(cases_file, "w", encoding="utf-8") as f:
+ json.dump(all_cases, f, indent=2, ensure_ascii=False)
+ ctx["cases"] = all_cases
+ print(f" Total cases: {len(all_cases)}")
+ print(" [OK] Cases saved")
+
+
+# ===============================================================
+# Phase 3: Pattern Extraction (English only)
+# ===============================================================
+def _decompose_skill_name_en(skill_name: str, cases: list = None) -> list[tuple[str, int]]:
+ """Generate sub-topic patterns from an English skill name."""
+ words = [w for w in skill_name.replace("_", " ").replace("-", " ").split() if len(w) > 2]
+
+ topic_map = TOPIC_MAP
+
+ sub_patterns = []
+ seen = set()
+ for word in words:
+ for keyword, pattern_text in topic_map.items():
+ if keyword in word.lower() or keyword == word.lower():
+ if keyword not in seen:
+ seen.add(keyword)
+ sub_patterns.append((pattern_text, 78))
+
+ # Extract keywords from case titles
+ case_keywords_found = set()
+ cases = cases or []
+ for c in cases:
+ text = (c.get("title", "") + " " + c.get("snippet", "")).lower()
+ for term in CASE_SCAN_KEYWORDS:
+ if term in text and term not in seen:
+ case_keywords_found.add(term)
+
+ for kw in case_keywords_found:
+ display = topic_map.get(kw, f"{kw.title()} related best practices ({skill_name})")
+ sub_patterns.append((display, 72))
+ seen.add(kw)
+
+ if not sub_patterns:
+ generic = [
+ f"{skill_name} core concepts & terminology",
+ f"{skill_name} common scenarios & solutions",
+ f"{skill_name} toolchain & environment setup",
+ ]
+ sub_patterns = [(s, 70) for s in generic]
+
+ return sub_patterns[:6]
+
+
+def _extract_patterns(ctx: dict):
+ """Phase 3 — Extract knowledge patterns from collected cases."""
+ print(f"\n{'-' * 55}")
+ print(" Phase 3: Pattern Extraction")
+ print("-" * 55)
+
+ cases = ctx.get("cases", [])
+ skill_name = ctx["skill_name"]
+ all_text = " ".join(
+ str(v) for c in cases for v in c.values() if isinstance(v, str)
+ ).lower()
+
+ # Core pattern library (from eng_patterns_data)
+ core_patterns = CORE_PATTERNS
+
+ patterns = []
+ seen_ids = set()
+
+ # Match core patterns against case text
+ for category, info in core_patterns.items():
+ for kw in info["keywords"]:
+ if kw in all_text:
+ for principle, pid, conf in info["principles"]:
+ if pid not in seen_ids:
+ patterns.append({"id": pid, "principle": principle, "confidence": conf, "level": "basic"})
+ seen_ids.add(pid)
+ break
+
+ # Add domain patterns from skill name decomposition
+ sub_ideas = _decompose_skill_name_en(skill_name, cases=cases)
+ for i, (sub_name, conf) in enumerate(sub_ideas):
+ pid = f"P_domain_{i+1}"
+ if pid not in seen_ids:
+ patterns.append({
+ "id": pid,
+ "principle": sub_name,
+ "confidence": conf,
+ "level": "domain"
+ })
+ seen_ids.add(pid)
+
+ # Inherit patterns from previous version
+ if os.environ.get("SKILL_FORCE_REFRESH") != "1":
+ inherited = dir_manager.get_latest_patterns(skill_name)
+ if inherited:
+ added = 0
+ for p in inherited:
+ pid = p.get("id")
+ if pid and pid not in seen_ids:
+ patterns.append({
+ "id": pid, "principle": p["principle"],
+ "confidence": max(p.get("confidence", 50) - 5, 50),
+ "level": "inherited"
+ })
+ seen_ids.add(pid)
+ added += 1
+ print(f" Inherited: +{added} patterns from prev revision")
+
+ if not patterns:
+ # Fallback: generate generic patterns
+ patterns = [
+ {"id": "P_generic_1", "principle": f"Core concepts of {skill_name}", "confidence": 70, "level": "basic"},
+ {"id": "P_generic_2", "principle": f"Best practices for {skill_name} setup", "confidence": 70, "level": "basic"},
+ {"id": "P_generic_3", "principle": f"Common pitfalls in {skill_name}", "confidence": 65, "level": "basic"},
+ ]
+
+ # Save
+ patterns_file = ctx["rev_dir"] / "patterns" / "knowledge_patterns.json"
+ with open(patterns_file, "w", encoding="utf-8") as f:
+ json.dump(patterns, f, indent=2, ensure_ascii=False)
+ ctx["patterns"] = patterns
+ print(f" Patterns extracted: {len(patterns)}")
+ for p in patterns:
+ print(f" [{p['level']:>9}] {p['principle'][:60]}")
+ print(" [OK] Patterns saved")
+
+
+# ===============================================================
+# Phase 4: Generate Assessment Tool
+# ===============================================================
+def _generate_assessment(ctx: dict):
+ """Phase 4 — Generate an inline assessment script."""
+ print(f"\n{'-' * 55}")
+ print(" Phase 4: Generate Assessment")
+ print("-" * 55)
+
+ patterns = ctx.get("patterns", [])
+ case_count = len(ctx.get("cases", []))
+ skill_name = ctx["skill_name"]
+ version = ctx["version"]
+
+ # Build questions from patterns
+ questions = []
+ pattern_texts = [p.get("principle", "?") for p in patterns]
+ n = len(pattern_texts)
+ generic_fillers = [
+ "Clean up temp files regularly to free disk space",
+ "Use type annotations to improve code readability",
+ "Add unit tests to ensure code quality",
+ "Document API endpoints for team collaboration",
+ ]
+
+ for i, p in enumerate(patterns):
+ principle = p.get("principle", "")
+ scenario = pattern_texts[(i + 1) % n][:60] if n > 1 else principle[:60]
+ correct_text = principle[:60]
+
+ others = [pattern_texts[j][:60] for j in range(n) if j != i and j != (i + 1) % n]
+ random.shuffle(others)
+ wrongs = others[:3]
+ while len(wrongs) < 3:
+ wrongs.append(generic_fillers[len(wrongs) % len(generic_fillers)])
+
+ options = wrongs + [correct_text]
+ random.shuffle(options)
+ correct_idx = options.index(correct_text)
+ labels = ["A", "B", "C", "D"]
+
+ questions.append({
+ "q": f"Which approach is best for: {scenario}?",
+ "a": options[0], "b": options[1], "c": options[2], "d": options[3],
+ "answer": labels[correct_idx],
+ "explain": f"Best practice: {principle}"
+ })
+
+ # Generate assess.py via template
+ assess_code = render_assess_code(
+ version=version, skill_name=skill_name,
+ patterns=patterns, questions=questions,
+ case_count=case_count
+ )
+
+ assess_file = ctx["rev_dir"] / "tools" / "assess.py"
+ with open(assess_file, "w", encoding="utf-8") as f:
+ f.write(assess_code)
+
+ ctx["assess_file"] = assess_file
+ print(f" Generated: tools/assess.py ({len(questions)} questions)")
+ print(" [OK] Assessment generated")
+
+
+# ===============================================================
+# Phase 5: Validation & Report
+# ===============================================================
+def _phase5_validate(ctx: dict):
+ """Phase 5 — Run validation and generate learning report."""
+ print(f"\n{'-' * 55}")
+ print(" Phase 5: Validation & Report")
+ print("-" * 55)
+
+ assess_file = ctx.get("assess_file")
+ if assess_file and assess_file.exists():
+ try:
+ result = subprocess.run(
+ [sys.executable, str(assess_file)],
+ capture_output=True, text=True, timeout=60,
+ cwd=str(ctx["rev_dir"])
+ )
+ print(result.stdout)
+ if result.stderr:
+ print(f" [STDERR] {result.stderr[:200]}")
+
+ # Parse overall score from output
+ score = 0.0
+ for line in result.stdout.split("\n"):
+ if "Overall Score:" in line:
+ try:
+ score = float(line.split(":")[1].strip().split("/")[0])
+ except ValueError:
+ pass
+ ctx["score"] = score
+ print(f" Validation score: {score:.1f}/100")
+ except subprocess.TimeoutExpired:
+ print(" [FAIL] Validation timed out")
+ ctx["score"] = 0
+ except Exception as e:
+ print(f" [FAIL] Validation error: {e}")
+ ctx["score"] = 0
+ else:
+ print(" No assess.py found, skipping validation")
+ ctx["score"] = 0
+
+ # Generate learning report
+ report = f"""# Learning Report: {ctx['skill_name']} (rev{ctx['version']})
+
+## Summary
+- **Skill**: {ctx['skill_name']}
+- **Version**: rev{ctx['version']}
+- **Date**: 2026-05-15
+- **Cases collected**: {len(ctx.get('cases', []))}
+- **Patterns extracted**: {len(ctx.get('patterns', []))}
+- **Validation score**: {ctx.get('score', 0):.1f}/100
+
+## Patterns
+"""
+ for p in ctx.get("patterns", []):
+ report += f"- [{p.get('level', 'basic')}] {p.get('principle', '?')} (confidence: {p.get('confidence', 0)})\n"
+
+ report += f"""
+## Next Steps
+1. Review extracted patterns and adjust confidence levels if needed
+2. Add more targeted web searches for uncovered topics
+3. Re-run learning with `--force` for a fresh start
+4. Apply learned patterns in real projects
+"""
+
+ report_file = ctx["rev_dir"] / "reports" / "learning_report.md"
+ with open(report_file, "w", encoding="utf-8") as f:
+ f.write(report)
+ print(f" Report saved: reports/learning_report.md")
+ print(f" [OK] rev{ctx['version']} complete!")
+
+
+# ===============================================================
+# Main Orchestrator
+# ===============================================================
+def run(skill_name: str, dry_run: bool = False, force: bool = False) -> dict:
+ """
+ Run the full 5-phase skill learning pipeline.
+
+ Args:
+ skill_name: English skill name to learn (e.g., "docker_compose_production")
+ dry_run: If True, only show what would be done
+ force: If True, skip inherited patterns/cases
+
+ Returns:
+ Context dict with all phase results
+ """
+ if force:
+ os.environ["SKILL_FORCE_REFRESH"] = "1"
+
+ ctx = {
+ "skill_name": skill_name,
+ "version": 0,
+ "rev_dir": None,
+ "cases": [],
+ "patterns": [],
+ "score": 0,
+ "dry_run": dry_run,
+ }
+
+ if dry_run:
+ print(f"\n{'=' * 55}")
+ print(f" DRY RUN: {skill_name}")
+ print(f"{'=' * 55}")
+ version = dir_manager.next_version(skill_name)
+ rev_dir = dir_manager.get_skill_dir(skill_name) / f"rev{version}"
+ print(f" Would create: {rev_dir}")
+ print(f" Would run: Phase 1-5 pipeline")
+ print(f" [OK] Dry run complete (no changes made)")
+ return ctx
+
+ _ensure_env(ctx)
+ _phase1_define(ctx)
+ _phase2_search(ctx)
+ _extract_patterns(ctx)
+ _generate_assessment(ctx)
+ _phase5_validate(ctx)
+
+ return ctx