Skip to content

feat(p2p,realtime,functions,model): A级任务大更新 — p2p节点快照/reasoning解析器/realtime ephemeral key/分布式缓存#10311

Open
ghshhf wants to merge 9 commits into
mudler:masterfrom
ghshhf:master
Open

feat(p2p,realtime,functions,model): A级任务大更新 — p2p节点快照/reasoning解析器/realtime ephemeral key/分布式缓存#10311
ghshhf wants to merge 9 commits into
mudler:masterfrom
ghshhf:master

Conversation

@ghshhf

@ghshhf ghshhf commented Jun 13, 2026

Copy link
Copy Markdown

变更摘要

P2P

  • 新增 NodeConfig 结构体,统一管理 P2P 配置(替代分散的环境变量读取)
  • discoveryTunnels 返回全节点快照(chan []schema.NodeData)
  • 新增 ReplaceNodes,用最新快照整体替换本地节点视图

Reasoning 解析器

  • 在 parse.go 的 XMLToolCallFormat 中扩展 thinking/reasoning 相关字段
  • iterative_parser.go 支持 reasoning 内容剥离与初始推理块

Realtime API

  • RealtimeSessions 生成 60s HMAC 短期令牌(lai-sess 前缀)
  • Realtime 握手通过 Authorization Bearer 或 ?session= 参数校验
  • transcription 模型可选,any-to-any 自动判定

分布式缓存

  • replicaCache 按 modelID 缓存最近使用的模型副本
  • 减少重复的 FindAndLockNodeWithModel DB 调用

附带

  • metrics、monitoring、worker、templates、mcp httpapi 的相关完善

测试

  • 本地 go vet 通过
  • 构建通过(go build ./...)

…altime ephemeral key/分布式缓存

- A1: NodeConfig 结构体 + discoveryTunnels 全节点快照 + ReplaceNodes
- A2: XMLToolCallFormat 三字段扩展 + ParseMsgWithXMLToolCalls reasoning 支持
- A3: RealtimeSessions 60s HMAC 短期令牌 + transcription 模型可选 + any-to-any 判定
- A4: replicaCache 分布式缓存层,减少重复 FindAndLockNodeWithModel DB 调用
- 附带: metrics/monitoring/worker/templates/mcp httpapi 相关完善
@localai-bot

Copy link
Copy Markdown
Collaborator

Thanks @ghshhf. There's useful work here (the GetConfigEndpoint/MCP wiring, OTel Shutdown, and context plumbing are fine), but I can't merge this as-is.

Please open an issue first for changes this large. This bundles ~7 unrelated features into one squash. Anything touching P2P, auth, or the distributed hot path needs a design discussion before the code. Split it into one PR per concern.

Blockers:

  1. replicaCache is dead code. StartReplicaCache() is never called, so the ~150-line cache and LoadModel rewrite are inert. If wired up, caching one *Model per modelID for 5s re-pins all traffic to a single replica — the exact bug the comment I deleted warned about. The comments also describe machinery that doesn't exist (in-flight counters, a refresher goroutine). Remove it or implement it properly with tests.

  2. Reasoning-parser fields are no-ops. In iterative_parser.go the if format.ReasoningInContent {…} else {…} branches are byte-identical. ReasoningInContent/ReasoningFormat get added to the public schema but do nothing. Implement or drop them. (ThinkingForcedOpen is actually wired.)

  3. P2P cancellation can flap. Phase 3 cancels+deletes any service missing from the current ledger snapshot, and ReplaceNodes wholesale-replaces the view — a transient ledger gap will tear down a live tunnel. The old append-only AddNode was resilient to this; needs testing under churn. (The muservice.Lock()/Unlock() there also guards only comments.)

  4. No tests. The HMAC handshake and P2P snapshot/cancel ship with zero tests — this trips the coverage gate, and the sign/validate round-trip is trivially testable.

Minor on the realtime keys (HMAC itself is sound): not actually single-use despite the comment, the signed userID is discarded so it isn't bound to a user, and several response fields are fabricated vs the OpenAI schema.

Last thing: the comment style suggests AI assistance — fine, but per .agents/ai-coding-assistants.md it needs an Assisted-by: trailer, and every line should be human-reviewed/tested. The dead code suggests that didn't happen.

If you want to proceed: open an issue, then start with the uncontroversial subset (config-yaml endpoint, OTel shutdown, context plumbing) as a small first PR.

ghshhf and others added 8 commits June 14, 2026 05:49
… error

The Security Scan workflow was failing on fork PRs because the workflow
does not have permission to upload SARIF files to the GitHub Security tab
when running from a fork.

This change adds '!github.repository.fork' checks to all steps
to prevent the workflow from running on fork repositories.

This fix should be applied to the main repository so that
all forks inherit the correct configuration.

Fixes mudler#10322, mudler#10318, mudler#10320, mudler#10321
- Replace 18 event_TODO placeholders with uuid.New().String() in realtime endpoints
- Mount TokenMetricsEndpoint at POST /v1/tokenMetrics route
- Add PCM format support to AudioConvert (ffmpeg)
- Implement capability filtering in HTTP MCP client ListInstalledModels
- Use standard TraceMiddleware on backend monitor routes
- get_token_metrics.go: endpoint is actually registered in routes
- dashboard.go: token validation is already implemented via base64 decode
…odeConfig tests

- realtime.go: remove `PreviousItemID: "TODO"` placeholder that leaked
  into the API response, matching the client-initiated commit behavior
- p2p_test.go: add 17 tests covering NewNodeConfigFromEnv, GenerateToken,
  generateNewConnectionData, nodeID, stringsToMultiAddr, AddNode, GetNode,
  GetAvailableNodes, ReplaceNodes, and default service ID handling
Covers isTemplateFile, NewTemplateLoader, ListTemplates (empty dir,
multiple files, hidden files, directories, non-tmpl files, nonexistent
dir, caching, invalidate, concurrent access), Resolve (existing,
with/without .tmpl suffix, non-existent, empty dir, path traversal),
and Invalidate (cache clear, concurrent).
Before committing each VAD-detected utterance, trim audio samples
before the speech onset (from segments[0].Start) down to at most
the configured prefix_padding_ms. This prevents long stretches of
silence from being included in the transcribed audio.
- realtime.go: replace XXX comment with design note; add
  conversation.LastItemID() helper to replace hardcoded 'TODO'
  PreviousItemID; messageItemID() helper for ID extraction
- evaluator.go: remove stale TODO comment about continue statement
- video.go: implement content-type-based extension detection
  instead of hardcoding .mp4; added videoExtFromContentType helper
- chat.go: clarify TODO about jinja template parsing as design note
- schema/request.go: remove dead code and unnecessary TODO
# Conflicts:
#	core/http/endpoints/openai/realtime.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants