Skip to content

feat: 支持 OpenAI 与 Gemini 兼容的嵌入模型列表获取#7736

Open
Sisyphbaous-DT-Project wants to merge 4 commits intoAstrBotDevs:masterfrom
Sisyphbaous-DT-Project:feat/openai-embedding-model-discovery
Open

feat: 支持 OpenAI 与 Gemini 兼容的嵌入模型列表获取#7736
Sisyphbaous-DT-Project wants to merge 4 commits intoAstrBotDevs:masterfrom
Sisyphbaous-DT-Project:feat/openai-embedding-model-discovery

Conversation

@Sisyphbaous-DT-Project
Copy link
Copy Markdown
Contributor

@Sisyphbaous-DT-Project Sisyphbaous-DT-Project commented Apr 23, 2026

背景

之前在配置嵌入模型时,用户只能手动填 embedding_model
我的想法是改成和对话供应商一样的,可以通过获取模型列表的方式进行选择

这次就是把这块补齐,让面板可以根据当前填写的 API KeyBase URL 自动拉出可用模型列表
这样用户在新增和编辑提供商时,都能直接选模型

改动

这次主要做了三步

  • openai_embedding 支持通过自定义 embedding_api_baseembedding_api_key 获取模型列表
  • gemini_embedding 也补上了同样的模型列表获取能力
  • Dashboard 复用了现有的通用配置交互,embedding_model 现在可以直接下拉选择,也保留了手动输入

兼容逻辑上做了几层简单的保障

  • OpenAI 兼容提供商优先筛掉明显不是嵌入模型的项
  • Gemini 兼容提供商优先按模型能力判断是否支持 embedContent
  • 如果对方返回的信息不完整,就退回到更宽松的名称判断
  • 如果还是识别不出来,就直接把全部模型列出来,避免误伤可用模型

同时也保留了原来的手动填写方式
如果某个兼容平台不支持模型列表接口,或者返回格式不完整,还是可以继续手动填模型 ID

测试

这次补了对应测试,并且已经实际跑过

  • OpenAI 嵌入模型列表获取测试
  • Gemini 嵌入模型列表获取测试
  • Dashboard 通用接口测试

我本地验证结果是

  • uv run ruff check 通过
  • uv run python -m pytest tests/test_openai_embedding_source.py tests/test_gemini_embedding_source.py tests/test_dashboard.py -k "get_embedding_models or openai_embedding_get_models or gemini_embedding_get_models" -q 通过
  • 结果为 5 passed, 22 deselected

webui如下:
IMG_20260423_141504.png

Summary by Sourcery

Add embedding model discovery support for embedding providers and expose it through the dashboard configuration API and UI.

New Features:

  • Expose a new dashboard API endpoint to fetch available embedding models for a given embedding provider configuration.
  • Enable OpenAI-compatible embedding providers to list and filter available embedding models based on model naming hints.
  • Enable Gemini-compatible embedding providers to list and filter available embedding models based on model capabilities and naming patterns.
  • Add a configurable dashboard control that lets users auto-detect and select embedding models from a fetched list while still allowing manual input.

Enhancements:

  • Ensure embedding providers used for dimension and model discovery are properly terminated to release resources.
  • Extend the embedding provider base class with an optional model discovery interface for implementations to override.

Tests:

  • Add unit tests for OpenAI embedding providers' model discovery behavior and fallbacks.
  • Add unit tests for Gemini embedding providers' model discovery behavior and fallbacks.
  • Add dashboard API tests covering successful, unsupported, and error cases when fetching embedding models.

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. labels Apr 23, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In the dashboard route handlers (get_embedding_dim and get_embedding_models), the terminate logic in the finally blocks is duplicated and only handles async terminate; consider extracting a shared helper that also safely calls synchronous terminate implementations if any providers use them.
  • In AstrBotConfig.vue#getEmbeddingModels, errors are only logged to the console; it would be more consistent with the rest of the UI to surface a toast error on request failure instead of failing silently for the user.
  • The success toast in getEmbeddingModels uses a hard-coded English message (Fetched: n); consider using the existing i18n utilities so the new embedding-model discovery flow is localized like the rest of the dashboard.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In the dashboard route handlers (`get_embedding_dim` and `get_embedding_models`), the terminate logic in the `finally` blocks is duplicated and only handles async `terminate`; consider extracting a shared helper that also safely calls synchronous `terminate` implementations if any providers use them.
- In `AstrBotConfig.vue#getEmbeddingModels`, errors are only logged to the console; it would be more consistent with the rest of the UI to surface a toast error on request failure instead of failing silently for the user.
- The success toast in `getEmbeddingModels` uses a hard-coded English message (`Fetched: n`); consider using the existing i18n utilities so the new embedding-model discovery flow is localized like the rest of the dashboard.

## Individual Comments

### Comment 1
<location path="tests/test_dashboard.py" line_range="1246-1255" />
<code_context>
+    assert _DiscoverableEmbeddingProvider.terminate_calls == 1
+
+
+@pytest.mark.asyncio
+async def test_get_embedding_models_unsupported_returns_error(
+    app: Quart,
+    authenticated_header: dict,
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test where `get_models` raises a generic exception to exercise the `获取嵌入模型列表失败` error path.

Right now we only cover a successful provider and one that raises `NotImplementedError`. There’s also a generic `except Exception as e` branch in `ConfigRoute.get_embedding_models` that logs and returns `获取嵌入模型列表失败: {e}`. Please add a test using a provider whose `get_models()` raises a different exception (e.g. `RuntimeError("boom")`), and assert that:

- The response status is `"error"`,
- The message includes `获取嵌入模型列表失败`, and
- The provider’s `terminate` method is still called once.

This will exercise the generic error path and verify cleanup on runtime failures.

Suggested implementation:

```python
@pytest.mark.asyncio
async def test_get_embedding_models_success_and_terminate(
    app: Quart,
    authenticated_header: dict,
    monkeypatch,
):
    from astrbot.core.provider.register import provider_cls_map

    _DiscoverableEmbeddingProvider.terminate_calls = 0
    monkeypatch.setitem(
        provider_cls_map,
        "test_embedding_discovery",
        SimpleNamespace(cls_type=_DiscoverableEmbeddingProvider),
    )


class _ErrorEmbeddingProvider(_DiscoverableEmbeddingProvider):
    async def get_models(self) -> list[str]:
        raise RuntimeError("boom")


@pytest.mark.asyncio
async def test_get_embedding_models_runtime_error_returns_error_and_terminate(
    app: Quart,
    authenticated_header: dict,
    monkeypatch,
):
    from astrbot.core.provider.register import provider_cls_map

    _ErrorEmbeddingProvider.terminate_calls = 0
    monkeypatch.setitem(
        provider_cls_map,
        "test_embedding_runtime_error",
        SimpleNamespace(cls_type=_ErrorEmbeddingProvider),
    )

    test_client = app.test_client()
    resp = await test_client.get(
        "/api/config/embedding_models",  # adjust path if different in existing tests
        headers=authenticated_header,
    )

    assert resp.status_code == 200
    data = await resp.get_json()
    assert data["status"] == "error"
    assert "获取嵌入模型列表失败" in data["message"]
    assert _ErrorEmbeddingProvider.terminate_calls == 1

```

1. Ensure the URL used in `test_client.get(...)` matches the actual route for `ConfigRoute.get_embedding_models`; update `"/api/config/embedding_models"` to the correct path if your existing tests use a different one.
2. Align the JSON key assertions (`status`, `message`) with the response schema used elsewhere in `test_dashboard.py` (e.g. if responses are nested like `data["data"]["status"]`, adjust accordingly).
3. If other tests create the Quart test client differently (e.g. via a fixture), adapt `test_client` acquisition to match the existing pattern.
</issue_to_address>

### Comment 2
<location path="astrbot/dashboard/routes/config.py" line_range="923" />
<code_context>
+                except Exception:
+                    logger.warning("释放嵌入 provider 资源失败")
+
+    async def get_embedding_models(self):
+        """根据临时 provider_config 获取可用嵌入模型列表"""
+        post_data = await request.json
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the common embedding-provider setup, validation, and teardown into a shared helper so these handlers only implement their specific operations.

You can reduce duplication and make both methods clearer by extracting the shared “resolve + init + teardown” lifecycle into a small helper, then passing in the operation to perform.

For example:

```python
async def _with_embedding_provider(self, provider_config, fn):
    from astrbot.core.provider.provider import EmbeddingProvider
    from astrbot.core.provider.register import provider_cls_map

    provider_type = provider_config.get("type")
    if not provider_type:
        return Response().error("provider_config 缺少 type 字段").__dict__

    if provider_type not in provider_cls_map:
        try:
            self.core_lifecycle.provider_manager.dynamic_import_provider(provider_type)
        except ImportError:
            logger.error(traceback.format_exc())
            return Response().error("提供商适配器加载失败").__dict__

    if provider_type not in provider_cls_map:
        return (
            Response()
            .error(f"未找到适用于 {provider_type} 的提供商适配器")
            .__dict__
        )

    provider_metadata = provider_cls_map[provider_type]
    cls_type = provider_metadata.cls_type
    if not cls_type:
        return Response().error(f"无法找到 {provider_type} 的类").__dict__

    inst = cls_type(provider_config, {})
    if not isinstance(inst, EmbeddingProvider):
        return Response().error("提供商不是 EmbeddingProvider 类型").__dict__

    init_fn = getattr(inst, "initialize", None)
    if inspect.iscoroutinefunction(init_fn):
        await init_fn()

    try:
        return await fn(inst)
    finally:
        terminate_fn = getattr(inst, "terminate", None)
        if inspect.iscoroutinefunction(terminate_fn):
            try:
                await terminate_fn()
            except Exception:
                logger.warning("释放嵌入 provider 资源失败")
```

Then `get_embedding_dim` and `get_embedding_models` become small and focused:

```python
async def get_embedding_dim(self):
    post_data = await request.json
    provider_config = post_data.get("provider_config", None)
    if not provider_config:
        return Response().error("缺少参数 provider_config").__dict__

    try:
        async def _op(inst):
            # existing logic to compute dim (including get_dim / embedding call)
            dim = await inst.get_dim()
            return Response().ok({"embedding_dimensions": dim}).__dict__

        return await self._with_embedding_provider(provider_config, _op)
    except Exception as e:
        logger.error(traceback.format_exc())
        return Response().error(f"获取嵌入维度失败: {e!s}").__dict__
```

```python
async def get_embedding_models(self):
    post_data = await request.json
    provider_config = post_data.get("provider_config", None)
    if not provider_config:
        return Response().error("缺少参数 provider_config").__dict__

    try:
        async def _op(inst):
            try:
                models = await inst.get_models()
            except NotImplementedError:
                return (
                    Response()
                    .error("当前提供商暂不支持自动获取模型列表,请手动填写模型 ID")
                    .__dict__
                )

            models = sorted(dict.fromkeys(models or []))
            return Response().ok({"models": models}).__dict__

        return await self._with_embedding_provider(provider_config, _op)
    except Exception as e:
        logger.error(traceback.format_exc())
        return Response().error(f"获取嵌入模型列表失败: {e!s}").__dict__
```

This keeps all behavior (including dynamic import, type validation, async initialize/terminate, and per-method error messages) while removing the duplicated control flow and `try/except/finally` boilerplate from the two public handlers.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread tests/test_dashboard.py
Comment thread astrbot/dashboard/routes/config.py
@Sisyphbaous-DT-Project Sisyphbaous-DT-Project force-pushed the feat/openai-embedding-model-discovery branch from c66b9b0 to 647c798 Compare April 23, 2026 03:06
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces automatic discovery of embedding models for Gemini and OpenAI providers, along with a new dashboard UI component that uses a combobox for model selection. It also implements error handling in the OpenAI provider to automatically downgrade tool_choice from required to auto when it conflicts with thinking mode. A critical issue was identified in the Gemini embedding provider where the model list iteration was implemented synchronously instead of using an async pager, which would likely result in runtime errors.

Comment thread astrbot/core/provider/sources/gemini_embedding_source.py Outdated
@Sisyphbaous-DT-Project
Copy link
Copy Markdown
Contributor Author

@sourcery-ai review

@Sisyphbaous-DT-Project
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues, and left some high level feedback:

  • In GeminiEmbeddingProvider.get_models, using async for model in await self.client.models.list() tightly couples to your test fake; consider iterating directly over self.client.models.list() or handling both async-iterable and awaited-return cases to better match the real Gemini client API.
  • The logic for dynamically instantiating an EmbeddingProvider in get_embedding_dim and get_embedding_models is nearly identical; extracting this into a shared helper would reduce duplication and keep future changes (e.g., provider loading/initialization rules) consistent.
  • In AstrBotConfig.vue, the success toast message in getEmbeddingModels is hard-coded and not localized (Fetched: ${...}); consider using your i18n utilities (and possibly a more user-friendly message than just a count) to keep the dashboard UI consistent with the rest of the app.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `GeminiEmbeddingProvider.get_models`, using `async for model in await self.client.models.list()` tightly couples to your test fake; consider iterating directly over `self.client.models.list()` or handling both async-iterable and awaited-return cases to better match the real Gemini client API.
- The logic for dynamically instantiating an `EmbeddingProvider` in `get_embedding_dim` and `get_embedding_models` is nearly identical; extracting this into a shared helper would reduce duplication and keep future changes (e.g., provider loading/initialization rules) consistent.
- In `AstrBotConfig.vue`, the success toast message in `getEmbeddingModels` is hard-coded and not localized (`Fetched: ${...}`); consider using your i18n utilities (and possibly a more user-friendly message than just a count) to keep the dashboard UI consistent with the rest of the app.

## Individual Comments

### Comment 1
<location path="tests/test_dashboard.py" line_range="1230-1239" />
<code_context>
+class _UnsupportedEmbeddingProvider(EmbeddingProvider):
</code_context>
<issue_to_address>
**suggestion (testing):** Consider asserting the user-facing error message for unsupported providers

In the unsupported-provider case, the test only asserts `data["status"] == "error"`. Since the implementation returns a specific message when `get_models` is not implemented (e.g. `"当前提供商暂不支持自动获取模型列表,请手动填写模型 ID"`), please also assert on `data["message"]` (or the appropriate field) so changes to the error wording or branch are caught by tests.

Suggested implementation:

```python
    assert data["status"] == "error"
    assert data["message"] == "当前提供商暂不支持自动获取模型列表,请手动填写模型 ID"

```

I assumed the test that uses `_UnsupportedEmbeddingProvider` currently only asserts `data["status"] == "error"` and that the error message is exposed as `data["message"]`. If your response payload uses a different key (e.g. `data["detail"]` or is nested under another field), adjust the added assertion accordingly to target the correct field.
</issue_to_address>

### Comment 2
<location path="astrbot/dashboard/routes/config.py" line_range="923" />
<code_context>
+                except Exception:
+                    logger.warning("释放嵌入 provider 资源失败")
+
+    async def get_embedding_models(self):
+        """根据临时 provider_config 获取可用嵌入模型列表"""
+        post_data = await request.json
</code_context>
<issue_to_address>
**issue (complexity):** Consider extracting the shared embedding provider setup/teardown logic into a reusable helper to simplify both endpoints and avoid duplication.

You can eliminate most of the new complexity by centralizing the provider lifecycle into a small helper and reusing it in both endpoints.

For example, extract the common bootstrap + init + terminate logic:

```python
async def _with_embedding_provider(self, provider_config, handler):
    from astrbot.core.provider.provider import EmbeddingProvider
    from astrbot.core.provider.register import provider_cls_map

    provider_type = provider_config.get("type")
    if not provider_type:
        return Response().error("provider_config 缺少 type 字段").__dict__

    if provider_type not in provider_cls_map:
        try:
            self.core_lifecycle.provider_manager.dynamic_import_provider(provider_type)
        except ImportError:
            logger.error(traceback.format_exc())
            return Response().error("提供商适配器加载失败").__dict__

    if provider_type not in provider_cls_map:
        return (
            Response()
            .error(f"未找到适用于 {provider_type} 的提供商适配器")
            .__dict__
        )

    metadata = provider_cls_map[provider_type]
    cls_type = metadata.cls_type
    if not cls_type:
        return Response().error(f"无法找到 {provider_type} 的类").__dict__

    inst = cls_type(provider_config, {})
    if not isinstance(inst, EmbeddingProvider):
        return Response().error("提供商不是 EmbeddingProvider 类型").__dict__

    init_fn = getattr(inst, "initialize", None)
    if inspect.iscoroutinefunction(init_fn):
        await init_fn()

    try:
        return await handler(inst)
    finally:
        terminate_fn = getattr(inst, "terminate", None)
        if inspect.iscoroutinefunction(terminate_fn):
            try:
                await terminate_fn()
            except Exception:
                logger.warning("释放嵌入 provider 资源失败")
```

Then `get_embedding_dim` becomes focused only on its core logic:

```python
async def get_embedding_dim(self):
    post_data = await request.json
    provider_config = post_data.get("provider_config")
    if not provider_config:
        return Response().error("缺少参数 provider_config").__dict__

    async def _dim_handler(inst):
        text = "test"
        vec = await inst.get_embedding(text)
        dim = len(vec)
        logger.info(
            f"检测到 {provider_config.get('id', 'unknown')} 的嵌入向量维度为 {dim}",
        )
        return Response().ok({"embedding_dimensions": dim}).__dict__

    try:
        return await self._with_embedding_provider(provider_config, _dim_handler)
    except Exception as e:
        logger.error(traceback.format_exc())
        return Response().error(f"获取嵌入维度失败: {e!s}").__dict__
```

And `get_embedding_models` reuses the same helper:

```python
async def get_embedding_models(self):
    post_data = await request.json
    provider_config = post_data.get("provider_config")
    if not provider_config:
        return Response().error("缺少参数 provider_config").__dict__

    async def _models_handler(inst):
        try:
            models = await inst.get_models()
        except NotImplementedError:
            return (
                Response()
                .error("当前提供商暂不支持自动获取模型列表,请手动填写模型 ID")
                .__dict__
            )
        models = sorted(dict.fromkeys(models or []))
        return Response().ok({"models": models}).__dict__

    try:
        return await self._with_embedding_provider(provider_config, _models_handler)
    except Exception as e:
        logger.error(traceback.format_exc())
        return Response().error(f"获取嵌入模型列表失败: {e!s}").__dict__
```

This keeps all current behavior (dynamic import, type checks, initialize/terminate, error messages) but removes duplicated lifecycle/plumbing code, making future embedding endpoints easier to add and read.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread tests/test_dashboard.py
Comment thread astrbot/dashboard/routes/config.py
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements automatic model discovery for embedding providers, allowing users to fetch available models directly from the dashboard. Key changes include the addition of a get_models method to the EmbeddingProvider base class, specific implementations for Gemini and OpenAI, and a new dashboard API endpoint. The UI is updated with a combobox and an "Auto Detect" button for model selection. Review feedback suggests refactoring duplicated provider initialization logic in the dashboard routes and using more idiomatic Python sets for list deduplication.

Comment thread astrbot/dashboard/routes/config.py
Comment thread astrbot/core/provider/sources/gemini_embedding_source.py
Comment thread astrbot/dashboard/routes/config.py
Copy link
Copy Markdown
Contributor Author

这轮 bot review 我已经逐条复核并处理

已修复并补测

  • 不支持自动获取模型列表场景补充了用户错误消息断言
  • 对应提交 1ac60e87

评估后暂不在本 PR 修复

  • get_embedding_dim / get_embedding_models 公共 lifecycle 抽 helper
  • dict.fromkeysset 的去重写法

原因

  • 上面两类属于结构或风格优化,不影响当前功能正确性
  • 本 PR 目标是最小改动完成功能修复与测试补强,避免引入额外重构面
  • 后续可单独开重构 PR 处理这些优化点

@Sisyphbaous-DT-Project
Copy link
Copy Markdown
Contributor Author

对小白友好(关于我的朋友分不清楚模型名称和ID这件事)Screenshot_2026-04-23-11-46-09-364_com.tencent.mobileqq-edit.jpg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant