Skip to content

feat(apionly): add Alibaba Cloud Models#9055

Merged
lstein merged 69 commits intoinvoke-ai:mainfrom
Pfannkuchensack:alibabacloud/dashscope
May 5, 2026
Merged

feat(apionly): add Alibaba Cloud Models#9055
lstein merged 69 commits intoinvoke-ai:mainfrom
Pfannkuchensack:alibabacloud/dashscope

Conversation

@Pfannkuchensack
Copy link
Copy Markdown
Collaborator

Summary

Adds Alibaba Cloud DashScope as an external image generation provider on top of the external-models base (PR #8884). Introduces five starter models covering Qwen Image 2.0 Pro / 2.0 / Max / Edit Max and Wan 2.6 Text-to-Image.

What:

  • New provider: invokeai/app/services/external_generation/providers/alibabacloud.py
  • Starter models with capability metadata (aspect ratios, max images per request, negative-prompt & seed support)
  • Provider registration in ExternalProvidersForm.tsx and providers/__init__.py
  • Config entry for DashScope API credentials (in api_keys.yaml)

Why:
DashScope exposes Alibaba's Qwen Image family (strong bilingual text rendering) and Wan 2.6 (photorealistic T2I), expanding the set of external providers users can pick from.

How:
Implements the external-provider interface from PR #8884. All five models share similar capability shapes; Qwen Image Edit Max is the only img2img entry.

Related Issues / Discussions

QA Instructions

  1. Configure an Alibaba Cloud DashScope API key in api_keys.yaml.
  2. Install one or more DashScope starter models.
  3. Verify txt2img on Qwen Image 2.0 Pro / 2.0 / Max and Wan 2.6 T2I across the supported aspect ratios.
  4. Verify img2img on Qwen Image Edit Max.
  5. Confirm negative prompt and seed fields are forwarded to the API and respected.
  6. Confirm batch generation up to max_images_per_request=4 works for each model.

Merge Plan

Merge after #8884 lands. No DB migrations. Config-default changes are additive.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

CypherNaught-0x and others added 30 commits February 27, 2026 11:12
…pi_keys.yaml

Add 'external', 'external_image_generator', and 'external_api' to Zod
enum schemas (zBaseModelType, zModelType, zModelFormat) to match the
generated OpenAPI types. Remove redundant union workarounds from
component prop types and Record definitions.

Fix type errors in ModelEdit (react-hook-form Control invariance),
parsing.tsx (model identifier narrowing), buildExternalGraph (edge
typing), and ModelSettings import/export buttons.

Move external_gemini_base_url and external_openai_base_url into
api_keys.yaml alongside the API keys so all external provider config
lives in one dedicated file, separate from invokeai.yaml.
Add combined resolution preset selector for external models that maps
aspect ratio + image size to fixed dimensions. Gemini 3 Pro and 3.1 Flash
now send imageConfig (aspectRatio + imageSize) via generationConfig instead
of text-based aspect ratio hints used by Gemini 2.5 Flash.

Backend: ExternalResolutionPreset model, resolution_presets capability field,
image_size on ExternalGenerationRequest, and Gemini provider imageConfig logic.

Frontend: ExternalSettingsAccordion with combo resolution select, dimension
slider disabling for fixed-size models, and panel schema constraint wiring
for Steps/Guidance/Seed controls.
- Remove negative_prompt, steps, guidance, reference_image_weights,
  reference_image_modes from external model nodes (unused by any provider)
- Remove supports_negative_prompt, supports_steps, supports_guidance
  from ExternalModelCapabilities
- Add provider_options dict to ExternalGenerationRequest for
  provider-specific parameters
- Add OpenAI-specific fields: quality, background, input_fidelity
- Add Gemini-specific fields: temperature, thinking_level
- Add new OpenAI starter models: GPT Image 1.5, GPT Image 1 Mini,
  DALL-E 3, DALL-E 2
- Fix OpenAI provider to use output_format (GPT Image) vs
  response_format (DALL-E) and send model ID in requests
- Add fixed aspect ratio sizes for OpenAI models (bucketing)
- Add ExternalProviderRateLimitError with retry logic for 429 responses
- Add provider-specific UI components in ExternalSettingsAccordion
- Simplify ParamSteps/ParamGuidance by removing dead external overrides
- Update all backend and frontend tests
Add AlibabaCloudProvider supporting Qwen Image and Wan model families
via the DashScope API. Includes sync (multimodal-generation) and async
(image-generation with task polling) request modes, five starter models
(Qwen Image 2.0 Pro, 2.0, Max, Wan 2.6 T2I, Qwen Image Edit Max),
config fields for API key and base URL, and frontend registration.
…rnal graph

- Export imageSizeChanged from paramsSlice (required by the new ImageSize
  recall handler).
- Emit the external graph's metadata model entry via zModelIdentifierField
  since ExternalApiModelConfig is not part of the AnyModelConfig union.
@lstein lstein self-assigned this Apr 17, 2026
@lstein lstein added the v6.13.x label Apr 17, 2026
@lstein lstein moved this to 6.13.x Theme: MODELS in Invoke - Community Roadmap Apr 17, 2026
@lstein
Copy link
Copy Markdown
Collaborator

lstein commented Apr 19, 2026

Functional testing only so far. I registered on the free tier, so perhaps some of the issues I've encountered are due to that.

Editing External Model Configure

In the model manager, when I select any of the external models (including Seedream, GPT, gemini etc), click the "Edit" button and then try to make changes to the model configuration, I get Model Update Failed. I tried to use this to enable reference images in Qwen Image Edit Max.

Recall not working

The "Remix" button is restoring the prompt and dimensions, but doesn't restore the model or any other parameters.

Wan 2.6 text-to-image, Qwen Image 2.0, Qwen Image 2.0 Pro, Qwen Image Max

When I try to generate with any of these models, I get: AttributeError: 'ExternalGenerationRequest' object has no attribute 'negative_prompt'

To continue testing, I patched the code using the patch attached below. I did not add the negative prompt to the linear view.

Qwen Image 2.0

  • txt2img working
  • No reference image support as expected.
  • In the invocation node, the img2img and inpaint are both provided as mode options in the pulldown menu, but only txt2img is supported.
  • In the invocation node, providing an init image is silently ignored (the generation does a txt2img). However, when I provide a reference image I get the message "Reference images are not supported." Should we get a similar message when the init image or mask image are provided?
  • When I set the node's height and width to 1024x1024 I get a 2048x2048 image. Are the dimension parameters being ignored? What do I enter for the "Image Size" string?

Qwen Image 2.0 Pro

Same comments as Qwen Image 2.0

Qwen Image Edit Max

  • In linear mode this model is grayed out for me, but I seem to be able to access it via the Alibaba invocation node. I think this is happening because this model is designated img2img only.
  • The model is marked as not supporting reference images. This isn't right. To the contrary, it doesn't support raster-based img2img.
  • When I try to run the model in img2img mode it rejects images put into the "Reference Images" field with "Reference images are not supported by Qwen Image Edit Max." Image editing does work when I put the input image into Init Image. However, this limits me to one image only.
  • The Mask Image field has no effect. (I think as a general rule we should avoid user interface elements that don't do anything).

Wan 2.0 Text-to-image

  • Linear mode txt2img is working as expected.
  • Generating into an empty canvas does not work. Invoke calls out to the remote server, but the image never appears.
  • The node works as expected with the same caveats as Qwen Image.

Negative prompt patch

I used this just to get the invocation minimally working. It's not fully wired up to the user interface, and I'm not sure which models use negative prompts. The negative prompt seems to work with Qwen Image Max but not with Qwen Image 2.0.

diff --git a/invokeai/app/invocations/external_image_generation.py b/invokeai/app/invocations/external_image_generation.py
index 07a74ffdde..407ed7bf7c 100644
--- a/invokeai/app/invocations/external_image_generation.py
+++ b/invokeai/app/invocations/external_image_generation.py
@@ -37,6 +37,7 @@ class BaseExternalImageGenerationInvocation(BaseInvocation, WithMetadata, WithBo
         description="Generation mode. Not all modes are supported by every model; unsupported modes raise at runtime.",
     )
     prompt: str = InputField(description="Prompt")
+    negative_prompt: str = InputField(description="Negative Prompt", default='')
     seed: int | None = InputField(default=None, description=FieldDescriptions.seed)
     num_images: int = InputField(default=1, gt=0, description="Number of images to generate")
     width: int = InputField(default=1024, gt=0, description=FieldDescriptions.width)
@@ -77,6 +78,7 @@ class BaseExternalImageGenerationInvocation(BaseInvocation, WithMetadata, WithBo
             model=model_config,
             mode=self.mode,
             prompt=self.prompt,
+            negative_prompt=self.negative_prompt,
             seed=self.seed,
             num_images=self.num_images,
             width=self.width,
diff --git a/invokeai/app/services/external_generation/external_generation_common.py b/invokeai/app/services/external_generation/external_generation_common.py
index f14bff52dd..69a63daa5c 100644
--- a/invokeai/app/services/external_generation/external_generation_common.py
+++ b/invokeai/app/services/external_generation/external_generation_common.py
@@ -18,6 +18,7 @@ class ExternalGenerationRequest:
     model: ExternalApiModelConfig
     mode: ExternalGenerationMode
     prompt: str
+    negative_prompt: str
     seed: int | None
     num_images: int
     width: int
diff --git a/invokeai/app/services/external_generation/external_generation_default.py b/invokeai/app/services/external_generation/external_generation_default.py
index d6a266753b..441fd12ede 100644
--- a/invokeai/app/services/external_generation/external_generation_default.py
+++ b/invokeai/app/services/external_generation/external_generation_default.py
@@ -195,6 +195,7 @@ class ExternalGenerationService(ExternalGenerationServiceBase):
             model=record,
             mode=request.mode,
             prompt=request.prompt,
+            negative_prompt=request.negative_prompt,
             seed=request.seed,
             num_images=request.num_images,
             width=request.width,
@@ -264,6 +265,7 @@ class ExternalGenerationService(ExternalGenerationServiceBase):
             model=request.model,
             mode=request.mode,
             prompt=request.prompt,
+            negative_prompt=request.negative_prompt,
             seed=request.seed,
             num_images=request.num_images,
             width=width,

@lstein
Copy link
Copy Markdown
Collaborator

lstein commented Apr 21, 2026

@Pfannkuchensack Lots of conflicts I'm afraid. I'll review your other PRs in the meantime.

@lstein
Copy link
Copy Markdown
Collaborator

lstein commented Apr 25, 2026

@Pfannkuchensack With a fresh pull, I'm still getting this crash when generating with any of the AliBaba Qwen Image models or with Wan 2.6 Text-to-Image:

ile "/home/lstein/Projects/InvokeAI/invokeai/app/services/external_generation/external_generation_default.py", line 59, in generate
    result = self._generate_with_retry(provider, request)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI/invokeai/app/services/external_generation/external_generation_default.py", line 76, in _generate_with_retry
    return provider.generate(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI/invokeai/app/services/external_generation/providers/alibabacloud.py", line 74, in generate
    return self._generate_sync(request, base_url, headers, model_id, size)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/lstein/Projects/InvokeAI/invokeai/app/services/external_generation/providers/alibabacloud.py", line 107, in _generate_sync
    if request.negative_prompt:
       ^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'ExternalGenerationRequest' object has no attribute 'negative_prompt'

In addition, the menu item for Qwen Image Edit Max is persistently grayed out. I don't know why.

@Pfannkuchensack
Copy link
Copy Markdown
Collaborator Author

The Qwen Image Edit Max Model can only do img2img.
Fixed the negative_prompt Stuff.

@lstein
Copy link
Copy Markdown
Collaborator

lstein commented Apr 26, 2026

Thanks. Code is running well now and all models check out. I have some comments:

Qwen edit model init image vs reference image. The input to Qwen Image Edit Max is an init_image, and you can only provide one of these. However, the Alibaba documentation indicates that the qwen image editing models support up to three input images. From the user's point of view these input images act more like reference images than init images, seeing as you can't add mask regions to them or adjust the denoising level. So is it possible to accept the images as ref images rather than init images?

Location of the alibaba API key. When using the BytesPlus (seedream) API, the API key is stored in INVOKEAI_ROOT/invokeai.yaml, but when using the Alibaba provider, the API key is stored in INVOKEAI_ROOT/api_keys.yaml. The location of the API keys should be consistent across all external providers. In addition, the api key hint in the model manager install dialogue says that the Alibaba key will be stored in invokeai.yaml, which isn't currently the case.

Here are some issues detected by Claude code review, ordered by descending importance:

  • Negative prompt is advertised but never sent. Commit e92155bd8a removed all request.negative_prompt references because ExternalGenerationRequest has no such field — but every starter model still sets supports_negative_prompt=True (starter_models.py:1165, 1189, 1213, 1237, 1261) and the docs say "All models support negative prompt and seed" (docs/features/external-models/alibabacloud.md:42). Either the request schema needs a negative_prompt field plumbed through, or the capability flags + docs need to be set to false. Right now users will set negative prompts that get silently dropped.

  • Async parser double-counts images. alibabacloud.py:275-282 uses two consecutive if blocks, not if/elif. If a result dict ever contains both url and b64_image, the same logical image gets appended twice (once from URL download, once from base64). Should be elif.

  • Sync routing fallthrough hides bugs. alibabacloud.py:73:

    if model_id in _SYNC_MODELS or model_id not in _ASYNC_MODELS:
        return self._generate_sync(...)

    This silently routes any unknown model_id to the sync endpoint. If a future starter model is added and forgotten in the sets, it won't fail loudly — it'll hit the wrong endpoint and produce a confusing API error. Prefer an explicit lookup with a clear ExternalProviderRequestError for unrecognized models.

  • First poll is delayed by 5s. alibabacloud.py:192-197 sleeps before the first GET. Fast-completing tasks pay an unnecessary 5 s. Move time.sleep to the end of the loop, or check first then sleep.

  • Dead/unused model entries in routing tables. _SYNC_MODELS and _ASYNC_MODELS reference models (qwen-image-plus, wan2.6-image, wan2.5-t2i-preview, wanx2.0-t2i-turbo, etc.) that have no starter-model entry. Either drop them or document why they're reserved.

Robustness

  • No retry / no 429 handling. Transient 5xx and rate-limit responses fail the whole request. DashScope publishes per-model RPM limits — at minimum, parse Retry-After on 429 and back off once or twice.

  • Raw network errors bubble up. requests.post / requests.get calls aren't wrapped in try/except requests.RequestException — DNS errors, connection resets, and timeouts will surface as bare requests exceptions instead of ExternalProviderRequestError. Inconsistent with how HTTP-status failures are handled.

  • _download_image has no size cap. alibabacloud.py:294-301 reads the full response body into memory and feeds it to PIL with no length check. The URL is from DashScope, so this isn't an SSRF risk, but a malformed/oversized response could cause OOM. Consider stream=True + a max-bytes guard, or at least a content-length sanity check.

  • _logger.debug after the success branch is unreachable for terminal states. alibabacloud.py:215 fires on every poll but only when status is not SUCCEEDED/FAILED/UNKNOWN — that's fine, but worth an info log on first poll so operators can see the task ID.

Test Coverage

  • None. Given the sync/async branching, the polling loop, and two distinct response shapes, this provider is the most logic-heavy of the three external providers and deserves at least:
    • sync response parsing (multimodal choices[].message.content[].image)
    • async response parsing (results[].url and results[].b64_image)
    • poll-timeout behavior
    • the if/elif double-image case noted above

@lstein lstein mentioned this pull request Apr 26, 2026
5 tasks
Copy link
Copy Markdown
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above for comments.

Pfannkuchensack and others added 4 commits May 1, 2026 02:07
… double-counting, Qwen Edit Max as ref-image model

- Explicit sync/async lookup, raise on unknown model_id
- Move poll sleep to end of loop, info-log on first poll
- if/elif in async parser to prevent url+b64_image double-count
- 429/5xx retry with Retry-After, wrap RequestException into ExternalProviderRequestError
- 32 MiB streaming cap on image downloads
- Drop dead routing-table entries and the init_image edit path
- Disable supports_negative_prompt on all Alibaba starter models (request schema has no negative_prompt field)
- Switch Qwen Image Edit Max to txt2img + reference_images panel (up to 3 inputs)
- Update docs
- Add 8 unit tests covering parser, routing, retries, polling, and download cap
Copy link
Copy Markdown
Collaborator

@lstein lstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working as advertised. I made a small commit to change the info text in the model installer to indicate that the api keys are being stored in api_keys.yaml rather than the main invoke config file.

@lstein lstein enabled auto-merge (squash) May 5, 2026 02:31
@lstein lstein merged commit 73d4633 into invoke-ai:main May 5, 2026
16 checks passed
@Pfannkuchensack Pfannkuchensack deleted the alibabacloud/dashscope branch May 5, 2026 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api backend PRs that change backend files docs PRs that change docs frontend PRs that change frontend files invocations PRs that change invocations python PRs that change python files python-tests PRs that change python tests Root services PRs that change app services v6.13.x

Projects

Status: 6.13.x Theme: MODELS

Development

Successfully merging this pull request may close these issues.

3 participants