feat(apionly): add Alibaba Cloud Models#9055
Conversation
…pi_keys.yaml Add 'external', 'external_image_generator', and 'external_api' to Zod enum schemas (zBaseModelType, zModelType, zModelFormat) to match the generated OpenAPI types. Remove redundant union workarounds from component prop types and Record definitions. Fix type errors in ModelEdit (react-hook-form Control invariance), parsing.tsx (model identifier narrowing), buildExternalGraph (edge typing), and ModelSettings import/export buttons. Move external_gemini_base_url and external_openai_base_url into api_keys.yaml alongside the API keys so all external provider config lives in one dedicated file, separate from invokeai.yaml.
Add combined resolution preset selector for external models that maps aspect ratio + image size to fixed dimensions. Gemini 3 Pro and 3.1 Flash now send imageConfig (aspectRatio + imageSize) via generationConfig instead of text-based aspect ratio hints used by Gemini 2.5 Flash. Backend: ExternalResolutionPreset model, resolution_presets capability field, image_size on ExternalGenerationRequest, and Gemini provider imageConfig logic. Frontend: ExternalSettingsAccordion with combo resolution select, dimension slider disabling for fixed-size models, and panel schema constraint wiring for Steps/Guidance/Seed controls.
- Remove negative_prompt, steps, guidance, reference_image_weights, reference_image_modes from external model nodes (unused by any provider) - Remove supports_negative_prompt, supports_steps, supports_guidance from ExternalModelCapabilities - Add provider_options dict to ExternalGenerationRequest for provider-specific parameters - Add OpenAI-specific fields: quality, background, input_fidelity - Add Gemini-specific fields: temperature, thinking_level - Add new OpenAI starter models: GPT Image 1.5, GPT Image 1 Mini, DALL-E 3, DALL-E 2 - Fix OpenAI provider to use output_format (GPT Image) vs response_format (DALL-E) and send model ID in requests - Add fixed aspect ratio sizes for OpenAI models (bucketing) - Add ExternalProviderRateLimitError with retry logic for 429 responses - Add provider-specific UI components in ExternalSettingsAccordion - Simplify ParamSteps/ParamGuidance by removing dead external overrides - Update all backend and frontend tests
Add AlibabaCloudProvider supporting Qwen Image and Wan model families via the DashScope API. Includes sync (multimodal-generation) and async (image-generation with task polling) request modes, five starter models (Qwen Image 2.0 Pro, 2.0, Max, Wan 2.6 T2I, Qwen Image Edit Max), config fields for API key and base URL, and frontend registration.
…rnal graph - Export imageSizeChanged from paramsSlice (required by the new ImageSize recall handler). - Emit the external graph's metadata model entry via zModelIdentifierField since ExternalApiModelConfig is not part of the AnyModelConfig union.
|
Functional testing only so far. I registered on the free tier, so perhaps some of the issues I've encountered are due to that. Editing External Model ConfigureIn the model manager, when I select any of the external models (including Seedream, GPT, gemini etc), click the "Edit" button and then try to make changes to the model configuration, I get Recall not workingThe "Remix" button is restoring the prompt and dimensions, but doesn't restore the model or any other parameters. Wan 2.6 text-to-image, Qwen Image 2.0, Qwen Image 2.0 Pro, Qwen Image MaxWhen I try to generate with any of these models, I get: To continue testing, I patched the code using the patch attached below. I did not add the negative prompt to the linear view. Qwen Image 2.0
Qwen Image 2.0 ProSame comments as Qwen Image 2.0 Qwen Image Edit Max
Wan 2.0 Text-to-image
Negative prompt patchI used this just to get the invocation minimally working. It's not fully wired up to the user interface, and I'm not sure which models use negative prompts. The negative prompt seems to work with Qwen Image Max but not with Qwen Image 2.0. |
|
@Pfannkuchensack Lots of conflicts I'm afraid. I'll review your other PRs in the meantime. |
|
@Pfannkuchensack With a fresh pull, I'm still getting this crash when generating with any of the AliBaba Qwen Image models or with Wan 2.6 Text-to-Image: In addition, the menu item for |
|
The Qwen Image Edit Max Model can only do img2img. |
|
Thanks. Code is running well now and all models check out. I have some comments: Qwen edit model init image vs reference image. The input to Qwen Image Edit Max is an Location of the alibaba API key. When using the BytesPlus (seedream) API, the API key is stored in Here are some issues detected by Claude code review, ordered by descending importance:
Robustness
Test Coverage
|
… double-counting, Qwen Edit Max as ref-image model - Explicit sync/async lookup, raise on unknown model_id - Move poll sleep to end of loop, info-log on first poll - if/elif in async parser to prevent url+b64_image double-count - 429/5xx retry with Retry-After, wrap RequestException into ExternalProviderRequestError - 32 MiB streaming cap on image downloads - Drop dead routing-table entries and the init_image edit path - Disable supports_negative_prompt on all Alibaba starter models (request schema has no negative_prompt field) - Switch Qwen Image Edit Max to txt2img + reference_images panel (up to 3 inputs) - Update docs - Add 8 unit tests covering parser, routing, retries, polling, and download cap
lstein
left a comment
There was a problem hiding this comment.
Working as advertised. I made a small commit to change the info text in the model installer to indicate that the api keys are being stored in api_keys.yaml rather than the main invoke config file.
Summary
Adds Alibaba Cloud DashScope as an external image generation provider on top of the external-models base (PR #8884). Introduces five starter models covering Qwen Image 2.0 Pro / 2.0 / Max / Edit Max and Wan 2.6 Text-to-Image.
What:
invokeai/app/services/external_generation/providers/alibabacloud.pyExternalProvidersForm.tsxandproviders/__init__.pyapi_keys.yaml)Why:
DashScope exposes Alibaba's Qwen Image family (strong bilingual text rendering) and Wan 2.6 (photorealistic T2I), expanding the set of external providers users can pick from.
How:
Implements the external-provider interface from PR #8884. All five models share similar capability shapes; Qwen Image Edit Max is the only img2img entry.
Related Issues / Discussions
QA Instructions
api_keys.yaml.max_images_per_request=4works for each model.Merge Plan
Merge after #8884 lands. No DB migrations. Config-default changes are additive.
Checklist
What's Newcopy (if doing a release after this PR)