You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-`bl omni --list-voices` prints the built-in output voices (ID, name, description, language) and exits without needing an API key. The built-in voice table is expanded from 6 to 17 voices, including dialect voices such as Dylan, Sunny, and Kiki.
14
+
15
+
### Changed
16
+
17
+
-`bl omni` default `--voice` is now `Tina` (previously `Cherry`). The `--voice` help points at `--list-voices` instead of listing every option inline.
18
+
-`bl speech synthesize --list-voices` and its missing-`--voice` hint now include a link to the official CosyVoice voice documentation.
19
+
- Agent skill setup guidance now covers console site selection (`--console-site domestic` / `international`) for console login and gateway commands.
20
+
21
+
### Fixed
22
+
23
+
-`bl speech synthesize` corrects the `cosyvoice-v3-flash` built-in voice ID from `longanhuan` to `longanhuan_v3`.
24
+
25
+
## [1.4.1] - 2026-06-22
26
+
27
+
### Changed
28
+
29
+
- Video generation now defaults to the upgraded HappyHorse 1.1 model for better quality. The 1.0 models are still available via `--model`.
30
+
-`bl update` now keeps the agent skill in sync across all your agent apps (Claude Code, Cursor, etc.), and refreshes it even when the CLI is already up to date.
Copy file name to clipboardExpand all lines: README.md
+25-16Lines changed: 25 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,7 +27,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
27
27
-**Text chat** — Qwen3.7-max: major gains in agentic coding, frontend coding, and vibe coding
28
28
-**Multimodal (Omni)** — Full omni-modal support across text + image + audio + video
29
29
-**Image generation & editing** — Qwen-Image 2.0: pro text rendering, photorealism, strong semantic adherence, multi-image composition
30
-
-**Video generation & editing** — HappyHorse-1.0 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
30
+
-**Video generation & editing** — happyhorse-1.1 series: text-/image-/reference-to-video and natural-language video editing (up to 9-image reference)
31
31
-**Speech synthesis & recognition** — CosyVoice streaming TTS, voice cloning from 5–20s samples; FunAudio-ASR covers 30 languages including 7 Chinese dialects and 20+ Mandarin accents
32
32
-**Image & video understanding** — Qwen-VL: long-form video analysis, chart/document parsing, visual reasoning, multilingual OCR
33
33
@@ -38,6 +38,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
38
38
-**MCP integration** — Orchestrate Bailian MCP servers: list services, inspect tools, and invoke any tool directly from the terminal
39
39
-**Web search** — Real-time internet retrieval for up-to-date, accurate answers
40
40
-**Model recommendation** — Describe your scenario and get best-fit model suggestions; supports scoped search, model comparison, and alternative discovery
41
+
-**Fine-tuning & deployment** — Upload datasets, create SFT/LoRA/DPO/CPT jobs (`finetune create`), probe job status non-blockingly (`finetune watch`), query per-model training capability (`finetune capability`), and deploy trained models as endpoints (`deploy create`)
-**Local file auto-upload** — Every URL parameter accepts a local path; uploaded to free temp storage with 48-hour validity
43
44
@@ -54,7 +55,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
54
55
A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from a single natural-language sentence, with **zero manual editing**. This showcase demonstrates how an AI Agent can compose a multi-step creative pipeline by orchestrating three primitives:
55
56
56
57
-**[Qwen Code](https://github.com/QwenLM/qwen-code)** — the agentic coding model that interprets the user's intent and drives the workflow
57
-
-**[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.0**, Aliyun Model Studio's text-/image-/reference-to-video generation model
58
+
-**[Aliyun Model Studio CLI](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)** — invokes **HappyHorse 1.1**, Aliyun Model Studio's text-/image-/reference-to-video generation model
58
59
-**[spark-video Skill](https://github.com/JohnKeating1997/spark-video)** — handles scene decomposition, storyboarding, shot continuity, and final stitching
59
60
60
61
### The single prompt
@@ -67,7 +68,7 @@ A complete **2-minute, 16:9 cinematic short film** — produced end-to-end from
67
68
68
69
1.**Qwen Code** parses the request, plans the narrative beats, and decides which tools to call.
69
70
2. The **spark-video Skill** breaks the story into shots, writes per-shot prompts, and enforces visual continuity (characters, lighting, palette, lens language).
70
-
3.**`bl video generate`** dispatches each shot to **HappyHorse 1.0** in parallel.
71
+
3.**`bl video generate`** dispatches each shot to **HappyHorse 1.1** in parallel.
71
72
4. The skill stitches all clips back together into a single 16:9 / ~2-min deliverable.
72
73
73
74
No timeline scrubbing. No frame-by-frame editing. Just one sentence → one video.
@@ -111,22 +112,30 @@ bl advisor recommend --message "qwen-max vs deepseek-v3 for code generation"
111
112
# Browser login (required for console capability commands)
112
113
bl auth login --console
113
114
115
+
# Fine-tune & deploy — a one-shot train-to-serve workflow
0 commit comments