Conversation
|
@darryncampbell it would be great if we can have this PR reviewed and merged asap as it contains a very small but very important model access related change as well |
There was a problem hiding this comment.
Pull request overview
This PR updates the Sarvam TTS plugin to support the newer Bulbul v3 model with its additional controls and model-specific parameter behavior. The changes add temperature control, output audio settings, and buffering/chunking controls while properly segregating which parameters are sent to which model versions.
Changes:
- Added support for
bulbul:v3model alongside existingbulbul:v2andbulbul:v3-beta - Added four new configurable parameters:
temperature(0.01-1.0),output_audio_bitrate(32k/64k/96k/128k/192k),min_buffer_size(30-200), andmax_chunk_length(50-500) - Implemented model-specific parameter handling:
temperaturesent only for v3/v3-beta,enable_preprocessingsent only for v2
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pace: Speech rate multiplier (0.5 to 2.0) | ||
| loudness: Volume multiplier (0.5 to 2.0) - only supported in v2 for now | ||
| temperature: Sampling temperature (0.01 to 1.0), only used in v3 and v3-beta | ||
| output_audio_bitrate: Output audio bitrate (default 128k) |
There was a problem hiding this comment.
The documentation for output_audio_bitrate should include the allowed values. Consider updating to: "output_audio_bitrate: Output audio bitrate (32k, 64k, 96k, 128k, or 192k; default 128k)"
| output_audio_bitrate: Output audio bitrate (default 128k) | |
| output_audio_bitrate: Output audio bitrate (32k, 64k, 96k, 128k, or 192k; default 128k) |
| "output_audio_bitrate must be one of " | ||
| f"{', '.join(sorted(ALLOWED_OUTPUT_AUDIO_BITRATES))}" |
There was a problem hiding this comment.
The error message format is inconsistent between init and update_options. In init (line 387-388), the message uses 'one of' with a comma-separated list. In update_options (line 547-548), it uses 'one of' followed by the list on the next line without repeating 'one of'. For consistency, both should use the same format.
| "output_audio_bitrate must be one of " | |
| f"{', '.join(sorted(ALLOWED_OUTPUT_AUDIO_BITRATES))}" | |
| f"output_audio_bitrate must be one of {', '.join(sorted(ALLOWED_OUTPUT_AUDIO_BITRATES))}" |
| pace: Speech rate multiplier (0.5 to 2.0) | ||
| loudness: Volume multiplier (0.5 to 2.0) | ||
| temperature: Sampling temperature (0.01 to 1.0), used for v3 and v3-beta | ||
| output_audio_bitrate: Output audio bitrate |
There was a problem hiding this comment.
The documentation for output_audio_bitrate should include the allowed values for consistency with other parameters. Consider updating the documentation to: "output_audio_bitrate: Output audio bitrate (32k, 64k, 96k, 128k, or 192k)"
| output_audio_bitrate: Output audio bitrate | |
| output_audio_bitrate: Output audio bitrate (32k, 64k, 96k, 128k, or 192k) |
| min_buffer_size: Minimum character length for flushing | ||
| max_chunk_length: Maximum chunk length for sentence splitting |
There was a problem hiding this comment.
The documentation for min_buffer_size and max_chunk_length should include their valid ranges for consistency. Consider updating to: "min_buffer_size: Minimum character length for flushing (30 to 200)" and "max_chunk_length: Maximum chunk length for sentence splitting (50 to 500)"
| min_buffer_size: Minimum character length for flushing | |
| max_chunk_length: Maximum chunk length for sentence splitting | |
| min_buffer_size: Minimum character length for flushing (30 to 200) | |
| max_chunk_length: Maximum chunk length for sentence splitting (50 to 500) |
This PR updates the Sarvam TTS plugin to support newer Bulbul model controls and model-specific parameter behavior in
tts.py, with corresponding validation.Key updates for
bulbul:v3/bulbul:v3-betatemperaturewith range validation and default.output_audio_bitratemin_buffer_sizemax_chunk_lengthtemperatureonly forbulbul:v3/bulbul:v3-beta.enable_preprocessingforbulbul:v3/bulbul:v3-beta(it is sent only forbulbul:v2).Validation and runtime options
update_options()now validate all newly added fields.Why
Sarvam Bulbul v3 APIs expose additional output and buffering controls that were not previously configurable in the plugin. This PR makes those controls first-class in plugin options while preserving model-specific compatibility behavior.
Detailed changes
temperature: float = 0.6(valid0.01..1.0)output_audio_bitrate: "128k"(allowed:32k|64k|96k|128k|192k)min_buffer_size: 50(valid30..200)max_chunk_length: 150(valid50..500)