Skip to content

initial#4923

Merged
theomonnom merged 2 commits intolivekit:mainfrom
dhruvladia-sarvam:sarvam-v3-tts-addns
Feb 26, 2026
Merged

initial#4923
theomonnom merged 2 commits intolivekit:mainfrom
dhruvladia-sarvam:sarvam-v3-tts-addns

Conversation

@dhruvladia-sarvam
Copy link
Contributor

@dhruvladia-sarvam dhruvladia-sarvam commented Feb 23, 2026

This PR updates the Sarvam TTS plugin to support newer Bulbul model controls and model-specific parameter behavior in tts.py, with corresponding validation.

Key updates for bulbul:v3 / bulbul:v3-beta

  • Add temperature with range validation and default.
  • Add output and chunking controls:
    • output_audio_bitrate
    • min_buffer_size
    • max_chunk_length
  • Send temperature only for bulbul:v3/bulbul:v3-beta.
  • Do not send enable_preprocessing for bulbul:v3/bulbul:v3-beta (it is sent only for bulbul:v2).

Validation and runtime options

  • Constructor and update_options() now validate all newly added fields.
  • Errors are raised for invalid enum/range inputs before request dispatch.

Why

Sarvam Bulbul v3 APIs expose additional output and buffering controls that were not previously configurable in the plugin. This PR makes those controls first-class in plugin options while preserving model-specific compatibility behavior.

Detailed changes

  • New options with defaults
    • temperature: float = 0.6 (valid 0.01..1.0)
    • output_audio_bitrate: "128k" (allowed: 32k|64k|96k|128k|192k)
    • min_buffer_size: 50 (valid 30..200)
    • max_chunk_length: 150 (valid 50..500)

devin-ai-integration[bot]

This comment was marked as resolved.

@dhruvladia-sarvam
Copy link
Contributor Author

@darryncampbell it would be great if we can have this PR reviewed and merged asap as it contains a very small but very important model access related change as well

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the Sarvam TTS plugin to support the newer Bulbul v3 model with its additional controls and model-specific parameter behavior. The changes add temperature control, output audio settings, and buffering/chunking controls while properly segregating which parameters are sent to which model versions.

Changes:

  • Added support for bulbul:v3 model alongside existing bulbul:v2 and bulbul:v3-beta
  • Added four new configurable parameters: temperature (0.01-1.0), output_audio_bitrate (32k/64k/96k/128k/192k), min_buffer_size (30-200), and max_chunk_length (50-500)
  • Implemented model-specific parameter handling: temperature sent only for v3/v3-beta, enable_preprocessing sent only for v2

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pace: Speech rate multiplier (0.5 to 2.0)
loudness: Volume multiplier (0.5 to 2.0) - only supported in v2 for now
temperature: Sampling temperature (0.01 to 1.0), only used in v3 and v3-beta
output_audio_bitrate: Output audio bitrate (default 128k)
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for output_audio_bitrate should include the allowed values. Consider updating to: "output_audio_bitrate: Output audio bitrate (32k, 64k, 96k, 128k, or 192k; default 128k)"

Suggested change
output_audio_bitrate: Output audio bitrate (default 128k)
output_audio_bitrate: Output audio bitrate (32k, 64k, 96k, 128k, or 192k; default 128k)

Copilot uses AI. Check for mistakes.
Comment on lines +547 to +548
"output_audio_bitrate must be one of "
f"{', '.join(sorted(ALLOWED_OUTPUT_AUDIO_BITRATES))}"
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message format is inconsistent between init and update_options. In init (line 387-388), the message uses 'one of' with a comma-separated list. In update_options (line 547-548), it uses 'one of' followed by the list on the next line without repeating 'one of'. For consistency, both should use the same format.

Suggested change
"output_audio_bitrate must be one of "
f"{', '.join(sorted(ALLOWED_OUTPUT_AUDIO_BITRATES))}"
f"output_audio_bitrate must be one of {', '.join(sorted(ALLOWED_OUTPUT_AUDIO_BITRATES))}"

Copilot uses AI. Check for mistakes.
pace: Speech rate multiplier (0.5 to 2.0)
loudness: Volume multiplier (0.5 to 2.0)
temperature: Sampling temperature (0.01 to 1.0), used for v3 and v3-beta
output_audio_bitrate: Output audio bitrate
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for output_audio_bitrate should include the allowed values for consistency with other parameters. Consider updating the documentation to: "output_audio_bitrate: Output audio bitrate (32k, 64k, 96k, 128k, or 192k)"

Suggested change
output_audio_bitrate: Output audio bitrate
output_audio_bitrate: Output audio bitrate (32k, 64k, 96k, 128k, or 192k)

Copilot uses AI. Check for mistakes.
Comment on lines +275 to +276
min_buffer_size: Minimum character length for flushing
max_chunk_length: Maximum chunk length for sentence splitting
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation for min_buffer_size and max_chunk_length should include their valid ranges for consistency. Consider updating to: "min_buffer_size: Minimum character length for flushing (30 to 200)" and "max_chunk_length: Maximum chunk length for sentence splitting (50 to 500)"

Suggested change
min_buffer_size: Minimum character length for flushing
max_chunk_length: Maximum chunk length for sentence splitting
min_buffer_size: Minimum character length for flushing (30 to 200)
max_chunk_length: Maximum chunk length for sentence splitting (50 to 500)

Copilot uses AI. Check for mistakes.
@theomonnom theomonnom merged commit 69e66fa into livekit:main Feb 26, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants