Summary
The Cerebras Cloud SDK (cerebras-cloud-sdk) is the official Python client for Cerebras Inference, which provides some of the world's fastest LLM inference using Cerebras' CS-3 wafer-scale processors. Its Cerebras and AsyncCerebras clients expose chat.completions.create() with full streaming support. This repository has zero instrumentation for any Cerebras SDK surface — no integration directory, no wrapper, no patcher, no auto_instrument() support.
While Cerebras exposes an OpenAI-compatible endpoint, the native cerebras.cloud.sdk.Cerebras client is not a subclass of openai.OpenAI, so wrap_openai() cannot be used with it. The pattern is identical to Groq and Mistral — both have OpenAI-compatible APIs but received dedicated native integrations in this repo. Users who follow Cerebras' official documentation and pip install cerebras-cloud-sdk get zero Braintrust tracing.
What needs to be instrumented
The cerebras-cloud-sdk package exposes these execution surfaces via Cerebras and AsyncCerebras, none of which are instrumented:
Chat completions (highest priority)
| SDK Method |
Description |
Streaming |
Return type |
client.chat.completions.create() |
Chat completions with tool calling, JSON mode, structured output |
stream=True returns iterator of ChatCompletionChunk |
ChatCompletion |
client.chat.completions.stream() |
Context-manager streaming variant |
Yields ChatCompletionStreamEvent |
ChatCompletionStreamManager |
Response shape is OpenAI-compatible: ChatCompletion has choices, usage (prompt_tokens, completion_tokens, total_tokens), model, id, time_info (wall and queue times) — standard span metrics extraction follows the OpenAI pattern, with the bonus of Cerebras-specific latency fields.
Streaming: stream=True on create() or the stream() context manager yields ChatCompletionChunk objects. The integration must handle the streaming span lifecycle (start on call, accumulate chunks, finalize on stream exhaustion).
Both sync (Cerebras) and async (AsyncCerebras) clients exist with identical method signatures.
Implementation notes
Stainless-generated SDK: The cerebras-cloud-sdk package is generated by Stainless (same toolchain as OpenAI, Groq, Mistral). The client structure (client.chat.completions.create()) mirrors the OpenAI SDK, so the patcher/wrapper pattern from the OpenAI or Mistral integration can serve as a close structural reference.
chat.completions.create() parameters relevant for span metadata: model, temperature, max_tokens, max_completion_tokens, top_p, frequency_penalty, presence_penalty, seed, tools, tool_choice, response_format, stop, n.
Extra timing fields: Cerebras responses include time_info with queue_time and total_time (floats, in seconds). These are unique to Cerebras and useful for performance monitoring spans.
No coverage in any instrumentation layer
- No integration directory (
py/src/braintrust/integrations/cerebras/)
- No wrapper function (e.g.
wrap_cerebras())
- No patcher in any existing integration
- No nox test session (
test_cerebras)
- No version entry in
py/src/braintrust/integrations/versioning.py
- No mention in
py/src/braintrust/integrations/__init__.py
- No entry in
[tool.braintrust.matrix] in py/pyproject.toml
A grep for cerebras across py/src/braintrust/ returns zero matches.
Braintrust docs status
not_found — Cerebras is not listed on the Braintrust integrations directory or the tracing guide.
Upstream references
Local repo files inspected
py/src/braintrust/integrations/ — no cerebras/ directory exists on main
py/src/braintrust/wrappers/ — no Cerebras wrapper
py/noxfile.py — no test_cerebras session
py/src/braintrust/integrations/__init__.py — Cerebras not listed in integration registry
py/src/braintrust/integrations/versioning.py — no Cerebras version matrix
py/pyproject.toml — no Cerebras entries in [tool.braintrust.matrix]
- Full repo grep for "cerebras" across
py/src/braintrust/ — zero matches
Summary
The Cerebras Cloud SDK (
cerebras-cloud-sdk) is the official Python client for Cerebras Inference, which provides some of the world's fastest LLM inference using Cerebras' CS-3 wafer-scale processors. ItsCerebrasandAsyncCerebrasclients exposechat.completions.create()with full streaming support. This repository has zero instrumentation for any Cerebras SDK surface — no integration directory, no wrapper, no patcher, noauto_instrument()support.While Cerebras exposes an OpenAI-compatible endpoint, the native
cerebras.cloud.sdk.Cerebrasclient is not a subclass ofopenai.OpenAI, sowrap_openai()cannot be used with it. The pattern is identical to Groq and Mistral — both have OpenAI-compatible APIs but received dedicated native integrations in this repo. Users who follow Cerebras' official documentation andpip install cerebras-cloud-sdkget zero Braintrust tracing.What needs to be instrumented
The
cerebras-cloud-sdkpackage exposes these execution surfaces viaCerebrasandAsyncCerebras, none of which are instrumented:Chat completions (highest priority)
client.chat.completions.create()stream=Truereturns iterator ofChatCompletionChunkChatCompletionclient.chat.completions.stream()ChatCompletionStreamEventChatCompletionStreamManagerResponse shape is OpenAI-compatible:
ChatCompletionhaschoices,usage(prompt_tokens,completion_tokens,total_tokens),model,id,time_info(wall and queue times) — standard span metrics extraction follows the OpenAI pattern, with the bonus of Cerebras-specific latency fields.Streaming:
stream=Trueoncreate()or thestream()context manager yieldsChatCompletionChunkobjects. The integration must handle the streaming span lifecycle (start on call, accumulate chunks, finalize on stream exhaustion).Both sync (
Cerebras) and async (AsyncCerebras) clients exist with identical method signatures.Implementation notes
Stainless-generated SDK: The
cerebras-cloud-sdkpackage is generated by Stainless (same toolchain as OpenAI, Groq, Mistral). The client structure (client.chat.completions.create()) mirrors the OpenAI SDK, so the patcher/wrapper pattern from the OpenAI or Mistral integration can serve as a close structural reference.chat.completions.create()parameters relevant for span metadata:model,temperature,max_tokens,max_completion_tokens,top_p,frequency_penalty,presence_penalty,seed,tools,tool_choice,response_format,stop,n.Extra timing fields: Cerebras responses include
time_infowithqueue_timeandtotal_time(floats, in seconds). These are unique to Cerebras and useful for performance monitoring spans.No coverage in any instrumentation layer
py/src/braintrust/integrations/cerebras/)wrap_cerebras())test_cerebras)py/src/braintrust/integrations/versioning.pypy/src/braintrust/integrations/__init__.py[tool.braintrust.matrix]inpy/pyproject.tomlA grep for
cerebrasacrosspy/src/braintrust/returns zero matches.Braintrust docs status
not_found— Cerebras is not listed on the Braintrust integrations directory or the tracing guide.Upstream references
Local repo files inspected
py/src/braintrust/integrations/— nocerebras/directory exists onmainpy/src/braintrust/wrappers/— no Cerebras wrapperpy/noxfile.py— notest_cerebrassessionpy/src/braintrust/integrations/__init__.py— Cerebras not listed in integration registrypy/src/braintrust/integrations/versioning.py— no Cerebras version matrixpy/pyproject.toml— no Cerebras entries in[tool.braintrust.matrix]py/src/braintrust/— zero matches