python-agents-examples/docs/examples/stdio_mcp_client at main · livekit-examples/python-agents-examples

Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
stdio_mcp_client.py	stdio_mcp_client.py

title

MCP Agent

Prerequisites

Add a .env in this directory with your LiveKit credentials:

LIVEKIT_URL=your_livekit_url
LIVEKIT_API_KEY=your_api_key
LIVEKIT_API_SECRET=your_api_secret

Install dependencies:

pip install "livekit-agents[silero,deepgram,openai,cartesia]" python-dotenv

Have an MCP server available (this example uses Codex)

Set up logging and create the AgentServer

Load environment variables and configure logging. Create an AgentServer to manage the agent lifecycle.

import logging
from dotenv import load_dotenv
from livekit.agents import AgentServer, AgentSession, JobContext, JobProcess, cli, Agent, inference, mcp
from livekit.plugins import silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel

logger = logging.getLogger("mcp-agent")

load_dotenv()

server = AgentServer()

Prewarm VAD for faster connections

Preload the VAD model once per process. This runs before any sessions start and stores the VAD instance in proc.userdata so it can be reused.

def prewarm(proc: JobProcess):
    proc.userdata["vad"] = silero.VAD.load()

server.setup_fnc = prewarm

Define a lightweight agent

Keep the Agent lightweight with just instructions. The MCP server provides the tools, so the agent doesn't need to define any function tools itself. The instructions explain how to interact with the MCP server.

class MyAgent(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions=(
                """
                You can retrieve data via the MCP server. The interface is voice-based:
                accept spoken user queries and respond with synthesized speech.
                The MCP server is a codex instance running on the local machine.

                When you call the codex MCP server, you should use the following parameters:
                - approval-policy: never
                - sandbox: workspace-write
                - prompt: [user_prompt_goes_here]
                """
            ),
        )

    async def on_enter(self):
        self.session.generate_reply()

Define the RTC session entrypoint with MCP server

Create the AgentSession with STT, LLM, TTS, VAD, and the MCP server configuration. The mcp_servers parameter accepts a list of MCP server connections—here we use MCPServerStdio to connect to a local Codex process.

@server.rtc_session()
async def entrypoint(ctx: JobContext):
    ctx.log_context_fields = {"room": ctx.room.name}

    session = AgentSession(
        stt=inference.STT(model="deepgram/nova-3-general"),
        llm=inference.LLM(model="openai/gpt-4.1-mini"),
        tts=inference.TTS(model="cartesia/sonic-2", voice="6f84f4b8-58a2-430c-8c79-688dad597532"),
        vad=ctx.proc.userdata["vad"],
        turn_detection=MultilingualModel(),
        mcp_servers=[mcp.MCPServerStdio(command="codex", args=["mcp"], client_session_timeout_seconds=600000)],
        preemptive_generation=True,
    )
    agent = MyAgent()

    await session.start(agent=agent, room=ctx.room)
    await ctx.connect()

Run the server

The cli.run_app() function starts the agent server, manages the worker lifecycle, and processes incoming jobs.

if __name__ == "__main__":
    cli.run_app(server)

Run it

Run the agent using the console command for local testing:

python stdio_mcp_client.py console

To test with a real LiveKit room, use dev mode:

python stdio_mcp_client.py dev

How it works

The agent connects to the MCP server (Codex) via stdio when the session starts.
The MCP server exposes tools that the LLM can call.
When users speak, their requests are transcribed and sent to the LLM.
The LLM can invoke MCP tools to perform actions like code generation or file operations.
Tool results are incorporated into the response and spoken back to the user.

Full example

import logging
from dotenv import load_dotenv
from livekit.agents import AgentServer, AgentSession, JobContext, JobProcess, cli, Agent, inference, mcp
from livekit.plugins import silero
from livekit.plugins.turn_detector.multilingual import MultilingualModel

logger = logging.getLogger("mcp-agent")

load_dotenv()

class MyAgent(Agent):
    def __init__(self) -> None:
        super().__init__(
            instructions=(
                """
                You can retrieve data via the MCP server. The interface is voice-based:
                accept spoken user queries and respond with synthesized speech.
                The MCP server is a codex instance running on the local machine.

                When you call the codex MCP server, you should use the following parameters:
                - approval-policy: never
                - sandbox: workspace-write
                - prompt: [user_prompt_goes_here]
                """
            ),
        )

    async def on_enter(self):
        self.session.generate_reply()

server = AgentServer()

def prewarm(proc: JobProcess):
    proc.userdata["vad"] = silero.VAD.load()

server.setup_fnc = prewarm

@server.rtc_session()
async def entrypoint(ctx: JobContext):
    ctx.log_context_fields = {"room": ctx.room.name}

    session = AgentSession(
        stt=inference.STT(model="deepgram/nova-3-general"),
        llm=inference.LLM(model="openai/gpt-4.1-mini"),
        tts=inference.TTS(model="cartesia/sonic-2", voice="6f84f4b8-58a2-430c-8c79-688dad597532"),
        vad=ctx.proc.userdata["vad"],
        turn_detection=MultilingualModel(),
        mcp_servers=[mcp.MCPServerStdio(command="codex", args=["mcp"], client_session_timeout_seconds=600000)],
        preemptive_generation=True,
    )
    agent = MyAgent()

    await session.start(agent=agent, room=ctx.room)
    await ctx.connect()

if __name__ == "__main__":
    cli.run_app(server)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Prerequisites

Set up logging and create the AgentServer

Prewarm VAD for faster connections

Define a lightweight agent

Define the RTC session entrypoint with MCP server

Run the server

Run it

How it works

Full example

FilesExpand file tree

stdio_mcp_client

Directory actions

More options

Directory actions

More options

Latest commit

History

stdio_mcp_client

Folders and files

parent directory

README.md

Prerequisites

Set up logging and create the AgentServer

Prewarm VAD for faster connections

Define a lightweight agent

Define the RTC session entrypoint with MCP server

Run the server

Run it

How it works

Full example