Claude Code CLI Setup Guide

Complete guide to using Claude Code CLI with Lynkr for provider flexibility, cost savings, and local model support.

Overview

Lynkr acts as a drop-in replacement for Anthropic's backend, enabling Claude Code CLI to work with any LLM provider (Databricks, Bedrock, OpenRouter, Ollama, etc.) while maintaining full compatibility with all Claude Code features.

Why Use Lynkr with Claude Code CLI?

💰 60-80% cost savings through token optimization
🔓 Provider choice - Use any of 12+ supported providers
🏠 Self-hosted - Full control over your AI infrastructure
🔒 Local option - Run 100% offline with Ollama or llama.cpp
✅ Zero code changes - Drop-in replacement for Anthropic backend
📊 Full observability - Logs, metrics, token tracking

Quick Setup (3 Minutes)

Step 1: Install Lynkr

# Option A: NPM (Recommended)
npm install -g lynkr

# Option B: Homebrew (macOS)
brew tap vishalveerareddy123/lynkr
brew install lynkr

# Option C: Git Clone
git clone https://github.com/vishalveerareddy123/Lynkr.git
cd Lynkr && npm install

Step 2: Configure Provider

Choose your provider and configure credentials:

Option A: AWS Bedrock (100+ models)

export MODEL_PROVIDER=bedrock
export AWS_BEDROCK_API_KEY=your-bearer-token
export AWS_BEDROCK_REGION=us-east-1
export AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0

Option B: Ollama (100% Local, FREE)

# Start Ollama first
ollama serve
ollama pull llama3.1:8b

export MODEL_PROVIDER=ollama
export OLLAMA_MODEL=llama3.1:8b

Option C: OpenRouter (Simplest Cloud)

export MODEL_PROVIDER=openrouter
export OPENROUTER_API_KEY=sk-or-v1-your-key
export OPENROUTER_MODEL=anthropic/claude-3.5-sonnet

Option D: Databricks (Enterprise)

export MODEL_PROVIDER=databricks
export DATABRICKS_API_BASE=https://your-workspace.databricks.com
export DATABRICKS_API_KEY=dapi1234567890abcdef

See Provider Configuration Guide for all 12+ providers.

Step 3: Start Lynkr

lynkr start
# Or: npm start (if installed from source)

# Wait for: "Server listening at http://0.0.0.0:8081"

Step 4: Configure Claude Code CLI

Point Claude Code CLI to Lynkr instead of Anthropic:

# Set Lynkr as backend
export ANTHROPIC_BASE_URL=http://localhost:8081
export ANTHROPIC_API_KEY=dummy  # Required by CLI, but ignored by Lynkr

# Verify configuration
echo $ANTHROPIC_BASE_URL
# Should show: http://localhost:8081

Step 5: Test It

# Simple test
claude "What is 2+2?"

# Should return response from your configured provider ✅

# File operation test
claude "List files in current directory"

# Should use Read/Bash tools ✅

Configuration Options

Environment Variables

Core Variables:

# Lynkr backend URL (required)
export ANTHROPIC_BASE_URL=http://localhost:8081

# API key (required by CLI, but ignored by Lynkr)
export ANTHROPIC_API_KEY=dummy

# Workspace directory (optional, defaults to current directory)
export WORKSPACE_ROOT=/path/to/your/projects

Make Permanent (Optional):

Add to ~/.bashrc, ~/.zshrc, or ~/.profile:

# Add these lines to your shell config
export ANTHROPIC_BASE_URL=http://localhost:8081
export ANTHROPIC_API_KEY=dummy

Then reload:

source ~/.bashrc  # or ~/.zshrc

Feature Compatibility

Fully Supported Features

All Claude Code CLI features work through Lynkr:

Feature	Status	Notes
Chat conversations	✅ Works	Full streaming support
File operations	✅ Works	Read, Write, Edit tools
Bash commands	✅ Works	Execute shell commands
Git operations	✅ Works	Status, diff, commit, push
Tool calling	✅ Works	All standard Claude Code tools
Streaming responses	✅ Works	Real-time token streaming
Multi-turn conversations	✅ Works	Full context retention
Code generation	✅ Works	Works with all providers
Error handling	✅ Works	Automatic retries, fallbacks
Token counting	✅ Works	Accurate usage tracking

Tool Execution Modes

Lynkr supports two tool execution modes:

Server Mode (Default)

# Tools execute on Lynkr server
export TOOL_EXECUTION_MODE=server

Tools run on the machine running Lynkr
Good for: Standalone proxy, shared team server
File operations access server filesystem

Client Mode (Passthrough)

# Tools execute on Claude Code CLI side
export TOOL_EXECUTION_MODE=client

Tools run on your local machine (where you run claude)
Good for: Local development, accessing local files
Full integration with local environment

Usage Examples

Basic Chat

# Simple question
claude "Explain async/await in JavaScript"

# Code explanation
claude "Explain this function" < app.js

# Multi-line prompt
claude "Write a function that:
- Takes an array of numbers
- Filters out even numbers
- Returns the sum of odd numbers"

File Operations

# Read file
claude "What does this file do?" < src/server.js

# Create file
claude "Create a new Express server in server.js"

# Edit file
claude "Add error handling to src/api/router.js"

# Multiple files
claude "Refactor authentication across src/auth/*.js files"

Git Workflow

# Status check
claude "What files have changed?"

# Review diff
claude "Review my changes and suggest improvements"

# Commit changes
claude "Commit these changes with a descriptive message"

# Create PR (if gh CLI installed)
claude "Create a pull request for these changes"

Code Generation

# Generate function
claude "Write a binary search function in Python"

# Generate tests
claude "Write unit tests for utils/validation.js"

# Generate documentation
claude "Add JSDoc comments to this file" < src/helpers.js

Provider-Specific Considerations

AWS Bedrock

Best for: AWS ecosystem, 100+ models

export MODEL_PROVIDER=bedrock
export AWS_BEDROCK_MODEL_ID=anthropic.claude-3-5-sonnet-20241022-v2:0

Considerations:

✅ Tool calling works (Claude models only)
✅ Streaming supported
⚠️ Non-Claude models don't support tools

Ollama (Local)

Best for: Privacy, offline work, zero costs

export MODEL_PROVIDER=ollama
export OLLAMA_MODEL=llama3.1:8b

Considerations:

✅ 100% FREE, runs locally
✅ Tool calling supported (llama3.1, llama3.2, qwen2.5, mistral)
⚠️ Smaller models may struggle with complex tool usage
💡 Use qwen2.5:14b for better tool calling

Recommended models:

llama3.1:8b - Good balance
qwen2.5:14b - Better reasoning (7b struggles)
mistral:7b-instruct - Fast and capable

OpenRouter

Best for: Simplicity, flexibility, 100+ models

export MODEL_PROVIDER=openrouter
export OPENROUTER_MODEL=anthropic/claude-3.5-sonnet

Considerations:

✅ 100+ models available
✅ Excellent tool calling support
✅ Automatic fallbacks
💰 Competitive pricing

Databricks

Best for: Enterprise production, Claude 4.5

export MODEL_PROVIDER=databricks

Considerations:

✅ Claude Sonnet 4.5, Opus 4.5
✅ Enterprise SLA
✅ Excellent tool calling
💰 Enterprise pricing

Tier-Based Routing (Cost Optimization)

Use local Ollama for simple tasks, cloud for complex ones:

# Configure tier-based routing (set all 4 to enable)
export TIER_SIMPLE=ollama:llama3.2
export TIER_MEDIUM=openrouter:openai/gpt-4o-mini
export TIER_COMPLEX=databricks:databricks-claude-sonnet-4-5
export TIER_REASONING=databricks:databricks-claude-sonnet-4-5
export FALLBACK_ENABLED=true
export FALLBACK_PROVIDER=databricks
export DATABRICKS_API_BASE=https://your-workspace.databricks.com
export DATABRICKS_API_KEY=your-key

# Start Lynkr
lynkr start

How it works:

Each request is scored for complexity (0-100) and mapped to a tier
SIMPLE (0-25): Ollama (free, local, fast)
MEDIUM (26-50): OpenRouter (affordable cloud)
COMPLEX (51-75): Databricks (most capable)
REASONING (76-100): Databricks (best available)
Provider failures: Automatic transparent fallback to cloud

Cost savings:

65-100% for requests routed to local models
40-87% faster for simple requests

Verification & Testing

Check Lynkr Health

curl http://localhost:8081/health/live

# Expected response:
{
  "status": "ok",
  "provider": "bedrock",
  "timestamp": "2026-01-12T00:00:00.000Z"
}

Test API Endpoint

curl http://localhost:8081/v1/messages \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "claude-3-5-sonnet-20241022",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

# Should return Claude-compatible response from your provider

Test Claude CLI

# Simple test
claude "Hello, can you see this?"

# Tool calling test
claude "What files are in the current directory?"

# Should use Read/Bash tools and return results

Troubleshooting

Connection Refused

Symptoms: Connection refused or ECONNREFUSED

Solutions:

Verify Lynkr is running:

lsof -i :8081
# Should show node process

Check URL configuration:

echo $ANTHROPIC_BASE_URL
# Should be: http://localhost:8081

Test health endpoint:

curl http://localhost:8081/health/live
# Should return: {"status":"ok"}

Provider Authentication Errors

Symptoms: 401 Unauthorized or 403 Forbidden

Solutions:

Check provider credentials:

# For Bedrock
echo $AWS_BEDROCK_API_KEY

# For Databricks
echo $DATABRICKS_API_KEY

# For OpenRouter
echo $OPENROUTER_API_KEY

Verify credentials are valid:
- Bedrock: Check AWS Console → Bedrock → API Keys
- Databricks: Check workspace → Settings → User Settings → Tokens
- OpenRouter: Check openrouter.ai/keys

Check Lynkr logs:

# In Lynkr terminal, look for authentication errors

Tool Execution Errors

Symptoms: Tools fail to execute or return errors

Solutions:

Check tool execution mode:

echo $TOOL_EXECUTION_MODE
# Should be: server (default) or client

Verify workspace root:

echo $WORKSPACE_ROOT
# Should be valid directory path

Check file permissions:

# For server mode, Lynkr needs read/write access
ls -la $WORKSPACE_ROOT

Model Not Found

Symptoms: Model not found or Invalid model

Solutions:

Verify model is available:

# For Ollama
ollama list
# Should show your configured model

# For Bedrock
# Check AWS Console → Bedrock → Model access

Check model name matches provider:
- Bedrock: Use full model ID (e.g., anthropic.claude-3-5-sonnet-20241022-v2:0)
- Ollama: Use exact model name (e.g., llama3.1:8b)
- OpenRouter: Use provider prefix (e.g., anthropic/claude-3.5-sonnet)

Slow Responses

Symptoms: Responses take 5+ seconds

Solutions:

Check provider latency:
- Local (Ollama): Should be 100-500ms
- Cloud: Should be 500ms-2s

Enable tier-based routing:

# Set all 4 TIER_* env vars to enable tier-based routing
export TIER_SIMPLE=ollama:llama3.2
export TIER_MEDIUM=openrouter:openai/gpt-4o-mini
export TIER_COMPLEX=azure-openai:gpt-4o
export TIER_REASONING=azure-openai:gpt-4o
export FALLBACK_ENABLED=true

Check Lynkr logs for actual response times

Enable Debug Logging

For detailed troubleshooting:

# In .env or export
export LOG_LEVEL=debug

# Restart Lynkr
lynkr start

# Check logs for detailed request/response info

Advanced Configuration

Custom Port

# Change Lynkr port
export PORT=8082

# Update Claude CLI configuration
export ANTHROPIC_BASE_URL=http://localhost:8082

Custom Workspace Root

# Set specific workspace directory
export WORKSPACE_ROOT=/path/to/your/projects

# Claude CLI will use this as base directory for file operations

Tool Execution Policies

# Allow git push (default: disabled)
export POLICY_GIT_ALLOW_PUSH=true

# Require tests before commit (default: disabled)
export POLICY_GIT_REQUIRE_TESTS=true

# Custom test command
export POLICY_GIT_TEST_COMMAND="npm test"

Memory System

# Enable long-term memory (default: enabled)
export MEMORY_ENABLED=true

# Memories to inject per request
export MEMORY_RETRIEVAL_LIMIT=5

# Surprise threshold (0.0-1.0)
export MEMORY_SURPRISE_THRESHOLD=0.3

See Memory System Guide for details.

Cost Comparison

Scenario: 100,000 requests/month, average 50k input tokens, 2k output tokens

Provider	Without Lynkr	With Lynkr (60% savings)	Monthly Savings
Claude Sonnet 4.5 (via Databricks)	$16,000	$6,400	$9,600
GPT-4o (via OpenRouter)	$12,000	$4,800	$7,200
Ollama (Local)	API costs + compute	Local compute only	$12,000+

Token optimization includes:

Smart tool selection (50-70% reduction for simple queries)
Prompt caching (30-45% reduction for repeated prompts)
Memory deduplication (20-30% reduction for long conversations)
Tool truncation (15-25% reduction for tool responses)

See Token Optimization Guide for details.

Architecture

Claude Code CLI
    ↓ Anthropic API format
Lynkr Proxy (localhost:8081)
    ↓ Format conversion
Your Provider (Databricks/Bedrock/OpenRouter/Ollama/etc.)
    ↓ Returns response
Lynkr Proxy
    ↓ Format conversion back
Claude Code CLI (displays result)

Next Steps

Provider Configuration - Configure all 12+ providers
Installation Guide - Detailed installation
Features Guide - Learn about advanced features
Token Optimization - Maximize cost savings
Memory System - Long-term memory
Production Deployment - Deploy to production

Getting Help

Troubleshooting Guide - Common issues and solutions
FAQ - Frequently asked questions
GitHub Discussions - Community Q&A
GitHub Issues - Report bugs

Uh oh!

FilesExpand file tree

claude-code-cli.md

Latest commit

History

claude-code-cli.md

File metadata and controls

Claude Code CLI Setup Guide

Overview

Why Use Lynkr with Claude Code CLI?

Quick Setup (3 Minutes)

Step 1: Install Lynkr

Step 2: Configure Provider

Step 3: Start Lynkr

Step 4: Configure Claude Code CLI

Step 5: Test It

Configuration Options

Environment Variables

Feature Compatibility

Fully Supported Features

Tool Execution Modes

Usage Examples

Basic Chat

File Operations

Git Workflow

Code Generation

Provider-Specific Considerations

AWS Bedrock

Ollama (Local)

OpenRouter

Databricks

Tier-Based Routing (Cost Optimization)

Verification & Testing

Check Lynkr Health

Test API Endpoint

Test Claude CLI

Troubleshooting

Connection Refused

Provider Authentication Errors

Tool Execution Errors

Model Not Found

Slow Responses

Enable Debug Logging

Advanced Configuration

Custom Port

Custom Workspace Root

Tool Execution Policies

Memory System

Cost Comparison

Architecture

Next Steps

Getting Help