SIE — Superlinked Inference Engine

Open-source inference server and production cluster for embeddings, reranking, and extraction.

85+ models. Three functions. From laptop to Kubernetes. All Apache 2.0.

Docs · Quickstart · API Reference · Models

About

SIE is an open-source inference engine that serves embeddings, reranking, and entity extraction through a single unified API. It replaces the patchwork of separate model servers with one system that handles 85+ models across dense, sparse, multi-vector, vision, and cross-encoder architectures.

Three functions (encode, score, extract) cover the entire embedding, reranking, and extraction pipeline
85+ pre-configured models, hot-swappable, all quality-verified against MTEB in CI
Serves multiple models simultaneously with on-demand loading and LRU eviction
Ships the full production stack: load-balancing router, KEDA autoscaling, Grafana dashboards, Terraform for GKE/EKS
Integrates with LangChain, LlamaIndex, Haystack, DSPy, CrewAI, Chroma, Qdrant, and Weaviate
OpenAI-compatible /v1/embeddings endpoint for drop-in migration

Quickstart

Or try it in your browser, no install needed:

1. Start the server

pip install sie-server
sie-server serve            # auto-detects CUDA / Apple Silicon / CPU

Or with Docker:

docker run -p 8080:8080 ghcr.io/superlinked/sie-server:latest-cpu-default                # CPU
docker run --gpus all -p 8080:8080 ghcr.io/superlinked/sie-server:latest-cuda12-default  # GPU

2. Install the SDK and go

pip install sie-sdk

The entire API is three functions: encode, score, extract.

from sie_sdk import SIEClient
from sie_sdk.types import Item

client = SIEClient("http://localhost:8080")

# Encode: dense embeddings, 400M-parameter model
result = client.encode("NovaSearch/stella_en_400M_v5", Item(text="Hello world"))
print(result["dense"].shape)  # (1024,)

# Score: rerank documents by relevance
scores = client.score(
    "BAAI/bge-reranker-v2-m3",
    Item(text="What is machine learning?"),
    [Item(text="ML learns from data."), Item(text="The weather is sunny.")]
)
print(scores["scores"])
# [{'item_id': 'item-0', 'score': 0.998, 'rank': 0},
#  {'item_id': 'item-1', 'score': 0.012, 'rank': 1}]

# Extract: zero-shot named entity recognition, no training data
result = client.extract(
    "urchade/gliner_multi-v2.1",
    Item(text="Tim Cook is the CEO of Apple."),
    labels=["person", "organization"]
)
print(result["entities"])
# [{'text': 'Tim Cook', 'label': 'person', 'score': 0.96},
#  {'text': 'Apple', 'label': 'organization', 'score': 0.91}]

TypeScript: pnpm add @sie/sdk — TypeScript docs ->

Full quickstart guide -> · SDK reference ->

Production

The same code works against a production cluster. SIE ships a load-balancing router, KEDA autoscaling (scale to zero), Grafana dashboards, and Terraform modules for GKE and EKS. Not just the server, the whole stack. All Apache 2.0.

helm upgrade --install sie-cluster deploy/helm/sie-cluster \
  --namespace sie --create-namespace \
  --set hfToken.create=true \
  --set hfToken.value=<TOKEN> \
  -f deploy/helm/sie-cluster/values-{gke|aws}.yaml

Deployment guide ->

Explore

85+ models — Stella v5 · BGE-M3 · SPLADE v3 · SigLIP · ColQwen2.5 · BGE-reranker · GLiNER · Florence-2 · and more -> Dense, sparse, multi-vector, vision, rerankers, extractors. All pre-configured. All quality-verified against MTEB in CI.

Integrations — LangChain · LlamaIndex · Haystack · DSPy · CrewAI · Chroma · Qdrant · Weaviate

Notebooks — Quickstarts and walkthroughs

Examples — End-to-end project gallery

sie.dev/docs · Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
deploy		deploy
examples		examples
integrations		integrations
notebooks		notebooks
packages		packages
.gitignore		.gitignore
.npmrc		.npmrc
CHANGELOG.md		CHANGELOG.md
COMPATIBILITY.md		COMPATIBILITY.md
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SIE — Superlinked Inference Engine

About

Quickstart

Production

Explore

About

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SIE — Superlinked Inference Engine

About

Quickstart

Production

Explore

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Uh oh!

Contributors

Uh oh!

Languages