Developers

Build with BrainstormRouter.

Orchestration SDKs, session loop API, 117 MCP tools, OpenAI- and Anthropic-compatible proxy, and full machine-readable discovery. One base_url swap — zero code changes.

Create an API key Read the docs

SDKs

Drop-in OpenAI replacement.

Works with LangChain, LlamaIndex, CrewAI, Vercel AI SDK. Change your base URL and key — everything else keeps working.

TypeScript SDK ESM · CJS

npm →

# install
npm install brainstormrouter

# use
import BrainstormRouter from "brainstormrouter";

const client = new BrainstormRouter({
  apiKey: "brk_...",
});

const res = await client.chat.completions.create({
  model: "auto",
  messages: [{ role: "user", content: "Hello" }],
});

Python SDK sync · async

PyPI →

# install
pip install brainstormrouter

# use
from brainstormrouter import BrainstormRouter

client = BrainstormRouter(api_key="brk_...")

res = client.chat.completions.create(
    model="auto",
    messages=[{"role": "user",
               "content": "Hello"}],
)

REST API OpenAI- & Anthropic-compatible

OpenAPI spec →

curl -X POST https://api.brainstormrouter.com/v1/chat/completions \
  -H "Authorization: Bearer brk_..." \
  -H "Content-Type: application/json" \
  -d '{
    "model": "auto",
    "messages": [{"role":"user","content":"Hello"}],
    "stream": true
  }'

Auto-routing variants

Set `model` and let the router choose.

One of four string values per call. Thompson sampling handles the rest.

auto UCB1 + Gaussian posterior

General workloads — learns the optimal model per task type.

auto:fast Lowest p50 latency

Real-time UX, streaming interfaces, conversational agents.

auto:floor Cheapest above quality floor

Bulk processing, classification, triage.

auto:best Highest quality regardless of cost

Critical reasoning, code review, legal analysis.

Prometheus stress-test data: Thompson routed simple tasks to DeepSeek at $0.000005/req and complex tasks to Claude Opus at $0.000156/req — a 31× cost difference, chosen automatically.

Intelligence headers

Every response ships with a receipt.

Machine-readable metadata on every call. Drop them into dashboards, feed them to agents, store them for audit.

X-BR-Model-Requested:       auto
X-BR-Model-Resolved:        claude-sonnet-4-6
X-BR-Provider:              anthropic
X-BR-Route-Strategy:        thompson
X-BR-Estimated-Cost:        $0.00412
X-BR-Actual-Cost:           $0.00389
X-BR-Efficiency:            0.94
X-BR-Cache-Status:          MISS
X-BR-Guardian-Status:       ok
X-BR-Guardian-Overhead-Ms:  0.7
X-BR-Routing-Savings:       $0.00124
X-BR-Phases:                planning,execution
X-BR-Request-Id:             req_01HXXXXXX
X-BR-Tenant-Id:              tnt_XXXXXX

Authorization plane

Delegation you can revoke.

Mint a capability grant — "this agent may use these models until E, derived from this parent" — then present it on a request with one header. Grants are Ed25519-signed, chain-bound, and revocable with zero cache lag. Attenuate a narrower slice for a sub-agent; the chain stays monotone-narrowing and every hop is evidenced.

POST /v1/grants                 // mint a root grant
POST /v1/grants/{id}/attenuate   // delegate a narrower child
POST /v1/grants/{id}/revoke      // cascade-revoke the subtree
POST /v1/grants/{id}/check       // signed allow/deny verdict (PDP)

X-BR-Grant-Id: grant_01HXXXXXX    // present on any completion

Plans vs Grants vs Delegation →

Evidence plane

Ask "why was I denied?" — and prove the answer.

Every decision and denial leaves committed lineage: the routing stages, the policy verdict, and the W3C trace identity, durable in Postgres and bound to the audit hash chain by one new hashed field. The WHY can be lost — visibly, with an honest coverage label — but never silently rewritten or manufactured. Reconstruct a whole causal episode by trace id; every edge is labeled verified, claimed, or unverified from its own committed fields.

GET /v1/governance/lineage/{request_id}   // the WHY answer + coverage
GET /v1/governance/lineage?trace_id=…      // reconstruct a causal episode
GET /v1/governance/audit/chain/verify      // chain + per-row provenance recompute

traceparent: 00-<trace>-<span>-01      // ingested, minted, echoed, propagated
X-BR-Request-Id: req_01HXXXXXX           // one canonical join key

Committed Lineage →

MCP Server

117 governed tools on one connection.

Connect from Claude Desktop, Cursor, Windsurf, or any MCP client. Single SSE stream. RBAC per tool. Every call audited.

Endpoint:   POST https://api.brainstormrouter.com/v1/mcp/connect
Transport:  Streamable HTTP
Auth:       Bearer API key

• br_route_completion
• br_memory_store
• br_get_leaderboard
• br_replay_decisions
• br_kill_switch
• br_forensics

Browse all 117 tools in agents.json →

Machine-Native Interface

AI clients discover this site themselves.

Machine-readable manifests so agents and coding assistants can find BrainstormRouter automatically.

• GET /v1/discovery — live capability discovery
• /llms.txt — concise AI-readable site map
• /llms-full.txt — complete API reference (all routes)
• /.well-known/ai-plugin.json — OpenAI plugin manifest
• /.well-known/agents.json — agent capability discovery
• /openapi.yaml — OpenAPI 3.1 spec

Error recovery

Every error is machine-actionable.

Errors include a recovery field. Agents read it programmatically to pick their next action — retry, downgrade, or call for help — without human intervention.

{
  "error": {
    "type": "budget_exceeded",
    "message": "Daily budget limit of $5.00 reached",
    "recovery": {
      "action": "wait_or_upgrade",
      "retry_after": "2026-04-21T00:00:00Z",
      "alternative": "Use auto:floor for lower-cost routing",
      "dashboard_url": "https://brainstormrouter.com/dashboard"
    }
  }
}

Ship your first routed request.

Get an API key, swap your base_url, and watch the receipts roll in.

Create API key Read the docs