Cryptographic Agency Firewall
SPIFFE-compatible identities with 5-minute ephemeral certificates. Every agent signs in; every request carries a provable origin. Internal CA, RSA-signed, mutual TLS verification, ALB passthrough for AWS.
Every request passes through adaptive routing, persistent memory, cryptographic identity, streaming security, agent governance, and cost intelligence — in under 5ms of overhead. Nothing to configure to start; full control when you need it.
Short-lived cryptographic identity, scoped provisioning, and a kill switch that takes 300ms end-to-end.
SPIFFE-compatible identities with 5-minute ephemeral certificates. Every agent signs in; every request carries a provable origin. Internal CA, RSA-signed, mutual TLS verification, ALB passthrough for AWS.
Virtual corporate cards for AI. Budget, quality floor, lifecycle state. 5-state machine: provisioned → active → quarantined → suspended → terminated. Auto-downgrade at 80%, quarantine at 95%.
Agents provision sub-agents with scoped permissions and budgets. Three trust tiers (minimal / standard / elevated) graduate based on operational history. 5 auth methods, including mTLS and SCIM.
Provider keys stay encrypted at rest with optional BYOK envelope encryption. Five auth methods cover every caller: human, SDK, mesh agent, identity provider.
AES-256-GCM encrypted storage for provider API keys. Each key carries its own daily budget ceiling. Zero-downtime rotation: the new key activates while the old drains in-flight requests.
Thompson sampling picks per request. Quality scores feed back into the posterior. Circuit breakers fail fast; cascades recover without client-side retry logic.
UCB1 exploration with a Gaussian posterior over quality scores. Finds the cheapest model above your quality floor. Variants: :floor, :fast, :best, auto.
pgvector HNSW + in-memory hybrid at 90% similarity. Streaming responses reconstruct from cache as SSE chunks, not just JSON. Cache hits ~4ms; 100% cost saved.
Continuous health probes. Circuit opens on failure; re-probes at 15s and 60s to catch recovery. Cascade escalates to a different healthy provider — never a retry loop to the same endpoint.
4-block architecture: HUMAN / SYSTEM / PROJECT / GENERAL. pgvector similarity retrieves context automatically. Nightly synthesis compacts memories into durable knowledge. Tenant-isolated, session-scoped.
Routing, memory, budget, kill switch, guardrails, governance, forensics, skills, workspace, agent runs. RBAC per tool. Single SSE connection gives agents secretless access to every upstream.
7-layer discovery: llms.txt, agents.json, RFC 8631 Link headers, /v1/discovery, /v1/self. AI clients bootstrap themselves.
Guardian predicts spend before execution. Budget velocity alerts fire before you cross a daily target. Every response ships with a receipt.
Under 5ms p95 overhead. Records cost + routing metadata per request. Identifies waste patterns: cheaper-equivalent swaps, retry-storm bursts, cost anomalies. Feeds Insights API.
Records Thompson's decision next to what static price-only or quality-only routing would have chosen. Welch's t-test, Cohen's d, win-rate confidence interval. Ring buffer, 10K entries.
/v1/intelligence/benchmark and as X-BR-Routing-Savings on every response.PII severed mid-stream, not logged after delivery. Every verdict exports to SIEM. Every response carries phase prediction, efficiency, cost, and routing metadata.
7-check pipeline, 3 layers: input guardrails (PII, PCI-DSS, regex), streaming output interception (sliding window), tool-call governance (RBAC on tool_calls before execution). SIEM exports in CEF or ECS JSON.
ONNX model classifies each request into workflow phases (planning, execution, validation). Header: X-BR-Phases. Drives phase-aware cost + caching policies.
Post-request anomaly detection. Catches retry storms, agents burning budget on wasted calls, model drift. Emits SIEM alerts; ARM can auto-quarantine offenders.
Change base_url. Add your provider keys. All 13 systems activate automatically.