Thompson sampling
Explored week 1, stabilized week 2. Haiku handles 3.4K low-severity + 1.8K medium. Sonnet handles 480 novel high + 120 critical.
A Fortune 500 SOC processed 12K alerts/day through Sonnet at $35K/month. 4K identical firewall alerts and 2K misconfigured health checks were pure waste. Rate limits hit on routine tasks, starving critical alert processing.
Thompson sampling routes by alert severity. Duplicates and low-severity go to Haiku; critical incidents still hit Sonnet. Semantic cache serves the 4K identical alerts from first-pass cache. One-line change: model: "auto".
Explored week 1, stabilized week 2. Haiku handles 3.4K low-severity + 1.8K medium. Sonnet handles 480 novel high + 120 critical.
52% of volume hits cache. A firewall-rule change produces 4K identical alerts — the first runs through Haiku, the remaining 3,999 return in <5ms at zero cost.
Cost + routing decisions logged. The team verified critical alerts always hit Sonnet — a SOC 2 compliance requirement.
“Thompson sampling never sent a critical alert to Haiku. Exploration confirmed Sonnet superiority on critical categories and policy stabilized week 1. The SOC found 40% more real threats per analyst by eliminating noise-processing cost.”