An e-commerce platform increased first-contact resolution by 34% while cutting support AI costs by $62K annually.
E-commerce support at scale is a math problem. This company fielded 50,000 support tickets monthly — 80% were routine (tracking numbers, return policies, password resets). The other 20% were complex (billing disputes, custom orders, damage claims). They'd been routing everything to Claude Sonnet to ensure quality, spending $28K/month on AI that was overkill for FAQ questions.
The challenge: you can't know query complexity until you read it. And you can't risk degraded quality on edge cases. So they'd defaulted to premium models across the board.
BrainstormRouter solved this with adaptive routing. The team didn't classify queries
manually — they set model: "auto" and let Thompson Sampling learn from
outcomes.
The router used customer satisfaction scores and resolution rates as reward signals. Simple queries (FAQ matches, templated responses) produced high satisfaction on GPT-4o Mini. Complex escalations produced higher satisfaction on Claude Sonnet. Within 60 days, the router had mapped 47 distinct query patterns and assigned optimal models to each.
// The integration was one line change
client = OpenAI(
api_key="br_live_...",
base_url="https://api.brainstormrouter.com/v1"
)
// Before: hardcoded expensive model
response = client.chat.completions.create(
model="claude-sonnet-4-0",
messages=ticket_messages
)
// After: router decides based on learned patterns
response = client.chat.completions.create(
model="auto",
messages=ticket_messages
)
Week 1: Thompson Sampling explored broadly. All models received roughly equal traffic. Satisfaction data accumulated.
Week 2-3: Patterns emerged. "Where is my order?" queries routed 90% to GPT-4o Mini. "I was charged twice" queries routed 85% to Claude Sonnet. The router was learning.
Week 4-8: The routing policy stabilized. 47 distinct query patterns mapped to 3 models. GPT-4o Mini handled 68% of volume. Claude Sonnet handled 18%. Claude Haiku handled 14% (quick acknowledgments, status checks).
| Pattern Category | Monthly Volume | Routed Model | Satisfaction |
|---|---|---|---|
| Order tracking | 14,200 | GPT-4o Mini | 4.6/5.0 |
| Return/refund FAQ | 9,800 | GPT-4o Mini | 4.5/5.0 |
| Account/password | 7,100 | Claude Haiku | 4.7/5.0 |
| Billing disputes | 5,400 | Claude Sonnet | 4.4/5.0 |
| Custom orders | 3,200 | Claude Sonnet | 4.3/5.0 |
| Damage claims | 2,800 | Claude Sonnet | 4.2/5.0 |
| Other (23 patterns) | 7,500 | Mixed | 4.4/5.0 |
Support costs dropped from $28K/month to $12K/month. More importantly, first-contact resolution improved 34% because Sonnet's deeper reasoning now focused exclusively on genuinely hard problems — billing disputes, damage claims, and custom order negotiations where nuanced language mattered.
Average response time fell from 3.4 seconds to 1.2 seconds. GPT-4o Mini responds faster than Sonnet, and it now handled the majority of traffic. Customers noticed — support satisfaction scores improved alongside the cost reduction.
The team didn't write a single routing rule. Thompson Sampling discovered the optimal mapping between query patterns and models entirely from outcome data.
Next Case Study