Intelligent routing layer that picks the right model for each task — optimising cost and latency per request.
A middleware layer that sits between the application and the AI providers. When a request comes in, the router classifies the task type (reasoning, vision, simple extraction, creative writing) and routes to the optimal model. Claude Sonnet for complex reasoning, Haiku for simple classification, GPT-4o for image analysis. The router itself uses Haiku to classify — adding only 50ms overhead.
Testing across 500 real requests from the BRVO site audit tool. Current results: routing reduces average cost per request by 41% compared to sending everything to Sonnet, with only a 3% drop in output quality (measured by human evaluation). The biggest win is on simple tasks — extracting meta tags from HTML costs £0.001 with Haiku vs £0.008 with Sonnet, same accuracy.
Being integrated into all BRVO AI features to reduce client operating costs. When a chatbot answers a simple FAQ, it uses Haiku. When it needs to reason about a complex customer problem, it routes to Sonnet. Clients get better AI at lower monthly cost.
A chatbot handling 10,000 messages per day. Simple greetings and FAQs go to Haiku (£0.001/msg). Complex product comparisons go to Sonnet (£0.008/msg). Monthly cost drops from £2,400 to £1,400 with no quality loss on complex queries.
An invoicing system processing 500 PDFs daily. Simple field extraction (date, amount, vendor) uses Haiku. Anomaly detection and fraud flagging routes to Sonnet. 60% cost reduction.
A platform moderating user-generated content. Obviously safe content passes through Haiku instantly. Edge cases route to Sonnet for nuanced judgement. Faster moderation, lower cost, same safety.