Model registry
Provider matrix.
Continuously updated.
The Inference Arbitrage Router evaluates registered providers at 200ms intervals across four signals: cost, latency, quality, and availability. Provider state is continuously updated — not cached.
Provider tiers
Routing tiers
Tier S
Frontier reasoning
Highest quality. Used for complex reasoning, legal analysis, compliance review. Highest cost tier. Quality score: 0.98.
Tier A
Balanced performance
Balanced cost-quality. Default for most inference routing tasks. Quality score: 0.91.
Tier B
Cost-optimised
Lowest cost. Used for classification, extraction, summarisation. Quality score: 0.84.
Tier C
Specialised
Domain-specific models for code, multimodal, and structured output tasks.
Routing signals
Provider evaluation matrix
| Signal | Weight (Balanced) | Weight (Cheapest) | Weight (Fastest) | Weight (Quality) |
|---|---|---|---|---|
| Cost-per-token | 0.25 | 0.80 | 0.067 | 0.067 |
| Observed latency (EWMA) | 0.25 | 0.067 | 0.80 | 0.067 |
| Quality score | 0.25 | 0.067 | 0.067 | 0.80 |
| Availability (binary) | 0.25 | 0.067 | 0.067 | 0.067 |
Provider state matrix refreshed every 200ms. Quality scores derived from WORM-sealed outcome records.