All systems operationalLatency 42msWORM intactL828 enforcing
Model registry

Provider matrix.
Continuously updated.

The Inference Arbitrage Router evaluates registered providers at 200ms intervals across four signals: cost, latency, quality, and availability. Provider state is continuously updated — not cached.

Provider tiers

Routing tiers

Tier S
Frontier reasoning
Highest quality. Used for complex reasoning, legal analysis, compliance review. Highest cost tier. Quality score: 0.98.
Tier A
Balanced performance
Balanced cost-quality. Default for most inference routing tasks. Quality score: 0.91.
Tier B
Cost-optimised
Lowest cost. Used for classification, extraction, summarisation. Quality score: 0.84.
Tier C
Specialised
Domain-specific models for code, multimodal, and structured output tasks.
Routing signals

Provider evaluation matrix

SignalWeight (Balanced)Weight (Cheapest)Weight (Fastest)Weight (Quality)
Cost-per-token0.250.800.0670.067
Observed latency (EWMA)0.250.0670.800.067
Quality score0.250.0670.0670.80
Availability (binary)0.250.0670.0670.067
Provider state matrix refreshed every 200ms. Quality scores derived from WORM-sealed outcome records.