All systems operationalLatency 42msWORM intactL828 enforcing
Routing Explorer

Inference arbitrage.
Pareto-optimal.

The Inference Arbitrage Router maintains a provider state matrix refreshed every 200ms and solves a lightweight Pareto optimisation at dispatch time. Four priority modes map requests to distinct regions on the Pareto frontier.

200msState refresh
4Priority modes
0.94CBEI production
34msFailover latency
Priority modes

Four Pareto modes

balanced
Social optimum
Equal weighting across all signals. Produces Pareto-optimal allocation closest to social optimum.
cost: 0.25 latency: 0.25 quality: 0.25 availability: 0.25
cheapest
Cost minimiser
Minimises cost subject to quality floor of mean minus 2 standard deviations.
cost: 0.80 latency: 0.067 quality: 0.067 availability: 0.067
fastest
Latency minimiser
Minimises round-trip time subject to cost ceiling of mean plus 1 standard deviation.
cost: 0.067 latency: 0.80 quality: 0.067 availability: 0.067
quality
Quality maximiser
Maximises output quality subject to cost ceiling of mean plus 2 standard deviations.
cost: 0.067 latency: 0.067 quality: 0.80 availability: 0.067
Provider matrix

State matrix structure

P matrix — n providers × 4 signals
ProviderCost/tokenLatency EWMAQuality scoreAvailability
Provider [A]Tier A costEWMA α=0.3WORM-derivedBinary 200ms
Provider [B]Tier B costEWMA α=0.3WORM-derivedBinary 200ms
Provider [C]Tier C costEWMA α=0.3WORM-derivedBinary 200ms
Provider [N]VariableEWMA α=0.3WORM-derivedBinary 200ms