understudydocs

concepts

Architecture

Two planes with one identity system. The data plane proxies model traffic on the hot path; the control plane manages everything about how that traffic is treated. Understanding the five-phase pipeline explains every behavior documented elsewhere.

The two planes

planehostresponsibility
data plane (gateway)api.understudylabs.com/v1/*Proxies LLM traffic. Validates keys, resolves scope, routes, forwards a single upstream call, observes the result.
control plane (admin API + dashboard)api.understudylabs.com/admin/v1/* · app.understudylabs.comKeys, projects, workloads, model catalog, capture listing, routing writes. Never touches live model traffic.

The separation is deliberate: a control-plane outage cannot drop your model traffic, and nothing on the hot path waits on a dashboard operation.

The request pipeline

Every proxied request passes through five phases, in order. Each phase's outcome is visible to you — in response headers on the request itself, and in the capture envelope afterward.

identify  →  workload  →  route  →  forward  →  observe
phasewhat it does
identifyValidates your sk_* key and establishes the credential source: managed (Understudy's supply will pay upstream) or BYO (your provider key rides along).
workloadResolves the (organization, project, workload) triple from the x-understudy-project and x-understudy-workload headers, defaulting to your default project's main workload when absent.
routeDecides which upstream serves the request: passthrough to the provider your code called, the workload's configured model route, or a catalog model you named in body.model.
forwardMakes exactly one upstream call (with a fallback retry to your primary provider when a routed call fails — see Routing). Streams responses through without buffering.
observeEmits a usage event for metering and, when the workload has capture enabled, writes the capture envelope. Happens after your response is already on its way — observation never adds latency.

Where things live

Routing configuration, projects, and workloads live in a relational store read by the gateway on the hot path with caching. Captures are written as one object per request to durable object storage, keyed by organization, project, key, and date. Usage events flow through an async queue into an analytics store — the meter that billing derives from is append-only and recomputable.