concepts
Capture
Capture records request/response pairs as evidence — the raw material for evals, regression suites, and fine-tuning data. It's per-workload, sampled deterministically, written after your response is already streaming, and never normalized: what you sent and what came back, byte for byte.
When a request is captured
Three gates, all per workload: the workload has capture_enabled, the request passes the workload's capture_sample_rate (0.0–1.0), and the request actually reached an upstream. The sampling gate hashes the request id — a retried request is consistently captured or consistently skipped, so samples stay unbiased under retries.
Routing a workload to a model turns capture on by default: routed traffic is precisely the traffic you'll want evidence about. Pass an explicit capture_enabled: false to route without recording.
The capture envelope
One JSONL object per captured request, with the raw bodies plus the attribution needed to slice them later:
| field group | contents |
|---|---|
| identity | request_id, timestamp, organization, project, workload, and the API key id that sent the request |
| resolution | mode (byo/managed), provider, requested_model vs upstream_model (what your code asked for vs what served), endpoint |
| outcome | upstream status_code and latency_ms — measured to first byte, so streaming responses report time-to-first-token |
| payloads | customer_request_body, upstream_request_body (when a rewrite made them differ), response_body — raw, unparsed |
| your dimensions | tags — the key/value map from x-understudy-tags, if you sent one |
Error responses are captured too — a 4xx/5xx envelope with its body is often the most useful evidence in an incident.
What capture costs you
Nothing on the hot path. The observe phase runs after your response has begun streaming back; capture writes have their own retry and fallback machinery, and a capture failure can never fail or slow your request.
Where captures go, who sees them
Captures are stored per organization in durable object storage, partitioned by project, key, and date. They're browsable per-project in the dashboard (Captures), filterable by workload, and each entry opens to the full envelope. They are your data: captured for your evals and your replacement models, exportable in bulk on request during the preview.