Worker pool protocol (/orch)

How a pool of workers connects to pond, claims work, and reports results.

Code: app/routers/orch.py (pond, server side), swarm/src/backend.py (the orchestrator’s poll/push loops).

Workers poll pond

Pond never dials out to the pool. The orchestrator is a client of pond — it connects in, which is what lets workers live anywhere: behind NAT, on a laptop, on a third party’s machine pond has no route to.

An operator pairs an orchestrator once (swarm pair redeems a pairing code a per-binding bearer token stored locally). From then on it runs two loops:

  • pollPOST /orch/poll for dispatched jobs to claim;
  • push — relay each job’s state / output / telemetry, and push a worker registry snapshot every few seconds.

The surface

Every endpoint is authed by the per-binding orchestrator bearer; chunk/done are additionally gated by a job-scoped X-Claim-Token. A binding may only touch jobs it claimed.

EndpointPurpose
POST /orch/pairredeem a pairing code bearer token + protocol version
POST /orch/pollclaim dispatched jobs (returns intents + tokens, no secrets)
POST /orch/jobs/{id}/statereport a state transition (+ rc, error, signed attestation)
POST /orch/jobs/{id}/chunkstream stdout/stderr (size-capped)
POST /orch/jobs/{id}/telemetryedge-captured model.call usage events
POST /orch/workers/snapshotpush the worker registry (pins worker pubkeys)
POST /orch/enrollment/{code}/consumedacknowledge a worker enrollment (pins keys)
POST /orch/seal, /orch/bundle/{token,redeem}per-job secret / bundle sealing
GET /orch/whoamibinding identity

No secrets on the poll

/orch/poll returns only intents: a cred_request (credential name + project, no value), a source_bundle flag + one-time token, and the job’s command + sandbox block. Real secrets are sealed to the claiming worker’s pinned key through the separate /orch/seal and /orch/bundle/redeem calls at claim time — see model-credentials and worker-source-delivery. A leaked or logged poll response therefore never exposes a credential.

For a confidential run the poll job also carries recipients — the consumer’s X25519 public keys. These are not secrets (public keys), so they ride the poll verbatim; the worker seals every chunk of agent output to them before upload, so the orchestrator and control plane only ever store ciphertext (see confidential-runs). Older workers ignore the field — it’s additive, tolerant-reader.

/orch/pair returns a protocol version; both sides are tolerant readers, so a pool and pond can run a version apart during a rollout.

Invariants

  • Pond never initiates the connection; the pool polls and pushes.
  • A binding can only read or mutate jobs it claimed.
  • The poll response is secret-free; every secret is delivered sealed to a specific worker’s pinned key.