Run an agent (worker pool)

swarm (agent) stages run on workers, not in pond. This guide attaches a pool. (Builtin/shell/noop stages need none — skip this unless you’re running agents.)

How it fits together

An orchestrator (swarm serve) pairs with pond once, then polls /v1/orch for jobs and relays results back. Pond never dials out to it — so the pool can be anywhere.
Workers (swarm worker) register with the orchestrator, claim jobs, and run each agent in the sandbox the stage declared.

The worker tool lives in this repo under ../../swarm/; it’s stdlib-only Python you copy to a worker host.

A pool is one orchestrator + its workers, and you can run several at once (e.g. a trusted in-house pool plus per-tenant BYO pools). For how this fits the overall deployment — what runs where and how many — see Concepts → What you deploy.

1. Pair an orchestrator

The orchestrator needs a one-time pairing code and pond’s URL, then it self-configures:

cd swarm
python swarm.py pair        # redeems a pairing code → writes backend-handle.json
python swarm.py serve --bind 127.0.0.1:8080 --token POOL_SECRET --state-dir ./swarm-state

serve reads the paired backend handle and starts polling pond. (--help lists the pair flags; bind loopback and terminate TLS in front, or use --insecure-no-tls only on a trusted overlay.)

Where do pairing & enrollment codes come from? You mint them from pond’s operator API — POST /v1/admin/pool/pairing-codes and …/enrollment-codes (and there’s a read-only …/pool/orchestrators + …/pool/workers view). See Set up an agent for the full walk-through, including registering the harness/model/credential a stage needs.

2. Attach workers

python swarm.py worker \
  --orchestrator http://orch.lan:8080 \
  --enroll-code CODE \
  --tags pool:trusted,repo:acme \
  --capabilities harness.codex,sandbox.firecracker \
  --repo-root /srv/checkouts \
  --concurrency 2

—tags decide where jobs may land (subset-matched against a stage’s required tags): pool:trusted / pool:byo, tenant:<id>, deployment shards like repo:acme.
—capabilities decide what the worker can do (glob-matched): harness.* (which agent CLI), sandbox.* (which isolation backend). A worker only keeps a sandbox.<backend> capability it can actually enforce.
Install cryptography on the worker (uv sync in swarm/, or pip install cryptography) so it can unseal credentials/bundles and sign results. Without it the worker still runs, but can’t do sealed delivery or attestation.

Match the worker’s tags/capabilities to what your stages request (see Sandbox a run and Author a workflow) — a job with no matching worker stays queued (fail-closed), it never runs somewhere unsuitable.

3. Agents, models, credentials

A swarm stage resolves a harness (which CLI), a model, and a credential. These are configured through the same operator console noted above; pond seals the chosen credential to the claiming worker at dispatch (see specs/model-credentials).

Operating

swarm.py status / jobs / cancel / evict manage a running pool (run with --help). The pool protocol itself is specced in specs/worker-pool-protocol.