Run an agent (worker pool)

swarm (agent) stages run on workers, not in pond. This guide attaches a pool. (Builtin/shell/noop stages need none — skip this unless you’re running agents.)

How it fits together

pond /v1/orch orchestrator swarm serve · claims jobs workers swarm worker · run agent poll poll
  • An orchestrator (swarm serve) pairs with pond once, then polls /v1/orch for jobs and relays results back. Pond never dials out to it — so the pool can be anywhere.
  • Workers (swarm worker) register with the orchestrator, claim jobs, and run each agent in the sandbox the stage declared.

The worker tool lives in this repo under ../../swarm/; it’s stdlib-only Python you copy to a worker host.

A pool is one orchestrator + its workers, and you can run several at once (e.g. a trusted in-house pool plus per-tenant BYO pools). For how this fits the overall deployment — what runs where and how many — see Concepts What you deploy.

1. Pair an orchestrator

The orchestrator needs a one-time pairing code and pond’s URL, then it self-configures:

cd swarm
python swarm.py pair        # redeems a pairing code → writes backend-handle.json
python swarm.py serve --bind 127.0.0.1:8080 --token POOL_SECRET --state-dir ./swarm-state

serve reads the paired backend handle and starts polling pond. (--help lists the pair flags; bind loopback and terminate TLS in front, or use --insecure-no-tls only on a trusted overlay.)

Where do pairing & enrollment codes come from? You mint them from pond’s operator API — POST /v1/admin/pool/pairing-codes and …/enrollment-codes (and there’s a read-only …/pool/orchestrators + …/pool/workers view). See Set up an agent for the full walk-through, including registering the harness/model/credential a stage needs.

2. Attach workers

python swarm.py worker \
  --orchestrator http://orch.lan:8080 \
  --enroll-code CODE \
  --tags pool:trusted,repo:acme \
  --capabilities harness.codex,sandbox.firecracker \
  --repo-root /srv/checkouts \
  --concurrency 2
  • —tags decide where jobs may land (subset-matched against a stage’s required tags): pool:trusted / pool:byo, tenant:<id>, deployment shards like repo:acme.
  • —capabilities decide what the worker can do (glob-matched): harness.* (which agent CLI), sandbox.* (which isolation backend). A worker only keeps a sandbox.<backend> capability it can actually enforce.
  • Install cryptography on the worker (uv sync in swarm/, or pip install cryptography) so it can unseal credentials/bundles and sign results. Without it the worker still runs, but can’t do sealed delivery or attestation.

Match the worker’s tags/capabilities to what your stages request (see Sandbox a run and Author a workflow) — a job with no matching worker stays queued (fail-closed), it never runs somewhere unsuitable.

3. Agents, models, credentials

A swarm stage resolves a harness (which CLI), a model, and a credential. These are configured through the same operator console noted above; pond seals the chosen credential to the claiming worker at dispatch (see specs/model-credentials).

Operating

swarm.py status / jobs / cancel / evict manage a running pool (run with --help). The pool protocol itself is specced in specs/worker-pool-protocol.

See also: Untrusted & BYO workers.