Pond — hardened, trusted, still-functional runs
Design plan for executing agent runs in pond so they are hardened (untrusted
code + untrusted agent + untrusted worker host can’t harm pond, other tenants, or
the host) and trusted (pond can rely on the result), while staying a real,
modifiable coding harness (the agent can read/write files, run builds & tests,
install deps, and call the model). Companion to pond-migration.md.
Grounded in a survey of what already exists. Where the survey’s current-state claims are uncertain they’re marked (verify) — confirm before relying.
The tension
Hardening and “a fully functional coding harness” pull opposite ways: real isolation wants no filesystem, no network, no privileges; a coding agent legitimately needs a writable tree, a toolchain, package installs, and the model API. The resolution is layered, tiered isolation with a default-deny egress allowlist — lock everything, then open exactly the holes a harness needs, and match isolation strength to how much you trust the host.
Threat model (decided)
Actors and trust:
| Actor | Trusted? | Why it matters |
|---|---|---|
| Target repository code | No | Builds/tests run arbitrary code; may be malicious. |
| The agent itself | No | Prompt-injectable; can be steered to exfiltrate/pivot. |
| Worker host (BYO / third-party) | No | Pairing/enrollment lets anyone attach a worker. |
| Orchestrator | Semi | Bearer-authed; today results are trusted on its say-so. |
| Pond (control plane) | Yes | The only fully trusted component. |
Two distinct threats, often conflated — keep them separate:
- T1 — untrusted agent/code on a trusted host. Defended by a strong in-VM sandbox + egress allowlist + sealed secrets + quotas. This is the common case and the bulk of the value.
- T2 — untrusted host. A malicious worker operator can read anything the job touches after it’s decrypted on their box (source, unsealed creds, outputs) and can fabricate results. Isolation cannot fix this — a sandbox protects the host from the job, not the job from the host. T2 needs trust tiering + result attestation + data minimization, and for sensitive work on untrusted hardware, confidential computing (below).
Trust tiers + routing
Route by the existing two axes (tags for deployment sharding, capabilities
for features) plus a new trust facet:
- Tier A — pond-operated pool (
tag: pool:trusted, KVM hosts). Runs untrusted code for any tenant. Host is trusted, so isolation protects the host + enforces cross-tenant separation. Sensitive / multi-tenant runs land here. Default for product-driven runs. - Tier B — BYO / untrusted host (
tag: pool:byo). Only runs work whose inputs that operator is already entitled to see (self-hosted, single-tenant — “bring your own compute for your own repos”). Results are attested and verified before pond trusts them. Never receives another tenant’s data. - Tier C — confidential (future,
capability: sandbox.cc-snp/cc-tdx). Untrusted host can run sensitive work because AMD SEV-SNP / Intel TDX keeps the guest memory opaque to the host and pond gets a remote-attestation quote before releasing secrets. The only honest answer to “sensitive workload on hardware I don’t control.” Firecracker alone does not provide this.
Pond is fail-closed: a run’s required tier is part of its sandbox spec; a job
only dispatches to a worker whose tags/capabilities satisfy it (extends the
existing _SANDBOX_PROFILE_MIN_CAP routing).
Isolation backend: “Firecracker-grade”, delivered pragmatically
VM-level isolation is correct (shared-kernel containers are insufficient for untrusted code on a multi-tenant pool). But ship it so the harness stays an editable OCI image, not a hand-baked rootfs:
- Preferred: Kata Containers (Firecracker VMM). microVM hardware boundary + normal container images. “Modify the environment” = edit a Dockerfile, not a kernel. KVM-required.
- Alt: gVisor (runsc). Lighter ops, runs on more hosts; occasional syscall-compat gaps — acceptable for most coding harnesses.
- Dev fallback: passthrough (the existing
NoneProvider) for local/darwin where there’s no KVM. Never for untrusted multi-tenant.
These become sandbox providers/backends alongside today’s NoneProvider +
DockerProvider (swarm/src/sandbox.py), selected by the sandbox.{profile, backend} block already on DispatchJob. Raw Firecracker stays possible via a
backend but isn’t the default — the image ergonomics are what keep the harness
“modifiable and functional.”
Sandbox profile catalog
Profiles fix the isolation posture (immutable); only resource/image knobs are
overridable — mirrors today’s _SANDBOX_OVERRIDABLE = (memory, cpus, pids, image).
| Profile | Filesystem | Network | Use |
|---|---|---|---|
dev (existing none) | host | open | local/darwin only |
review (existing untrusted-code-read) | ro checkout | broker-only | read-only analysis |
harness (new untrusted-code-write) | rw throwaway copy | allowlist egress | the full coding harness (edit/build/test/install) |
harness is the new profile that makes “still functional” true: a disposable
read-write copy of the checkout (so edits + node_modules are fine and discarded
after), an allowlist egress (so pip/npm install + git + the model API work),
dropped caps + no-new-privileges + non-root user, microVM backend, and resource
quotas. Output (edit_result.json) is captured from the rw copy.
Egress: default-deny allowlist proxy (decided)
Default-deny; open exactly:
allow: <model API host> # e.g. api.anthropic.com
<package registries> # pypi.org, files.pythonhosted.org,
# registry.npmjs.org, crates.io, …
<the target's own git remote>
deny: 169.254.0.0/16 (metadata) 10.0.0.0/8 172.16.0.0/12
192.168.0.0/16 ::1 loopback everything else
- Enforcement point matters per tier. On Tier A the proxy is a pond-/pool- controlled egress gateway the microVM is forced through (a malicious agent can’t bypass it). On Tier B the allowlist lives in the guest’s network config — it constrains the agent, but a malicious host can ignore it (that’s the T2 reality; mitigate via data minimization, not network policy).
- The registry allowlist is operator-editable per project/workflow (this is part of “modifiable”) with safe defaults; adding a host is an explicit, audited config change.
- Pairs with the broker: the model key is reached only via the broker URL, which is itself on the allowlist.
Secret flow + data minimization
- Keep the sealed-box + broker design: model creds resolved at dispatch
(project→org), Fernet at rest, sealed to the worker pubkey per job, broker
sidecar holds the real key so the agent env only ever has a dummy +
BROKER_URL(swarm/src/sandbox.pybroker plan,crypto/sealed.py, harness runtime files{{SECRET:NAME}}/{{BROKER_URL}}). - H0 RESULT (2026-06-21) — sealing IS fully wired and fail-closed; the wire is
safe. Traced live:
seal()runs server-side at claim time onPOST /orch/seal(model creds,orch.py:648) andPOST /orch/bundle/redeem(bundle key,orch.py:745), each requiring the worker’s enrollment-pinned 32-byte pubkey (elseok=false, no payload). The/orch/pollresponse carries no secret — only acred_requestintent (credential_name + project_id), asource_bundleboolean, and a one-time bundle token (orch.py:233-255). The orchestrator wiresBackendClient.sealas the claim-time callback (swarm/src/http.py); the worker unseals with its private key (swarm/src/security.py). No cleartext fallback if the pubkey is absent — the job just runs without the secret. The survey’s “exercised only by tests / cleartext on the wire” was a stalesealed.pydocstring (now corrected), not a real gap. - Data minimization for Tier B (T2): an untrusted host can read anything it
decrypts. So BYO workers get only their own operator’s data; sensitive /
cross-tenant secrets + source never route to
pool:byo. Broker-only network keeps the key out of the agent, but not out of a malicious host — don’t oversell it.
Result trust / attestation
Today results are trusted because they arrived over an orchestrator bearer token — no proof the worker ran the job (survey §7). For untrusted hosts that’s not enough. Add, in order of cost:
- Signed results. Worker signs
(job_id, input_sha, output_sha, rc)with the key already pinned at enrollment (pubkey pinning + optional fingerprint exist). Pond verifies the signature + thatinput_shamatches the bundle it issued. Cheap; closes result fabrication by anyone other than the keyholder. - Output integrity + bounds. Hash the artifact set; sequence + size-cap the stdout/stderr chunks (today they’re byte-counted but unbounded and unordered) so a worker can’t DoS via gigabytes of output.
- Reproducibility / quorum (Tier B sensitive). Re-run high-stakes jobs on a Tier-A worker (or a second BYO) and compare output hashes; divergence → distrust.
- Remote attestation (Tier C). SEV-SNP/TDX quote bound to the guest measure- ment, verified before secret release. The real fix for “trust a result from hardware I don’t own.”
Resource governance (per-run quotas)
Make the optional Docker knobs enforced defaults, add what’s missing:
- cgroup defaults per profile: memory, cpus, pids-limit (fork-bomb), block-io.
- ulimits: nofile, nproc, core=0, stack — none today.
- disk quota on the rw copy + tmpfs caps (checkout extraction can fill disk).
- wall-clock budget per run (not just per-stage timeout) → auto-cancel.
- cost ceiling: enforce
max_cost_centspre-emptively — the telemetry rollup already computes run cost; cancel when it crosses the budget instead of only recording it after the fact.
What keeps it a functional, modifiable harness
The hardening is designed around the harness, not against it:
- rw throwaway checkout copy → edits, builds,
node_modules, generated files all work; discarded after the run (rw-checkout-copyposture exists). - allowlist egress incl. registries + git →
pip/npm/cargoinstall andgitoperations succeed; only exfil/pivot is blocked. - editable OCI image (Kata/gVisor) → operators bake/modify the toolchain via a
Dockerfile; per-profile
imageoverride already allowed. - harness runtime files → the agent CLI (claude/codex) + its config land in
$HOMEvia the existing{{MODEL}}/{{BROKER_URL}}/{{SECRET:NAME}}templating. - generous-but-bounded resources + wall-clock + cost ceiling → big enough for real builds, capped so a run can’t run away.
Implementation status
Everything in this model is shipped and fail-closed today: trust-tier routing,
the three sandbox profiles, the gVisor / Kata-Firecracker isolation backends
(fail-closed when the runtime is absent — never a silent downgrade to runc), the
default-deny egress allowlist (per-job --internal network + operator-editable
rules), per-job credential sealing + the broker sidecar, encrypted source bundles,
signed result attestation (Ed25519, pinned at enrollment), and per-run resource /
wall-clock / cost governance. The mechanisms are specified in
docs/specs/.
The one deferred piece is a confidential tier (SEV-SNP / TDX with remote-attestation-gated secret release), for running sensitive workloads on a fully untrusted host — additive, and only needed if that’s a hard requirement.