Run a real coding agent (codex)

Verified end-to-end on k3d: a git_https source bundle dispatch to a worker the OpenAI Codex CLI runs on the checkout a real model call output promoted to a run artifact (results.jsonl). This is the recipe.

1. A worker image with the agent CLI

The base worker carries only the pool machinery; bake the agent CLI in:

# deploy/worker-codex.Dockerfile — Node + @openai/codex on top of the worker

Build, import, and run the pool on it:

docker build -t pond-worker-codex:dev -f deploy/worker-codex.Dockerfile .
k3d image import pond-worker-codex:dev -c pond
helm upgrade pond deploy/chart -n pond --reuse-values \
  --set worker.image.repository=pond-worker-codex \
  --set worker.capabilities=harness.codex

The control-plane image also needs git (it clones git_https sources) — already in Dockerfile.

2. Register credential + model + harness

Shortcut: GET /v1/admin/harness-presets lists built-in templates; POST /v1/admin/harnesses/from-preset {"preset":"codex"} registers a correct harness without hand-authoring argv. The manual form below is equivalent.

# credential (sealed): codex reads OPENAI_API_KEY for its configured provider
POST /v1/admin/credentials {"name":"openrouter","provider":"openrouter",
  "secrets":{"OPENAI_API_KEY":"sk-or-..."}}

# model: a model id on your provider
POST /v1/admin/models {"key":"orx","provider":"openrouter",
  "model_id":"openai/gpt-4o-mini","credential_name":"openrouter","capability":"harness.codex"}

# harness: DIRECT ARGV (no `sh -lc`), {prompt} as its own element
POST /v1/admin/harnesses {"key":"codex","capability":"harness.codex",
  "command_template":[
    "codex","exec","--dangerously-bypass-approvals-and-sandbox","--skip-git-repo-check",
    "-c","model_provider=openrouter",
    "-c","model_providers.openrouter.base_url=https://openrouter.ai/api/v1",
    "-c","model_providers.openrouter.env_key=OPENAI_API_KEY",
    "-c","model_providers.openrouter.wire_api=responses",
    "-m","{model}","{prompt}"]}

3. Run it

POST /v1/runs {"project_id":"…","sources":[{"label":"repo","kind":"git_https",
  "config":{"url":"https://github.com/sindresorhus/slugify"}}],
  "definition":{"stages":[{"key":"agent","name":"Codex","executor":{
    "kind":"swarm","harness":"codex","model_ref":"orx",
    "prompt":"Summarize this repo. End with a ```result block: {\"external_id\":\"E2E-1\",\"title\":\"…\"}",
    "sandbox":{"profile":"none"},"capabilities":["harness.codex"]}}]}}
# → fetch → codex runs on the checkout → extracted 1 result → artifacts/results.jsonl → done
# (no `parse` block → the neutral default: fence `result` → results.jsonl)

Check before you run. Upsert lints the command template (rejects {{…}}-in-command, flags unknown placeholders + {prompt_file} for remote pools) and returns warnings. To see the exact argv a run would exec: POST /v1/admin/harnesses/<key>/dry-run {"model_ref":"…","prompt":"…"}.

Gotchas (learned the hard way)

  • Command placeholders are single-brace: {model}, {prompt}, {prompt_file}not {{MODEL}} (those are for runtime files). A wrong one silently passes the literal {{MODEL}} to the CLI.
  • For a remote/orchestrated pool use {prompt} (text), not {prompt_file} — the file path is on the control-plane host; a worker pod can’t open it. And give {prompt} its own argv element (no sh -lc) so backticks/quotes in the prompt aren’t shell-parsed.
  • codex ≥ 0.40 dropped wire_api="chat" — use wire_api="responses" (the provider must expose an OpenAI /responses endpoint; OpenRouter does).
  • --dangerously-bypass-approvals-and-sandbox is correct here — the worker pod is already the sandbox; it tells codex not to add its own.
  • sandbox.profile: none runs the agent directly in the worker pod (network open, credential in env). For untrusted code use a confined profile on a sandbox-capable worker (see sandboxing); egress is then governed by the operator allowlist (operator-setup).