docs(runtime): clarify fleet-backed subagents

State the v0.8.60 direction plainly: sub-agent is role and UX vocabulary, while durable detached work should use the fleet-backed worker lifecycle.

Document the current agent_open path as compatibility until retry, receipt, and ledger semantics are unified with Fleet.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hunter B
2026-06-13 13:06:28 -07:00
parent a21048a207
commit 5815f94f0b
5 changed files with 103 additions and 27 deletions
+13
View File
@@ -27,6 +27,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
supports Ctrl+P/Ctrl+N movement without letting the global file picker steal
focus while the menu is open. Thanks @1Git2Clone for the PR.
### Changed
- **README and contributor credits.** The README now has a shorter public
overview and moves the full contributor ledger to `docs/CONTRIBUTORS.md`,
preserving public thanks for [DeepSeek](https://github.com/deepseek-ai),
[DataWhale](https://github.com/datawhalechina),
[OpenWarp](https://github.com/zerx-lab/warp), and
[Open Design](https://github.com/nexu-io/open-design).
- **Fleet-backed sub-agent direction.** Runtime docs now state the intended
cutover clearly: "sub-agent" is role/UX vocabulary, while durable detached
work should converge on the fleet-backed worker lifecycle with retries,
receipts, and ledgered inspection.
### Fixed
- **Z.ai GLM thinking traces.** Direct Z.ai requests now use the documented
+13
View File
@@ -27,6 +27,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
supports Ctrl+P/Ctrl+N movement without letting the global file picker steal
focus while the menu is open. Thanks @1Git2Clone for the PR.
### Changed
- **README and contributor credits.** The README now has a shorter public
overview and moves the full contributor ledger to `docs/CONTRIBUTORS.md`,
preserving public thanks for [DeepSeek](https://github.com/deepseek-ai),
[DataWhale](https://github.com/datawhalechina),
[OpenWarp](https://github.com/zerx-lab/warp), and
[Open Design](https://github.com/nexu-io/open-design).
- **Fleet-backed sub-agent direction.** Runtime docs now state the intended
cutover clearly: "sub-agent" is role/UX vocabulary, while durable detached
work should converge on the fleet-backed worker lifecycle with retries,
receipts, and ledgered inspection.
### Fixed
- **Z.ai GLM thinking traces.** Direct Z.ai requests now use the documented
+43 -16
View File
@@ -1,15 +1,19 @@
# The CodeWhale Agent Runtime — one substrate, three launchers
# The CodeWhale Agent Runtime — one durable substrate, familiar launchers
This document explains how sub-agents, the headless `exec` path, and the Agent
Fleet relate. It exists because these had drifted into *two* parallel "worker"
systems, and the fix is to treat them as **one**. It also answers the
open direction question in #2972 ("how much Claude Code convergence is right?").
This document explains how sub-agents, the headless `exec` path, and Agent Fleet
relate. It exists because these had drifted into *two* parallel "worker"
systems, and the fix is to make the **fleet-backed worker run** the durable
primitive. "Sub-agent" remains useful product vocabulary for a nested role, but
it must not imply a separate execution substrate with weaker lifecycle
semantics. It also answers the open direction question in #2972 ("how much
Claude Code convergence is right?").
## The core idea
There is exactly **one** thing that runs work: a **headless agent runtime** — a
model loop with the full (policy-gated) tool surface that can, in turn, spawn
its own sub-agents. Everything else is just a different way to *launch* that one
There is exactly **one** thing that runs detached agent work: a **headless agent
runtime** wrapped in a durable worker lifecycle. It is a model loop with the
full (policy-gated) tool surface that can, in turn, delegate child work through
the same lifecycle. Everything else is just a different way to *launch* that one
runtime, or a different way to *observe* it.
```
@@ -30,8 +34,10 @@ runtime, or a different way to *observe* it.
└───────────────────────┘
```
- A **sub-agent** is a *nested* run of the same runtime, launched from inside
any agent process via `agent_open`. Same spec, same events, same depth axis.
- A **sub-agent** is the user-facing name for a *nested assignment* with a role
(`explore`, `review`, `implementer`, `verifier`, ...). It should be backed by
the same worker run lifecycle as fleet. `agent_open` is the compatibility
launcher, not a second runtime.
- **`codewhale exec`** is the headless front door: usable by anyone at any time
(CI, scripts, another agent), full tools, emits a `stream-json` event stream,
and can spawn sub-agents. It is *the* runtime with a CLI on it.
@@ -41,8 +47,29 @@ runtime, or a different way to *observe* it.
adds **orchestration** (durable ledger, scheduling/leasing/retry, host
transport, alert escalation) *over* the one runtime.
So "fleet vs sub-agent" is not two categories. It is **the same headless run**,
differing only in *where it was launched from* and *how durably it is tracked*.
So "fleet vs sub-agent" is not two categories. It is **the same headless run**:
Fleet is the durable control plane, while sub-agent is the role/UX vocabulary
for a nested worker.
## The cutover rule
If a detached `agent_open` child can fail on a one-off provider timeout with no
retry while an equivalent fleet worker would retry and preserve ledger evidence,
then the cutover is incomplete. Treat that as a CodeWhale runtime gap, not as
normal "sub-agent behavior".
The target rule is:
- durable or long-running work goes through the fleet worker lifecycle;
- `agent_open` may stay as the friendly nested-agent API, but it should enqueue
or observe a fleet-backed worker run instead of owning an independent
lifecycle;
- in-process children are allowed only as a small compatibility/latency
optimization, and they must expose the same terminal states, retry semantics,
receipts, and inspection handles as the fleet path.
In product language it is fine to say "open a sub-agent". In architecture
language that means "start a nested fleet worker with this role".
## Why this shape (and why it fixes the lag)
@@ -107,10 +134,10 @@ CodeWhale should converge with Claude Code on **shape**, not on branding:
multi-provider support; the local-first **Agent Fleet** (durable, SSH-capable
orchestration) as CodeWhale's own layer above the shared runtime; WhaleFlow as
the orchestration overlay.
- **Do not** fork execution semantics per surface. The TUI, `exec`, the Runtime
API, and the fleet must all drive the *same* runtime and observe the *same*
event stream — divergence there is what produced the "two moving targets" this
document exists to prevent.
- **Do not** fork execution semantics per surface. The TUI, `agent_open`,
`exec`, the Runtime API, and the fleet must all drive the *same* runtime and
observe the *same* event stream — divergence there is what produced the "two
moving targets" this document exists to prevent.
The litmus test for any new agent surface: *does it launch and observe the one
runtime, or does it invent a second one?* Only the former is allowed.
+7 -1
View File
@@ -4,7 +4,13 @@ Agent Fleet is the local-first control plane for durable multi-worker runs. It
is **not** a separate execution engine: a fleet worker is a headless
`codewhale exec` run that the fleet launches and tracks durably. See
[AGENT_RUNTIME.md](AGENT_RUNTIME.md) for how sub-agents, `exec`, and the fleet
are one runtime with three launchers. The initial CLI surface is:
converge on one durable runtime. In product language, a user may still "open a
sub-agent"; in architecture language, durable nested work should be a
fleet-backed worker with a role.
Use Fleet rather than the compatibility `agent_open` path whenever the work
needs retry, sleep/restart survival, remote execution, receipts, or a ledgered
audit trail. The initial CLI surface is:
```sh
codewhale fleet init
+27 -10
View File
@@ -1,14 +1,30 @@
# Sub-Agents
Sub-agents are persistent background instances of the agent loop. The parent
opens one with a focused task, gets back an `agent_id` and session name
immediately, and continues working while the sub-agent runs to completion.
Sub-agents inherit the parent's tool registry by default. `agent_open`
launches them as detached background work: cancelling the parent turn stops the
parent wait/eval path, but it does not kill already-opened child sessions. Use
Sub-agents are the user-facing vocabulary for nested worker assignments: a
parent opens a focused role (`explore`, `review`, `implementer`, `verifier`,
...) and gets back an `agent_id` plus session name while the worker runs.
Architecturally, sub-agents should not be a second execution substrate. The
durable primitive is the fleet-backed worker run described in
[`AGENT_RUNTIME.md`](AGENT_RUNTIME.md): retries, terminal status, receipts,
artifact refs, inspection, and restart behavior belong there. `agent_open`
stays as the familiar nested-agent launcher, but detached work should converge
on the same lifecycle as Agent Fleet.
The current `agent_open` implementation is a compatibility path while that
cutover completes. It can still be useful for short in-session delegation, but
if a child fails once on a transient provider timeout while an equivalent fleet
worker would retry from the ledger, that is a runtime unification gap. For work
that must survive provider hiccups, process restarts, sleep, or remote
execution, prefer Fleet or a WhaleFlow-backed fleet run.
Sub-agents inherit the parent's tool registry by default. `agent_open` launches
them as detached background work: cancelling the parent turn stops the parent
wait/eval path, but it does not kill already-opened child sessions. Use
`agent_close` to cancel a running child explicitly.
This doc covers the role taxonomy. The active orchestration surface is
This doc covers the role taxonomy and current compatibility controls. The active
orchestration surface is
`agent_open`, `agent_eval`, and `agent_close`; see `prompts/base.md`
"Sub-Agent Strategy" and the in-line tool descriptions.
@@ -112,7 +128,7 @@ the next turn.
## Concurrency cap
The dispatcher caps concurrent sub-agents at 10 by default
The current compatibility dispatcher caps concurrent sub-agents at 10 by default
(configurable via `[subagents].max_concurrent` in `~/.codewhale/config.toml`,
hard ceiling 20). When the parent hits the cap, `agent_open` returns
an error with the cap value; the parent should use `agent_eval` to wait for a
@@ -230,9 +246,10 @@ manager can't match them to the current boot.
## Run receipts, follow-up, and takeover
Each sub-agent has a persisted worker record in
Each compatibility sub-agent has a persisted worker record in
`.codewhale/state/subagents.v1.json`. The record is the current run-ledger
slice for sub-agent lanes: it stores `run_id`, objective, role/model,
slice for sub-agent lanes until those lanes are backed directly by the fleet
ledger: it stores `run_id`, objective, role/model,
workspace/branch, lifecycle events, artifact refs, follow-up target, takeover
target, usage provenance, and verification provenance.