docs(runtime): clarify fleet-backed subagents

State the v0.8.60 direction plainly: sub-agent is role and UX vocabulary, while durable detached work should use the fleet-backed worker lifecycle. Document the current agent_open path as compatibility until retry, receipt, and ledger semantics are unified with Fleet. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 13:06:28 -07:00
parent a21048a207
commit 5815f94f0b
5 changed files with 103 additions and 27 deletions
@@ -27,6 +27,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
  supports Ctrl+P/Ctrl+N movement without letting the global file picker steal
  focus while the menu is open. Thanks @1Git2Clone for the PR.

+### Changed
+
+- **README and contributor credits.** The README now has a shorter public
+  overview and moves the full contributor ledger to `docs/CONTRIBUTORS.md`,
+  preserving public thanks for [DeepSeek](https://github.com/deepseek-ai),
+  [DataWhale](https://github.com/datawhalechina),
+  [OpenWarp](https://github.com/zerx-lab/warp), and
+  [Open Design](https://github.com/nexu-io/open-design).
+- **Fleet-backed sub-agent direction.** Runtime docs now state the intended
+  cutover clearly: "sub-agent" is role/UX vocabulary, while durable detached
+  work should converge on the fleet-backed worker lifecycle with retries,
+  receipts, and ledgered inspection.
+
 ### Fixed

 - **Z.ai GLM thinking traces.** Direct Z.ai requests now use the documented
@@ -27,6 +27,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
  supports Ctrl+P/Ctrl+N movement without letting the global file picker steal
  focus while the menu is open. Thanks @1Git2Clone for the PR.

+### Changed
+
+- **README and contributor credits.** The README now has a shorter public
+  overview and moves the full contributor ledger to `docs/CONTRIBUTORS.md`,
+  preserving public thanks for [DeepSeek](https://github.com/deepseek-ai),
+  [DataWhale](https://github.com/datawhalechina),
+  [OpenWarp](https://github.com/zerx-lab/warp), and
+  [Open Design](https://github.com/nexu-io/open-design).
+- **Fleet-backed sub-agent direction.** Runtime docs now state the intended
+  cutover clearly: "sub-agent" is role/UX vocabulary, while durable detached
+  work should converge on the fleet-backed worker lifecycle with retries,
+  receipts, and ledgered inspection.
+
 ### Fixed

 - **Z.ai GLM thinking traces.** Direct Z.ai requests now use the documented
@@ -1,15 +1,19 @@
-# The CodeWhale Agent Runtime — one substrate, three launchers
+# The CodeWhale Agent Runtime — one durable substrate, familiar launchers

-This document explains how sub-agents, the headless `exec` path, and the Agent
-Fleet relate. It exists because these had drifted into *two* parallel "worker"
-systems, and the fix is to treat them as **one**. It also answers the
-open direction question in #2972 ("how much Claude Code convergence is right?").
+This document explains how sub-agents, the headless `exec` path, and Agent Fleet
+relate. It exists because these had drifted into *two* parallel "worker"
+systems, and the fix is to make the **fleet-backed worker run** the durable
+primitive. "Sub-agent" remains useful product vocabulary for a nested role, but
+it must not imply a separate execution substrate with weaker lifecycle
+semantics. It also answers the open direction question in #2972 ("how much
+Claude Code convergence is right?").

 ## The core idea

-There is exactly **one** thing that runs work: a **headless agent runtime** — a
-model loop with the full (policy-gated) tool surface that can, in turn, spawn
-its own sub-agents. Everything else is just a different way to *launch* that one
+There is exactly **one** thing that runs detached agent work: a **headless agent
+runtime** wrapped in a durable worker lifecycle. It is a model loop with the
+full (policy-gated) tool surface that can, in turn, delegate child work through
+the same lifecycle. Everything else is just a different way to *launch* that one
 runtime, or a different way to *observe* it.

 ```
@@ -30,8 +34,10 @@ runtime, or a different way to *observe* it.
                                                 └───────────────────────┘
 ```

- A **sub-agent** is a *nested* run of the same runtime, launched from inside
-  any agent process via `agent_open`. Same spec, same events, same depth axis.
+- A **sub-agent** is the user-facing name for a *nested assignment* with a role
+  (`explore`, `review`, `implementer`, `verifier`, ...). It should be backed by
+  the same worker run lifecycle as fleet. `agent_open` is the compatibility
+  launcher, not a second runtime.
 - **`codewhale exec`** is the headless front door: usable by anyone at any time
  (CI, scripts, another agent), full tools, emits a `stream-json` event stream,
  and can spawn sub-agents. It is *the* runtime with a CLI on it.
@@ -41,8 +47,29 @@ runtime, or a different way to *observe* it.
  adds **orchestration** (durable ledger, scheduling/leasing/retry, host
  transport, alert escalation) *over* the one runtime.

-So "fleet vs sub-agent" is not two categories. It is **the same headless run**,
-differing only in *where it was launched from* and *how durably it is tracked*.
+So "fleet vs sub-agent" is not two categories. It is **the same headless run**:
+Fleet is the durable control plane, while sub-agent is the role/UX vocabulary
+for a nested worker.
+
+## The cutover rule
+
+If a detached `agent_open` child can fail on a one-off provider timeout with no
+retry while an equivalent fleet worker would retry and preserve ledger evidence,
+then the cutover is incomplete. Treat that as a CodeWhale runtime gap, not as
+normal "sub-agent behavior".
+
+The target rule is:
+
+- durable or long-running work goes through the fleet worker lifecycle;
+- `agent_open` may stay as the friendly nested-agent API, but it should enqueue
+  or observe a fleet-backed worker run instead of owning an independent
+  lifecycle;
+- in-process children are allowed only as a small compatibility/latency
+  optimization, and they must expose the same terminal states, retry semantics,
+  receipts, and inspection handles as the fleet path.
+
+In product language it is fine to say "open a sub-agent". In architecture
+language that means "start a nested fleet worker with this role".

 ## Why this shape (and why it fixes the lag)

@@ -107,10 +134,10 @@ CodeWhale should converge with Claude Code on **shape**, not on branding:
  multi-provider support; the local-first **Agent Fleet** (durable, SSH-capable
  orchestration) as CodeWhale's own layer above the shared runtime; WhaleFlow as
  the orchestration overlay.
- **Do not** fork execution semantics per surface. The TUI, `exec`, the Runtime
-  API, and the fleet must all drive the *same* runtime and observe the *same*
-  event stream — divergence there is what produced the "two moving targets" this
-  document exists to prevent.
+- **Do not** fork execution semantics per surface. The TUI, `agent_open`,
+  `exec`, the Runtime API, and the fleet must all drive the *same* runtime and
+  observe the *same* event stream — divergence there is what produced the "two
+  moving targets" this document exists to prevent.

 The litmus test for any new agent surface: *does it launch and observe the one
 runtime, or does it invent a second one?* Only the former is allowed.
@@ -4,7 +4,13 @@ Agent Fleet is the local-first control plane for durable multi-worker runs. It
 is **not** a separate execution engine: a fleet worker is a headless
 `codewhale exec` run that the fleet launches and tracks durably. See
 [AGENT_RUNTIME.md](AGENT_RUNTIME.md) for how sub-agents, `exec`, and the fleet
-are one runtime with three launchers. The initial CLI surface is:
+converge on one durable runtime. In product language, a user may still "open a
+sub-agent"; in architecture language, durable nested work should be a
+fleet-backed worker with a role.
+
+Use Fleet rather than the compatibility `agent_open` path whenever the work
+needs retry, sleep/restart survival, remote execution, receipts, or a ledgered
+audit trail. The initial CLI surface is:

 ```sh
 codewhale fleet init
@@ -1,14 +1,30 @@
 # Sub-Agents

-Sub-agents are persistent background instances of the agent loop. The parent
-opens one with a focused task, gets back an `agent_id` and session name
-immediately, and continues working while the sub-agent runs to completion.
-Sub-agents inherit the parent's tool registry by default. `agent_open`
-launches them as detached background work: cancelling the parent turn stops the
-parent wait/eval path, but it does not kill already-opened child sessions. Use
+Sub-agents are the user-facing vocabulary for nested worker assignments: a
+parent opens a focused role (`explore`, `review`, `implementer`, `verifier`,
+...) and gets back an `agent_id` plus session name while the worker runs.
+
+Architecturally, sub-agents should not be a second execution substrate. The
+durable primitive is the fleet-backed worker run described in
+[`AGENT_RUNTIME.md`](AGENT_RUNTIME.md): retries, terminal status, receipts,
+artifact refs, inspection, and restart behavior belong there. `agent_open`
+stays as the familiar nested-agent launcher, but detached work should converge
+on the same lifecycle as Agent Fleet.
+
+The current `agent_open` implementation is a compatibility path while that
+cutover completes. It can still be useful for short in-session delegation, but
+if a child fails once on a transient provider timeout while an equivalent fleet
+worker would retry from the ledger, that is a runtime unification gap. For work
+that must survive provider hiccups, process restarts, sleep, or remote
+execution, prefer Fleet or a WhaleFlow-backed fleet run.
+
+Sub-agents inherit the parent's tool registry by default. `agent_open` launches
+them as detached background work: cancelling the parent turn stops the parent
+wait/eval path, but it does not kill already-opened child sessions. Use
 `agent_close` to cancel a running child explicitly.

-This doc covers the role taxonomy. The active orchestration surface is
+This doc covers the role taxonomy and current compatibility controls. The active
+orchestration surface is
 `agent_open`, `agent_eval`, and `agent_close`; see `prompts/base.md`
 "Sub-Agent Strategy" and the in-line tool descriptions.

@@ -112,7 +128,7 @@ the next turn.

 ## Concurrency cap

-The dispatcher caps concurrent sub-agents at 10 by default
+The current compatibility dispatcher caps concurrent sub-agents at 10 by default
 (configurable via `[subagents].max_concurrent` in `~/.codewhale/config.toml`,
 hard ceiling 20). When the parent hits the cap, `agent_open` returns
 an error with the cap value; the parent should use `agent_eval` to wait for a
@@ -230,9 +246,10 @@ manager can't match them to the current boot.

 ## Run receipts, follow-up, and takeover

-Each sub-agent has a persisted worker record in
+Each compatibility sub-agent has a persisted worker record in
 `.codewhale/state/subagents.v1.json`. The record is the current run-ledger
-slice for sub-agent lanes: it stores `run_id`, objective, role/model,
+slice for sub-agent lanes until those lanes are backed directly by the fleet
+ledger: it stores `run_id`, objective, role/model,
 workspace/branch, lifecycle events, artifact refs, follow-up target, takeover
 target, usage provenance, and verification provenance.