From 5815f94f0bc22659a49ff558406023ae854b7f7b Mon Sep 17 00:00:00 2001 From: Hunter B Date: Sat, 13 Jun 2026 13:06:28 -0700 Subject: [PATCH] docs(runtime): clarify fleet-backed subagents State the v0.8.60 direction plainly: sub-agent is role and UX vocabulary, while durable detached work should use the fleet-backed worker lifecycle. Document the current agent_open path as compatibility until retry, receipt, and ledger semantics are unified with Fleet. Co-Authored-By: Claude Opus 4.8 (1M context) --- CHANGELOG.md | 13 +++++++++ crates/tui/CHANGELOG.md | 13 +++++++++ docs/AGENT_RUNTIME.md | 59 ++++++++++++++++++++++++++++++----------- docs/FLEET.md | 8 +++++- docs/SUBAGENTS.md | 37 +++++++++++++++++++------- 5 files changed, 103 insertions(+), 27 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 73308a52..124bc5ef 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -27,6 +27,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 supports Ctrl+P/Ctrl+N movement without letting the global file picker steal focus while the menu is open. Thanks @1Git2Clone for the PR. +### Changed + +- **README and contributor credits.** The README now has a shorter public + overview and moves the full contributor ledger to `docs/CONTRIBUTORS.md`, + preserving public thanks for [DeepSeek](https://github.com/deepseek-ai), + [DataWhale](https://github.com/datawhalechina), + [OpenWarp](https://github.com/zerx-lab/warp), and + [Open Design](https://github.com/nexu-io/open-design). +- **Fleet-backed sub-agent direction.** Runtime docs now state the intended + cutover clearly: "sub-agent" is role/UX vocabulary, while durable detached + work should converge on the fleet-backed worker lifecycle with retries, + receipts, and ledgered inspection. + ### Fixed - **Z.ai GLM thinking traces.** Direct Z.ai requests now use the documented diff --git a/crates/tui/CHANGELOG.md b/crates/tui/CHANGELOG.md index 7f88feda..b252c149 100644 --- a/crates/tui/CHANGELOG.md +++ b/crates/tui/CHANGELOG.md @@ -27,6 +27,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 supports Ctrl+P/Ctrl+N movement without letting the global file picker steal focus while the menu is open. Thanks @1Git2Clone for the PR. +### Changed + +- **README and contributor credits.** The README now has a shorter public + overview and moves the full contributor ledger to `docs/CONTRIBUTORS.md`, + preserving public thanks for [DeepSeek](https://github.com/deepseek-ai), + [DataWhale](https://github.com/datawhalechina), + [OpenWarp](https://github.com/zerx-lab/warp), and + [Open Design](https://github.com/nexu-io/open-design). +- **Fleet-backed sub-agent direction.** Runtime docs now state the intended + cutover clearly: "sub-agent" is role/UX vocabulary, while durable detached + work should converge on the fleet-backed worker lifecycle with retries, + receipts, and ledgered inspection. + ### Fixed - **Z.ai GLM thinking traces.** Direct Z.ai requests now use the documented diff --git a/docs/AGENT_RUNTIME.md b/docs/AGENT_RUNTIME.md index 3423e3c7..b3d9f07b 100644 --- a/docs/AGENT_RUNTIME.md +++ b/docs/AGENT_RUNTIME.md @@ -1,15 +1,19 @@ -# The CodeWhale Agent Runtime — one substrate, three launchers +# The CodeWhale Agent Runtime — one durable substrate, familiar launchers -This document explains how sub-agents, the headless `exec` path, and the Agent -Fleet relate. It exists because these had drifted into *two* parallel "worker" -systems, and the fix is to treat them as **one**. It also answers the -open direction question in #2972 ("how much Claude Code convergence is right?"). +This document explains how sub-agents, the headless `exec` path, and Agent Fleet +relate. It exists because these had drifted into *two* parallel "worker" +systems, and the fix is to make the **fleet-backed worker run** the durable +primitive. "Sub-agent" remains useful product vocabulary for a nested role, but +it must not imply a separate execution substrate with weaker lifecycle +semantics. It also answers the open direction question in #2972 ("how much +Claude Code convergence is right?"). ## The core idea -There is exactly **one** thing that runs work: a **headless agent runtime** — a -model loop with the full (policy-gated) tool surface that can, in turn, spawn -its own sub-agents. Everything else is just a different way to *launch* that one +There is exactly **one** thing that runs detached agent work: a **headless agent +runtime** wrapped in a durable worker lifecycle. It is a model loop with the +full (policy-gated) tool surface that can, in turn, delegate child work through +the same lifecycle. Everything else is just a different way to *launch* that one runtime, or a different way to *observe* it. ``` @@ -30,8 +34,10 @@ runtime, or a different way to *observe* it. └───────────────────────┘ ``` -- A **sub-agent** is a *nested* run of the same runtime, launched from inside - any agent process via `agent_open`. Same spec, same events, same depth axis. +- A **sub-agent** is the user-facing name for a *nested assignment* with a role + (`explore`, `review`, `implementer`, `verifier`, ...). It should be backed by + the same worker run lifecycle as fleet. `agent_open` is the compatibility + launcher, not a second runtime. - **`codewhale exec`** is the headless front door: usable by anyone at any time (CI, scripts, another agent), full tools, emits a `stream-json` event stream, and can spawn sub-agents. It is *the* runtime with a CLI on it. @@ -41,8 +47,29 @@ runtime, or a different way to *observe* it. adds **orchestration** (durable ledger, scheduling/leasing/retry, host transport, alert escalation) *over* the one runtime. -So "fleet vs sub-agent" is not two categories. It is **the same headless run**, -differing only in *where it was launched from* and *how durably it is tracked*. +So "fleet vs sub-agent" is not two categories. It is **the same headless run**: +Fleet is the durable control plane, while sub-agent is the role/UX vocabulary +for a nested worker. + +## The cutover rule + +If a detached `agent_open` child can fail on a one-off provider timeout with no +retry while an equivalent fleet worker would retry and preserve ledger evidence, +then the cutover is incomplete. Treat that as a CodeWhale runtime gap, not as +normal "sub-agent behavior". + +The target rule is: + +- durable or long-running work goes through the fleet worker lifecycle; +- `agent_open` may stay as the friendly nested-agent API, but it should enqueue + or observe a fleet-backed worker run instead of owning an independent + lifecycle; +- in-process children are allowed only as a small compatibility/latency + optimization, and they must expose the same terminal states, retry semantics, + receipts, and inspection handles as the fleet path. + +In product language it is fine to say "open a sub-agent". In architecture +language that means "start a nested fleet worker with this role". ## Why this shape (and why it fixes the lag) @@ -107,10 +134,10 @@ CodeWhale should converge with Claude Code on **shape**, not on branding: multi-provider support; the local-first **Agent Fleet** (durable, SSH-capable orchestration) as CodeWhale's own layer above the shared runtime; WhaleFlow as the orchestration overlay. -- **Do not** fork execution semantics per surface. The TUI, `exec`, the Runtime - API, and the fleet must all drive the *same* runtime and observe the *same* - event stream — divergence there is what produced the "two moving targets" this - document exists to prevent. +- **Do not** fork execution semantics per surface. The TUI, `agent_open`, + `exec`, the Runtime API, and the fleet must all drive the *same* runtime and + observe the *same* event stream — divergence there is what produced the "two + moving targets" this document exists to prevent. The litmus test for any new agent surface: *does it launch and observe the one runtime, or does it invent a second one?* Only the former is allowed. diff --git a/docs/FLEET.md b/docs/FLEET.md index 7e2f7aa5..ba321d94 100644 --- a/docs/FLEET.md +++ b/docs/FLEET.md @@ -4,7 +4,13 @@ Agent Fleet is the local-first control plane for durable multi-worker runs. It is **not** a separate execution engine: a fleet worker is a headless `codewhale exec` run that the fleet launches and tracks durably. See [AGENT_RUNTIME.md](AGENT_RUNTIME.md) for how sub-agents, `exec`, and the fleet -are one runtime with three launchers. The initial CLI surface is: +converge on one durable runtime. In product language, a user may still "open a +sub-agent"; in architecture language, durable nested work should be a +fleet-backed worker with a role. + +Use Fleet rather than the compatibility `agent_open` path whenever the work +needs retry, sleep/restart survival, remote execution, receipts, or a ledgered +audit trail. The initial CLI surface is: ```sh codewhale fleet init diff --git a/docs/SUBAGENTS.md b/docs/SUBAGENTS.md index 9d92f927..72175fb8 100644 --- a/docs/SUBAGENTS.md +++ b/docs/SUBAGENTS.md @@ -1,14 +1,30 @@ # Sub-Agents -Sub-agents are persistent background instances of the agent loop. The parent -opens one with a focused task, gets back an `agent_id` and session name -immediately, and continues working while the sub-agent runs to completion. -Sub-agents inherit the parent's tool registry by default. `agent_open` -launches them as detached background work: cancelling the parent turn stops the -parent wait/eval path, but it does not kill already-opened child sessions. Use +Sub-agents are the user-facing vocabulary for nested worker assignments: a +parent opens a focused role (`explore`, `review`, `implementer`, `verifier`, +...) and gets back an `agent_id` plus session name while the worker runs. + +Architecturally, sub-agents should not be a second execution substrate. The +durable primitive is the fleet-backed worker run described in +[`AGENT_RUNTIME.md`](AGENT_RUNTIME.md): retries, terminal status, receipts, +artifact refs, inspection, and restart behavior belong there. `agent_open` +stays as the familiar nested-agent launcher, but detached work should converge +on the same lifecycle as Agent Fleet. + +The current `agent_open` implementation is a compatibility path while that +cutover completes. It can still be useful for short in-session delegation, but +if a child fails once on a transient provider timeout while an equivalent fleet +worker would retry from the ledger, that is a runtime unification gap. For work +that must survive provider hiccups, process restarts, sleep, or remote +execution, prefer Fleet or a WhaleFlow-backed fleet run. + +Sub-agents inherit the parent's tool registry by default. `agent_open` launches +them as detached background work: cancelling the parent turn stops the parent +wait/eval path, but it does not kill already-opened child sessions. Use `agent_close` to cancel a running child explicitly. -This doc covers the role taxonomy. The active orchestration surface is +This doc covers the role taxonomy and current compatibility controls. The active +orchestration surface is `agent_open`, `agent_eval`, and `agent_close`; see `prompts/base.md` "Sub-Agent Strategy" and the in-line tool descriptions. @@ -112,7 +128,7 @@ the next turn. ## Concurrency cap -The dispatcher caps concurrent sub-agents at 10 by default +The current compatibility dispatcher caps concurrent sub-agents at 10 by default (configurable via `[subagents].max_concurrent` in `~/.codewhale/config.toml`, hard ceiling 20). When the parent hits the cap, `agent_open` returns an error with the cap value; the parent should use `agent_eval` to wait for a @@ -230,9 +246,10 @@ manager can't match them to the current boot. ## Run receipts, follow-up, and takeover -Each sub-agent has a persisted worker record in +Each compatibility sub-agent has a persisted worker record in `.codewhale/state/subagents.v1.json`. The record is the current run-ledger -slice for sub-agent lanes: it stores `run_id`, objective, role/model, +slice for sub-agent lanes until those lanes are backed directly by the fleet +ledger: it stores `run_id`, objective, role/model, workspace/branch, lifecycle events, artifact refs, follow-up target, takeover target, usage provenance, and verification provenance.