docs(subagents): clarify detached session lifecycle
This commit is contained in:
@@ -154,8 +154,9 @@ pub struct MailboxReceiver {
|
||||
|
||||
impl Mailbox {
|
||||
/// Create a new mailbox bound to the given cancellation token. Closing
|
||||
/// the mailbox (or dropping the last sender) cancels this token, which
|
||||
/// propagates to children via `child_token()` per `SubAgentRuntime`.
|
||||
/// the mailbox (or dropping the last sender) cancels this token. Runtimes
|
||||
/// that derive from the same token observe that cancellation; detached
|
||||
/// background `agent_open` sessions use their own runtime token.
|
||||
#[must_use]
|
||||
pub fn new(cancel_token: CancellationToken) -> (Self, MailboxReceiver) {
|
||||
let (tx, rx) = mpsc::unbounded_channel();
|
||||
@@ -211,10 +212,11 @@ impl Mailbox {
|
||||
|
||||
/// Close the mailbox AND cancel the bound cancellation token.
|
||||
///
|
||||
/// "Close-as-cancel": there's no useful state where the consumer is
|
||||
/// gone but children should keep producing. Closing the parent's
|
||||
/// mailbox cascades to every nested child because each child runtime
|
||||
/// derived its `cancel_token` via `child_token()` from the parent's.
|
||||
/// "Close-as-cancel": there's no useful state where the consumer is gone
|
||||
/// but producers bound to this mailbox token should keep publishing.
|
||||
/// Closing cancels the bound token; directly derived `child_runtime()`
|
||||
/// children observe it, while detached `agent_open` sessions rely on their
|
||||
/// own explicit cancellation.
|
||||
pub fn close(&self) {
|
||||
if !self.inner.closed.swap(true, Ordering::AcqRel) {
|
||||
self.inner.cancel_token.cancel();
|
||||
|
||||
@@ -602,8 +602,10 @@ pub struct SubAgentForkContext {
|
||||
/// Runtime configuration for spawning sub-agents.
|
||||
///
|
||||
/// Carries everything a child needs to (a) build its own tool registry —
|
||||
/// including the manager so grandchildren can spawn — and (b) cooperate
|
||||
/// with the rest of the spawn tree on cancellation and depth cap.
|
||||
/// including the manager so grandchildren can spawn — and (b) cooperate with
|
||||
/// lifecycle cancellation and depth caps. `child_runtime()` links cancellation
|
||||
/// tokens, while `background_runtime()` deliberately detaches long-running
|
||||
/// `agent_open` sessions from the caller's turn token.
|
||||
#[derive(Clone)]
|
||||
pub struct SubAgentRuntime {
|
||||
pub client: DeepSeekClient,
|
||||
@@ -625,8 +627,9 @@ pub struct SubAgentRuntime {
|
||||
/// exceed this is rejected at the spawn entry. Use `>` (strictly
|
||||
/// greater than) so equality is allowed — matches codex's pattern.
|
||||
pub max_spawn_depth: u32,
|
||||
/// Cooperative cancellation token. Children derive a child_token() from
|
||||
/// the parent so cancelling the root cascades down.
|
||||
/// Cooperative cancellation token. Direct `child_runtime()` callers derive
|
||||
/// a child token from the parent; model-visible `agent_open` uses
|
||||
/// `background_runtime()` to replace that token with a detached one.
|
||||
pub cancel_token: CancellationToken,
|
||||
/// Structured progress / lifecycle stream. Cloned across children so the
|
||||
/// whole spawn tree publishes into one ordered, fan-out-able mailbox.
|
||||
@@ -696,10 +699,10 @@ impl SubAgentRuntime {
|
||||
self
|
||||
}
|
||||
|
||||
/// Attach a `Mailbox` so this runtime (and every descendant — children
|
||||
/// clone it) publishes structured `MailboxMessage` envelopes alongside
|
||||
/// the legacy `Event` stream. Pair with [`Self::with_cancel_token`] when
|
||||
/// you want close-as-cancel to propagate the same way.
|
||||
/// Attach a `Mailbox` so this runtime and its derived children publish
|
||||
/// structured `MailboxMessage` envelopes alongside the legacy `Event`
|
||||
/// stream. Pair with [`Self::with_cancel_token`] when the mailbox close
|
||||
/// token should match this runtime's cancellation token.
|
||||
#[must_use]
|
||||
#[allow(dead_code)] // wired by #128 (in-transcript cards) when it lands.
|
||||
pub fn with_mailbox(mut self, mailbox: Mailbox) -> Self {
|
||||
@@ -3389,9 +3392,9 @@ async fn run_subagent(
|
||||
let mut pending_inputs: VecDeque<SubAgentInput> = VecDeque::new();
|
||||
|
||||
for _step in 0..max_steps {
|
||||
// Cooperative cancellation: bail if the parent (or root) cancelled
|
||||
// us while we were between steps. Children derive their token from
|
||||
// the parent's via `child_token()` so this propagates the whole tree.
|
||||
// Cooperative cancellation: bail if this session's token was cancelled
|
||||
// while we were between steps. Top-level model-visible sub-agents use
|
||||
// a detached token so parent turn cancellation does not stop them.
|
||||
if runtime.cancel_token.is_cancelled() {
|
||||
emit_agent_progress(
|
||||
runtime.event_tx.as_ref(),
|
||||
|
||||
+29
-25
@@ -3,9 +3,10 @@
|
||||
Sub-agents are persistent background instances of the agent loop. The parent
|
||||
opens one with a focused task, gets back an `agent_id` and session name
|
||||
immediately, and continues working while the sub-agent runs to completion.
|
||||
Sub-agents inherit the parent's tool registry by default and run with
|
||||
`CancellationToken::child_token()`, so cancelling the parent cancels every
|
||||
descendant.
|
||||
Sub-agents inherit the parent's tool registry by default. `agent_open`
|
||||
launches them as detached background work: cancelling the parent turn stops the
|
||||
parent wait/eval path, but it does not kill already-opened child sessions. Use
|
||||
`agent_close` to cancel a running child explicitly.
|
||||
|
||||
This doc covers the role taxonomy. The active orchestration surface is
|
||||
`agent_open`, `agent_eval`, and `agent_close`; see `prompts/base.md`
|
||||
@@ -17,15 +18,15 @@ The `type` field on `agent_open` selects a system-prompt posture for the child
|
||||
(`agent_type` is accepted as a compatibility alias). Each role is a distinct
|
||||
stance toward the work — not just a different label.
|
||||
|
||||
| Role | Stance | Writes? | Runs shell? | Typical use |
|
||||
|---------------|----------------------------------------|---------|-------------|----------------------------------------------|
|
||||
| `general` | flexible; do whatever the parent says | yes | yes | the default; multi-step tasks |
|
||||
| `explore` | read-only; map the relevant code fast | no | yes (read) | "find every call site of `Foo`" |
|
||||
| `plan` | analyse and produce a strategy | minimal | minimal | "design the migration; don't execute" |
|
||||
| `review` | read-and-grade with severity scores | no | no | "audit this PR for bugs" |
|
||||
| `implementer` | land a specific change with min edit | yes | yes | "rewrite `bar.rs::Foo::bar` to do X" |
|
||||
| `verifier` | run tests / validation, report outcome | no | yes (test) | "run cargo test --workspace, report" |
|
||||
| `custom` | explicit narrow tool allowlist | depends | depends | locked-down dispatch with hand-picked tools |
|
||||
| Role | Stance | Writes? | Shell posture | Typical use |
|
||||
|---------------|----------------------------------------|---------|---------------|----------------------------------------------|
|
||||
| `general` | flexible; do whatever the parent says | yes | yes | the default; multi-step tasks |
|
||||
| `explore` | read-only; map the relevant code fast | no | read-only | "find every call site of `Foo`" |
|
||||
| `plan` | analyse and produce a strategy | minimal | minimal | "design the migration; don't execute" |
|
||||
| `review` | read-and-grade with severity scores | no | read-only | "audit this PR for bugs" |
|
||||
| `implementer` | land a specific change with min edit | yes | yes | "rewrite `bar.rs::Foo::bar` to do X" |
|
||||
| `verifier` | run tests / validation, report outcome | no | test-focused | "run cargo test --workspace, report" |
|
||||
| `custom` | explicit narrow tool allowlist | depends | depends | locked-down dispatch with hand-picked tools |
|
||||
|
||||
Each role's full system prompt lives in
|
||||
`crates/tui/src/tools/subagent/mod.rs` (search for
|
||||
@@ -100,8 +101,9 @@ the next turn.
|
||||
The dispatcher caps concurrent sub-agents at 10 by default
|
||||
(configurable via `[subagents].max_concurrent` in `~/.deepseek/config.toml`,
|
||||
hard ceiling 20). When the parent hits the cap, `agent_open` returns
|
||||
an error with the cap value; the parent should use `agent_eval` to wait for
|
||||
completion or `agent_close` to free a slot before retrying.
|
||||
an error with the cap value; the parent should use `agent_eval` to wait for a
|
||||
running agent to complete, or `agent_close` to cancel a running agent, before
|
||||
retrying.
|
||||
|
||||
The cap counts only **running** agents — completed / failed /
|
||||
cancelled records persist for inspection but don't occupy a slot.
|
||||
@@ -116,17 +118,16 @@ Each opened session produces a record that progresses through:
|
||||
Pending → Running → (Completed | Failed(reason) | Cancelled | Interrupted(reason))
|
||||
```
|
||||
|
||||
`Interrupted` fires when the manager detects a `Running` agent
|
||||
whose task handle is gone — typically after a process restart that
|
||||
loaded the agent from `~/.deepseek/subagents.v1.json`. The parent
|
||||
can open a replacement session with the same assignment or treat it as a
|
||||
terminal state.
|
||||
`Interrupted` fires when the manager detects a `Running` agent whose task
|
||||
handle is gone — typically after a process restart that loaded the workspace's
|
||||
persisted state from `.deepseek/state/subagents.v1.json`. The parent can open a
|
||||
replacement session with the same assignment or treat it as a terminal state.
|
||||
|
||||
### Session boundaries (#405)
|
||||
|
||||
Each `SubAgentManager` instance assigns itself a fresh
|
||||
`session_boot_id` on construction. Every new session stamps the agent
|
||||
with that id; the persisted state file carries it across restarts.
|
||||
Each `SubAgentManager` instance assigns itself a fresh `session_boot_id` on
|
||||
construction. Every new session stamps the agent with that id; the workspace
|
||||
state file records it for restart recovery.
|
||||
|
||||
`agent_eval` and the sidebar/status projections focus on current-session
|
||||
agents by default. Prior-session agents that are not still running are treated
|
||||
@@ -166,10 +167,13 @@ don't go through the standard write-approval flow.
|
||||
|
||||
## Implementation notes
|
||||
|
||||
- Source: `crates/tui/src/tools/subagent/mod.rs` (about 3500 LOC).
|
||||
- Persisted state: `~/.deepseek/subagents.v1.json`. Schema version
|
||||
`1` (forward-compatible — new optional fields use
|
||||
- Source: `crates/tui/src/tools/subagent/mod.rs`.
|
||||
- Persisted state: `<workspace>/.deepseek/state/subagents.v1.json`. Schema
|
||||
version `1` (forward-compatible — new optional fields use
|
||||
`#[serde(default)]`).
|
||||
- `SubAgentRuntime::background_runtime()` starts from `child_runtime()` but
|
||||
replaces the turn-scoped child token with a fresh cancellation token, so
|
||||
parent turn cancellation does not stop detached background sessions.
|
||||
- The `is_running` check ignores agents whose `task_handle` is
|
||||
`None`; this avoids counting persisted-but-detached records
|
||||
toward the concurrency cap (#509).
|
||||
|
||||
@@ -219,8 +219,8 @@ The caps appear in each tool's description and error messages so the model
|
||||
(and the user) can choose the right tool for the job. If one sub-agent is
|
||||
enough but you need parallel semantic lookups over the same loaded context,
|
||||
prefer `rlm_eval` with `sub_query_batch`; if each task needs its own
|
||||
tool-carrying agent loop, use `agent_open` and close completed sessions to free
|
||||
slots.
|
||||
tool-carrying agent loop, use `agent_open` and wait for running sessions to
|
||||
complete or cancel no-longer-needed running sessions with `agent_close`.
|
||||
|
||||
## Removed legacy aliases and surfaces
|
||||
|
||||
|
||||
Reference in New Issue
Block a user