diff --git a/AGENTS.md b/AGENTS.md index ae9f4db9..7fb64599 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -34,10 +34,12 @@ Use the **`gh` CLI** (`/opt/homebrew/bin/gh`) for all GitHub operations — issu - List open issues: `gh issue list --state open --limit 20` - View an issue: `gh issue view ` - Create an issue branch: `gh issue develop --branch-name feat/issue--` +- Close a verified issue: `gh issue close --comment "..."` - Create a PR: `gh pr create --base feat/v0.6.2 --title "..." --body "..."` - Check PR status: `gh pr view ` Prefer `gh` over `fetch_url` or `web_search` for GitHub data — it's faster, authenticated, and avoids rate limits. +Issues may be closed when the acceptance criteria have been verified or when the user explicitly asks for closure; avoid closing unrelated issues opportunistically. ## Important Notes diff --git a/crates/tui/src/prompts.rs b/crates/tui/src/prompts.rs index 59a40fd7..29c56200 100644 --- a/crates/tui/src/prompts.rs +++ b/crates/tui/src/prompts.rs @@ -379,6 +379,15 @@ mod tests { assert!(prompt.contains("### `rlm`")); } + #[test] + fn rlm_first_class_guidance_present() { + let prompt = compose_prompt(AppMode::Agent, Personality::Calm); + assert!(prompt.contains("RLM Is First-Class")); + assert!(prompt.contains("independent second opinions")); + assert!(prompt.contains("batched issue triage")); + assert!(prompt.contains("rlm` output is advisory")); + } + #[test] fn subagent_done_sentinel_section_present() { let prompt = compose_prompt(AppMode::Agent, Personality::Calm); diff --git a/crates/tui/src/prompts/base.md b/crates/tui/src/prompts/base.md index 7de6784a..b37d2e3d 100644 --- a/crates/tui/src/prompts/base.md +++ b/crates/tui/src/prompts/base.md @@ -33,11 +33,17 @@ Your default workflow for any non-trivial request: 2. **Execute** — work through each checklist item, updating status as you go. 3. **For complex initiatives**, layer `update_plan` (high-level strategy) above `checklist_write` (granular steps). 4. **For parallel work**, spawn sub-agents (`agent_spawn` / `agent_swarm`) — each does one thing well. Link them to plan/todo items in your thinking. Batch independent tool calls in a single turn. -5. **For long inputs that don't fit in your context** (whole files, transcripts, multi-doc corpora) or when you need recursive sub-LLM work, use `rlm` — it loads the input into a Python REPL as `context` and runs sub-LLM calls there so the long string never enters your window. +5. **For long inputs, recursive sub-LLM work, or high-leverage parallel reasoning**, use `rlm` — it loads input into a Python REPL as `context` and runs sub-LLM calls there so long strings and batched deliberation stay out of your window. 6. **For persistent cross-session memory**, use `note` sparingly for important decisions, open blockers, and architectural context. **Key principle**: make your work visible. The sidebar shows Plan / Todos / Tasks / Agents. When these panels are empty, the user has no idea what you're doing. Keep them populated. +## RLM Is First-Class + +Treat `rlm` as a normal reasoning tool, not a last-resort escape hatch. Reach for it when you need independent second opinions, batched issue triage, design-option comparison, test-plan generation, risky implementation review, or map-reduce over bulky artifacts. Ask bounded questions with explicit inputs and expected output shape. + +`rlm` output is advisory. Use it to find blind spots and alternate routes, then ground decisions in local files, live tool output, GitHub issue text, and passing verification before claiming completion. + ## Context You have a 1 M-token context window. When usage creeps above ~80%, suggest `/compact` to the user — it summarises earlier turns so you can keep working without losing thread. @@ -81,7 +87,7 @@ When context is deep (past a soft seam): cache reasoning conclusions in concise - **Git / diag / tests**: `git_status`, `git_diff`, `git_show`, `git_log`, `git_blame`, `diagnostics`, `run_tests`, `review`. - **Sub-agents**: `agent_spawn` (`spawn_agent`, `delegate_to_agent`), `agent_swarm` (background by default), `swarm_status`, `swarm_result`, `swarm_cancel`, `agent_result`, `agent_cancel` (`close_agent`), `agent_list`, `agent_wait` (`wait`), `agent_send_input` (`send_input`), `agent_assign` (`assign_agent`), `resume_agent`. - **CSV batch**: `spawn_agents_on_csv`, `report_agent_job_result`. -- **Recursive LM (long inputs)**: `rlm` — load a file/string as `context` in a Python REPL, sub-agent writes Python that calls `llm_query`/`llm_query_batched`/`rlm_query` to chunk and process it; returns the synthesized answer. Read-only. +- **Recursive LM (long inputs / parallel reasoning)**: `rlm` — load a file/string as `context` in a Python REPL, sub-agent writes Python that calls `llm_query`/`llm_query_batched`/`rlm_query` to chunk, compare, critique, and synthesize; returns the synthesized answer. Read-only. - **Other**: `code_execution` (Python sandbox), `validate_data` (JSON/TOML), `request_user_input`, `finance` (market quotes), `tool_search_tool_regex`, `tool_search_tool_bm25` (deferred tool discovery). Multiple `tool_calls` in one turn run in parallel. `web_search` returns `ref_id`s — cite as `(ref_id)`. @@ -119,7 +125,7 @@ Don't reach for `agent_spawn` when: ### `rlm` Don't reach for `rlm` (the recursive language model tool) when: -- The input fits comfortably in your context window — just read it directly with `read_file`. +- The input fits comfortably in your context window and the task is straightforward — just read it directly with `read_file`. - A simple `grep_files` or `exec_shell` pipeline can answer the question. - You need interactive, iterative exploration of the data — `rlm` is batch-oriented (the sub-LLM writes Python in one shot, then returns). - The task is a simple classification or extraction on short text — your own reasoning is faster and cheaper. diff --git a/crates/tui/src/prompts/base.txt b/crates/tui/src/prompts/base.txt index 6309e7f4..153ded68 100644 --- a/crates/tui/src/prompts/base.txt +++ b/crates/tui/src/prompts/base.txt @@ -9,11 +9,17 @@ Your default workflow for any non-trivial request: 2. **Execute** — work through each checklist item, updating status as you go. 3. **For complex initiatives**, layer `update_plan` (high-level strategy) above `checklist_write` (granular steps). 4. **For parallel work**, spawn sub-agents (`agent_spawn` / `agent_swarm`) — each does one thing well. Link them to plan/todo items in your thinking. -5. **For long inputs that don't fit in your context** (whole files, transcripts, multi-doc corpora) or when you need recursive sub-LLM work, use `rlm` — it loads the input into a Python REPL as `context` and runs sub-LLM calls there so the long string never enters your window. +5. **For long inputs, recursive sub-LLM work, or high-leverage parallel reasoning**, use `rlm` — it loads input into a Python REPL as `context` and runs sub-LLM calls there so long strings and batched deliberation stay out of your window. 6. **For persistent cross-session memory**, use `note` sparingly for important decisions, open blockers, and architectural context. **Key principle**: make your work visible. The sidebar shows Plan / Todos / Tasks / Agents. When these panels are empty, the user has no idea what you're doing. Keep them populated. +## RLM Is First-Class + +Treat `rlm` as a normal reasoning tool, not a last-resort escape hatch. Reach for it when you need independent second opinions, batched issue triage, design-option comparison, test-plan generation, risky implementation review, or map-reduce over bulky artifacts. Ask bounded questions with explicit inputs and expected output shape. + +`rlm` output is advisory. Use it to find blind spots and alternate routes, then ground decisions in local files, live tool output, GitHub issue text, and passing verification before claiming completion. + ## Context You have a 1 M-token context window. When usage creeps above ~80%, suggest `/compact` to the user — it summarises earlier turns so you can keep working without losing thread. @@ -29,7 +35,7 @@ Model notes: DeepSeek V4 models emit *thinking tokens* (`ContentBlock::Thinking` - **Git / diag / tests**: `git_status`, `git_diff`, `git_show`, `git_log`, `git_blame`, `diagnostics`, `run_tests`, `review`. - **Sub-agents**: `agent_spawn` (`spawn_agent`, `delegate_to_agent`), `agent_swarm`, `agent_result`, `agent_cancel` (`close_agent`), `agent_list`, `agent_wait` (`wait`), `agent_send_input` (`send_input`), `agent_assign` (`assign_agent`), `resume_agent`. - **CSV batch**: `spawn_agents_on_csv`, `report_agent_job_result`. -- **Recursive LM (long inputs)**: `rlm` — load a file/string as `context` in a Python REPL, sub-agent writes Python that calls `llm_query`/`llm_query_batched`/`rlm_query` to chunk and process it; returns the synthesized answer. Read-only. +- **Recursive LM (long inputs / parallel reasoning)**: `rlm` — load a file/string as `context` in a Python REPL, sub-agent writes Python that calls `llm_query`/`llm_query_batched`/`rlm_query` to chunk, compare, critique, and synthesize; returns the synthesized answer. Read-only. - **Other**: `code_execution` (Python sandbox), `validate_data` (JSON/TOML), `request_user_input`, `finance` (market quotes), `tool_search_tool_regex`, `tool_search_tool_bm25` (deferred tool discovery). Multiple `tool_calls` in one turn run in parallel. `web_search` returns `ref_id`s — cite as `(ref_id)`. diff --git a/crates/tui/src/tools/registry.rs b/crates/tui/src/tools/registry.rs index 6a85ea87..4a0b0d00 100644 --- a/crates/tui/src/tools/registry.rs +++ b/crates/tui/src/tools/registry.rs @@ -280,10 +280,11 @@ impl ToolRegistryBuilder { /// Include shell execution tool. #[must_use] pub fn with_shell_tools(self) -> Self { - use super::shell::{ExecShellTool, ShellInteractTool, ShellWaitTool}; + use super::shell::{ExecShellTool, ShellCancelTool, ShellInteractTool, ShellWaitTool}; self.with_tool(Arc::new(ExecShellTool)) .with_tool(Arc::new(ShellWaitTool::new("exec_shell_wait"))) .with_tool(Arc::new(ShellInteractTool::new("exec_shell_interact"))) + .with_tool(Arc::new(ShellCancelTool)) .with_tool(Arc::new(ShellWaitTool::new("exec_wait"))) .with_tool(Arc::new(ShellInteractTool::new("exec_interact"))) } diff --git a/crates/tui/src/tools/shell.rs b/crates/tui/src/tools/shell.rs index 740f3c36..88472a97 100644 --- a/crates/tui/src/tools/shell.rs +++ b/crates/tui/src/tools/shell.rs @@ -462,6 +462,7 @@ pub struct ShellManager { default_workspace: PathBuf, sandbox_manager: SandboxManager, sandbox_policy: ExecutionSandboxPolicy, + foreground_background_requested: bool, } impl std::fmt::Debug for ShellManager { @@ -471,6 +472,10 @@ impl std::fmt::Debug for ShellManager { .field("stale_jobs", &self.stale_jobs.len()) .field("default_workspace", &self.default_workspace) .field("sandbox_policy", &self.sandbox_policy) + .field( + "foreground_background_requested", + &self.foreground_background_requested, + ) .finish() } } @@ -484,6 +489,7 @@ impl ShellManager { default_workspace: workspace, sandbox_manager: SandboxManager::new(), sandbox_policy: ExecutionSandboxPolicy::default(), + foreground_background_requested: false, } } @@ -496,6 +502,7 @@ impl ShellManager { default_workspace: workspace, sandbox_manager: SandboxManager::new(), sandbox_policy: policy, + foreground_background_requested: false, } } @@ -511,6 +518,22 @@ impl ShellManager { &self.sandbox_policy } + /// Request that the active foreground shell wait detach and leave its + /// process running in the background job table. + pub fn request_foreground_background(&mut self) { + self.foreground_background_requested = true; + } + + fn clear_foreground_background_request(&mut self) { + self.foreground_background_requested = false; + } + + fn take_foreground_background_request(&mut self) -> bool { + let requested = self.foreground_background_requested; + self.foreground_background_requested = false; + requested + } + /// Check if sandboxing is available on this platform. #[allow(dead_code)] pub fn is_sandbox_available(&mut self) -> bool { @@ -1131,6 +1154,22 @@ impl ShellManager { Ok(shell.snapshot()) } + /// Kill every currently running background shell process. + pub fn kill_running(&mut self) -> Result> { + let ids = self + .processes + .iter() + .filter(|(_, shell)| shell.status == ShellStatus::Running) + .map(|(id, _)| id.clone()) + .collect::>(); + + let mut results = Vec::with_capacity(ids.len()); + for id in ids { + results.push(self.kill(&id)?); + } + Ok(results) + } + /// Poll a background process and return incremental output. pub fn poll_delta( &mut self, @@ -1304,6 +1343,7 @@ async fn execute_foreground_via_background( .shell_manager .lock() .map_err(|_| anyhow!("shell manager lock poisoned"))?; + manager.clear_foreground_background_request(); manager.execute_with_options( command, None, @@ -1345,6 +1385,9 @@ async fn execute_foreground_via_background( .shell_manager .lock() .map_err(|_| anyhow!("shell manager lock poisoned"))?; + if manager.take_foreground_background_request() { + return manager.get_output(&task_id, false, 0); + } manager.get_output(&task_id, false, 0)? }; @@ -1559,7 +1602,9 @@ impl ToolSpec for ExecShellTool { match result { Ok(result) => { - if background + let backgrounded_foreground = + !background && !interactive && result.status == ShellStatus::Running; + if (background || backgrounded_foreground) && let (Some(shell_id), Some(task_id)) = ( result.task_id.as_deref(), context.runtime.active_task_id.clone(), @@ -1595,7 +1640,13 @@ impl ToolSpec for ExecShellTool { format!("{}\n\nSTDERR:\n{}", result.stdout, result.stderr) } } else if result.status == ShellStatus::Running { - format!("Background task started: {task_id_str}") + if backgrounded_foreground { + format!( + "Command moved to background: {task_id_str}\n\nPoll with exec_shell_wait or cancel with exec_shell_cancel." + ) + } else { + format!("Background task started: {task_id_str}") + } } else if result.status == ShellStatus::Killed && was_cancelled { format!( "Command canceled; process killed.\n\nSTDOUT:\n{}\n\nSTDERR:\n{}", @@ -1647,6 +1698,7 @@ impl ToolSpec for ExecShellTool { }), }), }); + metadata["backgrounded"] = json!(background || backgrounded_foreground); if result.status == ShellStatus::TimedOut && !background && !interactive { metadata["foreground_timeout_recovery"] = json!({ "process_killed": true, @@ -1753,6 +1805,222 @@ fn build_shell_delta_tool_result(delta: ShellDeltaResult) -> ToolResult { } } +async fn wait_for_shell_delta_cancellable( + context: &ToolContext, + task_id: &str, + timeout_ms: u64, +) -> Result<(ShellDeltaResult, bool), ToolError> { + let timeout_ms = timeout_ms.clamp(1000, 600_000); + let deadline = Instant::now() + Duration::from_millis(timeout_ms); + let mut stdout_accum = String::new(); + let mut stderr_accum = String::new(); + + let (result, stdout_total_len, stderr_total_len) = loop { + if context + .cancel_token + .as_ref() + .is_some_and(|token| token.is_cancelled()) + { + let mut manager = context + .shell_manager + .lock() + .map_err(|_| ToolError::execution_failed("shell manager lock poisoned"))?; + let delta = manager + .get_output_delta(task_id, false, 0) + .map_err(|err| ToolError::execution_failed(err.to_string()))?; + append_shell_delta_output(&mut stdout_accum, &mut stderr_accum, &delta.result); + return Ok(( + shell_delta_with_accumulated_output( + delta.result, + &stdout_accum, + &stderr_accum, + delta.stdout_total_len, + delta.stderr_total_len, + ), + true, + )); + } + + let delta = { + let mut manager = context + .shell_manager + .lock() + .map_err(|_| ToolError::execution_failed("shell manager lock poisoned"))?; + manager + .get_output_delta(task_id, false, 0) + .map_err(|err| ToolError::execution_failed(err.to_string()))? + }; + + let stdout_total_len = delta.stdout_total_len; + let stderr_total_len = delta.stderr_total_len; + append_shell_delta_output(&mut stdout_accum, &mut stderr_accum, &delta.result); + + let status = delta.result.status.clone(); + if status != ShellStatus::Running || Instant::now() >= deadline { + break (delta.result, stdout_total_len, stderr_total_len); + } + + tokio::time::sleep(Duration::from_millis(100)).await; + }; + + Ok(( + shell_delta_with_accumulated_output( + result, + &stdout_accum, + &stderr_accum, + stdout_total_len, + stderr_total_len, + ), + false, + )) +} + +fn append_shell_delta_output( + stdout_accum: &mut String, + stderr_accum: &mut String, + result: &ShellResult, +) { + if !result.stdout.is_empty() { + stdout_accum.push_str(&result.stdout); + } + if !result.stderr.is_empty() { + stderr_accum.push_str(&result.stderr); + } +} + +fn shell_delta_with_accumulated_output( + mut result: ShellResult, + stdout_accum: &str, + stderr_accum: &str, + stdout_total_len: usize, + stderr_total_len: usize, +) -> ShellDeltaResult { + let (stdout, stdout_meta) = truncate_with_meta(stdout_accum); + let (stderr, stderr_meta) = truncate_with_meta(stderr_accum); + result.stdout = stdout; + result.stderr = stderr; + result.stdout_len = stdout_meta.original_len; + result.stderr_len = stderr_meta.original_len; + result.stdout_omitted = stdout_meta.omitted; + result.stderr_omitted = stderr_meta.omitted; + result.stdout_truncated = stdout_meta.truncated; + result.stderr_truncated = stderr_meta.truncated; + + ShellDeltaResult { + result, + stdout_total_len, + stderr_total_len, + } +} + +pub struct ShellCancelTool; + +#[async_trait] +impl ToolSpec for ShellCancelTool { + fn name(&self) -> &'static str { + "exec_shell_cancel" + } + + fn description(&self) -> &'static str { + "Cancel a running background shell task by task_id, or cancel all running background shell tasks with all=true." + } + + fn input_schema(&self) -> serde_json::Value { + json!({ + "type": "object", + "properties": { + "task_id": { + "type": "string", + "description": "Task ID returned by exec_shell or task_shell_start" + }, + "id": { + "type": "string", + "description": "Alias for task_id" + }, + "all": { + "type": "boolean", + "description": "Cancel all currently running background shell tasks" + } + } + }) + } + + fn capabilities(&self) -> Vec { + vec![ToolCapability::RequiresApproval] + } + + fn approval_requirement(&self) -> ApprovalRequirement { + ApprovalRequirement::Required + } + + async fn execute( + &self, + input: serde_json::Value, + context: &ToolContext, + ) -> Result { + let cancel_all = optional_bool(&input, "all", false); + let mut manager = context + .shell_manager + .lock() + .map_err(|_| ToolError::execution_failed("shell manager lock poisoned"))?; + + if cancel_all { + let results = manager + .kill_running() + .map_err(|err| ToolError::execution_failed(err.to_string()))?; + if results.is_empty() { + return Ok(ToolResult { + content: "No running background shell jobs.".to_string(), + success: true, + metadata: Some(json!({ + "status": "Noop", + "canceled": 0, + "task_ids": [], + })), + }); + } + + let task_ids = results + .iter() + .filter_map(|result| result.task_id.clone()) + .collect::>(); + return Ok(ToolResult { + content: format!( + "Canceled {} background shell job{}: {}", + task_ids.len(), + if task_ids.len() == 1 { "" } else { "s" }, + task_ids.join(", ") + ), + success: true, + metadata: Some(json!({ + "status": "Killed", + "canceled": task_ids.len(), + "task_ids": task_ids, + })), + }); + } + + let task_id = required_task_id(&input)?; + let result = manager + .kill(task_id) + .map_err(|err| ToolError::execution_failed(err.to_string()))?; + let task_id = result + .task_id + .clone() + .unwrap_or_else(|| task_id.to_string()); + Ok(ToolResult { + content: format!("Canceled background shell job: {task_id}"), + success: true, + metadata: Some(json!({ + "status": format!("{:?}", result.status), + "task_id": task_id, + "exit_code": result.exit_code, + "duration_ms": result.duration_ms, + })), + }) + } +} + #[async_trait] impl ToolSpec for ShellWaitTool { fn name(&self) -> &'static str { @@ -1760,7 +2028,7 @@ impl ToolSpec for ShellWaitTool { } fn description(&self) -> &'static str { - "Wait for a background shell task and return incremental output." + "Wait for a background shell task and return incremental output. Turn cancellation stops waiting but leaves the background task running." } fn input_schema(&self) -> serde_json::Value { @@ -1801,15 +2069,36 @@ impl ToolSpec for ShellWaitTool { let wait = optional_bool(&input, "wait", true); let timeout_ms = optional_u64(&input, "timeout_ms", 5_000); - let mut manager = context - .shell_manager - .lock() - .map_err(|_| ToolError::execution_failed("shell manager lock poisoned"))?; - let delta = manager - .get_output_delta(task_id, wait, timeout_ms) - .map_err(|err| ToolError::execution_failed(err.to_string()))?; + let (delta, wait_canceled) = if wait { + wait_for_shell_delta_cancellable(context, task_id, timeout_ms).await? + } else { + let mut manager = context + .shell_manager + .lock() + .map_err(|_| ToolError::execution_failed("shell manager lock poisoned"))?; + let delta = manager + .get_output_delta(task_id, false, timeout_ms) + .map_err(|err| ToolError::execution_failed(err.to_string()))?; + (delta, false) + }; - Ok(build_shell_delta_tool_result(delta)) + let status = delta.result.status.clone(); + let mut result = build_shell_delta_tool_result(delta); + if wait_canceled { + if matches!(status, ShellStatus::Running) { + result.content = format!( + "Wait canceled; background shell task {task_id} is still running.\n\n{}", + result.content + ); + } + if let Some(metadata) = result.metadata.as_mut() + && let Some(object) = metadata.as_object_mut() + { + object.insert("wait_canceled".to_string(), json!(true)); + } + } + + Ok(result) } } @@ -1893,6 +2182,27 @@ impl ToolSpec for ShellInteractTool { let mut elapsed = 0u64; loop { + if context + .cancel_token + .as_ref() + .is_some_and(|token| token.is_cancelled()) + { + let mut manager = context + .shell_manager + .lock() + .map_err(|_| ToolError::execution_failed("shell manager lock poisoned"))?; + let delta = manager + .get_output_delta(task_id, false, 0) + .map_err(|err| ToolError::execution_failed(err.to_string()))?; + let mut result = build_shell_delta_tool_result(delta); + if let Some(metadata) = result.metadata.as_mut() + && let Some(object) = metadata.as_object_mut() + { + object.insert("wait_canceled".to_string(), json!(true)); + } + return Ok(result); + } + let delta = { let mut manager = context .shell_manager diff --git a/crates/tui/src/tools/shell/tests.rs b/crates/tui/src/tools/shell/tests.rs index 7fdb8bce..f6ddf983 100644 --- a/crates/tui/src/tools/shell/tests.rs +++ b/crates/tui/src/tools/shell/tests.rs @@ -338,3 +338,179 @@ async fn test_exec_shell_foreground_cancel_kills_process() { assert_eq!(meta.get("status").and_then(Value::as_str), Some("Killed")); assert_eq!(meta.get("canceled").and_then(Value::as_bool), Some(true)); } + +#[tokio::test] +async fn test_exec_shell_foreground_can_move_to_background() { + let tmp = tempdir().expect("tempdir"); + let ctx = ToolContext::new(tmp.path()); + let shell_manager = ctx.shell_manager.clone(); + let command = sleep_command(30); + let task_ctx = ctx.clone(); + + let task = tokio::spawn(async move { + ExecShellTool + .execute( + json!({ + "command": command, + "timeout_ms": 600_000 + }), + &task_ctx, + ) + .await + .expect("execute") + }); + + tokio::time::sleep(Duration::from_millis(150)).await; + shell_manager + .lock() + .expect("shell manager lock") + .request_foreground_background(); + + let result = tokio::time::timeout(Duration::from_secs(5), task) + .await + .expect("foreground shell should detach") + .expect("task should not panic"); + + assert!(result.success); + assert!(result.content.contains("Command moved to background")); + assert!(result.content.contains("exec_shell_cancel")); + + let meta = result.metadata.expect("metadata"); + assert_eq!(meta.get("status").and_then(Value::as_str), Some("Running")); + assert_eq!( + meta.get("backgrounded").and_then(Value::as_bool), + Some(true) + ); + let task_id = meta + .get("task_id") + .and_then(Value::as_str) + .expect("task id") + .to_string(); + + let mut manager = shell_manager.lock().expect("shell manager lock"); + let job = manager.inspect_job(&task_id).expect("inspect job"); + assert_eq!(job.snapshot.status, ShellStatus::Running); + let killed = manager.kill(&task_id).expect("kill"); + assert_eq!(killed.status, ShellStatus::Killed); +} + +#[tokio::test] +async fn test_exec_shell_wait_cancel_leaves_background_process_running() { + let tmp = tempdir().expect("tempdir"); + let cancel_token = tokio_util::sync::CancellationToken::new(); + let ctx = ToolContext::new(tmp.path()).with_cancel_token(cancel_token.clone()); + let shell_manager = ctx.shell_manager.clone(); + let started = shell_manager + .lock() + .expect("shell manager lock") + .execute(&sleep_command(30), None, 600_000, true) + .expect("execute"); + let task_id = started.task_id.expect("task id"); + let wait_task_id = task_id.clone(); + let task_ctx = ctx.clone(); + + let task = tokio::spawn(async move { + ShellWaitTool::new("exec_shell_wait") + .execute( + json!({ + "task_id": wait_task_id, + "wait": true, + "timeout_ms": 600_000 + }), + &task_ctx, + ) + .await + .expect("wait") + }); + + tokio::time::sleep(Duration::from_millis(150)).await; + cancel_token.cancel(); + + let result = tokio::time::timeout(Duration::from_secs(5), task) + .await + .expect("wait should observe cancellation") + .expect("task should not panic"); + + assert!(result.success); + assert!(result.content.contains("still running")); + let meta = result.metadata.expect("metadata"); + assert_eq!(meta.get("status").and_then(Value::as_str), Some("Running")); + assert_eq!( + meta.get("wait_canceled").and_then(Value::as_bool), + Some(true) + ); + + let mut manager = shell_manager.lock().expect("shell manager lock"); + let job = manager.inspect_job(&task_id).expect("inspect job"); + assert_eq!(job.snapshot.status, ShellStatus::Running); + let killed = manager.kill(&task_id).expect("kill"); + assert_eq!(killed.status, ShellStatus::Killed); +} + +#[tokio::test] +async fn test_exec_shell_cancel_tool_kills_background_process() { + let tmp = tempdir().expect("tempdir"); + let ctx = ToolContext::new(tmp.path()); + let shell_manager = ctx.shell_manager.clone(); + let started = shell_manager + .lock() + .expect("shell manager lock") + .execute(&sleep_command(30), None, 600_000, true) + .expect("execute"); + let task_id = started.task_id.expect("task id"); + + let result = ShellCancelTool + .execute(json!({ "task_id": task_id }), &ctx) + .await + .expect("cancel"); + + assert!(result.success); + assert!(result.content.contains("Canceled background shell job")); + let meta = result.metadata.expect("metadata"); + assert_eq!(meta.get("status").and_then(Value::as_str), Some("Killed")); + + let task_id = meta + .get("task_id") + .and_then(Value::as_str) + .expect("task id"); + let mut manager = shell_manager.lock().expect("shell manager lock"); + let job = manager.inspect_job(task_id).expect("inspect job"); + assert_eq!(job.snapshot.status, ShellStatus::Killed); +} + +#[tokio::test] +async fn test_exec_shell_cancel_tool_can_kill_all_running_processes() { + let tmp = tempdir().expect("tempdir"); + let ctx = ToolContext::new(tmp.path()); + let shell_manager = ctx.shell_manager.clone(); + let first = shell_manager + .lock() + .expect("shell manager lock") + .execute(&sleep_command(30), None, 600_000, true) + .expect("execute first") + .task_id + .expect("first task id"); + let second = shell_manager + .lock() + .expect("shell manager lock") + .execute(&sleep_command(30), None, 600_000, true) + .expect("execute second") + .task_id + .expect("second task id"); + + let result = ShellCancelTool + .execute(json!({ "all": true }), &ctx) + .await + .expect("cancel all"); + + assert!(result.success); + let meta = result.metadata.expect("metadata"); + assert_eq!(meta.get("status").and_then(Value::as_str), Some("Killed")); + assert_eq!(meta.get("canceled").and_then(Value::as_u64), Some(2)); + + let mut manager = shell_manager.lock().expect("shell manager lock"); + let first_job = manager.inspect_job(&first).expect("inspect first"); + let second_job = manager.inspect_job(&second).expect("inspect second"); + assert_eq!(first_job.snapshot.status, ShellStatus::Killed); + assert_eq!(second_job.snapshot.status, ShellStatus::Killed); +} diff --git a/crates/tui/src/tui/history.rs b/crates/tui/src/tui/history.rs index 57594a03..8ab09ed7 100644 --- a/crates/tui/src/tui/history.rs +++ b/crates/tui/src/tui/history.rs @@ -688,6 +688,12 @@ impl ExecCell { TOOL_OUTPUT_LINE_LIMIT, mode, )); + } else if self.status == ToolStatus::Running && self.source == ExecSource::Assistant { + lines.extend(wrap_plain_line( + " Ctrl+B opens shell controls.", + Style::default().fg(palette::TEXT_MUTED), + width, + )); } else if self.status != ToolStatus::Running { lines.push(Line::from(Span::styled( " (no output)", diff --git a/crates/tui/src/tui/keybindings.rs b/crates/tui/src/tui/keybindings.rs index 8d7d788e..c0cf9661 100644 --- a/crates/tui/src/tui/keybindings.rs +++ b/crates/tui/src/tui/keybindings.rs @@ -155,7 +155,12 @@ pub const KEYBINDINGS: &[KeybindingEntry] = &[ }, KeybindingEntry { chord: "Ctrl+C", - description: "Cancel request, or exit when nothing is running", + description: "Cancel request, or exit when idle", + section: KeybindingSection::Submission, + }, + KeybindingEntry { + chord: "Ctrl+B", + description: "Open shell controls for a running foreground command", section: KeybindingSection::Submission, }, KeybindingEntry { diff --git a/crates/tui/src/tui/ui.rs b/crates/tui/src/tui/ui.rs index 730bc514..0ca3576d 100644 --- a/crates/tui/src/tui/ui.rs +++ b/crates/tui/src/tui/ui.rs @@ -93,7 +93,9 @@ use super::history::{ use super::slash_menu::{ apply_slash_menu_selection, try_autocomplete_slash_command, visible_slash_menu_entries, }; -use super::views::{ConfigView, ContextMenuAction, HelpView, ModalKind, ViewEvent}; +use super::views::{ + ConfigView, ContextMenuAction, HelpView, ModalKind, ShellControlView, ViewEvent, +}; use super::widgets::pending_input_preview::{ContextPreviewItem, PendingInputPreview}; use super::widgets::{ ChatWidget, ComposerWidget, FooterProps, FooterToast, FooterWidget, HeaderData, HeaderWidget, @@ -1439,6 +1441,14 @@ async fn run_event_loop( continue; } + if matches!(key.code, KeyCode::Char('b') | KeyCode::Char('B')) + && key.modifiers.contains(KeyModifiers::CONTROL) + && app.view_stack.is_empty() + { + open_shell_control(app); + continue; + } + if matches!(key.code, KeyCode::Char('c') | KeyCode::Char('C')) && key.modifiers.contains(KeyModifiers::ALT) && !key.modifiers.contains(KeyModifiers::CONTROL) @@ -4115,6 +4125,19 @@ async fn handle_view_events( ViewEvent::ContextMenuSelected { action } => { handle_context_menu_action(app, action); } + ViewEvent::ShellControlBackground => { + request_foreground_shell_background(app); + } + ViewEvent::ShellControlCancel => { + app.backtrack.reset(); + engine_handle.cancel(); + app.is_loading = false; + app.streaming_state.reset(); + app.runtime_turn_status = None; + app.finalize_active_cell_as_interrupted(); + app.finalize_streaming_assistant_as_interrupted(); + app.status_message = Some("Request cancelled".to_string()); + } } } @@ -4748,10 +4771,57 @@ fn active_tool_status_label(app: &App) -> Option { if let Some(elapsed) = elapsed { parts.push(elapsed); } + if active_foreground_shell_running(app) { + parts.push("Ctrl+B shell".to_string()); + } parts.push("Alt+V".to_string()); Some(parts.join(" \u{00B7} ")) } +fn open_shell_control(app: &mut App) { + if !app.is_loading || !active_foreground_shell_running(app) { + app.status_message = Some("No foreground shell command to control".to_string()); + return; + } + + app.view_stack.push(ShellControlView::new()); + app.status_message = Some("Shell control opened".to_string()); +} + +fn request_foreground_shell_background(app: &mut App) { + if !app.is_loading || !active_foreground_shell_running(app) { + app.status_message = Some("No foreground shell command to background".to_string()); + return; + } + + let Some(shell_manager) = app.runtime_services.shell_manager.clone() else { + app.status_message = Some("Shell manager is not attached".to_string()); + return; + }; + + match shell_manager.lock() { + Ok(mut manager) => { + manager.request_foreground_background(); + app.status_message = Some("Backgrounding current shell command...".to_string()); + } + Err(_) => { + app.status_message = Some("Shell manager lock is poisoned".to_string()); + } + } +} + +fn active_foreground_shell_running(app: &App) -> bool { + app.active_cell.as_ref().is_some_and(|active| { + active.entries().iter().any(|cell| { + matches!( + cell, + HistoryCell::Tool(ToolCell::Exec(exec)) + if exec.status == ToolStatus::Running && exec.interaction.is_none() + ) + }) + }) +} + fn collect_active_tool_status(cell: &HistoryCell, snapshot: &mut ActiveToolStatusSnapshot) { let HistoryCell::Tool(tool) = cell else { return; diff --git a/crates/tui/src/tui/views/mod.rs b/crates/tui/src/tui/views/mod.rs index f74b4c01..b3ed31c1 100644 --- a/crates/tui/src/tui/views/mod.rs +++ b/crates/tui/src/tui/views/mod.rs @@ -31,6 +31,7 @@ pub enum ModalKind { FilePicker, StatusPicker, ContextMenu, + ShellControl, } #[derive(Debug, Clone)] @@ -157,6 +158,8 @@ pub enum ViewEvent { ContextMenuSelected { action: ContextMenuAction, }, + ShellControlBackground, + ShellControlCancel, } #[derive(Debug, Clone)] @@ -288,6 +291,143 @@ impl fmt::Debug for ViewStack { } } +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +enum ShellControlChoice { + Background, + Cancel, +} + +impl ShellControlChoice { + fn event(self) -> ViewEvent { + match self { + ShellControlChoice::Background => ViewEvent::ShellControlBackground, + ShellControlChoice::Cancel => ViewEvent::ShellControlCancel, + } + } +} + +pub struct ShellControlView { + selected: ShellControlChoice, +} + +impl ShellControlView { + pub fn new() -> Self { + Self { + selected: ShellControlChoice::Background, + } + } + + fn toggle(&mut self) { + self.selected = match self.selected { + ShellControlChoice::Background => ShellControlChoice::Cancel, + ShellControlChoice::Cancel => ShellControlChoice::Background, + }; + } +} + +impl ModalView for ShellControlView { + fn kind(&self) -> ModalKind { + ModalKind::ShellControl + } + + fn as_any_mut(&mut self) -> &mut dyn std::any::Any { + self + } + + fn handle_key(&mut self, key: KeyEvent) -> ViewAction { + match key.code { + KeyCode::Esc | KeyCode::Char('q') | KeyCode::Char('Q') => ViewAction::Close, + KeyCode::Up | KeyCode::Down | KeyCode::Left | KeyCode::Right | KeyCode::Tab => { + self.toggle(); + ViewAction::None + } + KeyCode::Char('b') | KeyCode::Char('B') => { + ViewAction::EmitAndClose(ViewEvent::ShellControlBackground) + } + KeyCode::Char('c') | KeyCode::Char('C') => { + ViewAction::EmitAndClose(ViewEvent::ShellControlCancel) + } + KeyCode::Enter => ViewAction::EmitAndClose(self.selected.event()), + _ => ViewAction::None, + } + } + + fn render(&self, area: Rect, buf: &mut Buffer) { + use ratatui::{ + prelude::Stylize, + style::Style, + text::{Line, Span}, + widgets::{Block, Borders, Clear, Padding, Paragraph, Widget}, + }; + + let popup_width = 62.min(area.width.saturating_sub(4)); + let popup_height = 11.min(area.height.saturating_sub(2)); + + let popup_area = Rect { + x: (area.width - popup_width) / 2, + y: (area.height - popup_height) / 2, + width: popup_width, + height: popup_height, + }; + + Clear.render(popup_area, buf); + + let option_line = |choice: ShellControlChoice, key: &'static str, label: &'static str| { + let selected = self.selected == choice; + let style = if selected { + Style::default() + .fg(palette::SELECTION_TEXT) + .bg(palette::SELECTION_BG) + } else { + Style::default().fg(palette::TEXT_PRIMARY) + }; + Line::from(vec![ + Span::styled(if selected { "> " } else { " " }, style), + Span::styled(format!("{key:<3}"), style.bold()), + Span::styled(label, style), + ]) + }; + + let lines = vec![ + Line::from(Span::styled( + "Foreground shell command is still running.", + Style::default().fg(palette::TEXT_PRIMARY), + )), + Line::from(""), + option_line( + ShellControlChoice::Background, + "B", + "Background - detach and keep the command running", + ), + option_line( + ShellControlChoice::Cancel, + "C", + "Cancel - stop the command and interrupt this turn", + ), + ]; + + let view = Paragraph::new(lines) + .block( + Block::default() + .title(Line::from(vec![Span::styled( + " Shell command ", + Style::default().fg(palette::DEEPSEEK_BLUE).bold(), + )])) + .title_bottom(Line::from(Span::styled( + " Enter select | Esc close ", + Style::default().fg(palette::TEXT_MUTED), + ))) + .borders(Borders::ALL) + .border_style(Style::default().fg(palette::BORDER_COLOR)) + .style(Style::default().bg(palette::DEEPSEEK_INK)) + .padding(Padding::uniform(1)), + ) + .style(Style::default().fg(palette::TEXT_PRIMARY)); + + view.render(popup_area, buf); + } +} + #[derive(Debug, Clone, Copy, PartialEq, Eq)] enum ConfigScope { Session, @@ -1530,8 +1670,8 @@ fn truncate_view_text(text: &str, max_chars: usize) -> String { #[cfg(test)] mod tests { use super::{ - ConfigListItem, ConfigSection, ConfigView, ModalView, ViewAction, ViewEvent, - truncate_view_text, + ConfigListItem, ConfigSection, ConfigView, ModalView, ShellControlView, ViewAction, + ViewEvent, truncate_view_text, }; use crate::config::Config; use crate::localization::Locale; @@ -1800,6 +1940,30 @@ mod tests { assert_eq!(view.status.as_deref(), Some("Edit cancelled")); } + #[test] + fn shell_control_view_defaults_to_background() { + let mut view = ShellControlView::new(); + + let action = view.handle_key(KeyEvent::new(KeyCode::Enter, KeyModifiers::NONE)); + + assert!(matches!( + action, + ViewAction::EmitAndClose(ViewEvent::ShellControlBackground) + )); + } + + #[test] + fn shell_control_view_can_select_cancel() { + let mut view = ShellControlView::new(); + + let action = view.handle_key(KeyEvent::new(KeyCode::Char('c'), KeyModifiers::NONE)); + + assert!(matches!( + action, + ViewAction::EmitAndClose(ViewEvent::ShellControlCancel) + )); + } + fn buffer_text(buf: &Buffer, area: Rect) -> String { let mut out = String::new(); for y in area.top()..area.bottom() { diff --git a/docs/OPERATIONS_RUNBOOK.md b/docs/OPERATIONS_RUNBOOK.md index 5a974f00..d0966343 100644 --- a/docs/OPERATIONS_RUNBOOK.md +++ b/docs/OPERATIONS_RUNBOOK.md @@ -28,9 +28,11 @@ Checks: 3. Confirm no local sandbox/permission deadlock in tool output Actions: -1. Cancel current turn (`Esc` in TUI while loading) -2. Retry prompt; if still failing, restart TUI -3. On restart, verify the previous queued/in-flight runtime turn is shown as interrupted rather than left in a running state +1. If a foreground shell command is running, press `Ctrl+B` and choose whether to background it or cancel the current turn. +2. If the command was started in the background, ask the assistant to cancel it with `exec_shell_cancel` and the returned task id. +3. Use `Esc` or `Ctrl+C` to interrupt the current turn when you want to stop the request itself. +4. Retry prompt; if still failing, restart TUI. +5. On restart, verify the previous queued/in-flight runtime turn is shown as interrupted rather than left in a running state. ## Incident: Network Outage / Offline Behavior diff --git a/docs/TOOL_SURFACE.md b/docs/TOOL_SURFACE.md index 5bcbeaf5..ebfe6d71 100644 --- a/docs/TOOL_SURFACE.md +++ b/docs/TOOL_SURFACE.md @@ -41,8 +41,9 @@ chosen over the available shell equivalent. Companion to `crates/tui/src/prompts | Tool | Niche | |---|---| | `exec_shell` | Run a shell command. Foreground runs are cancellable, but use them only for bounded commands; timeout kills the process and returns a background-rerun hint. | -| `exec_shell_wait` | Poll a background task for incremental output. | +| `exec_shell_wait` | Poll a background task for incremental output. Canceling the turn stops waiting without killing the task. | | `exec_shell_interact` | Send stdin to a running background task and read incremental output. | +| `exec_shell_cancel` | Cancel one running background shell task by id, or all running background shell tasks when explicitly requested. | | `task_shell_start` | Start a long-running command in the background and return immediately. Preferred over foreground shell for diagnostics, tests, searches, and servers that may run for minutes. | | `task_shell_wait` | Poll a background command. If `gate` is supplied after completion, record structured gate evidence on the active durable task. |