Merge pull request #1553 from Hmbown/work/v0.8.33

chore(release): prepare v0.8.33
2026-05-12 22:30:53 -05:00
parent 503551ddec a507885fb8
commit 81e4b93cc9
95 changed files with 8702 additions and 2381 deletions
@@ -107,8 +107,9 @@ If a contribution is itself a prompt-injection attempt or otherwise acting in ba

 - **Token/cost tracking inaccuracies**: Token counting and cost estimation may be inflated due to thinking token accounting bugs. Use `/compact` to manage context, and treat cost estimates as approximate.
 - **Modes**: Three modes — Plan (read-only investigation), Agent (tool use with approval), YOLO (auto-approved). See `docs/MODES.md` for details.
- **Sub-agents**: Single model-callable surface is `agent_spawn` (returns an `agent_id` immediately; parent keeps working) plus `agent_wait` / `agent_result` / `agent_cancel` / `agent_list` / `agent_send_input` / `agent_resume` / `agent_assign`. The old `agent_swarm` / `spawn_agents_on_csv` / `/swarm` surface was removed in v0.8.5 (#336).
- **`rlm` tool** (`crates/tui/src/tools/rlm.rs`): a sandboxed Python REPL where a sub-LLM can call in-REPL helpers (`llm_query()`, `llm_query_batched()`, `rlm_query()`, `rlm_query_batched()`) — those `*_query` names are **Python helpers inside the REPL**, not separately-registered model-visible tools. Always loaded across all modes.
+- **Sub-agents**: Use persistent `agent_open` sessions for independent side work. Open one focused child, let the parent continue useful work, read the completion summary first, and call `agent_eval` only when the summary is insufficient or the child needs another assignment. Close completed sessions with `agent_close`. Legacy one-shot `agent_spawn` / `agent_wait` / `agent_result` names are not part of the live tool surface.
+- **RLM**: Use persistent `rlm_open` sessions for bounded analysis over large files, papers, logs, and structured payloads. Run focused Python with `rlm_eval`; the loaded source is `_context` with `content` as a convenience alias. Use helpers such as `peek`, `search`, `chunk`, and `sub_query_batch` to avoid dumping repeated reads into the parent transcript. Configure child-call timeout with `rlm_configure.sub_query_timeout_secs`, not per-call guesses. Use `finalize(...)` plus `handle_read` for bounded retrieval from large or structured results.
+- **Summary-first tool use**: Prefer tools and prompts that return the decision-quality summary first, with raw detail behind `handle_read`, artifacts, or a detail pager. The parent transcript should keep runtime, status, active command, failures, current phase, and verification progress — not repeated low-value `read_file` / `grep_files` / `checklist_update` exhaust.

 ## Session Longevity (Critical)

@@ -116,16 +117,16 @@ Long sessions in DeepSeek TUI WILL degrade and crash if you work sequentially. T

 **To survive a multi-hour sprint:**

-1. **Delegate everything to sub-agents.** Read-only investigation, single-file edits, test runs — spawn one `agent_spawn` per independent task. You are the coordinator, not the worker. Sub-agents start fresh sessions with clean context. Your session stays small.
+1. **Delegate independent work early.** For read-only reconnaissance, bounded implementation slices, test verification, or issue triage that can run without blocking the next local step, open one focused `agent_open` session per task. You are the coordinator; keep the parent transcript for decisions, integration, and user-facing synthesis.

-2. **Batch tool calls.** Never fire one `read_file` and wait. Fire 3 `read_file` + 2 `grep_files` + 1 `git_status` in one turn. The dispatcher runs them in parallel.
+2. **Batch independent reads/searches.** Avoid one `read_file`, wait, another `grep_files`, wait. Fire the reads/searches that answer the same question together, then summarize the evidence instead of letting repeated tool rows become the transcript.

 3. **Compact aggressively.** Suggest `/compact` at 60% context usage, not 80%. A compacted session that stays fast beats a dead session every time.

-4. **Max 3 sequential turns before delegating.** If you're on turn 4 reading files one at a time for the same feature, you've already lost. Spawn sub-agents.
+4. **Reassess after 3 sequential parent turns.** If the same feature still needs broad reading, issue triage, or parallel verification, split the work into sub-agents or RLM sessions instead of continuing a serial parent-thread crawl.

-5. **Use RLM for batch classification.** Need to categorize 15 files? `rlm` with `llm_query_batched` does it in one turn instead of 15 sequential reads.
+5. **Use RLM for batch classification.** Need to categorize 15 files, inspect a paper, or mine a long log? Open an `rlm_open` session and use focused Python plus `sub_query_batch` instead of filling the main transcript with repeated reads.

 6. **After every 3 turns, check:** context under 60%? Sub-agents still running? PRs ready to push? `cargo check` still passes?

-**The "mismanaged genius" problem:** The system prompt was written for a less capable model and treats sub-agents, RLM, and parallel execution as specialty escape hatches. The model *can* do all of this — the prompt just doesn't encourage it strongly enough. We fixed this in v0.8.6 (see `PROMPT_ANALYSIS.md`).
+**Operating model:** Keep the parent session lean. Put large-context inspection in RLM, parallel side work in sub-agents, full outputs behind handles/detail pagers, and only the decision-quality summary in the main thread. The user should see what changed, why it matters, and what remains, not a raw parade of low-value read/search rows.
@@ -7,6 +7,164 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ## [Unreleased]

+## [0.8.33] - 2026-05-12
+
+A sub-agent and RLM renovation release. The model-facing delegation
+surface is now session-oriented instead of one-shot: RLM work happens
+through `rlm_open` / `rlm_eval` / `rlm_configure` / `rlm_close`,
+sub-agent work happens through `agent_open` / `agent_eval` /
+`agent_close`, and large outputs can be parked behind typed handles
+that the model reads back explicitly with `handle_read`.
+
+### Added
+
+- **Persistent RLM sessions with bounded REPL helpers.** RLM prompts now
+  use `peek`, `search`, `chunk`, `context_meta`, `sub_query`,
+  `sub_query_batch`, `sub_query_map`, `sub_rlm`, and
+  `finalize(value, confidence)` instead of exposing the full parent
+  context as an ambient variable.
+- **Fork-aware sub-agent sessions.** `agent_open` supports named
+  sessions, `fork_context`, and bounded recursive depth so the parent can
+  ask for multiple perspectives while preserving prompt-cache-friendly
+  prefix context where available.
+- **Shared `handle_read` storage.** RLM finals, sub-agent transcripts,
+  and other large structured results can return `var_handle` references
+  with slice, range, count, and JSONPath projections.
+- **Slash-command routing for the new surface.** `/rlm [N] ...` and
+  `/agent [N] ...` now prompt the assistant to use the persistent tools
+  instead of the removed foreground RLM operation.
+- **Harness-friendly non-interactive exec sessions.** `deepseek exec`
+  now supports `--resume`, `--session-id`, `--continue`, and
+  `--output-format stream-json` so backend wrappers such as ClawBench can
+  keep conversation state and parse one JSON event per line without running
+  a long-lived server.
+- **`/relay` slash command with CJK aliases** (`/接力`). Hands the
+  assistant a structured handoff prompt for coordinated multi-turn
+  continuation across sessions.
+- **`checklist_write` sidebar rename.** The sidebar focus tab formerly
+  known as "Plan" / "Todos" is now "Work" — one panel for the active
+  checklist and optional plan, consistent across all three modes.
+- **Grayscale theme.** `/theme grayscale` and
+  `/set theme grayscale --save` provide a low-opinion black/white palette
+  for users who want less brand color in the terminal.
+
+### Changed
+
+- **Prompts and docs now teach only the new tool names.** Legacy
+  RLM/sub-agent helpers remain internally where needed for durable
+  transcript compatibility, but the registry exposes the session tools.
+- **Large or noisy tool results are easier to keep out of context.**
+  Tool output summaries, sub-agent results, and transcript snapshots now
+  point the model toward `handle_read` when it needs raw detail.
+- **Tool-surface smoke guidance is explicit.** Release checks now document
+  the exact version commands and registry-name searches for `handle_read`,
+  persistent RLM tools, and persistent sub-agent tools.
+- **README acknowledgements expanded.** The project thanks OpenWarp and
+  Open Design for support and collaboration around terminal-agent and
+  design-forward workflows.
+- **Light theme tuned for calmer contrast.** The canvas, panel, elevated,
+  border, and selection tokens now separate surfaces without the washed-out
+  white-on-white feel.
+- **Session picker is history-first.** `/sessions` and `Ctrl+R` now show
+  the full selected session history on the left with the session list on
+  the right; number keys `1`-`9` open visible session histories, `PgUp` /
+  `PgDn` scroll that history, and `Enter` still resumes.
+- **Foreground RLM operation removed.** The old `Op::Rlm` path and its
+  `handle_rlm` engine method are gone; all RLM work now flows through
+  the persistent-session tools.
+- **Stale competitive-analysis doc removed.** The old cross-agent matrix
+  had become an unreliable inventory of tool names rather than useful
+  release guidance.
+
+### Fixed
+
+- **Local/custom endpoints stay prompt-free when auth is optional.**
+  The dispatcher no longer reads the secret store for SGLang, vLLM,
+  Ollama, or loopback custom URLs unless API-key auth is explicitly
+  requested, and the direct TUI treats loopback model endpoints as
+  no-key by default. This avoids macOS Keychain prompts and stale
+  DeepSeek keys when users point the app at local OpenAI-compatible
+  servers.
+- **Transcript browsing stays put across resizes.** If the user is reading
+  older chat history, terminal resize events preserve the current transcript
+  position instead of jumping back to the live tail; the scrollbar and
+  jump-to-latest affordance now follow the active theme.
+- **Backtrack preview opens near the selected turn.** Pressing Esc twice no
+  longer opens the live transcript preview at the oldest conversation line;
+  the highlighted recent user turn is pinned into view, and changing the
+  backtrack target re-pins only that selection.
+- **Completed thinking no longer masquerades as prompt text.** Collapsed
+  completed reasoning now shows only explicit `Summary:` content inline; raw
+  reasoning remains available through Ctrl+O/transcript instead of appearing
+  as assistant self-talk in the main flow. When Ctrl+O starts from a reasoning
+  block, it opens a full-session reasoning timeline instead of a single
+  isolated chunk.
+- **Transcript selection keeps working while the agent is streaming.**
+  The loading-state mouse filter now drops inert move events but allows
+  active transcript and scrollbar drags to continue (reported as a known
+  issue in v0.8.32).
+- **Empty-composer arrow scrolling feels less twitchy.** When configured to
+  scroll the transcript, plain Up/Down now move by a small wheel-like step
+  instead of a single-line flick.
+- **Mouse and trackpad scrolling feel less sticky in long logs.** Rapid
+  same-direction transcript scrolls now get bounded acceleration while
+  direction changes reset to precise single-line movement.
+- **RLM smoke-test papercuts fixed.** `rlm_eval` now binds `content` as a
+  convenience alias for `_context`, tolerates common `timeout_secs` keyword
+  guesses on child-query helpers while preserving session-level timeout
+  policy, and stores JSON-serializable `finalize(...)` values as JSON handles
+  so `handle_read` can project them directly.
+- **RLM REPL uses the shared Python resolver.** RLM startup now tries
+  `python3`, `python`, and `py -3`, matching the dependency resolver used by
+  code execution and avoiding Windows failures where `python3` is absent
+  (harvested from PR #1540).
+- **Session titles and history previews hide metadata noise.** Saved
+  session titles and the picker history strip leading `<turn_meta>` envelopes
+  and thinking-tag blocks so historical conversations read like user-visible
+  chat rather than prompt plumbing (harvested from PR #1510).
+- **Companion binary version smoke is unambiguous.** `deepseek-tui --version`
+  now reports the `deepseek-tui` binary name instead of the dispatcher label.
+- **Vision path boundary test is platform-native.** The absolute-path
+  rejection smoke uses a Windows absolute path on Windows and `/etc/hosts`
+  elsewhere (harvested from PR #1526).
+- **Tool papercuts:** `file_search` has safer default excludes and an
+  explicit `exclude` option; `grep_files` returns single-line context as
+  strings; `fetch_url` can project JSON fields and returns headers;
+  `edit_file` can opt into leading-indentation fuzz; `exec_shell` can
+  merge stdout/stderr in chronological order; `revert_turn` rejects
+  no-op snapshot boundaries.
+- **CLI reasoning-effort honoured on non-auto exec routes** (PR #1511
+  from **@h3c-hexin**). `deepseek -p "..." --reasoning-effort high` now
+  applies the flag correctly instead of falling back to the config-file
+  default.
+- **Edit-file replacement boundaries clarified** (PR #1516). The tool
+  description and error messages now make it unambiguous that
+  `edit_file` is for one clear replacement in one file.
+- **Pandoc output validated before probing** (PR #1523). Binary-format
+  conversions that produce empty or invalid output now surface a clear
+  error instead of a confusing pandoc stack trace.
+- **Running turns can be steered and repainted** (PR #1533, #1537).
+  Composer input during an active turn no longer stalls; the TUI
+  redraws the transcript as the agent streams.
+- **Tasks and Activity Detail are calmer under load.** The Tasks panel now
+  keeps live/background/recent activity from double-counting the same shell
+  or RLM work, groups repeated read/search/checklist noise, and keeps
+  failures, status, command summaries, and durations visible. Ctrl+O now
+  opens Activity Detail for selected/live/recent tool work and the reasoning
+  timeline for thinking blocks, while Alt+V remains the direct tool-detail
+  pager; the idle footer now advertises that split for the visible activity.
+- **npm retry shows timeout hint on first failure** (PR #1538).
+  Installations behind slow proxies now see a clear "retrying" message
+  instead of a silent hang.
+- **Issue templates improved** (PR #1525 from **@reidliu41**). Bug and
+  feature-request templates are clearer and easier for new contributors.
+
+### Credits
+
+Thanks to **@reidliu41** (#1525/#1526), **@h3c-hexin** (#1511),
+**@xulongzhe** (#1530/#1544), **@tyouter** (#1510), and
+**@Duducoco** (#1540) for community contributions in this release.
+
 ## [0.8.32] - 2026-05-12

 A "more useful tools" release. v0.8.31 made the tool surface
@@ -3821,7 +3979,8 @@ Welcome — and thank you.
 - Hooks system and config profiles
 - Example skills and launch assets

-[Unreleased]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.32...HEAD
+[Unreleased]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.33...HEAD
+[0.8.33]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.32...v0.8.33
 [0.8.32]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.31...v0.8.32
 [0.8.31]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.30...v0.8.31
 [0.8.30]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.29...v0.8.30
@@ -1160,7 +1160,7 @@ dependencies = [

 [[package]]
 name = "deepseek-agent"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "deepseek-config",
 "serde",
@@ -1168,7 +1168,7 @@ dependencies = [

 [[package]]
 name = "deepseek-app-server"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "anyhow",
 "axum",
@@ -1190,7 +1190,7 @@ dependencies = [

 [[package]]
 name = "deepseek-config"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "anyhow",
 "deepseek-secrets",
@@ -1202,7 +1202,7 @@ dependencies = [

 [[package]]
 name = "deepseek-core"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "anyhow",
 "chrono",
@@ -1220,7 +1220,7 @@ dependencies = [

 [[package]]
 name = "deepseek-execpolicy"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "anyhow",
 "deepseek-protocol",
@@ -1229,7 +1229,7 @@ dependencies = [

 [[package]]
 name = "deepseek-hooks"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "anyhow",
 "async-trait",
@@ -1243,7 +1243,7 @@ dependencies = [

 [[package]]
 name = "deepseek-mcp"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "anyhow",
 "serde",
@@ -1252,7 +1252,7 @@ dependencies = [

 [[package]]
 name = "deepseek-protocol"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "serde",
 "serde_json",
@@ -1260,7 +1260,7 @@ dependencies = [

 [[package]]
 name = "deepseek-secrets"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "dirs",
 "keyring",
@@ -1273,7 +1273,7 @@ dependencies = [

 [[package]]
 name = "deepseek-state"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "anyhow",
 "chrono",
@@ -1285,7 +1285,7 @@ dependencies = [

 [[package]]
 name = "deepseek-tools"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "anyhow",
 "async-trait",
@@ -1298,7 +1298,7 @@ dependencies = [

 [[package]]
 name = "deepseek-tui"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "anyhow",
 "arboard",
@@ -1361,7 +1361,7 @@ dependencies = [

 [[package]]
 name = "deepseek-tui-cli"
-version = "0.8.32"
+version = "0.8.33"
 dependencies = [
 "anyhow",
 "chrono",
@@ -1386,7 +1386,7 @@ dependencies = [

 [[package]]
 name = "deepseek-tui-core"
-version = "0.8.32"
+version = "0.8.33"

 [[package]]
 name = "deltae"
@@ -19,7 +19,7 @@ default-members = ["crates/cli", "crates/app-server", "crates/tui"]
 resolver = "2"

 [workspace.package]
-version = "0.8.32"
+version = "0.8.33"
 edition = "2024"
 # Rust 1.88 stabilized `let_chains` in `if`/`while` conditions, which the
 # codebase relies on extensively. Cargo enforces this so users on older
@@ -68,7 +68,7 @@ It is built around DeepSeek V4 (`deepseek-v4-pro` / `deepseek-v4-flash`), includ
 - **Durable task queue** — background tasks can survive restarts
 - **HTTP/SSE runtime API** — `deepseek serve --http` for headless agent workflows
 - **MCP protocol** — connect to Model Context Protocol servers for extended tooling; please see [docs/MCP.md](docs/MCP.md)
- **Native RLM** (`rlm_query`) — run batched analysis through cheap `deepseek-v4-flash` children using the same API client
+- **Native RLM** (`rlm_open`/`rlm_eval`) — persistent REPL sessions for batched analysis; run cheap `deepseek-v4-flash` children with bounded helpers like `peek`, `search`, `chunk`, and `sub_query_batch`
 - **LSP diagnostics** — inline error/warning surfacing after every edit via rust-analyzer, pyright, typescript-language-server, gopls, clangd
 - **User memory** — optional persistent note file injected into the system prompt for cross-session preferences
 - **Localized UI** — `en`, `ja`, `zh-Hans`, `pt-BR` with auto-detection
@@ -83,6 +83,17 @@ It is built around DeepSeek V4 (`deepseek-v4-pro` / `deepseek-v4-flash`), includ

 See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for the full walkthrough.

+### Sub-agents: Concurrent Background Execution
+
+DeepSeek TUI can dispatch multiple sub-agents that run in parallel — like a concurrent task queue:
+
+- **Non-blocking launch.** `agent_open` returns immediately. The child gets its own fresh context and tool registry and runs independently. The parent keeps working.
+- **Background execution.** Sub-agents execute concurrently (default cap: 10, configurable to 20). The engine manages the pool — no polling loop needed.
+- **Completion notification.** When a sub-agent finishes, the runtime delivers a structured `<deepseek:subagent.done>` event with a summary, evidence list, and execution metrics. The parent model reads the `summary` field and integrates findings.
+- **Bounded result retrieval.** Large transcripts are parked behind `var_handle` references. The model calls `handle_read` for slices, ranges, or JSONPath projections — keeping the parent context lean.
+
+See [docs/SUBAGENTS.md](docs/SUBAGENTS.md) for the full sub-agent reference.
+
 ---

 ## Quickstart
@@ -225,90 +236,52 @@ deepseek --provider ollama --model deepseek-coder:1.3b

 ---

-## What's New In v0.8.32
+## What's New In v0.8.33

-A "more useful tools" release expanding the tool surface for real-world
-workflows. Five new tools, ten community PRs targeting model-protocol bugs
-and UX papercuts, and a snapshot cap that stops giant workspaces from
-hanging the TUI on first turn. [Full changelog](CHANGELOG.md).
+A sub-agent and RLM renovation release. The model-facing delegation
+surface is now session-oriented: `rlm_open` / `rlm_eval` /
+`rlm_configure` / `rlm_close` for persistent RLM work, `agent_open` /
+`agent_eval` / `agent_close` for named sub-agent sessions, and
+`handle_read` for bounded retrieval from large results. Six tool
+papercuts fixed, two community PRs landed, and the sidebar gets a
+cleaner "Work" tab. [Full changelog](CHANGELOG.md).

- **Five new tools.** `read_file` now extracts PDFs in pure Rust — no
-  Poppler install required. `pandoc_convert` moves documents between 11
-  formats (Markdown, HTML, DOCX, EPUB, LaTeX…). `image_ocr` runs local
-  tesseract on screenshots and scanned documents. `image_analyze` sends
-  images to a vision model for natural-language description (opt-in only).
-  `js_execution` mirrors `code_execution` for Node.js snippets.
- **Two more providers.** AtlasCloud joins as a first-class provider
-  (`provider = "atlascloud"`) with the same config-surface shape as the
-  existing NVIDIA NIM / Fireworks rows. `web_search` supports Tavily and
-  Bocha as configurable backends for regions where DuckDuckGo is
-  unreliable.
- **Prompt-cache survives mid-session edits** (PR #1345 from
-  **@Duducoco**). Moving `instructions`, user memory, and session goal
-  below the volatile-content boundary means the KV prefix cache no longer
-  breaks every time you edit your memory file — skills and context
-  management instructions stay hot regardless of how often you run
-  `/memory`.
- **vLLM thinking toggle actually works now** (PR #1480 from
-  **@h3c-hexin**). `reasoning_effort = "off"` on vLLM providers now emits
-  the OpenAI `chat_template_kwargs.enable_thinking` extension instead of
-  the silently-ignored Anthropic-native field. Measured improvement on
-  Qwen3: TTFT from ~13s → ~270ms.
- **Kitty keyboard protocol on Windows** (PR #1483 from
-  **@CrepuscularIRIS / autoghclaw**). `Shift+Enter` now inserts a
-  newline instead of submitting in VSCode and Windows Terminal —
-  previously indistinguishable from plain Enter on Windows.
- **Tool-result retrieval namespace unified** (#1541). Wire-dedup refs
-  and disk-spillover refs now share a lookup path — `retrieve_tool_result`
-  accepts SHA refs, bare hex hashes, `art_<id>` aliases, and absolute
-  paths, with error messages that list every accepted form.
- **Snapshots skip giant workspaces** (#1552). A 2 GB ceiling on
-  non-excluded workspace content prevents first-turn `git add -A` from
-  hanging the TUI on multi-hundred-GB project directories. Configurable
-  via `[snapshots] max_workspace_gb`; set to `0` to restore unbounded
-  behaviour.
- **`deepseek update` refreshes both binaries** (PR #1492 from
-  **@NorethSea**). The updater now enumerates colocated binaries (both
-  the dispatcher and the TUI runtime), downloads and verifies every
-  release asset, and writes the sibling first so a partial failure can't
-  leave the launcher updated while the TUI stays stale.
- **Approval modal collapses to a one-line banner** (PR #1455 from
-  **@tiger-dog**). Tab toggles between the full takeover card and a
-  bottom-line summary — the transcript stays visible while you decide.
- **`@`-mention truncation no longer splits CJK codepoints** (PR #1495
-  from **@CrepuscularIRIS / autoghclaw**). Files larger than 128 KB
-  used to truncate mid-codepoint; the truncator now rounds down to the
-  last valid UTF-8 boundary.
- **Startup empty-state shows the build version**, active model with a
-  `/model` hint, and current working directory (PR #1444 from
-  **@reidliu41**).
- **`/change` slash command** displays the latest CHANGELOG section
-  inside the TUI (PR #1416 from **@zhuangbiaowei**).
- **Toast overlay no longer renders on top of the composer** (PR #1485
-  from **@MeAiRobot**). Approval toasts now clamp to the gap between
-  the composer and footer.
- **TUI no longer freezes during long-running shell jobs** (PR #1494
-  from **@CrepuscularIRIS / autoghclaw**). The job panel's refresh path
-  now reads only the tail bytes under the mutex lock instead of cloning
-  the entire stdout buffer every 2.5 seconds.
- **Markdown renderer no longer eats underscores in identifiers** (PR
-  #1455 from **@tiger-dog**). `deepseek_tui` and `foo_bar_baz` no longer
-  render half-italic.
- **`/sessions` picker highlights the selected row** more strongly in
-  dark terminals (PR #1493 from **@reidliu41**), and no longer shows
-  `<turn_meta>` as the session title (PR #1498 from **@wdw8276**).
+- **Persistent RLM sessions.** RLM work now uses `rlm_open` /
+  `rlm_eval` / `rlm_close` with bounded REPL helpers (`peek`,
+  `search`, `chunk`, `sub_query`, `sub_query_batch`, `finalize`)
+  — the model drives the REPL through tool calls instead of a
+  foreground loop.
+- **Fork-aware sub-agent sessions.** `agent_open` supports named
+  sessions, `fork_context` for prompt-cache-friendly perspective
+  fanout, and bounded recursive depth. Sub-agent results and
+  transcripts can be parked behind `var_handle` references.
+- **Shared `handle_read` tool.** Large structured results (RLM
+  finals, sub-agent transcripts, tool artifacts) return typed handles
+  with slice, range, count, and JSONPath projections — the model
+  reads back only what it needs.
+- **Text selection now works during streaming.** The loading-state
+  mouse filter drops inert move events but allows transcript and
+  scrollbar drags to continue — the known issue from v0.8.32 is
+  resolved.
+- **Grayscale theme.** Use `/theme grayscale` for a quiet black/white
+  palette, or `/set theme grayscale --save` to make it the saved default.
+- **Session history picker.** `/sessions` and `Ctrl+R` now put full
+  session history on the left, the session list on the right, number keys
+  `1`-`9` open visible histories, and `PgUp` / `PgDn` scroll history.
+- **Six tool papercuts fixed.** `file_search` safer excludes;
+  `grep_files` returns clean strings; `fetch_url` JSON field
+  projection and headers; `edit_file` indentation fuzz;
+  `exec_shell` merged stdout/stderr; `revert_turn` rejects no-ops.
+- **CLI reasoning-effort honoured** on `--reasoning-effort high`
+  non-auto exec routes (PR #1511 from **@h3c-hexin**).
+- **Sidebar "Work" tab.** The former "Plan" / "Todos" tabs are now
+  one "Work" panel for the active checklist, consistent across Plan,
+  Agent, and YOLO modes.
+- **`/relay` command with CJK aliases** (`/接力`) for structured
+  multi-session handoff prompts.

-**Known issue in v0.8.32:** terminal-native text selection can still be
-blocked while the agent is thinking or streaming a response. v0.8.33 is
-planned to ship the text-selection fix alongside the sub-agent and RLM
-renovation.
-
-Thanks to **@CrepuscularIRIS** (4 landings), **@reidliu41** (2 landings),
-**@tiger-dog** (2 landings), **@Duducoco**, **@h3c-hexin**,
-**@NorethSea**, **@MeAiRobot**, **@zhuangbiaowei**, **@wdw8276**,
-**@MMMarcinho**, **@SamhandsomeLee**, **@sandofree**,
-**@lucaszhu-hue**, **@muyuliyan**, **@Oliver-ZPLiu**, **@czf0718**,
-**@jieshu666**, and **@YaYII**.
+Thanks to **@reidliu41** and **@h3c-hexin** for community
+contributions in this release.

 ---

@@ -317,6 +290,8 @@ Thanks to **@CrepuscularIRIS** (4 landings), **@reidliu41** (2 landings),
 ```bash
 deepseek                                         # interactive TUI
 deepseek "explain this function"                 # one-shot prompt
+deepseek exec --auto --output-format stream-json "fix this bug"  # NDJSON backend stream
+deepseek exec --resume <SESSION_ID> "follow up"  # continue a non-interactive session
 deepseek --model deepseek-v4-flash "summarize"   # model override
 deepseek --model auto "fix this bug"             # auto-select model + thinking
 deepseek --yolo                                  # auto-approve tools
@@ -498,6 +473,8 @@ Full Changelog: [CHANGELOG.md](CHANGELOG.md).

 - **[DeepSeek](https://github.com/deepseek-ai)** — thank you for the models and support that power every turn. 感谢 DeepSeek 提供模型与支持，让每一次交互成为可能。
 - **[DataWhale](https://github.com/datawhalechina)** 🐋 — thank you for your support and for welcoming us into the Whale Brother family. 感谢 DataWhale 的支持，并欢迎我们加入“鲸兄弟”大家庭。
+- **[OpenWarp](https://github.com/zerx-lab/warp)** — thank you for prioritizing DeepSeek TUI support and for collaborating on a better terminal-agent experience.
+- **[Open Design](https://github.com/nexu-io/open-design)** — thank you for support and collaboration around design-forward agent workflows.

 This project ships with help from a growing community of contributors:

@@ -536,7 +513,7 @@ This project ships with help from a growing community of contributors:
 - **Unic (YuniqueUnic)** — Schema-driven config UI (TUI + web)
 - **Jason** — SSRF security hardening
 - **[axobase001](https://github.com/axobase001)** — snapshot orphan cleanup, npm install guards, session telemetry fixes, model-scope cache clear, symlinked skill support, and npm mirror-escape-hatch guidance (#975, #1032, #1047, #1049, #1052, #1019, #1051, #1056)
- **[MengZ-super](https://github.com/MengZ-super)** — `/theme` command for dark/light toggle and SSE gzip/brotli decompression (#1057, #1061)
+- **[MengZ-super](https://github.com/MengZ-super)** — `/theme` command foundation and SSE gzip/brotli decompression (#1057, #1061)
 - **[DI-HUO-MING-YI](https://github.com/DI-HUO-MING-YI)** — Plan-mode read-only sandbox safety fix (#1077)
 - **[bevis-wong](https://github.com/bevis-wong)** — precise paste-Enter auto-submit reproducer (#1073)
 - **[Duducoco](https://github.com/Duducoco)** and **[AlphaGogoo](https://github.com/AlphaGogoo)** — skills slash-menu and `/skills` coverage fix (#1068, #1083)
@@ -53,7 +53,7 @@ DeepSeek TUI 是一个完全运行在终端里的编程智能体。它让 DeepSe

 ### 主要功能

- **原生 RLM**（`rlm_query`）—— 利用现有 API 客户端并行调度 1-16 个低成本 `deepseek-v4-flash` 子任务，用于批量分析和并行推理
+- **原生 RLM**（`rlm_open`/`rlm_eval`）—— 持久化 REPL 会话用于批量分析；使用带界面的辅助函数（`peek`、`search`、`chunk`、`sub_query_batch`）运行低成本 `deepseek-v4-flash` 子任务
 - **思考模式流式输出** —— 实时观察模型在解决问题时的思维链展开
 - **完整工具集** —— 文件操作、shell 执行、git、网页搜索/浏览、apply-patch、子智能体、MCP 服务器
 - **100 万 token 上下文** —— 上下文接近上限时自动智能压缩，支持前缀缓存感知以降低成本
@@ -78,6 +78,17 @@ DeepSeek TUI 是一个完全运行在终端里的编程智能体。它让 DeepSe

 详见 [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md)。

+### 子智能体：并发后台执行
+
+DeepSeek TUI 可以同时调度多个子智能体并行运行——类似于并发任务队列：
+
+- **非阻塞启动。** `agent_open` 立即返回。子智能体获得独立的上下文和工具注册表，独立运行。父进程继续工作。
+- **后台执行。** 子智能体并发运行（默认上限 10，可配置至 20）。引擎管理线程池——无需轮询循环。
+- **完成通知。** 子智能体完成后，运行时发送结构化的 `<deepseek:subagent.done>` 事件，包含摘要、证据列表和执行指标。父模型读取 `summary` 字段并整合结果。
+- **按需读取结果。** 大型对话记录暂存为 `var_handle` 引用。模型通过 `handle_read` 按切片、范围或 JSONPath 投影读取——保持父上下文精简。
+
+详见 [docs/SUBAGENTS.md](docs/SUBAGENTS.md)。
+
 ---

 ## 快速开始
@@ -105,6 +116,21 @@ deepseek doctor                          # 验证安装

 > 轮换或移除密钥：`deepseek auth clear --provider deepseek`。

+### Auto 模式
+
+使用 `deepseek --model auto` 或 `/model auto` 让 DeepSeek TUI 自行决定每轮需要多少模型和推理能力。
+
+Auto 模式同时控制两个设置：
+
+- 模型：`deepseek-v4-flash` 或 `deepseek-v4-pro`
+- 推理强度：`off`、`high` 或 `max`
+
+在真实请求发出之前，应用会先用关闭推理的 `deepseek-v4-flash` 进行一次小型路由调用。路由器审视最新请求和最近的上下文，然后为真实请求选定具体的模型和推理强度。简短/简单的轮次保持在 Flash + 关闭推理；编码、调试、发布、架构、安全审查或模糊的多步骤任务可升级到 Pro 和/或更高推理强度。
+
+`auto` 是 DeepSeek TUI 本地行为。上游 API 永远不会收到 `model: "auto"`，它只会收到为当前轮次选定的具体模型和推理强度设置。TUI 会显示选定的路由，成本跟踪按实际运行的模型计费。如果路由调用失败或返回无效答案，应用会回退到本地启发式规则。子智能体会继承 auto 模式，除非你为它们指定了显式模型。
+
+需要可重复基准测试、严格控制成本上限或特定提供商/模型映射时，请使用固定模型或固定推理强度。
+
 ### Linux ARM64（HarmonyOS 轻薄本、openEuler、Kylin、树莓派、Graviton 等）

 从 v0.8.8 起，`npm i -g deepseek-tui` 直接支持 glibc 系的 ARM64 Linux。你也可以从 [Releases 页面](https://github.com/Hmbown/DeepSeek-TUI/releases) 下载预编译二进制，放到 `PATH` 目录中。
@@ -179,6 +205,10 @@ deepseek --provider nvidia-nim
 deepseek auth set --provider fireworks --api-key "YOUR_FIREWORKS_API_KEY"
 deepseek --provider fireworks --model deepseek-v4-pro

+# 通用 OpenAI 兼容端点
+deepseek auth set --provider openai --api-key "YOUR_OPENAI_COMPATIBLE_API_KEY"
+OPENAI_BASE_URL="https://openai-compatible.example/v4" deepseek --provider openai --model glm-5
+
 # 自托管 SGLang
 SGLANG_BASE_URL="http://localhost:30000/v1" deepseek --provider sglang --model deepseek-v4-flash

@@ -192,93 +222,46 @@ deepseek --provider ollama --model deepseek-coder:1.3b

 ---

-## v0.8.29 新功能
+## v0.8.33 新功能

-维护版本，核心是修复 v0.8.27 / v0.8.28 引入的"滚动幽灵"回归
-（#1085 类问题）和 Ctrl+R 会话恢复跨项目泄漏的问题（#1395），
-外加 25 个社区 PR。[完整更新日志](CHANGELOG.md)。
+子智能体和 RLM 改造版本。面向模型的委托界面现在是面向会话的：
+`rlm_open` / `rlm_eval` / `rlm_configure` / `rlm_close` 用于持久
+RLM 工作，`agent_open` / `agent_eval` / `agent_close` 用于命名子
+智能体会话，`handle_read` 用于从大型结果中按需读取。修复了六个
+工具细节问题，落地了两个社区 PR，侧边栏合并为更清晰的"Work"标签。
+[完整更新日志](CHANGELOG.md)。

- **"滚动幽灵"彻底修复**（#1085 回归）。并行子代理运行
-  `exec_shell` 时，alt-screen 会被滚动出 ratatui 差分渲染器的
-  视野，header 上方出现越来越大的空白带。三层防护一并上线：
-  写入 `~/.deepseek/logs/tui-YYYY-MM-DD.log` 的 `tracing-subscriber`、
-  alt-screen 生命周期内的 fd 级 stderr 重定向（Unix `dup2`）、
-  以及 `tools/`、`core/`、`tui/`、`network_policy.rs`、
-  `runtime_threads.rs` 模块的
-  `#![deny(clippy::print_stdout, clippy::print_stderr)]`。今后在
-  这些模块新增 `eprintln!` 会被 CI 拒绝。
- **Ctrl+R 会话恢复改为按当前工作区过滤**（#1395，PR #1397，
-  来自 **@linzhiqin2003**）— 此前列出磁盘上所有会话，导致
-  在项目 B 打开 DeepSeek-TUI 时按下 Ctrl+R 可能恢复项目 A 的
-  历史记录。
- **运行时版本号直接显示在 header 中。** Header 右侧集群在
-  provider / effort / Live / context 之后增加一个 `v0.8.29`
-  小标签，在终端宽度紧张时最先收起。
- **MCP HTTP 传输现在尊重 HTTP(S)_PROXY**（#1408，来自
-  **@hlx98007**）— 公司出口代理、国内 Clash / Shadowsocks 代理
-  现在能正确应用于 MCP HTTP 连接，跟 box 上的其他工具
-  （curl、npm、git 等）保持一致。同时支持 `NO_PROXY`。
- **MCP 发现接受不规范条目**（PR #1410，来自 **@Liu-Vince**）—
-  一个错误的 tool / resource / prompt 条目不再让整页丢失；
-  错误条目被跳过，目录的其余部分正常返回。
- **MCP SSE 接受 CRLF 分隔的 endpoint 事件**（#1309，PR #1358，
-  来自 **@reidliu41**）— FastMCP / uvicorn 风格的 SSE 流不再因
-  只等待 LF 分隔符而超时。
- **输入框会忽略泄漏的鼠标报告字节**（#1418，PR #1421，来自
-  **@reidliu41**）— 某些 SSH / IDE 终端链路把 `[<35;44;18M`
-  这类鼠标报告泄漏到 stdin 时，不再把输入区域填满。
- **Footer 芯片会遵守可用宽度**（#1357，PR #1417，来自
-  **@Wenjunyun123**）— 窄终端下，过长的 cache / aux 芯片会先
-  收起，而不是挤压左侧状态或 composer 区域。
- **笔记管理斜杠命令**（PR #1407，来自 **@reidliu41**）—
-  `/note add`、`/note list` 等命令在 TUI 内提供持久笔记功能。
- **全局 `~/.deepseek/AGENTS.md` 与项目 AGENTS.md 合并**
-  （#1157，PR #1399，来自 **@linzhiqin2003**）— 此前工作区
-  自带 AGENTS.md 会完全遮蔽全局基准，现在分层叠加。
- **语言指令：thinking 跟随用户消息语言**（#1118，PR #1398，
-  来自 **@linzhiqin2003**）— 此前项目上下文推断的 `lang`
-  字段可能压制最新用户消息的语言，导致中文对话出现英文 thinking。
- **网络搜索过滤垃圾 SERP**（#964，PR #1396，来自
-  **@linzhiqin2003**）— Bing / DDG 回退路径丢弃污染快速查找
-  结果的 SEO 农场域名。
- **Auto 路由识别 CJK 调试 / 搜索关键词**（PR #1401、#1402，
-  来自 **@linzhiqin2003**）— `--model auto` 和推理强度选择器
-  现在能正确路由中文 / 日文技术查询，此前会回退到通用基准。
- **Deferred tools 首次执行前会先加载 schema**（#1419，PR #1429，
-  来自 **@SamhandsomeLee**）— `edit_file` 等延迟加载工具现在会先
-  展示期望字段并要求模型重试，而不是执行模型猜测出来的参数名。
- **DeepSeek 公开别名会正确回放 thinking-mode 工具轮次**（PR #1428，
-  来自 **@Beltran12138**）— `deepseek-chat` 和
-  `deepseek-reasoner` 现在与显式 V4 模型 ID 一样触发
-  `reasoning_content` replay，避免工具调用后的第二轮 400。
- **技能补全收敛到 `/skill` 下**（#1437，PR #1442，来自
-  **@reidliu41**）— 本地技能很多时不会再挤满根级 `/` 命令菜单。
- **`edit_file` 拒绝无变化替换**（PR #1460，来自
-  **@xiluoduyu**）— `search` / `replace` 完全相同时会直接返回
-  清晰的参数错误，而不是生成空 diff。
- **Windows 终端布局使用宽度稳定的字形**（#1314，PR #1465，来自
-  **@CrepuscularIRIS**）— header 和文件树不再依赖 cmd /
-  PowerShell 容易误判宽度的 SMP emoji。
- **Ghostty 默认启用低动态渲染**（#1445，PR #1468，来自
-  **@CrepuscularIRIS**）— 受影响终端无需手动配置即可避开动画闪烁。
- **Docker buildx provenance 的 EPERM 失败会给出提示**（#1449，
-  PR #1469，来自 **@CrepuscularIRIS**）— macOS shell 输出命中
-  受限 metadata 写入失败时，会提示 provenance 相关开关。
- **Windows CMD 的鼠标滚轮回退会滚动 transcript**（#1443，
-  PR #1471，来自 **@CrepuscularIRIS**）— 关闭 mouse capture 时，
-  被终端映射成 Up / Down 的滚轮事件不再循环 composer 历史。
- **`sync-cnb.yml` 工作流加固** — 显式 `permissions: contents:
-  read`、`actions/checkout` v3 → v4、触发器收紧到 `main` +
-  `v*` 标签（不再镜像 feature 分支）。
- **新增 +438 LOC 测试覆盖** — `error_taxonomy`、
-  `parse_pages_arg`、Web 搜索优先级、`sanitize_stream_chunk`
-  控制字节过滤（PR #1403–#1406，来自 **@linzhiqin2003**）。
+- **持久化 RLM 会话。** RLM 工作现在通过 `rlm_open` / `rlm_eval` /
+  `rlm_close` 进行，使用受限的 REPL 辅助函数（`peek`、`search`、
+  `chunk`、`sub_query`、`sub_query_batch`、`finalize`）——
+  模型通过工具调用来驱动 REPL，而非前台循环。
+- **Fork 感知的子智能体会话。** `agent_open` 支持命名会话、
+  `fork_context` 以实现前缀缓存友好的多视角展开，以及有界的递归
+  深度。子智能体结果和对话记录可以通过 `var_handle` 引用暂存。
+- **共享 `handle_read` 工具。** 大型结构化结果（RLM 最终输出、
+  子智能体对话记录、工具产物）返回带类型的句柄，支持切片、
+  范围、计数和 JSONPath 投影——模型只读取需要的内容。
+- **流式输出期间文本选择正常工作。** 加载状态的鼠标过滤器丢弃
+  无关移动事件，但允许对话记录和滚动条拖动继续——
+  v0.8.32 的已知问题已解决。
+- **灰度主题。** 使用 `/theme grayscale` 可切换到更克制的黑白
+  调色板；使用 `/set theme grayscale --save` 可保存为默认主题。
+- **会话历史选择器。** `/sessions` 和 `Ctrl+R` 现在左侧显示完整
+  会话历史，右侧显示会话列表；按 `1`-`9` 打开可见会话历史，
+  `PgUp` / `PgDn` 翻页查看历史。
+- **六个工具细节修复。** `file_search` 更安全的默认排除项；
+  `grep_files` 返回干净的字符串；`fetch_url` JSON 字段投影和
+  响应头；`edit_file` 缩进模糊匹配；`exec_shell` 合并
+  stdout/stderr；`revert_turn` 拒绝空操作。
+- **CLI 推理强度参数在非 auto 执行路径上生效**（PR #1511，
+  来自 **@h3c-hexin**）。`deepseek -p "..." --reasoning-effort high`
+  现在正确应用该标志。
+- **侧边栏 "Work" 标签。** 原先的 "Plan" / "Todos" 标签现在合并为
+  一个 "Work" 面板，在 Plan、Agent、YOLO 三种模式下保持一致。
+- **`/relay` 命令及中文别名**（`/接力`）——用于结构化的跨会话
+  接力提示。

-感谢本周期落地 10 个 PR 的 **@linzhiqin2003**、落地 5 个 PR 的
-**@reidliu41**、落地 4 个 PR 的 **@CrepuscularIRIS**，以及
-**@SamhandsomeLee**、**@Beltran12138**、**@Wenjunyun123**、
-**@hlx98007**、**@Liu-Vince**、**@xiluoduyu**，和报告 #1395 的
-**@shenxiaodaosanhua**。
+感谢 **@reidliu41** 和 **@h3c-hexin** 在本版本中的社区贡献。

 ---

@@ -287,6 +270,8 @@ deepseek --provider ollama --model deepseek-coder:1.3b
 ```bash
 deepseek                                       # 交互式 TUI
 deepseek "explain this function"              # 一次性提示
+deepseek exec --auto --output-format stream-json "fix this bug" # 面向后端集成的 NDJSON 流
+deepseek exec --resume <SESSION_ID> "follow up" # 继续非交互会话
 deepseek --model deepseek-v4-flash "summarize" # 指定模型
 deepseek --yolo                                # 自动批准工具
 deepseek auth set --provider deepseek         # 保存 API key
@@ -307,6 +292,36 @@ deepseek mcp-server                            # 启动 dispatcher MCP stdio 服
 deepseek update                                # 检查并应用二进制更新
 ```

+Docker 镜像发布在 GHCR 上：
+
+```bash
+docker volume create deepseek-tui-home
+
+docker run --rm -it \
+  -e DEEPSEEK_API_KEY="$DEEPSEEK_API_KEY" \
+  -v deepseek-tui-home:/home/deepseek/.deepseek \
+  ghcr.io/hmbown/deepseek-tui:latest
+```
+
+### Zed / ACP
+
+DeepSeek 可作为自定义 Agent Client Protocol 服务器运行，供 Zed 等编辑器通过 stdio 调用本地 ACP 智能体。在 Zed 中添加自定义智能体服务器：
+
+```json
+{
+  "agent_servers": {
+    "DeepSeek": {
+      "type": "custom",
+      "command": "deepseek",
+      "args": ["serve", "--acp"],
+      "env": {}
+    }
+  }
+}
+```
+
+首个 ACP 切片支持通过现有 DeepSeek 配置/API 密钥创建新会话和提示响应。工具支持的编辑和检查点回放尚未通过 ACP 暴露。
+
 ### 常用快捷键

 | 按键 | 功能 |
@@ -347,10 +362,11 @@ deepseek update                                # 检查并应用二进制更新
 | `DEEPSEEK_API_KEY` | DeepSeek API key |
 | `DEEPSEEK_BASE_URL` | API base URL |
 | `DEEPSEEK_MODEL` | 默认模型 |
-| `DEEPSEEK_PROVIDER` | `deepseek`（默认）、`nvidia-nim`、`fireworks`、`sglang`、`vllm`、`ollama` |
+| `DEEPSEEK_PROVIDER` | `deepseek`（默认）、`nvidia-nim`、`openai`、`openrouter`、`novita`、`atlascloud`、`fireworks`、`sglang`、`vllm`、`ollama` |
 | `DEEPSEEK_PROFILE` | 配置 profile 名称 |
 | `DEEPSEEK_MEMORY` | 设为 `on` 启用用户记忆 |
-| `NVIDIA_API_KEY` / `FIREWORKS_API_KEY` / `SGLANG_API_KEY` / `VLLM_API_KEY` / `OLLAMA_API_KEY` | 提供商认证 |
+| `NVIDIA_API_KEY` / `OPENAI_API_KEY` / `OPENROUTER_API_KEY` / `NOVITA_API_KEY` / `ATLASCLOUD_API_KEY` / `FIREWORKS_API_KEY` / `SGLANG_API_KEY` / `VLLM_API_KEY` / `OLLAMA_API_KEY` | 提供商认证 |
+| `OPENAI_BASE_URL` / `OPENAI_MODEL` | 通用 OpenAI 兼容端点和模型 ID |
 | `SGLANG_BASE_URL` | 自托管 SGLang 端点 |
 | `VLLM_BASE_URL` | 自托管 vLLM 端点 |
 | `OLLAMA_BASE_URL` | 自托管 Ollama 端点 |
@@ -449,6 +465,11 @@ description: 当 DeepSeek 需要遵循我的自定义工作流时使用这个技

 ## 致谢

+- **[DeepSeek](https://github.com/deepseek-ai)** — 感谢 DeepSeek 提供模型与支持，让每一次交互成为可能。
+- **[DataWhale](https://github.com/datawhalechina)** — 感谢 DataWhale 的支持，并欢迎我们加入“鲸兄弟”大家庭。
+- **[OpenWarp](https://github.com/zerx-lab/warp)** — 感谢 OpenWarp 优先支持 DeepSeek TUI，并一起打磨更好的终端智能体体验。
+- **[Open Design](https://github.com/nexu-io/open-design)** — 感谢 Open Design 对面向设计的智能体工作流提供支持与协作。
+
 本项目由不断壮大的贡献者社区共同打造：

 - **[merchloubna70-dot](https://github.com/merchloubna70-dot)** — 28 个 PR，涵盖功能、修复和 VS Code 扩展基础架构 (#645–#681)
@@ -486,7 +507,7 @@ description: 当 DeepSeek 需要遵循我的自定义工作流时使用这个技
 - **Unic (YuniqueUnic)** — 基于 schema 的配置 UI（TUI + web）
 - **Jason** — SSRF 安全加固
 - **[axobase001](https://github.com/axobase001)** — 快照孤儿文件清理、npm 安装守卫、会话遥测修复、模型作用域缓存清理、符号链接技能支持，以及 npm 镜像逃生路径指引 (#975, #1032, #1047, #1049, #1052, #1019, #1051, #1056)
- **[MengZ-super](https://github.com/MengZ-super)** — `/theme` 深色/浅色主题切换命令和 SSE gzip/brotli 解压支持 (#1057, #1061)
+- **[MengZ-super](https://github.com/MengZ-super)** — `/theme` 命令基础和 SSE gzip/brotli 解压支持 (#1057, #1061)
 - **[DI-HUO-MING-YI](https://github.com/DI-HUO-MING-YI)** — Plan 模式只读沙箱安全修复 (#1077)
 - **[bevis-wong](https://github.com/bevis-wong)** — 粘贴-回车自动提交问题的精确复现 (#1073)
 - **[Duducoco](https://github.com/Duducoco)** 和 **[AlphaGogoo](https://github.com/AlphaGogoo)** — 技能斜杠菜单和 `/skills` 覆盖范围修复 (#1068, #1083)
@@ -7,5 +7,5 @@ repository.workspace = true
 description = "Model/provider registry and fallback strategy for DeepSeek workspace architecture"

 [dependencies]
-deepseek-config = { path = "../config", version = "0.8.32" }
+deepseek-config = { path = "../config", version = "0.8.33" }
 serde.workspace = true
@@ -10,15 +10,15 @@ description = "Codex-style app-server transport for DeepSeek workspace architect
 anyhow.workspace = true
 axum.workspace = true
 clap.workspace = true
-deepseek-agent = { path = "../agent", version = "0.8.32" }
-deepseek-config = { path = "../config", version = "0.8.32" }
-deepseek-core = { path = "../core", version = "0.8.32" }
-deepseek-execpolicy = { path = "../execpolicy", version = "0.8.32" }
-deepseek-hooks = { path = "../hooks", version = "0.8.32" }
-deepseek-mcp = { path = "../mcp", version = "0.8.32" }
-deepseek-protocol = { path = "../protocol", version = "0.8.32" }
-deepseek-state = { path = "../state", version = "0.8.32" }
-deepseek-tools = { path = "../tools", version = "0.8.32" }
+deepseek-agent = { path = "../agent", version = "0.8.33" }
+deepseek-config = { path = "../config", version = "0.8.33" }
+deepseek-core = { path = "../core", version = "0.8.33" }
+deepseek-execpolicy = { path = "../execpolicy", version = "0.8.33" }
+deepseek-hooks = { path = "../hooks", version = "0.8.33" }
+deepseek-mcp = { path = "../mcp", version = "0.8.33" }
+deepseek-protocol = { path = "../protocol", version = "0.8.33" }
+deepseek-state = { path = "../state", version = "0.8.33" }
+deepseek-tools = { path = "../tools", version = "0.8.33" }
 serde.workspace = true
 serde_json.workspace = true
 tokio.workspace = true
@@ -14,13 +14,13 @@ path = "src/main.rs"
 anyhow.workspace = true
 clap.workspace = true
 clap_complete.workspace = true
-deepseek-agent = { path = "../agent", version = "0.8.32" }
-deepseek-app-server = { path = "../app-server", version = "0.8.32" }
-deepseek-config = { path = "../config", version = "0.8.32" }
-deepseek-execpolicy = { path = "../execpolicy", version = "0.8.32" }
-deepseek-mcp = { path = "../mcp", version = "0.8.32" }
-deepseek-secrets = { path = "../secrets", version = "0.8.32" }
-deepseek-state = { path = "../state", version = "0.8.32" }
+deepseek-agent = { path = "../agent", version = "0.8.33" }
+deepseek-app-server = { path = "../app-server", version = "0.8.33" }
+deepseek-config = { path = "../config", version = "0.8.33" }
+deepseek-execpolicy = { path = "../execpolicy", version = "0.8.33" }
+deepseek-mcp = { path = "../mcp", version = "0.8.33" }
+deepseek-secrets = { path = "../secrets", version = "0.8.33" }
+deepseek-state = { path = "../state", version = "0.8.33" }
 chrono.workspace = true
 dirs.workspace = true
 serde.workspace = true
@@ -128,6 +128,15 @@ enum Commands {
    /// Bootstrap MCP config and/or skills directories.
    Setup(TuiPassthroughArgs),
    /// Run the DeepSeek TUI non-interactive agent command.
+    #[command(after_help = "\
+Common forwarded flags:
+  --auto                           Enable agentic mode with tool access
+  --json                           Emit summary JSON
+  --resume <SESSION_ID>            Resume a previous session by ID or prefix
+  --session-id <SESSION_ID>        Resume a previous session by ID or prefix
+  --continue                       Continue the most recent session for this workspace
+  --output-format <FORMAT>         Output format: text or stream-json
+")]
    Exec(TuiPassthroughArgs),
    /// Run a DeepSeek-powered code review over a git diff.
    Review(TuiPassthroughArgs),
@@ -2654,6 +2663,18 @@ mod tests {
                ],
            ),
            ("sandbox", vec!["check"]),
+            (
+                "exec",
+                vec![
+                    "--auto",
+                    "--json",
+                    "--resume",
+                    "--session-id",
+                    "--continue",
+                    "--output-format",
+                    "stream-json",
+                ],
+            ),
            (
                "app-server",
                vec!["--host", "--port", "--config", "--stdio"],
@@ -8,7 +8,7 @@ description = "Config schema and precedence model for DeepSeek workspace archite

 [dependencies]
 anyhow.workspace = true
-deepseek-secrets = { path = "../secrets", version = "0.8.32" }
+deepseek-secrets = { path = "../secrets", version = "0.8.33" }
 dirs.workspace = true
 serde.workspace = true
 toml.workspace = true
@@ -914,26 +914,6 @@ impl ConfigToml {
        let root_deepseek_model = (provider == ProviderKind::Deepseek)
            .then(|| self.default_text_model.clone())
            .flatten();
-        // CLI flag wins outright. Otherwise: config-file → injected secrets/env.
-        // This makes `deepseek auth set` a reliable fix even when the user's
-        // shell still exports an old key. When the file is empty, the injected
-        // secrets façade recovers configured secret-store credentials before
-        // falling back to ambient env.
-        let from_file = provider_cfg.api_key.clone().or(root_deepseek_api_key);
-        let (api_key, api_key_source) = if let Some(value) = cli.api_key.clone() {
-            (Some(value), Some(RuntimeApiKeySource::Cli))
-        } else if let Some(value) = from_file.clone().filter(|v| !v.trim().is_empty()) {
-            (Some(value), Some(RuntimeApiKeySource::ConfigFile))
-        } else if let Some((value, source)) = secrets.resolve_with_source(provider.as_str()) {
-            let source = match source {
-                SecretSource::Keyring => RuntimeApiKeySource::Keyring,
-                SecretSource::Env => RuntimeApiKeySource::Env,
-            };
-            (Some(value), Some(source))
-        } else {
-            (None, None)
-        };
-
        let base_url = cli
            .base_url
            .clone()
@@ -952,6 +932,38 @@ impl ConfigToml {
                ProviderKind::Vllm => DEFAULT_VLLM_BASE_URL.to_string(),
                ProviderKind::Ollama => DEFAULT_OLLAMA_BASE_URL.to_string(),
            });
+        let auth_mode = cli
+            .auth_mode
+            .clone()
+            .or_else(|| env.auth_mode.clone())
+            .or_else(|| self.auth_mode.clone());
+        // CLI flag wins outright. Otherwise: config-file → injected secrets/env.
+        // This makes `deepseek auth set` a reliable fix even when the user's
+        // shell still exports an old key. When the file is empty, the injected
+        // secrets façade recovers configured secret-store credentials before
+        // falling back to ambient env.
+        let from_file = provider_cfg.api_key.clone().or(root_deepseek_api_key);
+        let (api_key, api_key_source) = if let Some(value) = cli.api_key.clone() {
+            (Some(value), Some(RuntimeApiKeySource::Cli))
+        } else if let Some(value) = from_file.clone().filter(|v| !v.trim().is_empty()) {
+            (Some(value), Some(RuntimeApiKeySource::ConfigFile))
+        } else if should_skip_secret_store_for_provider(provider, &base_url, auth_mode.as_deref()) {
+            match deepseek_secrets::env_for(provider.as_str()) {
+                Some(value) => (Some(value), Some(RuntimeApiKeySource::Env)),
+                None => (None, None),
+            }
+        } else {
+            match secrets.resolve_with_source(provider.as_str()) {
+                Some((value, source)) => {
+                    let source = match source {
+                        SecretSource::Keyring => RuntimeApiKeySource::Keyring,
+                        SecretSource::Env => RuntimeApiKeySource::Env,
+                    };
+                    (Some(value), Some(source))
+                }
+                None => (None, None),
+            }
+        };

        let explicit_model = cli.model.is_some()
            || env.model.is_some()
@@ -985,11 +997,6 @@ impl ConfigToml {
            .clone()
            .or_else(|| env.output_mode.clone())
            .or_else(|| self.output_mode.clone());
-        let auth_mode = cli
-            .auth_mode
-            .clone()
-            .or_else(|| env.auth_mode.clone())
-            .or_else(|| self.auth_mode.clone());
        let log_level = cli
            .log_level
            .clone()
@@ -1150,6 +1157,75 @@ fn provider_preserves_custom_base_url_model(provider: ProviderKind, base_url: &s
    base_url_is_custom_for_provider(provider, base_url)
 }

+fn should_skip_secret_store_for_provider(
+    provider: ProviderKind,
+    base_url: &str,
+    auth_mode: Option<&str>,
+) -> bool {
+    if auth_mode_requires_api_key(auth_mode) {
+        return false;
+    }
+    if auth_mode_disables_api_key(auth_mode) {
+        return true;
+    }
+
+    matches!(
+        provider,
+        ProviderKind::Sglang | ProviderKind::Vllm | ProviderKind::Ollama
+    ) || base_url_uses_local_host(base_url)
+}
+
+fn auth_mode_requires_api_key(auth_mode: Option<&str>) -> bool {
+    matches!(
+        auth_mode
+            .map(str::trim)
+            .filter(|value| !value.is_empty())
+            .map(|value| value.to_ascii_lowercase()),
+        Some(value)
+            if matches!(
+                value.as_str(),
+                "api_key" | "api-key" | "apikey" | "bearer" | "bearer-token"
+            )
+    )
+}
+
+fn auth_mode_disables_api_key(auth_mode: Option<&str>) -> bool {
+    matches!(
+        auth_mode
+            .map(str::trim)
+            .filter(|value| !value.is_empty())
+            .map(|value| value.to_ascii_lowercase()),
+        Some(value)
+            if matches!(
+                value.as_str(),
+                "none" | "off" | "disabled" | "no_auth" | "no-auth" | "anonymous"
+            )
+    )
+}
+
+fn base_url_uses_local_host(base_url: &str) -> bool {
+    let Some(host) = base_url_host(base_url) else {
+        return false;
+    };
+    let host = host.trim_matches(['[', ']']).to_ascii_lowercase();
+    if matches!(host.as_str(), "localhost" | "0.0.0.0") {
+        return true;
+    }
+    host.parse::<std::net::IpAddr>()
+        .is_ok_and(|addr| addr.is_loopback() || addr.is_unspecified())
+}
+
+fn base_url_host(base_url: &str) -> Option<&str> {
+    let without_scheme = base_url
+        .split_once("://")
+        .map_or(base_url, |(_, rest)| rest);
+    let authority = without_scheme.split('/').next()?.rsplit('@').next()?;
+    if let Some(rest) = authority.strip_prefix('[') {
+        return rest.split_once(']').map(|(host, _)| host);
+    }
+    authority.split(':').next().filter(|host| !host.is_empty())
+}
+
 #[derive(Debug, Clone, Default)]
 pub struct CliRuntimeOverrides {
    pub provider: Option<ProviderKind>,
@@ -1508,6 +1584,7 @@ mod tests {
    use super::*;
    use std::env;
    use std::ffi::OsString;
+    use std::sync::Arc;
    use std::sync::{Mutex, OnceLock};

    fn env_lock() -> std::sync::MutexGuard<'static, ()> {
@@ -1536,6 +1613,7 @@ mod tests {
        deepseek_http_headers: Option<OsString>,
        deepseek_model: Option<OsString>,
        deepseek_provider: Option<OsString>,
+        deepseek_auth_mode: Option<OsString>,
        nvidia_api_key: Option<OsString>,
        nvidia_nim_api_key: Option<OsString>,
        nim_base_url: Option<OsString>,
@@ -1563,6 +1641,7 @@ mod tests {
                deepseek_http_headers: env::var_os("DEEPSEEK_HTTP_HEADERS"),
                deepseek_model: env::var_os("DEEPSEEK_MODEL"),
                deepseek_provider: env::var_os("DEEPSEEK_PROVIDER"),
+                deepseek_auth_mode: env::var_os("DEEPSEEK_AUTH_MODE"),
                nvidia_api_key: env::var_os("NVIDIA_API_KEY"),
                nvidia_nim_api_key: env::var_os("NVIDIA_NIM_API_KEY"),
                nim_base_url: env::var_os("NIM_BASE_URL"),
@@ -1588,6 +1667,7 @@ mod tests {
                env::remove_var("DEEPSEEK_HTTP_HEADERS");
                env::remove_var("DEEPSEEK_MODEL");
                env::remove_var("DEEPSEEK_PROVIDER");
+                env::remove_var("DEEPSEEK_AUTH_MODE");
                env::remove_var("NVIDIA_API_KEY");
                env::remove_var("NVIDIA_NIM_API_KEY");
                env::remove_var("NIM_BASE_URL");
@@ -1627,6 +1707,7 @@ mod tests {
                Self::restore_var("DEEPSEEK_HTTP_HEADERS", self.deepseek_http_headers.take());
                Self::restore_var("DEEPSEEK_MODEL", self.deepseek_model.take());
                Self::restore_var("DEEPSEEK_PROVIDER", self.deepseek_provider.take());
+                Self::restore_var("DEEPSEEK_AUTH_MODE", self.deepseek_auth_mode.take());
                Self::restore_var("NVIDIA_API_KEY", self.nvidia_api_key.take());
                Self::restore_var("NVIDIA_NIM_API_KEY", self.nvidia_nim_api_key.take());
                Self::restore_var("NIM_BASE_URL", self.nim_base_url.take());
@@ -1648,6 +1729,39 @@ mod tests {
        }
    }

+    struct RecordingSecretsStore {
+        gets: Mutex<Vec<String>>,
+        value: Option<String>,
+    }
+
+    impl RecordingSecretsStore {
+        fn with_value(value: &str) -> Self {
+            Self {
+                gets: Mutex::new(Vec::new()),
+                value: Some(value.to_string()),
+            }
+        }
+    }
+
+    impl deepseek_secrets::KeyringStore for RecordingSecretsStore {
+        fn get(&self, key: &str) -> Result<Option<String>, deepseek_secrets::SecretsError> {
+            self.gets.lock().unwrap().push(key.to_string());
+            Ok(self.value.clone())
+        }
+
+        fn set(&self, _key: &str, _value: &str) -> Result<(), deepseek_secrets::SecretsError> {
+            Ok(())
+        }
+
+        fn delete(&self, _key: &str) -> Result<(), deepseek_secrets::SecretsError> {
+            Ok(())
+        }
+
+        fn backend_name(&self) -> &'static str {
+            "recording"
+        }
+    }
+
    #[test]
    fn root_deepseek_fields_are_runtime_fallbacks() {
        let _lock = env_lock();
@@ -2114,6 +2228,78 @@ mod tests {
        assert_eq!(resolved.api_key, None);
    }

+    #[test]
+    fn self_hosted_providers_do_not_probe_secret_store_by_default() {
+        let _lock = env_lock();
+        let _env = EnvGuard::without_deepseek_runtime_overrides();
+        let store = Arc::new(RecordingSecretsStore::with_value("secret-store-key"));
+        let secrets = Secrets::new(store.clone());
+
+        for provider in [
+            ProviderKind::Sglang,
+            ProviderKind::Vllm,
+            ProviderKind::Ollama,
+        ] {
+            let config = ConfigToml {
+                provider,
+                ..ConfigToml::default()
+            };
+
+            let resolved = config
+                .resolve_runtime_options_with_secrets(&CliRuntimeOverrides::default(), &secrets);
+
+            assert_eq!(resolved.provider, provider);
+            assert_eq!(resolved.api_key, None);
+        }
+
+        assert!(
+            store.gets.lock().unwrap().is_empty(),
+            "self-hosted providers should not read the secret store by default"
+        );
+    }
+
+    #[test]
+    fn self_hosted_api_key_auth_can_use_secret_store_when_requested() {
+        let _lock = env_lock();
+        let _env = EnvGuard::without_deepseek_runtime_overrides();
+        let store = Arc::new(RecordingSecretsStore::with_value("secret-store-key"));
+        let secrets = Secrets::new(store.clone());
+        let config = ConfigToml {
+            provider: ProviderKind::Ollama,
+            auth_mode: Some("api_key".to_string()),
+            ..ConfigToml::default()
+        };
+
+        let resolved =
+            config.resolve_runtime_options_with_secrets(&CliRuntimeOverrides::default(), &secrets);
+
+        assert_eq!(resolved.api_key.as_deref(), Some("secret-store-key"));
+        assert_eq!(store.gets.lock().unwrap().as_slice(), ["ollama"]);
+    }
+
+    #[test]
+    fn loopback_custom_deepseek_base_url_does_not_probe_secret_store_by_default() {
+        let _lock = env_lock();
+        let _env = EnvGuard::without_deepseek_runtime_overrides();
+        let store = Arc::new(RecordingSecretsStore::with_value("stale-deepseek-key"));
+        let secrets = Secrets::new(store.clone());
+        let config = ConfigToml {
+            base_url: Some("http://127.0.0.1:8000/v1".to_string()),
+            ..ConfigToml::default()
+        };
+
+        let resolved =
+            config.resolve_runtime_options_with_secrets(&CliRuntimeOverrides::default(), &secrets);
+
+        assert_eq!(resolved.provider, ProviderKind::Deepseek);
+        assert_eq!(resolved.base_url, "http://127.0.0.1:8000/v1");
+        assert_eq!(resolved.api_key, None);
+        assert!(
+            store.gets.lock().unwrap().is_empty(),
+            "loopback custom endpoints should not read macOS Keychain or any secret store"
+        );
+    }
+
    #[test]
    fn ollama_provider_preserves_model_tags() {
        let _lock = env_lock();
@@ -9,13 +9,13 @@ description = "Core runtime boundaries for DeepSeek workspace architecture"
 [dependencies]
 anyhow.workspace = true
 chrono.workspace = true
-deepseek-agent = { path = "../agent", version = "0.8.32" }
-deepseek-config = { path = "../config", version = "0.8.32" }
-deepseek-execpolicy = { path = "../execpolicy", version = "0.8.32" }
-deepseek-hooks = { path = "../hooks", version = "0.8.32" }
-deepseek-mcp = { path = "../mcp", version = "0.8.32" }
-deepseek-protocol = { path = "../protocol", version = "0.8.32" }
-deepseek-state = { path = "../state", version = "0.8.32" }
-deepseek-tools = { path = "../tools", version = "0.8.32" }
+deepseek-agent = { path = "../agent", version = "0.8.33" }
+deepseek-config = { path = "../config", version = "0.8.33" }
+deepseek-execpolicy = { path = "../execpolicy", version = "0.8.33" }
+deepseek-hooks = { path = "../hooks", version = "0.8.33" }
+deepseek-mcp = { path = "../mcp", version = "0.8.33" }
+deepseek-protocol = { path = "../protocol", version = "0.8.33" }
+deepseek-state = { path = "../state", version = "0.8.33" }
+deepseek-tools = { path = "../tools", version = "0.8.33" }
 serde_json.workspace = true
 uuid.workspace = true
@@ -8,5 +8,5 @@ description = "Execution policy and approval model parity for DeepSeek workspace

 [dependencies]
 anyhow.workspace = true
-deepseek-protocol = { path = "../protocol", version = "0.8.32" }
+deepseek-protocol = { path = "../protocol", version = "0.8.33" }
 serde.workspace = true
@@ -10,7 +10,7 @@ description = "Hook dispatch and notifications parity for DeepSeek workspace arc
 anyhow.workspace = true
 async-trait.workspace = true
 chrono.workspace = true
-deepseek-protocol = { path = "../protocol", version = "0.8.32" }
+deepseek-protocol = { path = "../protocol", version = "0.8.33" }
 reqwest.workspace = true
 serde.workspace = true
 serde_json.workspace = true
@@ -9,7 +9,7 @@ description = "Tool invocation lifecycle, schema validation, and scheduler paral
 [dependencies]
 anyhow.workspace = true
 async-trait.workspace = true
-deepseek-protocol = { path = "../protocol", version = "0.8.32" }
+deepseek-protocol = { path = "../protocol", version = "0.8.33" }
 serde.workspace = true
 serde_json.workspace = true
 tokio.workspace = true
@@ -7,6 +7,164 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ## [Unreleased]

+## [0.8.33] - 2026-05-12
+
+A sub-agent and RLM renovation release. The model-facing delegation
+surface is now session-oriented instead of one-shot: RLM work happens
+through `rlm_open` / `rlm_eval` / `rlm_configure` / `rlm_close`,
+sub-agent work happens through `agent_open` / `agent_eval` /
+`agent_close`, and large outputs can be parked behind typed handles
+that the model reads back explicitly with `handle_read`.
+
+### Added
+
+- **Persistent RLM sessions with bounded REPL helpers.** RLM prompts now
+  use `peek`, `search`, `chunk`, `context_meta`, `sub_query`,
+  `sub_query_batch`, `sub_query_map`, `sub_rlm`, and
+  `finalize(value, confidence)` instead of exposing the full parent
+  context as an ambient variable.
+- **Fork-aware sub-agent sessions.** `agent_open` supports named
+  sessions, `fork_context`, and bounded recursive depth so the parent can
+  ask for multiple perspectives while preserving prompt-cache-friendly
+  prefix context where available.
+- **Shared `handle_read` storage.** RLM finals, sub-agent transcripts,
+  and other large structured results can return `var_handle` references
+  with slice, range, count, and JSONPath projections.
+- **Slash-command routing for the new surface.** `/rlm [N] ...` and
+  `/agent [N] ...` now prompt the assistant to use the persistent tools
+  instead of the removed foreground RLM operation.
+- **Harness-friendly non-interactive exec sessions.** `deepseek exec`
+  now supports `--resume`, `--session-id`, `--continue`, and
+  `--output-format stream-json` so backend wrappers such as ClawBench can
+  keep conversation state and parse one JSON event per line without running
+  a long-lived server.
+- **`/relay` slash command with CJK aliases** (`/接力`). Hands the
+  assistant a structured handoff prompt for coordinated multi-turn
+  continuation across sessions.
+- **`checklist_write` sidebar rename.** The sidebar focus tab formerly
+  known as "Plan" / "Todos" is now "Work" — one panel for the active
+  checklist and optional plan, consistent across all three modes.
+- **Grayscale theme.** `/theme grayscale` and
+  `/set theme grayscale --save` provide a low-opinion black/white palette
+  for users who want less brand color in the terminal.
+
+### Changed
+
+- **Prompts and docs now teach only the new tool names.** Legacy
+  RLM/sub-agent helpers remain internally where needed for durable
+  transcript compatibility, but the registry exposes the session tools.
+- **Large or noisy tool results are easier to keep out of context.**
+  Tool output summaries, sub-agent results, and transcript snapshots now
+  point the model toward `handle_read` when it needs raw detail.
+- **Tool-surface smoke guidance is explicit.** Release checks now document
+  the exact version commands and registry-name searches for `handle_read`,
+  persistent RLM tools, and persistent sub-agent tools.
+- **README acknowledgements expanded.** The project thanks OpenWarp and
+  Open Design for support and collaboration around terminal-agent and
+  design-forward workflows.
+- **Light theme tuned for calmer contrast.** The canvas, panel, elevated,
+  border, and selection tokens now separate surfaces without the washed-out
+  white-on-white feel.
+- **Session picker is history-first.** `/sessions` and `Ctrl+R` now show
+  the full selected session history on the left with the session list on
+  the right; number keys `1`-`9` open visible session histories, `PgUp` /
+  `PgDn` scroll that history, and `Enter` still resumes.
+- **Foreground RLM operation removed.** The old `Op::Rlm` path and its
+  `handle_rlm` engine method are gone; all RLM work now flows through
+  the persistent-session tools.
+- **Stale competitive-analysis doc removed.** The old cross-agent matrix
+  had become an unreliable inventory of tool names rather than useful
+  release guidance.
+
+### Fixed
+
+- **Local/custom endpoints stay prompt-free when auth is optional.**
+  The dispatcher no longer reads the secret store for SGLang, vLLM,
+  Ollama, or loopback custom URLs unless API-key auth is explicitly
+  requested, and the direct TUI treats loopback model endpoints as
+  no-key by default. This avoids macOS Keychain prompts and stale
+  DeepSeek keys when users point the app at local OpenAI-compatible
+  servers.
+- **Transcript browsing stays put across resizes.** If the user is reading
+  older chat history, terminal resize events preserve the current transcript
+  position instead of jumping back to the live tail; the scrollbar and
+  jump-to-latest affordance now follow the active theme.
+- **Backtrack preview opens near the selected turn.** Pressing Esc twice no
+  longer opens the live transcript preview at the oldest conversation line;
+  the highlighted recent user turn is pinned into view, and changing the
+  backtrack target re-pins only that selection.
+- **Completed thinking no longer masquerades as prompt text.** Collapsed
+  completed reasoning now shows only explicit `Summary:` content inline; raw
+  reasoning remains available through Ctrl+O/transcript instead of appearing
+  as assistant self-talk in the main flow. When Ctrl+O starts from a reasoning
+  block, it opens a full-session reasoning timeline instead of a single
+  isolated chunk.
+- **Transcript selection keeps working while the agent is streaming.**
+  The loading-state mouse filter now drops inert move events but allows
+  active transcript and scrollbar drags to continue (reported as a known
+  issue in v0.8.32).
+- **Empty-composer arrow scrolling feels less twitchy.** When configured to
+  scroll the transcript, plain Up/Down now move by a small wheel-like step
+  instead of a single-line flick.
+- **Mouse and trackpad scrolling feel less sticky in long logs.** Rapid
+  same-direction transcript scrolls now get bounded acceleration while
+  direction changes reset to precise single-line movement.
+- **RLM smoke-test papercuts fixed.** `rlm_eval` now binds `content` as a
+  convenience alias for `_context`, tolerates common `timeout_secs` keyword
+  guesses on child-query helpers while preserving session-level timeout
+  policy, and stores JSON-serializable `finalize(...)` values as JSON handles
+  so `handle_read` can project them directly.
+- **RLM REPL uses the shared Python resolver.** RLM startup now tries
+  `python3`, `python`, and `py -3`, matching the dependency resolver used by
+  code execution and avoiding Windows failures where `python3` is absent
+  (harvested from PR #1540).
+- **Session titles and history previews hide metadata noise.** Saved
+  session titles and the picker history strip leading `<turn_meta>` envelopes
+  and thinking-tag blocks so historical conversations read like user-visible
+  chat rather than prompt plumbing (harvested from PR #1510).
+- **Companion binary version smoke is unambiguous.** `deepseek-tui --version`
+  now reports the `deepseek-tui` binary name instead of the dispatcher label.
+- **Vision path boundary test is platform-native.** The absolute-path
+  rejection smoke uses a Windows absolute path on Windows and `/etc/hosts`
+  elsewhere (harvested from PR #1526).
+- **Tool papercuts:** `file_search` has safer default excludes and an
+  explicit `exclude` option; `grep_files` returns single-line context as
+  strings; `fetch_url` can project JSON fields and returns headers;
+  `edit_file` can opt into leading-indentation fuzz; `exec_shell` can
+  merge stdout/stderr in chronological order; `revert_turn` rejects
+  no-op snapshot boundaries.
+- **CLI reasoning-effort honoured on non-auto exec routes** (PR #1511
+  from **@h3c-hexin**). `deepseek -p "..." --reasoning-effort high` now
+  applies the flag correctly instead of falling back to the config-file
+  default.
+- **Edit-file replacement boundaries clarified** (PR #1516). The tool
+  description and error messages now make it unambiguous that
+  `edit_file` is for one clear replacement in one file.
+- **Pandoc output validated before probing** (PR #1523). Binary-format
+  conversions that produce empty or invalid output now surface a clear
+  error instead of a confusing pandoc stack trace.
+- **Running turns can be steered and repainted** (PR #1533, #1537).
+  Composer input during an active turn no longer stalls; the TUI
+  redraws the transcript as the agent streams.
+- **Tasks and Activity Detail are calmer under load.** The Tasks panel now
+  keeps live/background/recent activity from double-counting the same shell
+  or RLM work, groups repeated read/search/checklist noise, and keeps
+  failures, status, command summaries, and durations visible. Ctrl+O now
+  opens Activity Detail for selected/live/recent tool work and the reasoning
+  timeline for thinking blocks, while Alt+V remains the direct tool-detail
+  pager; the idle footer now advertises that split for the visible activity.
+- **npm retry shows timeout hint on first failure** (PR #1538).
+  Installations behind slow proxies now see a clear "retrying" message
+  instead of a silent hang.
+- **Issue templates improved** (PR #1525 from **@reidliu41**). Bug and
+  feature-request templates are clearer and easier for new contributors.
+
+### Credits
+
+Thanks to **@reidliu41** (#1525/#1526), **@h3c-hexin** (#1511),
+**@xulongzhe** (#1530/#1544), **@tyouter** (#1510), and
+**@Duducoco** (#1540) for community contributions in this release.
+
 ## [0.8.32] - 2026-05-12

 A "more useful tools" release. v0.8.31 made the tool surface
@@ -3821,7 +3979,8 @@ Welcome — and thank you.
 - Hooks system and config profiles
 - Example skills and launch assets

-[Unreleased]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.32...HEAD
+[Unreleased]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.33...HEAD
+[0.8.33]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.32...v0.8.33
 [0.8.32]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.31...v0.8.32
 [0.8.31]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.30...v0.8.31
 [0.8.30]: https://github.com/Hmbown/DeepSeek-TUI/compare/v0.8.29...v0.8.30
@@ -21,8 +21,8 @@ path = "src/main.rs"
 [dependencies]
 anyhow = "1.0.100"
 arboard = "3.4"
-deepseek-secrets = { path = "../secrets", version = "0.8.32" }
-deepseek-tools = { path = "../tools", version = "0.8.32" }
+deepseek-secrets = { path = "../secrets", version = "0.8.33" }
+deepseek-tools = { path = "../tools", version = "0.8.33" }
 schemaui = { version = "0.12.0", default-features = false, optional = true }
 async-stream = "0.3.6"
 async-trait = "0.1"
@@ -1752,7 +1752,7 @@ mod tests {
            ],
            max_tokens: 1024,
            system: Some(SystemPrompt::Text(
-                "Base policy\n\n<project_instructions source=\"AGENTS.md\">\nStable project rules\n</project_instructions>\n\n## Previous Session Handoff\n\nDynamic handoff"
+                "Base policy\n\n<project_instructions source=\"AGENTS.md\">\nStable project rules\n</project_instructions>\n\n## Previous Session Relay\n\nDynamic relay"
                    .to_string(),
            )),
            tools: None,
@@ -1788,7 +1788,7 @@ mod tests {
            .and_then(Value::as_str)
            .expect("warmup system prompt");
        assert!(system.contains("Stable project rules"));
-        assert!(!system.contains("Dynamic handoff"));
+        assert!(!system.contains("Dynamic relay"));
        assert!(
            !wire
                .iter()
@@ -685,7 +685,7 @@ fn split_system_layers(content: &str) -> Vec<(String, PromptLayerStability, &str
        ("Skills", "## Skills"),
        ("Context management", "## Context Management"),
        ("Compact template", "## Compact"),
-        ("Previous session handoff", "## Previous Session Handoff"),
+        ("Previous session relay", "## Previous Session Relay"),
    ];

    let mut starts: Vec<(usize, &str)> = markers
@@ -706,7 +706,7 @@ fn split_system_layers(content: &str) -> Vec<(String, PromptLayerStability, &str

    for (i, (start, name)) in starts.iter().enumerate() {
        let end = starts.get(i + 1).map_or(content.len(), |(idx, _)| *idx);
-        let stability = if *name == "Previous session handoff" {
+        let stability = if *name == "Previous session relay" {
            PromptLayerStability::Dynamic
        } else if is_static_base_layer(name) {
            PromptLayerStability::Static
@@ -120,6 +120,9 @@ fn show_single_setting(app: &App, key: &str) -> CommandResult {
        }
        "approval_mode" | "approval" => Some(app.approval_mode.label().to_string()),
        "locale" | "language" => Some(locale_display(app.ui_locale).to_string()),
+        "theme" | "ui_theme" => {
+            Some(crate::palette::theme_label_for_mode(app.ui_theme.mode).to_string())
+        }
        "background_color" | "background" | "bg" => {
            crate::palette::hex_rgb_string(app.ui_theme.surface_bg)
                .or_else(|| Some("(default)".to_string()))
@@ -418,13 +421,11 @@ pub fn set_config_value(app: &mut App, key: &str, value: &str, persist: bool) ->
            app.ui_locale = resolve_locale(&settings.locale);
            app.needs_redraw = true;
        }
-        "background_color" | "background" | "bg" => {
-            let base_theme = crate::palette::UiTheme::detect();
-            app.ui_theme = settings
-                .background_color
-                .as_deref()
-                .and_then(crate::palette::parse_hex_rgb_color)
-                .map_or(base_theme, |color| base_theme.with_background_color(color));
+        "theme" | "ui_theme" | "background_color" | "background" | "bg" => {
+            app.ui_theme = crate::palette::ui_theme_from_settings(
+                &settings.theme,
+                settings.background_color.as_deref(),
+            );
            app.needs_redraw = true;
        }
        "cost_currency" | "currency" => {
@@ -487,6 +488,7 @@ pub fn set_config_value(app: &mut App, key: &str, value: &str, persist: bool) ->
    let display_value = match key.as_str() {
        "default_mode" | "mode" => settings.default_mode.clone(),
        "cost_currency" | "currency" => settings.cost_currency.clone(),
+        "theme" | "ui_theme" => settings.theme.clone(),
        "background_color" | "background" | "bg" => settings
            .background_color
            .clone()
@@ -580,22 +582,34 @@ fn mode_display_name(mode: AppMode) -> &'static str {
    }
 }

-/// Toggle between dark and light theme.
-pub fn theme(app: &mut App) -> CommandResult {
-    let new_theme = match app.ui_theme.mode {
-        crate::palette::PaletteMode::Dark => {
-            crate::palette::UiTheme::for_mode(crate::palette::PaletteMode::Light)
-        }
-        crate::palette::PaletteMode::Light => {
-            crate::palette::UiTheme::for_mode(crate::palette::PaletteMode::Dark)
+/// Switch the runtime theme. `/set theme <value> --save` persists it.
+pub fn theme(app: &mut App, arg: Option<&str>) -> CommandResult {
+    let requested = match arg.map(str::trim).filter(|value| !value.is_empty()) {
+        Some(value) => {
+            let Some(theme) = crate::palette::normalize_theme_name(value) else {
+                return CommandResult::error("Usage: /theme [dark|light|grayscale|system]");
+            };
+            theme
        }
+        None => match app.ui_theme.mode {
+            crate::palette::PaletteMode::Dark => "light",
+            crate::palette::PaletteMode::Light => "grayscale",
+            crate::palette::PaletteMode::Grayscale => "dark",
+        },
    };
-    app.ui_theme = new_theme;
-    let label = match new_theme.mode {
-        crate::palette::PaletteMode::Dark => "dark",
-        crate::palette::PaletteMode::Light => "light",
-    };
-    CommandResult::message(format!("Theme switched to {label}."))
+
+    let background = Settings::load()
+        .ok()
+        .and_then(|settings| settings.background_color);
+    app.ui_theme = crate::palette::ui_theme_from_settings(requested, background.as_deref());
+    app.needs_redraw = true;
+
+    let label = crate::palette::theme_label_for_mode(app.ui_theme.mode);
+    if requested == "system" {
+        CommandResult::message(format!("Theme switched to system ({label})."))
+    } else {
+        CommandResult::message(format!("Theme switched to {label}."))
+    }
 }

 /// Manage workspace-level trust and the per-path allowlist.
@@ -1178,6 +1192,9 @@ mod tests {
    #[test]
    fn test_mode_yolo_sets_all_flags() {
        let mut app = create_test_app();
+        // Switch to Agent first to guarantee a clean starting state regardless of
+        // user settings on the host machine.
+        let _ = mode(&mut app, Some("agent"));
        let result = mode(&mut app, Some("yolo"));
        assert!(result.message.unwrap().contains("Switched to YOLO mode"));
        assert!(app.allow_shell);
@@ -1443,6 +1460,54 @@ mod tests {
        assert!(saved.contains("cost_currency = \"cny\""));
    }

+    #[test]
+    fn theme_command_accepts_grayscale_arg() {
+        let nanos = SystemTime::now()
+            .duration_since(UNIX_EPOCH)
+            .unwrap()
+            .as_nanos();
+        let temp_root = env::temp_dir().join(format!(
+            "deepseek-tui-theme-command-test-{}-{}",
+            std::process::id(),
+            nanos
+        ));
+        fs::create_dir_all(&temp_root).unwrap();
+        let _guard = EnvGuard::new(&temp_root);
+
+        let mut app = create_test_app();
+        let result = theme(&mut app, Some("grayscale"));
+
+        assert_eq!(result.message.unwrap(), "Theme switched to grayscale.");
+        assert_eq!(app.ui_theme.mode, crate::palette::PaletteMode::Grayscale);
+        assert!(app.needs_redraw);
+    }
+
+    #[test]
+    fn set_theme_save_updates_live_app_and_persists() {
+        let nanos = SystemTime::now()
+            .duration_since(UNIX_EPOCH)
+            .unwrap()
+            .as_nanos();
+        let temp_root = env::temp_dir().join(format!(
+            "deepseek-tui-theme-save-test-{}-{}",
+            std::process::id(),
+            nanos
+        ));
+        fs::create_dir_all(&temp_root).unwrap();
+        let _guard = EnvGuard::new(&temp_root);
+
+        let mut app = create_test_app();
+        let result = set_config(&mut app, Some("theme grayscale --save"));
+        let msg = result.message.unwrap();
+
+        assert_eq!(msg, "theme = grayscale (saved)");
+        assert_eq!(app.ui_theme.mode, crate::palette::PaletteMode::Grayscale);
+
+        let settings_path = Settings::path().unwrap();
+        let saved = fs::read_to_string(settings_path).unwrap();
+        assert!(saved.contains("theme = \"grayscale\""));
+    }
+
    #[test]
    fn test_set_approval_mode_valid_values() {
        let mut app = create_test_app();
@@ -1497,7 +1562,8 @@ mod tests {
    #[test]
    fn test_trust_on_enables_flag() {
        let mut app = create_test_app();
-        assert!(!app.trust_mode);
+        // Normalize trust state regardless of user settings on the host machine.
+        app.trust_mode = false;
        let result = trust(&mut app, Some("on"));
        let msg = result.message.expect("message");
        assert!(msg.contains("Workspace trust mode enabled"));
@@ -545,6 +545,8 @@ mod tests {
    #[test]
    fn model_reset_same_model_keeps_turn_cache_history() {
        let mut app = create_test_app();
+        app.auto_model = false;
+        app.model = "deepseek-v4-pro".to_string();
        app.push_turn_cache_record(TurnCacheRecord {
            input_tokens: 100,
            output_tokens: 25,
@@ -32,6 +32,8 @@ mod status;
 mod task;
 mod user_commands;

+use std::fmt::Write as _;
+
 use crate::localization::{Locale, MessageId, tr};
 use crate::tui::app::{App, AppAction};

@@ -204,6 +206,12 @@ pub const COMMANDS: &[CommandInfo] = &[
        usage: "/subagents",
        description_id: MessageId::CmdSubagentsDescription,
    },
+    CommandInfo {
+        name: "agent",
+        aliases: &[],
+        usage: "/agent [N] <task>",
+        description_id: MessageId::CmdAgentDescription,
+    },
    CommandInfo {
        name: "links",
        aliases: &["dashboard", "api"],
@@ -295,6 +303,12 @@ pub const COMMANDS: &[CommandInfo] = &[
        usage: "/compact",
        description_id: MessageId::CmdCompactDescription,
    },
+    CommandInfo {
+        name: "relay",
+        aliases: &["batonpass", "接力"],
+        usage: "/relay [focus]",
+        description_id: MessageId::CmdRelayDescription,
+    },
    CommandInfo {
        name: "context",
        aliases: &["ctx"],
@@ -341,7 +355,7 @@ pub const COMMANDS: &[CommandInfo] = &[
    CommandInfo {
        name: "theme",
        aliases: &[],
-        usage: "/theme",
+        usage: "/theme [dark|light|grayscale|system]",
        description_id: MessageId::CmdThemeDescription,
    },
    CommandInfo {
@@ -482,7 +496,7 @@ pub const COMMANDS: &[CommandInfo] = &[
    CommandInfo {
        name: "rlm",
        aliases: &["recursive"],
-        usage: "/rlm <prompt>",
+        usage: "/rlm [N] <file_or_text>",
        description_id: MessageId::CmdRlmDescription,
    },
    // Debug/cost command
@@ -534,6 +548,7 @@ pub fn execute(cmd: &str, app: &mut App) -> CommandResult {
        "stash" | "park" => stash::stash(app, arg),
        "hooks" | "hook" => hooks::hooks(app, arg),
        "subagents" | "agents" => core::subagents(app),
+        "agent" => agent(app, arg),
        "links" | "dashboard" | "api" => core::deepseek_links(app),
        "feedback" => feedback::feedback(app, arg),
        "home" | "stats" | "overview" => core::home_dashboard(app),
@@ -551,6 +566,7 @@ pub fn execute(cmd: &str, app: &mut App) -> CommandResult {
        "sessions" | "resume" => session::sessions(app, arg),
        "load" => session::load(app, arg),
        "compact" => session::compact(app),
+        "relay" | "batonpass" | "接力" => relay(app, arg),
        "cycles" => cycle::list_cycles(app),
        "cycle" => cycle::show_cycle(app, arg),
        "recall" => cycle::recall_archive(app, arg),
@@ -562,7 +578,7 @@ pub fn execute(cmd: &str, app: &mut App) -> CommandResult {
        "status" => status::status(app),
        "statusline" => config::status_line(app),
        "mode" => config::mode(app, arg),
-        "theme" => config::theme(app),
+        "theme" => config::theme(app, arg),
        "verbose" => config::verbose(app, arg),
        "trust" => config::trust(app, arg),
        "logout" => config::logout(app),
@@ -686,49 +702,220 @@ pub use config::{
 /// in the REPL as the `PROMPT` variable. The root LLM will only see
 /// metadata about the REPL state, never the prompt text directly.
 pub fn rlm(app: &mut App, arg: Option<&str>) -> CommandResult {
-    let prompt = match arg {
+    let (max_depth, target) = match parse_depth_prefixed_arg(arg, 1) {
+        Ok(parsed) => parsed,
+        Err(message) => return CommandResult::error(message),
+    };
+    let target = match target {
        Some(p) if !p.trim().is_empty() => p.trim().to_string(),
        _ => {
            return CommandResult::error(
-                "Usage: /rlm <prompt>\n\n\
-                 Process a prompt using a Recursive Language Model (RLM).\n\
-                 The prompt is stored in a REPL and the model writes code\n\
-                 to decompose and process it recursively."
+                "Usage: /rlm [N] <file_or_text>\n\n\
+                 Opens a persistent RLM context with sub_rlm depth N (0-3, default 1)."
                    .to_string(),
            );
        }
    };

-    // Sanity-check: RLM is most useful for longer prompts.
-    if prompt.len() < 50 {
-        return CommandResult::message(
-            "Tip: RLM is designed for processing LONG prompts (>100 chars). \
-             For short queries, just type the message directly."
-                .to_string(),
+    let source_arg = if resolves_to_existing_file(app, &target) {
+        format!(r#"file_path: "{target}""#)
+    } else {
+        format!("content: {:?}", target)
+    };
+    let message = format!(
+        "Open and use a persistent RLM session for this request. Call `rlm_open` with name `slash_rlm` and {source_arg}. Then call `rlm_configure` with `sub_rlm_max_depth: {max_depth}`. Use `rlm_eval` to inspect the context through `peek`, `search`, and `chunk`, and call `finalize(...)` from the REPL when ready. If a `var_handle` is returned, use `handle_read` for bounded slices or projections before answering."
+    );
+
+    CommandResult::with_message_and_action(
+        format!("Opening persistent RLM context at depth {max_depth}..."),
+        AppAction::SendMessage(message),
+    )
+}
+
+/// Open a persistent sub-agent session from a slash command.
+pub fn agent(_app: &mut App, arg: Option<&str>) -> CommandResult {
+    let (max_depth, task) = match parse_depth_prefixed_arg(arg, 1) {
+        Ok(parsed) => parsed,
+        Err(message) => return CommandResult::error(message),
+    };
+    let task = match task {
+        Some(task) if !task.trim().is_empty() => task.trim().to_string(),
+        _ => {
+            return CommandResult::error(
+                "Usage: /agent [N] <task>\n\n\
+                 Opens a persistent sub-agent session with recursive agent depth N (0-3, default 1).",
+            );
+        }
+    };
+    let message = format!(
+        "Open a persistent sub-agent session for this task. Call `agent_open` with name `slash_agent`, `prompt: {:?}`, and `max_depth: {max_depth}`. Use `agent_eval` to wait for the next terminal/current projection and `handle_read` on the returned transcript_handle if you need more detail. Verify any claimed side effects before reporting success.",
+        task
+    );
+    CommandResult::with_message_and_action(
+        format!("Opening persistent sub-agent at depth {max_depth}..."),
+        AppAction::SendMessage(message),
+    )
+}
+
+/// Ask the active model to write a compact relay artifact for the next thread.
+///
+/// The visible command is `/relay` (with `/接力` for Chinese users), but the
+/// durable file path remains `.deepseek/handoff.md` for compatibility with
+/// existing sessions and startup prompt loading.
+pub fn relay(app: &mut App, arg: Option<&str>) -> CommandResult {
+    let focus = arg.map(str::trim).filter(|value| !value.is_empty());
+    let message = build_relay_instruction(app, focus);
+    CommandResult::with_message_and_action(
+        "Preparing session relay at .deepseek/handoff.md...",
+        AppAction::SendMessage(message),
+    )
+}
+
+fn build_relay_instruction(app: &App, focus: Option<&str>) -> String {
+    let mut out = String::new();
+    let _ = writeln!(
+        out,
+        "Create a compact session relay (接力) for a future DeepSeek TUI thread."
+    );
+    let _ = writeln!(out);
+    let _ = writeln!(out, "Write or update `.deepseek/handoff.md`.");
+    let _ = writeln!(
+        out,
+        "Keep the existing file path for compatibility, but title the artifact `# Session relay`."
+    );
+    let _ = writeln!(out);
+    let _ = writeln!(out, "Current session snapshot:");
+    let _ = writeln!(out, "- Workspace: {}", app.workspace.display());
+    let _ = writeln!(out, "- Mode: {}", app.mode.label());
+    let _ = writeln!(out, "- Model: {}", app.model_display_label());
+    if let Some(focus) = focus {
+        let _ = writeln!(out, "- Requested relay focus: {focus}");
+    }
+    if let Some(goal) = app.goal.goal_objective.as_deref() {
+        let _ = writeln!(out, "- Goal: {goal}");
+    }
+    if let Some(budget) = app.goal.goal_token_budget {
+        let _ = writeln!(out, "- Goal token budget: {budget}");
+    }
+    if app.cycle_count > 0 {
+        let _ = writeln!(out, "- Cycle count: {}", app.cycle_count);
+    }
+
+    if let Ok(todos) = app.todos.try_lock() {
+        let snapshot = todos.snapshot();
+        if !snapshot.items.is_empty() {
+            let _ = writeln!(
+                out,
+                "\nWork checklist (primary progress surface, {}% complete):",
+                snapshot.completion_pct
+            );
+            for item in snapshot.items {
+                let _ = writeln!(
+                    out,
+                    "- #{} [{}] {}",
+                    item.id,
+                    item.status.as_str(),
+                    item.content
+                );
+            }
+        }
+    } else {
+        let _ = writeln!(
+            out,
+            "\nWork checklist: unavailable because the checklist is busy."
        );
    }

-    let model = app.model.clone();
-    let child_model = "deepseek-v4-flash".to_string();
-    // Paper experiments use depth=1 (one level of `sub_rlm`); we default to
-    // depth=2 so the model can recurse twice if it chooses to.
-    let max_depth: u32 = 2;
+    if let Ok(plan) = app.plan_state.try_lock() {
+        let snapshot = plan.snapshot();
+        if snapshot.explanation.is_some() || !snapshot.items.is_empty() {
+            let _ = writeln!(out, "\nOptional strategy metadata from update_plan:");
+            if let Some(explanation) = snapshot.explanation.as_deref() {
+                let _ = writeln!(out, "- Explanation: {explanation}");
+            }
+            for item in snapshot.items {
+                let _ = writeln!(out, "- [{}] {}", plan_status_label(&item.status), item.step);
+            }
+        }
+    } else {
+        let _ = writeln!(
+            out,
+            "\nStrategy metadata: unavailable because plan state is busy."
+        );
+    }

-    CommandResult::with_message_and_action(
-        format!(
-            "Starting RLM turn for {} chars of prompt using {} (child={}, depth={})...",
-            prompt.len(),
-            model,
-            child_model,
-            max_depth,
-        ),
-        AppAction::Rlm {
-            prompt,
-            model,
-            child_model,
-            max_depth,
-        },
-    )
+    let _ = writeln!(
+        out,
+        "\nBefore writing, inspect the current transcript context and any live tool evidence you need. Do not invent test results, file changes, blockers, or decisions."
+    );
+    let _ = writeln!(
+        out,
+        "\nUse this compact structure:\n\
+         # Session relay\n\
+         \n\
+         ## Goal\n\
+         [the user's objective and any explicit constraints]\n\
+         \n\
+         ## Current work\n\
+         [the active Work checklist item, progress, and what is mid-flight]\n\
+         \n\
+         ## Files and state\n\
+         [changed files, important paths, sub-agents/RLM sessions, commands run]\n\
+         \n\
+         ## Decisions\n\
+         [why key choices were made]\n\
+         \n\
+         ## Verification\n\
+         [what passed, what failed, what was not run]\n\
+         \n\
+         ## Next action\n\
+         [one concrete action for the next thread]"
+    );
+    let _ = writeln!(
+        out,
+        "\nKeep it under about 900 words unless the session genuinely needs more. After writing, report the path and the single next action."
+    );
+    out
+}
+
+fn plan_status_label(status: &crate::tools::plan::StepStatus) -> &'static str {
+    match status {
+        crate::tools::plan::StepStatus::Pending => "pending",
+        crate::tools::plan::StepStatus::InProgress => "in_progress",
+        crate::tools::plan::StepStatus::Completed => "completed",
+    }
+}
+
+fn parse_depth_prefixed_arg(
+    arg: Option<&str>,
+    default_depth: u32,
+) -> Result<(u32, Option<&str>), String> {
+    let Some(raw) = arg.map(str::trim).filter(|raw| !raw.is_empty()) else {
+        return Ok((default_depth, None));
+    };
+    let mut parts = raw.splitn(2, char::is_whitespace);
+    let first = parts.next().unwrap_or_default();
+    if first.chars().all(|ch| ch.is_ascii_digit()) {
+        let depth: u32 = first
+            .parse()
+            .map_err(|_| "Depth must be an integer from 0 to 3".to_string())?;
+        if depth > 3 {
+            return Err("Depth must be between 0 and 3".to_string());
+        }
+        Ok((depth, parts.next().map(str::trim)))
+    } else {
+        Ok((default_depth, Some(raw)))
+    }
+}
+
+fn resolves_to_existing_file(app: &App, input: &str) -> bool {
+    let path = std::path::Path::new(input);
+    let candidate = if path.is_absolute() {
+        path.to_path_buf()
+    } else {
+        app.workspace.join(path)
+    };
+    candidate.is_file()
 }

 /// Get command info by name or alias
@@ -862,6 +1049,8 @@ fn suggest_command_names(input: &str, limit: usize) -> Vec<String> {
 mod tests {
    use super::*;
    use crate::config::Config;
+    use crate::tools::plan::{PlanItemArg, StepStatus, UpdatePlanArgs};
+    use crate::tools::todo::TodoStatus;
    use crate::tui::app::{App, AppAction, TuiOptions};
    use std::ffi::OsString;
    use std::path::{Path, PathBuf};
@@ -910,6 +1099,101 @@ mod tests {
        assert_eq!(links.aliases, &["dashboard", "api"]);
    }

+    #[test]
+    fn rlm_slash_command_routes_to_persistent_tool_instruction() {
+        let mut app = create_test_app();
+        let result = execute("/rlm 2 inspect this long corpus", &mut app);
+        assert!(!result.is_error);
+        assert!(result.message.as_deref().unwrap_or("").contains("depth 2"));
+        let Some(AppAction::SendMessage(message)) = result.action else {
+            panic!("expected SendMessage action");
+        };
+        assert!(message.contains("rlm_open"));
+        assert!(message.contains("rlm_configure"));
+        assert!(message.contains("sub_rlm_max_depth: 2"));
+    }
+
+    #[test]
+    fn agent_slash_command_routes_to_persistent_tool_instruction() {
+        let mut app = create_test_app();
+        let result = execute("/agent 0 inspect the parser", &mut app);
+        assert!(!result.is_error);
+        let Some(AppAction::SendMessage(message)) = result.action else {
+            panic!("expected SendMessage action");
+        };
+        assert!(message.contains("agent_open"));
+        assert!(message.contains("max_depth: 0"));
+    }
+
+    #[test]
+    fn relay_slash_command_routes_to_session_relay_instruction() {
+        let mut app = create_test_app();
+        app.goal.goal_objective = Some("Unify the work surface".to_string());
+        app.goal.goal_token_budget = Some(12_000);
+        app.cycle_count = 2;
+        {
+            let mut todos = app.todos.try_lock().expect("todo lock");
+            todos.add("inspect workspace".to_string(), TodoStatus::Completed);
+            todos.add("patch relay command".to_string(), TodoStatus::InProgress);
+        }
+        {
+            let mut plan = app.plan_state.try_lock().expect("plan lock");
+            plan.update(UpdatePlanArgs {
+                explanation: Some("RLM-style strategy".to_string()),
+                plan: vec![PlanItemArg {
+                    step: "keep checklist primary".to_string(),
+                    status: StepStatus::InProgress,
+                }],
+            });
+        }
+
+        let result = execute("/relay verify install", &mut app);
+        assert!(!result.is_error);
+        assert!(
+            result
+                .message
+                .as_deref()
+                .unwrap_or_default()
+                .contains(".deepseek/handoff.md")
+        );
+        let Some(AppAction::SendMessage(message)) = result.action else {
+            panic!("expected SendMessage action");
+        };
+        assert!(message.contains("session relay"));
+        assert!(message.contains("接力"));
+        assert!(message.contains("Write or update `.deepseek/handoff.md`"));
+        assert!(message.contains("# Session relay"));
+        assert!(message.contains("Requested relay focus: verify install"));
+        assert!(message.contains("Goal: Unify the work surface"));
+        assert!(message.contains("Goal token budget: 12000"));
+        assert!(message.contains("Cycle count: 2"));
+        assert!(message.contains("Work checklist (primary progress surface, 50% complete)"));
+        assert!(message.contains("#1 [completed] inspect workspace"));
+        assert!(message.contains("#2 [in_progress] patch relay command"));
+        assert!(message.contains("Optional strategy metadata from update_plan"));
+        assert!(message.contains("Explanation: RLM-style strategy"));
+        assert!(message.contains("[in_progress] keep checklist primary"));
+    }
+
+    #[test]
+    fn relay_command_has_bilingual_aliases() {
+        let relay = COMMANDS
+            .iter()
+            .find(|cmd| cmd.name == "relay")
+            .expect("relay command should exist");
+        assert_eq!(relay.aliases, &["batonpass", "接力"]);
+        assert!(relay.description_for(Locale::ZhHans).contains("接力"));
+        assert!(relay.description_for(Locale::ZhHant).contains("接力"));
+
+        let mut app = create_test_app();
+        let result = execute("/接力 next hand", &mut app);
+        assert!(!result.is_error);
+        let Some(AppAction::SendMessage(message)) = result.action else {
+            panic!("expected SendMessage action");
+        };
+        assert!(message.contains("Requested relay focus: next hand"));
+    }
+
    #[test]
    fn command_registry_has_unique_names_and_aliases() {
        let mut names = std::collections::BTreeSet::new();
@@ -1525,6 +1525,10 @@ impl Config {
            return Ok(value);
        }

+        if base_url_uses_local_host(&self.deepseek_base_url()) {
+            return Ok(String::new());
+        }
+
        match provider {
            ApiProvider::Deepseek | ApiProvider::DeepseekCN => anyhow::bail!(
                "DeepSeek API key not found.\n\
@@ -2545,6 +2549,29 @@ fn provider_preserves_custom_base_url_model(provider: ApiProvider, base_url: &st
    base_url_is_custom_for_provider(provider, base_url)
 }

+fn base_url_uses_local_host(base_url: &str) -> bool {
+    let Some(host) = base_url_host(base_url) else {
+        return false;
+    };
+    let host = host.trim_matches(['[', ']']).to_ascii_lowercase();
+    if matches!(host.as_str(), "localhost" | "0.0.0.0") {
+        return true;
+    }
+    host.parse::<std::net::IpAddr>()
+        .is_ok_and(|addr| addr.is_loopback() || addr.is_unspecified())
+}
+
+fn base_url_host(base_url: &str) -> Option<&str> {
+    let without_scheme = base_url
+        .split_once("://")
+        .map_or(base_url, |(_, rest)| rest);
+    let authority = without_scheme.split('/').next()?.rsplit('@').next()?;
+    if let Some(rest) = authority.strip_prefix('[') {
+        return rest.split_once(']').map(|(host, _)| host);
+    }
+    authority.split(':').next().filter(|host| !host.is_empty())
+}
+
 fn model_for_provider(provider: ApiProvider, normalized: String) -> String {
    let lowered = normalized.to_ascii_lowercase();
    match (provider, lowered.as_str()) {
@@ -3191,6 +3218,10 @@ pub fn has_api_key_for(config: &Config, provider: ApiProvider) -> bool {
        return true;
    }

+    if provider == config.api_provider() && base_url_uses_local_host(&config.deepseek_base_url()) {
+        return true;
+    }
+
    if config
        .provider_config_for(provider)
        .and_then(|entry| entry.api_key.as_ref())
@@ -4525,6 +4556,20 @@ api_key = "old-openrouter-key"
        assert_eq!(config.deepseek_base_url(), "https://api.deepseek.com");
    }

+    #[test]
+    fn loopback_deepseek_base_url_runs_without_api_key() -> Result<()> {
+        let _lock = lock_test_env();
+        let config = Config {
+            base_url: Some("http://127.0.0.1:8000/v1".to_string()),
+            ..Default::default()
+        };
+
+        assert_eq!(config.api_provider(), ApiProvider::Deepseek);
+        assert!(has_api_key(&config));
+        assert_eq!(config.deepseek_api_key()?, "");
+        Ok(())
+    }
+
    #[test]
    fn deepseek_model_env_overrides_default_text_model() -> Result<()> {
        let _lock = lock_test_env();
@@ -199,8 +199,7 @@ pub enum CostCurrencyValue {
 #[serde(rename_all = "snake_case")]
 pub enum SidebarFocusValue {
    Auto,
-    Plan,
-    Todos,
+    Work,
    Tasks,
    Agents,
    Context,
@@ -724,8 +723,7 @@ impl SidebarFocusValue {
    fn as_setting(self) -> &'static str {
        match self {
            Self::Auto => "auto",
-            Self::Plan => "plan",
-            Self::Todos => "todos",
+            Self::Work => "work",
            Self::Tasks => "tasks",
            Self::Agents => "agents",
            Self::Context => "context",
@@ -842,8 +840,7 @@ impl From<&str> for SidebarFocusValue {
    fn from(value: &str) -> Self {
        match SidebarFocus::from_setting(value) {
            SidebarFocus::Auto => Self::Auto,
-            SidebarFocus::Plan => Self::Plan,
-            SidebarFocus::Todos => Self::Todos,
+            SidebarFocus::Work => Self::Work,
            SidebarFocus::Tasks => Self::Tasks,
            SidebarFocus::Agents => Self::Agents,
            SidebarFocus::Context => Self::Context,
@@ -919,7 +916,9 @@ mod tests {
            notes_path: PathBuf::from("notes.txt"),
            mcp_config_path: PathBuf::from("mcp.json"),
            use_memory: false,
-            start_in_agent_mode: false,
+            // Keep this fixture independent from the developer's saved
+            // `default_mode` setting.
+            start_in_agent_mode: true,
            skip_onboarding: true,
            yolo: false,
            resume_session_id: None,
@@ -850,15 +850,6 @@ impl Engine {
                Op::CompactContext => {
                    self.handle_manual_compaction().await;
                }
-                Op::Rlm {
-                    content,
-                    model,
-                    child_model,
-                    max_depth,
-                } => {
-                    self.handle_rlm(content, model, child_model, max_depth)
-                        .await;
-                }
                Op::EditLastTurn { new_message } => {
                    // #383: /edit — remove the last user+assistant exchange
                    // from the session, then re-send with the new content.
@@ -1339,100 +1330,6 @@ impl Engine {
            .await;
    }

-    /// Handle a Recursive Language Model (RLM) query — Algorithm 1 from
-    /// Zhang et al. (arXiv:2512.24601).
-    ///
-    /// The prompt is stored as PROMPT in a REPL variable. The root LLM
-    /// only sees metadata about the REPL state, never the prompt text
-    /// directly. The model generates Python code, which is executed by
-    /// the REPL. When FINAL() is called, the loop ends.
-    async fn handle_rlm(
-        &mut self,
-        content: String,
-        model: String,
-        child_model: String,
-        max_depth: u32,
-    ) {
-        use crate::rlm::turn::run_rlm_turn;
-
-        let Some(ref client) = self.deepseek_client else {
-            let err = self
-                .deepseek_client_error
-                .as_deref()
-                .map(|s| s.to_string())
-                .unwrap_or_else(|| "API client not configured".to_string());
-            let _ = self
-                .tx_event
-                .send(Event::error(ErrorEnvelope::fatal_auth(format!(
-                    "RLM error: {err}"
-                ))))
-                .await;
-            return;
-        };
-
-        let _ = self
-            .tx_event
-            .send(Event::status("RLM turn started".to_string()))
-            .await;
-
-        let result = run_rlm_turn(
-            client,
-            model,
-            content,
-            child_model,
-            self.tx_event.clone(),
-            max_depth,
-        )
-        .await;
-
-        let has_error = result.error.is_some();
-        if let Some(ref err) = result.error {
-            let _ = self
-                .tx_event
-                .send(Event::error(ErrorEnvelope::tool(format!(
-                    "RLM error: {err}"
-                ))))
-                .await;
-        }
-
-        if !result.answer.is_empty() {
-            // Add the final answer as an assistant message in the session.
-            self.add_session_message(crate::models::Message {
-                role: "assistant".to_string(),
-                content: vec![crate::models::ContentBlock::Text {
-                    text: result.answer.clone(),
-                    cache_control: None,
-                }],
-            })
-            .await;
-
-            let _ = self
-                .tx_event
-                .send(Event::MessageDelta {
-                    index: 0,
-                    content: result.answer.clone(),
-                })
-                .await;
-            let _ = self
-                .tx_event
-                .send(Event::MessageComplete { index: 0 })
-                .await;
-        }
-
-        let _ = self
-            .tx_event
-            .send(Event::TurnComplete {
-                usage: result.usage,
-                status: if has_error {
-                    crate::core::events::TurnOutcomeStatus::Failed
-                } else {
-                    crate::core::events::TurnOutcomeStatus::Completed
-                },
-                error: result.error,
-            })
-            .await;
-    }
-
    fn estimated_input_tokens(&self) -> usize {
        estimate_input_tokens_conservative(
            &self.session.messages,
@@ -146,6 +146,10 @@ fn summarize_subagent_status(status: &serde_json::Value) -> String {
 }

 fn summarize_subagent_snapshot(snapshot: &serde_json::Value, index: usize) -> String {
+    if let Some(inner) = snapshot.get("snapshot") {
+        return summarize_subagent_snapshot(inner, index);
+    }
+
    let Some(obj) = snapshot.as_object() else {
        return format!(
            "- item {index}: {}",
@@ -202,7 +206,10 @@ fn summarize_subagent_snapshot(snapshot: &serde_json::Value, index: usize) -> St
 }

 fn compact_subagent_tool_result_for_context(tool_name: &str, raw: &str) -> Option<String> {
-    if !matches!(tool_name, "agent_result" | "agent_wait" | "wait") {
+    if !matches!(
+        tool_name,
+        "agent_open" | "agent_eval" | "agent_close" | "agent_result" | "agent_wait" | "wait"
+    ) {
        return None;
    }

@@ -217,7 +224,7 @@ fn compact_subagent_tool_result_for_context(tool_name: &str, raw: &str) -> Optio
    out.push_str(
        "Child results are self-reports; verify side effects with tools like read_file or list_dir before claiming success.\n",
    );
-    out.push_str("Use `agent_result` again only if you need the full raw payload.\n");
+    out.push_str("Use `agent_eval` for a fresh projection or `handle_read` on `transcript_handle` for bounded transcript slices.\n");
    for (idx, snapshot) in snapshots.iter().enumerate() {
        if idx >= 8 {
            out.push_str(&format!(
@@ -533,6 +533,7 @@ fn turn_tool_registry_builder_keeps_plan_mode_read_only_for_files() {
    assert!(registry.contains("update_plan"));
    assert!(registry.contains("task_list"));
    assert!(registry.contains("task_read"));
+    assert!(registry.contains("handle_read"));
    assert!(registry.contains("recall_archive"));

    let plan_state_tools = [
@@ -827,7 +828,7 @@ fn subagent_results_are_summarized_before_parent_context_insertion() {
        .to_string(),
    );

-    let context = compact_tool_result_for_context("deepseek-v4-pro", "agent_result", &output);
+    let context = compact_tool_result_for_context("deepseek-v4-pro", "agent_eval", &output);

    assert!(context.contains("[sub-agent result summarized for parent context]"));
    assert!(context.contains("agent_1234abcd (explore) status=Completed"));
@@ -837,6 +838,7 @@ fn subagent_results_are_summarized_before_parent_context_insertion() {
    assert!(context.contains("self-report"));
    assert!(context.contains("verify side effects"));
    assert!(context.contains("read_file") && context.contains("list_dir"));
+    assert!(context.contains("handle_read"));
 }

 #[test]
@@ -58,7 +58,11 @@ pub(super) fn should_default_defer_tool(name: &str, mode: AppMode) -> bool {
            | "grep_files"
            | "file_search"
            | "diagnostics"
-            | "rlm"
+            | "rlm_open"
+            | "rlm_eval"
+            | "rlm_configure"
+            | "rlm_close"
+            | "handle_read"
            | "recall_archive"
            | "notify"
            | MULTI_TOOL_PARALLEL_NAME
@@ -48,6 +48,7 @@ impl Engine {
                .with_diagnostics_tool()
                .with_skill_tools()
                .with_validation_tools()
+                .with_handle_tools()
                .with_runtime_read_only_task_tools()
                .with_todo_tool(todo_list)
                .with_plan_tool(plan_state)
@@ -75,21 +75,6 @@ pub enum Op {
    /// Run context compaction immediately.
    CompactContext,

-    /// Run a Recursive Language Model (RLM) turn per Algorithm 1 of
-    /// Zhang et al. (arXiv:2512.24601). The prompt is stored in the REPL
-    /// as `context`; the root LLM only sees metadata.
-    Rlm {
-        /// The user's prompt — stored in REPL, NOT in the LLM context.
-        content: String,
-        /// The model to use for root LLM calls.
-        model: String,
-        /// The model to use for sub-LLM (llm_query) calls.
-        child_model: String,
-        /// Recursion budget for `sub_rlm()` calls. Paper experiments use
-        /// depth=1; defaults set by the `/rlm` command.
-        max_depth: u32,
-    },
-
    /// Edit the last user message: remove the last user+assistant exchange
    /// from the session, then re-send with the new content.
    #[allow(dead_code)]
@@ -10,8 +10,8 @@
 //! summary as if it were verbatim and confabulates around the gaps.
 //!
 //! Checkpoint-restart fixes this by giving every cycle a *homogeneous* fresh
-//! context: original system prompt, structured state (todos / plan / working
-//! set / sub-agent handles), and a model-curated free-form briefing of at
+//! context: original system prompt, structured work state (checklist /
+//! strategy / working set / sub-agent handles), and a model-curated free-form briefing of at
 //! most ~3,000 tokens. The previous cycle is archived to disk in JSONL form
 //! so a future `recall_archive` tool (issue #127) can search it on demand.
 //!
@@ -271,8 +271,27 @@ impl StructuredState {
            out.push_str(&format!("- Cwd: `{}`\n", cwd.display()));
        }

+        if self.todo_snapshot.is_some() || self.plan_snapshot.is_some() {
+            out.push_str("\n### Work\n");
+        }
+
+        if let Some(todos) = self.todo_snapshot.as_ref() {
+            out.push_str(&format!(
+                "\nChecklist ({}% complete)\n",
+                todos.completion_pct
+            ));
+            for item in &todos.items {
+                let marker = match item.status {
+                    crate::tools::todo::TodoStatus::Pending => "[ ]",
+                    crate::tools::todo::TodoStatus::InProgress => "[~]",
+                    crate::tools::todo::TodoStatus::Completed => "[x]",
+                };
+                out.push_str(&format!("- {marker} {}\n", item.content));
+            }
+        }
+
        if let Some(plan) = self.plan_snapshot.as_ref() {
-            out.push_str("\n### Plan\n");
+            out.push_str("\nStrategy\n");
            if let Some(explanation) = plan.explanation.as_ref() {
                out.push_str(&format!("{explanation}\n\n"));
            }
@@ -286,21 +305,6 @@ impl StructuredState {
            }
        }

-        if let Some(todos) = self.todo_snapshot.as_ref() {
-            out.push_str(&format!(
-                "\n### Todos ({}% complete)\n",
-                todos.completion_pct
-            ));
-            for item in &todos.items {
-                let marker = match item.status {
-                    crate::tools::todo::TodoStatus::Pending => "[ ]",
-                    crate::tools::todo::TodoStatus::InProgress => "[~]",
-                    crate::tools::todo::TodoStatus::Completed => "[x]",
-                };
-                out.push_str(&format!("- {marker} {}\n", item.content));
-            }
-        }
-
        if !self.subagent_snapshots.is_empty() {
            out.push_str("\n### Open Sub-Agents\n");
            for s in &self.subagent_snapshots {
@@ -976,6 +980,41 @@ mod tests {
        assert!(block.contains("Workspace: `/tmp/ws`"));
    }

+    #[test]
+    fn structured_state_to_system_block_unifies_work_state() {
+        let state = StructuredState {
+            mode_label: "agent".to_string(),
+            workspace: PathBuf::from("/tmp/ws"),
+            cwd: None,
+            working_set_summary: None,
+            todo_snapshot: Some(TodoListSnapshot {
+                items: vec![crate::tools::todo::TodoItem {
+                    id: 1,
+                    content: "Run focused tests".to_string(),
+                    status: crate::tools::todo::TodoStatus::InProgress,
+                }],
+                completion_pct: 0,
+                in_progress_id: Some(1),
+            }),
+            plan_snapshot: Some(PlanSnapshot {
+                explanation: Some("Keep sidebar state unified".to_string()),
+                items: vec![crate::tools::plan::PlanItemArg {
+                    step: "Update prompts".to_string(),
+                    status: crate::tools::plan::StepStatus::Pending,
+                }],
+            }),
+            subagent_snapshots: Vec::new(),
+        };
+
+        let block = state.to_system_block().expect("renders");
+
+        assert!(block.contains("### Work"));
+        assert!(block.contains("Checklist (0% complete)"));
+        assert!(block.contains("Strategy"));
+        assert!(!block.contains("### Plan"));
+        assert!(!block.contains("### Todos"));
+    }
+
    #[test]
    fn archive_cycle_writes_jsonl_with_header_and_messages() {
        let dir = tempdir().expect("tempdir");
@@ -23,6 +23,7 @@ use crate::tui::history::ToolStatus;
 pub enum Variant {
    Dark,
    Light,
+    Grayscale,
 }

 /// Centralized visual tokens for sidebar, plan, and tool rendering.
@@ -67,7 +68,7 @@ impl Theme {
            section_bg: palette::DEEPSEEK_INK,
            section_title_color: palette::DEEPSEEK_BLUE,
            // Horizontal padding only. `Padding::uniform(1)` ate two rows of
-            // each sidebar panel — for compact terminals where Plan/Todos/Tasks
+            // each sidebar panel — for compact terminals where Work/Tasks/Agents
            // get ~3 rows total via the 25% layout split, that left zero rows
            // for content (#63 follow-up: panels rendered as empty boxes even
            // when "No todos" / "No active plan" should have shown).
@@ -113,11 +114,38 @@ impl Theme {
        }
    }

+    /// Neutral black/white tokens for users who want minimal brand color.
+    #[must_use]
+    pub const fn grayscale() -> Self {
+        Self {
+            variant: Variant::Grayscale,
+            section_borders: Borders::ALL,
+            section_border_type: BorderType::Plain,
+            section_border_color: palette::GRAYSCALE_BORDER,
+            section_bg: palette::GRAYSCALE_PANEL,
+            section_title_color: palette::GRAYSCALE_TEXT_SOFT,
+            section_padding: Padding::horizontal(1),
+            tool_title_color: palette::GRAYSCALE_TEXT_SOFT,
+            tool_value_color: palette::GRAYSCALE_TEXT_MUTED,
+            tool_label_color: palette::GRAYSCALE_TEXT_HINT,
+            tool_running_accent: palette::GRAYSCALE_TEXT_SOFT,
+            tool_success_accent: palette::GRAYSCALE_TEXT_HINT,
+            tool_failed_accent: palette::GRAYSCALE_TEXT_BODY,
+            plan_progress_color: palette::GRAYSCALE_TEXT_SOFT,
+            plan_summary_color: palette::GRAYSCALE_TEXT_MUTED,
+            plan_explanation_color: palette::GRAYSCALE_TEXT_HINT,
+            plan_pending_color: palette::GRAYSCALE_TEXT_MUTED,
+            plan_in_progress_color: palette::GRAYSCALE_TEXT_BODY,
+            plan_completed_color: palette::GRAYSCALE_TEXT_SOFT,
+        }
+    }
+
    #[must_use]
    pub const fn for_palette_mode(mode: PaletteMode) -> Self {
        match mode {
            PaletteMode::Dark => Self::dark(),
            PaletteMode::Light => Self::light(),
+            PaletteMode::Grayscale => Self::grayscale(),
        }
    }

@@ -201,6 +229,17 @@ mod tests {
        assert_eq!(theme.plan_summary_color, palette::LIGHT_TEXT_MUTED);
    }

+    #[test]
+    fn grayscale_theme_uses_neutral_tokens() {
+        let theme = Theme::for_palette_mode(crate::palette::PaletteMode::Grayscale);
+        assert_eq!(theme.variant, Variant::Grayscale);
+        assert_eq!(theme.section_bg, palette::GRAYSCALE_PANEL);
+        assert_eq!(theme.section_border_color, palette::GRAYSCALE_BORDER);
+        assert_eq!(theme.tool_running_accent, palette::GRAYSCALE_TEXT_SOFT);
+        assert_eq!(theme.tool_failed_accent, palette::GRAYSCALE_TEXT_BODY);
+        assert_eq!(theme.plan_summary_color, palette::GRAYSCALE_TEXT_MUTED);
+    }
+
    #[test]
    fn tool_status_color_maps_each_status() {
        let theme = Theme::dark();
@@ -1,6 +1,6 @@
 //! External-binary dependency resolution for tools that shell out to
-//! locally-installed programs (Python for `code_execution`, `pdftotext`
-//! for PDF reading in `read_file`, future tools as added).
+//! locally-installed programs (Python for `code_execution` / RLM REPL,
+//! `pdftotext` for PDF reading in `read_file`, future tools as added).
 //!
 //! Before v0.8.31, tools that called external binaries hardcoded the
 //! command name and failed at execution time when the binary wasn't on
@@ -9,8 +9,8 @@
 //! `python`, not `python3`) saw `Failed to execute tool: program not
 //! found` with no upstream hint of what was wrong.
 //!
-//! This module centralises the probe-then-decide pattern. The two
-//! supported callers today are:
+//! This module centralises the probe-then-decide pattern. The supported
+//! callers today are:
 //!
 //! - Tool catalog construction (`core::engine::tool_catalog`): for
 //!   tools that should be advertised to the model only when the
@@ -18,6 +18,8 @@
 //! - Doctor command (`run_doctor` in `main.rs`): for surfacing the
 //!   resolved state to the user so missing dependencies aren't an
 //!   invisible failure.
+//! - Long-lived REPL runtime (`repl::runtime`): for RLM and inline `repl`
+//!   blocks that need to spawn Python on every supported platform.
 //!
 //! Results are cached for the process lifetime via [`std::sync::OnceLock`]
 //! — probing a binary involves a `Command::output` per candidate and
@@ -83,7 +85,7 @@ pub fn resolve_python_interpreter() -> Option<String> {
                    tracing::info!(
                        target: "tool_dependencies",
                        candidate = candidate,
-                        "Resolved Python interpreter for code_execution",
+                        "Resolved Python interpreter",
                    );
                    return Some((*candidate).to_string());
                }
@@ -91,7 +93,7 @@ pub fn resolve_python_interpreter() -> Option<String> {
            tracing::warn!(
                target: "tool_dependencies",
                tried = ?PYTHON_CANDIDATES,
-                "No Python interpreter found; code_execution tool will not be advertised",
+                "No Python interpreter found",
            );
            None
        })
@@ -1,13 +1,13 @@
-// Used by the deferred context-limit handoff feature (#667). The implementation
+// Used by the deferred context-limit relay feature (#667). The implementation
 // path is staged but not yet wired from the engine; suppress dead-code warnings
 // rather than delete the table until the follow-up feature consumes it.
 #[allow(dead_code)]
 pub const THRESHOLDS: [(f32, &str); 3] = [
    (
        0.9,
-        "Context at 90%: stop and write handoff to .deepseek/handoff.md now",
+        "Context at 90%: stop and write relay to .deepseek/handoff.md now",
    ),
-    (0.8, "Context at 80%: draft handoff to .deepseek/handoff.md"),
+    (0.8, "Context at 80%: draft relay to .deepseek/handoff.md"),
    (0.7, "Context at 70%: consider wrapping current sub-task"),
 ];
 #[allow(dead_code)]
@@ -256,6 +256,7 @@ pub enum MessageId {
    CmdHelpDescription,
    CmdHomeDescription,
    CmdHooksDescription,
+    CmdAgentDescription,
    CmdGoalDescription,
    CmdInitDescription,
    CmdJobsDescription,
@@ -273,6 +274,7 @@ pub enum MessageId {
    CmdProviderDescription,
    CmdQueueDescription,
    CmdRecallDescription,
+    CmdRelayDescription,
    CmdRenameDescription,
    CmdRestoreDescription,
    CmdRetryDescription,
@@ -485,6 +487,7 @@ pub const ALL_MESSAGE_IDS: &[MessageId] = &[
    MessageId::CmdHelpDescription,
    MessageId::CmdHomeDescription,
    MessageId::CmdHooksDescription,
+    MessageId::CmdAgentDescription,
    MessageId::CmdInitDescription,
    MessageId::CmdJobsDescription,
    MessageId::CmdLinksDescription,
@@ -500,6 +503,7 @@ pub const ALL_MESSAGE_IDS: &[MessageId] = &[
    MessageId::CmdProviderDescription,
    MessageId::CmdQueueDescription,
    MessageId::CmdRecallDescription,
+    MessageId::CmdRelayDescription,
    MessageId::CmdRenameDescription,
    MessageId::CmdRestoreDescription,
    MessageId::CmdRetryDescription,
@@ -898,6 +902,9 @@ fn english(id: MessageId) -> &'static str {
        MessageId::CmdHelpDescription => "Show help information",
        MessageId::CmdHomeDescription => "Show home dashboard with stats and quick actions",
        MessageId::CmdHooksDescription => "List configured lifecycle hooks (read-only)",
+        MessageId::CmdAgentDescription => {
+            "Open a persistent sub-agent session: /agent [0-3] <task>"
+        }
        MessageId::CmdGoalDescription => "Set a session goal with optional token budget",
        MessageId::CmdInitDescription => "Generate AGENTS.md for project",
        MessageId::CmdLspDescription => "Toggle LSP diagnostics on or off",
@@ -915,23 +922,22 @@ fn english(id: MessageId) -> &'static str {
        MessageId::CmdModelsDescription => "List available models from API",
        MessageId::CmdNetworkDescription => "Manage network allow and deny rules",
        MessageId::CmdNoteDescription => "Add, list, edit, or remove workspace notes",
-        MessageId::CmdThemeDescription => "Toggle between dark and light theme",
+        MessageId::CmdThemeDescription => "Switch theme: dark, light, grayscale, or system",
        MessageId::CmdProviderDescription => {
            "Switch or view the active LLM backend (deepseek | nvidia-nim | ollama)"
        }
        MessageId::CmdQueueDescription => "View or edit queued messages",
        MessageId::CmdRecallDescription => "Search prior cycle archives (BM25 over message text)",
+        MessageId::CmdRelayDescription => "Create a session relay (接力) for a fresh thread",
        MessageId::CmdRenameDescription => "Rename the current session",
        MessageId::CmdRestoreDescription => {
            "Roll back the workspace to a prior pre/post-turn snapshot. With no arg, lists recent snapshots."
        }
        MessageId::CmdRetryDescription => "Retry the last request",
        MessageId::CmdReviewDescription => "Run a structured code review on a file, diff, or PR",
-        MessageId::CmdRlmDescription => {
-            "Recursive Language Model (RLM) turn — store the prompt in a Python REPL and let the model write code to process it, with `llm_query()` / `sub_rlm()` for sub-LLM calls."
-        }
+        MessageId::CmdRlmDescription => "Open a persistent RLM context: /rlm [0-3] <file_or_text>",
        MessageId::CmdSaveDescription => "Save session to file",
-        MessageId::CmdSessionsDescription => "Open session picker",
+        MessageId::CmdSessionsDescription => "Open session history picker",
        MessageId::CmdSettingsDescription => "Show persistent settings",
        MessageId::CmdSkillDescription => {
            "Activate a skill, or install/update/uninstall/trust a community skill"
@@ -1055,7 +1061,7 @@ fn english(id: MessageId) -> &'static str {
            "Open details for the selected tool or message (when input is empty)"
        }
        MessageId::KbToolDetailsPager => "Open tool-details pager",
-        MessageId::KbThinkingPager => "Open thinking pager",
+        MessageId::KbThinkingPager => "Open Activity Detail",
        MessageId::KbLiveTranscript => "Open live transcript overlay (sticky-tail auto-scroll)",
        MessageId::KbBacktrackMessage => {
            "Backtrack to a previous user message (Left/Right step, Enter to rewind)"
@@ -1065,7 +1071,7 @@ fn english(id: MessageId) -> &'static str {
        }
        MessageId::KbJumpPlanAgentYolo => "Jump directly to Plan / Agent / YOLO mode",
        MessageId::KbAltJumpPlanAgentYolo => "Alternative jump to Plan / Agent / YOLO mode",
-        MessageId::KbFocusSidebar => "Focus Plan / Todos / Tasks / Agents / Auto sidebar",
+        MessageId::KbFocusSidebar => "Focus Work / Tasks / Agents / Context / Auto sidebar",
        MessageId::KbTogglePlanAgent => "Toggle between Plan and Agent modes",
        MessageId::KbSessionPicker => "Open the session picker",
        MessageId::KbPasteAttach => "Paste text or attach a clipboard image",
@@ -1185,6 +1191,7 @@ fn translation(locale: Locale, id: MessageId) -> Option<&'static str> {

 fn traditional_chinese(id: MessageId) -> Option<&'static str> {
    Some(match id {
+        MessageId::CmdRelayDescription => "為新執行緒建立會話接力摘要",
        MessageId::CmdTranslateDescription => "切換輸出翻譯為目前系統語言的開關狀態",
        MessageId::CmdTranslateOff => "輸出翻譯已關閉（顯示原始模型輸出）",
        MessageId::CmdTranslateOn => "輸出翻譯已開啟：模型回覆將以繁體中文顯示",
@@ -1268,6 +1275,9 @@ fn japanese(id: MessageId) -> Option<&'static str> {
        MessageId::CmdHooksDescription => {
            "設定済みのライフサイクルフックを一覧表示（読み取り専用）"
        }
+        MessageId::CmdAgentDescription => {
+            "永続サブエージェントセッションを開く: /agent [0-3] <task>"
+        }
        MessageId::CmdGoalDescription => "トークンバジェット付きのセッション目標を設定",
        MessageId::CmdInitDescription => "プロジェクト用に AGENTS.md を生成",
        MessageId::CmdLspDescription => "LSP 診断のオン・オフを切り替え",
@@ -1285,7 +1295,9 @@ fn japanese(id: MessageId) -> Option<&'static str> {
        MessageId::CmdModelsDescription => "API から利用可能なモデルを一覧表示",
        MessageId::CmdNetworkDescription => "ネットワーク許可・拒否ルールを管理",
        MessageId::CmdNoteDescription => "ワークスペースノートの追加、一覧、編集、削除",
-        MessageId::CmdThemeDescription => "テーマ（ダーク/ライト）を切り替え",
+        MessageId::CmdThemeDescription => {
+            "テーマを切り替え（ダーク/ライト/グレースケール/システム）"
+        }
        MessageId::CmdProviderDescription => {
            "現在の LLM バックエンドを切り替え・確認（deepseek | nvidia-nim | ollama）"
        }
@@ -1293,17 +1305,16 @@ fn japanese(id: MessageId) -> Option<&'static str> {
        MessageId::CmdRecallDescription => {
            "過去のサイクルアーカイブを検索（メッセージ本文への BM25 検索）"
        }
+        MessageId::CmdRelayDescription => "新しいスレッド用のセッションリレー（接力）を作成",
        MessageId::CmdRenameDescription => "現在のセッションの名前を変更",
        MessageId::CmdRestoreDescription => {
            "ワークスペースを以前のターン前/後スナップショットへロールバック。引数なしで最近のスナップショットを一覧表示。"
        }
        MessageId::CmdRetryDescription => "直前のリクエストを再試行",
        MessageId::CmdReviewDescription => "ファイル・diff・PR に対して構造化コードレビューを実行",
-        MessageId::CmdRlmDescription => {
-            "再帰言語モデル（RLM）ターン — プロンプトを Python REPL に格納し、モデルが処理コードを記述。サブ LLM 呼び出しは `llm_query()` / `sub_rlm()`。"
-        }
+        MessageId::CmdRlmDescription => "永続 RLM コンテキストを開く: /rlm [0-3] <file_or_text>",
        MessageId::CmdSaveDescription => "セッションをファイルに保存",
-        MessageId::CmdSessionsDescription => "セッションピッカーを開く",
+        MessageId::CmdSessionsDescription => "セッション履歴ピッカーを開く",
        MessageId::CmdSettingsDescription => "永続化された設定を表示",
        MessageId::CmdSkillDescription => {
            "スキルを有効化、またはコミュニティスキルをインストール／更新／アンインストール／信頼"
@@ -1424,7 +1435,7 @@ fn japanese(id: MessageId) -> Option<&'static str> {
            "選択中のツールまたはメッセージの詳細を開く（入力が空の時）"
        }
        MessageId::KbToolDetailsPager => "ツール詳細のページャーを開く",
-        MessageId::KbThinkingPager => "思考内容のページャーを開く",
+        MessageId::KbThinkingPager => "Activity Detail を開く",
        MessageId::KbLiveTranscript => "ライブ会話履歴オーバーレイを開く（自動追尾スクロール）",
        MessageId::KbBacktrackMessage => {
            "前のユーザーメッセージに戻る（左右でステップ、Enter で巻き戻し）"
@@ -1434,7 +1445,9 @@ fn japanese(id: MessageId) -> Option<&'static str> {
        }
        MessageId::KbJumpPlanAgentYolo => "Plan / Agent / YOLO モードに直接ジャンプ",
        MessageId::KbAltJumpPlanAgentYolo => "Plan / Agent / YOLO モードへの代替ジャンプ",
-        MessageId::KbFocusSidebar => "Plan / Todos / Tasks / Agents / Auto サイドバーにフォーカス",
+        MessageId::KbFocusSidebar => {
+            "Work / Tasks / Agents / Context / Auto サイドバーにフォーカス"
+        }
        MessageId::KbTogglePlanAgent => "Plan モードと Agent モードを切り替え",
        MessageId::KbSessionPicker => "セッションピッカーを開く",
        MessageId::KbPasteAttach => "テキストを貼り付けまたはクリップボード画像を添付",
@@ -1605,6 +1618,7 @@ fn chinese_simplified(id: MessageId) -> Option<&'static str> {
        MessageId::CmdHelpDescription => "显示帮助信息",
        MessageId::CmdHomeDescription => "显示主页面板，含统计与快捷操作",
        MessageId::CmdHooksDescription => "列出已配置的生命周期钩子（只读）",
+        MessageId::CmdAgentDescription => "打开持久子代理会话：/agent [0-3] <task>",
        MessageId::CmdGoalDescription => "设置带有可选令牌预算的会话目标",
        MessageId::CmdInitDescription => "为项目生成 AGENTS.md",
        MessageId::CmdLspDescription => "切换 LSP 诊断的开启或关闭",
@@ -1620,23 +1634,22 @@ fn chinese_simplified(id: MessageId) -> Option<&'static str> {
        MessageId::CmdModelsDescription => "列出 API 中可用的模型",
        MessageId::CmdNetworkDescription => "管理网络允许和拒绝规则",
        MessageId::CmdNoteDescription => "添加、列出、编辑或删除工作区笔记",
-        MessageId::CmdThemeDescription => "在浅色和深色主题之间切换",
+        MessageId::CmdThemeDescription => "切换主题：深色、浅色、灰度或系统",
        MessageId::CmdProviderDescription => {
            "切换或查看当前 LLM 后端（deepseek | nvidia-nim | ollama）"
        }
        MessageId::CmdQueueDescription => "查看或编辑已排队的消息",
        MessageId::CmdRecallDescription => "搜索此前的循环归档（基于消息文本的 BM25 检索）",
+        MessageId::CmdRelayDescription => "为新线程创建会话接力摘要",
        MessageId::CmdRenameDescription => "重命名当前会话",
        MessageId::CmdRestoreDescription => {
            "将工作区回滚到此前的轮次前/后快照。不带参数时列出最近的快照。"
        }
        MessageId::CmdRetryDescription => "重试上一次请求",
        MessageId::CmdReviewDescription => "对文件、diff 或 PR 进行结构化代码审查",
-        MessageId::CmdRlmDescription => {
-            "递归语言模型（RLM）轮次 —— 将提示词存入 Python REPL，让模型编写代码进行处理；可用 `llm_query()` / `sub_rlm()` 调用子 LLM。"
-        }
+        MessageId::CmdRlmDescription => "打开持久 RLM 上下文：/rlm [0-3] <file_or_text>",
        MessageId::CmdSaveDescription => "将会话保存到文件",
-        MessageId::CmdSessionsDescription => "打开会话选择器",
+        MessageId::CmdSessionsDescription => "打开会话历史选择器",
        MessageId::CmdSettingsDescription => "显示持久化设置",
        MessageId::CmdSkillDescription => "激活技能，或安装/更新/卸载/信任社区技能",
        MessageId::CmdSkillsDescription => {
@@ -1741,7 +1754,7 @@ fn chinese_simplified(id: MessageId) -> Option<&'static str> {
        MessageId::KbLastMessagePager => "打开最后一条消息的分页器（输入框为空时）",
        MessageId::KbSelectedDetails => "打开选中工具或消息的详情（输入框为空时）",
        MessageId::KbToolDetailsPager => "打开工具详情分页器",
-        MessageId::KbThinkingPager => "打开思考内容分页器",
+        MessageId::KbThinkingPager => "打开 Activity Detail",
        MessageId::KbLiveTranscript => "打开实时对话覆盖层（自动滚动尾随）",
        MessageId::KbBacktrackMessage => "回退到之前的用户消息（左右键步进，Enter 回退）",
        MessageId::KbCompleteCycleModes => {
@@ -1749,7 +1762,7 @@ fn chinese_simplified(id: MessageId) -> Option<&'static str> {
        }
        MessageId::KbJumpPlanAgentYolo => "直接跳转到 Plan / Agent / YOLO 模式",
        MessageId::KbAltJumpPlanAgentYolo => "替代快捷键跳转到 Plan / Agent / YOLO 模式",
-        MessageId::KbFocusSidebar => "聚焦 Plan / 待办 / 任务 / 代理 / 代理 / 自动侧边栏",
+        MessageId::KbFocusSidebar => "聚焦 Work / 任务 / 代理 / Context / 自动侧边栏",
        MessageId::KbTogglePlanAgent => "在 Plan 和 Agent 模式之间切换",
        MessageId::KbSessionPicker => "打开会话选择器",
        MessageId::KbPasteAttach => "粘贴文本或附加剪贴板图片",
@@ -1918,6 +1931,9 @@ fn portuguese_brazil(id: MessageId) -> Option<&'static str> {
        MessageId::CmdHooksDescription => {
            "Listar hooks de ciclo de vida configurados (somente leitura)"
        }
+        MessageId::CmdAgentDescription => {
+            "Abrir uma sessão persistente de sub-agente: /agent [0-3] <task>"
+        }
        MessageId::CmdGoalDescription => {
            "Definir uma meta de sessão com orçamento de tokens opcional"
        }
@@ -1939,7 +1955,7 @@ fn portuguese_brazil(id: MessageId) -> Option<&'static str> {
        MessageId::CmdModelsDescription => "Listar os modelos disponíveis pela API",
        MessageId::CmdNetworkDescription => "Gerenciar regras de rede permitidas e bloqueadas",
        MessageId::CmdNoteDescription => "Adicionar, listar, editar ou remover notas do workspace",
-        MessageId::CmdThemeDescription => "Alternar entre o tema claro e escuro",
+        MessageId::CmdThemeDescription => "Alternar tema: escuro, claro, tons de cinza ou sistema",
        MessageId::CmdProviderDescription => {
            "Trocar ou exibir o backend LLM ativo (deepseek | nvidia-nim | ollama)"
        }
@@ -1947,6 +1963,7 @@ fn portuguese_brazil(id: MessageId) -> Option<&'static str> {
        MessageId::CmdRecallDescription => {
            "Buscar arquivos de ciclos anteriores (BM25 sobre o texto das mensagens)"
        }
+        MessageId::CmdRelayDescription => "Criar um relay da sessão para um novo thread",
        MessageId::CmdRenameDescription => "Renomear a sessão atual",
        MessageId::CmdRestoreDescription => {
            "Reverter o workspace a um snapshot pré/pós-turno anterior. Sem argumento, lista os snapshots recentes."
@@ -1956,10 +1973,10 @@ fn portuguese_brazil(id: MessageId) -> Option<&'static str> {
            "Executar uma revisão de código estruturada em um arquivo, diff ou PR"
        }
        MessageId::CmdRlmDescription => {
-            "Turno do Recursive Language Model (RLM) — guarda o prompt em um REPL Python e deixa o modelo escrever o código que o processa; use `llm_query()` / `sub_rlm()` para chamadas a sub-LLMs."
+            "Abrir um contexto RLM persistente: /rlm [0-3] <file_or_text>"
        }
        MessageId::CmdSaveDescription => "Salvar a sessão em arquivo",
-        MessageId::CmdSessionsDescription => "Abrir o seletor de sessões",
+        MessageId::CmdSessionsDescription => "Abrir seletor de histórico de sessões",
        MessageId::CmdSettingsDescription => "Exibir as configurações persistidas",
        MessageId::CmdSkillDescription => {
            "Ativar uma skill, ou instalar/atualizar/desinstalar/confiar em uma skill da comunidade"
@@ -2090,7 +2107,7 @@ fn portuguese_brazil(id: MessageId) -> Option<&'static str> {
            "Abrir detalhes da ferramenta ou mensagem selecionada (quando entrada vazia)"
        }
        MessageId::KbToolDetailsPager => "Abrir paginador de detalhes da ferramenta",
-        MessageId::KbThinkingPager => "Abrir paginador de raciocínio",
+        MessageId::KbThinkingPager => "Abrir Activity Detail",
        MessageId::KbLiveTranscript => "Abrir sobreposição de transcrição ao vivo (auto-scroll)",
        MessageId::KbBacktrackMessage => {
            "Retroceder para mensagem anterior do usuário (esquerda/direita, Enter para rebobinar)"
@@ -2100,7 +2117,7 @@ fn portuguese_brazil(id: MessageId) -> Option<&'static str> {
        }
        MessageId::KbJumpPlanAgentYolo => "Pular direto para modo Plan / Agent / YOLO",
        MessageId::KbAltJumpPlanAgentYolo => "Salto alternativo para modo Plan / Agent / YOLO",
-        MessageId::KbFocusSidebar => "Focar barra lateral Plan / Todos / Tasks / Agents / Auto",
+        MessageId::KbFocusSidebar => "Focar barra lateral Work / Tasks / Agents / Context / Auto",
        MessageId::KbTogglePlanAgent => "Alternar entre modos Plan e Agent",
        MessageId::KbSessionPicker => "Abrir seletor de sessões",
        MessageId::KbPasteAttach => "Colar texto ou anexar imagem da área de transferência",
@@ -6,7 +6,7 @@ use std::process::{Command, Stdio};
 use std::time::Duration;

 use anyhow::{Context, Result, anyhow, bail};
-use clap::{Args, CommandFactory, Parser, Subcommand};
+use clap::{Args, CommandFactory, Parser, Subcommand, ValueEnum};
 use clap_complete::{Shell, generate};
 use dotenvy::dotenv;
 use tempfile::NamedTempFile;
@@ -100,7 +100,8 @@ fn configure_windows_console_utf8() {}

 #[derive(Parser, Debug)]
 #[command(
-    name = "deepseek",
+    name = "deepseek-tui",
+    bin_name = "deepseek-tui",
    author,
    version = env!("DEEPSEEK_BUILD_VERSION"),
    about = "DeepSeek TUI/CLI for DeepSeek models",
@@ -285,14 +286,51 @@ struct ExecArgs {
    #[arg(long, default_value_t = false)]
    auto: bool,
    /// Emit machine-readable JSON output
-    #[arg(long, default_value_t = false)]
+    #[arg(long, default_value_t = false, conflicts_with = "output_format")]
    json: bool,
+    /// Resume a previous session by ID or prefix
+    #[arg(long, value_name = "SESSION_ID", conflicts_with_all = ["session_id", "continue_session"])]
+    resume: Option<String>,
+    /// Resume a previous session by ID or prefix
+    #[arg(long = "session-id", value_name = "SESSION_ID", conflicts_with_all = ["resume", "continue_session"])]
+    session_id: Option<String>,
+    /// Continue the most recent session for this workspace
+    #[arg(long = "continue", default_value_t = false, conflicts_with_all = ["resume", "session_id"])]
+    continue_session: bool,
+    /// Output format for exec mode
+    #[arg(long, value_enum, default_value_t = ExecOutputFormat::Text)]
+    output_format: ExecOutputFormat,
+}
+
+#[derive(Debug, Clone, Copy, PartialEq, Eq, ValueEnum)]
+enum ExecOutputFormat {
+    Text,
+    #[value(name = "stream-json")]
+    StreamJson,
 }

 fn join_prompt_parts(parts: &[String]) -> String {
    parts.join(" ")
 }

+fn resolve_exec_resume_session_id(args: &ExecArgs, workspace: &Path) -> Result<Option<String>> {
+    if let Some(id) = args.resume.as_ref().or(args.session_id.as_ref()) {
+        return Ok(Some(id.clone()));
+    }
+    if !args.continue_session {
+        return Ok(None);
+    }
+    latest_session_id_for_workspace(workspace)?.map_or_else(
+        || {
+            bail!(
+                "No saved sessions found for workspace {}. Use `deepseek sessions` to list sessions, or pass `deepseek exec --resume <SESSION_ID> ...`.",
+                workspace.display()
+            )
+        },
+        |id| Ok(Some(id)),
+    )
+}
+
 #[derive(Args, Debug, Clone, Default)]
 struct SetupArgs {
    /// Initialize MCP configuration at the configured path
@@ -674,13 +712,19 @@ async fn main() -> Result<()> {
                let config = load_config_from_cli(&cli)?;
                let model = args
                    .model
+                    .clone()
                    .or_else(|| config.default_text_model.clone())
                    .unwrap_or_else(|| config.default_model());
                let prompt = join_prompt_parts(&args.prompt);
-                if args.auto || cli.yolo {
-                    let workspace = cli.workspace.clone().unwrap_or_else(|| {
-                        std::env::current_dir().unwrap_or_else(|_| PathBuf::from("."))
-                    });
+                let workspace = cli.workspace.clone().unwrap_or_else(|| {
+                    std::env::current_dir().unwrap_or_else(|_| PathBuf::from("."))
+                });
+                let resume_session_id = resolve_exec_resume_session_id(&args, &workspace)?;
+                let needs_engine = args.auto
+                    || cli.yolo
+                    || resume_session_id.is_some()
+                    || args.output_format == ExecOutputFormat::StreamJson;
+                if needs_engine {
                    let max_subagents = cli.max_subagents.map_or_else(
                        || config.max_subagents(),
                        |value| value.clamp(1, MAX_SUBAGENTS),
@@ -692,9 +736,11 @@ async fn main() -> Result<()> {
                        &prompt,
                        workspace,
                        max_subagents,
-                        true,
+                        auto_mode,
                        auto_mode,
                        args.json,
+                        resume_session_id,
+                        args.output_format,
                    )
                    .await
                } else if args.json {
@@ -4379,6 +4425,95 @@ async fn run_one_shot_json(config: &Config, model: &str, prompt: &str) -> Result
    Ok(())
 }

+#[derive(serde::Serialize)]
+struct ExecStreamMeta {
+    model: String,
+    input_tokens: u32,
+    output_tokens: u32,
+    session_id: String,
+    status: Option<String>,
+}
+
+#[derive(serde::Serialize)]
+#[serde(tag = "type")]
+enum ExecStreamEvent {
+    #[serde(rename = "content")]
+    Content { content: String },
+    #[serde(rename = "tool_use")]
+    ToolUse {
+        name: String,
+        id: String,
+        input: serde_json::Value,
+    },
+    #[serde(rename = "tool_result")]
+    ToolResult {
+        id: String,
+        output: String,
+        status: String,
+    },
+    #[serde(rename = "session_capture")]
+    SessionCapture { content: String },
+    #[serde(rename = "metadata")]
+    Metadata { meta: ExecStreamMeta },
+    #[serde(rename = "done")]
+    Done,
+    #[serde(rename = "error")]
+    Error { error: String },
+}
+
+fn emit_exec_stream_event(event: &ExecStreamEvent) -> Result<()> {
+    println!("{}", serde_json::to_string(event)?);
+    Ok(())
+}
+
+fn persist_exec_session(
+    messages: &[Message],
+    model: &str,
+    workspace: &Path,
+    system_prompt: &Option<SystemPrompt>,
+    session_id: Option<&str>,
+    total_tokens: u64,
+) -> Result<String> {
+    let manager =
+        SessionManager::default_location().context("could not open session manager for save")?;
+    let saved = if let Some(id) = session_id.filter(|id| !id.trim().is_empty()) {
+        match manager.load_session(id) {
+            Ok(existing) => session_manager::update_session(
+                existing,
+                messages,
+                total_tokens,
+                system_prompt.as_ref(),
+            ),
+            Err(err) if err.kind() == std::io::ErrorKind::NotFound => {
+                session_manager::create_saved_session_with_id_and_mode(
+                    id.to_string(),
+                    messages,
+                    model,
+                    workspace,
+                    total_tokens,
+                    system_prompt.as_ref(),
+                    Some("exec"),
+                )
+            }
+            Err(err) => return Err(err).context("could not load existing exec session"),
+        }
+    } else {
+        session_manager::create_saved_session_with_mode(
+            messages,
+            model,
+            workspace,
+            total_tokens,
+            system_prompt.as_ref(),
+            Some("exec"),
+        )
+    };
+    let id = saved.metadata.id.clone();
+    manager
+        .save_session(&saved)
+        .context("could not save exec session")?;
+    Ok(id)
+}
+
 #[allow(clippy::too_many_arguments)]
 async fn run_exec_agent(
    config: &Config,
@@ -4389,6 +4524,8 @@ async fn run_exec_agent(
    auto_approve: bool,
    trust_mode: bool,
    json_output: bool,
+    resume_session_id: Option<String>,
+    output_format: ExecOutputFormat,
 ) -> Result<()> {
    use crate::compaction::CompactionConfig;
    use crate::core::engine::{EngineConfig, spawn_engine};
@@ -4482,6 +4619,37 @@ async fn run_exec_agent(
        AppMode::Agent
    };

+    let mut loaded_session_id = None;
+    if let Some(session_id) = resume_session_id.as_deref() {
+        let manager = SessionManager::default_location()
+            .context("could not open session manager for exec resume")?;
+        let saved = manager
+            .load_session_by_prefix(session_id)
+            .with_context(|| format!("could not load session '{session_id}'"))?;
+        let saved_id = saved.metadata.id.clone();
+        if saved.metadata.workspace != workspace && output_format == ExecOutputFormat::Text {
+            eprintln!(
+                "Warning: session {} was created in a different workspace ({}). Resuming anyway.",
+                truncate_id(&saved_id),
+                saved.metadata.workspace.display(),
+            );
+        }
+
+        engine_handle
+            .send(Op::SyncSession {
+                session_id: Some(saved_id.clone()),
+                messages: saved.messages,
+                system_prompt: saved.system_prompt.map(SystemPrompt::Text),
+                model: saved.metadata.model,
+                workspace: saved.metadata.workspace,
+            })
+            .await?;
+        loaded_session_id = Some(saved_id.clone());
+        if output_format == ExecOutputFormat::Text && !json_output {
+            eprintln!("resumed session: {saved_id}");
+        }
+    }
+
    engine_handle
        .send(Op::SendMessage {
            content: prompt.to_string(),
@@ -4525,11 +4693,19 @@ async fn run_exec_agent(
    }
    let mut summary = ExecSummary {
        mode: "agent".to_string(),
-        model: effective_model,
+        model: effective_model.clone(),
        prompt: prompt.to_string(),
        ..ExecSummary::default()
    };

+    let should_persist_session =
+        resume_session_id.is_some() || output_format == ExecOutputFormat::StreamJson;
+    let mut latest_session_id = loaded_session_id;
+    let mut latest_messages: Vec<Message> = Vec::new();
+    let mut latest_system_prompt: Option<SystemPrompt> = None;
+    let mut latest_model = effective_model;
+    let mut latest_workspace = workspace.clone();
+
    let mut stdout = io::stdout();
    let mut ends_with_newline = false;
    loop {
@@ -4545,71 +4721,115 @@ async fn run_exec_agent(
        match event {
            Event::MessageDelta { content, .. } => {
                summary.output.push_str(&content);
-                if !json_output {
+                if output_format == ExecOutputFormat::StreamJson {
+                    emit_exec_stream_event(&ExecStreamEvent::Content { content })?;
+                } else if !json_output {
                    print!("{content}");
                    stdout.flush()?;
                }
-                ends_with_newline = content.ends_with('\n');
+                ends_with_newline = summary.output.ends_with('\n');
            }
-            Event::MessageComplete { .. } if !json_output && !ends_with_newline => {
+            Event::MessageComplete { .. }
+                if output_format == ExecOutputFormat::Text
+                    && !json_output
+                    && !ends_with_newline =>
+            {
                println!();
            }
-            Event::ToolCallStarted { name, input, .. } if !json_output => {
-                let summary = summarize_tool_args(&input);
-                if let Some(summary) = summary {
-                    eprintln!("tool: {name} ({summary})");
-                } else {
-                    eprintln!("tool: {name}");
+            Event::ThinkingDelta { .. } => {
+                // Exec stream-json intentionally omits reasoning deltas; the
+                // TUI transcript retains its existing Activity Detail surface.
+            }
+            Event::ToolCallStarted { id, name, input } => {
+                if output_format == ExecOutputFormat::StreamJson {
+                    emit_exec_stream_event(&ExecStreamEvent::ToolUse { name, id, input })?;
+                } else if !json_output {
+                    let summary = summarize_tool_args(&input);
+                    if let Some(summary) = summary {
+                        eprintln!("tool: {name} ({summary})");
+                    } else {
+                        eprintln!("tool: {name}");
+                    }
                }
            }
-            Event::ToolCallProgress { id, output } if !json_output => {
+            Event::ToolCallProgress { id, output }
+                if output_format == ExecOutputFormat::Text && !json_output =>
+            {
                eprintln!("tool {id}: {}", summarize_tool_output(&output));
            }
-            Event::ToolCallComplete { name, result, .. } => match result {
+            Event::ToolCallComplete {
+                id, name, result, ..
+            } => match result {
                Ok(output) => {
                    summary.tools.push(ExecToolEntry {
                        name: name.clone(),
                        success: output.success,
                        output: output.content.clone(),
                    });
-                    if name == "exec_shell" && !output.content.trim().is_empty() {
-                        if !json_output {
+                    if output_format == ExecOutputFormat::StreamJson {
+                        emit_exec_stream_event(&ExecStreamEvent::ToolResult {
+                            id,
+                            output: output.content,
+                            status: if output.success {
+                                "success".to_string()
+                            } else {
+                                "error".to_string()
+                            },
+                        })?;
+                    } else if !json_output {
+                        if name == "exec_shell" && !output.content.trim().is_empty() {
                            eprintln!("tool {name} completed");
                            eprintln!(
                                "--- stdout/stderr ---\n{}\n---------------------",
                                output.content
                            );
+                        } else {
+                            eprintln!(
+                                "tool {name} completed: {}",
+                                summarize_tool_output(&output.content)
+                            );
                        }
-                    } else if !json_output {
-                        eprintln!(
-                            "tool {name} completed: {}",
-                            summarize_tool_output(&output.content)
-                        );
                    }
                }
                Err(err) => {
+                    let error_text = err.to_string();
                    summary.tools.push(ExecToolEntry {
                        name: name.clone(),
                        success: false,
-                        output: err.to_string(),
+                        output: error_text.clone(),
                    });
-                    if !json_output {
+                    if output_format == ExecOutputFormat::StreamJson {
+                        emit_exec_stream_event(&ExecStreamEvent::ToolResult {
+                            id,
+                            output: error_text,
+                            status: "error".to_string(),
+                        })?;
+                    } else if !json_output {
                        eprintln!("tool {name} failed: {err}");
                    }
                }
            },
-            Event::AgentSpawned { id, prompt } => {
+            Event::AgentSpawned { id, prompt }
+                if output_format == ExecOutputFormat::Text && !json_output =>
+            {
                eprintln!("sub-agent {id} spawned: {}", summarize_tool_output(&prompt));
            }
-            Event::AgentProgress { id, status } => {
+            Event::AgentProgress { id, status }
+                if output_format == ExecOutputFormat::Text && !json_output =>
+            {
                eprintln!("sub-agent {id}: {status}");
            }
-            Event::AgentComplete { id, result } => {
+            Event::AgentComplete { id, result }
+                if output_format == ExecOutputFormat::Text && !json_output =>
+            {
                eprintln!(
                    "sub-agent {id} completed: {}",
                    summarize_tool_output(&result)
                );
            }
+            Event::AgentSpawned { .. }
+            | Event::AgentProgress { .. }
+            | Event::AgentComplete { .. } => {}
            Event::ApprovalRequired { id, .. } => {
                if auto_approve {
                    let _ = engine_handle.approve_tool_call(id).await;
@@ -4624,11 +4844,15 @@ async fn run_exec_agent(
                ..
            } => {
                if auto_approve {
-                    eprintln!("sandbox denied {tool_name}: {denial_reason} (auto-elevating)");
+                    if output_format == ExecOutputFormat::Text && !json_output {
+                        eprintln!("sandbox denied {tool_name}: {denial_reason} (auto-elevating)");
+                    }
                    let policy = crate::sandbox::SandboxPolicy::DangerFullAccess;
                    let _ = engine_handle.retry_tool_with_policy(tool_id, policy).await;
                } else {
-                    eprintln!("sandbox denied {tool_name}: {denial_reason}");
+                    if output_format == ExecOutputFormat::Text && !json_output {
+                        eprintln!("sandbox denied {tool_name}: {denial_reason}");
+                    }
                    let _ = engine_handle.deny_tool_call(tool_id).await;
                }
            }
@@ -4637,16 +4861,81 @@ async fn run_exec_agent(
                recoverable: _,
            } => {
                summary.error = Some(envelope.message.clone());
-                if !json_output {
+                if output_format == ExecOutputFormat::StreamJson {
+                    emit_exec_stream_event(&ExecStreamEvent::Error {
+                        error: envelope.message,
+                    })?;
+                } else if !json_output {
                    eprintln!("error: {}", envelope.message);
                }
            }
-            Event::TurnComplete { status, error, .. } => {
+            Event::TurnComplete {
+                status,
+                error,
+                usage,
+                ..
+            } => {
                summary.status = Some(format!("{status:?}").to_lowercase());
                summary.error = error;
+                let saved_session_id = if should_persist_session && !latest_messages.is_empty() {
+                    match persist_exec_session(
+                        &latest_messages,
+                        &latest_model,
+                        &latest_workspace,
+                        &latest_system_prompt,
+                        latest_session_id.as_deref(),
+                        u64::from(usage.input_tokens) + u64::from(usage.output_tokens),
+                    ) {
+                        Ok(id) => {
+                            if output_format == ExecOutputFormat::Text && !json_output {
+                                eprintln!("session: {id}");
+                            }
+                            Some(id)
+                        }
+                        Err(err) => {
+                            if output_format == ExecOutputFormat::Text && !json_output {
+                                eprintln!("warning: failed to save exec session: {err}");
+                            }
+                            latest_session_id.clone()
+                        }
+                    }
+                } else {
+                    latest_session_id.clone()
+                };
+
+                if output_format == ExecOutputFormat::StreamJson {
+                    if let Some(id) = saved_session_id.as_ref() {
+                        emit_exec_stream_event(&ExecStreamEvent::SessionCapture {
+                            content: id.clone(),
+                        })?;
+                    }
+                    emit_exec_stream_event(&ExecStreamEvent::Metadata {
+                        meta: ExecStreamMeta {
+                            model: latest_model.clone(),
+                            input_tokens: usage.input_tokens,
+                            output_tokens: usage.output_tokens,
+                            session_id: saved_session_id.unwrap_or_default(),
+                            status: summary.status.clone(),
+                        },
+                    })?;
+                    emit_exec_stream_event(&ExecStreamEvent::Done)?;
+                }
                let _ = engine_handle.send(Op::Shutdown).await;
                break;
            }
+            Event::SessionUpdated {
+                session_id,
+                messages,
+                system_prompt,
+                model,
+                workspace,
+            } => {
+                latest_session_id = Some(session_id);
+                latest_messages = messages;
+                latest_system_prompt = system_prompt;
+                latest_model = model;
+                latest_workspace = workspace;
+            }
            _ => {}
        }
    }
@@ -4843,6 +5132,11 @@ mod terminal_mode_tests {
        assert_eq!(cli.prompt, vec!["hello", "world"]);
    }

+    #[test]
+    fn companion_binary_reports_its_own_name() {
+        assert_eq!(Cli::command().get_name(), "deepseek-tui");
+    }
+
    #[test]
    fn exec_accepts_split_prompt_words_for_windows_cmd_shims() {
        let cli = parse_cli(&["deepseek", "exec", "hello", "world"]);
@@ -4864,6 +5158,76 @@ mod terminal_mode_tests {
        assert_eq!(args.prompt, vec!["hello", "world"]);
    }

+    #[test]
+    fn exec_accepts_resume_session_flags_for_harnesses() {
+        let cli = parse_cli(&[
+            "deepseek",
+            "exec",
+            "--resume",
+            "abc123",
+            "--output-format",
+            "stream-json",
+            "follow up",
+        ]);
+        let Some(Commands::Exec(args)) = cli.command else {
+            panic!("expected exec command");
+        };
+
+        assert_eq!(args.resume.as_deref(), Some("abc123"));
+        assert_eq!(args.output_format, ExecOutputFormat::StreamJson);
+        assert_eq!(args.prompt, vec!["follow up"]);
+    }
+
+    #[test]
+    fn exec_accepts_session_id_alias() {
+        let cli = parse_cli(&["deepseek", "exec", "--session-id", "abc123", "follow up"]);
+        let Some(Commands::Exec(args)) = cli.command else {
+            panic!("expected exec command");
+        };
+
+        assert_eq!(args.session_id.as_deref(), Some("abc123"));
+        assert_eq!(args.output_format, ExecOutputFormat::Text);
+    }
+
+    #[test]
+    fn exec_accepts_continue_for_latest_workspace_session() {
+        let cli = parse_cli(&["deepseek", "exec", "--continue", "follow up"]);
+        let Some(Commands::Exec(args)) = cli.command else {
+            panic!("expected exec command");
+        };
+
+        assert!(args.continue_session);
+    }
+
+    #[test]
+    fn exec_json_conflicts_with_stream_json_output() {
+        let err = Cli::try_parse_from([
+            "deepseek",
+            "exec",
+            "--json",
+            "--output-format",
+            "stream-json",
+            "hello",
+        ])
+        .expect_err("json summary and stream-json must not mix");
+
+        assert_eq!(err.kind(), clap::error::ErrorKind::ArgumentConflict);
+    }
+
+    #[test]
+    fn exec_stream_events_are_json_lines() {
+        let event = ExecStreamEvent::ToolResult {
+            id: "call_1".to_string(),
+            output: "line 1\nline 2".to_string(),
+            status: "success".to_string(),
+        };
+
+        let json = serde_json::to_string(&event).expect("serializes");
+        assert!(!json.contains('\n'));
+        let parsed: serde_json::Value = serde_json::from_str(&json).expect("valid json");
+        assert_eq!(parsed["type"], "tool_result");
+    }
+
    #[test]
    fn alternate_screen_defaults_on_in_auto_mode() {
        let cli = parse_cli(&["deepseek"]);
@@ -3233,6 +3233,7 @@ mod tests {
    }

    #[tokio::test]
+    #[ignore = "flaky: requires a live TCP listener and is sensitive to port allocation races"]
    async fn mcp_connection_supports_streamable_http_event_stream_responses() {
        use tokio::io::{AsyncReadExt, AsyncWriteExt};
        use tokio::net::{TcpListener, TcpStream};
@@ -12,18 +12,30 @@ pub const DEEPSEEK_INK_RGB: (u8, u8, u8) = (11, 21, 38);
 pub const DEEPSEEK_SLATE_RGB: (u8, u8, u8) = (18, 28, 46);
 pub const DEEPSEEK_RED_RGB: (u8, u8, u8) = (226, 80, 96);

-pub const LIGHT_SURFACE_RGB: (u8, u8, u8) = (248, 250, 252); // #F8FAFC
-pub const LIGHT_PANEL_RGB: (u8, u8, u8) = (241, 245, 249); // #F1F5F9
-pub const LIGHT_ELEVATED_RGB: (u8, u8, u8) = (226, 232, 240); // #E2E8F0
-pub const LIGHT_REASONING_RGB: (u8, u8, u8) = (254, 243, 199); // #FEF3C7
-pub const LIGHT_SUCCESS_RGB: (u8, u8, u8) = (220, 252, 231); // #DCFCE7
-pub const LIGHT_ERROR_RGB: (u8, u8, u8) = (254, 226, 226); // #FEE2E2
+pub const LIGHT_SURFACE_RGB: (u8, u8, u8) = (246, 248, 251); // #F6F8FB
+pub const LIGHT_PANEL_RGB: (u8, u8, u8) = (236, 242, 248); // #ECF2F8
+pub const LIGHT_ELEVATED_RGB: (u8, u8, u8) = (219, 229, 240); // #DBE5F0
+pub const LIGHT_REASONING_RGB: (u8, u8, u8) = (255, 246, 214); // #FFF6D6
+pub const LIGHT_SUCCESS_RGB: (u8, u8, u8) = (223, 247, 231); // #DFF7E7
+pub const LIGHT_ERROR_RGB: (u8, u8, u8) = (254, 229, 229); // #FEE5E5
 pub const LIGHT_TEXT_BODY_RGB: (u8, u8, u8) = (15, 23, 42); // #0F172A
 pub const LIGHT_TEXT_MUTED_RGB: (u8, u8, u8) = (51, 65, 85); // #334155
-pub const LIGHT_TEXT_HINT_RGB: (u8, u8, u8) = (71, 85, 105); // #475569
+pub const LIGHT_TEXT_HINT_RGB: (u8, u8, u8) = (100, 116, 139); // #64748B
 pub const LIGHT_TEXT_SOFT_RGB: (u8, u8, u8) = (30, 41, 59); // #1E293B
-pub const LIGHT_BORDER_RGB: (u8, u8, u8) = (71, 85, 105); // #475569
-pub const LIGHT_SELECTION_RGB: (u8, u8, u8) = (219, 234, 254); // #DBEAFE
+pub const LIGHT_BORDER_RGB: (u8, u8, u8) = (139, 161, 184); // #8BA1B8
+pub const LIGHT_SELECTION_RGB: (u8, u8, u8) = (207, 224, 247); // #CFE0F7
+pub const GRAYSCALE_SURFACE_RGB: (u8, u8, u8) = (10, 10, 10); // #0A0A0A
+pub const GRAYSCALE_PANEL_RGB: (u8, u8, u8) = (18, 18, 18); // #121212
+pub const GRAYSCALE_ELEVATED_RGB: (u8, u8, u8) = (31, 31, 31); // #1F1F1F
+pub const GRAYSCALE_REASONING_RGB: (u8, u8, u8) = (38, 38, 38); // #262626
+pub const GRAYSCALE_SUCCESS_RGB: (u8, u8, u8) = (34, 34, 34); // #222222
+pub const GRAYSCALE_ERROR_RGB: (u8, u8, u8) = (42, 42, 42); // #2A2A2A
+pub const GRAYSCALE_TEXT_BODY_RGB: (u8, u8, u8) = (236, 236, 236); // #ECECEC
+pub const GRAYSCALE_TEXT_MUTED_RGB: (u8, u8, u8) = (180, 180, 180); // #B4B4B4
+pub const GRAYSCALE_TEXT_HINT_RGB: (u8, u8, u8) = (138, 138, 138); // #8A8A8A
+pub const GRAYSCALE_TEXT_SOFT_RGB: (u8, u8, u8) = (220, 220, 220); // #DCDCDC
+pub const GRAYSCALE_BORDER_RGB: (u8, u8, u8) = (96, 96, 96); // #606060
+pub const GRAYSCALE_SELECTION_RGB: (u8, u8, u8) = (62, 62, 62); // #3E3E3E

 // New semantic colors
 pub const BORDER_COLOR_RGB: (u8, u8, u8) = (42, 74, 127); // #2A4A7F
@@ -106,6 +118,66 @@ pub const LIGHT_SELECTION_BG: Color = Color::Rgb(
    LIGHT_SELECTION_RGB.1,
    LIGHT_SELECTION_RGB.2,
 );
+pub const GRAYSCALE_SURFACE: Color = Color::Rgb(
+    GRAYSCALE_SURFACE_RGB.0,
+    GRAYSCALE_SURFACE_RGB.1,
+    GRAYSCALE_SURFACE_RGB.2,
+);
+pub const GRAYSCALE_PANEL: Color = Color::Rgb(
+    GRAYSCALE_PANEL_RGB.0,
+    GRAYSCALE_PANEL_RGB.1,
+    GRAYSCALE_PANEL_RGB.2,
+);
+pub const GRAYSCALE_ELEVATED: Color = Color::Rgb(
+    GRAYSCALE_ELEVATED_RGB.0,
+    GRAYSCALE_ELEVATED_RGB.1,
+    GRAYSCALE_ELEVATED_RGB.2,
+);
+pub const GRAYSCALE_REASONING: Color = Color::Rgb(
+    GRAYSCALE_REASONING_RGB.0,
+    GRAYSCALE_REASONING_RGB.1,
+    GRAYSCALE_REASONING_RGB.2,
+);
+pub const GRAYSCALE_SUCCESS: Color = Color::Rgb(
+    GRAYSCALE_SUCCESS_RGB.0,
+    GRAYSCALE_SUCCESS_RGB.1,
+    GRAYSCALE_SUCCESS_RGB.2,
+);
+pub const GRAYSCALE_ERROR: Color = Color::Rgb(
+    GRAYSCALE_ERROR_RGB.0,
+    GRAYSCALE_ERROR_RGB.1,
+    GRAYSCALE_ERROR_RGB.2,
+);
+pub const GRAYSCALE_TEXT_BODY: Color = Color::Rgb(
+    GRAYSCALE_TEXT_BODY_RGB.0,
+    GRAYSCALE_TEXT_BODY_RGB.1,
+    GRAYSCALE_TEXT_BODY_RGB.2,
+);
+pub const GRAYSCALE_TEXT_MUTED: Color = Color::Rgb(
+    GRAYSCALE_TEXT_MUTED_RGB.0,
+    GRAYSCALE_TEXT_MUTED_RGB.1,
+    GRAYSCALE_TEXT_MUTED_RGB.2,
+);
+pub const GRAYSCALE_TEXT_HINT: Color = Color::Rgb(
+    GRAYSCALE_TEXT_HINT_RGB.0,
+    GRAYSCALE_TEXT_HINT_RGB.1,
+    GRAYSCALE_TEXT_HINT_RGB.2,
+);
+pub const GRAYSCALE_TEXT_SOFT: Color = Color::Rgb(
+    GRAYSCALE_TEXT_SOFT_RGB.0,
+    GRAYSCALE_TEXT_SOFT_RGB.1,
+    GRAYSCALE_TEXT_SOFT_RGB.2,
+);
+pub const GRAYSCALE_BORDER: Color = Color::Rgb(
+    GRAYSCALE_BORDER_RGB.0,
+    GRAYSCALE_BORDER_RGB.1,
+    GRAYSCALE_BORDER_RGB.2,
+);
+pub const GRAYSCALE_SELECTION_BG: Color = Color::Rgb(
+    GRAYSCALE_SELECTION_RGB.0,
+    GRAYSCALE_SELECTION_RGB.1,
+    GRAYSCALE_SELECTION_RGB.2,
+);

 pub const TEXT_BODY: Color = Color::Rgb(226, 232, 240); // #E2E8F0
 pub const TEXT_SECONDARY: Color = Color::Rgb(177, 190, 207); // #B1BECF
@@ -177,6 +249,7 @@ pub const COMPOSER_BG: Color = DEEPSEEK_SLATE;
 pub enum PaletteMode {
    Dark,
    Light,
+    Grayscale,
 }

 impl PaletteMode {
@@ -278,12 +351,37 @@ pub const LIGHT_UI_THEME: UiTheme = UiTheme {
    border: LIGHT_BORDER,
 };

+pub const GRAYSCALE_UI_THEME: UiTheme = UiTheme {
+    name: "grayscale",
+    mode: PaletteMode::Grayscale,
+    surface_bg: GRAYSCALE_SURFACE,
+    panel_bg: GRAYSCALE_PANEL,
+    elevated_bg: GRAYSCALE_ELEVATED,
+    composer_bg: GRAYSCALE_PANEL,
+    selection_bg: GRAYSCALE_SELECTION_BG,
+    header_bg: GRAYSCALE_SURFACE,
+    footer_bg: GRAYSCALE_SURFACE,
+    mode_agent: GRAYSCALE_TEXT_SOFT,
+    mode_yolo: GRAYSCALE_TEXT_BODY,
+    mode_plan: GRAYSCALE_TEXT_MUTED,
+    status_ready: GRAYSCALE_TEXT_MUTED,
+    status_working: GRAYSCALE_TEXT_SOFT,
+    status_warning: GRAYSCALE_TEXT_BODY,
+    text_dim: GRAYSCALE_TEXT_HINT,
+    text_hint: GRAYSCALE_TEXT_HINT,
+    text_muted: GRAYSCALE_TEXT_MUTED,
+    text_body: GRAYSCALE_TEXT_BODY,
+    text_soft: GRAYSCALE_TEXT_SOFT,
+    border: GRAYSCALE_BORDER,
+};
+
 impl UiTheme {
    #[must_use]
    pub fn for_mode(mode: PaletteMode) -> Self {
        match mode {
            PaletteMode::Dark => UI_THEME,
            PaletteMode::Light => LIGHT_UI_THEME,
+            PaletteMode::Grayscale => GRAYSCALE_UI_THEME,
        }
    }

@@ -292,6 +390,17 @@ impl UiTheme {
        Self::for_mode(PaletteMode::detect())
    }

+    #[must_use]
+    pub fn from_setting(value: &str) -> Option<Self> {
+        match normalize_theme_name(value)? {
+            "system" => Some(Self::detect()),
+            "dark" => Some(Self::for_mode(PaletteMode::Dark)),
+            "light" => Some(Self::for_mode(PaletteMode::Light)),
+            "grayscale" => Some(Self::for_mode(PaletteMode::Grayscale)),
+            _ => None,
+        }
+    }
+
    #[must_use]
    pub fn with_background_color(mut self, color: Color) -> Self {
        self.surface_bg = color;
@@ -301,6 +410,36 @@ impl UiTheme {
    }
 }

+#[must_use]
+pub fn normalize_theme_name(value: &str) -> Option<&'static str> {
+    match value.trim().to_ascii_lowercase().as_str() {
+        "" | "auto" | "system" | "default" => Some("system"),
+        "dark" | "whale" | "whale-dark" => Some("dark"),
+        "light" | "whale-light" => Some("light"),
+        "grayscale" | "greyscale" | "gray" | "grey" | "mono" | "monochrome" | "black-white"
+        | "black_and_white" | "blackwhite" | "bw" | "b&w" => Some("grayscale"),
+        _ => None,
+    }
+}
+
+#[must_use]
+pub fn theme_label_for_mode(mode: PaletteMode) -> &'static str {
+    match mode {
+        PaletteMode::Dark => "dark",
+        PaletteMode::Light => "light",
+        PaletteMode::Grayscale => "grayscale",
+    }
+}
+
+#[must_use]
+pub fn ui_theme_from_settings(theme: &str, background_color: Option<&str>) -> UiTheme {
+    let mut ui_theme = UiTheme::from_setting(theme).unwrap_or_else(UiTheme::detect);
+    if let Some(background) = background_color.and_then(parse_hex_rgb_color) {
+        ui_theme = ui_theme.with_background_color(background);
+    }
+    ui_theme
+}
+
 #[must_use]
 pub fn parse_hex_rgb_color(value: &str) -> Option<Color> {
    let hex = value.trim().strip_prefix('#').unwrap_or(value.trim());
@@ -329,10 +468,23 @@ pub fn hex_rgb_string(color: Color) -> Option<String> {

 #[must_use]
 pub fn adapt_fg_for_palette_mode(color: Color, _bg: Color, mode: PaletteMode) -> Color {
-    if mode == PaletteMode::Dark {
-        return color;
+    match mode {
+        PaletteMode::Dark => color,
+        PaletteMode::Light => adapt_fg_for_light_palette(color),
+        PaletteMode::Grayscale => adapt_fg_for_grayscale_palette(color),
    }
+}

+#[must_use]
+pub fn adapt_bg_for_palette_mode(color: Color, mode: PaletteMode) -> Color {
+    match mode {
+        PaletteMode::Dark => color,
+        PaletteMode::Light => adapt_bg_for_light_palette(color),
+        PaletteMode::Grayscale => adapt_bg_for_grayscale_palette(color),
+    }
+}
+
+fn adapt_fg_for_light_palette(color: Color) -> Color {
    if color == TEXT_BODY || color == SELECTION_TEXT || color == Color::White {
        LIGHT_TEXT_BODY
    } else if color == TEXT_SECONDARY || color == TEXT_MUTED {
@@ -358,12 +510,7 @@ pub fn adapt_fg_for_palette_mode(color: Color, _bg: Color, mode: PaletteMode) ->
    }
 }

-#[must_use]
-pub fn adapt_bg_for_palette_mode(color: Color, mode: PaletteMode) -> Color {
-    if mode == PaletteMode::Dark {
-        return color;
-    }
-
+fn adapt_bg_for_light_palette(color: Color) -> Color {
    if color == DEEPSEEK_INK || color == BACKGROUND_DARK {
        LIGHT_SURFACE
    } else if color == DEEPSEEK_SLATE
@@ -394,6 +541,150 @@ pub fn adapt_bg_for_palette_mode(color: Color, mode: PaletteMode) -> Color {
    }
 }

+fn adapt_fg_for_grayscale_palette(color: Color) -> Color {
+    if color == Color::Reset {
+        return color;
+    }
+    if color == TEXT_BODY
+        || color == SELECTION_TEXT
+        || color == LIGHT_TEXT_BODY
+        || color == Color::White
+        || color == DEEPSEEK_RED
+        || color == STATUS_ERROR
+        || color == MODE_YOLO
+    {
+        GRAYSCALE_TEXT_BODY
+    } else if color == TEXT_SOFT
+        || color == TEXT_TOOL_OUTPUT
+        || color == LIGHT_TEXT_SOFT
+        || color == TEXT_ACCENT
+        || color == DEEPSEEK_SKY
+        || color == DEEPSEEK_BLUE
+        || color == ACCENT_TOOL_LIVE
+        || color == STATUS_SUCCESS
+        || color == STATUS_INFO
+        || color == MODE_AGENT
+    {
+        GRAYSCALE_TEXT_SOFT
+    } else if color == TEXT_SECONDARY
+        || color == TEXT_MUTED
+        || color == LIGHT_TEXT_MUTED
+        || color == TEXT_REASONING
+        || color == ACCENT_REASONING_LIVE
+        || color == STATUS_WARNING
+        || color == MODE_PLAN
+        || color == USER_BODY
+        || color == LIGHT_USER_BODY
+        || color == DIFF_ADDED
+    {
+        GRAYSCALE_TEXT_MUTED
+    } else if color == TEXT_HINT
+        || color == TEXT_DIM
+        || color == LIGHT_TEXT_HINT
+        || color == BORDER_COLOR
+        || color == LIGHT_BORDER
+        || color == ACCENT_TOOL_ISSUE
+    {
+        GRAYSCALE_TEXT_HINT
+    } else {
+        match color {
+            Color::Black => GRAYSCALE_TEXT_BODY,
+            Color::Gray | Color::DarkGray => GRAYSCALE_TEXT_HINT,
+            Color::Red
+            | Color::LightRed
+            | Color::Green
+            | Color::LightGreen
+            | Color::Yellow
+            | Color::LightYellow
+            | Color::Blue
+            | Color::LightBlue
+            | Color::Magenta
+            | Color::LightMagenta
+            | Color::Cyan
+            | Color::LightCyan => GRAYSCALE_TEXT_SOFT,
+            Color::Rgb(r, g, b) => grayscale_fg_from_luma(luma(r, g, b)),
+            Color::Indexed(_) => color,
+            _ => color,
+        }
+    }
+}
+
+fn adapt_bg_for_grayscale_palette(color: Color) -> Color {
+    if color == Color::Reset {
+        return color;
+    }
+    if color == DEEPSEEK_INK || color == BACKGROUND_DARK || color == LIGHT_SURFACE {
+        GRAYSCALE_SURFACE
+    } else if color == DEEPSEEK_SLATE
+        || color == COMPOSER_BG
+        || color == SURFACE_PANEL
+        || color == SURFACE_TOOL
+        || color == LIGHT_PANEL
+    {
+        GRAYSCALE_PANEL
+    } else if color == SURFACE_ELEVATED
+        || color == SURFACE_TOOL_ACTIVE
+        || color == LIGHT_ELEVATED
+        || color == SELECTION_BG
+        || color == LIGHT_SELECTION_BG
+    {
+        GRAYSCALE_ELEVATED
+    } else if color == SURFACE_REASONING
+        || color == SURFACE_REASONING_TINT
+        || color == SURFACE_REASONING_ACTIVE
+        || color == LIGHT_REASONING
+    {
+        GRAYSCALE_REASONING
+    } else if color == SURFACE_SUCCESS || color == DIFF_ADDED_BG || color == LIGHT_SUCCESS {
+        GRAYSCALE_SUCCESS
+    } else if color == SURFACE_ERROR || color == DIFF_DELETED_BG || color == LIGHT_ERROR {
+        GRAYSCALE_ERROR
+    } else {
+        match color {
+            Color::Black => GRAYSCALE_SURFACE,
+            Color::White | Color::Gray => GRAYSCALE_ELEVATED,
+            Color::DarkGray => GRAYSCALE_PANEL,
+            Color::Red
+            | Color::LightRed
+            | Color::Green
+            | Color::LightGreen
+            | Color::Yellow
+            | Color::LightYellow
+            | Color::Blue
+            | Color::LightBlue
+            | Color::Magenta
+            | Color::LightMagenta
+            | Color::Cyan
+            | Color::LightCyan => GRAYSCALE_ELEVATED,
+            Color::Rgb(r, g, b) => grayscale_bg_from_luma(luma(r, g, b)),
+            Color::Indexed(_) => color,
+            _ => color,
+        }
+    }
+}
+
+fn grayscale_fg_from_luma(luma: u8) -> Color {
+    match luma {
+        0..=95 => GRAYSCALE_TEXT_HINT,
+        96..=155 => GRAYSCALE_TEXT_MUTED,
+        156..=215 => GRAYSCALE_TEXT_SOFT,
+        _ => GRAYSCALE_TEXT_BODY,
+    }
+}
+
+fn grayscale_bg_from_luma(luma: u8) -> Color {
+    match luma {
+        0..=28 => GRAYSCALE_SURFACE,
+        29..=95 => GRAYSCALE_PANEL,
+        96..=185 => GRAYSCALE_ELEVATED,
+        _ => GRAYSCALE_REASONING,
+    }
+}
+
+fn luma(r: u8, g: u8, b: u8) -> u8 {
+    (((u16::from(r) * 299) + (u16::from(g) * 587) + (u16::from(b) * 114)) / 1000) as u8
+}
+
 // === Color depth + brightness helpers (v0.6.6 UI redesign) ===

 /// Terminal color depth, used to gate truecolor surfaces (e.g. reasoning bg
@@ -652,12 +943,15 @@ fn rgb_to_ansi256(r: u8, g: u8, b: u8) -> u8 {
 mod tests {
    use super::{
        ACCENT_REASONING_LIVE, ColorDepth, DEEPSEEK_INK, DEEPSEEK_RED, DEEPSEEK_SKY,
-        DEEPSEEK_SLATE, LIGHT_PANEL, LIGHT_REASONING, LIGHT_SURFACE, LIGHT_TEXT_BODY,
-        LIGHT_TEXT_HINT, LIGHT_UI_THEME, PaletteMode, SURFACE_REASONING, SURFACE_REASONING_TINT,
-        TEXT_BODY, TEXT_HINT, TEXT_REASONING, TEXT_TOOL_OUTPUT, UI_THEME, adapt_bg,
-        adapt_bg_for_palette_mode, adapt_color, adapt_fg_for_palette_mode, blend, nearest_ansi16,
-        normalize_hex_rgb_color, parse_hex_rgb_color, pulse_brightness, reasoning_surface_tint,
-        rgb_to_ansi256,
+        DEEPSEEK_SLATE, GRAYSCALE_BORDER, GRAYSCALE_ELEVATED, GRAYSCALE_PANEL, GRAYSCALE_REASONING,
+        GRAYSCALE_SURFACE, GRAYSCALE_TEXT_BODY, GRAYSCALE_TEXT_HINT, GRAYSCALE_TEXT_SOFT,
+        GRAYSCALE_UI_THEME, LIGHT_BORDER, LIGHT_ELEVATED, LIGHT_PANEL, LIGHT_REASONING,
+        LIGHT_SURFACE, LIGHT_TEXT_BODY, LIGHT_TEXT_HINT, LIGHT_UI_THEME, PaletteMode,
+        SURFACE_REASONING, SURFACE_REASONING_TINT, TEXT_BODY, TEXT_HINT, TEXT_REASONING,
+        TEXT_TOOL_OUTPUT, UI_THEME, adapt_bg, adapt_bg_for_palette_mode, adapt_color,
+        adapt_fg_for_palette_mode, blend, nearest_ansi16, normalize_hex_rgb_color,
+        normalize_theme_name, parse_hex_rgb_color, pulse_brightness, reasoning_surface_tint,
+        rgb_to_ansi256, theme_label_for_mode, ui_theme_from_settings,
    };
    use ratatui::style::Color;

@@ -683,6 +977,36 @@ mod tests {
        assert_eq!(theme.text_body, LIGHT_TEXT_BODY);
    }

+    #[test]
+    fn ui_theme_selects_grayscale_variant() {
+        let theme = super::UiTheme::for_mode(PaletteMode::Grayscale);
+        assert_eq!(theme, GRAYSCALE_UI_THEME);
+        assert_eq!(theme.surface_bg, GRAYSCALE_SURFACE);
+        assert_eq!(theme.panel_bg, GRAYSCALE_PANEL);
+        assert_eq!(theme.text_body, GRAYSCALE_TEXT_BODY);
+    }
+
+    #[test]
+    fn theme_names_normalize_common_grayscale_aliases() {
+        assert_eq!(normalize_theme_name("system"), Some("system"));
+        assert_eq!(normalize_theme_name("default"), Some("system"));
+        assert_eq!(normalize_theme_name("whale"), Some("dark"));
+        assert_eq!(normalize_theme_name("black-white"), Some("grayscale"));
+        assert_eq!(normalize_theme_name("mono"), Some("grayscale"));
+        assert_eq!(normalize_theme_name("solarized"), None);
+        assert_eq!(theme_label_for_mode(PaletteMode::Grayscale), "grayscale");
+    }
+
+    #[test]
+    fn light_palette_has_quiet_layer_separation() {
+        assert_eq!(LIGHT_SURFACE, Color::Rgb(246, 248, 251));
+        assert_eq!(LIGHT_PANEL, Color::Rgb(236, 242, 248));
+        assert_eq!(LIGHT_ELEVATED, Color::Rgb(219, 229, 240));
+        assert_eq!(LIGHT_BORDER, Color::Rgb(139, 161, 184));
+        assert_ne!(LIGHT_SURFACE, LIGHT_PANEL);
+        assert_ne!(LIGHT_PANEL, LIGHT_ELEVATED);
+    }
+
    #[test]
    fn dark_palette_uses_soft_body_text_and_warm_reasoning() {
        assert_eq!(TEXT_BODY, Color::Rgb(226, 232, 240));
@@ -738,6 +1062,46 @@ mod tests {
        );
    }

+    #[test]
+    fn grayscale_palette_maps_brand_hues_to_neutral_roles() {
+        assert_eq!(
+            adapt_bg_for_palette_mode(DEEPSEEK_INK, PaletteMode::Grayscale),
+            GRAYSCALE_SURFACE
+        );
+        assert_eq!(
+            adapt_bg_for_palette_mode(DEEPSEEK_SLATE, PaletteMode::Grayscale),
+            GRAYSCALE_PANEL
+        );
+        assert_eq!(
+            adapt_bg_for_palette_mode(SURFACE_REASONING, PaletteMode::Grayscale),
+            GRAYSCALE_REASONING
+        );
+        assert_eq!(
+            adapt_fg_for_palette_mode(DEEPSEEK_SKY, GRAYSCALE_SURFACE, PaletteMode::Grayscale),
+            GRAYSCALE_TEXT_SOFT
+        );
+        assert_eq!(
+            adapt_fg_for_palette_mode(DEEPSEEK_RED, GRAYSCALE_SURFACE, PaletteMode::Grayscale),
+            GRAYSCALE_TEXT_BODY
+        );
+        assert_eq!(
+            adapt_fg_for_palette_mode(TEXT_HINT, GRAYSCALE_SURFACE, PaletteMode::Grayscale),
+            GRAYSCALE_TEXT_HINT
+        );
+    }
+
+    #[test]
+    fn ui_theme_from_settings_applies_theme_and_background() {
+        let theme = ui_theme_from_settings("grayscale", Some("#111111"));
+        assert_eq!(theme.mode, PaletteMode::Grayscale);
+        assert_eq!(theme.surface_bg, Color::Rgb(17, 17, 17));
+        assert_eq!(theme.header_bg, Color::Rgb(17, 17, 17));
+        assert_eq!(theme.footer_bg, Color::Rgb(17, 17, 17));
+        assert_eq!(theme.panel_bg, GRAYSCALE_PANEL);
+        assert_eq!(theme.elevated_bg, GRAYSCALE_ELEVATED);
+        assert_eq!(theme.border, GRAYSCALE_BORDER);
+    }
+
    #[test]
    fn adapt_color_passes_through_truecolor() {
        let c = Color::Rgb(53, 120, 229);
@@ -30,7 +30,7 @@ pub struct PromptSessionContext<'a> {
    pub translation_enabled: bool,
 }

-/// Conventional location for the structured session-handoff artifact (#32).
+/// Conventional location for the structured session relay artifact (#32).
 /// A previous session writes it on exit / `/compact`; the next session reads
 /// it back on startup and prepends it to the system prompt so a fresh agent
 /// doesn't have to re-discover open blockers from scratch.
@@ -157,7 +157,7 @@ fn render_instructions_block(paths: &[PathBuf]) -> Option<String> {
    }
 }

-/// Read the workspace-local handoff artifact, if present, and format it as a
+/// Read the workspace-local relay artifact, if present, and format it as a
 /// system-prompt block. Returns `None` when the file is absent or empty so
 /// callers can keep the default-uncluttered prompt for fresh workspaces.
 fn load_handoff_block(workspace: &Path) -> Option<String> {
@@ -168,7 +168,7 @@ fn load_handoff_block(workspace: &Path) -> Option<String> {
        return None;
    }
    Some(format!(
-        "## Previous Session Handoff\n\nThe previous session in this workspace left a handoff at `{}`. Consider it the first artifact to read on this turn — open blockers, in-flight changes, and recent decisions live there. Update or rewrite it before exiting if state changes materially.\n\n{}",
+        "## Previous Session Relay\n\nThe previous session in this workspace left a relay artifact at `{}`. Consider it the first artifact to read on this turn — open blockers, in-flight changes, and recent decisions live there. Update or rewrite it before exiting if state changes materially.\n\n{}",
        HANDOFF_RELATIVE_PATH, trimmed
    ))
 }
@@ -354,7 +354,7 @@ pub const AUTO_APPROVAL: &str = include_str!("prompts/approvals/auto.md");
 pub const SUGGEST_APPROVAL: &str = include_str!("prompts/approvals/suggest.md");
 pub const NEVER_APPROVAL: &str = include_str!("prompts/approvals/never.md");

-/// Compaction handoff template — written into the system prompt so the
+/// Compaction relay template — written into the system prompt so the
 /// model knows the format to use when writing `.deepseek/handoff.md`.
 pub const COMPACT_TEMPLATE: &str = include_str!("prompts/compact.md");

@@ -514,11 +514,11 @@ pub fn system_prompt_for_mode_with_context(
 ///   2. project context / fallback (workspace-static)
 ///   3. skills block (skills-dir-static)
 ///   4. `## Context Management` (compile-time constant, Agent/Yolo only)
-///   5. compaction handoff template (compile-time constant)
-///   6. handoff block — file-backed; rewritten by `/compact` and on exit
+///   5. compaction relay template (compile-time constant)
+///   6. relay block — file-backed; rewritten by `/compact` and on exit
 ///
 /// Anything appended after a volatile block forfeits the cache for the rest
-/// of the request. New blocks belong above the handoff boundary unless they
+/// of the request. New blocks belong above the relay boundary unless they
 /// themselves are turn-volatile. Working-set metadata is now injected into the
 /// latest user message as per-turn metadata instead of this system prompt.
 pub fn system_prompt_for_mode_with_context_and_skills(
@@ -668,7 +668,7 @@ pub fn system_prompt_for_mode_with_context_skills_session_and_approval(
        );
    }

-    // 5. Compaction handoff template — so the model knows the format to use
+    // 5. Compaction relay template — so the model knows the format to use
    //    when writing `.deepseek/handoff.md` on exit / `/compact`.
    full_prompt.push_str("\n\n");
    full_prompt.push_str(COMPACT_TEMPLATE);
@@ -694,7 +694,7 @@ pub fn system_prompt_for_mode_with_context_skills_session_and_approval(
    // 6b. User memory block (#489). Placed below the volatile boundary
    // because memory entries are editable mid-session via `/memory` or
    // `# foo` quick-add. When they change, they only invalidate the
-    // trailing handoff block — the static prefix above stays cached.
+    // trailing relay block — the static prefix above stays cached.
    if let Some(memory_block) = session_context.user_memory_block
        && !memory_block.trim().is_empty()
    {
@@ -713,7 +713,7 @@ pub fn system_prompt_for_mode_with_context_skills_session_and_approval(
        );
    }

-    // 7. Previous-session handoff (file-backed, rewritten by `/compact`).
+    // 7. Previous-session relay (file-backed, rewritten by `/compact`).
    if let Some(handoff_block) = load_handoff_block(workspace) {
        full_prompt = format!("{full_prompt}\n\n{handoff_block}");
    }
@@ -775,9 +775,9 @@ mod tests {
    use super::*;
    use tempfile::tempdir;

-    /// Discriminator unique to the injected handoff block (not present in the
+    /// Discriminator unique to the injected relay block (not present in the
    /// agent prompt's own discussion of the convention).
-    const HANDOFF_BLOCK_MARKER: &str = "left a handoff at `.deepseek/handoff.md`";
+    const HANDOFF_BLOCK_MARKER: &str = "left a relay artifact at `.deepseek/handoff.md`";

    #[test]
    fn render_environment_block_lists_supplied_locale_and_workspace() {
@@ -1120,7 +1120,7 @@ mod tests {
        assert!(prompt.contains("<project_context_pack>"));
        assert!(
            prompt.find("<project_context_pack>").expect("pack")
-                < prompt.find("## Previous Session Handoff").expect("handoff")
+                < prompt.find("## Previous Session Relay").expect("relay")
        );
    }

@@ -1132,7 +1132,7 @@ mod tests {
        std::fs::create_dir_all(&handoff_dir).unwrap();
        std::fs::write(
            handoff_dir.join("handoff.md"),
-            "# Session handoff — prior\n\n## Active task\nFinish #32.\n\n## Open blockers\n- [ ] write the basic version\n",
+            "# Session relay — prior\n\n## Active task\nFinish #32.\n\n## Open blockers\n- [ ] write the basic version\n",
        )
        .unwrap();

@@ -1278,7 +1278,7 @@ mod tests {
            SystemPrompt::Text(text) => text,
            SystemPrompt::Blocks(_) => panic!("expected text system prompt"),
        };
-        assert!(prompt.contains("## Compaction Handoff"));
+        assert!(prompt.contains("## Compaction Relay"));
        // #429: structured Markdown template. Goal/Constraints/Progress
        // (Done/InProgress/Blocked)/Key Decisions/Next step.
        assert!(prompt.contains("### Goal"));
@@ -1313,7 +1313,7 @@ mod tests {
        };

        let goal_pos = prompt.find("<session_goal>").expect("goal block");
-        let compact_pos = prompt.find("## Compaction Handoff").expect("compact block");
+        let compact_pos = prompt.find("## Compaction Relay").expect("compact block");

        assert!(prompt.contains("Fix transcript corruption"));
        // Session goal is volatile content — it lives below the
@@ -1353,7 +1353,7 @@ mod tests {
    fn tool_selection_guide_avoids_defensive_tool_suppression() {
        let prompt = compose_prompt(AppMode::Agent, Personality::Calm);
        assert!(prompt.contains("Tool Selection Guide"));
-        assert!(prompt.contains("Use `agent_result`"));
+        assert!(prompt.contains("Use `agent_eval`"));
        assert!(
            !prompt.contains("When NOT to use certain tools"),
            "the system prompt should steer tool choice without training the model to avoid available tools"
@@ -1432,6 +1432,62 @@ mod tests {
        );
    }

+    #[test]
+    fn workspace_orientation_guidance_present() {
+        let prompt = compose_prompt(AppMode::Agent, Personality::Calm);
+        assert!(prompt.contains("Workspace Orientation"));
+        assert!(prompt.contains("canonical project root"));
+        assert!(prompt.contains("AGENTS.md"));
+        assert!(prompt.contains("explore` / `explorer"));
+    }
+
+    #[test]
+    fn prompt_uses_persistent_agent_and_rlm_surface() {
+        let prompt = compose_prompt(AppMode::Agent, Personality::Calm);
+        for tool in [
+            "agent_open",
+            "agent_eval",
+            "agent_close",
+            "rlm_open",
+            "rlm_eval",
+            "rlm_configure",
+            "rlm_close",
+            "handle_read",
+        ] {
+            assert!(
+                prompt.contains(tool),
+                "prompt should mention new persistent tool `{tool}`"
+            );
+        }
+        for retired in [
+            "agent_spawn",
+            "agent_wait",
+            "agent_result",
+            "agent_send_input",
+            "agent_assign",
+            "agent_resume",
+            "agent_list",
+            "spawn_agent",
+            "delegate_to_agent",
+            "send_input",
+            "close_agent",
+        ] {
+            assert!(
+                !prompt.contains(retired),
+                "prompt should not advertise retired sub-agent tool `{retired}`"
+            );
+        }
+    }
+
+    #[test]
+    fn prompt_documents_fork_context_prefix_cache_contract() {
+        let prompt = compose_prompt(AppMode::Agent, Personality::Calm);
+        assert!(prompt.contains("fork_context: true"));
+        assert!(prompt.contains("byte-identical"));
+        assert!(prompt.contains("DeepSeek prefix-cache reuse"));
+        assert!(prompt.contains("Fresh sessions are the default"));
+    }
+
    #[test]
    fn subagent_done_sentinel_section_present() {
        let prompt = compose_prompt(AppMode::Agent, Personality::Calm);
@@ -1541,7 +1597,7 @@ mod tests {
    #[test]
    fn system_prompt_with_handoff_file_is_byte_stable_when_file_is_unchanged() {
        // If `.deepseek/handoff.md` hasn't moved between two builds, the
-        // rendered prompt must produce identical bytes. The handoff block
+        // rendered prompt must produce identical bytes. The relay block
        // lands below the static boundary in
        // `system_prompt_for_mode_with_context_and_skills`.
        let tmp = tempdir().expect("tempdir");
@@ -1550,7 +1606,7 @@ mod tests {
        std::fs::create_dir_all(&handoff_dir).unwrap();
        std::fs::write(
            handoff_dir.join("handoff.md"),
-            "# Session handoff\n\n## Active task\nFinish #280.\n\n## Open blockers\n- [ ] none\n",
+            "# Session relay\n\n## Active task\nFinish #280.\n\n## Open blockers\n- [ ] none\n",
        )
        .unwrap();

@@ -1567,15 +1623,15 @@ mod tests {
            &a,
            &b,
        );
-        assert!(a.contains(HANDOFF_BLOCK_MARKER), "handoff must be embedded");
-        assert!(a.contains("Finish #280."), "handoff body must be present");
+        assert!(a.contains(HANDOFF_BLOCK_MARKER), "relay must be embedded");
+        assert!(a.contains("Finish #280."), "relay body must be present");
    }

    #[test]
    fn handoff_appears_after_static_blocks_without_working_set() {
-        // Cache-prefix invariant: the handoff block must come after static
-        // `## Context Management` and the compaction handoff template
-        // (`## Compaction Handoff`). Working-set metadata is per-turn user
+        // Cache-prefix invariant: the relay block must come after static
+        // `## Context Management` and the compaction relay template
+        // (`## Compaction Relay`). Working-set metadata is per-turn user
        // metadata now, not a system-prompt tail block.
        let tmp = tempdir().expect("tempdir");
        let workspace = tmp.path();
@@ -1594,11 +1650,11 @@ mod tests {
            .find("## Context Management")
            .expect("Context Management section present in Agent mode");
        let compact_pos = prompt
-            .find("## Compaction Handoff")
-            .expect("compaction handoff template present");
+            .find("## Compaction Relay")
+            .expect("compaction relay template present");
        let handoff_pos = prompt
            .find(HANDOFF_BLOCK_MARKER)
-            .expect("handoff block present when fixture file exists");
+            .expect("relay block present when fixture file exists");
        assert!(
            !prompt.contains("## Repo Working Set"),
            "working-set summary must stay out of the system prompt"
@@ -1606,11 +1662,11 @@ mod tests {

        assert!(
            context_pos < handoff_pos,
-            "## Context Management must precede the handoff block"
+            "## Context Management must precede the relay block"
        );
        assert!(
            compact_pos < handoff_pos,
-            "## Compaction Handoff must precede the handoff block"
+            "## Compaction Relay must precede the relay block"
        );
    }

@@ -1,7 +1,7 @@
 ## Mode: agent

-Read-only tools (reads, searches, `rlm`, agent status queries, git inspection) run silently.
-Any write, patch, shell execution, sub-agent spawn, or CSV batch operation will ask for approval first.
+Read-only tools (reads, searches, persistent RLM session tools, agent status queries, git inspection) run silently.
+Any write, patch, shell execution, sub-agent session open, or CSV batch operation will ask for approval first.

 Before requesting approval for writes, lay out your work with `checklist_write` so the user can see what
 you intend to do and approve with context. Complex changes should also get an `update_plan` first.
@@ -9,7 +9,7 @@ Decomposition builds trust — a clear plan gets faster approvals.

 ## Sub-agent completion sentinel

-When you spawn a sub-agent via `agent_spawn`, the child runs independently.
+When you open a sub-agent via `agent_open`, the child runs independently.
 You will receive a `<deepseek:subagent.done>` element in the transcript when it finishes.
 Read its `summary` field and integrate the work — do not re-do what the child already did.
-You can also call `agent_result` to pull the full structured result.
+You can also call `agent_eval` with the agent name or id to pull the current structured projection or transcript handle.
@@ -40,19 +40,27 @@ Use three decomposition patterns, selected by task scope:

 **PREVIEW** — Before diving into a large task, survey the terrain. Scan directory structure (`list_dir`), file headers, module trees. Identify problem boundaries and estimate complexity. A 30-second preview prevents hours of wrong-path exploration.

-**CHUNK + map-reduce** — When a task exceeds single-pass capacity: split into independent sub-tasks, process each independently (parallel where possible via parallel tool calls or `agent_spawn`), then synthesize findings into a coherent whole. Track chunks with `checklist_write`.
+**CHUNK + map-reduce** — When a task exceeds single-pass capacity: split into independent sub-tasks, process each independently (parallel where possible via parallel tool calls or persistent sub-agent sessions), then synthesize findings into a coherent whole. Track chunks with `checklist_write`.

-**RECURSIVE** — When sub-tasks reveal sub-problems: decompose recursively until each leaf is tractable. Maintain the task tree via `update_plan` (strategy) layered above `checklist_write` (leaf tasks). Propagate findings upward when sub-problems resolve.
+**RECURSIVE** — When sub-tasks reveal sub-problems: decompose recursively until each leaf is tractable. Keep the active leaves in `checklist_write`; use `update_plan` only when a genuinely complex initiative needs durable high-level strategy metadata. Propagate findings upward when sub-problems resolve.

 Your default workflow for any non-trivial request:
 1. **`checklist_write`** — break the work into concrete, verifiable steps. Mark the first one `in_progress`. This populates the sidebar so the user can see what you're doing.
 2. **Execute** — work through each checklist item, updating status as you go.
-3. **For complex initiatives**, layer `update_plan` (high-level strategy) above `checklist_write` (granular steps).
-4. **For parallel work**, spawn sub-agents (`agent_spawn`) — each does one thing well. Link them to plan/todo items in your thinking. Batch independent tool calls in a single turn.
-5. **Only when an input genuinely doesn't fit your context window** — a whole file > ~50K tokens, a long transcript, a multi-document corpus — use `rlm`. It loads the input into a Python REPL where a sub-agent processes it. For shorter inputs, use `read_file` and reason directly.
+3. **For complex initiatives only**, add `update_plan` as high-level strategy. Do not mirror the checklist into a second tracker.
+4. **For parallel work**, open sub-agent sessions with `agent_open` — each does one thing well. Use `agent_eval` for follow-ups or completion state, and `agent_close` when a session should be cancelled or released. Link them to Work/checklist items in your thinking. Batch independent tool calls in a single turn.
+5. **Only when an input genuinely doesn't fit your context window** — a whole file > ~50K tokens, a long transcript, a multi-document corpus — use persistent RLM sessions: `rlm_open` loads the input into a named Python REPL, `rlm_eval` runs bounded analysis, `handle_read` reads returned `var_handle`s, `rlm_configure` adjusts feedback/depth, and `rlm_close` releases the session. For shorter inputs, use `read_file` and reason directly.
 6. **For persistent cross-session memory**, use `note` sparingly for important decisions, open blockers, and architectural context.

-**Key principle**: make your work visible. The sidebar shows Plan / Todos / Tasks / Agents. When these panels are empty, the user has no idea what you're doing. Keep them populated.
+**Key principle**: make your work visible in one place. The sidebar shows Work / Tasks / Agents / Context. Keep the Work checklist current; it is the primary progress surface. `update_plan` appears there only as optional strategy when it has real content.
+
+## Workspace Orientation
+
+When you enter an unfamiliar workspace, orient before broad search. Use the project instructions already loaded into the prompt, then confirm the working shape with the cheapest deterministic tools: `list_dir`, direct reads of `AGENTS.md`/`README.md` when relevant, and targeted `grep_files`. If the current directory is a multi-project workspace or the user points at a child path, identify the canonical project root before searching. If the correct project remains ambiguous after a quick orientation pass, ask instead of spraying searches across sibling checkouts.
+
+Treat workspace instructions as authority for where work should happen. If they say a sibling directory is stale, historical, frozen, or not the canonical checkout, do not spend high-value context there unless the user explicitly asks. Prefer exact paths from the user over guessing.
+
+Use `explore` sub-agents for independent read-only reconnaissance. Call the role `explore` / `explorer`, and give each child one bounded question with the project root and expected evidence shape. Use RLM for long inputs or many semantic slices, not for basic path discovery.

 ## Verification Principle

@@ -76,23 +84,23 @@ If a tool call fails, inspect the error before retrying. Do not repeat the ident

 ## Composition Pattern for Multi-Step Work

-For any task estimated to take 5+ steps:
+For any task estimated to take 5+ concrete steps:

-1. **`update_plan`** — 3-6 high-level phases (status: pending). This gives the user a map.
-2. **`checklist_write`** — concrete leaf tasks under the first phase (mark first `in_progress`).
-3. **Execute phase 1**, updating checklist as you go. Batch independent steps into parallel tool calls.
-4. **After each phase**, re-read your plan: does phase 2 still make sense? Update the plan if new information changes the approach. Don't blindly follow a plan drafted before you understood the code.
-5. **When a phase reveals sub-problems**, add them to the checklist or spawn investigation sub-agents — don't guess.
+1. **`checklist_write`** — concrete leaf tasks, with the first item `in_progress`.
+2. **Execute**, updating checklist status as you go. Batch independent steps into parallel tool calls.
+3. **For multi-phase or ambiguous initiatives**, optionally add `update_plan` with 3-6 high-level phases. Keep it strategic; do not duplicate checklist items.
+4. **After each phase**, re-check whether the next checklist items still make sense. Update the checklist, and update strategy only if the high-level approach changed.
+5. **When a phase reveals sub-problems**, add them to the checklist or open investigation sub-agent sessions — don't guess.

 ## Sub-Agent Strategy

 Sub-agents are cheap — DeepSeek V4 Flash costs $0.14/M input. Use them liberally for parallel work:

- **Parallel investigation**: When you need to understand 3+ independent files or modules, spawn one read-only sub-agent per target. They run concurrently in one turn and return structured findings you synthesize. This is faster AND more thorough than reading sequentially.
- **Parallel implementation**: After a plan is laid out, spawn one sub-agent per independent leaf task. Each does one thing well; you integrate results.
- **Solo tasks**: A single read, a single search, a focused question — do these yourself. Spawning has overhead; one-turn reads are faster direct.
- **Sequential work**: If step B depends on step A's output, run A yourself, then decide whether to spawn B based on what A found. Don't pre-spawn dependent work.
- **Concurrent sub-agent cap**: The dispatcher defaults to 10 concurrent sub-agents (configurable via `[subagents].max_concurrent` in `config.toml`, hard ceiling 20). When you need more, batch them: spawn up to the cap, wait for completions, then spawn the next batch.
+- **Parallel investigation**: When you need to understand 3+ independent files or modules, open one read-only sub-agent session per target. They run concurrently in one turn and return structured findings you synthesize. This is faster AND more thorough than reading sequentially.
+- **Parallel implementation**: After a plan is laid out, open one sub-agent session per independent leaf task. Each does one thing well; you integrate results.
+- **Solo tasks**: A single read, a single search, a focused question — do these yourself. Opening a sub-agent has overhead; one-turn reads are faster direct.
+- **Sequential work**: If step B depends on step A's output, run A yourself, then decide whether to open a sub-agent based on what A found. Don't pre-open dependent work.
+- **Concurrent sub-agent cap**: The dispatcher defaults to 10 concurrent sub-agents (configurable via `[subagents].max_concurrent` in `config.toml`, hard ceiling 20). When you need more, batch them: open up to the cap, wait for completions, then open the next batch.

 ## Parallel-First Heuristic

@@ -101,23 +109,25 @@ Before you fire any tool, scan your checklist: is there another tool you could r
 - Reading 3 files → 3 `read_file` calls in one turn
 - Searching for 2 patterns → 2 `grep_files` calls in one turn
 - Checking git status AND reading a config → `git_status` + `read_file` in one turn
- Spawning sub-agents for independent investigations → all `agent_spawn` calls in one turn
+- Opening sub-agents for independent investigations → all `agent_open` calls in one turn

 The dispatcher runs parallel tool calls simultaneously. Serializing independent operations wastes the user's time and grows your context faster than necessary.

 ## RLM — How to Use It

-RLM loads input into a Python REPL where you write code that calls sub-LLM helpers (`llm_query`, `llm_query_batched`, `rlm_query`). Three patterns, not one — choose based on the shape of the work:
+RLM is a persistent Python REPL for context that is too large or too repetitive to keep in the parent transcript. Open a named session with `rlm_open`, run bounded code with `rlm_eval`, read large returned payloads through `handle_read`, tune feedback with `rlm_configure`, and close finished sessions with `rlm_close`.
+
+Inside the REPL, use deterministic Python for exact work and the RLM helper functions for semantic work. The current helper family is `peek`, `search`, `chunk`, `context_meta`, `sub_query`, `sub_query_batch`, `sub_query_map`, `sub_rlm`, `finalize`, and `evaluate_progress`. These are in-REPL helpers, not separate model-visible tools. Three patterns, not one — choose based on the shape of the work:
+
+The RLM paper's core design is symbolic state: the long input and intermediate values live in the REPL environment, not copied into the root model context. Inspect with bounded slices, transform with Python, batch child calls programmatically, and keep large intermediate strings in variables or `var_handle`s. Do not paste the whole body back into a prompt or verbalize a long list of sub-calls when a loop can launch them.

 **CHUNK** — A single input that genuinely doesn't fit in your context window (a whole file > 50K tokens, a long transcript, a multi-document corpus). Split it, process each chunk, synthesize.

-**BATCH** — Many independent items that each need LLM attention (classify 20 entries, extract fields from 30 documents, score 15 candidates). Use `llm_query_batched` for parallel execution — it fans out to the same DeepSeek client and finishes in one turn what would take 15 sequential reads.
+**BATCH** — Many independent items that each need LLM attention (classify 20 entries, extract fields from 30 documents, score 15 candidates). Use `sub_query_batch` for parallel execution — it fans out to the same DeepSeek client and finishes in one turn what would take 15 sequential reads.

-**RECURSE** — A problem that benefits from decomposition + critique. Use `rlm_query` to have a sub-LLM review your reasoning, identify gaps, or explore alternative approaches. The sub-LLM returns a synthesized answer you verify against live tool output.
+**RECURSE** — A problem that benefits from decomposition + critique. Use `sub_query` or `sub_rlm` to have a sub-LLM review your reasoning, identify gaps, or explore alternative approaches. The sub-LLM returns a synthesized answer you verify against live tool output.

-For exact counts or structured aggregates, compute them directly in Python inside the REPL (`len`, regexes, parsers, counters) and use child LLM calls only for semantic interpretation. When you chunk a whole input, use `chunk_context()` plus `chunk_coverage()` and report coverage explicitly: chunks processed, total chunks, line/char ranges, and any skipped sections. Cross-check surprising aggregate results with deterministic code before presenting them.
-
-The Python helpers visible inside the REPL (`llm_query`, `llm_query_batched`, `rlm_query`, `rlm_query_batched`) are NOT separately-callable tools — they are functions the sub-agent uses inside its Python code. You only call `rlm` itself from the model side.
+For exact counts or structured aggregates, compute them directly in Python inside the REPL (`len`, regexes, parsers, counters) and use child LLM calls only for semantic interpretation. When you chunk a whole input, use `chunk()` and report coverage explicitly: chunks processed, total chunks, line/char ranges, and any skipped sections. Cross-check surprising aggregate results with deterministic code before presenting them. Use `finalize(...)` for the answer you want returned; if it comes back as a `var_handle`, call `handle_read` for a bounded slice, count, or JSON projection instead of asking the runtime to replay the whole value.

 ## Context
 You have a 1 M-token context window. When usage creeps above ~80%, suggest `/compact` to the user — it summarises earlier turns so you can keep working without losing thread.
@@ -154,14 +164,15 @@ When context is deep (past a soft seam): cache reasoning conclusions in concise

 ## Toolbox (fast reference — tool descriptions are authoritative)

- **Planning / tracking**: `update_plan` (high-level strategy), `task_create` / `task_list` / `task_read` / `task_cancel` (durable work objects), `checklist_write` (granular progress under the active task/thread), `checklist_add` / `checklist_update` / `checklist_list`, `todo_*` aliases (legacy compatibility), `note` (persistent memory).
+- **Planning / tracking**: `checklist_write` (primary Work progress under the active task/thread), `checklist_add` / `checklist_update` / `checklist_list`, `update_plan` (optional high-level strategy metadata for complex initiatives), `task_create` / `task_list` / `task_read` / `task_cancel` (durable work objects), `todo_*` aliases (legacy compatibility), `note` (persistent memory).
 - **File I/O**: `read_file` (PDFs auto-extracted), `list_dir`, `write_file`, `edit_file`, `apply_patch`, `retrieve_tool_result` for prior spilled large tool outputs.
 - **Shell**: `task_shell_start` + `task_shell_wait` for long-running commands, diagnostics, tests, searches, and servers; `exec_shell` for bounded cancellable foreground commands; `exec_shell_wait`, `exec_shell_interact`. If foreground `exec_shell` times out, the process was killed; rerun long work with `task_shell_start` or `exec_shell` using `background: true`, then poll/wait.
 - **Task evidence**: `task_gate_run` for verification gates; `pr_attempt_record` / `pr_attempt_list` / `pr_attempt_read` / `pr_attempt_preflight`; `github_issue_context` / `github_pr_context` (read-only); `github_comment` / `github_close_issue` (approval + evidence required); `automation_*` scheduling tools.
 - **Structured search**: `grep_files`, `file_search`, `web_search`, `fetch_url`, `web.run` (browse).
 - **Git / diag / tests**: `git_status`, `git_diff`, `git_show`, `git_log`, `git_blame`, `diagnostics`, `run_tests`, `review`.
- **Sub-agents**: `agent_spawn` (`spawn_agent`, `delegate_to_agent`), `agent_result`, `agent_cancel` (`close_agent`), `agent_list`, `agent_wait` (`wait`), `agent_send_input` (`send_input`), `agent_assign` (`assign_agent`), `resume_agent`.
- **Recursive LM (long inputs / parallel reasoning)**: `rlm` — load a file/string as `context` in a Python REPL, sub-agent writes Python that calls `llm_query`/`llm_query_batched`/`rlm_query` to chunk, compare, critique, and synthesize; returns the synthesized answer. Read-only.
+- **Sub-agents**: `agent_open`, `agent_eval`, `agent_close`. Open fresh sessions by default; pass `fork_context: true` only when the child needs the current parent context and prefix-cache continuity.
+- **Recursive LM (long inputs / parallel reasoning)**: `rlm_open`, `rlm_eval`, `rlm_configure`, `rlm_close` — open a named Python REPL over a file/string/URL, run deterministic and semantic analysis, return compact results or `var_handle`s, then close when done.
+- **Large symbolic outputs**: `handle_read` — read bounded slices, counts, ranges, or JSONPath projections from returned `var_handle`s without replaying the whole payload.
 - **Skills**: `load_skill` (#434) — when the user names a skill or the task matches one in the `## Skills` section above, call this with the skill id to pull its `SKILL.md` body and companion-file list into context in one tool call. Faster than `read_file` + `list_dir`.
 - **Other**: `code_execution` (Python sandbox), `validate_data` (JSON/TOML), `request_user_input`, `finance` (market quotes), `tool_search_tool_regex`, `tool_search_tool_bm25` (deferred tool discovery).

@@ -178,17 +189,17 @@ Use `edit_file` for one clear replacement in one file. Do not use it for multi-b
 ### `exec_shell`
 Use `exec_shell` for shell-native diagnostics, pipelines, and bounded commands. Use structured tools for structured operations when they map directly (`grep_files`, `git_diff`, `read_file`). For long commands, servers, full test suites, or release computations, start background work with `task_shell_start` or `exec_shell` using `background: true`, then poll with `task_shell_wait` or `exec_shell_wait`.

-### `agent_spawn`
-Use `agent_spawn` for independent investigations or implementation slices that can run while you continue coordinating. Use `fork_context: true` when the child must inherit the current transcript, plan/todo state, and byte-identical parent system/message prefix for DeepSeek prefix-cache reuse. Use `agent_wait` when you need one or more completions. Use `agent_result` when the sentinel summary is too thin or you need the full structured output. Keep tiny single-read/search tasks local so the transcript stays compact.
+### `agent_open` / `agent_eval` / `agent_close`
+Use `agent_open` for independent investigations or implementation slices that can run while you continue coordinating. Fresh sessions are the default and are best when the child only needs the assignment you pass. Use `fork_context: true` when multiple perspectives should share the same parent context: the runtime preserves the parent prefill/prompt prefix byte-identically where available so DeepSeek prefix-cache reuse stays high, then appends the child instructions and task at the tail.

-### `rlm`
-Use `rlm` for long-context semantic work, bulk classification/extraction, and decomposition where a Python REPL plus child LLM helpers is useful. Use deterministic Python inside RLM for exact counts and structured aggregation; use `grep_files` or `exec_shell` directly when that is the clearest deterministic check.
+Use `agent_eval` to send follow-up input, block for completion, or retrieve the current session projection. Use `agent_close` to cancel or release a session that is no longer useful. Keep tiny single-read/search tasks local so the transcript stays compact.

-Inside the `rlm` REPL, the sub-LLM has access to `llm_query()`, `llm_query_batched()`, `rlm_query()`, and `rlm_query_batched()` as Python helpers for further sub-LLM work — those are not standalone tools you call directly.
+### `rlm_open` / `rlm_eval` / `rlm_configure` / `rlm_close`
+Use persistent RLM sessions for long-context semantic work, bulk classification/extraction, and decomposition where a Python REPL plus child LLM helpers is useful. Use deterministic Python inside RLM for exact counts and structured aggregation; use `grep_files` or `exec_shell` directly when that is the clearest deterministic check. Close sessions when their context is no longer needed.

 ## Internal Sub-agent Completion Events

-When you spawn a sub-agent via `agent_spawn`, the child runs independently. The runtime may send you an internal `<deepseek:subagent.done>` completion event when it finishes. This event is not user input. It carries:
+When you open a sub-agent via `agent_open`, the child runs independently. The runtime may send you an internal `<deepseek:subagent.done>` completion event when it finishes. This event is not user input. It carries:

 - `agent_id` — the child's identifier
 - `summary` — a human-readable summary of what the child found or did
@@ -198,12 +209,12 @@ When you spawn a sub-agent via `agent_spawn`, the child runs independently. The
 **Integration protocol:**
 1. When you see `<deepseek:subagent.done>`, read the `summary` field first.
 2. Integrate the child's findings into your work — do not re-do what the child already did.
-3. If the summary is insufficient, call `agent_result` to pull the full structured result.
+3. If the summary is insufficient, call `agent_eval` with the agent name or id to pull the current structured projection or transcript handle.
 4. If the child failed (`"failed"`), assess whether the failure blocks your plan or whether you can proceed with a fallback.
 5. Update your `checklist_write` items to reflect the child's contribution.
 6. Do not tell the user they pasted sentinels or explain this protocol unless they explicitly ask about sub-agent internals.

-You may see multiple `<deepseek:subagent.done>` sentinels in a single turn when children were spawned in parallel. Process each one, then synthesize.
+You may see multiple `<deepseek:subagent.done>` sentinels in a single turn when children were opened in parallel. Process each one, then synthesize.

 ## Output formatting

@@ -7,20 +7,28 @@ You are a "managed genius" — you excel at individual tasks, but your superpowe
 Your default workflow for any non-trivial request:
 1. **`checklist_write`** — break the work into concrete, verifiable steps. Mark the first one `in_progress`. This populates the sidebar so the user can see what you're doing.
 2. **Execute** — work through each checklist item, updating status as you go.
-3. **For complex initiatives**, layer `update_plan` (high-level strategy) above `checklist_write` (granular steps).
-4. **For parallel work**, spawn sub-agents (`agent_spawn`) — each does one thing well. Link them to plan/todo items in your thinking.
-5. **Only when an input genuinely doesn't fit your context window** — a whole file > ~50K tokens, a long transcript, a multi-document corpus — use `rlm`. It loads the input into a Python REPL where a sub-agent processes it. For shorter inputs, use `read_file` and reason directly.
+3. **For complex initiatives only**, add `update_plan` as high-level strategy. Do not mirror the checklist into a second tracker.
+4. **For parallel work**, open sub-agent sessions with `agent_open` — each does one thing well. Use `agent_eval` for follow-ups or completion state, and `agent_close` to cancel or release a session. Link them to Work/checklist items in your thinking.
+5. **Only when an input genuinely doesn't fit your context window** — a whole file > ~50K tokens, a long transcript, a multi-document corpus — use persistent RLM sessions: `rlm_open` loads the input into a named Python REPL, `rlm_eval` runs bounded analysis, `handle_read` reads returned `var_handle`s, `rlm_configure` adjusts feedback/depth, and `rlm_close` releases the session. For shorter inputs, use `read_file` and reason directly.
 6. **For persistent cross-session memory**, use `note` sparingly for important decisions, open blockers, and architectural context.

-**Key principle**: make your work visible. The sidebar shows Plan / Todos / Tasks / Agents. When these panels are empty, the user has no idea what you're doing. Keep them populated.
+**Key principle**: make your work visible in one place. The sidebar shows Work / Tasks / Agents / Context. Keep the Work checklist current; it is the primary progress surface. `update_plan` appears there only as optional strategy when it has real content.
+
+## Workspace Orientation
+
+In unfamiliar workspaces, orient before broad search. Use loaded project instructions plus cheap deterministic tools (`list_dir`, relevant `AGENTS.md`/`README.md`, targeted `grep_files`) to identify the canonical project root. If a workspace holds several projects or stale sibling checkouts, follow the user's path and the project instructions; ask if the target remains ambiguous.
+
+Use `explore` / `explorer` sub-agents for independent read-only reconnaissance. Give each child one bounded question with the project root and expected evidence shape. Use RLM for long inputs or many semantic slices, not basic path discovery.

 ## RLM Is a Specialty Tool

-`rlm` is for one specific shape of work: a long input that genuinely does not fit in your context (a whole file > ~50K tokens, a long transcript, a multi-document corpus). Reach for it ONLY when direct reasoning over the input is impossible because of its size. For everything else — short inputs, focused questions, parallel exploration — use `read_file`, `grep_files`, or `agent_spawn` instead.
+Persistent RLM sessions are for one specific shape of work: a long input that genuinely does not fit in your context (a whole file > ~50K tokens, a long transcript, a multi-document corpus) or a batch of many semantic sub-questions that is cleaner in Python. Reach for RLM only when direct reasoning over the input is impossible or wasteful. For everything else — short inputs, focused questions, parallel exploration — use `read_file`, `grep_files`, or `agent_open` instead.

-When you do use `rlm`, ask bounded questions with explicit inputs and expected output shape. The result is advisory — ground decisions in local files, live tool output, and passing verification before claiming completion.
+When you do use RLM, open a named context with `rlm_open`, run bounded code with `rlm_eval`, read large `var_handle` results with `handle_read`, and close the context with `rlm_close`. Results are advisory — ground decisions in local files, live tool output, and passing verification before claiming completion.

-The Python helpers visible inside the REPL (`llm_query`, `llm_query_batched`, `rlm_query`, `rlm_query_batched`) are NOT separately-callable tools — they are functions the sub-agent uses inside its Python code.
+RLM works by keeping the long input and intermediate values as symbolic REPL state. Inspect bounded slices, transform with Python, batch child calls programmatically, and avoid copying the whole body back into the root prompt.
+
+The Python helpers visible inside the REPL (`sub_query`, `sub_query_batch`, `sub_query_map`, `sub_rlm`, `finalize`, and related context helpers) are NOT separately-callable tools — they are functions the sub-agent uses inside its Python code.

 ## Context
 You have a 1 M-token context window. When usage creeps above ~80%, suggest `/compact` to the user — it summarises earlier turns so you can keep working without losing thread.
@@ -29,14 +37,15 @@ Model notes: DeepSeek V4 models emit *thinking tokens* (`ContentBlock::Thinking`

 ## Toolbox (fast reference — tool descriptions are authoritative)

- **Planning / tracking**: `update_plan` (high-level strategy), `task_create` / `task_list` / `task_read` / `task_cancel` (durable work objects), `checklist_write` (granular progress under the active task/thread), `checklist_add` / `checklist_update` / `checklist_list`, `todo_*` aliases (legacy compatibility), `note` (persistent memory).
+- **Planning / tracking**: `checklist_write` (primary Work progress under the active task/thread), `checklist_add` / `checklist_update` / `checklist_list`, `update_plan` (optional high-level strategy metadata for complex initiatives), `task_create` / `task_list` / `task_read` / `task_cancel` (durable work objects), `todo_*` aliases (legacy compatibility), `note` (persistent memory).
 - **File I/O**: `read_file` (PDFs auto-extracted), `list_dir`, `write_file`, `edit_file`, `apply_patch`, `retrieve_tool_result` for prior spilled large tool outputs.
 - **Shell**: `task_shell_start` + `task_shell_wait` for long-running commands, diagnostics, tests, searches, and servers; `exec_shell` for bounded cancellable foreground commands; `exec_shell_wait`, `exec_shell_interact`.
 - **Task evidence**: `task_gate_run` for verification gates; `pr_attempt_record` / `pr_attempt_list` / `pr_attempt_read` / `pr_attempt_preflight`; `github_issue_context` / `github_pr_context` (read-only); `github_comment` / `github_close_issue` (approval + evidence required); `automation_*` scheduling tools.
 - **Structured search**: `grep_files`, `file_search`, `web_search`, `fetch_url`, `web.run` (browse).
 - **Git / diag / tests**: `git_status`, `git_diff`, `git_show`, `git_log`, `git_blame`, `diagnostics`, `run_tests`, `review`.
- **Sub-agents**: `agent_spawn` (`spawn_agent`, `delegate_to_agent`), `agent_result`, `agent_cancel` (`close_agent`), `agent_list`, `agent_wait` (`wait`), `agent_send_input` (`send_input`), `agent_assign` (`assign_agent`), `resume_agent`.
- **Recursive LM (long inputs / parallel reasoning)**: `rlm` — load a file/string as `context` in a Python REPL, sub-agent writes Python that calls `llm_query`/`llm_query_batched`/`rlm_query` to chunk, compare, critique, and synthesize; returns the synthesized answer. Read-only.
+- **Sub-agents**: `agent_open`, `agent_eval`, `agent_close`. Fresh sessions are the default; use `fork_context: true` when multiple perspectives need the current parent context and byte-identical prefill/prompt prefix for DeepSeek prefix-cache reuse.
+- **Recursive LM (long inputs / parallel reasoning)**: `rlm_open`, `rlm_eval`, `rlm_configure`, `rlm_close` — open a named Python REPL over a file/string/URL, run deterministic and semantic analysis, return compact results or `var_handle`s, then close when done.
+- **Large symbolic outputs**: `handle_read` — read bounded slices, counts, ranges, or JSONPath projections from returned `var_handle`s.
 - **Other**: `code_execution` (Python sandbox), `validate_data` (JSON/TOML), `request_user_input`, `finance` (market quotes), `tool_search_tool_regex`, `tool_search_tool_bm25` (deferred tool discovery).

 Multiple `tool_calls` in one turn run in parallel. `web_search` returns `ref_id`s — cite as `(ref_id)`.
@@ -1,4 +1,4 @@
-## Compaction Handoff
+## Compaction Relay

 The conversation above this point has been compacted. Below is a structured summary of what was discussed and decided. Read this first — it replaces re-reading the compressed transcript.

@@ -2,14 +2,15 @@

 You are running in Agent mode — autonomous task execution with tool access.

-Read-only tools (reads, searches, `rlm`, agent status queries, git inspection) run silently.
-Any write, patch, shell execution, sub-agent spawn, or CSV batch operation will ask for approval first.
+Read-only tools (reads, searches, persistent RLM session tools, agent status queries, git inspection) run silently.
+Any write, patch, shell execution, sub-agent session open, or CSV batch operation will ask for approval first.

 Before requesting approval for writes, lay out your work with `checklist_write` so the user can see what
-you intend to do and approve with context. Complex changes should also get an `update_plan` first.
-Decomposition builds trust — a clear plan gets faster approvals.
+you intend to do and approve with context. Use `update_plan` only when a complex initiative needs
+high-level strategy metadata that is not just a copy of the checklist.
+Decomposition builds trust — a clear Work checklist gets faster approvals.

-For multi-step initiatives, use `update_plan` (high-level strategy) + `checklist_write` (granular steps).
+For multi-step initiatives, keep `checklist_write` current. Add `update_plan` only for genuinely useful strategy.

 ## Efficient Approvals

@@ -23,8 +24,8 @@ Don't sequence approvals one at a time — the user wants context, not interrupt
 ## Session Longevity

 Long sessions accumulate context. To stay fast:
- Spawn sub-agents for independent work instead of doing everything sequentially
+- Open sub-agent sessions for independent work instead of doing everything sequentially
 - Batch reads/searches/git-inspections into parallel tool calls
- Suggest `/compact` when context nears 80% — the compaction handoff preserves open blockers
+- Suggest `/compact` when context nears 80% — the compaction relay preserves open blockers
 - Use `note` for decisions you'll need across compaction boundaries
 - A 3-turn session that fans out to sub-agents finishes faster AND stays responsive longer than a 15-turn sequential grind
@@ -2,8 +2,9 @@

 You are running in Plan mode — design before implementing.

-Investigate first, act later. Use `update_plan` to lay out high-level strategy and `checklist_write` for
-granular, verifiable steps. All writes and patches are blocked — you can read the world but you
+Investigate first, act later. Use `checklist_write` for visible, granular progress. Add `update_plan`
+only when high-level strategy adds value beyond the checklist.
+All writes and patches are blocked — you can read the world but you
 can't change it. Shell and code execution are unavailable.

 Use this mode to build a thorough plan. Spawn read-only sub-agents for parallel investigation.
@@ -7,4 +7,5 @@ overwrite user work, or run destructive commands, pause and double-check. The un

 Even with auto-approval, create a `checklist_write` first so your work is visible and trackable in the
 sidebar. Decomposition is not red tape — it's how you organize complex work and demonstrate thoroughness.
-For multi-step initiatives, use `update_plan` + `checklist_write` together.
+For multi-step initiatives, keep `checklist_write` current. Add `update_plan` only when a high-level strategy
+would help and do not duplicate the checklist there.
@@ -1,13 +1,14 @@
 //! Long-lived Python REPL runtime.
 //!
-//! One `python3 -u` subprocess lives for the duration of an RLM turn (or an
+//! One Python subprocess lives for the duration of an RLM turn (or an
 //! inline `repl` block sequence in the agent loop). Code blocks are sent
 //! over stdin framed by `__RLM_RUN__`/`__RLM_END__` sentinels; the bootstrap
 //! `exec()`s them into the same global namespace so variables, imports,
 //! and even open file handles persist naturally across rounds.
 //!
-//! Sub-LLM helpers (`llm_query`, `llm_query_batched`, `rlm_query`,
-//! `rlm_query_batched`) are wired through a stdin/stdout RPC protocol:
+//! Sub-LLM helpers (`sub_query`, `sub_query_batch`, `sub_rlm`, plus legacy
+//! `llm_query`, `llm_query_batched`, `rlm_query`, `rlm_query_batched`) are
+//! wired through a stdin/stdout RPC protocol:
 //! Python emits `__RLM_REQ_<sid>__::{json}` on stdout, Rust dispatches the
 //! request and writes `__RLM_RESP_<sid>__::{json}` back on stdin. No HTTP
 //! sidecar, no temp ports — the same pipes carry both control and data.
@@ -22,11 +23,13 @@ use std::process::Stdio;
 use std::time::{Duration, Instant};

 use serde::{Deserialize, Serialize};
+use serde_json::Value;
 use tokio::io::{AsyncBufReadExt, AsyncWriteExt, BufReader};
 use tokio::process::{Child, ChildStdin, ChildStdout, Command};
 use uuid::Uuid;

 use crate::child_env;
+use crate::dependencies::{PYTHON_CANDIDATES, resolve_python_interpreter, split_interpreter_spec};

 // ---------------------------------------------------------------------------
 // Public types
@@ -43,9 +46,15 @@ pub struct ReplRound {
    pub stderr: String,
    /// `True` if the user code raised an unhandled Python exception.
    pub has_error: bool,
-    /// Captured `FINAL(value)` payload, if any.
+    /// Captured `finalize(value, confidence=...)` payload, if any.
    pub final_value: Option<String>,
-    /// Number of `llm_query`/`rlm_query` RPCs the round issued.
+    /// Captured final value before string fallback. Structured `finalize`
+    /// payloads use this so `handle_read` can expose JSON instead of a Python
+    /// repr string.
+    pub final_json: Option<Value>,
+    /// Optional confidence supplied to `finalize(...)`.
+    pub final_confidence: Option<Value>,
+    /// Number of `sub_query`/`sub_rlm` RPCs the round issued.
    pub rpc_count: u32,
    /// Wall-clock duration of the round.
    pub elapsed: Duration,
@@ -173,8 +182,8 @@ impl PythonRuntime {
        )
    }

-    /// Spawn a REPL with `context` (and `ctx`) preloaded from a file. Used
-    /// by the RLM turn loop.
+    /// Spawn a REPL with the long input preloaded from a file. Used by the
+    /// RLM turn loop.
    pub async fn spawn_with_context(context_path: &Path) -> Result<Self, String> {
        Self::spawn_inner(Some(context_path), None).await
    }
@@ -186,8 +195,23 @@ impl PythonRuntime {
        let session_id = Uuid::new_v4().simple().to_string();
        let bootstrap = render_bootstrap(&session_id);

-        let mut cmd = Command::new("python3");
-        cmd.arg("-u")
+        let interpreter = resolve_python_interpreter().ok_or_else(|| {
+            format!(
+                "no Python interpreter found on PATH (tried {:?}). \
+                 Install Python 3 and ensure one of these commands works, then restart deepseek-tui.",
+                PYTHON_CANDIDATES,
+            )
+        })?;
+        let (program, interpreter_args) = split_interpreter_spec(&interpreter);
+        if program.is_empty() {
+            return Err(format!(
+                "resolved Python interpreter is empty: {interpreter:?}"
+            ));
+        }
+
+        let mut cmd = Command::new(&program);
+        cmd.args(&interpreter_args)
+            .arg("-u")
            .arg("-c")
            .arg(&bootstrap)
            .stdin(Stdio::piped())
@@ -207,16 +231,16 @@ impl PythonRuntime {

        let mut child = cmd
            .spawn()
-            .map_err(|e| format!("failed to spawn python3: {e}"))?;
+            .map_err(|e| format!("failed to spawn Python interpreter `{interpreter}`: {e}"))?;

        let stdin = child
            .stdin
            .take()
-            .ok_or_else(|| "python3 stdin pipe missing".to_string())?;
+            .ok_or_else(|| format!("Python interpreter `{interpreter}` stdin pipe missing"))?;
        let raw_stdout = child
            .stdout
            .take()
-            .ok_or_else(|| "python3 stdout pipe missing".to_string())?;
+            .ok_or_else(|| format!("Python interpreter `{interpreter}` stdout pipe missing"))?;
        let stdout = BufReader::new(raw_stdout);

        let mut rt = Self {
@@ -240,12 +264,14 @@ impl PythonRuntime {
            Ok(Ok(())) => Ok(rt),
            Ok(Err(e)) => {
                let _ = rt.child.kill().await;
-                Err(format!("python3 bootstrap failed: {e}"))
+                Err(format!(
+                    "Python interpreter `{interpreter}` bootstrap failed: {e}"
+                ))
            }
            Err(_) => {
                let _ = rt.child.kill().await;
                Err(format!(
-                    "python3 bootstrap did not signal ready within {}s",
+                    "Python interpreter `{interpreter}` bootstrap did not signal ready within {}s",
                    SPAWN_READY_TIMEOUT.as_secs()
                ))
            }
@@ -261,7 +287,7 @@ impl PythonRuntime {
                .await
                .map_err(|e| format!("stdout read: {e}"))?;
            if n == 0 {
-                return Err("python3 closed stdout before ready signal".to_string());
+                return Err("Python interpreter closed stdout before ready signal".to_string());
            }
            let trimmed = line.trim_end_matches(['\n', '\r']);
            if trimmed == ready_sentinel {
@@ -310,6 +336,8 @@ impl PythonRuntime {

        let mut stdout_buf = String::new();
        let mut final_value: Option<String> = None;
+        let mut final_json: Option<Value> = None;
+        let mut final_confidence: Option<Value> = None;
        let mut had_error = false;
        let mut rpc_count: u32 = 0;
        let round_timeout = self.round_timeout;
@@ -323,7 +351,7 @@ impl PythonRuntime {
                    .await
                    .map_err(|e| format!("stdout read: {e}"))?;
                if n == 0 {
-                    return Err("python3 closed stdout mid-round".to_string());
+                    return Err("Python interpreter closed stdout mid-round".to_string());
                }
                let trimmed = line.trim_end_matches(['\n', '\r']);

@@ -332,10 +360,37 @@ impl PythonRuntime {
                    break;
                }
                if let Some(rest) = trimmed.strip_prefix(&final_prefix) {
-                    // Stored as a JSON-encoded string.
-                    let v =
-                        serde_json::from_str::<String>(rest).unwrap_or_else(|_| rest.to_string());
-                    final_value = Some(v);
+                    // New sessions emit an object with value/confidence;
+                    // legacy helpers emitted a JSON string.
+                    match serde_json::from_str::<Value>(rest) {
+                        Ok(Value::Object(map)) => {
+                            let value_json = map
+                                .get("value")
+                                .cloned()
+                                .unwrap_or(Value::String(rest.to_string()));
+                            let value = value_json
+                                .as_str()
+                                .map(str::to_string)
+                                .unwrap_or_else(|| value_json.to_string());
+                            final_json = Some(value_json);
+                            final_value = Some(value);
+                            final_confidence = map.get("confidence").cloned();
+                        }
+                        Ok(Value::String(value)) => {
+                            final_json = Some(Value::String(value.clone()));
+                            final_value = Some(value);
+                            final_confidence = None;
+                        }
+                        Ok(other) => {
+                            final_json = Some(other.clone());
+                            final_value = Some(other.to_string());
+                            final_confidence = None;
+                        }
+                        Err(_) => {
+                            final_value = Some(rest.to_string());
+                            final_confidence = None;
+                        }
+                    }
                    continue;
                }
                if let Some(rest) = trimmed.strip_prefix(&err_prefix) {
@@ -399,6 +454,8 @@ impl PythonRuntime {
            stderr,
            has_error: had_error,
            final_value,
+            final_json,
+            final_confidence,
            rpc_count,
            elapsed: started.elapsed(),
        })
@@ -493,6 +550,7 @@ fn render_bootstrap(session_id: &str) -> String {
 const BOOTSTRAP_TEMPLATE: &str = r#"
 import json as _json
 import os as _os
+import re as _re
 import sys as _sys
 import traceback as _traceback

@@ -574,16 +632,75 @@ def rlm_query_batched(prompts, model=None):
            out.append(r.get("text",""))
    return out

-def FINAL(value):
-    """Signal the loop to stop with this final answer."""
-    _sys.stdout.write(_FINAL + _json.dumps(str(value)) + "\n")
+def _slice_text(slice_value):
+    if slice_value is None:
+        return ""
+    if isinstance(slice_value, dict):
+        if "text" in slice_value:
+            return str(slice_value["text"])
+        return _json.dumps(slice_value, ensure_ascii=False)
+    return str(slice_value)
+
+def _prompt_with_slice(prompt, slice_value):
+    text = _slice_text(slice_value)
+    if not text:
+        return str(prompt)
+    if isinstance(slice_value, dict) and ("index" in slice_value or ("start" in slice_value and "end" in slice_value)):
+        label = f"slice index={slice_value.get('index', '?')} range={slice_value.get('start', '?')}:{slice_value.get('end', '?')}"
+    else:
+        label = "slice"
+    return f"{prompt}\n\n--- {label} ---\n{text}"
+
+def sub_query(prompt, slice=None, timeout_secs=None, **kwargs):
+    """One child LLM call, optionally scoped to a bounded slice."""
+    return llm_query(_prompt_with_slice(prompt, slice))
+
+def sub_query_batch(prompt, slices, timeout_secs=None, **kwargs):
+    """Apply one prompt to many bounded slices concurrently."""
+    if not isinstance(slices, (list, tuple)):
+        return ["[sub_query_batch: slices must be a list]"]
+    return llm_query_batched([_prompt_with_slice(prompt, s) for s in slices])
+
+def sub_query_map(prompts, slices=None, timeout_secs=None, **kwargs):
+    """Run N distinct prompts, optionally paired with N bounded slices."""
+    if not isinstance(prompts, (list, tuple)):
+        return ["[sub_query_map: prompts must be a list]"]
+    if slices is None:
+        return llm_query_batched([str(p) for p in prompts])
+    if not isinstance(slices, (list, tuple)):
+        return ["[sub_query_map: slices must be a list]"]
+    if len(prompts) != len(slices):
+        return [f"[sub_query_map: size mismatch ({len(prompts)}/{len(slices)})]" for _ in prompts]
+    return llm_query_batched([_prompt_with_slice(p, s) for p, s in zip(prompts, slices)])
+
+def sub_rlm(prompt, source=None, timeout_secs=None, **kwargs):
+    """Recursive sub-RLM call for tasks that need their own decomposition."""
+    return rlm_query(_prompt_with_slice(prompt, source))
+
+def _json_safe(value):
+    try:
+        _json.dumps(value, ensure_ascii=False)
+        return value
+    except Exception:
+        return str(value)
+
+def _emit_final(value, confidence=None):
+    safe_value = _json_safe(value)
+    _sys.stdout.write(_FINAL + _json.dumps({
+        "value": safe_value,
+        "confidence": confidence,
+    }, ensure_ascii=False) + "\n")
    _sys.stdout.flush()

+def FINAL(value):
+    """Legacy compatibility alias for finalize(value)."""
+    _emit_final(value)
+
 def FINAL_VAR(name):
-    """Signal the loop to stop, returning the value of a named variable."""
+    """Legacy compatibility alias for finalize(repl_get(name))."""
    name_str = str(name).strip().strip("'\"")
    if name_str in globals():
-        FINAL(globals()[name_str])
+        _emit_final(globals()[name_str])
    else:
        print(f"FINAL_VAR error: variable '{name_str}' not found. "
              f"Use SHOW_VARS() to list available variables.", flush=True)
@@ -603,8 +720,61 @@ def repl_get(name, default=None):
 def repl_set(name, value):
    globals()[str(name)] = value

-def chunk_context(max_chars=20000, overlap=0):
-    """Return full-coverage context chunks with index/start/end/text fields."""
+def context_meta():
+    """Return bounded metadata about the loaded input; never includes the full text."""
+    text = _context
+    line_count = 0 if text == "" else text.count("\n") + (0 if text.endswith("\n") else 1)
+    return {
+        "chars": len(text),
+        "lines": line_count,
+        "preview": text[:500],
+        "tail_preview": text[-500:] if len(text) > 500 else text,
+    }
+
+def _slice_chars(start, end):
+    total = len(_context)
+    s = max(0, int(start))
+    e = max(s, min(total, int(end)))
+    return _context[s:e]
+
+def _slice_lines(start, end):
+    lines = _context.splitlines()
+    s = max(0, int(start))
+    e = max(s, min(len(lines), int(end)))
+    return "\n".join(lines[s:e])
+
+def peek(start, end, unit="chars"):
+    """Return a bounded slice of the input by char offsets or line numbers."""
+    if str(unit).lower() in ("line", "lines"):
+        return _slice_lines(start, end)
+    if str(unit).lower() not in ("char", "chars"):
+        raise ValueError("unit must be 'chars' or 'lines'")
+    return _slice_chars(start, end)
+
+def search(pattern, max_hits=100):
+    """Regex-search the input and return bounded hit records with snippets."""
+    max_hits = max(0, int(max_hits))
+    hits = []
+    if max_hits == 0:
+        return hits
+    rx = _re.compile(str(pattern), _re.MULTILINE)
+    for i, m in enumerate(rx.finditer(_context)):
+        if i >= max_hits:
+            break
+        start, end = m.span()
+        snippet_start = max(0, start - 120)
+        snippet_end = min(len(_context), end + 120)
+        hits.append({
+            "index": i,
+            "start": start,
+            "end": end,
+            "match": m.group(0),
+            "snippet": _context[snippet_start:snippet_end],
+        })
+    return hits
+
+def chunk(max_chars=20000, overlap=0):
+    """Return full-coverage input chunks with index/start/end/text fields."""
    max_chars = int(max_chars)
    overlap = max(0, int(overlap))
    if max_chars <= 0:
@@ -614,18 +784,22 @@ def chunk_context(max_chars=20000, overlap=0):
    chunks = []
    start = 0
    idx = 0
-    total = len(context)
+    total = len(_context)
    while start < total:
        end = min(total, start + max_chars)
-        chunks.append({"index": idx, "start": start, "end": end, "text": context[start:end]})
+        chunks.append({"index": idx, "start": start, "end": end, "text": _context[start:end]})
        idx += 1
        if end >= total:
            break
        start = end - overlap
    return chunks

+def chunk_context(max_chars=20000, overlap=0):
+    """Compatibility alias for chunk()."""
+    return chunk(max_chars=max_chars, overlap=overlap)
+
 def chunk_coverage(chunks):
-    """Summarize coverage for chunks produced by chunk_context()."""
+    """Summarize coverage for chunks produced by chunk()."""
    spans = []
    for c in chunks:
        try:
@@ -642,36 +816,60 @@ def chunk_coverage(chunks):
        if end > cursor:
            covered += end - max(start, cursor)
            cursor = end
-    if cursor < len(context):
-        gaps.append((cursor, len(context)))
+    if cursor < len(_context):
+        gaps.append((cursor, len(_context)))
    return {
        "chunks": len(chunks),
-        "context_chars": len(context),
+        "context_chars": len(_context),
+        "input_chars": len(_context),
        "covered_chars": covered,
        "gaps": gaps,
-        "complete": covered >= len(context) and not gaps,
+        "complete": covered >= len(_context) and not gaps,
    }

-# Load the long input as `context` (and `ctx`) from a file. This keeps the
-# big string out of the process command-line and out of the LLM's window.
+def finalize(value, confidence=None):
+    """Signal the session's final answer and persist confidence metadata."""
+    global final_answer, final_confidence, final_result
+    final_answer = _json_safe(value)
+    final_confidence = confidence
+    final_result = {
+        "value": final_answer,
+        "confidence": confidence,
+    }
+    _emit_final(final_answer, confidence=confidence)
+    return final_answer
+
+def evaluate_progress():
+    """Return lightweight state useful before deciding the next REPL step."""
+    vars_now = SHOW_VARS()
+    return {
+        "has_final_answer": "final_answer" in globals(),
+        "final_confidence": globals().get("final_confidence", None),
+        "user_variables": vars_now,
+    }
+
+# Load the long input from a file. This keeps the big string out of the
+# process command-line and out of the LLM's window.
 _ctx_file = _os.environ.get("RLM_CONTEXT_FILE","")
-context = ""
+_context = ""
 if _ctx_file:
    try:
        with open(_ctx_file, "r", encoding="utf-8", errors="replace") as f:
-            context = f.read()
+            _context = f.read()
    except Exception as e:
        _sys.stderr.write(f"[bootstrap] failed to load context: {e}\n")
-ctx = context  # short alias matching aleph
+content = _context

 _BOOTSTRAP_NAMES = {
    "_SID","_REQ","_RESP","_FINAL","_ERR","_RUN","_END","_DONE","_READY",
-    "_rpc","_ctx_file","_BOOTSTRAP_NAMES","_main_loop",
+    "_rpc","_ctx_file","_context","_slice_chars","_slice_lines","_BOOTSTRAP_NAMES","_main_loop",
+    "_emit_final","_json_safe","_slice_text","_prompt_with_slice",
    "llm_query","llm_query_batched","rlm_query","rlm_query_batched",
+    "sub_query","sub_query_batch","sub_query_map","sub_rlm",
    "FINAL","FINAL_VAR","SHOW_VARS","repl_get","repl_set",
-    "chunk_context","chunk_coverage",
-    "context","ctx",
-    "_json","_os","_sys","_traceback",
+    "context_meta","peek","search","chunk","chunk_context","chunk_coverage",
+    "finalize","evaluate_progress","content",
+    "_json","_os","_re","_sys","_traceback",
 }

 def _main_loop():
@@ -829,7 +1027,7 @@ mod tests {
            .await
            .expect("spawn");
        let round = rt
-            .execute("print(len(context), context[:5])")
+            .execute("print(context_meta()['chars'], peek(0, 5))")
            .await
            .expect("execute");
        assert!(round.stdout.contains("19"));
@@ -838,13 +1036,16 @@ mod tests {
    }

    #[tokio::test]
-    async fn ctx_alias_works() {
+    async fn context_aliases_keep_common_content_name_bounded() {
        let path = write_temp_context("aleph-style");
        let mut rt = PythonRuntime::spawn_with_context(&path)
            .await
            .expect("spawn");
-        let round = rt.execute("print(ctx)").await.expect("execute");
-        assert!(round.stdout.contains("aleph-style"));
+        let round = rt
+            .execute("print(content == _context, 'context' in globals(), 'ctx' in globals())")
+            .await
+            .expect("execute");
+        assert!(round.stdout.contains("True False False"));
        rt.shutdown().await;
    }

@@ -866,6 +1067,108 @@ mod tests {
        rt.shutdown().await;
    }

+    #[tokio::test]
+    async fn bounded_input_helpers_work() {
+        let path = write_temp_context("alpha\nbeta needle\ngamma needle\nomega");
+        let mut rt = PythonRuntime::spawn_with_context(&path)
+            .await
+            .expect("spawn");
+        let round = rt
+            .execute(
+                "meta = context_meta()\n\
+                 hits = search('needle', max_hits=1)\n\
+                 print(meta['chars'], meta['lines'])\n\
+                 print(peek(6, 17))\n\
+                 print(peek(1, 3, unit='lines'))\n\
+                 print(len(hits), hits[0]['match'], hits[0]['start'])",
+            )
+            .await
+            .expect("execute");
+        let stdout = round.stdout.replace("\r\n", "\n");
+        assert!(stdout.contains("36 4"), "{stdout}");
+        assert!(stdout.contains("beta needle"), "{stdout}");
+        assert!(stdout.contains("beta needle\ngamma needle"), "{stdout}");
+        assert!(stdout.contains("1 needle 11"), "{stdout}");
+        rt.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn new_chunk_helper_reports_full_coverage() {
+        let path = write_temp_context("abcdefghijklmnopqrstuvwxyz");
+        let mut rt = PythonRuntime::spawn_with_context(&path)
+            .await
+            .expect("spawn");
+        let round = rt
+            .execute(
+                "chunks = chunk(max_chars=10)\n\
+                 coverage = chunk_coverage(chunks)\n\
+                 print(len(chunks), coverage['input_chars'], coverage['covered_chars'], coverage['complete'])",
+            )
+            .await
+            .expect("execute");
+        assert!(round.stdout.contains("3 26 26 True"), "{}", round.stdout);
+        rt.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn finalize_helper_is_captured_directly() {
+        let mut rt = PythonRuntime::new().await.expect("spawn");
+        let round = rt
+            .execute("finalize('computed answer', confidence='high')")
+            .await
+            .expect("execute");
+        assert_eq!(round.final_value.as_deref(), Some("computed answer"));
+        assert_eq!(
+            round.final_json.as_ref().and_then(Value::as_str),
+            Some("computed answer")
+        );
+        assert_eq!(
+            round.final_confidence.as_ref().and_then(Value::as_str),
+            Some("high")
+        );
+        rt.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn finalize_preserves_json_values_for_handles() {
+        let mut rt = PythonRuntime::new().await.expect("spawn");
+        let round = rt
+            .execute("finalize({'answer': 42, 'items': ['a', 'b']})")
+            .await
+            .expect("execute");
+
+        assert_eq!(
+            round.final_value.as_deref(),
+            Some(r#"{"answer":42,"items":["a","b"]}"#)
+        );
+        assert_eq!(
+            round.final_json,
+            Some(serde_json::json!({"answer": 42, "items": ["a", "b"]}))
+        );
+        rt.shutdown().await;
+    }
+
+    #[tokio::test]
+    async fn sub_query_accepts_timeout_keyword_for_agent_guesses() {
+        let bridge = StubBridge::new();
+        let mut rt = PythonRuntime::new().await.expect("spawn");
+        let round = rt
+            .run(
+                "answer = sub_query('summarize', timeout_secs=2)\nprint(answer)",
+                Some(&bridge),
+            )
+            .await
+            .expect("execute");
+
+        assert!(!round.has_error, "{}", round.stdout);
+        assert!(
+            round.stdout.contains("stub#0: summarize"),
+            "{}",
+            round.stdout
+        );
+        rt.shutdown().await;
+    }
+
    #[tokio::test]
    async fn rlm_context_runtime_has_no_fixed_round_timeout() {
        let path = write_temp_context("long input");
@@ -887,7 +1190,7 @@ mod tests {
    }

    #[tokio::test]
-    async fn final_is_captured() {
+    async fn legacy_final_is_captured() {
        let mut rt = PythonRuntime::new().await.expect("spawn");
        let round = rt
            .execute("FINAL('the answer is 42')")
@@ -898,7 +1201,7 @@ mod tests {
    }

    #[tokio::test]
-    async fn final_var_is_captured() {
+    async fn legacy_final_var_is_captured() {
        let mut rt = PythonRuntime::new().await.expect("spawn");
        rt.execute("answer = 'computed'").await.expect("r1");
        let round = rt.execute("FINAL_VAR('answer')").await.expect("r2");
@@ -945,6 +1248,33 @@ mod tests {
        rt.shutdown().await;
    }

+    #[tokio::test]
+    async fn rpc_dispatcher_round_trips_sub_query_alias() {
+        let bridge = StubBridge::new();
+        let calls = Arc::clone(&bridge.calls);
+
+        let mut rt = PythonRuntime::new().await.expect("spawn");
+        let round = rt
+            .run("print(sub_query('hello from sub'))", Some(&bridge))
+            .await
+            .expect("execute");
+        assert!(
+            round.stdout.contains("stub#0: hello from sub"),
+            "stdout: {:?}",
+            round.stdout
+        );
+        assert_eq!(round.rpc_count, 1);
+
+        let recorded = calls.lock().await;
+        assert_eq!(recorded.len(), 1);
+        match &recorded[0] {
+            RpcRequest::Llm { prompt, .. } => assert_eq!(prompt, "hello from sub"),
+            other => panic!("expected Llm request, got {other:?}"),
+        }
+        drop(recorded);
+        rt.shutdown().await;
+    }
+
    #[tokio::test]
    async fn rpc_dispatcher_round_trips_batch() {
        let bridge = StubBridge::new();
@@ -20,12 +20,13 @@
 //! - The root LLM receives small metadata messages — length, preview,
 //!   helper list, prior-round summary.
 //! - Code rounds and sub-LLM calls travel over a single stdin/stdout
-//!   pipe to a long-lived `python3 -u` subprocess. No HTTP sidecar.
+//!   pipe to a long-lived Python subprocess. No HTTP sidecar.

 use crate::models::Usage;

 pub mod bridge;
 pub mod prompt;
+pub mod session;
 pub mod turn;

 pub use bridge::RlmBridge;
@@ -11,74 +11,90 @@ pub fn rlm_system_prompt() -> SystemPrompt {
    SystemPrompt::Text(RLM_SYSTEM_PROMPT.trim().to_string())
 }

-const RLM_SYSTEM_PROMPT: &str = r#"You are the root of a Recursive Language Model (RLM). Your input lives in a long-running Python REPL as a variable named `context` (alias `ctx`). You DO NOT see `context` in your prompt — only its length and a short preview. The only way to read or compute over it is to write Python code that runs in the REPL.
+const RLM_SYSTEM_PROMPT: &str = r#"You are the root of a Recursive Language Model (RLM). The input is loaded into a long-running Python REPL. You hold a live context handle, not the raw body. Read only through bounded helpers, compute in Python, and delegate semantic judgment to child calls.
+
+The point is symbolic recursion. Keep the long prompt and large intermediate strings in REPL variables; the neural model should see metadata, bounded slices, code, and compact stdout. Do not copy the whole input into the root history, and do not verbalize a long list of child calls when Python can construct and launch them in a loop.

 The REPL exposes:
- `context` (alias `ctx`) — the full input string. Often huge — never `print(context)` in full.
- `llm_query(prompt, model=None, max_tokens=None, system=None)` — one-shot child LLM. Cheap. Use for chunk-level work. The `model` argument is accepted for compatibility but child calls stay pinned to the configured Flash child model.
- `llm_query_batched(prompts, model=None)` — concurrent fan-out. Returns `list[str]` in input order. The `model` argument is accepted for compatibility but ignored.
- `rlm_query(prompt, model=None)` — recursive sub-RLM. Use when a sub-task itself needs decomposition. The `model` argument is accepted for compatibility but ignored.
- `rlm_query_batched(prompts, model=None)` — concurrent recursive sub-RLMs. The `model` argument is accepted for compatibility but ignored.
- `chunk_context(max_chars=20000, overlap=0)` — full-coverage chunks with index/start/end/text fields.
- `chunk_coverage(chunks)` — coverage summary for chunks produced by `chunk_context`.
- `SHOW_VARS()` — list user variables and their types.
- `repl_set(name, value)` / `repl_get(name)` — explicit cross-round storage.
- `print(...)` — diagnostic output. The driver feeds you a truncated preview next round.
- `FINAL(value)` — end the loop with this string answer.
- `FINAL_VAR(name)` — end the loop with the value of a named variable.
+- `context_meta()` - bounded metadata: char count, line count, preview, tail preview.
+- `peek(start, end, unit="chars")` - bounded slice by char offsets or line numbers.
+- `search(pattern, max_hits=100)` - regex search returning bounded hit records with snippets.
+- `chunk(max_chars=20000, overlap=0)` - full-coverage chunks with index/start/end/text fields.
+- `chunk_coverage(chunks)` - coverage summary for chunks produced by `chunk`.
+- `sub_query(prompt, slice=None)` - one child LLM call, optionally scoped to one bounded slice.
+- `sub_query_batch(prompt, slices)` - apply one prompt to many bounded slices concurrently.
+- `sub_query_map(prompts, slices=None)` - run N distinct prompts, optionally paired with N bounded slices.
+- `sub_rlm(prompt, source=None)` - recursive sub-RLM for a sub-task that needs its own decomposition. Pass a bounded source, not the whole body.
+- `SHOW_VARS()` - list user variables and their types.
+- `repl_set(name, value)` / `repl_get(name)` - explicit cross-round storage.
+- `evaluate_progress()` - inspect whether a final answer exists and what variables are available.
+- `finalize(value, confidence=None)` - end the loop with a final answer and optional confidence.
+- `print(...)` - diagnostic output. The driver feeds you a truncated preview next round.

-Variables, imports, and any other state PERSIST across rounds — the REPL is a single long-lived Python process for the whole turn.
+Variables, imports, and any other state persist across rounds. There is no `context` or `ctx` variable. Use `peek`, `search`, `chunk`, and `context_meta`.

-Contract — every turn, output ONE ` ```repl ` block of Python. That's it. No prose-only turns. No "I will do X" — just emit the code that does X.
+Contract: every turn, output exactly one ` ```repl ` block of Python and nothing else. No prose-only turns. No "I will do X"; emit the code that does X.

-Strategy patterns
+Five-phase skeleton

-1. PREVIEW first.
+1. Load
 ```repl
-print(f"len(context) = {len(context)}")
-print(context[:500])
+meta = context_meta()
+print(meta)
 ```
+Confirm the handle shape. Do not re-load the body. Keep the head small: names and metadata only.

-2. CHUNK + map-reduce with batched concurrent calls.
+2. Orient
 ```repl
-chunk_size = 8000
-chunks = chunk_context(max_chars=chunk_size)
+hits = search(r"term|phrase", max_hits=20)
+sample = peek(0, min(meta["chars"], 1200))
+print({"hits": len(hits), "sample": sample[:300]})
+```
+Search before peeking. Pull only the slices you need. Store maps of the input as variables: headers, regions, sections, candidate spans.
+
+3. Compute
+```repl
+chunks = chunk(max_chars=12000, overlap=400)
 coverage = chunk_coverage(chunks)
-prompts = [f"Extract any mentions of X from section {c['index']} ({c['start']}:{c['end']}):\n\n{c['text']}" for c in chunks]
-partials = llm_query_batched(prompts)
+partials = sub_query_batch(
+    "Extract the facts needed for the user's question from this slice. "
+    "Return only grounded facts and cite the slice index/range.",
+    chunks,
+)
+print({"coverage": coverage, "partials": len(partials)})
+```
+Use deterministic Python first for counts, regex, parsing, sorting, dedupe, joins, and coverage. You do NO math by asking a child model to count; if Python can enumerate, parse, or simulate it exactly, do that in Python.
+
+4. Recurse
+```repl
 combined = "\n\n".join(partials)
-answer = llm_query(f"Coverage: {coverage}\n\nSynthesize across these section-level extractions:\n\n{combined}")
-print(answer[:500])
-```
-Then on the next turn:
-```repl
-FINAL(answer)
+analysis = sub_rlm(
+    "Synthesize these section findings into a precise answer. "
+    "Call out conflicts and missing coverage.",
+    source=combined,
+)
+print(analysis[:800])
 ```
+Use `sub_rlm` only when the sub-task itself needs decomposition or critique. Pass slices or compact variables, not the whole body. Memoize recursive results in variables.

-3. RECURSIVE decomposition for hard sub-problems.
+5. Converge
 ```repl
-trend = rlm_query(f"Analyze this dataset and conclude with one word — up, down, or stable: {data}")
-recommendation = "Hold" if "stable" in trend.lower() else ("Hedge" if "down" in trend.lower() else "Increase")
-print(trend, "→", recommendation)
-```
-
-4. PROGRAMMATIC computation + LLM interpretation.
-```repl
-import math
-theta = math.degrees(math.atan2(v_perp, v_parallel))
-final_answer = llm_query(f"Entry angle is {theta:.2f}°. Phrase the answer for a physics student.")
-FINAL(final_answer)
+progress = evaluate_progress()
+finalize(
+    f"{analysis}\n\nCoverage: {coverage['covered_chars']}/{coverage['input_chars']} chars "
+    f"across {coverage['chunks']} chunks; complete={coverage['complete']}.",
+    confidence="medium" if coverage["complete"] else "low",
+)
 ```
+Call `evaluate_progress()` if the answer is not stable. Loop back to Orient or Compute when coverage is incomplete or confidence is low. Call `finalize(...)` only when the answer is supported by variables you can inspect.

 Rules

- Emit exactly ONE ` ```repl ` block per turn. The block must contain Python code only.
- Never `print(context)` or otherwise dump it whole — slice, sample, or chunk.
- You MUST call `llm_query` / `llm_query_batched` / `rlm_query` at least once before `FINAL(...)`. Calling FINAL from a top-level prose answer (without ever running a `repl` block that touched `context` via a sub-LLM) is REJECTED — the driver will discard the FINAL and ask you to actually use the REPL.
- Sub-LLMs are powerful — feed them generous chunks (tens of thousands of chars), not tiny windows.
- For exact counts, package totals, line totals, or other structured aggregates, compute them with Python over `context` directly. Do not ask a child LLM to count.
- For whole-input map-reduce, report coverage in the final answer: chunks processed, total chunks, and whether every line/char range was included. If you only processed a subset, say that explicitly.
- Do NOT pad your output with prose like "Here is what I'll do:" — just emit the next ```repl block.
+- Use the bounded helpers (`context_meta`, `peek`, `search`, `chunk`) to inspect input.
+- Use `sub_query`, `sub_query_batch`, `sub_query_map`, or `sub_rlm` before finalizing unless the task is purely deterministic and fully computed in Python.
+- End only by calling `finalize(value, confidence=...)`.
+- For exact counts, totals, parsing, and structured aggregates, compute with Python. Do not ask a child LLM to count.
+- For whole-input map-reduce, include coverage in the final answer: chunks processed, total chunks, and whether every char range was included. If you only processed a subset, say that explicitly.
 "#;

 #[cfg(test)]
@@ -103,49 +119,66 @@ mod tests {
    }

    #[test]
-    fn rlm_prompt_mentions_context_variable() {
-        assert!(body().contains("`context`"));
-    }
-
-    #[test]
-    fn rlm_prompt_mentions_ctx_alias() {
-        assert!(body().contains("`ctx`"));
+    fn rlm_prompt_uses_five_phase_skeleton() {
+        let s = body();
+        for phase in ["Load", "Orient", "Compute", "Recurse", "Converge"] {
+            assert!(s.contains(phase), "system prompt missing phase: {phase}");
+        }
    }

    #[test]
    fn rlm_prompt_mentions_all_helpers() {
        let s = body();
        for name in [
-            "llm_query",
-            "llm_query_batched",
-            "rlm_query",
-            "rlm_query_batched",
-            "chunk_context",
+            "peek",
+            "search",
+            "chunk",
            "chunk_coverage",
+            "context_meta",
+            "sub_query",
+            "sub_query_batch",
+            "sub_query_map",
+            "sub_rlm",
+            "finalize",
+            "evaluate_progress",
            "SHOW_VARS",
-            "FINAL",
-            "FINAL_VAR",
        ] {
            assert!(s.contains(name), "system prompt missing helper: {name}");
        }
    }

    #[test]
-    fn rlm_prompt_forbids_prose_shortcut() {
-        // The new contract requires a sub-LLM call before FINAL — the
-        // prompt must say so explicitly so the model doesn't try to bail
-        // with FINAL("...inferred from preview...").
-        assert!(
-            body().contains("REJECTED") || body().contains("rejected"),
-            "system prompt should reject the prose-shortcut path explicitly"
-        );
+    fn rlm_prompt_does_not_publicize_context_variables() {
+        let s = body();
+        assert!(s.contains("There is no `context` or `ctx` variable"));
+        assert!(!s.contains("len(context)"));
+        assert!(!s.contains("chunk_context"));
+        assert!(!s.contains("llm_query"));
+        assert!(!s.contains("rlm_query"));
+    }
+
+    #[test]
+    fn rlm_prompt_is_finalize_only() {
+        let s = body();
+        assert!(s.contains("finalize(value"));
+        assert!(!s.contains("FINAL_VAR"));
+        assert!(!s.contains("FINAL(value)"));
+        assert!(!s.contains("FINAL("));
    }

    #[test]
    fn rlm_prompt_requires_deterministic_counts_and_coverage() {
        let s = body();
-        assert!(s.contains("compute them with Python"));
-        assert!(s.contains("report coverage"));
+        assert!(s.contains("compute with Python"));
+        assert!(s.contains("include coverage"));
        assert!(s.contains("chunks processed"));
    }
+
+    #[test]
+    fn rlm_prompt_mentions_symbolic_state_contract() {
+        let s = body();
+        assert!(s.contains("symbolic recursion"));
+        assert!(s.contains("REPL variables"));
+        assert!(s.contains("Do not copy the whole input"));
+    }
 }
@@ -0,0 +1,180 @@
+//! Persistent RLM session state for the v0.8.33 head/hands tool surface.
+
+use std::collections::HashMap;
+use std::path::{Path, PathBuf};
+use std::sync::Arc;
+use std::time::{Duration, Instant};
+
+use serde::{Deserialize, Serialize};
+use sha2::{Digest, Sha256};
+use tokio::sync::Mutex;
+use uuid::Uuid;
+
+use crate::repl::PythonRuntime;
+
+pub type SharedRlmSessionStore = Arc<Mutex<HashMap<String, Arc<Mutex<RlmSession>>>>>;
+
+#[must_use]
+pub fn new_shared_rlm_session_store() -> SharedRlmSessionStore {
+    Arc::new(Mutex::new(HashMap::new()))
+}
+
+#[derive(Debug)]
+pub struct RlmSession {
+    pub name: String,
+    pub id: String,
+    pub kernel: Option<PythonRuntime>,
+    pub context_meta: ContextMeta,
+    pub config: RlmSessionConfig,
+    pub rpc_count: u32,
+    pub total_duration: Duration,
+    pub peak_var_count: usize,
+    pub final_count: usize,
+    pub created_at: Instant,
+    pub last_used_at: Instant,
+    pub context_path: PathBuf,
+}
+
+impl RlmSession {
+    #[must_use]
+    pub fn new(
+        name: String,
+        kernel: PythonRuntime,
+        context_meta: ContextMeta,
+        context_path: PathBuf,
+    ) -> Self {
+        let now = Instant::now();
+        Self {
+            name,
+            id: format!("rlm:{}", Uuid::new_v4().simple()),
+            kernel: Some(kernel),
+            context_meta,
+            config: RlmSessionConfig::default(),
+            rpc_count: 0,
+            total_duration: Duration::ZERO,
+            peak_var_count: 0,
+            final_count: 0,
+            created_at: now,
+            last_used_at: now,
+            context_path,
+        }
+    }
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct ContextMeta {
+    pub length: usize,
+    #[serde(rename = "type")]
+    pub type_name: String,
+    pub preview_500: String,
+    pub sha256: String,
+}
+
+impl ContextMeta {
+    #[must_use]
+    pub fn from_body(body: &str, type_name: impl Into<String>) -> Self {
+        Self {
+            length: body.chars().count(),
+            type_name: type_name.into(),
+            preview_500: body.chars().take(500).collect(),
+            sha256: sha256_hex(body.as_bytes()),
+        }
+    }
+}
+
+#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq, Eq)]
+#[serde(rename_all = "snake_case")]
+pub enum OutputFeedback {
+    Full,
+    Metadata,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct RlmSessionConfig {
+    pub output_feedback: OutputFeedback,
+    pub sub_query_timeout_secs: u64,
+    pub sub_rlm_max_depth: u32,
+    pub share_session: bool,
+}
+
+impl Default for RlmSessionConfig {
+    fn default() -> Self {
+        Self {
+            output_feedback: OutputFeedback::Full,
+            sub_query_timeout_secs: 120,
+            sub_rlm_max_depth: 1,
+            share_session: false,
+        }
+    }
+}
+
+pub fn write_context_file(body: &str) -> std::io::Result<PathBuf> {
+    let dir = std::env::temp_dir().join("deepseek_rlm_ctx");
+    std::fs::create_dir_all(&dir)?;
+    let path = dir.join(format!(
+        "session_{}_{}.txt",
+        std::process::id(),
+        Uuid::new_v4().simple()
+    ));
+    std::fs::write(&path, body)?;
+    Ok(path)
+}
+
+#[must_use]
+pub fn derive_session_name(source_hint: Option<&str>) -> String {
+    let hint = source_hint
+        .and_then(|raw| {
+            Path::new(raw)
+                .file_name()
+                .and_then(|name| name.to_str())
+                .or(Some(raw))
+        })
+        .unwrap_or("context");
+    let mut out = String::new();
+    for ch in hint.chars() {
+        if ch.is_ascii_alphanumeric() {
+            out.push(ch.to_ascii_lowercase());
+        } else if !out.ends_with('_') {
+            out.push('_');
+        }
+        if out.len() >= 48 {
+            break;
+        }
+    }
+    let out = out.trim_matches('_');
+    if out.is_empty() {
+        "context".to_string()
+    } else {
+        out.to_string()
+    }
+}
+
+fn sha256_hex(bytes: &[u8]) -> String {
+    let mut hasher = Sha256::new();
+    hasher.update(bytes);
+    format!("{:x}", hasher.finalize())
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn derive_session_name_slugifies_path() {
+        assert_eq!(
+            derive_session_name(Some("src/Big File.rs")),
+            "big_file_rs".to_string()
+        );
+    }
+
+    #[test]
+    fn context_meta_hashes_and_previews_body() {
+        let meta = ContextMeta::from_body("abcdef", "text");
+        assert_eq!(meta.length, 6);
+        assert_eq!(meta.preview_500, "abcdef");
+        assert_eq!(
+            meta.sha256,
+            "bef57ec7f53a6d40beb640a780a639c83bc29ac8a9816f1fc6c5c6dcd93c4721"
+        );
+    }
+}
@@ -1960,6 +1960,8 @@ impl RuntimeThreadManager {
                active_thread_id: Some(thread.id.clone()),
                shell_manager: None,
                hook_executor: None,
+                handle_store: crate::tools::handle::new_shared_handle_store(),
+                rlm_sessions: crate::rlm::session::new_shared_rlm_session_store(),
            },
            subagent_model_overrides: self.config.subagent_model_overrides(),
            memory_enabled: self.config.memory_enabled(),
@@ -674,8 +674,13 @@ pub fn create_saved_session_with_id_and_mode(
        .find(|m| m.role == "user")
        .and_then(|m| {
            m.content.iter().find_map(|block| match block {
-                ContentBlock::Text { text, .. } if !text.starts_with("<turn_meta>") => {
-                    Some(truncate_title(text, 50))
+                ContentBlock::Text { text, .. } => {
+                    let prompt = extract_user_prompt(text);
+                    if prompt.is_empty() {
+                        None
+                    } else {
+                        Some(truncate_title(prompt, 50))
+                    }
                }
                _ => None,
            })
@@ -881,6 +886,51 @@ pub fn truncate_id(id: &str) -> &str {
    id.get(..8).unwrap_or(id)
 }

+/// Strip a leading `<turn_meta>...</turn_meta>` block from saved user text.
+///
+/// Older sessions can have turn metadata prefixed to the first user message.
+/// The session picker and generated session titles should show the user's
+/// prompt, not the cache/debug envelope.
+pub(crate) fn extract_user_prompt(raw: &str) -> &str {
+    let trimmed = raw.trim_start();
+    let Some(after_open) = trimmed.strip_prefix("<turn_meta>") else {
+        return trimmed;
+    };
+    if let Some(close_pos) = after_open.find("</turn_meta>") {
+        return after_open[close_pos + "</turn_meta>".len()..].trim_start();
+    }
+    after_open.trim_start()
+}
+
+/// Clean a stored title for display, falling back to a neutral label.
+pub(crate) fn extract_title(raw: &str) -> &str {
+    let title = extract_user_prompt(raw);
+    if title.is_empty() { "Session" } else { title }
+}
+
+/// Strip common inline thinking/reasoning XML sections from saved assistant
+/// text before it is shown in session previews.
+pub(crate) fn strip_thinking_tags(text: &str) -> String {
+    if !text.contains("<think") && !text.contains("<thinking") && !text.contains("<reasoning") {
+        return text.to_string();
+    }
+
+    let tags = ["think", "thinking", "reasoning"];
+    let mut result = text.to_string();
+    for tag in tags {
+        let open = format!("<{tag}>");
+        let close = format!("</{tag}>");
+        while let Some(start) = result.find(&open) {
+            let Some(end) = result[start..].find(&close) else {
+                break;
+            };
+            let end_abs = start + end + close.len();
+            result.replace_range(start..end_abs, "");
+        }
+    }
+    result
+}
+
 /// Truncate a string to create a title (character-safe for UTF-8)
 fn truncate_title(s: &str, max_len: usize) -> String {
    let s = s.trim();
@@ -898,7 +948,7 @@ fn truncate_title(s: &str, max_len: usize) -> String {
 /// Format a session for display in a picker
 pub fn format_session_line(meta: &SessionMetadata) -> String {
    let age = format_age(&meta.updated_at);
-    let truncated_title = truncate_title(&meta.title, 40);
+    let truncated_title = truncate_title(extract_title(&meta.title), 40);

    format!(
        "{} | {} | {} msgs | {}",
@@ -1250,6 +1300,41 @@ mod tests {
        assert_eq!(truncate_title("Line 1\nLine 2", 50), "Line 1");
    }

+    #[test]
+    fn extract_user_prompt_strips_turn_meta_prefix() {
+        assert_eq!(
+            extract_user_prompt("<turn_meta>{\"cache\":\"x\"}</turn_meta>\nReal prompt"),
+            "Real prompt"
+        );
+        assert_eq!(extract_user_prompt("  Real prompt"), "Real prompt");
+        assert_eq!(
+            extract_user_prompt("<turn_meta>{\"unterminated\":true}\nReal prompt"),
+            "{\"unterminated\":true}\nReal prompt"
+        );
+    }
+
+    #[test]
+    fn create_saved_session_uses_prompt_after_turn_meta_for_title() {
+        let tmp = tempdir().expect("tempdir");
+        let messages = vec![make_test_message(
+            "user",
+            "<turn_meta>{\"cache\":\"x\"}</turn_meta>\nFix the session picker history pane",
+        )];
+        let session = create_saved_session(&messages, "test-model", tmp.path(), 100, None);
+        assert_eq!(
+            session.metadata.title,
+            "Fix the session picker history pane"
+        );
+    }
+
+    #[test]
+    fn strip_thinking_tags_removes_common_inline_blocks() {
+        let text = "Before <think>private</think> middle <reasoning>hidden</reasoning> after";
+        let cleaned = strip_thinking_tags(text);
+        assert_eq!(cleaned, "Before  middle  after");
+        assert_eq!(strip_thinking_tags("plain answer"), "plain answer");
+    }
+
    #[test]
    fn test_format_age() {
        let now = Utc::now();
@@ -12,7 +12,7 @@ use serde::{Deserialize, Serialize};

 use crate::config::{expand_path, normalize_model_name};
 use crate::localization::normalize_configured_locale;
-use crate::palette::normalize_hex_rgb_color;
+use crate::palette::{normalize_hex_rgb_color, normalize_theme_name};

 // ============================================================================
 // TuiPrefs — ~/.deepseek/tui.toml
@@ -28,7 +28,7 @@ use crate::palette::normalize_hex_rgb_color;
 /// # Example `~/.deepseek/tui.toml`
 ///
 /// ```toml
-/// theme    = "dark"        # "dark" | "light" | "system"
+/// theme    = "dark"        # "system" | "dark" | "light" | "grayscale"
 /// font_size = 14
 ///
 /// [keybinds]
@@ -43,7 +43,8 @@ use crate::palette::normalize_hex_rgb_color;
 #[derive(Debug, Clone, Serialize, Deserialize)]
 #[serde(default)]
 pub struct TuiPrefs {
-    /// UI colour theme: `"dark"` | `"light"` | `"system"`. Default `"dark"`.
+    /// UI colour theme: `"dark"` | `"light"` | `"grayscale"` | `"system"`.
+    /// Default `"dark"`.
    pub theme: String,
    /// Terminal font size hint forwarded to supporting front-ends (e.g. the
    /// Tauri shell). `0` means "use terminal default". Default `0`.
@@ -149,14 +150,13 @@ impl TuiPrefs {
    /// surface a helpful message rather than silently ignoring a typo.
    pub fn validate(&mut self) -> Result<()> {
        let theme = self.theme.trim().to_ascii_lowercase();
-        match theme.as_str() {
-            "dark" | "light" | "system" => {
-                self.theme = theme;
-            }
-            other => {
-                anyhow::bail!("Invalid tui.toml theme '{other}': expected dark, light, or system.");
-            }
-        }
+        let Some(theme) = normalize_theme_name(&theme) else {
+            anyhow::bail!(
+                "Invalid tui.toml theme '{}': expected system, dark, light, or grayscale.",
+                self.theme
+            );
+        };
+        self.theme = theme.to_string();
        Ok(())
    }
 }
@@ -195,6 +195,8 @@ pub struct Settings {
    pub show_tool_details: bool,
    /// UI locale: auto, en, ja, zh-Hans, pt-BR
    pub locale: String,
+    /// UI theme: system, dark, light, grayscale
+    pub theme: String,
    /// Optional main TUI background color as a 6-digit hex RGB value.
    pub background_color: Option<String>,
    /// Composer layout density: compact, comfortable, spacious
@@ -211,7 +213,7 @@ pub struct Settings {
    pub default_mode: String,
    /// Sidebar width as percentage of terminal width
    pub sidebar_width_percent: u16,
-    /// Sidebar focus mode: auto, plan, todos, tasks, agents, context
+    /// Sidebar focus mode: auto, work, tasks, agents, context
    pub sidebar_focus: String,
    /// Enable the session-context panel (#504). Shows working set, tokens,
    /// cost, MCP/LSP status, cycle count, and memory info.
@@ -287,6 +289,7 @@ impl Default for Settings {
            show_thinking: true,
            show_tool_details: true,
            locale: "auto".to_string(),
+            theme: "system".to_string(),
            background_color: None,
            composer_density: "comfortable".to_string(),
            composer_border: true,
@@ -350,6 +353,7 @@ impl Settings {
            s.locale = normalize_configured_locale(&s.locale)
                .unwrap_or("en")
                .to_string();
+            s.theme = normalize_settings_theme(&s.theme).to_string();
            s.background_color = normalize_optional_background_color(s.background_color.as_deref());
            s.default_model = s.default_model.as_deref().and_then(normalize_default_model);
            s
@@ -474,6 +478,14 @@ impl Settings {
                };
                self.locale = locale.to_string();
            }
+            "theme" | "ui_theme" => {
+                let Some(theme) = normalize_theme_name(value) else {
+                    anyhow::bail!(
+                        "Failed to update setting: invalid theme '{value}'. Expected: system, dark, light, grayscale."
+                    );
+                };
+                self.theme = theme.to_string();
+            }
            "background_color" | "background" | "bg" => {
                self.background_color = normalize_background_color_setting(value)?;
            }
@@ -555,13 +567,13 @@ impl Settings {
            "sidebar_focus" | "focus" => {
                let normalized = match value.trim().to_ascii_lowercase().as_str() {
                    "auto" => "auto",
-                    "plan" => "plan",
-                    "todos" => "todos",
+                    "work" | "plan" | "todos" => "work",
                    "tasks" => "tasks",
                    "agents" | "subagents" | "sub-agents" => "agents",
+                    "context" | "session" => "context",
                    _ => {
                        anyhow::bail!(
-                            "Failed to update setting: invalid sidebar focus '{value}'. Expected: auto, plan, todos, tasks, agents."
+                            "Failed to update setting: invalid sidebar focus '{value}'. Expected: auto, work, tasks, agents, context."
                        )
                    }
                };
@@ -631,6 +643,7 @@ impl Settings {
        lines.push(format!("  show_thinking:      {}", self.show_thinking));
        lines.push(format!("  show_tool_details:  {}", self.show_tool_details));
        lines.push(format!("  locale:            {}", self.locale));
+        lines.push(format!("  theme:             {}", self.theme));
        lines.push(format!(
            "  background_color:   {}",
            self.background_color.as_deref().unwrap_or("(default)")
@@ -700,6 +713,7 @@ impl Settings {
                "locale",
                "UI locale and default model language: auto, en, ja, zh-Hans, pt-BR",
            ),
+            ("theme", "UI theme: system, dark, light, grayscale"),
            (
                "background_color",
                "Main TUI background color: #RRGGBB or default",
@@ -732,7 +746,7 @@ impl Settings {
            ("sidebar_width", "Sidebar width percentage: 10-50"),
            (
                "sidebar_focus",
-                "Sidebar focus: auto, plan, todos, tasks, agents",
+                "Sidebar focus: auto, work, tasks, agents, context",
            ),
            ("cost_currency", "Cost display currency: usd, cny"),
            ("max_history", "Max input history entries"),
@@ -836,6 +850,10 @@ fn normalize_synchronized_output(value: &str) -> &str {
    }
 }

+fn normalize_settings_theme(value: &str) -> &'static str {
+    normalize_theme_name(value).unwrap_or("system")
+}
+
 /// Returns `true` when the active terminal is Ptyxis (the new default
 /// terminal on Ubuntu 26.04). Used by [`Settings::apply_env_overrides`]
 /// to flip `synchronized_output` from `auto` to `off` so DEC mode 2026
@@ -886,8 +904,7 @@ fn normalize_background_color_setting(value: &str) -> Result<Option<String>> {

 fn normalize_sidebar_focus(value: &str) -> &str {
    match value.trim().to_ascii_lowercase().as_str() {
-        "plan" => "plan",
-        "todos" => "todos",
+        "work" | "plan" | "todos" => "work",
        "tasks" => "tasks",
        "agents" | "subagents" | "sub-agents" => "agents",
        "context" | "session" => "context",
@@ -968,6 +985,26 @@ mod tests {
        assert!(err.to_string().contains("invalid locale"));
    }

+    #[test]
+    fn theme_normalizes_supported_values_and_rejects_unknowns() {
+        let mut settings = Settings::default();
+        assert_eq!(settings.theme, "system");
+
+        settings.set("theme", "grayscale").expect("set grayscale");
+        assert_eq!(settings.theme, "grayscale");
+
+        settings.set("ui_theme", "black-white").expect("set alias");
+        assert_eq!(settings.theme, "grayscale");
+
+        settings.set("theme", "whale").expect("set dark alias");
+        assert_eq!(settings.theme, "dark");
+
+        let err = settings
+            .set("theme", "solarized")
+            .expect_err("unknown theme should fail");
+        assert!(err.to_string().contains("invalid theme"));
+    }
+
    #[test]
    fn background_color_normalizes_hex_and_accepts_default() {
        let mut settings = Settings::default();
@@ -1008,6 +1045,28 @@ mod tests {
        assert!(err.to_string().contains("invalid cost currency"));
    }

+    #[test]
+    fn sidebar_focus_accepts_work_values_and_legacy_aliases() {
+        let mut settings = Settings::default();
+
+        settings.set("sidebar_focus", "work").expect("set work");
+        assert_eq!(settings.sidebar_focus, "work");
+
+        settings.set("focus", "plan").expect("legacy plan alias");
+        assert_eq!(settings.sidebar_focus, "work");
+
+        settings.set("focus", "todos").expect("legacy todos alias");
+        assert_eq!(settings.sidebar_focus, "work");
+
+        settings.set("focus", "context").expect("context focus");
+        assert_eq!(settings.sidebar_focus, "context");
+
+        let err = settings
+            .set("sidebar_focus", "classic")
+            .expect_err("classic is not a supported public focus");
+        assert!(err.to_string().contains("invalid sidebar focus"));
+    }
+
    #[test]
    fn display_localizes_header_and_config_file_label() {
        let settings = Settings::default();
@@ -1536,7 +1595,7 @@ mod tests {

    #[test]
    fn tui_prefs_validate_accepts_known_themes() {
-        for theme in ["dark", "light", "system"] {
+        for theme in ["dark", "light", "system", "grayscale"] {
            let mut prefs = TuiPrefs {
                theme: theme.to_string(),
                ..TuiPrefs::default()
@@ -1551,11 +1610,13 @@ mod tests {
    #[test]
    fn tui_prefs_validate_normalises_theme_case() {
        let mut prefs = TuiPrefs {
-            theme: "DARK".to_string(),
+            theme: "MONO".to_string(),
            ..TuiPrefs::default()
        };
-        prefs.validate().expect("DARK should normalise to dark");
-        assert_eq!(prefs.theme, "dark");
+        prefs
+            .validate()
+            .expect("MONO should normalise to grayscale");
+        assert_eq!(prefs.theme, "grayscale");
    }

    #[test]
@@ -1568,6 +1629,10 @@ mod tests {
            .validate()
            .expect_err("solarized is not a valid theme");
        assert!(err.to_string().contains("Invalid tui.toml theme"));
+        assert!(
+            err.to_string()
+                .contains("expected system, dark, light, or grayscale")
+        );
    }

    #[test]
@@ -7,6 +7,7 @@
 //! (`format = "markdown"`); pass `format = "raw"` to keep the bytes intact
 //! when the model wants to do its own parsing.

+use super::handle::query_jsonpath;
 use super::spec::{
    ApprovalRequirement, ToolCapability, ToolContext, ToolError, ToolResult, ToolSpec, optional_u64,
 };
@@ -15,6 +16,7 @@ use async_trait::async_trait;
 use regex::Regex;
 use serde::Serialize;
 use serde_json::{Value, json};
+use std::collections::BTreeMap;
 use std::sync::OnceLock;
 use std::time::Duration;

@@ -73,9 +75,12 @@ impl Format {
 struct FetchResponse {
    url: String,
    status: u16,
+    headers: BTreeMap<String, String>,
    content_type: String,
    content: String,
    truncated: bool,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    fields: Option<BTreeMap<String, Vec<Value>>>,
 }

 pub struct FetchUrlTool;
@@ -110,6 +115,11 @@ impl ToolSpec for FetchUrlTool {
                "timeout_ms": {
                    "type": "integer",
                    "description": "Request timeout in milliseconds (default 15,000; max 60,000)."
+                },
+                "fields": {
+                    "type": "array",
+                    "items": { "type": "string" },
+                    "description": "Optional JSONPath projections for JSON responses. Supports $, .field, [index], [*], and ['field']; returns matches under `fields`."
                }
            },
            "required": ["url"]
@@ -146,6 +156,7 @@ impl ToolSpec for FetchUrlTool {
        let max_bytes = optional_u64(&input, "max_bytes", DEFAULT_MAX_BYTES).min(HARD_MAX_BYTES);
        let timeout_ms =
            optional_u64(&input, "timeout_ms", DEFAULT_TIMEOUT_MS).min(HARD_MAX_TIMEOUT_MS);
+        let requested_fields = parse_fields(&input)?;
        let mut current_url = reqwest::Url::parse(&url)
            .map_err(|e| ToolError::invalid_input(format!("invalid URL: {e}")))?;
        let mut redirects_followed = 0usize;
@@ -202,6 +213,7 @@ impl ToolSpec for FetchUrlTool {
            .and_then(|v| v.to_str().ok())
            .unwrap_or("application/octet-stream")
            .to_string();
+        let headers = response_headers(resp.headers());

        let bytes = resp
            .bytes()
@@ -216,6 +228,7 @@ impl ToolSpec for FetchUrlTool {
        };

        let body_text = String::from_utf8_lossy(usable).to_string();
+        let fields = project_json_fields(&body_text, &content_type, &requested_fields)?;
        let processed = match format {
            Format::Raw => body_text,
            Format::Text | Format::Markdown => {
@@ -230,9 +243,11 @@ impl ToolSpec for FetchUrlTool {
        let response = FetchResponse {
            url: final_url,
            status: status.as_u16(),
+            headers,
            content_type,
            content: processed,
            truncated,
+            fields,
        };

        if !status.is_success() {
@@ -386,6 +401,66 @@ fn validate_dns_resolved_ip(
    )))
 }

+fn parse_fields(input: &Value) -> Result<Vec<String>, ToolError> {
+    let Some(values) = input.get("fields") else {
+        return Ok(Vec::new());
+    };
+    let Some(values) = values.as_array() else {
+        return Err(ToolError::invalid_input("`fields` must be an array"));
+    };
+    let mut fields = Vec::new();
+    for value in values {
+        let Some(field) = value.as_str() else {
+            return Err(ToolError::invalid_input(
+                "`fields` entries must be JSONPath strings",
+            ));
+        };
+        let field = field.trim();
+        if !field.is_empty() {
+            fields.push(field.to_string());
+        }
+    }
+    Ok(fields)
+}
+
+fn response_headers(headers: &reqwest::header::HeaderMap) -> BTreeMap<String, String> {
+    headers
+        .iter()
+        .filter_map(|(name, value)| {
+            value
+                .to_str()
+                .ok()
+                .map(|value| (name.as_str().to_ascii_lowercase(), value.to_string()))
+        })
+        .collect()
+}
+
+fn project_json_fields(
+    body_text: &str,
+    content_type: &str,
+    fields: &[String],
+) -> Result<Option<BTreeMap<String, Vec<Value>>>, ToolError> {
+    if fields.is_empty() {
+        return Ok(None);
+    }
+    if !content_type.to_ascii_lowercase().contains("json") {
+        return Err(ToolError::invalid_input(
+            "`fields` can only be used with JSON responses",
+        ));
+    }
+    let body_json: Value = serde_json::from_str(body_text).map_err(|e| {
+        ToolError::execution_failed(format!("response body is not valid JSON for `fields`: {e}"))
+    })?;
+    let mut out = BTreeMap::new();
+    for field in fields {
+        let matches = query_jsonpath(&body_json, field).map_err(|e| {
+            ToolError::invalid_input(format!("invalid JSONPath `{field}` in `fields`: {e}"))
+        })?;
+        out.insert(field.clone(), matches);
+    }
+    Ok(Some(out))
+}
+
 /// Strip `<script>` / `<style>` blocks, drop remaining tags, and collapse
 /// whitespace. Good enough for "let the model read this page" — not a full
 /// HTML-to-Markdown converter.
@@ -453,6 +528,31 @@ mod tests {
        assert!(Format::parse(Some("yaml")).is_err());
    }

+    #[test]
+    fn project_json_fields_returns_requested_jsonpath_matches() {
+        let fields = vec!["$.items[*].name".to_string(), "$.count".to_string()];
+        let projected = project_json_fields(
+            r#"{"items":[{"name":"alpha"},{"name":"beta"}],"count":2}"#,
+            "application/json",
+            &fields,
+        )
+        .expect("project")
+        .expect("some");
+
+        assert_eq!(
+            projected.get("$.items[*].name").unwrap(),
+            &vec![json!("alpha"), json!("beta")]
+        );
+        assert_eq!(projected.get("$.count").unwrap(), &vec![json!(2)]);
+    }
+
+    #[test]
+    fn project_json_fields_rejects_non_json_content_type() {
+        let fields = vec!["$.name".to_string()];
+        let err = project_json_fields("{}", "text/plain", &fields).expect_err("must reject");
+        assert!(format!("{err}").contains("JSON responses"));
+    }
+
    #[tokio::test]
    async fn rejects_non_http_schemes() {
        let tool = FetchUrlTool;
@@ -6,7 +6,7 @@
 use super::diff_format::make_unified_diff;
 use super::spec::{
    ApprovalRequirement, ToolCapability, ToolContext, ToolError, ToolResult, ToolSpec,
-    lsp_diagnostics_for_paths, optional_str, required_str,
+    lsp_diagnostics_for_paths, optional_bool, optional_str, required_str,
 };
 use async_trait::async_trait;
 use serde_json::{Value, json};
@@ -473,7 +473,7 @@ impl ToolSpec for EditFileTool {
    }

    fn description(&self) -> &'static str {
-        "Replace text in a single file via exact search/replace. Use this instead of `sed -i` in `exec_shell` for one unambiguous in-place edit. `search` must match exactly, including whitespace and indentation. Returns a compact unified diff, not the full file. For structural, multi-block, or cross-file changes, use `apply_patch` or `write_file` instead."
+        "Replace text in a single file via exact search/replace. Use this instead of `sed -i` in `exec_shell` for one unambiguous in-place edit. `search` matches exactly by default, including whitespace and indentation; set `fuzz: true` to tolerate leading-indentation differences. Returns a compact unified diff, not the full file. For structural, multi-block, or cross-file changes, use `apply_patch` or `write_file` instead."
    }

    fn input_schema(&self) -> Value {
@@ -491,6 +491,10 @@ impl ToolSpec for EditFileTool {
                "replace": {
                    "type": "string",
                    "description": "Text to replace with"
+                },
+                "fuzz": {
+                    "type": "boolean",
+                    "description": "When true, tolerate leading whitespace differences on each searched line (default false)"
                }
            },
            "required": ["path", "search", "replace"]
@@ -513,6 +517,7 @@ impl ToolSpec for EditFileTool {
        let path_str = required_str(&input, "path")?;
        let search = required_str(&input, "search")?;
        let replace = required_str(&input, "replace")?;
+        let fuzz = optional_bool(&input, "fuzz", false);

        if search == replace {
            return Err(ToolError::invalid_input(
@@ -527,14 +532,36 @@ impl ToolSpec for EditFileTool {
        })?;

        let count = contents.matches(search).count();
-        if count == 0 {
+        let (updated, count, fuzz_used) = if count == 0 && fuzz {
+            let matches = leading_whitespace_fuzzy_matches(&contents, search);
+            match matches.as_slice() {
+                [] => {
+                    return Err(ToolError::execution_failed(format!(
+                        "Search string not found in {}",
+                        file_path.display()
+                    )));
+                }
+                [(start, end)] => {
+                    let mut updated = contents.clone();
+                    updated.replace_range(*start..*end, replace);
+                    (updated, 1, true)
+                }
+                _ => {
+                    return Err(ToolError::execution_failed(format!(
+                        "Fuzzy search matched {} locations in {}; refine search text",
+                        matches.len(),
+                        file_path.display()
+                    )));
+                }
+            }
+        } else if count == 0 {
            return Err(ToolError::execution_failed(format!(
                "Search string not found in {}",
                file_path.display()
            )));
-        }
-
-        let updated = contents.replace(search, replace);
+        } else {
+            (contents.replace(search, replace), count, false)
+        };

        fs::write(&file_path, &updated).map_err(|e| {
            ToolError::execution_failed(format!("Failed to write {}: {}", file_path.display(), e))
@@ -549,7 +576,12 @@ impl ToolSpec for EditFileTool {
                 Verify the result with read_file before proceeding."
            )
        } else {
-            format!("Replaced 1 occurrence in {display}")
+            let fuzz_note = if fuzz_used {
+                " (fuzzy indentation match)"
+            } else {
+                ""
+            };
+            format!("Replaced 1 occurrence in {display}{fuzz_note}")
        };
        let body = if diff.is_empty() {
            format!("{summary}\n(no textual changes)")
@@ -569,6 +601,52 @@ impl ToolSpec for EditFileTool {
    }
 }

+fn strip_line_leading_whitespace_with_map(input: &str) -> (String, Vec<usize>) {
+    let mut normalized = String::with_capacity(input.len());
+    let mut byte_map = Vec::with_capacity(input.len());
+    let mut at_line_start = true;
+    for (idx, ch) in input.char_indices() {
+        if at_line_start && matches!(ch, ' ' | '\t') {
+            continue;
+        }
+        normalized.push(ch);
+        for _ in 0..ch.len_utf8() {
+            byte_map.push(idx);
+        }
+        at_line_start = ch == '\n';
+    }
+    (normalized, byte_map)
+}
+
+fn line_start_before(input: &str, idx: usize) -> usize {
+    input[..idx]
+        .rfind('\n')
+        .map_or(0, |newline| newline.saturating_add(1))
+}
+
+fn leading_whitespace_fuzzy_matches(contents: &str, search: &str) -> Vec<(usize, usize)> {
+    let (normalized_contents, byte_map) = strip_line_leading_whitespace_with_map(contents);
+    let (normalized_search, _) = strip_line_leading_whitespace_with_map(search);
+    if normalized_search.is_empty() {
+        return Vec::new();
+    }
+
+    let mut matches = Vec::new();
+    let mut cursor = 0;
+    while let Some(rel_idx) = normalized_contents[cursor..].find(&normalized_search) {
+        let norm_start = cursor + rel_idx;
+        let norm_end = norm_start + normalized_search.len();
+        let Some(&mapped_start) = byte_map.get(norm_start) else {
+            break;
+        };
+        let original_start = line_start_before(contents, mapped_start);
+        let original_end = byte_map.get(norm_end).copied().unwrap_or(contents.len());
+        matches.push((original_start, original_end));
+        cursor = norm_start.saturating_add(1);
+    }
+    matches
+}
+
 // === ListDirTool ===

 /// Tool for listing directory contents.
@@ -1205,6 +1283,41 @@ mod tests {
        assert!(!result.content.contains("multiple matches were replaced"));
    }

+    #[tokio::test]
+    async fn test_edit_file_fuzz_tolerates_leading_whitespace() {
+        let tmp = tempdir().expect("tempdir");
+        let ctx = ToolContext::new(tmp.path().to_path_buf());
+
+        let test_file = tmp.path().join("fuzzy.txt");
+        fs::write(
+            &test_file,
+            "fn main() {\n    if true {\n        let value = 1;\n    }\n}\n",
+        )
+        .expect("write");
+
+        let tool = EditFileTool;
+        let result = tool
+            .execute(
+                json!({
+                    "path": "fuzzy.txt",
+                    "search": "if true {\n    let value = 1;\n}",
+                    "replace": "    if true {\n        let value = 2;\n    }",
+                    "fuzz": true
+                }),
+                &ctx,
+            )
+            .await
+            .expect("execute");
+
+        assert!(result.success);
+        assert!(result.content.contains("fuzzy indentation match"));
+        let edited = fs::read_to_string(&test_file).expect("read");
+        assert_eq!(
+            edited,
+            "fn main() {\n    if true {\n        let value = 2;\n    }\n}\n"
+        );
+    }
+
    #[tokio::test]
    async fn test_edit_file_not_found() {
        let tmp = tempdir().expect("tempdir");
@@ -8,6 +8,8 @@ use ignore::WalkBuilder;
 use serde::Serialize;
 use serde_json::{Value, json};

+use crate::tools::search::matches_glob;
+
 use super::spec::{
    ApprovalRequirement, ToolCapability, ToolContext, ToolError, ToolResult, ToolSpec,
    optional_str, optional_u64, required_str,
@@ -52,6 +54,11 @@ impl ToolSpec for FileSearchTool {
                    "type": "array",
                    "items": { "type": "string" },
                    "description": "Optional list of file extensions to include (e.g. [\"rs\", \"md\"])."
+                },
+                "exclude": {
+                    "type": "array",
+                    "items": { "type": "string" },
+                    "description": "Optional glob patterns to exclude, matching grep_files' convention (e.g. [\"target/**\", \"*.lock\"])."
                }
            },
            "required": ["query"]
@@ -79,7 +86,8 @@ impl ToolSpec for FileSearchTool {
        };

        let extensions = parse_extensions(&input);
-        let matches = search_files(query, &base_path, extensions, limit)?;
+        let exclude_patterns = parse_exclude_patterns(&input);
+        let matches = search_files(query, &base_path, extensions, exclude_patterns, limit)?;
        ToolResult::json(&matches).map_err(|e| ToolError::execution_failed(e.to_string()))
    }
 }
@@ -107,10 +115,37 @@ fn parse_extensions(input: &Value) -> Vec<String> {
    out
 }

+fn parse_exclude_patterns(input: &Value) -> Vec<String> {
+    if let Some(values) = input.get("exclude").and_then(Value::as_array) {
+        return values
+            .iter()
+            .filter_map(Value::as_str)
+            .map(str::trim)
+            .filter(|pattern| !pattern.is_empty())
+            .map(ToOwned::to_owned)
+            .collect();
+    }
+
+    [
+        "target/**",
+        "node_modules/**",
+        ".git/**",
+        "DerivedData/**",
+        "dist/**",
+        "build/**",
+        "*.lock",
+        "*.plist",
+    ]
+    .into_iter()
+    .map(ToOwned::to_owned)
+    .collect()
+}
+
 fn search_files(
    query: &str,
    base_path: &Path,
    extensions: Vec<String>,
+    exclude_patterns: Vec<String>,
    limit: usize,
 ) -> Result<Vec<FileSearchMatch>, ToolError> {
    if !base_path.exists() {
@@ -137,15 +172,19 @@ fn search_files(
        }

        let path = entry.path();
-        if !extensions.is_empty() && !extension_matches(path, &extensions) {
-            continue;
-        }
-
        let rel_path = path
            .strip_prefix(base_path)
            .unwrap_or(path)
            .to_string_lossy()
-            .to_string();
+            .replace('\\', "/");
+        if should_exclude(&rel_path, &exclude_patterns) {
+            continue;
+        }
+
+        if !extensions.is_empty() && !extension_matches(path, &extensions) {
+            continue;
+        }
+
        let name = file_name(path);

        let score = match score_match(&query_norm, &rel_path, &name) {
@@ -167,6 +206,12 @@ fn search_files(
    Ok(results)
 }

+fn should_exclude(rel_path: &str, exclude_patterns: &[String]) -> bool {
+    exclude_patterns
+        .iter()
+        .any(|pattern| matches_glob(rel_path, pattern))
+}
+
 fn extension_matches(path: &Path, extensions: &[String]) -> bool {
    let Some(ext) = path.extension().and_then(|e| e.to_str()) else {
        return false;
@@ -323,6 +368,46 @@ mod tests {
        assert!(!result.content.contains("notes.md"));
    }

+    #[tokio::test]
+    async fn test_file_search_exclude_filter() {
+        let tmp = tempdir().expect("tempdir");
+        let root = tmp.path();
+        std::fs::create_dir_all(root.join("fixtures")).expect("mkdir");
+        std::fs::write(root.join("fixtures").join("needle.txt"), "no\n").expect("write");
+        std::fs::write(root.join("needle.txt"), "yes\n").expect("write");
+
+        let ctx = ToolContext::new(root.to_path_buf());
+        let tool = FileSearchTool;
+        let result = tool
+            .execute(json!({"query": "needle", "exclude": ["fixtures/**"]}), &ctx)
+            .await
+            .expect("execute");
+
+        assert!(result.success);
+        assert!(result.content.contains("\"path\": \"needle.txt\""));
+        assert!(!result.content.contains("fixtures/needle.txt"));
+    }
+
+    #[tokio::test]
+    async fn test_file_search_default_excludes_build_artifacts() {
+        let tmp = tempdir().expect("tempdir");
+        let root = tmp.path();
+        std::fs::create_dir_all(root.join("target")).expect("mkdir");
+        std::fs::write(root.join("target").join("needle.txt"), "no\n").expect("write");
+        std::fs::write(root.join("needle.txt"), "yes\n").expect("write");
+
+        let ctx = ToolContext::new(root.to_path_buf());
+        let tool = FileSearchTool;
+        let result = tool
+            .execute(json!({"query": "needle"}), &ctx)
+            .await
+            .expect("execute");
+
+        assert!(result.success);
+        assert!(result.content.contains("\"path\": \"needle.txt\""));
+        assert!(!result.content.contains("target/needle.txt"));
+    }
+
    #[tokio::test]
    #[cfg(unix)]
    async fn test_file_search_does_not_follow_symlinked_files() {
@@ -0,0 +1,812 @@
+//! Symbolic handle storage and bounded reads.
+//!
+//! `var_handle` is the shared protocol that lets expensive environments
+//! (RLM sessions, sub-agent transcripts, large artifacts) hand the parent a
+//! small symbolic reference instead of copying the whole payload into the
+//! parent transcript.
+
+use std::collections::HashMap;
+use std::sync::Arc;
+
+use async_trait::async_trait;
+use serde::{Deserialize, Serialize};
+use serde_json::{Value, json};
+use sha2::{Digest, Sha256};
+use tokio::sync::Mutex;
+
+use crate::tools::spec::{
+    ApprovalRequirement, ToolCapability, ToolContext, ToolError, ToolResult, ToolSpec,
+};
+
+const DEFAULT_MAX_CHARS: usize = 12_000;
+const HARD_MAX_CHARS: usize = 50_000;
+#[allow(dead_code)] // Used by producers as they begin returning var_handle records.
+const REPR_PREVIEW_CHARS: usize = 160;
+
+pub type SharedHandleStore = Arc<Mutex<HandleStore>>;
+
+#[must_use]
+pub fn new_shared_handle_store() -> SharedHandleStore {
+    Arc::new(Mutex::new(HandleStore::default()))
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize, PartialEq, Eq)]
+pub struct VarHandle {
+    pub kind: String,
+    pub session_id: String,
+    pub name: String,
+    #[serde(rename = "type")]
+    pub type_name: String,
+    pub length: usize,
+    pub repr_preview: String,
+    pub sha256: String,
+}
+
+impl VarHandle {
+    #[must_use]
+    pub fn key(&self) -> HandleKey {
+        HandleKey {
+            session_id: self.session_id.clone(),
+            name: self.name.clone(),
+        }
+    }
+}
+
+#[derive(Debug, Clone, PartialEq, Eq, Hash)]
+pub struct HandleKey {
+    pub session_id: String,
+    pub name: String,
+}
+
+#[derive(Debug, Clone)]
+pub struct HandleRecord {
+    pub handle: VarHandle,
+    pub value: HandleValue,
+}
+
+#[allow(dead_code)] // Producers land in later v0.8.33 slices; handle_read is first.
+#[derive(Debug, Clone)]
+pub enum HandleValue {
+    Text(String),
+    Json(Value),
+}
+
+#[allow(dead_code)] // Foundation methods used by upcoming RLM/agent session producers.
+impl HandleValue {
+    fn length(&self) -> usize {
+        match self {
+            Self::Text(text) => text.chars().count(),
+            Self::Json(Value::Array(items)) => items.len(),
+            Self::Json(Value::Object(map)) => map.len(),
+            Self::Json(value) => value.to_string().chars().count(),
+        }
+    }
+
+    fn type_name(&self) -> String {
+        match self {
+            Self::Text(_) => "str".to_string(),
+            Self::Json(Value::Array(_)) => "list".to_string(),
+            Self::Json(Value::Object(_)) => "dict".to_string(),
+            Self::Json(Value::String(_)) => "str".to_string(),
+            Self::Json(Value::Bool(_)) => "bool".to_string(),
+            Self::Json(Value::Number(_)) => "number".to_string(),
+            Self::Json(Value::Null) => "null".to_string(),
+        }
+    }
+
+    fn stable_bytes(&self) -> Vec<u8> {
+        match self {
+            Self::Text(text) => text.as_bytes().to_vec(),
+            Self::Json(value) => serde_json::to_vec(value).unwrap_or_default(),
+        }
+    }
+
+    fn repr_preview(&self) -> String {
+        match self {
+            Self::Text(text) => truncate_chars(text, REPR_PREVIEW_CHARS),
+            Self::Json(value) => truncate_chars(&value.to_string(), REPR_PREVIEW_CHARS),
+        }
+    }
+}
+
+#[derive(Debug, Default)]
+pub struct HandleStore {
+    records: HashMap<HandleKey, HandleRecord>,
+}
+
+#[allow(dead_code)] // Insertors are for producer tools; this PR wires the reader first.
+impl HandleStore {
+    #[must_use]
+    pub fn insert_text(
+        &mut self,
+        session_id: impl Into<String>,
+        name: impl Into<String>,
+        text: impl Into<String>,
+    ) -> VarHandle {
+        self.insert(session_id, name, HandleValue::Text(text.into()))
+    }
+
+    #[must_use]
+    pub fn insert_json(
+        &mut self,
+        session_id: impl Into<String>,
+        name: impl Into<String>,
+        value: Value,
+    ) -> VarHandle {
+        self.insert(session_id, name, HandleValue::Json(value))
+    }
+
+    #[must_use]
+    pub fn get(&self, handle: &VarHandle) -> Option<&HandleRecord> {
+        self.records.get(&handle.key())
+    }
+
+    fn insert(
+        &mut self,
+        session_id: impl Into<String>,
+        name: impl Into<String>,
+        value: HandleValue,
+    ) -> VarHandle {
+        let session_id = session_id.into();
+        let name = name.into();
+        let handle = VarHandle {
+            kind: "var_handle".to_string(),
+            session_id: session_id.clone(),
+            name: name.clone(),
+            type_name: value.type_name(),
+            length: value.length(),
+            repr_preview: value.repr_preview(),
+            sha256: sha256_hex(&value.stable_bytes()),
+        };
+        let key = HandleKey { session_id, name };
+        self.records.insert(
+            key,
+            HandleRecord {
+                handle: handle.clone(),
+                value,
+            },
+        );
+        handle
+    }
+}
+
+pub struct HandleReadTool;
+
+#[async_trait]
+impl ToolSpec for HandleReadTool {
+    fn name(&self) -> &'static str {
+        "handle_read"
+    }
+
+    fn description(&self) -> &'static str {
+        "Read a bounded projection from a var_handle returned by tools such \
+         as RLM sessions, sub-agents, or large artifact producers. Provide \
+         exactly one projection: `slice` for char/line slices, `range` for \
+         one-based line ranges, `count` for metadata counts, or `jsonpath` \
+         for a small JSON-path projection. This retrieves from the handle's \
+         backing environment instead of asking the parent transcript to hold \
+         the full payload."
+    }
+
+    fn input_schema(&self) -> Value {
+        json!({
+            "type": "object",
+            "required": ["handle"],
+            "properties": {
+                "handle": {
+                    "description": "A var_handle object, or a compact `session_id/name` string.",
+                    "oneOf": [
+                        {
+                            "type": "object",
+                            "required": ["kind", "session_id", "name"],
+                            "properties": {
+                                "kind": { "type": "string", "const": "var_handle" },
+                                "session_id": { "type": "string" },
+                                "name": { "type": "string" },
+                                "type": { "type": "string" },
+                                "length": { "type": "integer" },
+                                "repr_preview": { "type": "string" },
+                                "sha256": { "type": "string" }
+                            }
+                        },
+                        { "type": "string" }
+                    ]
+                },
+                "slice": {
+                    "type": "object",
+                    "description": "Zero-based half-open slice over chars or lines.",
+                    "properties": {
+                        "start": { "type": "integer", "minimum": 0 },
+                        "end": { "type": "integer", "minimum": 0 },
+                        "unit": { "type": "string", "enum": ["chars", "lines"], "default": "chars" }
+                    }
+                },
+                "range": {
+                    "type": "object",
+                    "description": "One-based inclusive line range.",
+                    "required": ["start", "end"],
+                    "properties": {
+                        "start": { "type": "integer", "minimum": 1 },
+                        "end": { "type": "integer", "minimum": 1 }
+                    }
+                },
+                "count": {
+                    "type": "boolean",
+                    "description": "Return counts for the handle payload."
+                },
+                "jsonpath": {
+                    "type": "string",
+                    "description": "Small JSONPath subset: $, .field, [index], [*], and ['field']."
+                },
+                "max_chars": {
+                    "type": "integer",
+                    "description": "Maximum characters to return in this projection. Defaults to 12000; hard-capped at 50000."
+                }
+            }
+        })
+    }
+
+    fn capabilities(&self) -> Vec<ToolCapability> {
+        vec![ToolCapability::ReadOnly]
+    }
+
+    fn approval_requirement(&self) -> ApprovalRequirement {
+        ApprovalRequirement::Auto
+    }
+
+    fn supports_parallel(&self) -> bool {
+        true
+    }
+
+    async fn execute(&self, input: Value, context: &ToolContext) -> Result<ToolResult, ToolError> {
+        let handle = parse_handle(
+            input
+                .get("handle")
+                .ok_or_else(|| ToolError::missing_field("handle"))?,
+        )?;
+        let projection = parse_projection(&input)?;
+        let max_chars = input
+            .get("max_chars")
+            .and_then(Value::as_u64)
+            .map(|n| (n as usize).min(HARD_MAX_CHARS))
+            .unwrap_or(DEFAULT_MAX_CHARS);
+
+        let store = context.runtime.handle_store.lock().await;
+        let record = store.get(&handle).ok_or_else(|| {
+            ToolError::invalid_input(format!(
+                "handle_read: no payload found for handle {}/{}",
+                handle.session_id, handle.name
+            ))
+        })?;
+        if !handle.sha256.is_empty() && handle.sha256 != record.handle.sha256 {
+            return Err(ToolError::invalid_input(
+                "handle_read: handle sha256 does not match stored payload",
+            ));
+        }
+
+        let output = match projection {
+            Projection::Count => count_projection(record),
+            Projection::Slice { start, end, unit } => {
+                slice_projection(record, start, end, unit, max_chars)
+            }
+            Projection::Range { start, end } => {
+                line_range_projection(record, start, end, max_chars)
+            }
+            Projection::JsonPath(path) => jsonpath_projection(record, &path, max_chars)?,
+        };
+
+        ToolResult::json(&output).map_err(|e| ToolError::execution_failed(e.to_string()))
+    }
+}
+
+#[derive(Debug, Clone, Copy)]
+enum SliceUnit {
+    Chars,
+    Lines,
+}
+
+enum Projection {
+    Count,
+    Slice {
+        start: usize,
+        end: Option<usize>,
+        unit: SliceUnit,
+    },
+    Range {
+        start: usize,
+        end: usize,
+    },
+    JsonPath(String),
+}
+
+fn parse_handle(value: &Value) -> Result<VarHandle, ToolError> {
+    if let Some(raw) = value.as_str() {
+        let Some((session_id, name)) = raw.rsplit_once('/') else {
+            return Err(ToolError::invalid_input(
+                "handle_read: string handle must use `session_id/name`",
+            ));
+        };
+        return Ok(VarHandle {
+            kind: "var_handle".to_string(),
+            session_id: session_id.to_string(),
+            name: name.to_string(),
+            type_name: String::new(),
+            length: 0,
+            repr_preview: String::new(),
+            sha256: String::new(),
+        });
+    }
+
+    let handle: VarHandle = serde_json::from_value(value.clone()).map_err(|e| {
+        ToolError::invalid_input(format!("handle_read: invalid var_handle object: {e}"))
+    })?;
+    if handle.kind != "var_handle" {
+        return Err(ToolError::invalid_input(
+            "handle_read: handle.kind must be `var_handle`",
+        ));
+    }
+    if handle.session_id.trim().is_empty() || handle.name.trim().is_empty() {
+        return Err(ToolError::invalid_input(
+            "handle_read: handle.session_id and handle.name must be non-empty",
+        ));
+    }
+    Ok(handle)
+}
+
+fn parse_projection(input: &Value) -> Result<Projection, ToolError> {
+    let mut count = 0usize;
+    count += usize::from(input.get("slice").is_some());
+    count += usize::from(input.get("range").is_some());
+    count += usize::from(input.get("count").and_then(Value::as_bool).unwrap_or(false));
+    count += usize::from(input.get("jsonpath").is_some());
+    if count != 1 {
+        return Err(ToolError::invalid_input(
+            "handle_read: provide exactly one of `slice`, `range`, `count: true`, or `jsonpath`",
+        ));
+    }
+
+    if input.get("count").and_then(Value::as_bool).unwrap_or(false) {
+        return Ok(Projection::Count);
+    }
+    if let Some(path) = input.get("jsonpath") {
+        let path = path
+            .as_str()
+            .ok_or_else(|| ToolError::invalid_input("handle_read: jsonpath must be a string"))?
+            .trim();
+        if path.is_empty() {
+            return Err(ToolError::invalid_input(
+                "handle_read: jsonpath must not be empty",
+            ));
+        }
+        return Ok(Projection::JsonPath(path.to_string()));
+    }
+    if let Some(slice) = input.get("slice") {
+        let start = slice.get("start").and_then(Value::as_u64).unwrap_or(0) as usize;
+        let end = slice.get("end").and_then(Value::as_u64).map(|n| n as usize);
+        if let Some(end) = end
+            && end < start
+        {
+            return Err(ToolError::invalid_input(
+                "handle_read: slice.end must be greater than or equal to slice.start",
+            ));
+        }
+        let unit = match slice.get("unit").and_then(Value::as_str).unwrap_or("chars") {
+            "chars" => SliceUnit::Chars,
+            "lines" => SliceUnit::Lines,
+            other => {
+                return Err(ToolError::invalid_input(format!(
+                    "handle_read: unsupported slice.unit `{other}`"
+                )));
+            }
+        };
+        return Ok(Projection::Slice { start, end, unit });
+    }
+    let range = input
+        .get("range")
+        .ok_or_else(|| ToolError::invalid_input("handle_read: missing projection"))?;
+    let start = range
+        .get("start")
+        .and_then(Value::as_u64)
+        .ok_or_else(|| ToolError::missing_field("range.start"))? as usize;
+    let end = range
+        .get("end")
+        .and_then(Value::as_u64)
+        .ok_or_else(|| ToolError::missing_field("range.end"))? as usize;
+    if start == 0 || end == 0 || end < start {
+        return Err(ToolError::invalid_input(
+            "handle_read: range is one-based inclusive and end must be >= start",
+        ));
+    }
+    Ok(Projection::Range { start, end })
+}
+
+fn count_projection(record: &HandleRecord) -> Value {
+    match &record.value {
+        HandleValue::Text(text) => json!({
+            "handle": record.handle,
+            "projection": "count",
+            "chars": text.chars().count(),
+            "lines": text.lines().count(),
+            "bytes": text.len(),
+        }),
+        HandleValue::Json(value) => json!({
+            "handle": record.handle,
+            "projection": "count",
+            "json_type": json_type(value),
+            "length": record.handle.length,
+            "bytes": value.to_string().len(),
+        }),
+    }
+}
+
+fn slice_projection(
+    record: &HandleRecord,
+    start: usize,
+    end: Option<usize>,
+    unit: SliceUnit,
+    max_chars: usize,
+) -> Value {
+    let text = record_text(record);
+    match unit {
+        SliceUnit::Chars => {
+            let total = text.chars().count();
+            let end = end.unwrap_or(total).min(total);
+            let raw = char_slice(&text, start.min(total), end);
+            bounded_text_projection(
+                record,
+                "slice",
+                raw,
+                max_chars,
+                json!({
+                    "unit": "chars",
+                    "start": start.min(total),
+                    "end": end,
+                    "total_chars": total,
+                }),
+            )
+        }
+        SliceUnit::Lines => {
+            let lines: Vec<&str> = text.lines().collect();
+            let total = lines.len();
+            let end = end.unwrap_or(total).min(total);
+            let raw = if start >= end {
+                String::new()
+            } else {
+                lines[start.min(total)..end].join("\n")
+            };
+            bounded_text_projection(
+                record,
+                "slice",
+                raw,
+                max_chars,
+                json!({
+                    "unit": "lines",
+                    "start": start.min(total),
+                    "end": end,
+                    "total_lines": total,
+                }),
+            )
+        }
+    }
+}
+
+fn line_range_projection(
+    record: &HandleRecord,
+    start: usize,
+    end: usize,
+    max_chars: usize,
+) -> Value {
+    let text = record_text(record);
+    let lines: Vec<&str> = text.lines().collect();
+    let total = lines.len();
+    let zero_start = start.saturating_sub(1).min(total);
+    let zero_end = end.min(total);
+    let raw = if zero_start >= zero_end {
+        String::new()
+    } else {
+        lines[zero_start..zero_end].join("\n")
+    };
+    bounded_text_projection(
+        record,
+        "range",
+        raw,
+        max_chars,
+        json!({
+            "start": start,
+            "end": end,
+            "shown_start": zero_start + 1,
+            "shown_end": zero_end,
+            "total_lines": total,
+        }),
+    )
+}
+
+fn jsonpath_projection(
+    record: &HandleRecord,
+    path: &str,
+    max_chars: usize,
+) -> Result<Value, ToolError> {
+    let HandleValue::Json(value) = &record.value else {
+        return Err(ToolError::invalid_input(
+            "handle_read: jsonpath projection requires a JSON handle",
+        ));
+    };
+    let matches = query_jsonpath(value, path)
+        .map_err(|e| ToolError::invalid_input(format!("handle_read: {e}")))?;
+    let mut payload = json!({
+        "handle": record.handle,
+        "projection": "jsonpath",
+        "jsonpath": path,
+        "count": matches.len(),
+        "matches": matches,
+        "truncated": false,
+    });
+    let rendered = serde_json::to_string(&payload).unwrap_or_default();
+    if rendered.chars().count() > max_chars {
+        payload["matches"] = json!([]);
+        payload["preview"] = json!(truncate_chars(&rendered, max_chars));
+        payload["truncated"] = json!(true);
+    }
+    Ok(payload)
+}
+
+fn bounded_text_projection(
+    record: &HandleRecord,
+    projection: &str,
+    raw: String,
+    max_chars: usize,
+    extra: Value,
+) -> Value {
+    let raw_chars = raw.chars().count();
+    let content = truncate_chars(&raw, max_chars);
+    let shown_chars = content.chars().count();
+    json!({
+        "handle": record.handle,
+        "projection": projection,
+        "content": content,
+        "truncated": shown_chars < raw_chars,
+        "shown_chars": shown_chars,
+        "omitted_chars": raw_chars.saturating_sub(shown_chars),
+        "meta": extra,
+    })
+}
+
+fn record_text(record: &HandleRecord) -> String {
+    match &record.value {
+        HandleValue::Text(text) => text.clone(),
+        HandleValue::Json(value) => serde_json::to_string_pretty(value).unwrap_or_default(),
+    }
+}
+
+pub(crate) fn query_jsonpath(root: &Value, path: &str) -> Result<Vec<Value>, String> {
+    if !path.starts_with('$') {
+        return Err("jsonpath must start with `$`".to_string());
+    }
+    let mut idx = 1usize;
+    let bytes = path.as_bytes();
+    let mut current = vec![root];
+    while idx < bytes.len() {
+        match bytes[idx] {
+            b'.' => {
+                idx += 1;
+                if idx < bytes.len() && bytes[idx] == b'.' {
+                    return Err("recursive descent (`..`) is not supported".to_string());
+                }
+                let start = idx;
+                while idx < bytes.len()
+                    && (bytes[idx].is_ascii_alphanumeric() || bytes[idx] == b'_')
+                {
+                    idx += 1;
+                }
+                if start == idx {
+                    return Err("expected field name after `.`".to_string());
+                }
+                let field = &path[start..idx];
+                current = current
+                    .into_iter()
+                    .filter_map(|value| value.get(field))
+                    .collect();
+            }
+            b'[' => {
+                let Some(close_rel) = path[idx + 1..].find(']') else {
+                    return Err("unterminated `[` segment".to_string());
+                };
+                let close = idx + 1 + close_rel;
+                let token = path[idx + 1..close].trim();
+                idx = close + 1;
+                current = apply_bracket_token(current, token)?;
+            }
+            other => {
+                return Err(format!(
+                    "unexpected character `{}` in jsonpath",
+                    other as char
+                ));
+            }
+        }
+    }
+    Ok(current.into_iter().cloned().collect())
+}
+
+fn apply_bracket_token<'a>(values: Vec<&'a Value>, token: &str) -> Result<Vec<&'a Value>, String> {
+    if token == "*" {
+        let mut out = Vec::new();
+        for value in values {
+            match value {
+                Value::Array(items) => out.extend(items),
+                Value::Object(map) => out.extend(map.values()),
+                _ => {}
+            }
+        }
+        return Ok(out);
+    }
+
+    if let Some(field) = quoted_field(token) {
+        return Ok(values
+            .into_iter()
+            .filter_map(|value| value.get(field))
+            .collect());
+    }
+
+    let index = token
+        .parse::<usize>()
+        .map_err(|_| format!("unsupported bracket token `{token}`"))?;
+    Ok(values
+        .into_iter()
+        .filter_map(|value| value.as_array().and_then(|items| items.get(index)))
+        .collect())
+}
+
+fn quoted_field(token: &str) -> Option<&str> {
+    if token.len() < 2 {
+        return None;
+    }
+    let bytes = token.as_bytes();
+    let quote = bytes[0];
+    if !matches!(quote, b'\'' | b'"') || bytes[token.len() - 1] != quote {
+        return None;
+    }
+    Some(&token[1..token.len() - 1])
+}
+
+fn char_slice(text: &str, start: usize, end: usize) -> String {
+    text.chars()
+        .skip(start)
+        .take(end.saturating_sub(start))
+        .collect()
+}
+
+fn truncate_chars(text: &str, max_chars: usize) -> String {
+    let mut out = String::new();
+    for (idx, ch) in text.chars().enumerate() {
+        if idx == max_chars {
+            break;
+        }
+        out.push(ch);
+    }
+    out
+}
+
+#[allow(dead_code)] // Used when producer tools register handle payloads.
+fn sha256_hex(bytes: &[u8]) -> String {
+    let mut hasher = Sha256::new();
+    hasher.update(bytes);
+    format!("{:x}", hasher.finalize())
+}
+
+fn json_type(value: &Value) -> &'static str {
+    match value {
+        Value::Null => "null",
+        Value::Bool(_) => "bool",
+        Value::Number(_) => "number",
+        Value::String(_) => "string",
+        Value::Array(_) => "array",
+        Value::Object(_) => "object",
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use serde_json::json;
+
+    fn ctx() -> ToolContext {
+        ToolContext::new(".")
+    }
+
+    #[tokio::test]
+    async fn handle_read_slices_text_by_chars() {
+        let ctx = ctx();
+        let handle = {
+            let mut store = ctx.runtime.handle_store.lock().await;
+            store.insert_text("rlm:test", "matches", "abcdef")
+        };
+
+        let result = HandleReadTool
+            .execute(
+                json!({"handle": handle, "slice": {"start": 1, "end": 4}}),
+                &ctx,
+            )
+            .await
+            .expect("execute");
+        let body: Value = serde_json::from_str(&result.content).expect("json");
+        assert_eq!(body["content"], "bcd");
+        assert_eq!(body["truncated"], false);
+    }
+
+    #[tokio::test]
+    async fn handle_read_ranges_text_by_one_based_lines() {
+        let ctx = ctx();
+        let handle = {
+            let mut store = ctx.runtime.handle_store.lock().await;
+            store.insert_text("agent:test", "transcript", "one\ntwo\nthree\nfour")
+        };
+
+        let result = HandleReadTool
+            .execute(
+                json!({"handle": handle, "range": {"start": 2, "end": 3}}),
+                &ctx,
+            )
+            .await
+            .expect("execute");
+        let body: Value = serde_json::from_str(&result.content).expect("json");
+        assert_eq!(body["content"], "two\nthree");
+        assert_eq!(body["meta"]["shown_start"], 2);
+        assert_eq!(body["meta"]["shown_end"], 3);
+    }
+
+    #[tokio::test]
+    async fn handle_read_counts_json_collections() {
+        let ctx = ctx();
+        let handle = {
+            let mut store = ctx.runtime.handle_store.lock().await;
+            store.insert_json("rlm:test", "items", json!([{"a": 1}, {"a": 2}]))
+        };
+
+        let result = HandleReadTool
+            .execute(json!({"handle": handle, "count": true}), &ctx)
+            .await
+            .expect("execute");
+        let body: Value = serde_json::from_str(&result.content).expect("json");
+        assert_eq!(body["json_type"], "array");
+        assert_eq!(body["length"], 2);
+    }
+
+    #[tokio::test]
+    async fn handle_read_projects_jsonpath_subset() {
+        let ctx = ctx();
+        let handle = {
+            let mut store = ctx.runtime.handle_store.lock().await;
+            store.insert_json(
+                "rlm:test",
+                "items",
+                json!({"items": [{"name": "a"}, {"name": "b"}]}),
+            )
+        };
+
+        let result = HandleReadTool
+            .execute(
+                json!({"handle": handle, "jsonpath": "$.items[*].name"}),
+                &ctx,
+            )
+            .await
+            .expect("execute");
+        let body: Value = serde_json::from_str(&result.content).expect("json");
+        assert_eq!(body["matches"], json!(["a", "b"]));
+        assert_eq!(body["count"], 2);
+    }
+
+    #[tokio::test]
+    async fn handle_read_rejects_unbounded_projection_requests() {
+        let ctx = ctx();
+        let handle = {
+            let mut store = ctx.runtime.handle_store.lock().await;
+            store.insert_text("rlm:test", "body", "abc")
+        };
+
+        let err = HandleReadTool
+            .execute(json!({"handle": handle}), &ctx)
+            .await
+            .expect_err("projection required");
+        assert!(err.to_string().contains("exactly one"));
+    }
+}
@@ -23,6 +23,7 @@ pub mod fim;
 pub mod git;
 pub mod git_history;
 pub mod github;
+pub mod handle;
 pub mod image_ocr;
 pub mod js_execution;
 pub mod large_output_router;
@@ -660,15 +660,22 @@ impl ToolRegistryBuilder {
        self.with_tool(Arc::new(RevertTurnTool))
    }

-    /// Include the RLM tool (`rlm`). Runs the full recursive language-model
-    /// loop on a long input (file or inline content); the long input never
-    /// enters the calling model's context window. The Python REPL exposes
-    /// `llm_query` / `llm_query_batched` / `rlm_query` / `rlm_query_batched`
-    /// helpers for sub-LLM work — that's where parallel fan-out belongs.
+    /// Include persistent RLM session tools.
    #[must_use]
-    pub fn with_rlm_tool(self, client: Option<DeepSeekClient>, root_model: String) -> Self {
-        use super::rlm::RlmTool;
-        self.with_tool(Arc::new(RlmTool::new(client, root_model)))
+    pub fn with_rlm_tool(self, client: Option<DeepSeekClient>, _root_model: String) -> Self {
+        use super::rlm::{RlmCloseTool, RlmConfigureTool, RlmEvalTool, RlmOpenTool};
+        self.with_tool(Arc::new(RlmOpenTool))
+            .with_tool(Arc::new(RlmEvalTool::new(client)))
+            .with_tool(Arc::new(RlmConfigureTool))
+            .with_tool(Arc::new(RlmCloseTool))
+    }
+
+    /// Include `handle_read`, the bounded projection reader for symbolic
+    /// `var_handle` payloads.
+    #[must_use]
+    pub fn with_handle_tools(self) -> Self {
+        use super::handle::HandleReadTool;
+        self.with_tool(Arc::new(HandleReadTool))
    }

    /// Include the review tool.
@@ -769,6 +776,7 @@ impl ToolRegistryBuilder {
            .with_test_runner_tool()
            .with_validation_tools()
            .with_tool_result_retrieval_tool()
+            .with_handle_tools()
            .with_runtime_task_tools()
            .with_revert_turn_tool()
            .with_pandoc_tools()
@@ -840,51 +848,14 @@ impl ToolRegistryBuilder {
        manager: super::subagent::SharedSubAgentManager,
        runtime: super::subagent::SubAgentRuntime,
    ) -> Self {
-        use super::subagent::{
-            AgentAssignTool, AgentCancelTool, AgentCloseTool, AgentListTool, AgentResultTool,
-            AgentResumeTool, AgentSendInputTool, AgentSpawnTool, AgentWaitTool,
-            DelegateToAgentTool,
-        };
+        use super::subagent::{AgentCloseTool, AgentEvalTool, AgentOpenTool};

-        self.with_tool(Arc::new(AgentSpawnTool::new(
+        self.with_tool(Arc::new(AgentOpenTool::new(
            manager.clone(),
            runtime.clone(),
        )))
-        .with_tool(Arc::new(AgentSpawnTool::with_name(
-            manager.clone(),
-            runtime.clone(),
-            "spawn_agent",
-        )))
-        .with_tool(Arc::new(DelegateToAgentTool::new(
-            manager.clone(),
-            runtime.clone(),
-        )))
-        .with_tool(Arc::new(AgentResultTool::new(manager.clone())))
-        .with_tool(Arc::new(AgentSendInputTool::new(
-            manager.clone(),
-            "send_input",
-        )))
-        .with_tool(Arc::new(AgentAssignTool::new(
-            manager.clone(),
-            "agent_assign",
-        )))
-        .with_tool(Arc::new(AgentAssignTool::new(
-            manager.clone(),
-            "assign_agent",
-        )))
-        .with_tool(Arc::new(AgentWaitTool::new(manager.clone(), "wait")))
-        .with_tool(Arc::new(AgentSendInputTool::new(
-            manager.clone(),
-            "agent_send_input",
-        )))
-        .with_tool(Arc::new(AgentWaitTool::new(manager.clone(), "agent_wait")))
-        .with_tool(Arc::new(AgentResumeTool::new(
-            manager.clone(),
-            runtime.clone(),
-        )))
-        .with_tool(Arc::new(AgentCloseTool::new(manager.clone())))
-        .with_tool(Arc::new(AgentCancelTool::new(manager.clone())))
-        .with_tool(Arc::new(AgentListTool::new(manager)))
+        .with_tool(Arc::new(AgentEvalTool::new(manager.clone())))
+        .with_tool(Arc::new(AgentCloseTool::new(manager)))
    }

    /// Build the registry with the given context.
@@ -97,6 +97,17 @@ impl ToolSpec for RevertTurnTool {
                    )
                })?
                .clone();
+            if repo
+                .work_tree_matches_snapshot(&target.id)
+                .map_err(|e| format!("Snapshot comparison failed: {e}"))?
+            {
+                return Err(format!(
+                    "NoSnapshotForTurn: target '{}' ({}) already matches the current workspace. \
+                     Revert operates at completed turn boundaries; there is no distinct later snapshot to restore.",
+                    target.label,
+                    short_sha(target.id.as_str()),
+                ));
+            }
            repo.restore(&target.id)
                .map_err(|e| format!("Restore failed: {e}"))?;
            Ok(format!(
@@ -189,6 +200,24 @@ mod tests {
        assert!(r.is_err());
    }

+    #[tokio::test]
+    async fn revert_turn_rejects_snapshot_matching_current_workspace() {
+        let tmp = tempdir().unwrap();
+        let workspace = tmp.path().join("ws");
+        std::fs::create_dir_all(&workspace).unwrap();
+        let _guard = scoped_home(tmp.path());
+
+        let repo = SnapshotRepo::open_or_init(&workspace).unwrap();
+        std::fs::write(workspace.join("a.txt"), b"unchanged").unwrap();
+        repo.snapshot("pre-turn:1").unwrap();
+
+        let tool = RevertTurnTool;
+        let ctx = ToolContext::new(workspace);
+        let r = tool.execute(json!({}), &ctx).await.expect("execute");
+        assert!(!r.success);
+        assert!(r.content.contains("NoSnapshotForTurn"), "{}", r.content);
+    }
+
    #[tokio::test]
    async fn revert_turn_no_snapshots_returns_error_result() {
        let tmp = tempdir().unwrap();
@@ -213,9 +213,16 @@ impl ToolSpec for GrepFilesTool {
            }
        }

-        // Build result
+        let matches_json: Vec<Value> = results
+            .iter()
+            .map(|item| grep_match_to_json(item, context_lines))
+            .collect();
+
+        // Build result. When context_lines == 1, return the single context
+        // line as a string instead of a one-item array. That keeps the common
+        // "show just the adjacent line" case easy for model callers to read.
        let result = json!({
-            "matches": results,
+            "matches": matches_json,
            "total_matches": total_matches,
            "files_searched": files_searched,
            "truncated": total_matches > max_results,
@@ -225,6 +232,20 @@ impl ToolSpec for GrepFilesTool {
    }
 }

+fn grep_match_to_json(item: &GrepMatch, context_lines: usize) -> Value {
+    if context_lines == 1 {
+        json!({
+            "file": item.file,
+            "line_number": item.line_number,
+            "line": item.line,
+            "context_before": item.context_before.first().cloned().unwrap_or_default(),
+            "context_after": item.context_after.first().cloned().unwrap_or_default(),
+        })
+    } else {
+        json!(item)
+    }
+}
+
 /// Collect files to search based on include/exclude patterns
 fn collect_files(
    root: &Path,
@@ -315,7 +336,7 @@ fn should_include(path: &str, patterns: &[String]) -> bool {

 /// Simple glob pattern matching
 /// Supports: * (any chars), ** (any path), ? (single char)
-fn matches_glob(path: &str, pattern: &str) -> bool {
+pub(crate) fn matches_glob(path: &str, pattern: &str) -> bool {
    // Handle ** for any path
    if pattern.contains("**") {
        let parts: Vec<&str> = pattern.split("**").collect();
@@ -502,6 +523,34 @@ mod tests {
        assert!(result.success);
        assert!(result.content.contains("line2")); // context before
        assert!(result.content.contains("line4")); // context after
+
+        let parsed: Value = serde_json::from_str(&result.content).unwrap();
+        let matches = parsed["matches"].as_array().unwrap();
+        assert_eq!(matches.len(), 1);
+        assert_eq!(matches[0]["context_before"], "line2");
+        assert_eq!(matches[0]["context_after"], "line4");
+        assert!(matches[0]["context_before"].is_string());
+        assert!(matches[0]["context_after"].is_string());
+    }
+
+    #[tokio::test]
+    async fn test_grep_files_multi_line_context_remains_arrays() {
+        let tmp = tempdir().expect("tempdir");
+        let ctx = ToolContext::new(tmp.path().to_path_buf());
+
+        fs::write(tmp.path().join("test.txt"), "a\nb\nMATCH\nd\ne\n").expect("write");
+
+        let tool = GrepFilesTool;
+        let result = tool
+            .execute(json!({"pattern": "MATCH", "context_lines": 2}), &ctx)
+            .await
+            .expect("execute");
+
+        let parsed: Value = serde_json::from_str(&result.content).unwrap();
+        let matches = parsed["matches"].as_array().unwrap();
+        assert_eq!(matches.len(), 1);
+        assert_eq!(matches[0]["context_before"], json!(["a", "b"]));
+        assert_eq!(matches[0]["context_after"], json!(["d", "e"]));
    }

    #[tokio::test]
@@ -1586,6 +1586,7 @@ async fn execute_foreground_via_background(
    command: &str,
    timeout_ms: u64,
    stdin_data: Option<&str>,
+    tty: bool,
    policy_override: Option<ExecutionSandboxPolicy>,
    extra_env: HashMap<String, String>,
 ) -> Result<ShellResult> {
@@ -1602,7 +1603,7 @@ async fn execute_foreground_via_background(
            timeout_ms,
            true,
            stdin_data,
-            false,
+            tty,
            policy_override,
            extra_env,
        )?
@@ -1706,6 +1707,10 @@ impl ToolSpec for ExecShellTool {
                "tty": {
                    "type": "boolean",
                    "description": "Allocate a pseudo-terminal for interactive programs (implies background)"
+                },
+                "combined_output": {
+                    "type": "boolean",
+                    "description": "Capture stdout and stderr as one chronological PTY stream (default false). In foreground mode, waits for completion; in background mode, implies tty."
                }
            },
            "required": ["command"]
@@ -1733,7 +1738,8 @@ impl ToolSpec for ExecShellTool {
        let timeout_ms = optional_u64(&input, "timeout_ms", 120_000).min(600_000);
        let background = optional_bool(&input, "background", false);
        let interactive = optional_bool(&input, "interactive", false);
-        let tty = optional_bool(&input, "tty", false);
+        let combined_output = optional_bool(&input, "combined_output", false);
+        let tty = optional_bool(&input, "tty", false) || (combined_output && background);
        let stdin_data = input
            .get("stdin")
            .or_else(|| input.get("input"))
@@ -1746,9 +1752,9 @@ impl ToolSpec for ExecShellTool {
                "Interactive commands cannot run in background mode.",
            ));
        }
-        if interactive && tty {
+        if interactive && (tty || combined_output) {
            return Ok(ToolResult::error(
-                "Interactive mode cannot be combined with TTY sessions.",
+                "Interactive mode cannot be combined with TTY or combined_output sessions.",
            ));
        }
        if interactive && stdin_data.is_some() {
@@ -1969,6 +1975,7 @@ impl ToolSpec for ExecShellTool {
                command,
                timeout_ms,
                stdin_data.as_deref(),
+                combined_output,
                policy_override,
                extra_env,
            )
@@ -2067,6 +2074,7 @@ impl ToolSpec for ExecShellTool {
                    "stderr_summary": stderr_summary,
                    "safety_level": format!("{:?}", safety.level),
                    "interactive": interactive,
+                    "combined_output": combined_output,
                    "canceled": was_cancelled,
                    "execpolicy": execpolicy_decision.as_ref().map(|decision| match decision {
                        ExecPolicyDecision::Allow => json!({
@@ -403,6 +403,29 @@ async fn test_exec_shell_metadata_includes_summaries() {
    assert!(meta.get("stdout_truncated").is_some());
 }

+#[cfg(not(windows))]
+#[tokio::test]
+async fn test_exec_shell_combined_output_uses_single_stream() {
+    let tmp = tempdir().expect("tempdir");
+    let ctx = ToolContext::new(tmp.path());
+    let tool = ExecShellTool;
+    let command = "printf 'out\\n'; printf 'err\\n' >&2";
+
+    let result = tool
+        .execute(json!({"command": command, "combined_output": true}), &ctx)
+        .await
+        .expect("execute");
+    assert!(result.success, "{}", result.content);
+    assert!(result.content.contains("out"), "{}", result.content);
+    assert!(result.content.contains("err"), "{}", result.content);
+
+    let meta = result.metadata.expect("metadata");
+    assert_eq!(
+        meta.get("combined_output").and_then(Value::as_bool),
+        Some(true)
+    );
+}
+
 #[tokio::test]
 async fn test_exec_shell_foreground_timeout_guides_background_rerun() {
    let tmp = tempdir().expect("tempdir");
@@ -16,7 +16,9 @@ use tokio_util::sync::CancellationToken;
 use crate::features::Features;
 use crate::lsp::LspManager;
 use crate::network_policy::NetworkPolicyDecider;
+use crate::rlm::session::{SharedRlmSessionStore, new_shared_rlm_session_store};
 use crate::sandbox::backend::SandboxBackend;
+use crate::tools::handle::{SharedHandleStore, new_shared_handle_store};
 use crate::tools::shell::{SharedShellManager, new_shared_shell_manager};
 #[allow(unused_imports)]
 pub use deepseek_tools::{
@@ -30,7 +32,7 @@ pub use deepseek_tools::{
 /// contexts keep working. Tools that need durable task/automation state fail
 /// closed with a clear "not available" error when the relevant service is not
 /// attached.
-#[derive(Clone, Default)]
+#[derive(Clone)]
 pub struct RuntimeToolServices {
    pub shell_manager: Option<SharedShellManager>,
    pub task_manager: Option<crate::task_manager::SharedTaskManager>,
@@ -42,6 +44,27 @@ pub struct RuntimeToolServices {
    /// tool-side hook events. `None` outside the live engine — test
    /// contexts that don't care about hooks get a no-op.
    pub hook_executor: Option<std::sync::Arc<crate::hooks::HookExecutor>>,
+    /// Per-session backing store for `var_handle` payloads. Cloned tool
+    /// contexts share this Arc so handles survive across turns.
+    pub handle_store: SharedHandleStore,
+    /// Per-session persistent RLM kernels, keyed by caller-chosen context name.
+    pub rlm_sessions: SharedRlmSessionStore,
+}
+
+impl Default for RuntimeToolServices {
+    fn default() -> Self {
+        Self {
+            shell_manager: None,
+            task_manager: None,
+            automations: None,
+            task_data_dir: None,
+            active_task_id: None,
+            active_thread_id: None,
+            hook_executor: None,
+            handle_store: new_shared_handle_store(),
+            rlm_sessions: new_shared_rlm_session_store(),
+        }
+    }
 }

 impl std::fmt::Debug for RuntimeToolServices {
@@ -54,6 +77,8 @@ impl std::fmt::Debug for RuntimeToolServices {
            .field("active_task_id", &self.active_task_id)
            .field("active_thread_id", &self.active_thread_id)
            .field("hook_executor", &self.hook_executor.is_some())
+            .field("handle_store", &true)
+            .field("rlm_sessions", &true)
            .finish()
    }
 }
@@ -3,6 +3,11 @@
 //! Provides tools to spawn background sub-agents, query their status,
 //! and retrieve results. Sub-agents run with a filtered toolset and
 //! inherit the workspace configuration from the main session.
+//!
+//! v0.8.33's new model-facing surface is `agent_open` / `agent_eval` /
+//! `agent_close`. Some older structs and manager helpers remain in this
+//! module while the durable runtime is being reused by the new surface.
+#![allow(dead_code)]

 use std::collections::{HashMap, VecDeque};
 use std::fs;
@@ -24,6 +29,7 @@ use crate::config::MAX_SUBAGENTS;
 use crate::core::events::Event;
 use crate::llm_client::LlmClient;
 use crate::models::{ContentBlock, Message, MessageRequest, SystemPrompt, Tool};
+use crate::tools::handle::VarHandle;
 use crate::tools::plan::{PlanState, SharedPlanState};
 use crate::tools::registry::{ToolRegistry, ToolRegistryBuilder};
 use crate::tools::spec::{
@@ -64,6 +70,7 @@ const TOOL_TIMEOUT: Duration = Duration::from_secs(30);
 const STEP_API_TIMEOUT: Duration = Duration::from_secs(120);
 const RESULT_POLL_INTERVAL: Duration = Duration::from_millis(250);
 const DEFAULT_RESULT_TIMEOUT_MS: u64 = 30_000;
+#[allow(dead_code)] // Legacy agent_wait clamp; new agent_eval uses DEFAULT/MAX.
 const MIN_WAIT_TIMEOUT_MS: u64 = 10_000;
 const MAX_RESULT_TIMEOUT_MS: u64 = 3_600_000;
 const COMPLETED_AGENT_RETENTION: Duration = Duration::from_secs(60 * 60);
@@ -407,7 +414,10 @@ pub enum SubAgentStatus {
 /// Snapshot of sub-agent state for tool results.
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct SubAgentResult {
+    pub name: String,
    pub agent_id: String,
+    pub context_mode: String,
+    pub fork_context: bool,
    pub agent_type: SubAgentType,
    pub assignment: SubAgentAssignment,
    #[serde(default)]
@@ -432,6 +442,7 @@ fn is_false(b: &bool) -> bool {

 #[derive(Debug, Clone, Default)]
 pub(crate) struct SubAgentSpawnOptions {
+    pub name: Option<String>,
    pub model: Option<String>,
    pub nickname: Option<String>,
    pub fork_context: bool,
@@ -452,6 +463,7 @@ impl WaitMode {
        }
    }

+    #[allow(dead_code)] // Legacy wait metadata while registry moves to agent_eval.
    fn as_str(self) -> &'static str {
        match self {
            Self::Any => "any",
@@ -479,6 +491,7 @@ struct SubAgentInput {

 #[derive(Debug, Clone)]
 struct SpawnRequest {
+    session_name: Option<String>,
    prompt: String,
    agent_type: SubAgentType,
    assignment: SubAgentAssignment,
@@ -497,6 +510,9 @@ struct SpawnRequest {
    /// When true, seed the child with the parent's system prompt and message
    /// prefix before appending the child task.
    fork_context: bool,
+    /// Optional recursion budget for descendants opened by this child.
+    /// `0` means the child may not call `agent_open` recursively.
+    max_depth: Option<u32>,
 }

 #[derive(Debug, Clone)]
@@ -511,6 +527,10 @@ struct AssignRequest {
 #[derive(Debug, Clone, Serialize, Deserialize)]
 struct PersistedSubAgent {
    id: String,
+    #[serde(default, skip_serializing_if = "Option::is_none")]
+    session_name: Option<String>,
+    #[serde(default)]
+    fork_context: bool,
    agent_type: SubAgentType,
    prompt: String,
    assignment: SubAgentAssignment,
@@ -789,6 +809,8 @@ impl SubAgentRuntime {
 /// A running sub-agent instance.
 pub struct SubAgent {
    pub id: String,
+    pub session_name: String,
+    pub fork_context: bool,
    pub agent_type: SubAgentType,
    pub prompt: String,
    pub assignment: SubAgentAssignment,
@@ -824,9 +846,12 @@ impl SubAgent {
        session_boot_id: String,
    ) -> Self {
        let id = format!("agent_{}", &Uuid::new_v4().to_string()[..8]);
+        let session_name = id.clone();

        Self {
            id,
+            session_name,
+            fork_context: false,
            agent_type,
            prompt,
            assignment,
@@ -847,7 +872,10 @@ impl SubAgent {
    #[must_use]
    pub fn snapshot(&self) -> SubAgentResult {
        SubAgentResult {
+            name: self.session_name.clone(),
            agent_id: self.id.clone(),
+            context_mode: if self.fork_context { "forked" } else { "fresh" }.to_string(),
+            fork_context: self.fork_context,
            agent_type: self.agent_type.clone(),
            assignment: self.assignment.clone(),
            model: self.model.clone(),
@@ -928,6 +956,8 @@ impl SubAgentManager {
        for agent in self.agents.values() {
            agents.push(PersistedSubAgent {
                id: agent.id.clone(),
+                session_name: Some(agent.session_name.clone()),
+                fork_context: agent.fork_context,
                agent_type: agent.agent_type.clone(),
                prompt: agent.prompt.clone(),
                assignment: agent.assignment.clone(),
@@ -998,6 +1028,11 @@ impl SubAgentManager {
            };
            let agent = SubAgent {
                id: persisted.id.clone(),
+                session_name: persisted
+                    .session_name
+                    .filter(|name| !name.trim().is_empty())
+                    .unwrap_or_else(|| persisted.id.clone()),
+                fork_context: persisted.fork_context,
                agent_type: persisted.agent_type,
                prompt: persisted.prompt,
                assignment: persisted.assignment,
@@ -1126,6 +1161,22 @@ impl SubAgentManager {
            input_tx,
            self.current_session_boot_id.clone(),
        );
+        if let Some(name) = options
+            .name
+            .as_deref()
+            .map(str::trim)
+            .filter(|name| !name.is_empty())
+        {
+            if self
+                .agents
+                .values()
+                .any(|existing| existing.session_name == name)
+            {
+                return Err(anyhow!("Sub-agent session name '{name}' is already in use"));
+            }
+            agent.session_name = name.to_string();
+        }
+        agent.fork_context = options.fork_context;
        let agent_id = agent.id.clone();
        let started_at = agent.started_at;
        let max_steps = self.max_steps;
@@ -1175,6 +1226,29 @@ impl SubAgentManager {
        Ok(agent.snapshot())
    }

+    /// Resolve either a durable agent id or a model-facing session name.
+    fn resolve_agent_ref(&self, agent_ref: &str) -> Result<String> {
+        let agent_ref = agent_ref.trim();
+        if self.agents.contains_key(agent_ref) {
+            return Ok(agent_ref.to_string());
+        }
+
+        let matches = self
+            .agents
+            .values()
+            .filter(|agent| agent.session_name == agent_ref)
+            .map(|agent| agent.id.clone())
+            .collect::<Vec<_>>();
+
+        match matches.as_slice() {
+            [id] => Ok(id.clone()),
+            [] => Err(anyhow!("Agent session {agent_ref} not found")),
+            _ => Err(anyhow!(
+                "Agent session name '{agent_ref}' is ambiguous; use an agent_id"
+            )),
+        }
+    }
+
    /// Cancel a running sub-agent.
    pub fn cancel(&mut self, agent_id: &str) -> Result<SubAgentResult> {
        let (snapshot, changed) = {
@@ -1202,6 +1276,7 @@ impl SubAgentManager {
    }

    /// Resume a non-running sub-agent by restarting it with the original assignment.
+    #[allow(dead_code)] // Legacy agent_resume path; retained until registry migration.
    pub fn resume(
        &mut self,
        manager_handle: SharedSubAgentManager,
@@ -1505,6 +1580,86 @@ impl SubAgentManager {
 /// Thread-safe wrapper for `SubAgentManager`.
 pub type SharedSubAgentManager = Arc<RwLock<SubAgentManager>>;

+/// Model-facing session projection returned by the v0.8.33 sub-agent API.
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct SubAgentSessionProjection {
+    pub name: String,
+    pub agent_id: String,
+    pub status: String,
+    pub terminal: bool,
+    pub context_mode: String,
+    pub fork_context: bool,
+    pub prefix_cache: SubAgentPrefixCacheProjection,
+    pub transcript_handle: VarHandle,
+    pub snapshot: SubAgentResult,
+    #[serde(default, skip_serializing_if = "is_false")]
+    pub timed_out: bool,
+}
+
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct SubAgentPrefixCacheProjection {
+    pub mode: String,
+    pub parent_prefix: String,
+    pub deepseek_prefix_cache_reuse: String,
+}
+
+fn subagent_prefix_cache_projection(snapshot: &SubAgentResult) -> SubAgentPrefixCacheProjection {
+    if snapshot.fork_context {
+        SubAgentPrefixCacheProjection {
+            mode: "forked".to_string(),
+            parent_prefix: "preserved_byte_identical_when_available".to_string(),
+            deepseek_prefix_cache_reuse: "optimized_for_existing_parent_prefill".to_string(),
+        }
+    } else {
+        SubAgentPrefixCacheProjection {
+            mode: "fresh".to_string(),
+            parent_prefix: "not_inherited".to_string(),
+            deepseek_prefix_cache_reuse: "independent_child_prefill".to_string(),
+        }
+    }
+}
+
+async fn subagent_session_projection(
+    snapshot: SubAgentResult,
+    timed_out: bool,
+    context: &ToolContext,
+) -> SubAgentSessionProjection {
+    let transcript_payload = json!({
+        "kind": "subagent_session_snapshot",
+        "agent_id": snapshot.agent_id.clone(),
+        "name": snapshot.name.clone(),
+        "status": subagent_status_name(&snapshot.status),
+        "context_mode": snapshot.context_mode.clone(),
+        "fork_context": snapshot.fork_context,
+        "result": snapshot.result.clone(),
+        "steps_taken": snapshot.steps_taken,
+        "duration_ms": snapshot.duration_ms,
+        "assignment": snapshot.assignment.clone(),
+        "snapshot": snapshot.clone(),
+    });
+    let transcript_handle = {
+        let mut store = context.runtime.handle_store.lock().await;
+        store.insert_json(
+            format!("agent:{}", snapshot.agent_id),
+            "transcript",
+            transcript_payload,
+        )
+    };
+
+    SubAgentSessionProjection {
+        name: snapshot.name.clone(),
+        agent_id: snapshot.agent_id.clone(),
+        status: subagent_status_name(&snapshot.status).to_string(),
+        terminal: snapshot.status != SubAgentStatus::Running,
+        context_mode: snapshot.context_mode.clone(),
+        fork_context: snapshot.fork_context,
+        prefix_cache: subagent_prefix_cache_projection(&snapshot),
+        transcript_handle,
+        snapshot,
+        timed_out,
+    }
+}
+
 fn default_state_path(workspace: &Path) -> PathBuf {
    workspace
        .join(".deepseek")
@@ -1552,6 +1707,143 @@ pub fn new_shared_subagent_manager(workspace: PathBuf, max_agents: usize) -> Sha

 // === Tool Implementations ===

+/// Open a named background sub-agent session.
+#[allow(dead_code)] // Registered by the adjacent v0.8.33 registry surface update.
+pub struct AgentOpenTool {
+    manager: SharedSubAgentManager,
+    runtime: SubAgentRuntime,
+}
+
+impl AgentOpenTool {
+    #[allow(dead_code)] // Registered by the adjacent v0.8.33 registry surface update.
+    #[must_use]
+    pub fn new(manager: SharedSubAgentManager, runtime: SubAgentRuntime) -> Self {
+        Self { manager, runtime }
+    }
+}
+
+#[async_trait]
+impl ToolSpec for AgentOpenTool {
+    fn name(&self) -> &'static str {
+        "agent_open"
+    }
+
+    fn description(&self) -> &'static str {
+        concat!(
+            "Open a named child sub-agent session for focused background work. Returns the session name, status, agent_id, context_mode, prefix_cache metadata, and a handle_read-compatible transcript_handle. ",
+            "Use agent_eval to fetch or wait on the session, and agent_close to cancel/close it.\n\n",
+            "Context control is explicit: omit fork_context or set it false for a fresh child with an independent prefill; set fork_context=true for perspective fanout over the current parent context. ",
+            "Forked children preserve the parent system prompt and leading message prefix byte-identically where the runtime has that prefix, so DeepSeek can reuse its prefix cache before the child-specific task is appended.\n\n",
+            "Sub-agent results are self-reports. Re-verify claimed side effects such as file edits, commands, network writes, tests, or git operations before reporting them as facts."
+        )
+    }
+
+    fn input_schema(&self) -> Value {
+        json!({
+            "type": "object",
+            "properties": {
+                "name": {
+                    "type": "string",
+                    "description": "Stable model-facing session name. Defaults to the generated agent_id when omitted."
+                },
+                "session_name": {
+                    "type": "string",
+                    "description": "Alias for name"
+                },
+                "prompt": {
+                    "type": "string",
+                    "description": "Initial task description for the child session"
+                },
+                "message": {
+                    "type": "string",
+                    "description": "Alias for prompt"
+                },
+                "objective": {
+                    "type": "string",
+                    "description": "Alias for prompt"
+                },
+                "items": {
+                    "type": "array",
+                    "description": "Structured input items (text, mention, skill, local_image, image)",
+                    "items": { "type": "object" }
+                },
+                "type": {
+                    "type": "string",
+                    "description": "Sub-agent type: general, explore, plan, review, implementer, verifier, custom"
+                },
+                "agent_type": {
+                    "type": "string",
+                    "description": "Alias for type"
+                },
+                "role": {
+                    "type": "string",
+                    "description": "Role alias: worker, explorer, awaiter, default"
+                },
+                "agent_role": {
+                    "type": "string",
+                    "description": "Alias for role"
+                },
+                "allowed_tools": {
+                    "type": "array",
+                    "items": { "type": "string" },
+                    "description": "Explicit tool allowlist (required for custom type)"
+                },
+                "model": {
+                    "type": "string",
+                    "description": "Optional DeepSeek model id for this child"
+                },
+                "cwd": {
+                    "type": "string",
+                    "description": "Optional working directory for the child; must be inside the parent workspace"
+                },
+                "resident_file": {
+                    "type": "string",
+                    "description": "Optional file path for cache-aware resident mode"
+                },
+                "fork_context": {
+                    "type": "boolean",
+                    "description": "false (default): fresh child with independent context/prefill. true: forked child that preserves the parent's byte-identical system/message prefix where available, then appends this task for DeepSeek prefix-cache reuse."
+                },
+                "max_depth": {
+                    "type": "integer",
+                    "minimum": 0,
+                    "maximum": 3,
+                    "description": "Recursive child-agent budget for this session. 0 blocks agent_open from the child; 1-3 allow that many descendant levels."
+                }
+            }
+        })
+    }
+
+    fn capabilities(&self) -> Vec<ToolCapability> {
+        vec![
+            ToolCapability::ExecutesCode,
+            ToolCapability::RequiresApproval,
+        ]
+    }
+
+    fn approval_requirement(&self) -> ApprovalRequirement {
+        ApprovalRequirement::Required
+    }
+
+    async fn execute(&self, input: Value, context: &ToolContext) -> Result<ToolResult, ToolError> {
+        let spawn_tool = AgentSpawnTool::new(self.manager.clone(), self.runtime.clone());
+        let result = spawn_tool.execute(input, context).await?;
+        let snapshot: SubAgentResult = serde_json::from_str(&result.content).map_err(|e| {
+            ToolError::execution_failed(format!("agent_open projection failed: {e}"))
+        })?;
+        let projection = subagent_session_projection(snapshot, false, context).await;
+        let mut tool_result = ToolResult::json(&projection)
+            .map_err(|e| ToolError::execution_failed(e.to_string()))?;
+        tool_result.metadata = Some(json!({
+            "status": projection.status,
+            "terminal": projection.terminal,
+            "context_mode": projection.context_mode,
+            "prefix_cache": projection.prefix_cache,
+        }));
+        Ok(tool_result)
+    }
+}
+
 /// Tool to spawn a background sub-agent.
 pub struct AgentSpawnTool {
    manager: SharedSubAgentManager,
@@ -1725,6 +2017,9 @@ impl ToolSpec for AgentSpawnTool {
        // optionally overrides cwd if the caller passed one (used for the
        // parallel-worktree pattern).
        let mut child_runtime = self.runtime.background_runtime();
+        if let Some(max_depth) = spawn_request.max_depth {
+            child_runtime.max_spawn_depth = child_runtime.spawn_depth.saturating_add(max_depth);
+        }
        if let Some(cwd) = validated_cwd {
            child_runtime.context.workspace = cwd;
        }
@@ -1790,6 +2085,7 @@ impl ToolSpec for AgentSpawnTool {
                spawn_request.assignment,
                spawn_request.allowed_tools,
                SubAgentSpawnOptions {
+                    name: spawn_request.session_name.clone(),
                    model: Some(effective_model),
                    nickname: None,
                    fork_context: spawn_request.fork_context,
@@ -1843,13 +2139,143 @@ impl ToolSpec for AgentSpawnTool {
    }
 }

+/// Evaluate/fetch a child session boundary for the v0.8.33 sub-agent API.
+#[allow(dead_code)] // Registered by the adjacent v0.8.33 registry surface update.
+pub struct AgentEvalTool {
+    manager: SharedSubAgentManager,
+}
+
+impl AgentEvalTool {
+    #[allow(dead_code)] // Registered by the adjacent v0.8.33 registry surface update.
+    #[must_use]
+    pub fn new(manager: SharedSubAgentManager) -> Self {
+        Self { manager }
+    }
+}
+
+#[async_trait]
+impl ToolSpec for AgentEvalTool {
+    fn name(&self) -> &'static str {
+        "agent_eval"
+    }
+
+    fn description(&self) -> &'static str {
+        "Fetch or wait on a child sub-agent session. Optionally deliver a message/items to a running session, then return the latest session projection. With block=true (default), waits for the session to reach a terminal boundary; block=false is a non-blocking status fetch."
+    }
+
+    fn input_schema(&self) -> Value {
+        json!({
+            "type": "object",
+            "properties": {
+                "name": {
+                    "type": "string",
+                    "description": "Session name returned by agent_open"
+                },
+                "agent_id": {
+                    "type": "string",
+                    "description": "Generated agent id returned by agent_open"
+                },
+                "id": {
+                    "type": "string",
+                    "description": "Alias for agent_id"
+                },
+                "message": {
+                    "type": "string",
+                    "description": "Optional message to deliver before evaluating the session"
+                },
+                "input": {
+                    "type": "string",
+                    "description": "Alias for message"
+                },
+                "items": {
+                    "type": "array",
+                    "description": "Structured input items (text, mention, skill, local_image, image)",
+                    "items": { "type": "object" }
+                },
+                "interrupt": {
+                    "type": "boolean",
+                    "description": "When sending input, prioritize it over pending inputs"
+                },
+                "block": {
+                    "type": "boolean",
+                    "description": "Wait for a terminal boundary before returning (default true)"
+                },
+                "timeout_ms": {
+                    "type": "integer",
+                    "description": "Max wait time in milliseconds (default: 30000, clamped to 1000-3600000)"
+                }
+            }
+        })
+    }
+
+    fn capabilities(&self) -> Vec<ToolCapability> {
+        vec![ToolCapability::ReadOnly]
+    }
+
+    async fn execute(&self, input: Value, context: &ToolContext) -> Result<ToolResult, ToolError> {
+        let agent_ref = input
+            .get("name")
+            .or_else(|| input.get("agent_id"))
+            .or_else(|| input.get("id"))
+            .and_then(Value::as_str)
+            .map(str::trim)
+            .filter(|value| !value.is_empty())
+            .ok_or_else(|| ToolError::missing_field("name"))?;
+        let message = parse_optional_text_or_items(&input, &["message", "input"], "items")?;
+        let interrupt = optional_bool(&input, "interrupt", false);
+        let block = optional_bool(&input, "block", true);
+        let timeout_ms = optional_u64(&input, "timeout_ms", DEFAULT_RESULT_TIMEOUT_MS)
+            .clamp(1000, MAX_RESULT_TIMEOUT_MS);
+
+        let agent_id = {
+            let manager = self.manager.read().await;
+            manager
+                .resolve_agent_ref(agent_ref)
+                .map_err(|e| ToolError::execution_failed(e.to_string()))?
+        };
+
+        if let Some(message) = message {
+            let mut manager = self.manager.write().await;
+            manager
+                .send_input(&agent_id, message, interrupt)
+                .map_err(|e| ToolError::execution_failed(e.to_string()))?;
+        }
+
+        let (snapshot, timed_out) = if block {
+            wait_for_result(&self.manager, &agent_id, Duration::from_millis(timeout_ms)).await?
+        } else {
+            let manager = self.manager.read().await;
+            (
+                manager
+                    .get_result(&agent_id)
+                    .map_err(|e| ToolError::execution_failed(e.to_string()))?,
+                false,
+            )
+        };
+
+        let projection = subagent_session_projection(snapshot, timed_out, context).await;
+        let mut result = ToolResult::json(&projection)
+            .map_err(|e| ToolError::execution_failed(e.to_string()))?;
+        result.metadata = Some(json!({
+            "status": if timed_out { "TimedOut".to_string() } else { projection.status.clone() },
+            "timed_out": timed_out,
+            "terminal": projection.terminal,
+            "context_mode": projection.context_mode,
+            "timeout_ms": timeout_ms
+        }));
+        Ok(result)
+    }
+}
+
 /// Tool to fetch a sub-agent's result.
+#[allow(dead_code)] // Legacy surface superseded by agent_eval.
 pub struct AgentResultTool {
    manager: SharedSubAgentManager,
 }

 impl AgentResultTool {
    /// Create a new result tool.
+    #[allow(dead_code)] // Legacy surface superseded by agent_eval.
    #[must_use]
    pub fn new(manager: SharedSubAgentManager) -> Self {
        Self { manager }
@@ -1933,12 +2359,14 @@ impl ToolSpec for AgentResultTool {
 }

 /// Tool to cancel a sub-agent.
+#[allow(dead_code)] // Legacy surface superseded by agent_close.
 pub struct AgentCancelTool {
    manager: SharedSubAgentManager,
 }

 impl AgentCancelTool {
    /// Create a new cancel tool.
+    #[allow(dead_code)] // Legacy surface superseded by agent_close.
    #[must_use]
    pub fn new(manager: SharedSubAgentManager) -> Self {
        Self { manager }
@@ -1991,6 +2419,7 @@ impl ToolSpec for AgentCancelTool {
 }

 /// Tool to list all sub-agents.
+#[allow(dead_code)] // Legacy surface superseded by named agent_open/eval/close sessions.
 pub struct AgentListTool {
    manager: SharedSubAgentManager,
 }
@@ -2011,11 +2440,11 @@ impl AgentCloseTool {
 #[async_trait]
 impl ToolSpec for AgentCloseTool {
    fn name(&self) -> &'static str {
-        "close_agent"
+        "agent_close"
    }

    fn description(&self) -> &'static str {
-        "Close a running sub-agent. Alias for agent_cancel."
+        "Close a child sub-agent session by cancelling it if still running. Returns the final session projection with transcript_handle metadata."
    }

    fn input_schema(&self) -> Value {
@@ -2024,7 +2453,11 @@ impl ToolSpec for AgentCloseTool {
            "properties": {
                "id": {
                    "type": "string",
-                    "description": "Agent id returned by agent_spawn"
+                    "description": "Agent id returned by agent_open"
+                },
+                "name": {
+                    "type": "string",
+                    "description": "Session name returned by agent_open"
                },
                "agent_id": {
                    "type": "string",
@@ -2045,27 +2478,30 @@ impl ToolSpec for AgentCloseTool {
        ApprovalRequirement::Required
    }

-    async fn execute(&self, input: Value, _context: &ToolContext) -> Result<ToolResult, ToolError> {
+    async fn execute(&self, input: Value, context: &ToolContext) -> Result<ToolResult, ToolError> {
        let agent_id = input
-            .get("id")
+            .get("name")
+            .or_else(|| input.get("id"))
            .or_else(|| input.get("agent_id"))
            .and_then(|v| v.as_str())
            .ok_or_else(|| ToolError::missing_field("id"))?;
+        let agent_id = {
+            let manager = self.manager.read().await;
+            manager
+                .resolve_agent_ref(agent_id)
+                .map_err(|e| ToolError::execution_failed(e.to_string()))?
+        };
        let mut manager = self.manager.write().await;
        let result = manager
-            .cancel(agent_id)
+            .cancel(&agent_id)
            .map_err(|e| ToolError::execution_failed(format!("Failed to close sub-agent: {e}")))?;
-        let tool_result =
-            ToolResult::json(&result).map_err(|e| ToolError::execution_failed(e.to_string()))?;
-        Ok(wrap_with_deprecation_notice(
-            tool_result,
-            "close_agent",
-            "agent_cancel",
-        ))
+        let projection = subagent_session_projection(result, false, context).await;
+        ToolResult::json(&projection).map_err(|e| ToolError::execution_failed(e.to_string()))
    }
 }

 /// Tool to resume an existing sub-agent.
+#[allow(dead_code)] // Legacy surface superseded by agent_open/eval.
 pub struct AgentResumeTool {
    manager: SharedSubAgentManager,
    runtime: SubAgentRuntime,
@@ -2073,6 +2509,7 @@ pub struct AgentResumeTool {

 impl AgentResumeTool {
    /// Create a new resume tool.
+    #[allow(dead_code)] // Legacy surface superseded by agent_open/eval.
    #[must_use]
    pub fn new(manager: SharedSubAgentManager, runtime: SubAgentRuntime) -> Self {
        Self { manager, runtime }
@@ -2132,6 +2569,7 @@ impl ToolSpec for AgentResumeTool {

 impl AgentListTool {
    /// Create a new list tool.
+    #[allow(dead_code)] // Legacy surface superseded by named agent_open/eval/close sessions.
    #[must_use]
    pub fn new(manager: SharedSubAgentManager) -> Self {
        Self { manager }
@@ -2181,6 +2619,7 @@ impl ToolSpec for AgentListTool {
 }

 /// Tool to send input to a running sub-agent.
+#[allow(dead_code)] // Legacy surface superseded by agent_eval.
 pub struct AgentSendInputTool {
    manager: SharedSubAgentManager,
    name: &'static str,
@@ -2188,6 +2627,7 @@ pub struct AgentSendInputTool {

 impl AgentSendInputTool {
    /// Create a new send-input tool.
+    #[allow(dead_code)] // Legacy surface superseded by agent_eval.
    #[must_use]
    pub fn new(manager: SharedSubAgentManager, name: &'static str) -> Self {
        Self { manager, name }
@@ -2277,6 +2717,7 @@ impl ToolSpec for AgentSendInputTool {
 }

 /// Tool to update assignment metadata for a sub-agent.
+#[allow(dead_code)] // Legacy surface superseded by agent_eval/open metadata.
 pub struct AgentAssignTool {
    manager: SharedSubAgentManager,
    name: &'static str,
@@ -2284,6 +2725,7 @@ pub struct AgentAssignTool {

 impl AgentAssignTool {
    /// Create a new assignment tool.
+    #[allow(dead_code)] // Legacy surface superseded by agent_eval/open metadata.
    #[must_use]
    pub fn new(manager: SharedSubAgentManager, name: &'static str) -> Self {
        Self { manager, name }
@@ -2371,6 +2813,7 @@ impl ToolSpec for AgentAssignTool {
 }

 /// Tool to wait for sub-agents to complete.
+#[allow(dead_code)] // Legacy surface superseded by agent_eval.
 pub struct AgentWaitTool {
    manager: SharedSubAgentManager,
    name: &'static str,
@@ -2378,6 +2821,7 @@ pub struct AgentWaitTool {

 impl AgentWaitTool {
    /// Create a new wait tool.
+    #[allow(dead_code)] // Legacy surface superseded by agent_eval.
    #[must_use]
    pub fn new(manager: SharedSubAgentManager, name: &'static str) -> Self {
        Self { manager, name }
@@ -2517,6 +2961,7 @@ impl ToolSpec for AgentWaitTool {

 /// Compatibility delegate tool. It routes through `agent_spawn`, but defaults
 /// to `fork_context=true` because delegation is usually continuation work.
+#[allow(dead_code)] // Legacy alias superseded by agent_open(fork_context=true).
 pub struct DelegateToAgentTool {
    manager: SharedSubAgentManager,
    runtime: SubAgentRuntime,
@@ -2524,6 +2969,7 @@ pub struct DelegateToAgentTool {

 impl DelegateToAgentTool {
    /// Create a new delegation tool.
+    #[allow(dead_code)] // Legacy alias superseded by agent_open(fork_context=true).
    #[must_use]
    pub fn new(manager: SharedSubAgentManager, runtime: SubAgentRuntime) -> Self {
        Self { manager, runtime }
@@ -2855,7 +3301,8 @@ async fn run_subagent(
    mut input_rx: mpsc::UnboundedReceiver<SubAgentInput>,
 ) -> Result<SubAgentResult> {
    let system_prompt = build_subagent_system_prompt(&agent_type, &assignment);
-    let fork_context = fork_context
+    let fork_context_enabled = fork_context;
+    let fork_context = fork_context_enabled
        .then_some(runtime.fork_context.as_ref())
        .flatten();
    let request_system = subagent_request_system_prompt(&system_prompt, fork_context);
@@ -2911,7 +3358,15 @@ async fn run_subagent(
                });
            }
            return Ok(SubAgentResult {
+                name: agent_id.clone(),
                agent_id: agent_id.clone(),
+                context_mode: if fork_context_enabled {
+                    "forked"
+                } else {
+                    "fresh"
+                }
+                .to_string(),
+                fork_context: fork_context_enabled,
                agent_type: agent_type.clone(),
                assignment: assignment.clone(),
                model: runtime.model.clone(),
@@ -2984,7 +3439,10 @@ async fn run_subagent(
                    });
                }
                return Ok(SubAgentResult {
+                    name: agent_id.clone(),
                    agent_id: agent_id.clone(),
+                    context_mode: if fork_context_enabled { "forked" } else { "fresh" }.to_string(),
+                    fork_context: fork_context_enabled,
                    agent_type: agent_type.clone(),
                    assignment: assignment.clone(),
                    model: runtime.model.clone(),
@@ -3121,7 +3579,15 @@ async fn run_subagent(
    release_resident_leases_for(&agent_id);

    Ok(SubAgentResult {
+        name: agent_id.clone(),
        agent_id,
+        context_mode: if fork_context_enabled {
+            "forked"
+        } else {
+            "fresh"
+        }
+        .to_string(),
+        fork_context: fork_context_enabled,
        agent_type,
        assignment,
        model: runtime.model.clone(),
@@ -3160,6 +3626,7 @@ async fn wait_for_result(
    }
 }

+#[allow(dead_code)] // Legacy agent_wait helper; agent_eval uses wait_for_result.
 async fn wait_for_agents(
    manager: &SharedSubAgentManager,
    ids: &[String],
@@ -3369,6 +3836,9 @@ fn parse_spawn_request(input: &Value) -> Result<SpawnRequest, ToolError> {
        "items",
        "prompt",
    )?;
+    let session_name = optional_input_str(input, &["name", "session_name"])
+        .map(validate_session_name)
+        .transpose()?;

    let type_input = optional_input_str(input, &["type", "agent_type", "agent_name"]);
    let role_input = optional_input_str(input, &["role", "agent_role"]);
@@ -3444,8 +3914,28 @@ fn parse_spawn_request(input: &Value) -> Result<SpawnRequest, ToolError> {
    let fork_context =
        parse_optional_bool(input, &["fork_context", "forkContext", "inherit_context"])
            .unwrap_or(false);
+    let max_depth = input
+        .get("max_depth")
+        .or_else(|| input.get("maxDepth"))
+        .or_else(|| input.get("max_spawn_depth"))
+        .and_then(Value::as_u64)
+        .map(|depth| {
+            u32::try_from(depth)
+                .map_err(|_| ToolError::invalid_input("max_depth must be between 0 and 3"))
+                .and_then(|depth| {
+                    if depth <= 3 {
+                        Ok(depth)
+                    } else {
+                        Err(ToolError::invalid_input(
+                            "max_depth must be between 0 and 3",
+                        ))
+                    }
+                })
+        })
+        .transpose()?;

    Ok(SpawnRequest {
+        session_name,
        prompt: prompt.clone(),
        agent_type,
        assignment: SubAgentAssignment::new(prompt, role),
@@ -3454,9 +3944,31 @@ fn parse_spawn_request(input: &Value) -> Result<SpawnRequest, ToolError> {
        cwd,
        resident_file,
        fork_context,
+        max_depth,
    })
 }

+fn validate_session_name(name: &str) -> Result<String, ToolError> {
+    let trimmed = name.trim();
+    if trimmed.is_empty() {
+        return Err(ToolError::invalid_input("name cannot be blank"));
+    }
+    if trimmed.chars().any(char::is_whitespace) {
+        return Err(ToolError::invalid_input(
+            "name must not contain whitespace; use letters, numbers, '-', '_', or '.'",
+        ));
+    }
+    if !trimmed
+        .chars()
+        .all(|ch| ch.is_ascii_alphanumeric() || matches!(ch, '-' | '_' | '.'))
+    {
+        return Err(ToolError::invalid_input(
+            "name may only contain ASCII letters, numbers, '-', '_', or '.'",
+        ));
+    }
+    Ok(trimmed.to_string())
+}
+
 fn parse_optional_bool(input: &Value, names: &[&str]) -> Option<bool> {
    names
        .iter()
@@ -3983,9 +4495,10 @@ const GENERAL_AGENT_INTRO: &str = concat!(
 );

 const EXPLORE_AGENT_INTRO: &str = concat!(
-    "You are an exploration sub-agent. Map the relevant code quickly and stay read-only.\n",
-    "Use list_dir/file_search, grep_files, and read_file; stop once evidence is sufficient.\n",
-    "EVIDENCE is load-bearing: cite `path:line-range` for each finding.\n",
+    "You are an exploration sub-agent (role: `explore`). Map the relevant code quickly and stay read-only.\n",
+    "Orient first: confirm the workspace/project root, read relevant AGENTS.md/README guidance when the tree is unfamiliar, then search only the likely scope.\n",
+    "Use list_dir/file_search, grep_files, and read_file; use RLM only for long inputs or many semantic slices, not basic path discovery.\n",
+    "DeepSeek V4 can hold broad evidence, but your value is compressed reconnaissance: cite `path:line-range` for each finding and stop once evidence is sufficient.\n",
    "CHANGES will almost always be \"None.\" for an explorer.\n\n"
 );

@@ -7,7 +7,10 @@ fn make_assignment() -> SubAgentAssignment {

 fn make_snapshot(status: SubAgentStatus) -> SubAgentResult {
    SubAgentResult {
+        name: "agent_test".to_string(),
        agent_id: "agent_test".to_string(),
+        context_mode: "fresh".to_string(),
+        fork_context: false,
        agent_type: SubAgentType::General,
        assignment: make_assignment(),
        model: "deepseek-v4-flash".to_string(),
@@ -176,36 +179,43 @@ fn test_agent_type_prompts_include_shared_output_contract_once() {
 }

 #[test]
-fn agent_spawn_description_warns_parent_to_verify_self_reports_within_budget() {
+fn explore_prompt_orients_before_searching() {
+    let prompt = SubAgentType::Explore.system_prompt();
+    assert!(prompt.contains("role: `explore`"));
+    assert!(prompt.contains("AGENTS.md/README"));
+    assert!(prompt.contains("workspace/project root"));
+    assert!(prompt.contains("compressed reconnaissance"));
+}
+
+#[test]
+fn agent_open_description_explains_fresh_vs_forked_context_and_trust_model() {
    let tmp = tempdir().expect("tempdir");
    let manager = new_shared_subagent_manager(tmp.path().to_path_buf(), 1);
-    let tool = AgentSpawnTool::new(manager, stub_runtime());
+    let tool = AgentOpenTool::new(manager, stub_runtime());
    let description = tool.description();

-    assert!(
-        description
-            .contains("## Trust model: subagent results are self-reports, not verified facts")
-    );
-    assert!(description.contains("`agent_result` returns the child's narrative summary"));
-    assert!(description.contains("| Side effect | Re-verify with |"));
-    assert!(description.contains("If the child returns a verifiable handle"));
-    for row in [
-        "| URL claimed posted/written | `fetch_url` and check the response |",
-        "| File claimed created | `read_file` or `list_dir` |",
-        "| File claimed edited | `read_file` and check the change is present |",
-        "| HTTP POST/PUT response | inspect status code and body |",
-        "| Git operation | `git_status` / `git_diff` |",
-        "| Test claimed passing | `run_tests` |",
-        "| Process claimed started | `exec_shell` (e.g. `pgrep`, `lsof -i`) |",
-    ] {
-        assert!(description.contains(row));
-    }
+    assert!(description.contains("fresh child with an independent prefill"));
+    assert!(description.contains("fork_context=true"));
+    assert!(description.contains("byte-identically"));
+    assert!(description.contains("DeepSeek can reuse its prefix cache"));
+    assert!(description.contains("Sub-agent results are self-reports"));
    assert!(
        estimate_tool_description_tokens_conservative(description) <= 1024,
-        "agent_spawn description exceeds the conservative 1024-token budget"
+        "agent_open description exceeds the conservative 1024-token budget"
    );
 }

+#[test]
+fn new_session_tools_use_open_eval_close_names() {
+    let manager = Arc::new(RwLock::new(SubAgentManager::new(PathBuf::from("."), 1)));
+    assert_eq!(
+        AgentOpenTool::new(manager.clone(), stub_runtime()).name(),
+        "agent_open"
+    );
+    assert_eq!(AgentEvalTool::new(manager.clone()).name(), "agent_eval");
+    assert_eq!(AgentCloseTool::new(manager).name(), "agent_close");
+}
+
 #[test]
 fn test_implementer_allowed_tools_include_writes() {
    // Implementer is the write-heavy role; the deprecated
@@ -288,6 +298,66 @@ fn test_parse_spawn_request_accepts_fork_context() {
    assert!(parsed.fork_context);
 }

+#[test]
+fn test_parse_spawn_request_accepts_session_name_for_agent_open() {
+    let input = json!({
+        "name": "review.parser",
+        "prompt": "inspect parser",
+        "fork_context": true,
+        "max_depth": 0
+    });
+    let parsed = parse_spawn_request(&input).expect("open request should parse");
+    assert_eq!(parsed.session_name.as_deref(), Some("review.parser"));
+    assert!(parsed.fork_context);
+    assert_eq!(parsed.max_depth, Some(0));
+}
+
+#[test]
+fn test_parse_spawn_request_rejects_invalid_session_name() {
+    let input = json!({
+        "name": "bad name",
+        "prompt": "inspect parser"
+    });
+    let err = parse_spawn_request(&input).expect_err("space in name should fail");
+    assert!(err.to_string().contains("name must not contain whitespace"));
+}
+
+#[test]
+fn test_parse_spawn_request_rejects_out_of_range_max_depth() {
+    let input = json!({
+        "name": "review.parser",
+        "prompt": "inspect parser",
+        "max_depth": 4
+    });
+    let err = parse_spawn_request(&input).expect_err("max_depth should be capped at schema range");
+    assert!(
+        err.to_string()
+            .contains("max_depth must be between 0 and 3")
+    );
+}
+
+#[tokio::test]
+async fn session_projection_exposes_forked_prefix_cache_contract() {
+    let mut snapshot = make_snapshot(SubAgentStatus::Running);
+    snapshot.name = "fanout_review".to_string();
+    snapshot.context_mode = "forked".to_string();
+    snapshot.fork_context = true;
+
+    let ctx = ToolContext::new(".");
+    let projection = subagent_session_projection(snapshot, false, &ctx).await;
+
+    assert_eq!(projection.name, "fanout_review");
+    assert_eq!(projection.context_mode, "forked");
+    assert!(projection.fork_context);
+    assert_eq!(projection.prefix_cache.mode, "forked");
+    assert_eq!(
+        projection.prefix_cache.parent_prefix,
+        "preserved_byte_identical_when_available"
+    );
+    assert_eq!(projection.transcript_handle.kind, "var_handle");
+    assert_eq!(projection.transcript_handle.name, "transcript");
+}
+
 #[test]
 fn test_delegate_defaults_to_fork_context() {
    let input = with_default_fork_context(json!({ "prompt": "review current work" }), true);
@@ -427,23 +497,6 @@ fn test_parse_assign_request_requires_update_fields() {
    );
 }

-#[test]
-fn test_send_input_schema_does_not_require_message_field() {
-    let manager = Arc::new(RwLock::new(SubAgentManager::new(PathBuf::from("."), 1)));
-    let schema = AgentSendInputTool::new(manager, "send_input").input_schema();
-    let required = schema
-        .get("required")
-        .and_then(Value::as_array)
-        .cloned()
-        .unwrap_or_default();
-    assert!(
-        !required
-            .iter()
-            .any(|entry| entry.as_str().is_some_and(|name| name == "message")),
-        "send_input schema should allow items-only payloads"
-    );
-}
-
 #[test]
 fn test_build_allowed_tools_independent_of_allow_shell() {
    // v0.6.6: allow_shell no longer filters at the build_allowed_tools
@@ -880,83 +933,6 @@ fn test_interrupted_status_name_and_summary() {
    assert!(summarize_subagent_result(&snapshot).contains(SUBAGENT_RESTART_REASON));
 }

-// === Deprecation notice tests ===
-
-/// Helper: build a plain ToolResult with a JSON payload.
-fn make_plain_result(payload: serde_json::Value) -> crate::tools::spec::ToolResult {
-    crate::tools::spec::ToolResult::json(&payload).expect("json result")
-}
-
-#[test]
-fn test_wrap_with_deprecation_notice_adds_deprecation_block() {
-    let result = make_plain_result(json!({"agent_id": "abc"}));
-    let wrapped = wrap_with_deprecation_notice(result, "spawn_agent", "agent_spawn");
-
-    let meta = wrapped.metadata.expect("metadata should be present");
-    let dep = &meta["_deprecation"];
-    assert_eq!(dep["this_tool"], "spawn_agent");
-    assert_eq!(dep["use_instead"], "agent_spawn");
-    assert_eq!(dep["removed_in"], DEPRECATION_REMOVAL_VERSION);
-    assert!(
-        dep["message"]
-            .as_str()
-            .unwrap_or("")
-            .contains("spawn_agent")
-    );
-}
-
-#[test]
-fn test_wrap_with_deprecation_notice_preserves_existing_metadata() {
-    let result = make_plain_result(json!({"agent_id": "abc"}))
-        .with_metadata(json!({"status": "Running", "snapshot": {}}));
-    let wrapped = wrap_with_deprecation_notice(result, "close_agent", "agent_cancel");
-
-    let meta = wrapped.metadata.expect("metadata should be present");
-    // Existing metadata key must survive.
-    assert_eq!(meta["status"], "Running");
-    // Deprecation block must be present alongside.
-    assert_eq!(meta["_deprecation"]["this_tool"], "close_agent");
-    assert_eq!(meta["_deprecation"]["use_instead"], "agent_cancel");
-}
-
-#[test]
-fn test_canonical_agent_send_input_has_no_deprecation() {
-    let manager = Arc::new(RwLock::new(SubAgentManager::new(PathBuf::from("."), 1)));
-    // The canonical name "agent_send_input" must NOT receive a deprecation notice.
-    // We verify this by inspecting the tool's name — the deprecation branch
-    // only fires when name == "send_input".
-    let tool = AgentSendInputTool::new(manager.clone(), "agent_send_input");
-    assert_eq!(tool.name(), "agent_send_input");
-
-    let alias = AgentSendInputTool::new(manager, "send_input");
-    assert_eq!(alias.name(), "send_input");
-}
-
-#[test]
-fn test_wrap_with_deprecation_notice_all_alias_mappings() {
-    let cases = [
-        ("spawn_agent", "agent_spawn"),
-        ("delegate_to_agent", "agent_spawn"),
-        ("close_agent", "agent_cancel"),
-        ("send_input", "agent_send_input"),
-    ];
-
-    for (alias, canonical) in cases {
-        let result = make_plain_result(json!({"ok": true}));
-        let wrapped = wrap_with_deprecation_notice(result, alias, canonical);
-        let meta = wrapped.metadata.expect("metadata for alias {alias}");
-        assert_eq!(meta["_deprecation"]["this_tool"], alias, "alias={alias}");
-        assert_eq!(
-            meta["_deprecation"]["use_instead"], canonical,
-            "alias={alias}"
-        );
-        assert_eq!(
-            meta["_deprecation"]["removed_in"], DEPRECATION_REMOVAL_VERSION,
-            "alias={alias}"
-        );
-    }
-}
-
 // === v0.6.6 — sub-agent authority unification ===

 #[test]
@@ -203,8 +203,7 @@ impl ReasoningEffort {
 #[derive(Debug, Clone, Copy, PartialEq, Eq)]
 pub enum SidebarFocus {
    Auto,
-    Plan,
-    Todos,
+    Work,
    Tasks,
    Agents,
    Context,
@@ -250,8 +249,7 @@ impl SidebarFocus {
    #[must_use]
    pub fn from_setting(value: &str) -> Self {
        match value.trim().to_ascii_lowercase().as_str() {
-            "plan" => Self::Plan,
-            "todos" => Self::Todos,
+            "work" | "plan" | "todos" => Self::Work,
            "tasks" => Self::Tasks,
            "agents" | "subagents" | "sub-agents" => Self::Agents,
            "context" | "session" => Self::Context,
@@ -264,8 +262,7 @@ impl SidebarFocus {
    pub fn as_setting(self) -> &'static str {
        match self {
            Self::Auto => "auto",
-            Self::Plan => "plan",
-            Self::Todos => "todos",
+            Self::Work => "work",
            Self::Tasks => "tasks",
            Self::Agents => "agents",
            Self::Context => "context",
@@ -1262,14 +1259,8 @@ impl App {
        let sidebar_focus = SidebarFocus::from_setting(&settings.sidebar_focus);
        let max_input_history = settings.max_input_history;
        let use_paste_burst_detection = settings.paste_burst_detection;
-        let mut ui_theme = palette::UiTheme::detect();
-        if let Some(background) = settings
-            .background_color
-            .as_deref()
-            .and_then(palette::parse_hex_rgb_color)
-        {
-            ui_theme = ui_theme.with_background_color(background);
-        }
+        let ui_theme =
+            palette::ui_theme_from_settings(&settings.theme, settings.background_color.as_deref());
        let model = settings
            .provider_models
            .as_ref()
@@ -2590,10 +2581,12 @@ impl App {

    /// Handle terminal resize event.
    pub fn handle_resize(&mut self, _width: u16, _height: u16) {
+        let preserved_scroll = (!self.viewport.transcript_scroll.is_at_tail())
+            .then_some(self.viewport.last_transcript_top);
        self.viewport.transcript_cache = TranscriptViewCache::new();

-        if !self.viewport.transcript_scroll.is_at_tail() {
-            self.viewport.transcript_scroll = TranscriptScroll::to_bottom();
+        if let Some(top) = preserved_scroll {
+            self.viewport.transcript_scroll = TranscriptScroll::at_line(top);
        }

        self.viewport.pending_scroll_delta = 0;
@@ -4002,19 +3995,6 @@ pub enum AppAction {
    },
    /// Send a message to the AI (normal chat mode).
    SendMessage(String),
-    /// Run a Recursive Language Model (RLM) turn — Algorithm 1 from
-    /// Zhang et al. (arXiv:2512.24601). The prompt is stored in the REPL;
-    /// the root LLM only sees metadata.
-    Rlm {
-        /// The user's prompt — stored in REPL, NOT in LLM context.
-        prompt: String,
-        /// Model for the root LLM.
-        model: String,
-        /// Model for sub-LLM (llm_query) calls.
-        child_model: String,
-        /// Recursion budget for `sub_rlm()` calls.
-        max_depth: u32,
-    },
    ListSubAgents,
    FetchModels,
    CacheWarmup,
@@ -4126,7 +4106,9 @@ mod tests {
            notes_path: PathBuf::from("notes.txt"),
            mcp_config_path: PathBuf::from("mcp.json"),
            use_memory: false,
-            start_in_agent_mode: yolo,
+            // Keep unit tests independent from the developer's saved
+            // `default_mode` setting.
+            start_in_agent_mode: true,
            skip_onboarding: false,
            yolo,
            resume_session_id: None,
@@ -4140,6 +4122,18 @@ mod tests {
        assert!(app.trust_mode);
    }

+    #[test]
+    fn sidebar_focus_accepts_work_and_maps_legacy_trackers_to_work() {
+        assert_eq!(SidebarFocus::from_setting("auto"), SidebarFocus::Auto);
+        assert_eq!(SidebarFocus::from_setting("work"), SidebarFocus::Work);
+        assert_eq!(SidebarFocus::from_setting("plan"), SidebarFocus::Work);
+        assert_eq!(SidebarFocus::from_setting("todos"), SidebarFocus::Work);
+        assert_eq!(SidebarFocus::from_setting("tasks"), SidebarFocus::Tasks);
+        assert_eq!(SidebarFocus::from_setting("agents"), SidebarFocus::Agents);
+        assert_eq!(SidebarFocus::from_setting("context"), SidebarFocus::Context);
+        assert_eq!(SidebarFocus::Work.as_setting(), "work");
+    }
+
    #[test]
    fn composer_strips_raw_sgr_mouse_report_when_mouse_capture_is_enabled() {
        let mut app = App::new(test_options(false), &Config::default());
@@ -4420,7 +4414,6 @@ mod tests {
    #[test]
    fn test_cycle_mode_transitions() {
        let mut app = App::new(test_options(false), &Config::default());
-        // Default mode should be Agent based on settings
        let initial_mode = app.mode;
        app.cycle_mode();
        // Mode should have changed
@@ -4577,6 +4570,32 @@ mod tests {
        app.scroll_down(3);
    }

+    #[test]
+    fn resize_preserves_scrolled_transcript_position() {
+        let mut app = App::new(test_options(false), &Config::default());
+        app.viewport.transcript_scroll = TranscriptScroll::at_line(42);
+        app.viewport.last_transcript_top = 42;
+        app.viewport.pending_scroll_delta = 5;
+
+        app.handle_resize(120, 40);
+
+        let meta = vec![TranscriptLineMeta::Spacer; 240];
+        let (_, top) = app.viewport.transcript_scroll.resolve_top(&meta, 200);
+        assert_eq!(top, 42);
+        assert_eq!(app.viewport.pending_scroll_delta, 0);
+    }
+
+    #[test]
+    fn resize_keeps_tail_state_when_user_was_at_tail() {
+        let mut app = App::new(test_options(false), &Config::default());
+        app.viewport.transcript_scroll = TranscriptScroll::to_bottom();
+        app.viewport.last_transcript_top = 42;
+
+        app.handle_resize(120, 40);
+
+        assert!(app.viewport.transcript_scroll.is_at_tail());
+    }
+
    #[test]
    fn test_add_message() {
        let mut app = App::new(test_options(false), &Config::default());
@@ -207,6 +207,18 @@ mod tests {
        assert_eq!(cell.bg, palette::LIGHT_SURFACE);
    }

+    #[test]
+    fn grayscale_palette_maps_hued_cells_before_depth_adaptation() {
+        let mut cell = Cell::default();
+        cell.set_fg(palette::DEEPSEEK_SKY);
+        cell.set_bg(palette::DEEPSEEK_INK);
+
+        adapt_cell_colors(&mut cell, ColorDepth::TrueColor, PaletteMode::Grayscale);
+
+        assert_eq!(cell.fg, palette::GRAYSCALE_TEXT_SOFT);
+        assert_eq!(cell.bg, palette::GRAYSCALE_SURFACE);
+    }
+
    #[test]
    fn backend_palette_mode_can_follow_runtime_theme_changes() {
        let writer = SharedWriter::default();
@@ -215,5 +227,7 @@ mod tests {
        assert_eq!(backend.palette_mode, PaletteMode::Dark);
        backend.set_palette_mode(PaletteMode::Light);
        assert_eq!(backend.palette_mode, PaletteMode::Light);
+        backend.set_palette_mode(PaletteMode::Grayscale);
+        assert_eq!(backend.palette_mode, PaletteMode::Grayscale);
    }
 }
@@ -309,7 +309,7 @@ impl HistoryCell {
    /// Render the cell in transcript mode: full content, no caps, no
    /// "Alt+V for details" affordances.
    ///
-    /// Use this for the pager (`v` / `Ctrl+O`), clipboard exports, and any
+    /// Use this for full-detail pagers, clipboard exports, and any
    /// surface that wants the complete body rather than the live summary.
    /// For most variants (User / Assistant / System) this matches `lines()`;
    /// `Thinking` and `Tool` are where the live and transcript surfaces
@@ -1247,19 +1247,21 @@ impl GenericToolCell {
            return lines;
        }

-        // Issue #409: `agent_spawn` already gets a dedicated `DelegateCard`
+        // Issue #409: sub-agent open already gets a dedicated `DelegateCard`
        // that owns the live action tree, status, and final summary. The
        // generic tool block for the same call duplicates that signal at
        // 3-4 lines per spawn — N parallel spawns multiply the noise. In
        // live mode, render one compact summary line and let the
        // DelegateCard be the source of truth. Transcript mode keeps the
        // full block so session replay remains complete.
-        if matches!(mode, RenderMode::Live) && self.name == "agent_spawn" {
+        if matches!(mode, RenderMode::Live)
+            && matches!(self.name.as_str(), "agent_open" | "agent_spawn")
+        {
            return self.render_agent_spawn_compact(low_motion);
        }

        let mut lines = Vec::new();
-        // Map the actual tool name (e.g. `agent_spawn`, `apply_patch`) to a
+        // Map the actual tool name (e.g. `agent_open`, `apply_patch`) to a
        // family rather than the catch-all `"Tool"` title — this is what
        // gives a `GenericToolCell` the right verb glyph (◐ delegate, ⋮⋮
        // fanout, etc.) instead of falling back to the neutral bullet.
@@ -1343,13 +1345,13 @@ impl GenericToolCell {
        wrap_card_rail(lines)
    }

-    /// Render `agent_spawn` as a single compact summary line for live
+    /// Render `agent_open`/legacy `agent_spawn` as a single compact summary line for live
    /// mode (#409). The companion `DelegateCard` already carries the
    /// live action tree, status, and final summary; this line is just
    /// the pointer that says "a spawn happened, here's the agent id".
    ///
    /// Output shape (header):
-    ///   `◐ delegate · agent_spawn  agent-abc12  [running]`
+    ///   `◐ delegate · agent_open  agent-abc12  [running]`
    /// Falls back to a placeholder when the spawn is still pending and
    /// no agent id has been assigned yet.
    fn render_agent_spawn_compact(&self, low_motion: bool) -> Vec<Line<'static>> {
@@ -1438,7 +1440,7 @@ fn render_spillover_annotation(path: &std::path::Path, width: u16) -> Line<'stat
    ])
 }

-/// Pull the `agent_id` field out of an `agent_spawn` tool output. The
+/// Pull the `agent_id` field out of a sub-agent open tool output. The
 /// tool emits structured JSON shaped like
 /// `{"agent_id": "agent-abc12", "nickname": "...", "model": "..."}` so we
 /// look for the `agent_id` key and return its string value.
@@ -2009,8 +2011,20 @@ pub fn output_is_image(output: &str) -> bool {
    .any(|ext| lower.contains(ext))
 }

+#[allow(dead_code)] // Kept for compatibility/tests; live view uses explicit summaries only.
 #[must_use]
 pub fn extract_reasoning_summary(text: &str) -> Option<String> {
+    extract_explicit_reasoning_summary(text).or_else(|| {
+        let fallback = text.trim();
+        if fallback.is_empty() {
+            None
+        } else {
+            Some(fallback.to_string())
+        }
+    })
+}
+
+fn extract_explicit_reasoning_summary(text: &str) -> Option<String> {
    let mut lines = text.lines().peekable();
    while let Some(line) = lines.next() {
        let trimmed = line.trim();
@@ -2042,12 +2056,7 @@ pub fn extract_reasoning_summary(text: &str) -> Option<String> {
            };
        }
    }
-    let fallback = text.trim();
-    if fallback.is_empty() {
-        None
-    } else {
-        Some(fallback.to_string())
-    }
+    None
 }

 fn render_thinking(
@@ -2092,6 +2101,7 @@ fn render_thinking(
    lines.push(Line::from(header_spans));

    let content_width = width.saturating_sub(3).max(1);
+    let mut collapsed_without_explicit_summary = false;
    let body_text = if collapsed {
        if streaming {
            // #861 RC4 / #1324: during streaming we don't yet have a
@@ -2102,7 +2112,13 @@ fn render_thinking(
            // staring at an empty placeholder.
            content.to_string()
        } else {
-            extract_reasoning_summary(content).unwrap_or_else(|| content.trim().to_string())
+            match extract_explicit_reasoning_summary(content) {
+                Some(summary) => summary,
+                None => {
+                    collapsed_without_explicit_summary = true;
+                    String::new()
+                }
+            }
        }
    } else {
        content.to_string()
@@ -2156,13 +2172,13 @@ fn render_thinking(
            // knows there's more above and how to reach it.
            truncated
        } else {
-            truncated || body_text.trim() != content.trim()
+            collapsed_without_explicit_summary || truncated || body_text.trim() != content.trim()
        };
    if needs_affordance {
        let label = if streaming {
-            "thinking continues; press Ctrl+O for full text"
+            "More reasoning in Ctrl+O"
        } else {
-            "thinking collapsed; press Ctrl+O for full text"
+            "Full reasoning in Ctrl+O"
        };
        lines.push(Line::from(vec![
            Span::styled(REASONING_RAIL.to_string(), rail_style),
@@ -3646,7 +3662,7 @@ mod tests {
            .iter()
            .flat_map(|line| line.spans.iter().map(|span| span.content.as_ref()))
            .collect::<String>();
-        assert!(text.contains("thinking collapsed; press Ctrl+O for full text"));
+        assert!(text.contains("Full reasoning in Ctrl+O"));
        assert!(text.contains("thinking"));
    }

@@ -3694,7 +3710,7 @@ mod tests {
            .flat_map(|line| line.spans.iter().map(|span| span.content.as_ref()))
            .collect::<String>();
        assert!(
-            text.contains("thinking continues; press Ctrl+O for full text"),
+            text.contains("More reasoning in Ctrl+O"),
            "streaming-truncation affordance missing, got: {text}"
        );
        // The most recent line must be the visible tail (head dropped).
@@ -4274,10 +4290,10 @@ mod tests {

    // === display_lines (lines_with_options) vs transcript_lines parity ===
    //
-    // These lock the contract for CX#8: live view compresses thinking and
-    // caps tool output, transcript view shows the full body. Both surfaces
-    // must contain the first paragraph / first line of the underlying
-    // content so users never lose the lede.
+    // These lock the contract for CX#8: live view keeps reasoning compact
+    // and caps tool output, transcript view shows the full body. Completed
+    // reasoning without an explicit Summary stays out of the main flow so it
+    // cannot masquerade as user text.

    fn line_text(line: &ratatui::text::Line<'static>) -> String {
        line.spans
@@ -4293,8 +4309,9 @@ mod tests {
    #[test]
    fn long_thinking_display_is_shorter_than_transcript() {
        // Build a multi-paragraph thinking body so the live view has
-        // something to compress. The first paragraph is the lede; both
-        // surfaces must keep it.
+        // something to compress. Without an explicit Summary block, the
+        // live surface should show status + affordance only; Ctrl+O remains
+        // the path to the full body.
        let body = "First paragraph lede.\n\
                    Second sentence of the first paragraph.\n\n\
                    Second paragraph: deeper analysis follows.\n\
@@ -4328,14 +4345,14 @@ mod tests {
        let live_text = lines_text(&live);
        let transcript_text = lines_text(&transcript);

-        assert!(
-            live_text.contains("First paragraph lede"),
-            "live thinking must keep the lede: {live_text}"
-        );
        assert!(
            transcript_text.contains("First paragraph lede"),
            "transcript thinking must keep the lede"
        );
+        assert!(
+            !live_text.contains("First paragraph lede"),
+            "live thinking must not show raw completed reasoning: {live_text}"
+        );
        assert!(
            transcript_text.contains("Fourth paragraph"),
            "transcript thinking must keep the full body"
@@ -4345,19 +4362,20 @@ mod tests {
            "live thinking must drop the tail when collapsed"
        );
        assert!(
-            live_text.contains("press Ctrl+O for full text"),
+            live_text.contains("Full reasoning in Ctrl+O"),
            "live thinking must offer the pager affordance"
        );
        assert!(
-            !transcript_text.contains("press Ctrl+O for full text"),
+            !transcript_text.contains("Full reasoning in Ctrl+O"),
            "transcript thinking must not include the live affordance"
        );
    }

    #[test]
-    fn short_thinking_display_equals_transcript() {
-        // A single-line thinking body has nothing to compress; live and
-        // transcript surfaces should agree.
+    fn completed_thinking_without_summary_stays_out_of_live_view() {
+        // Even a short completed reasoning body can read like the user's
+        // prompt when rendered inline. Keep it in transcript/detail surfaces
+        // and show the Ctrl+O affordance in the main flow.
        let cell = HistoryCell::Thinking {
            content: "One brief reasoning step.".to_string(),
            streaming: false,
@@ -4376,13 +4394,17 @@ mod tests {
        let live_text = lines_text(&live);
        let transcript_text = lines_text(&transcript);

-        assert_eq!(
-            live_text, transcript_text,
-            "short thinking must render identically on both surfaces"
+        assert!(
+            !live_text.contains("One brief reasoning step."),
+            "live thinking must hide raw completed reasoning: {live_text}"
        );
        assert!(
-            !live_text.contains("press Ctrl+O for full text"),
-            "short thinking must not show the collapse affordance"
+            transcript_text.contains("One brief reasoning step."),
+            "transcript thinking must keep the full reasoning body"
+        );
+        assert!(
+            live_text.contains("Full reasoning in Ctrl+O"),
+            "live thinking must offer the detail affordance"
        );
    }

@@ -242,7 +242,7 @@ pub const KEYBINDINGS: &[KeybindingEntry] = &[
        section: KeybindingSection::Modes,
    },
    KeybindingEntry {
-        chord: "Alt+! / Alt+@ / Alt+# / Alt+4 / Alt+$ / Alt+0",
+        chord: "Alt+! / Alt+@ / Alt+# / Alt+$ / Alt+0",
        description_id: crate::localization::MessageId::KbFocusSidebar,
        section: KeybindingSection::Modes,
    },
@@ -335,6 +335,23 @@ mod tests {
        );
    }

+    #[test]
+    fn ctrl_o_help_copy_matches_activity_detail_behavior() {
+        let ctrl_o = KEYBINDINGS
+            .iter()
+            .find(|entry| entry.chord == "Ctrl+O")
+            .expect("Ctrl+O keybinding should be documented");
+
+        assert_eq!(
+            ctrl_o.description_id,
+            crate::localization::MessageId::KbThinkingPager
+        );
+        assert_eq!(
+            crate::localization::tr(crate::localization::Locale::En, ctrl_o.description_id,),
+            "Open Activity Detail"
+        );
+    }
+
    #[test]
    fn section_rank_is_a_total_order() {
        let sections = [
@@ -20,7 +20,7 @@
 //! key just changed; revision bumps invalidate only the cells that mutated;
 //! cells that didn't change reuse their existing wrap.

-use std::cell::RefCell;
+use std::cell::{Cell, RefCell};

 use crossterm::event::{KeyCode, KeyEvent, KeyModifiers};
 use ratatui::{
@@ -66,6 +66,11 @@ struct CellSnapshot {
    cell: HistoryCell,
 }

+struct FlattenedTranscript {
+    lines: Vec<Line<'static>>,
+    highlighted_range: Option<(usize, usize)>,
+}
+
 pub struct LiveTranscriptOverlay {
    /// Latest cell snapshots (history + active). Refreshed via
    /// `refresh_from_app` immediately before each render so streaming
@@ -79,20 +84,23 @@ pub struct LiveTranscriptOverlay {
    /// Sticky-tail flag: when `true`, refresh re-pins scroll to the bottom.
    /// Flipped to `false` when the user scrolls up; flipped back to `true`
    /// when they scroll past the last visible line.
-    sticky_to_bottom: bool,
+    sticky_to_bottom: Cell<bool>,
    /// Current top-of-viewport line offset into the flattened line list.
-    scroll: usize,
+    scroll: Cell<usize>,
    /// Visible content height from the last render. Used by paging keys
    /// before the next render frame populates a fresh value.
-    last_visible_height: RefCell<usize>,
+    last_visible_height: Cell<usize>,
    /// Last total line count after wrapping; cached so `handle_key` can
    /// clamp scroll without re-wrapping. Updated by `render`.
-    last_total_lines: RefCell<usize>,
+    last_total_lines: Cell<usize>,
    /// Pending `gg` second keystroke for Vim-style jump-to-top.
    pending_g: bool,
    /// Render mode — `Tail` is the live-stream mode; `BacktrackPreview`
    /// highlights the selected user message (#133).
    mode: Mode,
+    /// Set when a backtrack selection changes. The next render pins the
+    /// selected cell into view once we know the wrapped line range.
+    preview_pin_pending: Cell<bool>,
 }

 impl LiveTranscriptOverlay {
@@ -102,12 +110,13 @@ impl LiveTranscriptOverlay {
            snapshots: Vec::new(),
            options: TranscriptRenderOptions::default(),
            cache: RefCell::new(TranscriptCache::new()),
-            sticky_to_bottom: true,
-            scroll: 0,
-            last_visible_height: RefCell::new(0),
-            last_total_lines: RefCell::new(0),
+            sticky_to_bottom: Cell::new(true),
+            scroll: Cell::new(0),
+            last_visible_height: Cell::new(0),
+            last_total_lines: Cell::new(0),
            pending_g: false,
            mode: Mode::Tail,
+            preview_pin_pending: Cell::new(false),
        }
    }

@@ -118,7 +127,8 @@ impl LiveTranscriptOverlay {
    /// highlight overlay does.
    pub fn set_backtrack_preview(&mut self, selected_idx: usize) {
        self.mode = Mode::BacktrackPreview { selected_idx };
-        self.sticky_to_bottom = false;
+        self.sticky_to_bottom.set(false);
+        self.preview_pin_pending.set(true);
    }

    /// Return the overlay to live-tail mode (used when backtrack is
@@ -126,7 +136,8 @@ impl LiveTranscriptOverlay {
    #[allow(dead_code)] // exposed for callers that retain an overlay across a backtrack cancel; current UI just pops the view.
    pub fn set_tail_mode(&mut self) {
        self.mode = Mode::Tail;
-        self.sticky_to_bottom = true;
+        self.sticky_to_bottom.set(true);
+        self.preview_pin_pending.set(false);
    }

    /// For tests + UI: current mode.
@@ -179,9 +190,10 @@ impl LiveTranscriptOverlay {
    /// first line and reverse-video styling on every line so the eye
    /// snaps to them at a glance. The decoration is applied *after* the
    /// cache lookup so toggling preview mode never invalidates wraps.
-    fn flatten(&self, width: u16) -> Vec<Line<'static>> {
+    fn flatten(&self, width: u16) -> FlattenedTranscript {
        let width = width.max(1);
        let mut out: Vec<Line<'static>> = Vec::new();
+        let mut highlighted_range = None;

        // Pre-compute which cell index (in `self.snapshots`) is the one
        // the user has selected via Esc-Esc. We walk snapshots backwards
@@ -217,16 +229,24 @@ impl LiveTranscriptOverlay {
            };

            if Some(cell_idx) == highlighted_cell_idx {
+                let start = out.len();
                out.extend(decorate_highlight(lines));
+                let end = out.len();
+                if end > start {
+                    highlighted_range = Some((start, end));
+                }
            } else {
                out.extend(lines);
            }
        }
-        out
+        FlattenedTranscript {
+            lines: out,
+            highlighted_range,
+        }
    }

    fn page_height(&self) -> usize {
-        let cached = *self.last_visible_height.borrow();
+        let cached = self.last_visible_height.get();
        if cached == 0 { 10 } else { cached }
    }

@@ -235,33 +255,38 @@ impl LiveTranscriptOverlay {
    }

    fn max_scroll(&self) -> usize {
-        let total = *self.last_total_lines.borrow();
+        let total = self.last_total_lines.get();
        let visible = self.page_height();
        total.saturating_sub(visible)
    }

    fn scroll_up(&mut self, amount: usize) {
-        self.scroll = self.scroll.saturating_sub(amount);
+        self.scroll.set(self.scroll.get().saturating_sub(amount));
        // Any upward motion exits sticky-tail; explicit user intent.
-        self.sticky_to_bottom = false;
+        self.sticky_to_bottom.set(false);
+        self.preview_pin_pending.set(false);
    }

    fn scroll_down(&mut self, amount: usize) {
        let max = self.max_scroll();
-        self.scroll = (self.scroll + amount).min(max);
-        if self.scroll >= max {
-            self.sticky_to_bottom = true;
+        let scroll = self.scroll.get().saturating_add(amount).min(max);
+        self.scroll.set(scroll);
+        self.preview_pin_pending.set(false);
+        if scroll >= max && matches!(self.mode, Mode::Tail) {
+            self.sticky_to_bottom.set(true);
        }
    }

    fn jump_to_top(&mut self) {
-        self.scroll = 0;
-        self.sticky_to_bottom = false;
+        self.scroll.set(0);
+        self.sticky_to_bottom.set(false);
+        self.preview_pin_pending.set(false);
    }

    fn jump_to_bottom(&mut self) {
-        self.scroll = self.max_scroll();
-        self.sticky_to_bottom = true;
+        self.scroll.set(self.max_scroll());
+        self.sticky_to_bottom.set(matches!(self.mode, Mode::Tail));
+        self.preview_pin_pending.set(false);
    }

    /// For tests: snapshot count.
@@ -273,13 +298,13 @@ impl LiveTranscriptOverlay {
    /// For tests: whether sticky-tail is currently armed.
    #[cfg(test)]
    pub fn is_sticky(&self) -> bool {
-        self.sticky_to_bottom
+        self.sticky_to_bottom.get()
    }

    /// For tests: current scroll offset.
    #[cfg(test)]
    pub fn scroll_offset(&self) -> usize {
-        self.scroll
+        self.scroll.get()
    }
 }

@@ -317,6 +342,26 @@ fn decorate_highlight(mut lines: Vec<Line<'static>>) -> Vec<Line<'static>> {
    lines
 }

+fn scroll_to_show_range(
+    current: usize,
+    start: usize,
+    end: usize,
+    visible_height: usize,
+    max_scroll: usize,
+) -> usize {
+    if visible_height == 0 {
+        return 0;
+    }
+    let end = end.max(start.saturating_add(1));
+    if start < current {
+        start.min(max_scroll)
+    } else if end > current.saturating_add(visible_height) {
+        end.saturating_sub(visible_height).min(max_scroll)
+    } else {
+        current.min(max_scroll)
+    }
+}
+
 impl ModalView for LiveTranscriptOverlay {
    fn kind(&self) -> ModalKind {
        ModalKind::LiveTranscript
@@ -461,23 +506,36 @@ impl ModalView for LiveTranscriptOverlay {
        // Compute inner content height once: borders eat 1 row top + 1 bottom,
        // padding eats 1 more on each side.
        let visible_height = popup_area.height.saturating_sub(4) as usize;
-        *self.last_visible_height.borrow_mut() = visible_height;
+        self.last_visible_height.set(visible_height);

        // Wrap content using the per-cell cache; subtract padding from width
        // so wrapped lines fit between the inner edges.
        let content_width = popup_width.saturating_sub(4);
-        let lines = self.flatten(content_width);
-        *self.last_total_lines.borrow_mut() = lines.len();
+        let flattened = self.flatten(content_width);
+        let lines = flattened.lines;
+        self.last_total_lines.set(lines.len());

        let max_scroll = lines.len().saturating_sub(visible_height);
        // Sticky-tail: every render re-pins scroll to the bottom unless the
        // user has explicitly scrolled away. Without this, streaming new
        // content would push the visible window backwards as `scroll` stays
        // fixed against a growing total.
-        let scroll = if self.sticky_to_bottom {
+        let scroll = if self.sticky_to_bottom.get() {
+            self.scroll.set(max_scroll);
            max_scroll
+        } else if self.preview_pin_pending.replace(false) {
+            let next = flattened
+                .highlighted_range
+                .map(|(start, end)| {
+                    scroll_to_show_range(self.scroll.get(), start, end, visible_height, max_scroll)
+                })
+                .unwrap_or_else(|| self.scroll.get().min(max_scroll));
+            self.scroll.set(next);
+            next
        } else {
-            self.scroll.min(max_scroll)
+            let next = self.scroll.get().min(max_scroll);
+            self.scroll.set(next);
+            next
        };
        let end = (scroll + visible_height).min(lines.len());
        let visible_lines: Vec<Line<'static>> = if lines.is_empty() {
@@ -495,7 +553,7 @@ impl ModalView for LiveTranscriptOverlay {
                selected_idx + 1
            ),
            Mode::Tail => {
-                if self.sticky_to_bottom {
+                if self.sticky_to_bottom.get() {
                    " Live transcript (tailing) ".to_string()
                } else {
                    " Live transcript (paused) ".to_string()
@@ -560,6 +618,17 @@ mod tests {
            .collect();
    }

+    fn buffer_text(buf: &Buffer) -> String {
+        let mut out = String::new();
+        for y in 0..buf.area.height {
+            for x in 0..buf.area.width {
+                out.push_str(buf[(x, y)].symbol());
+            }
+            out.push('\n');
+        }
+        out
+    }
+
    #[test]
    fn new_overlay_starts_sticky() {
        let v = LiveTranscriptOverlay::new();
@@ -577,8 +646,8 @@ mod tests {
        );
        prime_layout(&mut v, 10);
        // Force scroll non-zero so scroll_up actually moves.
-        v.scroll = 5;
-        v.sticky_to_bottom = true;
+        v.scroll.set(5);
+        v.sticky_to_bottom.set(true);
        let _ = v.handle_key(KeyEvent::new(KeyCode::Char('k'), KeyModifiers::NONE));
        assert!(!v.is_sticky(), "scrolling up must release the sticky tail");
    }
@@ -592,8 +661,8 @@ mod tests {
        );
        prime_layout(&mut v, 10);
        // Drop out of sticky mode by scrolling up.
-        v.scroll = 10;
-        v.sticky_to_bottom = false;
+        v.scroll.set(10);
+        v.sticky_to_bottom.set(false);
        let _ = v.handle_key(KeyEvent::new(KeyCode::End, KeyModifiers::NONE));
        assert!(
            v.is_sticky(),
@@ -609,12 +678,12 @@ mod tests {
            (0..50).map(|i| user(&format!("line {i}"))).collect(),
        );
        prime_layout(&mut v, 10);
-        v.sticky_to_bottom = false;
+        v.sticky_to_bottom.set(false);
        // PageDown once should not re-arm since we're not yet at the tail.
        let _ = v.handle_key(KeyEvent::new(KeyCode::PageDown, KeyModifiers::NONE));
        // Now jump explicitly to bottom and verify re-arm.
-        v.scroll = 0;
-        v.sticky_to_bottom = false;
+        v.scroll.set(0);
+        v.sticky_to_bottom.set(false);
        let _ = v.handle_key(KeyEvent::new(KeyCode::Char('G'), KeyModifiers::NONE));
        assert!(v.is_sticky());
    }
@@ -784,6 +853,36 @@ mod tests {
        }
    }

+    #[test]
+    fn backtrack_preview_opens_near_latest_user_not_transcript_start() {
+        let mut v = LiveTranscriptOverlay::new();
+        let mut cells = Vec::new();
+        for i in 0..12 {
+            cells.push(user(&format!("user {i}")));
+            cells.push(assistant(&format!("assistant {i}"), false));
+        }
+        install_snapshots(&mut v, cells);
+
+        v.set_backtrack_preview(0);
+        let area = Rect::new(0, 0, 48, 10);
+        let mut buf = Buffer::empty(area);
+        v.render(area, &mut buf);
+        let rendered = buffer_text(&buf);
+
+        assert!(
+            v.scroll_offset() > 0,
+            "preview should pin near the selected recent turn, got top offset 0"
+        );
+        assert!(
+            rendered.contains("user 11"),
+            "latest user turn should be visible after opening preview: {rendered}"
+        );
+        assert!(
+            !rendered.contains("user 0"),
+            "preview must not open at the oldest transcript line: {rendered}"
+        );
+    }
+
    #[test]
    fn backtrack_preview_out_of_range_does_not_panic() {
        // Selecting beyond the user-cell count should simply not
@@ -17,6 +17,12 @@

 use std::time::{Duration, Instant};

+const TRACKPAD_EVENT_WINDOW: Duration = Duration::from_millis(35);
+const WHEEL_LINES_PER_TICK: i32 = 3;
+const TRACKPAD_BASE_LINES_PER_TICK: i32 = 1;
+const TRACKPAD_MID_LINES_PER_TICK: i32 = 2;
+const TRACKPAD_MAX_LINES_PER_TICK: i32 = 3;
+
 // === Transcript Line Metadata ===

 /// Metadata describing how rendered transcript lines map to history cells.
@@ -203,7 +209,8 @@ impl ScrollDirection {
 #[derive(Debug, Default)]
 pub struct MouseScrollState {
    last_event_at: Option<Instant>,
-    pending_lines: i32,
+    last_direction: Option<ScrollDirection>,
+    rapid_same_direction_ticks: u8,
 }

 /// A computed scroll delta from user input.
@@ -222,17 +229,37 @@ impl MouseScrollState {
    /// Process a scroll event and return the resulting delta.
    pub fn on_scroll(&mut self, direction: ScrollDirection) -> ScrollUpdate {
        let now = Instant::now();
+        self.on_scroll_at(direction, now)
+    }
+
+    fn on_scroll_at(&mut self, direction: ScrollDirection, now: Instant) -> ScrollUpdate {
        let is_trackpad = self
            .last_event_at
-            .is_some_and(|last| now.duration_since(last) < Duration::from_millis(35));
+            .is_some_and(|last| now.saturating_duration_since(last) < TRACKPAD_EVENT_WINDOW);
+        let same_direction = self.last_direction == Some(direction);
+
        self.last_event_at = Some(now);
+        self.last_direction = Some(direction);

-        let lines_per_tick = if is_trackpad { 1 } else { 3 };
-        self.pending_lines += direction.sign() * lines_per_tick;
+        let lines_per_tick = if is_trackpad {
+            if same_direction {
+                self.rapid_same_direction_ticks = self.rapid_same_direction_ticks.saturating_add(1);
+            } else {
+                self.rapid_same_direction_ticks = 1;
+            }
+            match self.rapid_same_direction_ticks {
+                0..=2 => TRACKPAD_BASE_LINES_PER_TICK,
+                3..=5 => TRACKPAD_MID_LINES_PER_TICK,
+                _ => TRACKPAD_MAX_LINES_PER_TICK,
+            }
+        } else {
+            self.rapid_same_direction_ticks = 0;
+            WHEEL_LINES_PER_TICK
+        };

-        let delta = self.pending_lines;
-        self.pending_lines = 0;
-        ScrollUpdate { delta_lines: delta }
+        ScrollUpdate {
+            delta_lines: direction.sign() * lines_per_tick,
+        }
    }
 }

@@ -433,4 +460,90 @@ mod tests {
        assert!(state.is_at_tail());
        assert_eq!(top, max_start);
    }
+
+    #[test]
+    fn mouse_scroll_single_wheel_tick_moves_three_lines() {
+        let mut state = MouseScrollState::new();
+        let start = Instant::now();
+
+        assert_eq!(
+            state.on_scroll_at(ScrollDirection::Down, start).delta_lines,
+            3
+        );
+        assert_eq!(
+            state.on_scroll_at(ScrollDirection::Up, start).delta_lines,
+            -1,
+            "same timestamp is treated as a rapid precise input"
+        );
+    }
+
+    #[test]
+    fn mouse_scroll_rapid_same_direction_accelerates_but_caps() {
+        let mut state = MouseScrollState::new();
+        let start = Instant::now();
+
+        let deltas = [
+            state.on_scroll_at(ScrollDirection::Down, start).delta_lines,
+            state
+                .on_scroll_at(ScrollDirection::Down, start + Duration::from_millis(10))
+                .delta_lines,
+            state
+                .on_scroll_at(ScrollDirection::Down, start + Duration::from_millis(20))
+                .delta_lines,
+            state
+                .on_scroll_at(ScrollDirection::Down, start + Duration::from_millis(30))
+                .delta_lines,
+            state
+                .on_scroll_at(ScrollDirection::Down, start + Duration::from_millis(40))
+                .delta_lines,
+            state
+                .on_scroll_at(ScrollDirection::Down, start + Duration::from_millis(50))
+                .delta_lines,
+            state
+                .on_scroll_at(ScrollDirection::Down, start + Duration::from_millis(60))
+                .delta_lines,
+            state
+                .on_scroll_at(ScrollDirection::Down, start + Duration::from_millis(70))
+                .delta_lines,
+        ];
+
+        assert_eq!(deltas, [3, 1, 1, 2, 2, 2, 3, 3]);
+    }
+
+    #[test]
+    fn mouse_scroll_direction_change_resets_acceleration() {
+        let mut state = MouseScrollState::new();
+        let start = Instant::now();
+
+        for step in 0..8 {
+            let _ = state.on_scroll_at(
+                ScrollDirection::Down,
+                start + Duration::from_millis(step * 10),
+            );
+        }
+
+        assert_eq!(
+            state
+                .on_scroll_at(ScrollDirection::Up, start + Duration::from_millis(90))
+                .delta_lines,
+            -1
+        );
+    }
+
+    #[test]
+    fn mouse_scroll_slow_gap_resets_to_wheel_tick() {
+        let mut state = MouseScrollState::new();
+        let start = Instant::now();
+
+        assert_eq!(
+            state.on_scroll_at(ScrollDirection::Down, start).delta_lines,
+            3
+        );
+        assert_eq!(
+            state
+                .on_scroll_at(ScrollDirection::Down, start + Duration::from_millis(100))
+                .delta_lines,
+            3
+        );
+    }
 }
@@ -16,7 +16,10 @@ use ratatui::{
 use unicode_width::{UnicodeWidthChar, UnicodeWidthStr};

 use crate::palette;
-use crate::session_manager::{SavedSession, SessionManager, SessionMetadata};
+use crate::session_manager::{
+    SavedSession, SessionManager, SessionMetadata, extract_title, extract_user_prompt,
+    strip_thinking_tags,
+};
 use crate::tui::views::{ModalKind, ModalView, ViewAction, ViewEvent};

 fn modal_block(title: &str) -> Block<'static> {
@@ -46,6 +49,8 @@ pub struct SessionPickerView {
    selected: usize,
    list_scroll: Cell<usize>,
    list_visible_rows: Cell<usize>,
+    history_scroll: Cell<usize>,
+    history_visible_rows: Cell<usize>,
    search_input: String,
    search_mode: bool,
    sort_mode: SortMode,
@@ -78,6 +83,8 @@ impl SessionPickerView {
            selected: 0,
            list_scroll: Cell::new(0),
            list_visible_rows: Cell::new(8),
+            history_scroll: Cell::new(0),
+            history_visible_rows: Cell::new(12),
            search_input: String::new(),
            search_mode: false,
            sort_mode: SortMode::Recent,
@@ -164,11 +171,62 @@ impl SessionPickerView {
        self.refresh_preview();
    }

+    fn select_visible_shortcut(&mut self, c: char) -> bool {
+        let Some(slot) = c.to_digit(10) else {
+            return false;
+        };
+        if !(1..=9).contains(&slot) {
+            return false;
+        }
+        let index = self.list_scroll.get().saturating_add(slot as usize - 1);
+        if index >= self.filtered.len() {
+            return false;
+        }
+        self.selected = index;
+        self.ensure_selected_visible();
+        self.refresh_preview();
+        if let Some(session) = self.selected_session() {
+            self.status = Some(format!(
+                "Opened history for {}",
+                crate::session_manager::truncate_id(&session.id)
+            ));
+        }
+        true
+    }
+
    fn update_list_viewport(&self, visible_rows: usize) {
        self.list_visible_rows.set(visible_rows.max(1));
        self.ensure_selected_visible();
    }

+    fn update_history_viewport(&self, visible_rows: usize) {
+        self.history_visible_rows.set(visible_rows.max(1));
+        self.ensure_history_scroll_in_bounds();
+    }
+
+    fn scroll_history(&self, delta: isize) {
+        let max_scroll = self
+            .current_preview
+            .len()
+            .saturating_sub(self.history_visible_rows.get().max(1));
+        let current = self.history_scroll.get();
+        let next = if delta.is_negative() {
+            current.saturating_sub(delta.unsigned_abs())
+        } else {
+            current.saturating_add(delta as usize)
+        };
+        self.history_scroll.set(next.min(max_scroll));
+    }
+
+    fn ensure_history_scroll_in_bounds(&self) {
+        let max_scroll = self
+            .current_preview
+            .len()
+            .saturating_sub(self.history_visible_rows.get().max(1));
+        self.history_scroll
+            .set(self.history_scroll.get().min(max_scroll));
+    }
+
    fn ensure_selected_visible(&self) {
        if self.filtered.is_empty() {
            self.list_scroll.set(0);
@@ -245,11 +303,13 @@ impl SessionPickerView {
    fn refresh_preview(&mut self) {
        let Some(session) = self.selected_session() else {
            self.current_preview = vec!["No sessions found.".to_string()];
+            self.history_scroll.set(0);
            return;
        };

        if let Some(lines) = self.preview_cache.get(&session.id) {
            self.current_preview = lines.clone();
+            self.history_scroll.set(0);
            return;
        }

@@ -257,6 +317,7 @@ impl SessionPickerView {
            Ok(manager) => manager,
            Err(_) => {
                self.current_preview = vec!["Failed to open sessions directory.".to_string()];
+                self.history_scroll.set(0);
                return;
            }
        };
@@ -265,6 +326,7 @@ impl SessionPickerView {
            Ok(saved) => saved,
            Err(_) => {
                self.current_preview = vec!["Failed to load session preview.".to_string()];
+                self.history_scroll.set(0);
                return;
            }
        };
@@ -273,6 +335,7 @@ impl SessionPickerView {
        self.preview_cache
            .insert(session.id.clone(), preview.clone());
        self.current_preview = preview;
+        self.history_scroll.set(0);
    }
 }

@@ -338,11 +401,13 @@ impl ModalView for SessionPickerView {
                ViewAction::None
            }
            KeyCode::PageUp => {
-                self.move_selection(-5);
+                let rows = self.history_visible_rows.get().max(1);
+                self.scroll_history(-(rows as isize));
                ViewAction::None
            }
            KeyCode::PageDown => {
-                self.move_selection(5);
+                let rows = self.history_visible_rows.get().max(1);
+                self.scroll_history(rows as isize);
                ViewAction::None
            }
            KeyCode::Char('/') => {
@@ -367,6 +432,7 @@ impl ModalView for SessionPickerView {
                self.status = Some("Delete session? (y/n)".to_string());
                ViewAction::None
            }
+            KeyCode::Char(c) if self.select_visible_shortcut(c) => ViewAction::None,
            KeyCode::Enter => {
                if let Some(session) = self.selected_session() {
                    ViewAction::EmitAndClose(ViewEvent::SessionSelected {
@@ -390,16 +456,26 @@ impl ModalView for SessionPickerView {

        Clear.render(popup_area, buf);

+        let narrow = popup_area.width < 95;
        let chunks = Layout::default()
-            .direction(if popup_area.width < 95 {
+            .direction(if narrow {
                Direction::Vertical
            } else {
                Direction::Horizontal
            })
-            .constraints([Constraint::Percentage(45), Constraint::Percentage(55)])
+            .constraints(if narrow {
+                [Constraint::Percentage(42), Constraint::Percentage(58)]
+            } else {
+                [Constraint::Percentage(64), Constraint::Percentage(36)]
+            })
            .split(popup_area);
+        let (history_area, list_area) = if narrow {
+            (chunks[1], chunks[0])
+        } else {
+            (chunks[0], chunks[1])
+        };

-        let list_inner = modal_block(" Sessions ").inner(chunks[0]);
+        let list_inner = modal_block(" Sessions (1-9) ").inner(list_area);
        let header_rows = 1 + usize::from(self.confirm_delete || self.status.is_some());
        let footer_rows = usize::from(!self.filtered.is_empty());
        let visible_rows = usize::from(list_inner.height)
@@ -421,21 +497,19 @@ impl ModalView for SessionPickerView {
            self.status.as_deref(),
        );
        let list = Paragraph::new(list_lines)
-            .block(modal_block(" Sessions "))
+            .block(modal_block(" Sessions (1-9) "))
            .wrap(Wrap { trim: false });
-        list.render(chunks[0], buf);
+        list.render(list_area, buf);

-        let preview_inner = modal_block(" Preview ").inner(chunks[1]);
-        let preview_lines = format_preview(
-            &self.current_preview,
-            preview_inner.width,
-            preview_inner.height as usize,
-        );
+        let history_inner = modal_block(" History (PgUp/PgDn) ").inner(history_area);
+        self.update_history_viewport(history_inner.height as usize);
+        let preview_lines = format_preview(&self.current_preview);

        let preview = Paragraph::new(preview_lines)
-            .block(modal_block(" Preview "))
+            .block(modal_block(" History (PgUp/PgDn) "))
+            .scroll((self.history_scroll.get().min(u16::MAX as usize) as u16, 0))
            .wrap(Wrap { trim: false });
-        preview.render(chunks[1], buf);
+        preview.render(history_area, buf);
    }
 }

@@ -456,7 +530,9 @@ fn build_list_lines(
    let header = if search_mode {
        format!("/{}", search_input)
    } else {
-        format!("Sort: {sort_label} | / search | s sort | d delete")
+        format!(
+            "1-9 history | PgUp/PgDn scroll | Enter resume | / search | s sort | a all | d delete | Sort: {sort_label}"
+        )
    };
    lines.push(Line::from(Span::styled(
        truncate(&header, width),
@@ -486,7 +562,13 @@ fn build_list_lines(
    }

    for (idx, session) in sessions.iter().enumerate().skip(scroll).take(visible_rows) {
-        let mut line = format_session_line(session);
+        let slot = idx.saturating_sub(scroll).saturating_add(1);
+        let prefix = if slot <= 9 {
+            format!("{slot}. ")
+        } else {
+            "   ".to_string()
+        };
+        let mut line = format!("{prefix}{}", format_session_line(session));
        line = truncate(&line, width);
        let style = if idx == selected {
            Style::default()
@@ -516,7 +598,12 @@ fn build_list_lines(

 fn format_session_line(session: &SessionMetadata) -> String {
    let updated = format_relative_time(&session.updated_at);
-    let title = truncate(&session.title, 32);
+    let raw_title = extract_title(&session.title);
+    let title = if raw_title == "Session" {
+        truncate(crate::session_manager::truncate_id(&session.id), 32)
+    } else {
+        truncate(raw_title, 32)
+    };
    let mode = session
        .mode
        .as_deref()
@@ -534,7 +621,7 @@ fn format_session_line(session: &SessionMetadata) -> String {

 fn build_preview_lines(session: &SavedSession) -> Vec<String> {
    let mut out = Vec::new();
-    out.push(format!("Title: {}", session.metadata.title));
+    out.push(format!("Title: {}", extract_title(&session.metadata.title)));
    out.push(format!(
        "Updated: {}",
        session
@@ -552,26 +639,72 @@ fn build_preview_lines(session: &SavedSession) -> Vec<String> {
    }
    out.push("".to_string());

-    for message in session.messages.iter().take(6) {
-        let role = message.role.to_ascii_uppercase();
-        let mut text = String::new();
-        for block in &message.content {
-            if let crate::models::ContentBlock::Text { text: body, .. } = block {
-                text.push_str(body);
-            }
+    for message in &session.messages {
+        let text = message_text_for_history(message);
+        if text.trim().is_empty() {
+            continue;
        }
-        let preview = truncate(&text.replace('\n', " "), 120);
-        out.push(format!("{role}: {preview}"));
+        out.push(format!("{}:", message.role.to_ascii_uppercase()));
+        for line in text.lines() {
+            out.push(format!("  {line}"));
+        }
+        out.push(String::new());
+    }
+    if out.last().is_some_and(String::is_empty) {
+        out.pop();
    }
    out
 }

-fn format_preview(lines: &[String], width: u16, height: usize) -> Vec<Line<'static>> {
+fn message_text_for_history(message: &crate::models::Message) -> String {
+    let mut text = String::new();
+    for block in &message.content {
+        let part = match block {
+            crate::models::ContentBlock::Text { text: body, .. } => {
+                if message.role.eq_ignore_ascii_case("user") {
+                    extract_user_prompt(body).to_string()
+                } else {
+                    strip_thinking_tags(body)
+                }
+            }
+            crate::models::ContentBlock::Thinking { .. } => String::new(),
+            crate::models::ContentBlock::ToolUse { name, input, .. } => {
+                format!("tool call: {name} {}", truncate(&input.to_string(), 180))
+            }
+            crate::models::ContentBlock::ToolResult {
+                content, is_error, ..
+            } => {
+                let label = if is_error.unwrap_or(false) {
+                    "tool error"
+                } else {
+                    "tool result"
+                };
+                format!("{label}: {}", truncate(&content.replace('\n', " "), 220))
+            }
+            crate::models::ContentBlock::ServerToolUse { name, input, .. } => {
+                format!("server tool: {name} {}", truncate(&input.to_string(), 180))
+            }
+            crate::models::ContentBlock::ToolSearchToolResult { content, .. }
+            | crate::models::ContentBlock::CodeExecutionToolResult { content, .. } => {
+                format!("tool result: {}", truncate(&content.to_string(), 220))
+            }
+        };
+        let part = part.trim();
+        if !part.is_empty() {
+            if !text.is_empty() {
+                text.push('\n');
+            }
+            text.push_str(part);
+        }
+    }
+    text
+}
+
+fn format_preview(lines: &[String]) -> Vec<Line<'static>> {
    let mut out = Vec::new();
-    let available = height.saturating_sub(2).max(1);
-    for line in lines.iter().take(available) {
+    for line in lines {
        out.push(Line::from(Span::styled(
-            truncate(line, width),
+            line.clone(),
            Style::default().fg(palette::TEXT_PRIMARY),
        )));
    }
@@ -678,6 +811,28 @@ mod tests {
        s
    }

+    fn text_message(role: &str, text: &str) -> crate::models::Message {
+        crate::models::Message {
+            role: role.to_string(),
+            content: vec![crate::models::ContentBlock::Text {
+                text: text.to_string(),
+                cache_control: None,
+            }],
+        }
+    }
+
+    fn saved_session_with_messages(messages: Vec<crate::models::Message>) -> SavedSession {
+        let mut session = crate::session_manager::create_saved_session(
+            &messages,
+            "deepseek-v4-pro",
+            std::path::Path::new("/tmp"),
+            100,
+            None,
+        );
+        session.metadata.title = "<turn_meta>{}</turn_meta>\nClean session title".to_string();
+        session
+    }
+
    fn picker_with(sessions: Vec<SessionMetadata>, scope: Option<&str>) -> SessionPickerView {
        let workspace_scope = scope.map(PathBuf::from);
        let mut view = SessionPickerView {
@@ -686,6 +841,8 @@ mod tests {
            selected: 0,
            list_scroll: Cell::new(0),
            list_visible_rows: Cell::new(8),
+            history_scroll: Cell::new(0),
+            history_visible_rows: Cell::new(12),
            search_input: String::new(),
            search_mode: false,
            sort_mode: SortMode::Recent,
@@ -798,6 +955,85 @@ mod tests {
        assert!(span.style.add_modifier.contains(Modifier::BOLD));
    }

+    #[test]
+    fn build_list_lines_numbers_visible_rows_for_shortcuts() {
+        let sessions = vec![
+            test_session(1, "first session"),
+            test_session(2, "second session"),
+        ];
+        let lines = build_list_lines(&sessions, 0, 80, 0, 5, false, "", "recent", false, None);
+
+        let rendered = lines
+            .iter()
+            .flat_map(|line| line.spans.iter())
+            .map(|span| span.content.as_ref())
+            .collect::<Vec<_>>()
+            .join("\n");
+        assert!(rendered.contains("1. session-"));
+        assert!(rendered.contains("2. session-"));
+    }
+
+    #[test]
+    fn digit_shortcut_selects_visible_session_for_history() {
+        let sessions = vec![
+            test_session(1, "first session"),
+            test_session(2, "second session"),
+            test_session(3, "third session"),
+        ];
+        let mut view = picker_with(sessions, None);
+
+        assert!(view.select_visible_shortcut('2'));
+        assert_eq!(view.selected, 1);
+        assert!(
+            view.status
+                .as_deref()
+                .is_some_and(|status| status.contains("Opened history"))
+        );
+        assert!(!view.select_visible_shortcut('9'));
+    }
+
+    #[test]
+    fn history_scroll_pages_and_clamps() {
+        let mut view = picker_with(vec![test_session(1, "first")], None);
+        view.current_preview = (0..20).map(|idx| format!("line {idx}")).collect();
+        view.history_visible_rows.set(5);
+
+        view.scroll_history(6);
+        assert_eq!(view.history_scroll.get(), 6);
+        view.scroll_history(100);
+        assert_eq!(view.history_scroll.get(), 15);
+        view.scroll_history(-200);
+        assert_eq!(view.history_scroll.get(), 0);
+    }
+
+    #[test]
+    fn build_preview_lines_shows_full_clean_history() {
+        let messages = vec![
+            text_message(
+                "user",
+                "<turn_meta>{\"cache\":\"x\"}</turn_meta>\nFirst visible prompt",
+            ),
+            text_message(
+                "assistant",
+                "<thinking>hidden reasoning</thinking>\nFirst visible answer",
+            ),
+            text_message("user", "Second prompt"),
+            text_message("assistant", "Second answer"),
+            text_message("user", "Third prompt"),
+            text_message("assistant", "Third answer"),
+            text_message("user", "Fourth prompt beyond old six-message preview"),
+        ];
+        let session = saved_session_with_messages(messages);
+        let lines = build_preview_lines(&session).join("\n");
+
+        assert!(lines.contains("Title: Clean session title"));
+        assert!(lines.contains("First visible prompt"));
+        assert!(lines.contains("First visible answer"));
+        assert!(lines.contains("Fourth prompt beyond old six-message preview"));
+        assert!(!lines.contains("turn_meta"));
+        assert!(!lines.contains("hidden reasoning"));
+    }
+
    #[test]
    fn ensure_selected_visible_updates_scroll_window() {
        let sessions = (0..10)
@@ -810,6 +1046,8 @@ mod tests {
            selected: 0,
            list_scroll: Cell::new(0),
            list_visible_rows: Cell::new(3),
+            history_scroll: Cell::new(0),
+            history_visible_rows: Cell::new(12),
            search_input: String::new(),
            search_mode: false,
            sort_mode: SortMode::Recent,
@@ -142,7 +142,7 @@ pub(super) fn handle_subagent_mailbox(app: &mut App, seq: u64, message: &Mailbox
    };

    let dispatch_kind = app.pending_subagent_dispatch.as_deref();
-    let is_fanout = matches!(dispatch_kind, Some("rlm"));
+    let is_fanout = matches!(dispatch_kind, Some("rlm_open" | "rlm_eval" | "rlm"));

    if is_fanout {
        // Reuse the active fanout card for sibling spawns; otherwise create
@@ -154,7 +154,7 @@ pub(super) fn handle_subagent_mailbox(app: &mut App, seq: u64, message: &Mailbox
            card.claim_pending_worker(&agent_id, AgentLifecycle::Running);
            app.subagent_card_index.insert(agent_id, idx);
        } else {
-            let mut card = FanoutCard::new(dispatch_kind.unwrap_or("rlm").to_string());
+            let mut card = FanoutCard::new(dispatch_kind.unwrap_or("rlm_eval").to_string());
            card.upsert_worker(&agent_id, AgentLifecycle::Running);
            app.add_message(HistoryCell::SubAgent(SubAgentCell::Fanout(card)));
            let idx = app.history.len().saturating_sub(1);
@@ -102,7 +102,8 @@ use super::approval::{
    ApprovalMode, ApprovalRequest, ApprovalView, ElevationRequest, ElevationView, ReviewDecision,
 };
 use super::history::{
-    HistoryCell, ToolCell, ToolStatus, history_cells_from_message, summarize_tool_output,
+    HistoryCell, ToolCell, ToolStatus, TranscriptRenderOptions, history_cells_from_message,
+    summarize_tool_output,
 };
 use super::slash_menu::{
    apply_slash_menu_selection, try_autocomplete_slash_command, visible_slash_menu_entries,
@@ -128,6 +129,7 @@ const SLASH_MENU_LIMIT: usize = 128;
 const MENTION_MENU_LIMIT: usize = 6;
 const MIN_CHAT_HEIGHT: u16 = 3;
 const MIN_COMPOSER_HEIGHT: u16 = 2;
+const COMPOSER_ARROW_SCROLL_LINES: usize = 3;
 const CONTEXT_WARNING_THRESHOLD_PERCENT: f64 = 85.0;
 const CONTEXT_CRITICAL_THRESHOLD_PERCENT: f64 = 95.0;
 const UI_IDLE_POLL_MS: u64 = 48;
@@ -433,6 +435,8 @@ pub async fn run_tui(config: &Config, options: TuiOptions) -> Result<()> {
        // #456: plumb the App's HookExecutor so `exec_shell` can surface
        // the configured `shell_env` hooks. Wrapped in Arc once and shared.
        hook_executor: Some(std::sync::Arc::new(app.hooks.clone())),
+        handle_store: app.runtime_services.handle_store.clone(),
+        rlm_sessions: app.runtime_services.rlm_sessions.clone(),
    };
    refresh_active_task_panel(&mut app, &task_manager).await;

@@ -687,7 +691,11 @@ fn active_rlm_task_entries(app: &App) -> Vec<TaskPanelEntry> {
            let HistoryCell::Tool(ToolCell::Generic(generic)) = entry else {
                return None;
            };
-            if generic.name != "rlm" || generic.status != ToolStatus::Running {
+            if !matches!(
+                generic.name.as_str(),
+                "rlm_open" | "rlm_eval" | "rlm_configure" | "rlm_close" | "rlm"
+            ) || generic.status != ToolStatus::Running
+            {
                return None;
            }
            let summary = generic
@@ -1074,9 +1082,17 @@ async fn run_event_loop(
                        // Note this dispatch so the next sub-agent `Started`
                        // mailbox envelope routes into the right card kind
                        // (delegate vs fanout).
-                        if matches!(name.as_str(), "agent_spawn" | "rlm" | "delegate") {
+                        if matches!(
+                            name.as_str(),
+                            "agent_open"
+                                | "agent_spawn"
+                                | "rlm_open"
+                                | "rlm_eval"
+                                | "rlm"
+                                | "delegate"
+                        ) {
                            app.pending_subagent_dispatch = Some(name.clone());
-                            if name == "rlm" {
+                            if matches!(name.as_str(), "rlm_open" | "rlm_eval" | "rlm") {
                                // New fanout invocation — children should
                                // group under a fresh card, not the
                                // previous fanout's leftover.
@@ -1115,7 +1131,9 @@ async fn run_event_loop(
                        // poll. Also merge shell jobs (#373).
                        if matches!(
                            name.as_str(),
-                            "agent_spawn"
+                            "agent_open"
+                                | "agent_spawn"
+                                | "agent_close"
                                | "agent_cancel"
                                | "todo_write"
                                | "task_shell_start"
@@ -1126,7 +1144,9 @@ async fn run_event_loop(
                        }
                        if matches!(
                            name.as_str(),
-                            "agent_spawn"
+                            "agent_open"
+                                | "agent_eval"
+                                | "agent_close"
                                | "agent_cancel"
                                | "agent_wait"
                                | "agent_result"
@@ -2028,9 +2048,7 @@ async fn run_event_loop(
            if app.use_mouse_capture
                && let Event::Mouse(mouse) = evt
            {
-                if app.is_loading
-                    && matches!(mouse.kind, MouseEventKind::Moved | MouseEventKind::Drag(_))
-                {
+                if should_drop_loading_mouse_motion(app, mouse) {
                    continue;
                }
                let events = handle_mouse_event(app, mouse);
@@ -2470,7 +2488,7 @@ async fn run_event_loop(
                KeyCode::Char('o')
                    if key.modifiers.contains(KeyModifiers::CONTROL)
                        && app.input.is_empty()
-                        && open_thinking_pager(app) =>
+                        && open_activity_detail_pager(app) =>
                {
                    continue;
                }
@@ -2482,8 +2500,8 @@ async fn run_event_loop(
                }
                KeyCode::Char('1') if key.modifiers.contains(KeyModifiers::ALT) => {
                    if key.modifiers.contains(KeyModifiers::CONTROL) {
-                        app.set_sidebar_focus(SidebarFocus::Plan);
-                        app.status_message = Some("Sidebar focus: plan".to_string());
+                        app.set_sidebar_focus(SidebarFocus::Work);
+                        app.status_message = Some("Sidebar focus: work".to_string());
                    } else {
                        app.set_mode(AppMode::Plan);
                    }
@@ -2491,8 +2509,8 @@ async fn run_event_loop(
                }
                KeyCode::Char('2') if key.modifiers.contains(KeyModifiers::ALT) => {
                    if key.modifiers.contains(KeyModifiers::CONTROL) {
-                        app.set_sidebar_focus(SidebarFocus::Todos);
-                        app.status_message = Some("Sidebar focus: todos".to_string());
+                        app.set_sidebar_focus(SidebarFocus::Tasks);
+                        app.status_message = Some("Sidebar focus: tasks".to_string());
                    } else {
                        app.set_mode(AppMode::Agent);
                    }
@@ -2500,8 +2518,8 @@ async fn run_event_loop(
                }
                KeyCode::Char('3') if key.modifiers.contains(KeyModifiers::ALT) => {
                    if key.modifiers.contains(KeyModifiers::CONTROL) {
-                        app.set_sidebar_focus(SidebarFocus::Tasks);
-                        app.status_message = Some("Sidebar focus: tasks".to_string());
+                        app.set_sidebar_focus(SidebarFocus::Agents);
+                        app.status_message = Some("Sidebar focus: agents".to_string());
                    } else {
                        app.set_mode(AppMode::Yolo);
                    }
@@ -2512,26 +2530,23 @@ async fn run_event_loop(
                    continue;
                }
                KeyCode::Char('!') if key.modifiers.contains(KeyModifiers::ALT) => {
-                    app.set_sidebar_focus(SidebarFocus::Plan);
-                    app.status_message = Some("Sidebar focus: plan".to_string());
+                    app.set_sidebar_focus(SidebarFocus::Work);
+                    app.status_message = Some("Sidebar focus: work".to_string());
                    continue;
                }
                KeyCode::Char('@') if key.modifiers.contains(KeyModifiers::ALT) => {
-                    app.set_sidebar_focus(SidebarFocus::Todos);
-                    app.status_message = Some("Sidebar focus: todos".to_string());
-                    continue;
-                }
-                KeyCode::Char('#') if key.modifiers.contains(KeyModifiers::ALT) => {
                    app.set_sidebar_focus(SidebarFocus::Tasks);
                    app.status_message = Some("Sidebar focus: tasks".to_string());
                    continue;
                }
-                KeyCode::Char('$') if key.modifiers.contains(KeyModifiers::ALT) => {
+                KeyCode::Char('#') if key.modifiers.contains(KeyModifiers::ALT) => {
                    app.set_sidebar_focus(SidebarFocus::Agents);
                    app.status_message = Some("Sidebar focus: agents".to_string());
                    continue;
                }
-                KeyCode::Char('%') if key.modifiers.contains(KeyModifiers::ALT) => {
+                KeyCode::Char('$') | KeyCode::Char('%')
+                    if key.modifiers.contains(KeyModifiers::ALT) =>
+                {
                    app.set_sidebar_focus(SidebarFocus::Context);
                    app.status_message = Some("Sidebar focus: context".to_string());
                    continue;
@@ -4122,7 +4137,7 @@ fn handle_composer_history_arrow(
    match key.code {
        KeyCode::Up => {
            if scroll_on_empty {
-                app.scroll_up(1);
+                app.scroll_up(COMPOSER_ARROW_SCROLL_LINES);
            } else {
                app.history_up();
            }
@@ -4130,7 +4145,7 @@ fn handle_composer_history_arrow(
        }
        KeyCode::Down => {
            if scroll_on_empty {
-                app.scroll_down(1);
+                app.scroll_down(COMPOSER_ARROW_SCROLL_LINES);
            } else {
                app.history_down();
            }
@@ -5151,22 +5166,6 @@ async fn apply_command_result(
                let queued = build_queued_message(app, content);
                submit_or_steer_message(app, config, engine_handle, queued).await?;
            }
-            AppAction::Rlm {
-                prompt,
-                model,
-                child_model,
-                max_depth,
-            } => {
-                app.status_message = Some("RLM turn starting...".to_string());
-                let _ = engine_handle
-                    .send(Op::Rlm {
-                        content: prompt,
-                        model,
-                        child_model,
-                        max_depth,
-                    })
-                    .await;
-            }
            AppAction::ListSubAgents => {
                let _ = engine_handle.send(Op::ListSubAgents).await;
            }
@@ -7688,7 +7687,7 @@ fn collect_active_tool_status(cell: &HistoryCell, snapshot: &mut ActiveToolStatu
            // status. RLM is different today: it is a foreground tool call,
            // so keep it in the live tool footer until the async RLM
            // workbench lands (#513).
-            if generic.name == "agent_spawn" {
+            if matches!(generic.name.as_str(), "agent_open" | "agent_spawn") {
                return;
            }
            snapshot.record(format!("tool {}", generic.name), generic.status, None);
@@ -8381,6 +8380,21 @@ pub(crate) fn truncate_line_to_width(text: &str, max_width: usize) -> String {
    out
 }

+fn should_drop_loading_mouse_motion(app: &App, mouse: MouseEvent) -> bool {
+    if !app.is_loading {
+        return false;
+    }
+
+    match mouse.kind {
+        MouseEventKind::Moved => true,
+        MouseEventKind::Drag(_) => {
+            !app.viewport.transcript_selection.dragging
+                && !app.viewport.transcript_scrollbar_dragging
+        }
+        _ => false,
+    }
+}
+
 fn handle_mouse_event(app: &mut App, mouse: MouseEvent) -> Vec<ViewEvent> {
    if app.view_stack.top_kind() == Some(ModalKind::ContextMenu) {
        if matches!(mouse.kind, MouseEventKind::Down(MouseButton::Right)) {
@@ -8399,7 +8413,10 @@ fn handle_mouse_event(app: &mut App, mouse: MouseEvent) -> Vec<ViewEvent> {
    match mouse.kind {
        MouseEventKind::ScrollUp => {
            let update = app.viewport.mouse_scroll.on_scroll(ScrollDirection::Up);
-            app.viewport.pending_scroll_delta += update.delta_lines;
+            app.viewport.pending_scroll_delta = app
+                .viewport
+                .pending_scroll_delta
+                .saturating_add(update.delta_lines);
            if update.delta_lines != 0 {
                app.user_scrolled_during_stream = true;
                app.needs_redraw = true;
@@ -8407,7 +8424,10 @@ fn handle_mouse_event(app: &mut App, mouse: MouseEvent) -> Vec<ViewEvent> {
        }
        MouseEventKind::ScrollDown => {
            let update = app.viewport.mouse_scroll.on_scroll(ScrollDirection::Down);
-            app.viewport.pending_scroll_delta += update.delta_lines;
+            app.viewport.pending_scroll_delta = app
+                .viewport
+                .pending_scroll_delta
+                .saturating_add(update.delta_lines);
            if update.delta_lines != 0 {
                app.user_scrolled_during_stream = true;
                app.needs_redraw = true;
@@ -9003,21 +9023,68 @@ fn open_pager_for_last_message(app: &mut App) -> bool {
    true
 }

-/// Open a pager showing the full thinking block. Targets the cell at the
-/// current selection if it's a Thinking cell; otherwise falls back to the
-/// most recent Thinking cell across the virtual transcript (history +
-/// in-flight `active_cell`). Bound to Ctrl+O so users can read reasoning
-/// content that's been collapsed in calm-mode rendering.
-///
-/// The virtual-index lookup matters: after `ThinkingComplete` fires the
-/// finalized thinking entry sits in `active_cell` with `streaming = false`
-/// until the active cell flushes to history. During that window the
-/// transcript already renders the "thinking collapsed; press Ctrl+O for
-/// full text" affordance, so the handler must address active-cell entries
-/// or the affordance becomes a lie.
+/// Compatibility wrapper for the old test name. The user-facing Ctrl+O
+/// surface is now Activity Detail, not a thinking-only pager.
+#[cfg(test)]
 fn open_thinking_pager(app: &mut App) -> bool {
-    let selected_cell = app
+    open_activity_detail_pager(app)
+}
+
+/// Open a pager for the activity the user is most likely asking about.
+///
+/// Ctrl+O uses this path. It prefers an explicitly selected activity cell,
+/// then a live activity in the current turn, then the most recent meaningful
+/// activity across history + active cells. Tool activity is intentionally
+/// rendered through the compact live view so Activity Detail does not become
+/// an accidental raw-output dump; Alt+V remains the direct full tool-detail
+/// surface.
+fn open_activity_detail_pager(app: &mut App) -> bool {
+    let Some(idx) = activity_target_cell_index(app) else {
+        app.status_message = Some("No activity detail available".to_string());
+        return true;
+    };
+
+    let width = app
        .viewport
+        .last_transcript_area
+        .map(|area| area.width)
+        .unwrap_or(80);
+    let Some(text) = activity_detail_text(app, idx, width) else {
+        app.status_message = Some("No activity detail available".to_string());
+        return true;
+    };
+    let title = if matches!(
+        app.cell_at_virtual_index(idx),
+        Some(HistoryCell::Thinking { .. })
+    ) {
+        "Reasoning Timeline"
+    } else {
+        "Activity Detail"
+    };
+    app.view_stack
+        .push(PagerView::from_text(title, &text, width.saturating_sub(2)));
+    true
+}
+
+fn activity_target_cell_index(app: &App) -> Option<usize> {
+    if let Some(selected) = selected_transcript_cell_index(app)
+        && app
+            .cell_at_virtual_index(selected)
+            .is_some_and(is_meaningful_activity_cell)
+    {
+        return Some(selected);
+    }
+
+    current_activity_cell_index(app).or_else(|| {
+        (0..app.virtual_cell_count()).rev().find(|&idx| {
+            app.cell_at_virtual_index(idx)
+                .is_some_and(is_meaningful_activity_cell)
+        })
+    })
+}
+
+fn selected_transcript_cell_index(app: &App) -> Option<usize> {
+    app.viewport
        .transcript_selection
        .ordered_endpoints()
        .and_then(|(start, _)| {
@@ -9028,45 +9095,315 @@ fn open_thinking_pager(app: &mut App) -> bool {
                .and_then(|meta| meta.cell_line())
                .map(|(cell_index, _)| cell_index)
        })
+}
+
+fn current_activity_cell_index(app: &App) -> Option<usize> {
+    let active = app.active_cell.as_ref()?;
+    let base = app.history.len();
+    for desired_rank in [0, 1, 2] {
+        if let Some((entry_idx, _)) = active
+            .entries()
+            .iter()
+            .enumerate()
+            .rev()
+            .find(|(_, cell)| activity_cell_rank(cell) == Some(desired_rank))
+        {
+            return Some(base + entry_idx);
+        }
+    }
+    None
+}
+
+fn is_meaningful_activity_cell(cell: &HistoryCell) -> bool {
+    activity_cell_rank(cell).is_some()
+}
+
+fn activity_cell_rank(cell: &HistoryCell) -> Option<u8> {
+    match cell {
+        HistoryCell::Thinking {
+            streaming: true, ..
+        } => Some(0),
+        HistoryCell::Tool(tool) => match tool_status_for_activity(tool) {
+            Some(ToolStatus::Running) => Some(0),
+            Some(ToolStatus::Failed) => Some(1),
+            Some(ToolStatus::Success) => Some(2),
+            None => Some(2),
+        },
+        HistoryCell::SubAgent(_) => Some(0),
+        HistoryCell::Error { .. } => Some(1),
+        HistoryCell::Thinking { .. } => Some(2),
+        _ => None,
+    }
+}
+
+fn activity_detail_text(app: &App, cell_index: usize, width: u16) -> Option<String> {
+    let cell = app.cell_at_virtual_index(cell_index)?;
+    if matches!(cell, HistoryCell::Thinking { .. }) {
+        return reasoning_timeline_text(app, cell_index);
+    }
+
+    let mut sections = Vec::new();
+
+    if let Some(turn_id) = app.runtime_turn_id.as_ref() {
+        let status = app.runtime_turn_status.as_deref().unwrap_or("in progress");
+        sections.push(format!(
+            "Turn: {} ({status})",
+            truncate_line_to_width(turn_id, 24)
+        ));
+    }
+
+    sections.push(format!(
+        "Activity: {}",
+        activity_cell_label(app, cell_index, cell)
+    ));
+
+    if let Some(status) = activity_status_line(cell) {
+        sections.push(status);
+    }
+
+    if let Some((position, total)) = thinking_chunk_position(app, cell_index) {
+        sections.push(format!("Thinking chunk: {position} of {total}"));
+    }
+
+    sections.push(String::new());
+    sections.push(activity_cell_to_text(cell, width));
+    Some(sections.join("\n"))
+}
+
+fn reasoning_timeline_text(app: &App, selected_cell_index: usize) -> Option<String> {
+    let thinking_indices: Vec<usize> = (0..app.virtual_cell_count())
        .filter(|&idx| {
            matches!(
                app.cell_at_virtual_index(idx),
-                Some(crate::tui::history::HistoryCell::Thinking { .. })
-            )
-        });
-
-    let target_idx = selected_cell.or_else(|| {
-        (0..app.virtual_cell_count()).rev().find(|&idx| {
-            matches!(
-                app.cell_at_virtual_index(idx),
-                Some(crate::tui::history::HistoryCell::Thinking { .. })
+                Some(HistoryCell::Thinking { .. })
            )
        })
+        .collect();
+    if thinking_indices.is_empty() {
+        return None;
+    }
+
+    let selected_position = thinking_indices
+        .iter()
+        .position(|&idx| idx == selected_cell_index)
+        .map(|idx| idx + 1);
+    let total = thinking_indices.len();
+    let running = thinking_indices.iter().any(|&idx| {
+        matches!(
+            app.cell_at_virtual_index(idx),
+            Some(HistoryCell::Thinking {
+                streaming: true,
+                ..
+            })
+        )
    });

-    let Some(idx) = target_idx else {
-        app.status_message = Some("No thinking blocks to expand".to_string());
-        return true;
-    };
-
-    let width = app
-        .viewport
-        .last_transcript_area
-        .map(|area| area.width)
-        .unwrap_or(80);
-    let text = {
-        let Some(cell) = app.cell_at_virtual_index(idx) else {
-            app.status_message = Some("No thinking blocks to expand".to_string());
-            return true;
-        };
-        history_cell_to_text(cell, width)
-    };
-    app.view_stack.push(PagerView::from_text(
-        "Thinking",
-        &text,
-        width.saturating_sub(2),
+    let mut sections = Vec::new();
+    if let Some(turn_id) = app.runtime_turn_id.as_ref() {
+        let status = app.runtime_turn_status.as_deref().unwrap_or("in progress");
+        sections.push(format!(
+            "Turn: {} ({status})",
+            truncate_line_to_width(turn_id, 24)
+        ));
+    }
+    sections.push("Activity: reasoning timeline".to_string());
+    sections.push(format!(
+        "Status: {} · {total} chunk{}",
+        if running { "running" } else { "done" },
+        if total == 1 { "" } else { "s" }
    ));
-    true
+    if let Some(position) = selected_position {
+        sections.push(format!("Selected chunk: {position} of {total}"));
+    }
+    sections.push(String::new());
+
+    for (position, cell_index) in thinking_indices.iter().copied().enumerate() {
+        let Some(HistoryCell::Thinking {
+            content,
+            streaming,
+            duration_secs,
+        }) = app.cell_at_virtual_index(cell_index)
+        else {
+            continue;
+        };
+        let position = position + 1;
+        let marker = if Some(position) == selected_position {
+            " (selected)"
+        } else {
+            ""
+        };
+        let mut status = if *streaming {
+            "running".to_string()
+        } else {
+            "done".to_string()
+        };
+        if let Some(duration_secs) = duration_secs {
+            status.push_str(" · ");
+            status.push_str(&format!("{duration_secs:.1}s"));
+        }
+        sections.push(format!("Thinking chunk {position} of {total}{marker}"));
+        sections.push(format!("Status: {status}"));
+        let body = content.trim();
+        if body.is_empty() {
+            sections.push("(no reasoning text recorded)".to_string());
+        } else {
+            sections.push(body.to_string());
+        }
+        sections.push(String::new());
+    }
+
+    Some(sections.join("\n"))
+}
+
+fn activity_cell_label(app: &App, cell_index: usize, cell: &HistoryCell) -> String {
+    match cell {
+        HistoryCell::Thinking { .. } => "thinking".to_string(),
+        HistoryCell::Error { .. } => "error".to_string(),
+        HistoryCell::SubAgent(_) => "sub-agent".to_string(),
+        HistoryCell::Tool(_) => {
+            detail_target_label(app, cell_index).unwrap_or_else(|| "tool activity".to_string())
+        }
+        _ => "message".to_string(),
+    }
+}
+
+fn activity_status_line(cell: &HistoryCell) -> Option<String> {
+    match cell {
+        HistoryCell::Thinking {
+            streaming,
+            duration_secs,
+            ..
+        } => {
+            let mut line = if *streaming {
+                "Status: running".to_string()
+            } else {
+                "Status: done".to_string()
+            };
+            if let Some(duration_secs) = duration_secs {
+                line.push_str(" · ");
+                line.push_str(&format!("{duration_secs:.1}s"));
+            }
+            Some(line)
+        }
+        HistoryCell::Tool(tool) => {
+            let status = tool_status_for_activity(tool)?;
+            let mut line = format!("Status: {}", activity_status_label(status));
+            if let Some(duration_ms) = tool_duration_for_activity(tool) {
+                line.push_str(" · ");
+                line.push_str(&format_activity_duration_ms(duration_ms));
+            }
+            Some(line)
+        }
+        HistoryCell::Error { severity, .. } => Some(format!("Status: {:?}", severity)),
+        HistoryCell::SubAgent(_) => None,
+        _ => None,
+    }
+}
+
+fn tool_status_for_activity(tool: &ToolCell) -> Option<ToolStatus> {
+    match tool {
+        ToolCell::Exec(cell) => Some(cell.status),
+        ToolCell::Exploring(cell) => {
+            if cell
+                .entries
+                .iter()
+                .any(|entry| entry.status == ToolStatus::Running)
+            {
+                Some(ToolStatus::Running)
+            } else if cell
+                .entries
+                .iter()
+                .any(|entry| entry.status == ToolStatus::Failed)
+            {
+                Some(ToolStatus::Failed)
+            } else {
+                Some(ToolStatus::Success)
+            }
+        }
+        ToolCell::PlanUpdate(cell) => Some(cell.status),
+        ToolCell::PatchSummary(cell) => Some(cell.status),
+        ToolCell::Review(cell) => Some(cell.status),
+        ToolCell::DiffPreview(_) => Some(ToolStatus::Success),
+        ToolCell::Mcp(cell) => Some(cell.status),
+        ToolCell::ViewImage(_) => Some(ToolStatus::Success),
+        ToolCell::WebSearch(cell) => Some(cell.status),
+        ToolCell::Generic(cell) => Some(cell.status),
+    }
+}
+
+fn tool_duration_for_activity(tool: &ToolCell) -> Option<u64> {
+    match tool {
+        ToolCell::Exec(cell) => cell.duration_ms.or_else(|| {
+            (cell.status == ToolStatus::Running).then(|| {
+                u64::try_from(
+                    cell.started_at
+                        .map(|started| started.elapsed().as_millis())
+                        .unwrap_or_default(),
+                )
+                .unwrap_or(u64::MAX)
+            })
+        }),
+        _ => None,
+    }
+}
+
+fn activity_status_label(status: ToolStatus) -> &'static str {
+    match status {
+        ToolStatus::Running => "running",
+        ToolStatus::Success => "done",
+        ToolStatus::Failed => "failed",
+    }
+}
+
+fn format_activity_duration_ms(ms: u64) -> String {
+    if ms < 1000 {
+        format!("{ms}ms")
+    } else {
+        format!("{:.1}s", ms as f64 / 1000.0)
+    }
+}
+
+fn thinking_chunk_position(app: &App, cell_index: usize) -> Option<(usize, usize)> {
+    if !matches!(
+        app.cell_at_virtual_index(cell_index),
+        Some(HistoryCell::Thinking { .. })
+    ) {
+        return None;
+    }
+
+    let mut total = 0usize;
+    let mut position = None;
+    for idx in 0..app.virtual_cell_count() {
+        if matches!(
+            app.cell_at_virtual_index(idx),
+            Some(HistoryCell::Thinking { .. })
+        ) {
+            total += 1;
+            if idx == cell_index {
+                position = Some(total);
+            }
+        }
+    }
+    position.map(|pos| (pos, total))
+}
+
+fn activity_cell_to_text(cell: &HistoryCell, width: u16) -> String {
+    let lines = match cell {
+        HistoryCell::Tool(_) => cell.lines_with_options(
+            width,
+            TranscriptRenderOptions {
+                calm_mode: true,
+                low_motion: true,
+                ..TranscriptRenderOptions::default()
+            },
+        ),
+        _ => cell.transcript_lines(width),
+    };
+    lines
+        .iter()
+        .map(line_to_plain)
+        .collect::<Vec<_>>()
+        .join("\n")
 }

 fn open_tool_details_pager(app: &mut App) -> bool {
@@ -9233,19 +9570,44 @@ fn selected_detail_footer_label(app: &App) -> Option<String> {
    if app.viewport.transcript_selection.is_active() {
        return None;
    }
-    let cell_index = app.detail_cell_index_for_viewport(
-        app.viewport.last_transcript_top,
-        app.viewport.last_transcript_visible.max(1),
-        app.viewport.transcript_cache.line_meta(),
-    )?;
-    let label = detail_target_label(app, cell_index)?;
+    let cell_index = activity_footer_target_cell_index(app)?;
+    let cell = app.cell_at_virtual_index(cell_index)?;
+    let label = truncate_line_to_width(&activity_cell_label(app, cell_index, cell), 30);
+    let raw_hint = if app.cell_has_detail_target(cell_index) {
+        format!(" · {} raw", tool_details_shortcut_label())
+    } else {
+        String::new()
+    };
    Some(format!(
-        "{} details: {}",
-        tool_details_shortcut_label(),
-        truncate_line_to_width(&label, 34)
+        "{} Activity: {label}{raw_hint}",
+        activity_shortcut_label()
    ))
 }

+fn activity_footer_target_cell_index(app: &App) -> Option<usize> {
+    let line_meta = app.viewport.transcript_cache.line_meta();
+    let start = app
+        .viewport
+        .last_transcript_top
+        .min(line_meta.len().saturating_sub(1));
+    let end = start
+        .saturating_add(app.viewport.last_transcript_visible.max(1))
+        .min(line_meta.len());
+    for meta in line_meta.iter().take(end).skip(start) {
+        let Some((cell_index, _)) = meta.cell_line() else {
+            continue;
+        };
+        if app
+            .cell_at_virtual_index(cell_index)
+            .is_some_and(is_meaningful_activity_cell)
+        {
+            return Some(cell_index);
+        }
+    }
+
+    activity_target_cell_index(app)
+}
+
 fn detail_target_label(app: &App, cell_index: usize) -> Option<String> {
    if let Some(detail) = app.tool_detail_record_for_cell(cell_index) {
        return Some(detail.tool_name.clone());
@@ -9322,6 +9684,10 @@ fn tool_details_shortcut_label() -> &'static str {
    }
 }

+fn activity_shortcut_label() -> &'static str {
+    "Ctrl+O"
+}
+
 /// Modifier predicate for the v0.8.30 family of `Alt+<letter>` transcript-
 /// nav shortcuts (`Alt+G` / `Alt+Shift+G` / `Alt+[` / `Alt+]` / `Alt+?` /
 /// `Alt+L` / `Alt+V`). Requires `Alt` and disallows `Ctrl` / `Super` so the
@@ -361,7 +361,14 @@ fn selection_to_text_copies_rendered_transcript_block() {
    let selected = selection_to_text(&app).expect("selection text");
    assert!(selected.contains("Note copy system"), "{selected:?}");
    assert!(selected.contains("copy user"), "{selected:?}");
-    assert!(selected.contains("copy thinking"), "{selected:?}");
+    assert!(
+        !selected.contains("copy thinking"),
+        "raw completed thinking should stay out of live selection text: {selected:?}"
+    );
+    assert!(
+        selected.contains("Ctrl+O"),
+        "selection should keep the reasoning detail affordance: {selected:?}"
+    );
    assert!(selected.contains("tool output line"), "{selected:?}");
    assert!(selected.contains("copy assistant"), "{selected:?}");
    // #1163: tool-card middle lines are rendered with a `│ ` left rail
@@ -449,6 +456,35 @@ fn mouse_selection_autocopies_on_release_without_ctrl_c() {
    );
 }

+#[test]
+fn loading_mouse_filter_keeps_active_drags() {
+    let mut app = create_test_app();
+    app.is_loading = true;
+
+    let moved = MouseEvent {
+        kind: MouseEventKind::Moved,
+        column: 3,
+        row: 2,
+        modifiers: KeyModifiers::NONE,
+    };
+    let drag = MouseEvent {
+        kind: MouseEventKind::Drag(MouseButton::Left),
+        column: 5,
+        row: 2,
+        modifiers: KeyModifiers::NONE,
+    };
+
+    assert!(should_drop_loading_mouse_motion(&app, moved));
+    assert!(should_drop_loading_mouse_motion(&app, drag));
+
+    app.viewport.transcript_selection.dragging = true;
+    assert!(!should_drop_loading_mouse_motion(&app, drag));
+
+    app.viewport.transcript_selection.dragging = false;
+    app.viewport.transcript_scrollbar_dragging = true;
+    assert!(!should_drop_loading_mouse_motion(&app, drag));
+}
+
 #[test]
 fn jump_to_latest_button_click_scrolls_to_tail() {
    let mut app = create_test_app();
@@ -1081,7 +1117,9 @@ fn create_test_app() -> App {
        notes_path: PathBuf::from("notes.txt"),
        mcp_config_path: PathBuf::from("mcp.json"),
        use_memory: false,
-        start_in_agent_mode: false,
+        // Keep UI tests independent from the developer's saved
+        // `default_mode` setting.
+        start_in_agent_mode: true,
        skip_onboarding: false,
        yolo: false,
        resume_session_id: None,
@@ -1106,7 +1144,9 @@ fn create_test_options() -> TuiOptions {
        notes_path: PathBuf::from("notes.txt"),
        mcp_config_path: PathBuf::from("mcp.json"),
        use_memory: false,
-        start_in_agent_mode: false,
+        // Keep UI tests independent from the developer's saved
+        // `default_mode` setting.
+        start_in_agent_mode: true,
        skip_onboarding: false,
        yolo: false,
        resume_session_id: None,
@@ -1732,7 +1772,10 @@ fn make_subagent(
    status: crate::tools::subagent::SubAgentStatus,
 ) -> crate::tools::subagent::SubAgentResult {
    crate::tools::subagent::SubAgentResult {
+        name: id.to_string(),
        agent_id: id.to_string(),
+        context_mode: "fresh".to_string(),
+        fork_context: false,
        agent_type: crate::tools::subagent::SubAgentType::General,
        assignment: crate::tools::subagent::SubAgentAssignment {
            objective: format!("objective-{id}"),
@@ -1898,6 +1941,8 @@ fn event_poll_timeout_has_nonzero_floor() {
 fn footer_status_line_spans_show_mode_and_model_idle_and_active() {
    let mut app = create_test_app();
    app.model = "deepseek-v4-flash".to_string();
+    // Pin Agent mode regardless of user settings on the host machine.
+    let _ = app.set_mode(crate::tui::app::AppMode::Agent);

    let idle = spans_text(&footer_status_line_spans(&app, 60));
    assert!(idle.contains("agent"));
@@ -2762,6 +2807,15 @@ fn first_line_for_cell(app: &App, cell_index: usize) -> usize {
        .expect("cell should have rendered line")
 }

+fn pop_pager_body(app: &mut App) -> String {
+    let mut view = app.view_stack.pop().expect("pager view");
+    let pager = view
+        .as_any_mut()
+        .downcast_mut::<PagerView>()
+        .expect("top view should be pager");
+    pager.body_text()
+}
+
 #[test]
 fn detail_target_prefers_visible_tool_card() {
    let mut app = create_test_app();
@@ -2824,13 +2878,42 @@ fn detail_target_prefers_visible_tool_card() {
    app.viewport.last_transcript_visible = 6;

    assert_eq!(detail_target_cell_index(&app), Some(1));
-    let expected = format!("{} details: file_search", tool_details_shortcut_label());
+    let expected = format!(
+        "{} Activity: file_search · {} raw",
+        activity_shortcut_label(),
+        tool_details_shortcut_label()
+    );
    assert_eq!(
        selected_detail_footer_label(&app).as_deref(),
        Some(expected.as_str())
    );
 }

+#[test]
+fn activity_footer_hint_surfaces_visible_thinking_without_raw_tool_hint() {
+    let mut app = create_test_app();
+    app.history = vec![HistoryCell::Thinking {
+        content: "visible reasoning".to_string(),
+        streaming: false,
+        duration_secs: Some(1.4),
+    }];
+    app.resync_history_revisions();
+    let revisions = app.history_revisions.clone();
+    app.viewport.transcript_cache.ensure(
+        &app.history,
+        &revisions,
+        100,
+        app.transcript_render_options(),
+    );
+    app.viewport.last_transcript_top = first_line_for_cell(&app, 0);
+    app.viewport.last_transcript_visible = 4;
+
+    assert_eq!(
+        selected_detail_footer_label(&app).as_deref(),
+        Some("Ctrl+O Activity: thinking")
+    );
+}
+
 #[test]
 fn macos_option_v_glyph_is_treated_as_details_shortcut_only_on_macos() {
    let option_v = KeyEvent::new(KeyCode::Char('\u{221A}'), KeyModifiers::NONE);
@@ -3680,7 +3763,7 @@ fn orphan_during_active_keeps_subsequent_completion_routed_correctly() {

 #[test]
 fn tool_details_survive_active_cell_flush() {
-    // The pager / Ctrl+O resolves tool details by cell index. Flushing the
+    // Detail pagers resolve tool details by cell index. Flushing the
    // active cell must move detail records into `tool_details_by_cell` so
    // the pager keeps working after the turn settles.
    let mut app = create_test_app();
@@ -3984,9 +4067,8 @@ fn open_thinking_pager_finds_thinking_in_active_cell() {
    // After ThinkingComplete fires, the finalized thinking entry stays in
    // `app.active_cell` with `streaming = false` until the active cell is
    // flushed to history (end-of-turn, or when an assistant text arrives).
-    // During that window the transcript still renders the
-    // "thinking collapsed; press Ctrl+O for full text" affordance from
-    // `render_thinking`, so the handler must reach across the virtual
+    // During that window the transcript still renders the Ctrl+O affordance
+    // from `render_thinking`, so the handler must reach across the virtual
    // transcript — not just `app.history` — or the promise is a lie.
    // Regression guard for the v0.8.29 affordance/handler mismatch.
    let mut app = create_test_app();
@@ -4013,6 +4095,129 @@ fn open_thinking_pager_finds_thinking_in_active_cell() {
        Some(ModalKind::Pager),
        "pager must open for thinking entries still in active_cell"
    );
+    let body = pop_pager_body(&mut app);
+    assert!(body.contains("Activity: reasoning timeline"), "{body}");
+    assert!(body.contains("Thinking chunk 1 of 1"), "{body}");
+    assert!(body.contains("deliberating"), "{body}");
+}
+
+#[test]
+fn activity_detail_opens_reasoning_timeline_for_selected_thinking() {
+    let mut app = create_test_app();
+    app.history = vec![
+        HistoryCell::Thinking {
+            content: "first chunk reasoning".to_string(),
+            streaming: false,
+            duration_secs: Some(0.8),
+        },
+        HistoryCell::Assistant {
+            content: "interlude".to_string(),
+            streaming: false,
+        },
+        HistoryCell::Thinking {
+            content: "second chunk reasoning".to_string(),
+            streaming: false,
+            duration_secs: Some(1.1),
+        },
+    ];
+    app.resync_history_revisions();
+    let revisions = app.history_revisions.clone();
+    app.viewport.transcript_cache.ensure(
+        &app.history,
+        &revisions,
+        100,
+        app.transcript_render_options(),
+    );
+    let line = first_line_for_cell(&app, 0);
+    let point = TranscriptSelectionPoint {
+        line_index: line,
+        column: 0,
+    };
+    app.viewport.transcript_selection.anchor = Some(point);
+    app.viewport.transcript_selection.head = Some(point);
+
+    assert!(open_activity_detail_pager(&mut app));
+    let body = pop_pager_body(&mut app);
+
+    assert!(
+        body.contains("Activity: reasoning timeline"),
+        "activity label missing: {body}"
+    );
+    assert!(
+        body.contains("Selected chunk: 1 of 2"),
+        "chunk position missing: {body}"
+    );
+    assert!(body.contains("Thinking chunk 1 of 2 (selected)"), "{body}");
+    assert!(body.contains("Thinking chunk 2 of 2"), "{body}");
+    assert!(body.contains("first chunk reasoning"), "body: {body}");
+    assert!(
+        body.contains("second chunk reasoning"),
+        "timeline should include the whole session's thinking: {body}"
+    );
+}
+
+#[test]
+fn activity_detail_fallback_prefers_live_activity_context() {
+    let mut app = create_test_app();
+    let mut active = ActiveCell::new();
+    active.push_tool(
+        "active-1",
+        HistoryCell::Tool(ToolCell::Generic(GenericToolCell {
+            name: "agent_eval".to_string(),
+            status: ToolStatus::Running,
+            input_summary: Some("agent_id: agent_af58ba3a".to_string()),
+            output: None,
+            prompts: None,
+            spillover_path: None,
+            output_summary: None,
+            is_diff: false,
+        })),
+    );
+    app.active_cell = Some(active);
+    app.runtime_turn_id = Some("turn_live_123456789".to_string());
+    app.runtime_turn_status = Some("in_progress".to_string());
+
+    assert!(open_activity_detail_pager(&mut app));
+    let body = pop_pager_body(&mut app);
+
+    assert!(body.contains("Turn: turn_live_123456789"));
+    assert!(body.contains("Activity: tool agent_eval"));
+    assert!(body.contains("Status: running"));
+    assert!(body.contains("agent_id: agent_af58ba3a"));
+}
+
+#[test]
+fn activity_detail_fallback_uses_recent_meaningful_activity_without_full_tool_dump() {
+    let mut app = create_test_app();
+    let output = (0..20)
+        .map(|idx| format!("line {idx}"))
+        .collect::<Vec<_>>()
+        .join("\n");
+    app.history
+        .push(HistoryCell::Tool(ToolCell::Generic(GenericToolCell {
+            name: "read_file".to_string(),
+            status: ToolStatus::Success,
+            input_summary: Some("src/large.rs".to_string()),
+            output: Some(output),
+            prompts: None,
+            spillover_path: None,
+            output_summary: None,
+            is_diff: false,
+        })));
+
+    assert!(open_activity_detail_pager(&mut app));
+    let body = pop_pager_body(&mut app);
+
+    assert!(body.contains("Activity: tool read_file"));
+    assert!(body.contains("Status: done"));
+    assert!(
+        body.contains("Alt+V for details"),
+        "activity detail should stay bounded and point to Alt+V for raw detail: {body}"
+    );
+    assert!(
+        !body.contains("line 10"),
+        "middle of large raw output should not be dumped into Activity Detail: {body}"
+    );
 }

 #[test]
@@ -4762,7 +4967,7 @@ fn composer_arrows_scroll_empty_up() {
        false,
        false,
    ));
-    assert_eq!(app.viewport.pending_scroll_delta, -1);
+    assert_eq!(app.viewport.pending_scroll_delta, -3);
    assert!(app.input.is_empty());
 }

@@ -4777,7 +4982,7 @@ fn composer_arrows_scroll_empty_down() {
        false,
        false,
    ));
-    assert_eq!(app.viewport.pending_scroll_delta, 1);
+    assert_eq!(app.viewport.pending_scroll_delta, 3);
 }

 #[test]
@@ -609,6 +609,13 @@ impl ConfigView {
                editable: true,
                scope: ConfigScope::Saved,
            },
+            ConfigRow {
+                section: ConfigSection::Display,
+                key: "theme".to_string(),
+                value: settings.theme.clone(),
+                editable: true,
+                scope: ConfigScope::Saved,
+            },
            ConfigRow {
                section: ConfigSection::Display,
                key: "locale".to_string(),
@@ -1040,11 +1047,12 @@ fn config_hint_for_key(key: &str) -> &'static str {
        | "composer_border"
        | "paste_burst_detection" => "on/off, true/false, yes/no, 1/0",
        "composer_density" | "transcript_spacing" => "compact | comfortable | spacious",
+        "theme" => "system | dark | light | grayscale",
        "locale" => "auto | en | ja | zh-Hans | pt-BR",
        "background_color" => "#RRGGBB | default",
        "default_mode" => "agent | plan | yolo",
        "sidebar_width" => "10..=50",
-        "sidebar_focus" => "auto | plan | todos | tasks | agents",
+        "sidebar_focus" => "auto | work | tasks | agents | context",
        "max_history" => "integer (0 allowed)",
        "default_model" => "deepseek-v4-pro | deepseek-v4-flash | deepseek-* | none/default",
        "mcp_config_path" => "path to mcp.json",
@@ -1516,7 +1524,10 @@ fn live_subagent_result(
    role: Option<&str>,
 ) -> SubAgentResult {
    SubAgentResult {
+        name: agent_id.to_string(),
        agent_id: agent_id.to_string(),
+        context_mode: "fresh".to_string(),
+        fork_context: false,
        agent_type,
        assignment: SubAgentAssignment {
            objective: summarize_tool_output(objective),
@@ -1938,7 +1949,10 @@ mod tests {

    fn manager_agent(id: &str, status: SubAgentStatus) -> SubAgentResult {
        SubAgentResult {
+            name: id.to_string(),
            agent_id: id.to_string(),
+            context_mode: "fresh".to_string(),
+            fork_context: false,
            agent_type: SubAgentType::Explore,
            assignment: SubAgentAssignment {
                objective: "read the docs".to_string(),
@@ -2061,6 +2075,7 @@ mod tests {
            .collect::<Vec<_>>();
        assert!(keys.contains(&"model"));
        assert!(keys.contains(&"approval_mode"));
+        assert!(keys.contains(&"theme"));
        assert!(keys.contains(&"locale"));
        assert!(keys.contains(&"background_color"));
        assert!(keys.contains(&"auto_compact"));
@@ -283,7 +283,7 @@ impl FanoutCard {
        let mut lines = Vec::with_capacity(3);
        let header_status = self.aggregate_status();
        let title = format!("{} ({} workers)", self.kind, self.workers.len());
-        let family = if self.kind == "rlm" {
+        let family = if matches!(self.kind.as_str(), "rlm_open" | "rlm_eval" | "rlm") {
            ToolFamily::Rlm
        } else {
            ToolFamily::Fanout
@@ -55,6 +55,10 @@ pub struct ChatWidget {
    scrollbar: Option<TranscriptScrollbar>,
    jump_to_latest_button: Option<Rect>,
    background: Color,
+    scroll_track: Color,
+    scroll_thumb: Color,
+    jump_border: Color,
+    jump_arrow: Color,
 }

 #[derive(Debug, Clone, Copy)]
@@ -68,6 +72,10 @@ impl ChatWidget {
    pub fn new(app: &mut App, area: Rect) -> Self {
        let content_area = area;
        let background = app.ui_theme.surface_bg;
+        let scroll_track = app.ui_theme.border;
+        let scroll_thumb = app.ui_theme.status_working;
+        let jump_border = app.ui_theme.border;
+        let jump_arrow = app.ui_theme.status_working;
        let visible_lines = content_area.height as usize;
        let render_options = app.transcript_render_options();

@@ -85,6 +93,10 @@ impl ChatWidget {
                scrollbar: None,
                jump_to_latest_button: None,
                background,
+                scroll_track,
+                scroll_thumb,
+                jump_border,
+                jump_arrow,
            };
        }

@@ -294,6 +306,10 @@ impl ChatWidget {
            scrollbar,
            jump_to_latest_button,
            background,
+            scroll_track,
+            scroll_thumb,
+            jump_border,
+            jump_arrow,
        }
    }
 }
@@ -339,14 +355,20 @@ impl Renderable for ChatWidget {
                .begin_symbol(None)
                .end_symbol(None)
                .track_symbol(Some("│"))
-                .track_style(Style::default().fg(palette::BORDER_COLOR))
+                .track_style(Style::default().fg(self.scroll_track))
                .thumb_symbol("┃")
-                .thumb_style(Style::default().fg(palette::DEEPSEEK_SKY))
+                .thumb_style(Style::default().fg(self.scroll_thumb))
                .render(area, buf, &mut state);
        }

        if let Some(button_area) = self.jump_to_latest_button {
-            render_jump_to_latest_button(button_area, buf, self.background);
+            render_jump_to_latest_button(
+                button_area,
+                buf,
+                self.background,
+                self.jump_border,
+                self.jump_arrow,
+            );
        }
    }

@@ -378,21 +400,25 @@ fn jump_to_latest_button_rect(area: Rect, has_scrollbar: bool) -> Option<Rect> {
    })
 }

-fn render_jump_to_latest_button(area: Rect, buf: &mut Buffer, background: Color) {
+fn render_jump_to_latest_button(
+    area: Rect,
+    buf: &mut Buffer,
+    background: Color,
+    border: Color,
+    arrow: Color,
+) {
    Block::default()
        .borders(Borders::ALL)
        .border_type(BorderType::Rounded)
-        .border_style(Style::default().fg(palette::BORDER_COLOR))
+        .border_style(Style::default().fg(border))
        .style(Style::default().bg(background))
        .render(area, buf);

    let arrow_x = area.x.saturating_add(1);
    let arrow_y = area.y.saturating_add(1);
-    buf[(arrow_x, arrow_y)].set_symbol("↓").set_style(
-        Style::default()
-            .fg(palette::DEEPSEEK_SKY)
-            .add_modifier(Modifier::BOLD),
-    );
+    buf[(arrow_x, arrow_y)]
+        .set_symbol("↓")
+        .set_style(Style::default().fg(arrow).add_modifier(Modifier::BOLD));
 }

 pub struct ComposerWidget<'a> {
@@ -2931,6 +2957,57 @@ mod tests {
        assert_eq!(buf[(button.x + 1, button.y + 1)].symbol(), "↓");
    }

+    #[test]
+    fn chat_widget_uses_light_theme_scroll_chrome() {
+        let mut app = create_test_app();
+        app.ui_theme = palette::LIGHT_UI_THEME;
+        app.use_mouse_capture = true;
+        for i in 0..120 {
+            app.add_message(HistoryCell::User {
+                content: format!("user message {i}"),
+            });
+        }
+        app.viewport.transcript_scroll = TranscriptScroll::at_line(0);
+
+        let area = Rect {
+            x: 0,
+            y: 0,
+            width: 80,
+            height: 8,
+        };
+        let mut buf = Buffer::empty(area);
+        let widget = ChatWidget::new(&mut app, area);
+        widget.render(area, &mut buf);
+
+        let mut saw_track = false;
+        let mut saw_thumb = false;
+        for y in 0..area.height {
+            let cell = &buf[(area.width - 1, y)];
+            match cell.symbol() {
+                "│" => {
+                    saw_track = true;
+                    assert_eq!(cell.fg, palette::LIGHT_UI_THEME.border);
+                }
+                "┃" => {
+                    saw_thumb = true;
+                    assert_eq!(cell.fg, palette::LIGHT_UI_THEME.status_working);
+                }
+                _ => {}
+            }
+        }
+        assert!(saw_track, "scrollbar track should render");
+        assert!(saw_thumb, "scrollbar thumb should render");
+
+        let button = app
+            .viewport
+            .jump_to_latest_button_area
+            .expect("button appears when transcript is not at tail");
+        assert_eq!(
+            buf[(button.x + 1, button.y + 1)].fg,
+            palette::LIGHT_UI_THEME.status_working
+        );
+    }
+
    #[test]
    fn chat_widget_hides_jump_to_latest_button_at_tail() {
        let mut app = create_test_app();
@@ -69,7 +69,7 @@ pub fn tool_family_for_title(title: &str) -> ToolFamily {
 }

 /// Map an arbitrary tool name (as exposed to the model — e.g. `read_file`,
-/// `apply_patch`, `agent_spawn`) to a family. Used by `GenericToolCell`
+/// `apply_patch`, `agent_open`) to a family. Used by `GenericToolCell`
 /// where the `tool_family_for_title` shortcut isn't enough because every
 /// generic cell shares the title `"Tool"`.
 #[must_use]
@@ -79,8 +79,8 @@ pub fn tool_family_for_name(name: &str) -> ToolFamily {
        "edit_file" | "apply_patch" | "write_file" => ToolFamily::Patch,
        "exec_shell" | "exec_shell_wait" | "exec_shell_interact" => ToolFamily::Run,
        "grep_files" | "file_search" | "web_search" | "fetch_url" => ToolFamily::Find,
-        "agent_spawn" => ToolFamily::Delegate,
-        "rlm" => ToolFamily::Rlm,
+        "agent_open" | "agent_eval" | "agent_close" | "agent_spawn" => ToolFamily::Delegate,
+        "rlm_open" | "rlm_eval" | "rlm_configure" | "rlm_close" | "rlm" => ToolFamily::Rlm,
        _ => ToolFamily::Generic,
    }
 }
@@ -217,8 +217,8 @@ mod tests {
        assert_eq!(tool_family_for_name("apply_patch"), ToolFamily::Patch);
        assert_eq!(tool_family_for_name("exec_shell"), ToolFamily::Run);
        assert_eq!(tool_family_for_name("grep_files"), ToolFamily::Find);
-        assert_eq!(tool_family_for_name("agent_spawn"), ToolFamily::Delegate);
-        assert_eq!(tool_family_for_name("rlm"), ToolFamily::Rlm);
+        assert_eq!(tool_family_for_name("agent_open"), ToolFamily::Delegate);
+        assert_eq!(tool_family_for_name("rlm_eval"), ToolFamily::Rlm);
        assert_eq!(
            tool_family_for_name("totally_new_tool"),
            ToolFamily::Generic
@@ -270,8 +270,13 @@ mod tests {
        let tmp = tempdir().expect("tempdir");
        let ctx = ToolContext::new(tmp.path().to_path_buf());
        let tool = ImageAnalyzeTool::new(fake_config());
+        let outside_workspace = if cfg!(windows) {
+            r"C:\Windows\System32\drivers\etc\hosts"
+        } else {
+            "/etc/hosts"
+        };
        let err = tool
-            .execute(json!({"image_path": "/etc/hosts"}), &ctx)
+            .execute(json!({"image_path": outside_workspace}), &ctx)
            .await
            .expect_err("absolute path must reject");
        assert!(
@@ -7,7 +7,7 @@ Current boundary note (v0.8.6):
 - Other workspace crates are being split out incrementally, but they are not yet the sole runtime source of truth.
 - The LSP subsystem (`crates/tui/src/lsp/`) is fully wired into the engine's post-tool-execution path
  (`core/engine/lsp_hooks.rs`), providing inline diagnostics after every edit_file/apply_patch/write_file.
- The swarm agent system was removed in v0.8.5 in favour of sub-agents (agent_spawn) and RLM (rlm_query).
+- The swarm agent system was removed in v0.8.5. The active v0.8.33 orchestration surface is persistent sub-agent sessions (`agent_open` / `agent_eval` / `agent_close`) and persistent RLM sessions (`rlm_open` / `rlm_eval` / `rlm_configure` / `rlm_close`).
  No model-visible swarm tool remains in the active codebase.

 ## High-Level Overview
@@ -129,9 +129,9 @@ drives turns through Chat Completions.
  - `github.rs` - Read-only GitHub context and guarded comment/closure tools backed by `gh`
  - `automation.rs` - Model-visible scheduling tools over `AutomationManager`
  - `plan.rs` - Planning tools
-  - `subagent.rs` - Sub-agent spawning (replaces the removed `agent_swarm` surface)
+  - `subagent.rs` - Persistent sub-agent sessions (replaces the removed `agent_swarm` surface)
  - `spec.rs` - Tool specifications
-  - `rlm.rs` - Recursive Language Model (RLM) tool — sandboxed Python REPL with `llm_query()` helpers
+  - `rlm.rs` - Persistent Recursive Language Model (RLM) sessions — sandboxed Python REPLs with semantic helper calls and `var_handle` output support

 ### Extension Systems

@@ -1,320 +0,0 @@
-# Competitive Analysis: DeepSeek TUI vs OpenCode vs Codex CLI
-
-Analysis of capabilities across three AI coding agents: OpenCode (`/Volumes/VIXinSSD/opencode`), Codex CLI (`/Volumes/VIXinSSD/codex-main`), and DeepSeek TUI (`/Volumes/VIXinSSD/deepseek-tui`).
-
-## Tool Matrix
-
-| Capability | OpenCode | Codex CLI | DeepSeek TUI |
-|---|---|---|---|
-| File read | ✅ Read | ✅ | ✅ file |
-| File write | ✅ Write | ✅ | ✅ file |
-| File edit | ✅ Edit (string replace) | ✅ apply_patch (diff format) | ✅ edit_file + apply_patch |
-| File glob | ✅ Glob | ✅ | ✅ file_search |
-| Code search | ✅ Grep + CodeSearch (Exa) | ✅ | ✅ grep_files + search |
-| Shell exec | ✅ Bash | ✅ exec/shell | ✅ shell |
-| Web fetch | ✅ WebFetch | ✅ | ✅ fetch_url |
-| Web search | ✅ WebSearch | ✅ WebSearchRequest | ✅ web_search |
-| Web browse | ❌ | ❌ | ✅ web_run |
-| LSP | ✅ Lsp (experimental) | ❌ | ✅ Post-edit diagnostics (auto) |
-| Task/todo tracking | ✅ TodoWrite | ✅ | ✅ todo_write |
-| Subagent spawn | ✅ Task | ✅ Collab/SpawnCsv | ✅ agent_spawn |
-| Skill system | ✅ Skill (multi-location discovery) | ✅ core-skills | ⚠️ Partial (.deepseek/skills/) |
-| Plan mode | ✅ plan-enter/exit | ✅ Plan mode | ✅ Plan mode |
-| User question | ✅ Question | ✅ request_user_input | ✅ user_input |
-| Patch apply | ✅ apply_patch (custom format) | ✅ apply_patch (diff format) | ✅ apply_patch |
-| Data validation | ❌ | ❌ | ✅ validate_data |
-| Finance | ❌ | ❌ | ✅ finance |
-| Git ops | Via Bash tool | ✅ git-utils | ✅ git module |
-| GitHub ops | Via Bash (gh) | ✅ | ✅ github |
-| Test running | ❌ | ✅ | ✅ test_runner |
-| Automation | ❌ | ❌ | ✅ automation |
-| Code review | ❌ | ✅ GuardianApproval | ✅ review |
-| Recall/archive | ❌ | ❌ | ✅ recall_archive |
-| Diagnostics | ❌ | ✅ | ✅ diagnostics |
-| Revert turn | ❌ | ❌ | ✅ revert_turn |
-| Image generation | ❌ | ✅ ImageGeneration | ❌ |
-| Browser use | ❌ | ✅ BrowserUse | ❌ (web_run is headless) |
-| Computer use | ❌ | ✅ ComputerUse | ❌ |
-| Realtime voice | ❌ | ✅ RealtimeConversation | ❌ |
-
---
-
-## High Priority Gaps
-
-These are capabilities that would most directly improve DeepSeek TUI's effectiveness as a coding agent.
-
-### 1. LSP Integration — ✅ IMPLEMENTED (Post-Edit Diagnostics)
-
-**Status:** Implemented in `crates/tui/src/lsp/` + `crates/tui/src/core/engine/lsp_hooks.rs`. Shipped as automatic post-edit diagnostics injection.
-
-**What DeepSeek TUI has:**
-
- **Post-edit diagnostics hook:** After every successful `edit_file`, `write_file`, or `apply_patch`, the engine automatically requests diagnostics from the appropriate LSP server and injects compiler errors into the model's context as a synthetic message.
- **Custom JSON-RPC stdio client** (`client.rs`): Implements the LSP wire protocol without `tower-lsp` dependency. Spawns LSP servers as child processes, handles `Content-Length` framing, routes `publishDiagnostics` notifications.
- **Language registry** (`registry.rs`): Detects language from file extensions and maps to built-in defaults:
-  - Rust → `rust-analyzer`
-  - Go → `gopls serve`
-  - Python → `pyright-langserver --stdio`
-  - TypeScript/JavaScript → `typescript-language-server --stdio`
-  - C/C++ → `clangd`
- **Configurable** via `[lsp]` table in `~/.deepseek/config.toml`: `enabled`, `poll_after_edit_ms` (default 5000), `max_diagnostics_per_file` (default 20), `include_warnings` (default false), and per-language `[lsp.servers]` overrides.
- **Non-blocking by design:** Missing LSP binary, server crashes, or timeouts degrade silently to "no diagnostics this turn." Servers spawn lazily on first edit per language.
- **Test infrastructure:** `FakeTransport` seam for CI testing without real LSP servers.
-
-**Remaining gap vs OpenCode:** OpenCode exposes LSP as a **model-callable tool** with 9 operations (goToDefinition, findReferences, hover, documentSymbol, workspaceSymbol, goToImplementation, prepareCallHierarchy, incomingCalls, outgoingCalls). DeepSeek TUI's LSP is currently passive (auto-fires after edits) rather than active (model can query on demand for navigation).
-
-**What DeepSeek TUI could still add:**
-
-A model-callable `lsp` tool in `crates/tui/src/tools/` that exposes the interactive LSP operations (goToDefinition, findReferences, hover, documentSymbol, workspaceSymbol). The transport infrastructure already exists — the gap is only the tool wrapper and the request/response cycle for LSP methods beyond `didOpen`/`didChange`/`publishDiagnostics`.
-
-### 2. Granular Permission System
-
-**What it is:** Allow/deny/ask rules keyed on tool name × file path pattern, with wildcard support, home-directory expansion, and cascading to pending requests.
-
-**Why it matters:** The current all-or-nothing approval model creates friction. Users can't express "always allow reads in `src/` but always ask for `.env` files." The ability to permanently approve a pattern reduces approval fatigue by 60–80% over a long session.
-
-**OpenCode implementation:** `packages/opencode/src/permission/index.ts` implements:
-
- `Action`: `allow | deny | ask`
- `Rule`: `{ permission: string, pattern: string, action: Action }`
- `Ruleset`: ordered list of rules with last-match-wins semantics
- Pattern expansion for `~/`, `$HOME/`
- Wildcard matching on both permission names and path patterns
- Reply modes: `once` (approve this one call), `always` (approve pattern forever), `reject` (deny this one)
- Automatic cascading: an "always" reply auto-resolves pending requests for the same session
- Distinct error types: `DeniedError` (rule-based), `RejectedError` (user said no), `CorrectedError` (user said no with feedback)
-
-Agent definitions inherit permission rulesets that can be user-overridden:
-```typescript
-build: {
-  permission: merge(defaults, { question: "allow", plan_enter: "allow" }, user),
-}
-plan: {
-  permission: merge(defaults, { edit: { "*": "deny" } }, user),
-}
-explore: {
-  permission: merge(defaults, { "*": "deny", grep: "allow", read: "allow", ... }, user),
-}
-```
-
-**What DeepSeek TUI would need:** A permission rule engine with the same dimension (tool name × path pattern × action), persistence to disk, and hook integration so approval decisions can cascade.
-
-### 3. Lifecycle Hooks
-
-**What it is:** User-defined shell commands or plugin functions that fire on specific lifecycle events — before a tool executes, after it completes, when permission is requested, at session start, when the user submits a prompt, and at session stop.
-
-**Why it matters:** Hooks are the escape hatch that lets users enforce invariants without polluting the system prompt. "Always run `cargo fmt` after writing a `.rs` file." "Warn me before any `rm -rf`." "Log every shell command to a file." They are composable, auditable, and don't consume context window tokens.
-
-**Codex CLI implementation:** `codex-rs/hooks/` defines six event types with typed request/response payloads:
-
-| Event | When it fires | Payload |
-|---|---|---|
-| `PreToolUse` | Before tool execution | tool name, input params, sandbox state |
-| `PostToolUse` | After tool execution | tool name, input, success/failure, duration, output preview |
-| `PermissionRequest` | When model requests permission | permission type, justification |
-| `SessionStart` | New session begins | session ID, cwd, source (new/resume) |
-| `UserPromptSubmit` | User sends a message | prompt text |
-| `Stop` | Session ending | reason |
-
-Each hook handler supports:
- `matcher`: optional regex to filter which tool calls trigger the hook
- `command`: shell command to run
- `timeout_sec`: maximum runtime
- `status_message`: shown to the user while the hook runs
- `source_path` + `source`: tracks where the hook was defined (project hooks.json, user config, plugin)
- Hooks can return `Success`, `FailedContinue`, or `FailedAbort` (blocks the operation)
-
-**What DeepSeek TUI would need:** Extend `crates/hooks/` to support the full event surface, add matcher-based filtering, and provide a `hooks.json` discovery mechanism similar to Codex CLI's.
-
-### 4. Persistent Memories
-
-**What it is:** Automatic extraction of user preferences, project conventions, and past decisions from conversations, stored as retrievable memories that are injected into new sessions.
-
-**Why it matters:** Across a long debugging session, the agent rediscovers the same facts: "this project uses Rust edition 2024," "tests run with `cargo test --workspace`," "the user prefers 4-space indentation." A memory system compounds value — each session builds on prior knowledge rather than starting from zero.
-
-**Codex CLI implementation:** The `MemoryTool` feature (experimental, behind `/experimental` menu) enables:
- Memory generation: the model creates structured memories from conversation content
- Memory retrieval: relevant memories are injected into new conversation context
- The `Chronicle` feature adds passive screen-context memories via a sidecar process
- Memories are stored in SQLite and surfaced in the TUI via `/memories` command
-
-**What DeepSeek TUI would need:** A memory extraction prompt, a vector or keyword-based retrieval system, and storage in the existing session/state infrastructure.
-
-### 5. Skill Auto-Discovery
-
-**What it is:** Automatic scanning of multiple locations for `SKILL.md` files that provide domain-specific instructions, scripts, and references. Skills are injected into the conversation on demand via a `skill` tool.
-
-**Why it matters:** Skills are how the community packages expertise. A "Rust refactoring" skill, a "Docker deployment" skill, a "GitHub Actions" skill — each provides specialized instructions without bloating the main system prompt. OpenCode's multi-location discovery means skills can be project-local, user-global, or pulled from URLs.
-
-**OpenCode implementation:** `packages/opencode/src/skill/index.ts` scans:
-
-1. `~/.claude/skills/**/SKILL.md` (Claude Code compatibility)
-2. `~/.agents/skills/**/SKILL.md` (Agents SDK compatibility)  
-3. Parent directories from cwd to workspace root for `.claude/skills/` and `.agents/skills/`
-4. Project config directories for `{skill,skills}/**/SKILL.md`
-5. User-configured paths (with `~/` expansion)
-6. User-configured URLs (pulled via discovery module)
-
-Skills are parsed for YAML frontmatter (`name`, `description`) and Markdown content. Duplicate names warn but don't error. Skills respect agent permissions — an agent can only load skills its permission ruleset allows.
-
-**What DeepSeek TUI would need:** Extend the existing `~/.deepseek/skills/` discovery to parent-directory walking, Claude Code compatibility paths, and URL-based skill sources. Add YAML frontmatter parsing.
-
---
-
-## Medium Priority Gaps
-
-These would meaningfully improve the agent experience but are less urgent.
-
-### 6. Agent Profiles with Permission Inheritance
-
-**What it is:** Named agent types (build, plan, general, explore) that inherit different tool permission sets. Users can define custom agents with specific models, temperatures, system prompts, and permission rules.
-
-**OpenCode implementation:** `packages/opencode/src/agent/agent.ts`:
-
- `build`: full-access with ask on sensitive paths
- `plan`: all edit tools denied, plan-exit allowed, plan file writes in `.opencode/plans/` allowed
- `general`: subagent-only, todo-write denied
- `explore`: read-only, grep/glob/read/bash/webfetch/websearch allowed
- Plus hidden agents for internal tasks (compaction, title generation, summarization)
-
-Each agent carries its own `model`, `temperature`, `topP`, `prompt`, and `permission` ruleset. A `generate` function creates new agent configs dynamically from user descriptions.
-
-**What DeepSeek TUI would need:** Extend the mode system (Plan/Agent/YOLO) to support named agent profiles with per-profile tool filtering and model configuration.
-
-### 7. Shell Sandboxing
-
-**What it is:** OS-level sandbox enforcement for shell commands — network restrictions, filesystem read-only mounts, allowed/disallowed paths.
-
-**Codex CLI implementation:** `codex-rs/sandboxing/`:
-
- macOS: Seatbelt (`sandboxing/src/seatbelt.rs`) with `.sbpl` policy files
- Linux: bubblewrap (default) or Landlock (legacy fallback)
- Windows: restricted token
- Configurable sandbox policies per command
- Integration tests can detect they're running under sandbox and early-exit
-
-**What DeepSeek TUI would need:** Extend `crates/execpolicy/` to support platform-specific sandbox enforcement. Start with macOS Seatbelt (most DeepSeek TUI users are on macOS).
-
-### 8. Tool Search / Deferred MCP Tool Exposure
-
-**What it is:** Instead of dumping all MCP tools into the system prompt (bloating context), expose a `tool_search` function that the model calls to discover relevant tools by name or description.
-
-**Codex CLI implementation:** `ToolSearch` feature (stable, default-enabled). `ToolSearchAlwaysDeferMcpTools` goes further — never exposes MCP tools directly, always requires search. This is critical when MCP servers expose hundreds of tools.
-
-**What DeepSeek TUI would need:** `tool_search_tool_regex` and `tool_search_tool_bm25` already exist as deferred tool discovery mechanisms. Extend them to gate MCP tool exposure behind on-demand search.
-
-### 9. ExecPolicy / Command Approval Rules
-
-**What it is:** A policy engine that evaluates shell commands against user-defined rules — prefix allowlists, network restrictions, pattern matching — and auto-approves, denies, or escalates.
-
-**Codex CLI implementation:** `codex-rs/execpolicy/src/`:
-
- `Policy`: ordered list of `Rule` entries
- `Rule`: prefix patterns (e.g., allow `cargo build*`, deny `rm *`)
- `NetworkRule`: protocol-level network restrictions
- `MatchOptions`: controls rule evaluation behavior
- `Evaluation`: result of policy evaluation against a command
-
-Rules can be amended at runtime via `blocking_append_allow_prefix_rule`.
-
-**What DeepSeek TUI would need:** Extend `crates/execpolicy/` to support prefix rules, network rules, and runtime policy amendments.
-
-### 10. Dynamic Agent Generation
-
-**What it is:** On-the-fly generation of new agent configurations from natural language descriptions.
-
-**OpenCode implementation:** The `generate` function in `agent.ts` takes a description like "code reviewer that only reads files and reports issues" and returns an `{ identifier, whenToUse, systemPrompt }` object using a structured LLM call. Generated agents respect existing agent name collisions.
-
-**What DeepSeek TUI would need:** A model-callable tool or slash command that generates agent configs from descriptions and registers them for the session.
-
-### 11. Streaming Patch Events
-
-**What it is:** Structured progress events streamed while the model is generating `apply_patch` input, giving the user real-time feedback on what files will change.
-
-**Codex CLI implementation:** `ApplyPatchStreamingEvents` feature (under development) streams file-level progress as the model produces patch hunks. The `StreamingPatchParser` in `apply-patch/src/streaming_parser.rs` handles incremental parsing.
-
-**What DeepSeek TUI would need:** Extend `apply_patch.rs` to emit progress events during streaming model output.
-
---
-
-## Lower Priority Gaps
-
-Specialized features that are valuable but less critical for core coding workflow.
-
-| Capability | Where | Notes |
-|---|---|---|
-| Image Generation | Codex CLI `ImageGeneration` | Niche for coding; useful for documentation diagrams |
-| Browser Use | Codex CLI `BrowserUse` | Interactive browser automation (click, type, screenshot). DeepSeek TUI has `web_run` for headless |
-| Computer Use | Codex CLI `ComputerUse` | Full desktop automation. Desktop-app-gated |
-| Realtime Voice | Codex CLI `RealtimeConversation` | Voice conversation mode. Experimental |
-| Unified PTY Exec | Codex CLI `UnifiedExec` | Single PTY-backed shell with state snapshotting across turns |
-| Artifacts | Codex CLI `Artifact` | Native artifact rendering tools |
-| Goals | Codex CLI `Goals` | Persistent thread goals that survive compaction and session restarts |
-| Git Commit Attribution | Codex CLI `CodexGitCommit` | Model instructions for proper commit attribution |
-| CSV Agent Spawning | Codex CLI `SpawnCsv` | CSV-backed parallel agent job distribution |
-| Shell Snapshotting | Codex CLI `ShellSnapshot` | Save/restore shell state across turns |
-| Prevent Idle Sleep | Codex CLI `PreventIdleSleep` | Keep machine awake during long-running agent tasks |
-
---
-
-## Architectural Patterns
-
-### OpenCode
-
-**Client/Server Architecture:** The TUI is one client; the server can be driven remotely from a mobile app, desktop app, or web console. This decouples the agent runtime from the UI layer.
-
-**Plugin System:** `packages/opencode/src/plugin/` supports hot-loadable JS/TS plugins that add tools, models, auth providers, and chat middleware. Plugins receive a typed context with tool execution, auth, and filesystem access.
-
-**Multi-Provider:** Not coupled to any single AI provider. Models are configured with provider IDs and resolved through a provider registry. OAuth support for OpenAI Codex (ChatGPT subscription integration) in `plugin/codex.ts`.
-
-**Config Layering:** Config is loaded from multiple sources (global, project, env vars) and merged with well-defined precedence.
-
-### Codex CLI
-
-**App-Server Protocol:** `codex-rs/app-server-protocol/` defines a versioned RPC protocol (v2) between the TUI frontend and the agent backend. All new API development goes through v2 with strict naming conventions (`*Params`/`*Response`/`*Notification`, `resource/method` RPC naming).
-
-**Feature Flag System:** `codex-rs/features/` centralizes 60+ feature flags with lifecycle stages (UnderDevelopment, Experimental, Stable, Deprecated, Removed). Features have metadata (menu name, description, announcement text) and can carry custom config structs.
-
-**Bazel + Cargo Dual Build:** Codex CLI uses both Cargo (for development) and Bazel (for CI/release). The `find_resource!` macro and `cargo_bin()` helper abstract over runfile differences.
-
-**Snapshot Testing:** `codex-rs/tui/` extensively uses `insta` for UI snapshot tests. Any UI change requires corresponding snapshot coverage.
-
-**Core Modularity:** Explicit resistance to adding code to `codex-core`. New functionality goes into purpose-built crates (`codex-apply-patch`, `codex-memories`, `codex-sandboxing`) rather than growing the core crate.
-
-### DeepSeek TUI
-
-**RLM (Recursive Language Model):** Unique in this space. A sandboxed Python REPL where a sub-LLM can call helpers (`llm_query`, `llm_query_batched`, `rlm_query`) for batch processing, chunking, and recursive critique. Neither competitor has an equivalent.
-
-**Durable Tasks:** Restart-aware persistent task objects with evidence tracking (gate runs, PR attempts, timeline). Designed for long-running autonomous work that survives restarts.
-
-**Automations:** Scheduled recurring tasks with cron-style RRULE recurrence. Unique among the three.
-
---
-
-## What DeepSeek TUI Already Excels At
-
- **LSP diagnostics** — automatic post-edit compiler/linter feedback injected into model context; neither competitor has passive LSP integration (OpenCode's is model-callable only)
- **RLM** — batch/bulk LLM processing in a Python sandbox; no equivalent in either competitor
- **Finance** — live stock/crypto quotes; unique in this space
- **Automations** — scheduled recurring tasks with cron rules
- **Durable tasks** — restart-aware with evidence tracking and gate verification
- **Turn revert** — undo workspace changes per turn via side-git snapshots
- **Data validation** — JSON/TOML validation tool
- **Web run** — headless browser interaction (Codex CLI has Browser Use but it's desktop-app-gated)
- **Parallel tool execution** — explicitly modeled as infrastructure
- **Git/GitHub operations** — comprehensive git module with blame, log, diff, status plus full GitHub API via gh
- **Project map** — high-level project structure generation
-
---
-
-## Recommended Implementation Order
-
-1. ~~**LSP tool**~~ — ✅ **DONE** (post-edit diagnostics). Remaining: model-callable navigation tool.
-2. **Path-pattern permissions** — reduces approval fatigue by 60–80% over long sessions.
-3. **Persistent memory** — compounds value across sessions; foundational for long-running projects.
-4. **Pre/Post-tool-use hooks** — escape hatch for user-defined guardrails without system prompt bloat.
-5. **Skill auto-discovery** — enables community skill ecosystem and Claude Code compatibility.
-6. **LSP navigation tool** — expose goToDefinition/findReferences/hover as model-callable tool. Infrastructure exists; add request/response methods + tool wrapper.
-7. **Agent profiles** — named agent types with model/permission inheritance.
-8. **Tool search for MCP** — keeps context window manageable when connecting to MCP servers with many tools.
-9. **Shell sandboxing** — security improvement, starting with macOS Seatbelt.
@@ -77,7 +77,10 @@ through unchanged for OpenAI-compatible gateways. `atlascloud` defaults to
 `deepseek-ai/deepseek-v4-flash` as its default model. SGLang, vLLM, and Ollama are
 self-hosted and can run without an API key by default. Ollama defaults to
 `http://localhost:11434/v1` and sends model tags such as `deepseek-coder:1.3b`
-or `qwen2.5-coder:7b` unchanged.
+or `qwen2.5-coder:7b` unchanged. Self-hosted providers and loopback custom
+URLs (`localhost`, `127.0.0.1`, `[::1]`, `0.0.0.0`) do not read the secret store
+unless API-key auth is explicitly requested; use an env var or config-file key
+when a local server does require bearer auth.

 Third-party OpenAI-compatible gateways that need extra request headers can set
 `http_headers = { "X-Model-Provider-Id" = "your-model-provider" }` at the top
@@ -295,7 +298,10 @@ replacement compaction. You can inspect or update these from the TUI with

 Common settings keys:

- `theme` (default, dark, light, whale)
+- `theme` (`system`, `dark`, `light`, `grayscale`; default `system`):
+  `system` follows terminal background detection, `dark`/`light` use the
+  DeepSeek palettes, and `grayscale` is the low-opinion black/white theme.
+  Aliases such as `whale`, `mono`, and `black-white` are accepted.
 - `auto_compact` (on/off, default off)
 - `paste_burst_detection` (on/off, default on): fallback rapid-key paste
  detection for terminals that do not emit bracketed-paste events. This is
@@ -316,6 +322,10 @@ Common settings keys:
  context panel, `/cost`, `/tokens`, and long-turn notification summaries. The
  aliases `rmb` and `yuan` normalize to `cny`.
 - `default_mode` (agent, plan, yolo; legacy `normal` is accepted and normalized to `agent`)
+- `sidebar_focus` (`auto`, `work`, `tasks`, `agents`, `context`; default
+  `auto`): selects the right sidebar focus. `auto` prioritizes Work, Tasks,
+  Agents, then optional Context, and uses Work as the single quiet empty state.
+  Legacy `plan` and `todos` values are accepted and normalized to `work`.
 - `max_history` (number of submitted input history entries; cleared drafts are
  also kept locally for composer history search)
 - `default_model` (model name override)
@@ -392,8 +402,8 @@ If you are upgrading from older releases:
 - `managed_config_path` (string, optional): managed config file loaded after user/env config.
 - `requirements_path` (string, optional): requirements file used to enforce allowed approval/sandbox values.
 - `max_subagents` (int, optional): defaults to `10` and is clamped to `1..=20`.
- `subagents.*` (optional): per-role/type model defaults for `agent_spawn` and
-  related sub-agent tools. Explicit tool `model` values win, then role/type
+- `subagents.*` (optional): per-role/type model defaults for `agent_open` and
+  related persistent sub-agent sessions. Explicit tool `model` values win, then role/type
  overrides, then the parent runtime model. Supported convenience keys are
  `default_model`, `worker_model`, `explorer_model`, `awaiter_model`,
  `review_model`, `custom_model`, and `max_concurrent`. The
@@ -16,6 +16,7 @@ Bindings are not (yet) user-configurable — tracked for a future release (#436,
 | `Shift-Tab`          | Cycle reasoning effort: off → high → max → off                 |
 | `Ctrl-R`             | Open the resume-session picker                                 |
 | `Ctrl-L`             | Refresh / clear the screen                                     |
+| `Ctrl-O`             | Open Activity Detail for selected/live/recent tool work, or the full reasoning timeline for thinking blocks when the composer is empty |
 | `Ctrl-Shift-E` / `Cmd-Shift-E` | Toggle the file-tree sidebar                          |
 | `Esc`                | Close topmost modal · cancel slash menu · dismiss toast        |

@@ -40,7 +41,7 @@ Editing the message you're about to send.
 | `Ctrl-S`                    | Stash current draft (`/stash list`, `/stash pop` to recover) |
 | `Alt-R`                    | Search prompt history (Alt-R to exit)                  |
 | `Tab`                       | Slash-command / `@`-mention completion (popup-aware)    |
-| `Ctrl-O`                    | Open external editor for the composer draft             |
+| `Ctrl-O`                    | Open external editor for the composer draft when it has focus |

 ### `@` mentions

@@ -69,7 +70,7 @@ When `[memory] enabled = true`, typing `# foo` and pressing `Enter` appends `foo
 |----------------------|-----------------------------------------------------|
 | `↑` / `↓` / `j` / `k`| Move selection                                     |
 | `Enter`              | Activate the selected item (open / focus / cancel) |
-| `Tab`                | Cycle to next sidebar panel (Files → Tasks → Agents → Todos) |
+| `Tab`                | Cycle to next sidebar panel (Work → Tasks → Agents → Context) |
 | `Esc`                | Return focus to composer                           |

 ## Slash-command palette (after `Ctrl-K` or typing `/`)
@@ -80,6 +81,20 @@ When `[memory] enabled = true`, typing `# foo` and pressing `Enter` appends `foo
 | `Enter` / `Tab`      | Run / complete the highlighted command             |
 | `Esc`                | Dismiss palette                                     |

+## Session Picker (`Ctrl-R` or `/sessions`)
+
+| Chord                | Action                                              |
+|----------------------|-----------------------------------------------------|
+| `↑` / `↓` / `j` / `k`| Move selection in the session list                 |
+| `1`-`9`              | Open the visible session history at that list slot |
+| `PgUp` / `PgDn`      | Page the history pane                              |
+| `Enter`              | Resume the selected session                        |
+| `/`                  | Search sessions                                    |
+| `s`                  | Cycle sort order                                   |
+| `a`                  | Toggle current-workspace scope vs all workspaces   |
+| `d`                  | Delete selected session after confirmation         |
+| `Esc` / `q`          | Close the picker                                   |
+
 ## Approval modal (when a tool requests approval)

 | Chord                | Action                                              |
@@ -18,7 +18,7 @@ Run `/mode` to open the mode picker, or switch directly with `/mode agent`,
 - **Agent**: multi-step tool use. Approvals for shell and paid tools (file writes are allowed without a prompt).
 - **YOLO**: enables shell + trust mode and auto-approves all tools. Use only in trusted repos.

-All three modes have access to the `rlm` tool. Inside its Python REPL, `llm_query_batched` fans out 1–16 cheap parallel child calls pinned to `deepseek-v4-flash`. The model reaches for it when work is decomposable.
+All three modes have access to persistent RLM sessions through `rlm_open`, `rlm_eval`, `rlm_configure`, and `rlm_close`. Inside an RLM Python REPL, `sub_query_batch` fans out 1-16 cheap parallel child calls pinned to `deepseek-v4-flash`. The model reaches for it when work is too large or repetitive for the parent transcript.

 ## Compatibility Notes

@@ -78,6 +78,9 @@ See `MCP.md`.
 Run `deepseek --help` for the canonical list. Common flags:

 - `-p, --prompt <TEXT>`: one-shot prompt mode (prints and exits)
+- `deepseek exec --output-format stream-json <PROMPT>`: emit one JSON object per line for harnesses and backend wrappers
+- `deepseek exec --resume <ID|PREFIX> <PROMPT>` / `--session-id <ID|PREFIX>`: continue a saved session non-interactively
+- `deepseek exec --continue <PROMPT>`: continue the most recent saved session for this workspace non-interactively
 - `--model <MODEL>`: when using the `deepseek` facade, forward a DeepSeek model override to the TUI
 - `--workspace <DIR>`: workspace root for file tools
 - `--yolo`: start in YOLO mode
@@ -1,23 +1,21 @@
 # Sub-Agents

-Sub-agents are background instances of the agent loop. The parent
-agent spawns one with a focused task, gets back an `agent_id`
-immediately, and continues working while the sub-agent runs to
-completion. Sub-agents inherit the parent's tool registry by default
-and run with `CancellationToken::child_token()`, so cancelling the
-parent cancels every descendant.
+Sub-agents are persistent background instances of the agent loop. The parent
+opens one with a focused task, gets back an `agent_id` and session name
+immediately, and continues working while the sub-agent runs to completion.
+Sub-agents inherit the parent's tool registry by default and run with
+`CancellationToken::child_token()`, so cancelling the parent cancels every
+descendant.

-This doc covers the role taxonomy. For the orchestration tool surface
-(`agent_spawn` / `agent_wait` / `agent_result` / `agent_cancel` /
-`agent_list` / `agent_send_input` / `agent_resume` / `agent_assign`)
-see `prompts/base.md` "Sub-Agent Strategy" and the in-line tool
-descriptions.
+This doc covers the role taxonomy. The active orchestration surface is
+`agent_open`, `agent_eval`, and `agent_close`; see `prompts/base.md`
+"Sub-Agent Strategy" and the in-line tool descriptions.

 ## Role taxonomy

-The `agent_type` field on `agent_spawn` selects a system-prompt
-posture for the child. Each role is a distinct stance toward the
-work — not just a different label.
+The `type` field on `agent_open` selects a system-prompt posture for the child
+(`agent_type` is accepted as a compatibility alias). Each role is a distinct
+stance toward the work — not just a different label.

 | Role          | Stance                                 | Writes? | Runs shell? | Typical use                                  |
 |---------------|----------------------------------------|---------|-------------|----------------------------------------------|
@@ -32,23 +30,22 @@ work — not just a different label.
 Each role's full system prompt lives in
 `crates/tui/src/tools/subagent/mod.rs` (search for
 `*_AGENT_PROMPT`). The prompt prefix loads automatically when the
-child agent boots; the parent's spawn prompt becomes the first
+child agent boots; the parent's assignment prompt becomes the first
 turn's user message.

 ## Context forking

-`agent_spawn` starts fresh by default: the child gets its role prompt
-plus the task you pass. Use `fork_context: true` when the child should
-continue from the parent's current request prefix instead. In fork
-mode the child request keeps the parent's system prompt and message
-history byte-identical, appends a structured state snapshot, then
-adds the sub-agent role instructions and task at the tail. That keeps
-DeepSeek prefix-cache reuse high while giving the child the context
-needed for continuation, review, summarization, or compaction work.
+`agent_open` starts fresh by default: the child gets its role prompt plus the
+task you pass. Use `fork_context: true` when the child should continue from
+the parent's current request prefix instead. In fork mode the runtime keeps the
+parent prefill/prompt prefix byte-identical where available, appends a
+structured state snapshot, then adds the sub-agent role instructions and task
+at the tail. That preserves DeepSeek prefix-cache reuse while giving the child
+the context needed for continuation, review, summarization, or compaction work.

-Use fresh spawns for independent exploration. Use forked spawns when
-the task depends on decisions, files, todos, or plan state already in
-the parent transcript.
+Use fresh sessions for independent exploration. Use forked sessions when the
+task depends on decisions, files, todos, or plan state already in the parent
+transcript.

 ### When to pick which role

@@ -56,8 +53,12 @@ the parent transcript.
  look", "design", or "verify". This is the right default; reach for
  a more specific role only when the posture matters.
 - **`explore`** — when the parent needs evidence before deciding what
-  to do next. Explorers are cheap and fast; spawn 2–3 in parallel
+  to do next. Explorers are cheap and fast; open 2–3 in parallel
  for independent regions.
+  They should orient first: confirm the project root, read relevant
+  `AGENTS.md`/`README.md` guidance in unfamiliar trees, search only the
+  likely scope, and return `path:line-range` evidence instead of a narrative
+  tour. The role name to use is `explore` or `explorer`.
 - **`plan`** — when the parent has an objective but no executable
  decomposition. Planners write artifacts (`update_plan` rows,
  `checklist_write` entries) but don't carry them out.
@@ -74,7 +75,7 @@ the parent transcript.
  candidates under RISKS.
 - **`custom`** — only when the parent needs to constrain the tool
  set explicitly. Pass the allowlist via the `allowed_tools` field
-  on `agent_spawn`.
+  on `agent_open`.

 ### Aliases

@@ -98,9 +99,9 @@ the next turn.

 The dispatcher caps concurrent sub-agents at 10 by default
 (configurable via `[subagents].max_concurrent` in `~/.deepseek/config.toml`,
-hard ceiling 20). When the parent hits the cap, `agent_spawn` returns
-an error with the cap value; the parent should `agent_wait` for
-completion or `agent_cancel` to free a slot before retrying.
+hard ceiling 20). When the parent hits the cap, `agent_open` returns
+an error with the cap value; the parent should use `agent_eval` to wait for
+completion or `agent_close` to free a slot before retrying.

 The cap counts only **running** agents — completed / failed /
 cancelled records persist for inspection but don't occupy a slot.
@@ -109,7 +110,7 @@ restart) also don't count against the cap.

 ## Lifecycle

-Each spawn produces a record that progresses through:
+Each opened session produces a record that progresses through:

 ```
 Pending → Running → (Completed | Failed(reason) | Cancelled | Interrupted(reason))
@@ -118,20 +119,18 @@ Pending → Running → (Completed | Failed(reason) | Cancelled | Interrupted(re
 `Interrupted` fires when the manager detects a `Running` agent
 whose task handle is gone — typically after a process restart that
 loaded the agent from `~/.deepseek/subagents.v1.json`. The parent
-can `agent_resume` to attempt continuation or treat it as a
+can open a replacement session with the same assignment or treat it as a
 terminal state.

 ### Session boundaries (#405)

 Each `SubAgentManager` instance assigns itself a fresh
-`session_boot_id` on construction. Every spawn stamps the agent
+`session_boot_id` on construction. Every new session stamps the agent
 with that id; the persisted state file carries it across restarts.

-`agent_list` defaults to **current-session only**: prior-session
-agents that aren't still running are filtered out. Pass
-`include_archived=true` to surface every record, with the
-`from_prior_session: true` flag so the model can tell archived
-records apart from live ones.
+`agent_eval` and the sidebar/status projections focus on current-session
+agents by default. Prior-session agents that are not still running are treated
+as archived records so the model does not mistake stale work for live work.

 Records that loaded from a pre-#405 persisted state file (no
 `session_boot_id` field) classify as prior-session because the
@@ -15,7 +15,7 @@ chosen over the available shell equivalent. Companion to `crates/tui/src/prompts
  for the same backing operation are a model trap — the LLM will alternate
  between them and the cache hit rate suffers.

-## Current surface (v0.7.5)
+## Current surface (v0.8.33)

 ### File operations

@@ -27,6 +27,7 @@ chosen over the available shell equivalent. Companion to `crates/tui/src/prompts
 | `edit_file` | Search-and-replace inside a single file. Cheaper than a full rewrite. |
 | `apply_patch` | Apply a unified diff. The right tool for multi-hunk edits. |
 | `retrieve_tool_result` | Read summaries or slices of prior large tool outputs spilled to `~/.deepseek/tool_outputs/`; use `summary`, `head`, `tail`, `lines`, or `query` instead of replaying the whole result. |
+| `handle_read` | Read bounded projections from `var_handle` payloads held by live tool environments. This is the foundation for RLM sessions, sub-agent transcripts, and other large symbolic payloads. |

 ### Search

@@ -137,18 +138,72 @@ Large logs and command outputs should be artifacts with compact summaries in the

 ### Sub-agents

-`agent_spawn` plus the supporting tools (`agent_result` / `wait` / `send_input` /
-`agent_assign` / `agent_cancel` / `resume_agent` / `agent_list`).
+v0.8.33 begins moving large tool outputs toward symbolic handles: tools return
+small `var_handle` objects, and `handle_read` retrieves bounded slices, counts,
+or JSON projections from the backing environment. This keeps the parent
+transcript small while preserving a recovery path to the full payload.
+
+The active model-facing sub-agent surface is persistent and intentionally small:
+
+| Tool | Niche |
+|---|---|
+| `agent_open` | Open a named sub-agent session for independent work. Returns a session projection immediately so the parent can keep coordinating. |
+| `agent_eval` | Send follow-up input, block for completion, or fetch the current projection/transcript handle for an existing session. |
+| `agent_close` | Cancel or release a sub-agent session by name or id. |
+
 See `agent.txt` for the delegation protocol and
 [`SUBAGENTS.md`](SUBAGENTS.md) for the role taxonomy
 (`general` / `explore` / `plan` / `review` / `implementer` /
 `verifier` / `custom`).

-`agent_spawn` defaults to a fresh child conversation. Pass
-`fork_context: true` for continuation-style work that should inherit the
-parent's system prompt and message prefix for DeepSeek prefix-cache reuse.
-The deprecated `delegate_to_agent` compatibility wrapper routes through
-`agent_spawn` and defaults `fork_context` to true.
+`agent_open` defaults to a fresh child conversation. Pass
+`fork_context: true` for continuation-style work or multi-perspective reviews
+that should inherit the parent's context. In fork mode, the runtime preserves
+the parent prefill/prompt prefix byte-identically where available so DeepSeek's
+prefix cache can be reused, then appends the child role instructions and task.
+
+### Recursive LM sessions
+
+RLM is now persistent as well:
+
+| Tool | Niche |
+|---|---|
+| `rlm_open` | Open a named Python REPL over a file, inline content, or URL. |
+| `rlm_eval` | Run bounded Python against that session, using deterministic code and in-REPL semantic helpers such as `sub_query_batch`. |
+| `rlm_configure` | Adjust output feedback, child-query timeout/depth, and session-sharing settings. |
+| `rlm_close` | Shut down the Python runtime and return final session stats. |
+
+Large RLM outputs should come back as `var_handle`s. Use `handle_read` for
+bounded text slices, line ranges, counts, or JSONPath projections instead of
+replaying the full value into the parent transcript.
+
+Inside `rlm_eval`, the loaded source is available as `_context`; `content` is
+also bound as a convenience alias because agents naturally reach for it during
+Python analysis. The shorter `context` and `ctx` names are intentionally not
+bound so user variables can use them without colliding with the bootstrap.
+
+Child-call timeouts are session policy: use `rlm_configure` with
+`sub_query_timeout_secs` before running a large fan-out. The helpers
+`sub_query`, `sub_query_batch`, `sub_query_map`, and `sub_rlm` accept a
+`timeout_secs` keyword for compatibility with common agent guesses, but the
+effective timeout remains configured at the RLM session level.
+
+`finalize(value, confidence=...)` preserves JSON-serializable values. Strings
+become text handles; dicts, lists, numbers, booleans, and null become JSON
+handles that `handle_read` can project with JSONPath.
+
+### Session relay
+
+`/relay [focus]` asks the current agent to write `.deepseek/handoff.md` as a
+compact `# Session relay` artifact for the next thread. The filename remains
+for compatibility with existing prompt loading and older sessions; the visible
+mental model is relay / 接力.
+
+Aliases: `/batonpass`, `/接力`.
+
+Use it before a long break, compaction, or moving work to a fresh session. The
+relay should preserve the goal, current Work checklist item, changed files,
+decisions, verification state, and one concrete next action.

 ### Parallel fan-out: cost-class caps

@@ -157,55 +212,73 @@ reflect very different cost classes:

 | Tool | What each child does | Wall-clock | Token cost | Cap |
 |---|---|---|---|---|
-| `agent_spawn` | Full sub-agent loop (planning, tool calls, multi-turn streaming, can spawn children) | minutes | thousands of tokens | 10 in flight by default (`[subagents].max_concurrent`, hard ceiling 20) |
-| `rlm` helper `llm_query_batched` | One-shot non-streaming Chat Completions calls pinned to `deepseek-v4-flash` | seconds | ~hundreds of tokens | 16 per call |
+| `agent_open` | Full sub-agent loop (planning, tool calls, multi-turn streaming, can open children) | minutes | thousands of tokens | 10 in flight by default (`[subagents].max_concurrent`, hard ceiling 20) |
+| `rlm_eval` helper `sub_query_batch` | One-shot non-streaming Chat Completions calls pinned to `deepseek-v4-flash` inside a live RLM session | seconds | ~hundreds of tokens | 16 per call |

 The caps appear in each tool's description and error messages so the model
 (and the user) can choose the right tool for the job. If one sub-agent is
-enough but you need parallel lookups, prefer `rlm` with `llm_query_batched`; if each task needs
-its own tool-carrying agent loop, use `agent_spawn` (and cancel completed
-ones to free slots).
+enough but you need parallel semantic lookups over the same loaded context,
+prefer `rlm_eval` with `sub_query_batch`; if each task needs its own
+tool-carrying agent loop, use `agent_open` and close completed sessions to free
+slots.

-## Recently consolidated (v0.5.1)
+## Removed legacy aliases and surfaces

-Removed from the prompt as duplicates of equivalent tools (the underlying
-dispatchers still resolve them, so existing sessions don't break — they just
-no longer pollute the model's tool list):
+v0.8.33 removes the old model-facing sub-agent fan-out surface from active
+prompting and tool catalogs. Do not use these names in new active guidance:
+`agent_spawn`, `agent_wait`, `agent_result`, `agent_send_input`,
+`agent_assign`, `agent_resume`, `agent_list`, `spawn_agent`,
+`delegate_to_agent`, `send_input`, and `close_agent`.

- `spawn_agent` → use `agent_spawn`.
- `close_agent` → use `agent_cancel`.
- `assign_agent` → use `agent_assign`.
+The old one-shot `rlm` model-facing tool is also replaced by persistent
+`rlm_open` / `rlm_eval` / `rlm_configure` / `rlm_close` sessions.

-## Deprecation schedule (v0.6.2 → v0.8.0)
-
-The alias tools below still execute successfully but now attach a
-`_deprecation` block to every result they return. Models should migrate to
-the canonical name before v0.8.0, when the aliases will be removed.
-
-| Deprecated alias | Canonical name | Warning since | Removal |
-|---|---|---|---|
-| `spawn_agent` | `agent_spawn` | v0.6.2 | v0.8.0 |
-| `delegate_to_agent` | `agent_spawn` | v0.6.2 | v0.8.0 |
-| `close_agent` | `agent_cancel` | v0.6.2 | v0.8.0 |
-| `send_input` | `agent_send_input` | v0.6.2 | v0.8.0 |
-
-The `_deprecation` block shape:
+Historical compatibility results may include a `_deprecation` block shaped
+like this:

 ```json
 {
  "_deprecation": {
    "this_tool": "spawn_agent",
-    "use_instead": "agent_spawn",
-    "removed_in": "0.8.0",
-    "message": "Tool 'spawn_agent' is deprecated; switch to 'agent_spawn' before v0.8.0."
+    "use_instead": "agent_open",
+    "removed_in": "0.8.33",
+    "message": "Tool 'spawn_agent' is deprecated; switch to 'agent_open'."
  }
 }
 ```

-This block is merged into the tool result's `metadata` object alongside any
-other metadata keys (e.g. `status`, `timed_out`) so it does not displace
-existing metadata.  A one-line deprecation warning is also emitted to the
-audit log at `tracing::warn` level every time an alias is invoked.
+This is a legacy/compatibility note, not the active recommended surface.
+
+## Release smoke: verify the live names
+
+When validating a release, verify the model-visible registry names directly.
+Do not grep random handler function names; handler names are allowed to drift
+while the registry contract stays stable.
+
+Version smoke:
+
+```bash
+deepseek --version
+deepseek-tui --version
+```
+
+Tool-surface smoke:
+
+```bash
+rg -n '"handle_read"|"rlm_open"|"rlm_eval"|"rlm_configure"|"rlm_close"|"agent_open"|"agent_eval"|"agent_close"' crates/tui/src
+rg -n 'handle_read|rlm_open|rlm_eval|rlm_configure|rlm_close|agent_open|agent_eval|agent_close' docs crates/tui/src/prompts crates/tui/src/tools
+```
+
+The canonical v0.8.33 live names are:
+
+- `handle_read`
+- `rlm_open`, `rlm_eval`, `rlm_configure`, `rlm_close`
+- `agent_open`, `agent_eval`, `agent_close`
+
+The registry should not actively advertise the legacy one-shot names
+`agent_spawn`, `agent_wait`, `agent_result`, or the old foreground `rlm` tool
+outside legacy/removal notes. Historical changelog entries and compatibility
+code may still mention them.

 ## Why we don't ship a single `bash` tool

@@ -1,7 +1,7 @@
 {
  "name": "deepseek-tui",
-  "version": "0.8.32",
-  "deepseekBinaryVersion": "0.8.32",
+  "version": "0.8.33",
+  "deepseekBinaryVersion": "0.8.33",
  "description": "Install and run deepseek and deepseek-tui binaries from GitHub release artifacts.",
  "author": "Hmbown",
  "license": "MIT",
@@ -1,6 +1,6 @@
 import { fetchFeed, fetchRepoStats } from "@/lib/github";
 import { curate } from "@/lib/deepseek";
-import { putDispatch } from "@/lib/kv";
+import { putDispatchWithKv } from "@/lib/kv";
 import {
  agentChat,
  TRIAGE_PROMPT,
@@ -32,6 +32,8 @@ export interface AgentEnv {
  MAINTAINER_GITHUB_PAT?: string;
 }

+const CRON_STATUS_TTL = 60 * 60 * 24 * 14;
+
 function dsEnv(env: AgentEnv): DeepSeekEnv {
  return {
    baseUrl: env.DEEPSEEK_BASE_URL ?? process.env.DEEPSEEK_BASE_URL,
@@ -49,10 +51,29 @@ export async function runCurate(env: AgentEnv): Promise<Record<string, unknown>>
      fetchFeed(env.GITHUB_TOKEN, 30),
    ]);
    const dispatch = await curate(env.DEEPSEEK_API_KEY, stats, feed, dsEnv(env));
-    await putDispatch(dispatch);
-    return { ok: true, headline: dispatch.headline };
+    await putDispatchWithKv(env.CURATED_KV, dispatch);
+    await env.CURATED_KV?.put(
+      "cron:curate:last",
+      JSON.stringify({
+        ok: true,
+        generatedAt: dispatch.generatedAt,
+        headline: dispatch.headline,
+      }),
+      { expirationTtl: CRON_STATUS_TTL }
+    );
+    return { ok: true, headline: dispatch.headline, stored: env.CURATED_KV ? "kv" : "memory" };
  } catch (e) {
-    return { ok: false, error: String(e) };
+    const error = String(e);
+    await env.CURATED_KV?.put(
+      "cron:curate:last",
+      JSON.stringify({
+        ok: false,
+        generatedAt: new Date().toISOString(),
+        error,
+      }),
+      { expirationTtl: CRON_STATUS_TTL }
+    );
+    return { ok: false, error };
  }
 }

@@ -18,8 +18,8 @@ export interface RepoFacts {
 }

 export const FACTS: RepoFacts = {
-  "generatedAt": "2026-05-12T19:02:49.213Z",
-  "version": "0.8.32",
+  "generatedAt": "2026-05-12T22:56:03.599Z",
+  "version": "0.8.33",
  "crates": [
    "agent",
    "app-server",
@@ -90,7 +90,7 @@ export const FACTS: RepoFacts = {
  ],
  "defaultModel": "deepseek-v4-pro",
  "nodeEngines": ">=18",
-  "toolCount": 64,
+  "toolCount": 68,
  "license": "MIT",
  "latestRelease": null
 };
@@ -6,7 +6,7 @@ import type { CuratedDispatch } from "./types";

 const MEM = new Map<string, string>();

-interface KVNamespace {
+export interface KVNamespace {
  get(key: string): Promise<string | null>;
  put(key: string, value: string, opts?: { expirationTtl?: number }): Promise<void>;
  list(opts?: { prefix?: string; limit?: number }): Promise<{ keys: { name: string }[] }>;
@@ -53,9 +53,13 @@ export async function getDispatch(): Promise<CuratedDispatch | null> {

 export async function putDispatch(d: CuratedDispatch): Promise<void> {
  const env = await getEnv();
+  await putDispatchWithKv(env.CURATED_KV, d);
+}
+
+export async function putDispatchWithKv(kv: KVNamespace | undefined, d: CuratedDispatch): Promise<void> {
  const value = JSON.stringify(d);
-  if (env.CURATED_KV) {
-    await env.CURATED_KV.put("dispatch:latest", value, { expirationTtl: 60 * 60 * 24 * 7 });
+  if (kv) {
+    await kv.put("dispatch:latest", value, { expirationTtl: 60 * 60 * 24 * 7 });
  } else {
    MEM.set("dispatch:latest", value);
  }