v0.6.6 — sub-agents inherit the parent's full tool registry,
auto-approve, respect a depth cap, and propagate cancellation.
Adds optional cwd to agent_spawn for parallel-worktree dispatch.
Schema-ready for roles (full library lands in 0.6.7).
Changes:
- New ToolRegistryBuilder::with_full_agent_surface(...) shared by parent and child
- SubAgentToolRegistry::new refactored to use shared builder; per-type
allowlist becomes advisory
- SubAgentRuntime gains auto_approve, spawn_depth, max_spawn_depth, cancel_token
- Depth check at spawn entry; cancellation cascade via CancellationToken::child_token()
- <deepseek:subagent.done> sentinel emitted on child completion
- cwd: Option<PathBuf> on agent_spawn with workspace-boundary validation
- Stream wall-clock cap bumped to 30 min (was 300s)
- max_spawn_depth configurable via EngineConfig (default 3)
- Version bump to 0.6.6
Closes#99.
The previous /rlm slash command flow had a UI rendering gap (the answer
never made it back to the model's view) and required the user to invoke
it manually. Pivoting to a tool-call surface and aligning the in-REPL
helpers with the canonical reference (alexzhang13/rlm) by the paper
authors so the same prompts and decomposition patterns transfer.
New tool: rlm_process
- crates/tui/src/tools/rlm_process.rs
- Inputs: task (small, shown to root LLM each iter as root_prompt) +
exactly one of file_path (workspace-relative, preferred) or content
(inline, capped at 200k chars). Optional child_model and max_depth.
- Loaded across Plan/Agent/YOLO; never deferred via ToolSearch.
- Returns the final answer string + metadata (iterations, duration,
tokens, termination).
REPL surface aligned with reference (alexzhang13/rlm):
- Variable name `context` (was PROMPT)
- Code fence ```repl (was ```python; python/py kept as fallback)
- Helpers: llm_query, llm_query_batched (NEW), rlm_query (was sub_rlm),
rlm_query_batched (NEW), SHOW_VARS (NEW), FINAL, FINAL_VAR,
repl_get/repl_set
- Top-level JSON-serializable user variables auto-persist across rounds
(no repl_set ceremony required)
- FINAL(...) / FINAL_VAR(...) parseable from the model's raw response
text (parse_text_final), in addition to the in-REPL sentinel path.
Code-fenced occurrences are correctly ignored to prevent false hits.
Sidecar (axum, 127.0.0.1:0):
- Added POST /llm_batch and POST /rlm_batch endpoints (parallel fanout,
cap 16 prompts per batch). Mirrors the reference's batched semantics.
Other:
- System prompt rewritten with reference's strategy patterns
(PREVIEW → CHUNK+map-reduce via llm_query_batched → RECURSIVE
decomposition via rlm_query → programmatic compute + LLM interp).
- Strict termination loop unchanged: must emit ```repl or text-level
FINAL each round; one fence-less round → reminder, two → DirectAnswer.
- /rlm slash command remains for manual debug; description points the
model toward rlm_process for the in-agent flow.
Versions: workspace 0.6.4 → 0.6.5; npm wrapper 0.6.4 → 0.6.5.
Gates green: cargo fmt, cargo clippy --all-targets --all-features
--locked -D warnings, cargo test --workspace --all-features --locked
(all pass), parity_protocol/parity_state/snapshot, RUSTDOCFLAGS=
-Dwarnings cargo doc --workspace --no-deps.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The v0.6.3 RLM loop had Algorithm 1's outer shape but the substrate was
non-functional: `llm_query()` was a Python stub that returned a hardcoded
string and `child_model` was bound with an underscore prefix and silently
dropped. The recursive sub-LLM call advertised by /rlm never fired.
This commit wires the substrate end-to-end per Zhang/Kraska/Khattab
(arXiv:2512.24601, Algorithm 1):
- New axum HTTP sidecar (`rlm/sidecar.rs`) bound to 127.0.0.1:0 for the
duration of one RLM turn. Python's `llm_query()` and `sub_rlm()` are
real `urllib.request` POSTs; Rust services them via the existing
DeepSeek client. Token usage from sidecar-served calls folds into the
parent `RlmTurnResult.usage`.
- `child_model` is plumbed through `Op::RlmQuery` → `AppAction::RlmQuery`
→ `run_rlm_turn` → sidecar handlers; default remains `deepseek-v4-flash`.
- New `sub_rlm(prompt)` Python helper runs a full Algorithm-1 turn at
depth-1 (paper's `sub_RLM`). Default `max_depth = 2` from `/rlm`. The
recursive opaque-future cycle is broken by returning a concrete
`Pin<Box<dyn Future + Send>>` from `run_rlm_turn_inner`.
- Strict termination: the loop ends only via `FINAL(value)` (or the
iteration cap). One fence-less round is tolerated with a reminder
appended; two consecutive ones surface the model text as a
`RlmTermination::DirectAnswer` exit. New `RlmTermination` enum lets
callers tell `Final | DirectAnswer | Exhausted | Error` apart.
- Richer `Metadata(state)`: includes paper-required access patterns
(`repl_get` / slicing / `splitlines` / `repl_set` / `llm_query` /
`sub_rlm` / `FINAL`) and a live list of variable keys currently in
the REPL state file.
- Unicode-safe `truncate_text` (was mixing bytes with chars), per-turn
state-file cleanup, `ROOM_TEMPERATURE` typo → `ROOT_TEMPERATURE`.
- New end-to-end test `sidecar_url_is_exported_to_python_env` stands up
a stand-in axum server, runs `print(llm_query('hello'))` in the real
PythonRuntime, and asserts the reply round-trips. Catches future
regressions in sidecar URL passthrough.
Versions: workspace 0.6.3 → 0.6.4 in Cargo.toml; npm wrapper 0.6.3 → 0.6.4
in npm/deepseek-tui/package.json.
Gates: cargo fmt, cargo clippy --all-targets --all-features --locked
-D warnings, cargo test --workspace --all-features --locked (1088
passed), parity_protocol/parity_state/snapshot, RUSTDOCFLAGS=-Dwarnings
cargo doc --workspace --no-deps — all green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Highlights:
- fix(client): SSE idle-timeout so a stalled stream surfaces a clear error
instead of hanging the active cell (#76)
- fix(tui): sidebar Agents panel reads live engine progress, not just the
cached snapshot — matches the footer chip in real time (#63)
- fix(tui): generic tool result preview preserves newlines for diff stats
/ file lists / todo snapshots (#80)
- fix(tui): slash-menu scroll viewport now exercises center-tracking past
the first 6 entries (#64)
- feat(mcp): connect-failure errors include URL, status, body excerpt,
transport — credentials masked (#71)
- feat(tools): mark alias tools (spawn_agent, close_agent, send_input,
delegate_to_agent) with _deprecation metadata; removal slated 0.8.0 (#72)
- feat(capacity): V4 model priors (deepseek-v4-pro/flash) + key
normalization, plus DEEPSEEK_CAPACITY_PRIOR_V4_* env overrides (#73)
- feat(tools): explain parallel fan-out caps in agent_spawn vs rlm_query
descriptions and error messages — cost-class table in TOOL_SURFACE.md (#81)
- chore(errors): partial wiring of the error taxonomy — classify_error_message
helper used in capacity controller, audit log fields pending (#66)
- chore(providers): scaffold OpenRouter and Novita variants end-to-end
(env keys, default base URLs, model normalization). Modal /provider
picker UI still pending (#52)
Build hygiene:
- cargo fmt clean, cargo clippy --workspace -- -D warnings clean
- cargo test --workspace passes (979+ tests across crates)
- pre-existing dead-code warnings gated per-item with TODO refs to #61/#66
EOF
)
- V4 cache-hit input prices cut to 1/10th per DeepSeek pricing update:
Pro promo 0.03625→0.003625, Pro base 0.145→0.0145, Flash 0.028→0.0028
- Remove the 'light' theme variant (Variant::Light, Theme::light(), test)
- Remove the theme setting entirely — hardcode UI_THEME to whale/dark,
drop the theme field from Settings, ConfigView, and config command
- Bump workspace version 0.6.0 → 0.6.1 (Cargo.toml, npm pkg, CHANGELOG)
- De-cringe the README: drop emojis, marketing fluff, unverified claims
Adds a structured rlm_query tool for parallel/batched LLM fan-out.
The model calls it with one prompt or up to 16 concurrent prompts;
children dispatch via tokio::join_all against the existing DeepSeek
client. Default child model is deepseek-v4-flash; override per-call
via the model field. Available in Plan / Agent / YOLO. Cost folds
into the session's running total automatically.
Fixes scroll-stuck regression (#56): TranscriptScroll::resolve_top
and scrolled_by now use a three-level fallback chain (same line →
same cell line 0 → nearest cell at-or-before) instead of teleporting
to ToBottom when an anchor cell vanishes.
Loosens command-safety chains (#57): cargo build && cargo test and
similar chains of known-safe commands now escalate to RequiresApproval
instead of being hard-blocked as Dangerous. Chains containing unknown
commands still block.
Suppresses the GettingCrowded footer chip — context-percent header
already covers conversation pressure.
Refactors:
- Extracts file_mention parsing/completion/expansion (~450 LOC) from
the 5,500-line ui.rs into crates/tui/src/tui/file_mention.rs.
- Deletes truly unused helpers (write_bytes, timestamped_filename,
extension_from_url, output_path, has_project_doc, primary_doc_path).
Tests: 853 pass. cargo clippy --workspace -D warnings clean.
cargo fmt --all -- --check clean.
Closes#46#47#48#49#50#53#54#55#56#57#58.
Single-feature release that lands #39 (the /model two-pane picker)
on top of the v0.5.1 quality-of-life batch. Bumps workspace +
npm wrapper + Cargo.lock in lockstep; check-versions.sh verifies.
Closes#39.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps workspace version to 0.5.1 and finalises the changelog with the
issues that landed since v0.5.0:
* #25, #27, #31, #33, #34 (already on main)
* #28 @file Tab-completion
* #29 per-workspace trust list with /trust slash command
* #30 reasoning-replay token chip in the footer
* #36 regression tests for sidebar gutter bleed
scripts/release/check-versions.sh is green: workspace=0.5.1, npm=0.5.1,
Cargo.lock in sync.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
### Fixed
- DeepSeek thinking-mode tool-call rounds now always replay reasoning_content
in all subsequent requests (including across new user turns), matching the
documented API contract that assistant tool-call messages must retain their
reasoning content forever. Previously, reasoning_content was cleared after
the current user turn completed, which could cause HTTP 400 errors.
- Missing reasoning_content on a tool-call assistant message now substitutes
a safe placeholder ("(reasoning omitted)") instead of dropping the tool
calls and their matching tool results, preventing orphaned conversation
chains and API 400 rejections.
- Session checkpoint now persists a Thinking-block placeholder for tool-call
turns that produced no streamed reasoning text, keeping on-disk sessions
structurally correct for subsequent requests.
- Token estimation for compaction now counts thinking tokens across ALL
tool-call rounds (not just the current user turn), aligning with the
updated reasoning_content replay rule.
### Changed
- Internal crate dependency pins bumped 0.4.5 → 0.4.9 to match workspace.
- npm wrapper version and deepseekBinaryVersion bumped to 0.4.9.
- README fully rewritten: clearer feature highlights, V4 model focus,
keyboard shortcut table, improved docs index, and more engaging layout.
- CHANGELOG entry for 0.4.9 with comparison URLs.
Bump workspace version 0.4.6 -> 0.4.7 and ship the bug fix flagged by Devin
on PR #18: an uncommented `DEEPSEEK_API_KEY=` line in .env.example caused
`cp .env.example .env` to load an empty key, which `apply_env_overrides`
then propagated into the config and `Config::validate()` rejected on
startup with "api_key cannot be empty string".
- .env.example: comment out the empty `DEEPSEEK_API_KEY=` placeholder.
- crates/tui/src/config.rs: skip empty `DEEPSEEK_API_KEY` env values in
`apply_env_overrides`, matching the facade's empty-string filter.
- Add `apply_env_overrides_ignores_empty_api_key` regression test.
- Bump Cargo.toml workspace version, npm wrapper version + binary version,
and Cargo.lock.
Bundles new work since v0.4.4: NIM provider support, file mention
attachments, provider switch UX, setup --status/--clean/--tools/--plugins,
doctor --json, and protocol-recovery hardening that strips fake
tool-call wrappers from streaming model output.
Adds first-class DeepSeek V4 Pro and Flash support, updates the default model to deepseek-v4-pro, aligns legacy aliases with the current V4 1M context behavior, and fixes thinking-mode request handling.
Key fixes:
- Send DeepSeek's raw Chat Completions `thinking` parameter at the top level instead of SDK-only `extra_body`.
- Preserve assistant `reasoning_content` for all prior thinking-mode tool-call turns so subsequent requests satisfy DeepSeek V4's replay requirement.
- Fix npm wrapper concurrent first-run downloads by using per-process temporary download paths.
- Add `.mailmap` so historical bot-attributed commits aggregate under Hunter Bown where mailmap is honored.
Verified with the full local Rust gate, live DeepSeek V4 smoke, npm wrapper temp-install smoke, and green PR CI across Linux, macOS, and Windows.
- Move src/* into crates/tui/src/ to create a proper workspace structure
- Add .claude/ and .trimtab/ directories for Trimtab closed-loop workflow
- Add DEPENDENCY_GRAPH.md and update documentation
- Update Cargo.toml files to reflect new crate dependencies
- Update CI workflows and npm package scripts
- All tests pass, release build works
- Add npm/deepseek-tui package that downloads prebuilt binaries from
GitHub releases (supports macOS, Linux, Windows)
- Published as deepseek-tui@0.3.28 on npmjs.com
- Update README to feature npm as primary install method
- Add npm badge