Inject LSP diagnostics as a synthetic user message after every successful
file edit (`edit_file`, `apply_patch`, `write_file`) so the agent sees
compile breaks before its next reasoning step. Largest agent-quality
lever in v0.7.0.
Pieces:
- `crates/tui/src/lsp/`: thin JSON-RPC stdio client (no `tower-lsp`),
per-language registry, diagnostics renderer producing the
`<diagnostics file="…">` block format. `LspManager` owns lazily
spawned per-language transports keyed by `Language`.
- `core/engine.rs`: hook on the success branch of the tool-result loop
derives the edited file path(s) per tool, queries the LspManager
with a 5 s timeout, and collects rendered blocks into
`pending_lsp_blocks`. The queue is flushed as a `text` content
block on the next request iteration so the model sees the
diagnostics before it streams its next turn.
- `[lsp]` config schema (`enabled`, `poll_after_edit_ms`,
`max_diagnostics_per_file`, `include_warnings`, optional
`servers` override) with built-in defaults for rust-analyzer,
gopls, pyright, typescript-language-server, and clangd.
- Failure modes are non-blocking by design: a missing LSP binary
logs a one-time warning and skips the hook; a crashed server or
poll timeout simply drops that turn's diagnostics. The agent's
work is never blocked.
Tests: 24 unit tests cover language detection, registry overrides,
filter/sort/truncate behavior, and the rendered block format. Three
engine-level tokio tests exercise the full path through a fake
transport (no real LSP server is ever spawned in CI).
Acceptance criteria (per #136):
- Edit introducing a type error -> next request body contains
`<diagnostics file="…">` block at the right line/col.
- `[lsp] enabled = false` -> no diagnostics injected.
- Snapshot test exercises full path with mock transport.
- LSP binary not on PATH -> one-time warning, agent proceeds.
- 5 s timeout, errors-only by default.
- Transports spawn lazily on first edit per language.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a queue-driven `MockLlmClient` that implements the `LlmClient` trait
by replaying canned per-turn `StreamEvent` vectors and capturing every
outgoing `MessageRequest`. The mock lives at the trait boundary so it
stays decoupled from the concrete reqwest plumbing inside `DeepSeekClient`,
and surfaces builders (`canned::*`) for the common event shapes (text
delta, thinking delta, tool_use start, input JSON delta, message delta).
Wires a new `--record <DIR>` flag into `deepseek eval` that appends one
JSON Lines fixture line per step to `<DIR>/<scenario>.jsonl`. The format
is documented at the top of `eval.rs` and is the storage shape the mock
will replay from.
`crates/tui/src/llm_client.rs` becomes `crates/tui/src/llm_client/mod.rs`
to host the new submodule cleanly. The trait shape is unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces `network_policy::{Decision, NetworkPolicy, NetworkPolicyDecider,
NetworkAuditor, NetworkSessionCache, NetworkDenied}` for gating outbound
network calls.
Deny-wins precedence: a host listed in both `allow` and `deny` is denied.
Subdomain wildcard via leading-dot entries (`.example.com` matches
`api.example.com` but not the apex). Audit log writes one plaintext line
per terminal decision to `~/.deepseek/audit.log` in the format
`<RFC3339> network <host> <tool> <Allow|Deny|Prompt-Approved|Prompt-Denied>`.
Approve-once-for-session caching is implemented in `NetworkSessionCache`;
`approve_persistent` mutates the policy's allow list so callers can write
back to config later.
19 unit tests cover deny-wins precedence, subdomain matching, audit
logging, session-cache short-circuit, and `NetworkDenied` shape.
Adds first-class keyring management on the dispatcher CLI and wires
the TUI to read its DeepSeek key through the same Secrets façade.
Subcommands:
* `auth set --provider <name>` writes to the OS keyring; prompts
on stdin without echo, never prints the key, never touches
`config.toml`. Supports `--api-key` and `--api-key-stdin`.
* `auth get --provider <name>` reports `set` / `not set` plus the
resolving layer (keyring / env / config-file). Never prints the
value.
* `auth clear --provider <name>` deletes from keyring and from any
legacy plaintext slot in `config.toml` for parity.
* `auth list` table of all known providers and
whether each layer holds a key. Non-revealing.
* `auth migrate [--dry-run]` reads `api_key` (root + per-provider
blocks) from `config.toml`, writes them to the keyring, then
strips the entries from disk. Idempotent.
* `auth status` expanded to also report the active
keyring backend and per-provider keyring state.
`doctor` now prints `keyring backend: ...` plus per-provider
`keyring=yes/no, env=yes/no` lines and points users at
`deepseek auth set` when no key resolves.
`Config::deepseek_api_key()` in the TUI is rewritten to consult
`Secrets::auto_detect()` first (keyring -> env), then fall back to
the existing TOML slots with a deprecation warning. Error messages
now lead with `deepseek auth set --provider <name>`.
5 new unit tests cover argument parsing for the new subcommands and
end-to-end auth set/clear/migrate behaviour against an
`InMemoryKeyringStore`, verifying that no plaintext key ever lands
in `config.toml`.
Verified manually on macOS:
$ deepseek auth set --provider deepseek --api-key-stdin
$ security find-generic-password -s deepseek -a deepseek
# entry present
$ deepseek auth migrate
# api_key lines stripped from ~/.deepseek/config.toml
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add crates/tui/assets/skills/skill-creator/SKILL.md (ported from
OpenAI codex repo, MIT-licensed; all Codex→deepseek / ~/.codex→~/.deepseek
replacements applied; MIT attribution comment at top).
- Convert crates/tui/src/skills.rs → skills/mod.rs + new skills/system.rs.
install_system_skills() writes the bundled SKILL.md on fresh install or
version bump; respects user-deleted directories; idempotent by design.
Version marker at ~/.deepseek/skills/.system-installed-version.
- Wire install_system_skills() into run_interactive() (main.rs) before the
TUI mounts; errors are non-fatal (logged as warnings).
- Add /skill new as an alias for /skill skill-creator in commands/skills.rs.
- 7 unit tests covering fresh install, idempotence, user-deleted dir, version
bump, uninstall (all pass).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
StatusItem enum covers all current + new (rate-limit, ctx %, git branch,
last-tool elapsed) items. /statusline opens a multi-select picker with
live preview; selections persist to config.toml under `tui.status_items`.
Default selection mirrors today's footer so upgraders see no change.
Replaces lossy summarization compaction with a checkpoint-restart
architecture (#124). At 110K cumulative tokens (per V4's 128K retrieval
elbow) the engine runs a briefing turn, archives the cycle to JSONL at
~/.deepseek/sessions/<id>/cycles/<n>.jsonl, then resets the in-memory
buffer to a fresh context: original system prompt + structured state
(plan/todos/working-set/sub-agents) + the model-curated <carry_forward>
briefing (~3K token cap, hard-bounded).
The compaction summarizer is now off by default. Per-model thresholds in
[cycle.per_model] let operators tune deepseek-v4-pro vs -flash separately.
Phase guard in should_advance_cycle blocks mid-tool/stream/approval boundaries;
engine only invokes at clean turn-completed events. Sub-agents are not
awaited — their handles are captured in the structured-state block so the
new cycle sees them still running.
Adds the recall_archive tool (#127) — BM25 over message text in archived
cycles, top-N hits with cycle/index/excerpt. Always-loaded across modes
via should_default_defer_tool so the agent doesn't need ToolSearch to
discover it. Children inherit it via with_full_agent_surface.
UI surfaces:
- /cycles, /cycle <n>, /recall <query> slash commands
- Sidebar shows cycle counter once a boundary fires
- CycleAdvanced engine event carries the full briefing so the UI can
populate app.cycle_briefings for /cycle <n>
- runtime_threads schema bumped to v2 (cycle.advanced events appear in
the durable timeline; load rejects future versions)
Tests: 21 cycle_manager + 13 recall_archive + 4 commands::cycle.
All 1168 workspace tests pass. Three parity gates pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
v0.6.6 — sub-agents inherit the parent's full tool registry,
auto-approve, respect a depth cap, and propagate cancellation.
Adds optional cwd to agent_spawn for parallel-worktree dispatch.
Schema-ready for roles (full library lands in 0.6.7).
Changes:
- New ToolRegistryBuilder::with_full_agent_surface(...) shared by parent and child
- SubAgentToolRegistry::new refactored to use shared builder; per-type
allowlist becomes advisory
- SubAgentRuntime gains auto_approve, spawn_depth, max_spawn_depth, cancel_token
- Depth check at spawn entry; cancellation cascade via CancellationToken::child_token()
- <deepseek:subagent.done> sentinel emitted on child completion
- cwd: Option<PathBuf> on agent_spawn with workspace-boundary validation
- Stream wall-clock cap bumped to 30 min (was 300s)
- max_spawn_depth configurable via EngineConfig (default 3)
- Version bump to 0.6.6
Closes#99.
Adds the true Recursive Language Model (RLM) inference paradigm:
- rlm/mod.rs — module root with public API
- rlm/prompt.rs — RLM system prompt teaching the model to write code
- rlm/turn.rs — Algorithm 1 implementation:
- P stored as REPL variable (NEVER in LLM context window)
- Metadata-only context sent to root LLM (constant-size)
- LLM generates Python code, not free text
- Code executed in PythonRuntime with llm_query() for recursion
- FINAL() detection ends the loop
- Op::RlmQuery variant in ops.rs
- /rlm command in the command system
- AppAction::RlmQuery handler in ui.rs
- PythonRuntime::with_state_path made public for RLM integration
- 18 new unit tests for code extraction, metadata building, truncation
Key differences from previous 'RLM-inspired' approach:
✅ P is external (REPL variable), not in LLM context
✅ Only metadata(state) in LLM context (constant-size)
✅ LLM generates code, not free text + tool calls
✅ sub-LLM recursion via llm_query() inside REPL code
✅ FINAL() mechanism for programmatic termination
After the assistant message is persisted, when tool_uses is empty,
check for inline ```repl blocks and execute them via PythonRuntime:
- Extract REPL blocks from assistant text
- Spawn PythonRuntime and execute each block sequentially
- If a round returns FINAL: replace the assistant message text with
the final value and break the turn
- If no FINAL: append truncated stdout/stderr as user feedback and
continue the turn loop for iterative refinement
- Emit status events so the user sees 'REPL round N: ...' in the UI
All 26 REPL tests + RLM tests pass. Release build verified.
Refs: paper-spec RLM (Zhang et al., arXiv:2512.24601) §2
Closes#77, refs #75.
#77 — bracketed paste was unconditionally enabled at terminal init. Add a
\`bracketed_paste\` field to Settings (default true) and propagate it through
TuiOptions → App → run_tui / pause_terminal / resume_terminal so users on
the rare terminal that mishandles \`\e[?2004h\` can disable it via
\`/set bracketed_paste off\` or \`bracketed_paste = false\` in
\`~/.config/deepseek/settings.toml\`. Modern terminals continue to work as
before. All TuiOptions construction sites updated in one pass.
#75 — added an ignored-test microbench for \`compute_profile\` in
\`crates/tui/src/core/capacity.rs\`. Run with:
cargo test -p deepseek-tui --release bench_compute_profile -- --ignored --nocapture
Baseline (release, M1):
window= 16 per-call= 48ns
window= 64 per-call= 126ns
window= 256 per-call= 385ns
window=1024 per-call=1438ns
Sub-µs at typical window sizes — no optimization shipped, bench locks in
the regression contract. No new dev-deps (uses std::time::Instant +
black_box, gated as #[ignore]).
Adds a persistent allowlist of external paths the agent may read/write
from outside the current workspace, scoped to the workspace it was
granted in. The list lives in ~/.deepseek/workspace-trust.json with
schema {"workspaces": {"<ws>": ["<trusted>", ...]}}; canonical paths on
both sides keep symlink-aliased macOS tempdirs sane.
Surface area:
* crates/tui/src/workspace_trust.rs — new module: load_for / add /
remove plus *_at variants for tests that need an explicit file path
rather than HOME mutation.
* tools/spec.rs — ToolContext gains trusted_external_paths and
resolve_path consults it before returning PathEscape, both for the
existing-path branch and the to-be-created (parent-canonical) branch.
* core/engine.rs — build_tool_context loads the trust snapshot on every
tool dispatch so /trust mutations apply on the next call.
* commands/config.rs — /trust now takes subcommands (add, remove,
list, on, off, status) instead of being a single all-or-nothing
toggle. Tilde expansion handled in-line.
* commands/mod.rs — registry entry updated with the new usage string
and a dispatcher that forwards args.
* tools/diagnostics.rs — adds trusted_external_paths to the JSON
output so the agent and the user can see the list at a glance.
The interactive "Allow once / Always allow / Deny" prompt that the
issue describes is deferred — for v0.5.1 the workflow is "grant
ahead with /trust add". A future change will add a hook in
ToolContext::resolve_path that surfaces an ApprovalRequest when an
escape path is hit, so the slash-command remains the durable
mechanism while the prompt becomes the discovery one.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A user couldn't find an `NVIDIA_API_KEY` block in `~/.deepseek/config.toml`
because the example file only mentioned NIM as commented-out alternates
to the top-level keys. Two fixes:
- `config.example.toml` now has explicit `[providers.deepseek]` and
`[providers.nvidia_nim]` sections (placed after all top-level keys so
the TOML still parses cleanly), each documenting `api_key` /
`base_url` / `model` plus the env vars that override them. Both
providers can be stored at once and toggled via `/provider` or
`--provider` without re-entering keys.
- `setup --status` "missing api_key" message is now provider-aware: on
`nvidia-nim` it points at `NVIDIA_API_KEY` + `[providers.nvidia_nim]`
+ `deepseek auth set --provider nvidia-nim`, instead of the
DeepSeek-only hint.
Audit verified: the v0.5.0 multi-turn replay fix path
(`should_replay_reasoning_content` → `requires_reasoning_content` in
`crates/tui/src/client.rs:1796`) keys off the model name (matches
`deepseek-v4`), not the provider, so NIM-hosted V4 models get the
replay automatically. No NIM-specific 400-class regression there.
Closes#37 (docs/UX); the live multi-turn-against-NIM verification
remains a manual smoke step listed in the issue (no NIM creds in CI).
Local main's unpushed commits had removed DEFAULT_TEXT_MODEL from the
crate::config import in main.rs, but the merged branch's new code at
two call sites still uses it. Textual three-way merge took the local
import line and the branch's call sites, producing a build break.
Re-add the symbol to the import.
Adds a compact `setup --status` view, a `setup --clean` for regenerable
session checkpoints, and `--tools`/`--plugins` scaffolding for
~/.deepseek/{tools,plugins} so the extension model has a documented home
that doctor can count. `doctor --json` lands as a CI-safe alternative to
the human-readable doctor (skips the live API probe).
Also locks down the engine's hostility to fake tool-call wrappers:
filter_tool_call_delta and the marker constants are now testable, the
streaming loop emits one compact status notice per turn when it strips
a wrapper, and a new protocol_recovery integration test asserts that
the legacy text parser never turns <function_calls> into a real tool
call. Adds 23 unit tests + 14 integration tests covering both slices.
Adds first-class DeepSeek V4 Pro and Flash support, updates the default model to deepseek-v4-pro, aligns legacy aliases with the current V4 1M context behavior, and fixes thinking-mode request handling.
Key fixes:
- Send DeepSeek's raw Chat Completions `thinking` parameter at the top level instead of SDK-only `extra_body`.
- Preserve assistant `reasoning_content` for all prior thinking-mode tool-call turns so subsequent requests satisfy DeepSeek V4's replay requirement.
- Fix npm wrapper concurrent first-run downloads by using per-process temporary download paths.
- Add `.mailmap` so historical bot-attributed commits aggregate under Hunter Bown where mailmap is honored.
Verified with the full local Rust gate, live DeepSeek V4 smoke, npm wrapper temp-install smoke, and green PR CI across Linux, macOS, and Windows.
* fix: address PR #4 follow-ups
Honor low_motion in the default tool transcript path and align composer cursor padding with the rendered placeholder. Add focused regression tests for both behaviors.
* lint: remove redundant imports in empty_state test, reuse create_test_app
The test had inner `use` statements for Config, App, TuiOptions, and
PathBuf that duplicated the module-level test imports. It also manually
constructed App instead of calling the existing create_test_app() helper.
* fix: replace useless format!("{text}") with text.to_string() in details_affordance_line
* test: pin composer_density in cursor test to avoid sensitivity to loaded settings
Settings::load() may return a non-default composer_density on some CI
environments. Explicitly set ComposerDensity::Comfortable so the
expected cursor position is deterministic across all platforms.
* fix: make tool low_motion test robust against coarse Windows timers
Use a 2× cycle offset so the animated frame index is 2 (maximally
distant from 0), giving 1800 ms of headroom before the animation could
wrap back to index 0. The previous 1× offset left only ~15 ms of
margin, causing flaky failures on Windows where Instant resolution is
approximately 15.6 ms.
* fix: correct headroom comment in tool animation test (3600ms, not 1800ms)
* fix: resolve lint, parity, and Windows test failures
- Fix rustfmt line-length issue in history.rs tool animation test
- Settings::path() now respects DEEPSEEK_CONFIG_PATH for Windows test compat
- doctor_check_mcp_server recognizes Unix-style absolute paths on Windows
- Use checked_sub for Instant arithmetic in web_run tests to prevent
underflow on freshly-booted Windows CI runners
* fix: expand ~ in DEEPSEEK_CONFIG_PATH when resolving settings path
---------
- Move src/* into crates/tui/src/ to create a proper workspace structure
- Add .claude/ and .trimtab/ directories for Trimtab closed-loop workflow
- Add DEPENDENCY_GRAPH.md and update documentation
- Update Cargo.toml files to reflect new crate dependencies
- Update CI workflows and npm package scripts
- All tests pass, release build works