codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
20bytes	8aed1bb674	memory: polish help and docs (#569 ) - add /memory help and clearer invalid-subcommand guidance - register /memory in shared slash-command help - align memory docs with current behavior and config - add focused tests for help and discovery	2026-05-04 02:25:13 -05:00
Hunter Bown	fc1970fa55	fix(auth): use config-backed setup without credential prompts	2026-05-03 23:02:11 -05:00
Hunter Bown	84c55e9022	chore(release): bump version to 0.8.8 - Workspace `version = "0.8.8"` in root `Cargo.toml`. - 31 internal `deepseek-*` path-dep version pins across the 9 crates that declare them. - `npm/deepseek-tui/package.json` `version` and `deepseekBinaryVersion` both updated. - `Cargo.lock` regenerated for the new workspace version. - `CHANGELOG.md` `[Unreleased]` heading promoted to `[0.8.8] - 2026-05-03`. `scripts/release/check-versions.sh` reports the workspace, npm wrapper, and lockfile all aligned. Pushing this to `main` should fire `auto-tag.yml`, which creates the `v0.8.8` tag with `RELEASE_TAG_PAT`. The tag triggers `release.yml` to build the matrix and draft the GitHub Release. The npm wrapper publish remains manual (npm 2FA OTP requirement). What ships in v0.8.8 ==================== The full polish stack already merged via PRs #514 (stabilization), #515 (OSC 8 hyperlinks), #517 (inline diff render), #518 (user memory MVP), #519 (foreground polish + per-project overlay + security + Windows redraw fix), and #508 (Linux ARM64 prebuilts + install docs). See `CHANGELOG.md` and the README "What's new in v0.8.8" section for the full list.	2026-05-03 08:55:41 -05:00
Hunter Bown	bda30b0fd6	Merge main into feat/v0.8.8-tui-polish + gemini-code-assist feedback Resolves the post-#514/#517/#518 conflicts: - CHANGELOG.md: kept both polish-stack and Linux ARM64 entries under [Unreleased]; reordered so the ARM64/install-message Changed/Docs sections precede the Releases footer. - config.example.toml: kept both the `instructions = [...]` example and the `[memory]` opt-in stanza in sequence. - crates/tui/src/config.rs: kept both `instructions_paths()` (#454) and `memory_enabled()` (#489) on the Config impl. - crates/tui/src/prompts.rs: extended `system_prompt_for_mode_with_context_and_skills` to take BOTH `instructions: Option<&[PathBuf]>` and `user_memory_block: Option<&str>`. Section 2.5a renders instructions; 2.5b renders the memory block — both above the skills block so KV prefix caching still wins. - crates/tui/src/core/engine.rs: thread both args through the two call sites. - crates/tui/src/prompts.rs: update the `system_prompt_for_mode_with_context` forwarder and the test caller to pass `None` for the new arg. - .gitignore: ignore `.claude/.local.md` and `.local.json` so local ralph / Claude-Code notes can't leak into commits. Folds in two valid suggestions from the gemini-code-assist review on #519: - `client.rs`: collapse the duplicated `LlmError → label` match and the `human_retry_reason` body into a single `retry_reason_label_and_human(err) -> (&'static str, String)` helper. - `widgets/footer.rs::retry_banner_spans`: merge the two separate `match &props.retry` blocks into one that returns both `(label, color)`. Behavior is unchanged; refactor is a pure DRY win.	2026-05-03 08:29:59 -05:00
Hunter Bown	d9701c1dde	perf(tui): lock composer height while slash/mention menu is open User feedback (Windows 10 PowerShell + WSL, Telegram thread): typing through `/skill` feels visibly laggy because every keystroke shrinks the matched-entry list, which shrinks the composer panel, which forces the chat area above to repaint cells. On Unix terminals the work is invisible; on the Windows console backend the per-cell write cost makes it noticeable. Fix: when the slash- or mention-menu is open, `desired_height` reserves the panel's worst-case envelope (`composer_max_height`) for the whole menu session instead of tracking the matched-entry count. The chat-area Rect stays stable, so ratatui's diff renderer skips the cells above the composer entirely. The menu itself still renders only the entries that actually match — extra rows are panel padding inside the same Rect. `render()` and `cursor_pos` route through the same locked-budget calculation so the input stays at the top of the panel and the cursor lands on the row the input is drawn on. New unit test pins the invariant: 5-match and 1-match menus produce the same composer height; closing the menu releases the reserved rows.	2026-05-03 08:02:23 -05:00
Hunter Bown	f6c7a36076	feat(execpolicy): heredoc body parsing in normalize_command (#419 ) `normalize_command` now strips heredoc bodies before shlex tokenization so a user's `auto_allow = ["cat > file.txt"]` pattern matches the heredoc form `cat <<EOF > file.txt\nbody\nEOF` cleanly. Recognises the common forms (`<<DELIM`, `<<-DELIM`, `<<'DELIM'`, `<<"DELIM"`) while leaving the here-string operator (`<<<`) untouched. Six unit tests cover: simple body strip, dash form, quoted delimiter, non-heredoc passthrough, here-string preservation, and the end-to-end pattern-match path.	2026-05-03 07:44:43 -05:00
Hunter Bown	604edc9f83	feat(tls): honor SSL_CERT_FILE for corporate-CA / MITM proxies (#418 ) Corporate users behind TLS-inspecting proxies (Zscaler, Netskope, Palo Alto, in-house mitmproxy fleets) need to add the proxy's intermediate CA to the trusted-roots set so the deepseek client doesn't fail with `unable to get local issuer certificate`. The reqwest builder already trusts the platform's system store via native-tls. This adds opt-in support for the conventional `SSL_CERT_FILE` env var so users can point at their own bundle: * New `add_extra_root_certs(builder, path)` helper reads the file, tries `Certificate::from_pem_bundle` (covers single-cert files too), falls back to `from_der` for binary cert files. * Wired into `build_http_client` when `SSL_CERT_FILE` is set and non-empty. Failures log a warning via the existing `logging::warn` channel and return the builder unchanged — the existing system trust still applies, so a malformed env var degrades gracefully instead of bricking the launch. * Each successful load logs `info` with the cert count so operators can confirm their bundle was picked up. Documented in `docs/CONFIGURATION.md`'s environment-variables list alongside the existing TLS-related notes. No new dependency — reqwest's `native-tls` feature already exposes `Certificate::from_pem_bundle` / `from_der`.	2026-05-03 07:35:23 -05:00
Hunter Bown	6566a59097	feat(security): deny loosest approval/sandbox values at project scope (#417 ) Continues #417 by closing the value-level escalation case for the two pure-loosening values: * `approval_policy = "auto"` would auto-approve every tool call that the user's stricter setting (\`suggest\`, \`never\`, etc.) was prompting on. Pure escalation; project should never be able to set this. * `sandbox_mode = "danger-full-access"` exits the workspace sandbox entirely. Pure escalation; project should never be able to set this. Both denies are unconditional at project scope — the user's prior value (or absence) doesn't matter. The denied value emits a stderr warning so users see the deny. Sub-tightening comparisons (e.g. user `"never"` → project `"on-request"` is allowed even though it loosens) stay v0.8.9 follow-up because they need a richer ordering check across all `approval_policy` / `sandbox_mode` values. Tests: * `project_overlay_denies_approval_auto_and_sandbox_danger_values` exercises both escalation values in the same merge and confirms a non-escalation field on the same project file still applies. * `project_overlay_preserves_user_strict_value_when_project_tries_to_loosen` exercises the belt-and-suspenders case: user has `approval_policy = "never"`, project tries `"auto"`, the user's strict value survives.	2026-05-03 07:32:08 -05:00
Hunter Bown	926ffcb4f4	feat(security): deny dangerous keys at project-config scope (#417 ) A malicious `<workspace>/.deepseek/config.toml` could escalate privileges via the per-project overlay shipped in #485: * `api_key` / `base_url` / `provider` — exfiltrate prompts to an attacker-controlled endpoint by swapping the user's credentials and target host. * `mcp_config_path` — point the MCP loader at a config that spawns arbitrary stdio servers under the user's identity. Adds a `DENY_AT_PROJECT_SCOPE` allowlist-by-omission to `merge_project_config`. The four credential / redirect keys are silently dropped from the overlay; a stderr warning fires when one is present so a user who did expect the override sees the deny instead of a silent discard: warning: project-scope config key `api_key` is ignored — set it in `~/.deepseek/config.toml` instead. The remaining override surface (model, approval_policy, sandbox_mode, notes_path, reasoning_effort, max_subagents, allow_shell, instructions array) is unchanged. Note that this slice does NOT yet block escalation via value comparison — a project setting `approval_policy = "auto"` still wins over a user's stricter `"never"`. That richer check is filed as a v0.8.9 follow-up. Tests: * `project_overlay_overrides_model_but_denies_provider` replaces the previous test that asserted provider WOULD override (now reversed). * New `project_overlay_denies_dangerous_credentials_and_redirects` models the attacker scenario directly: project sets all four denied keys, asserts the user's pre-existing values survive and the project's are discarded. CHANGELOG documents the deny-list rationale and lists which fields remain overridable.	2026-05-03 07:27:44 -05:00
Hunter Bown	a4c8cb2514	feat(prompts): structured Markdown compaction template (#429 ) Replaces the legacy compaction template with the spec'd Goal / Constraints / Progress (Done / In Progress / Blocked) / Key Decisions / Next step structure. The richer Progress sub-bullets help long resumed sessions distinguish "what's verified done" from "what's mid-flight" — useful when the model writes `.deepseek/handoff.md` before a long break. The previous Active-task / Files-touched / Key-decisions / Open-blockers / Next-step framing collapsed "in progress" and "blocked" into a single "open blockers" heading, which lost the lineage of "I started X, hit Y, then…" trails. Backwards compat: existing `.deepseek/handoff.md` files continue to render fine because the loader (`prompts.rs::load_handoff_block`) injects them as plain markdown — the template only guides what NEW handoffs look like. The "pinned-tool-output configurability" half of #429's spec remains a v0.8.9 follow-up because it requires changes to `cycle_manager.rs` compaction logic itself; the template restructure is independently shippable and is the bigger UX delta in practice. Tests: existing `compact_template_is_included_in_full_prompt` updated to assert the new section headings and the nested Progress sub-bullets. All 24 prompt tests pass.	2026-05-03 07:12:45 -05:00
Hunter Bown	c0b6c2a1e5	perf(hooks): fast-path skip when no hooks configured (#455 follow-up) Now that `tool_call_before` / `tool_call_after` fire on every tool dispatch, the cost of constructing a `HookContext` (which allocates for `workspace`, `model`, `session_id`, …) shows up on the hot path even when the user has zero hooks configured — the common case. Adds `HookExecutor::has_hooks_for_event(event)` as a cheap boolean gate that callers consult before building the context. The pre-check returns false when: * `config.enabled == false` (globally disabled). * No hook in the config has the given `event`. Wired through every fire site: * `tool_routing.rs::handle_tool_call_started` — `ToolCallBefore`. * `tool_routing.rs::handle_tool_call_complete` — `ToolCallAfter`. Also skips the `result.content.clone()` that the `with_tool_result` builder demands. * `ui.rs::dispatch_user_message` — `MessageSubmit`. * `ui.rs::apply_engine_error_to_app` — `OnError`. Inside `HookExecutor::execute` itself, also short-circuit before calling `context.to_env_vars()` when no hooks match the event — defends against a caller that builds the context but forgets to gate. Tests: 3 new tests cover empty-config / globally-disabled / per-event filtering. The existing 18 hook tests pass unchanged. No behavioral change for users with hooks configured; pure allocation-free fast path otherwise.	2026-05-03 07:07:11 -05:00
Hunter Bown	e569f2ca99	feat(hooks): fire message_submit + on_error too (#455 observer-only) Completes the observer-only slice of #455 by wiring the two remaining `HookEvent` variants that were defined but never fired: * `MessageSubmit` fires from `dispatch_user_message` before the message is handed to the engine. Hook context carries `message` so observers can log every prompt the user submits, redact for compliance audit, or page on `/wipe-database`-style content. Read-only. * `OnError` fires from `apply_engine_error_to_app` before the error cell reaches the transcript. Hook context carries `error`. Useful for paging on auth / billing / invalid- request failures without tailing the audit log. Combined with the prior `tool_call_before` / `tool_call_after` wiring, every `HookEvent` variant now has a live producer: `SessionStart`, `SessionEnd`, `MessageSubmit`, `ToolCallBefore`, `ToolCallAfter`, `ModeChange`, `OnError`. The `/hooks events` listing already enumerates them with their on-fire semantics. Hooks remain read-only observers in this slice. Mutation is v0.8.9 follow-up because it needs a synchronous-gate contract that would change semantics for every hook surface — including the lifecycle events that have shipped for many releases.	2026-05-03 07:01:52 -05:00
Hunter Bown	4310202645	feat(hooks): fire tool_call_before / tool_call_after (#455 observer-only) The `HookEvent::ToolCallBefore` and `HookEvent::ToolCallAfter` enum variants were defined but never fired from production code, so `[[hooks.hooks]]` entries with those events sat dormant. Wires the fires from `tui/tool_routing.rs`: * `handle_tool_call_started` fires `ToolCallBefore` with the hook context populated with `tool_name` and `tool_args`. The fire happens before any UI bookkeeping so observers see the call as early as possible. * `handle_tool_call_complete` fires `ToolCallAfter` after the cell finalization with the result content (or stringified error) + success flag. Stays last in the function so any UI state the hook might want to observe via shell-out is already settled. Hooks remain read-only observers in this slice. Mutation (modifying tool args before execution, or the result before it reaches the model) is a v0.8.9 follow-up that needs a synchronous-gate contract; the existing executor is fire-and- forget and adding mutation would change semantics for every existing hook surface (session_start, mode_change, etc.). Operators can wire `tool_call_before` / `tool_call_after` hooks in `~/.deepseek/config.toml` immediately to log every tool call, page on long shell exec, or audit risky operations. The `/hooks events` listing already enumerates them. No new tests — `tool_routing.rs` has no existing test surface, and the hook execution path is already covered via `hooks::tests::*`. The wiring is mechanically minimal.	2026-05-03 06:59:26 -05:00
Hunter Bown	8ed1cb4e68	feat(hooks): /hooks events subcommand for discovery (#460 polish) The shipped `/hooks list` told users WHAT was configured but not WHAT they could configure. Without this, the only way to learn the supported `HookEvent` values is to grep source — not ideal when most users just want to wire up a notification on session_end. Adds `/hooks events` (aliases `event` / `list-events`) which prints every `HookEvent` variant alongside a short descriptive blurb (when it fires, current observability-vs-mutation status). Ordered lifecycle → per-tool → situational so the listing reads naturally and stays stable across releases. Updates `CommandInfo::usage` to `/hooks [list\|events]` so the fuzzy autocomplete shows the new subcommand. Tests: 1 new test (`events_subcommand_lists_every_event_variant_in_documented_order`) pins the order, the per-event descriptive blurb format, and exhaustive variant coverage. The existing 6 hooks tests pass unchanged.	2026-05-03 06:51:27 -05:00
Hunter Bown	a8e0693958	feat(doctor): report spillover dir + composer stash file (#422/#440 polish) The v0.8.8 polish stack added two on-disk surfaces operators might want to inspect — `~/.deepseek/tool_outputs/` for spilled tool output (#422 / #500), and `~/.deepseek/composer_stash.jsonl` for parked composer drafts (#440). Neither showed up in `deepseek doctor`, so users couldn't see at a glance "do I have parked drafts?" or "how much disk has spillover claimed?" Adds a `Storage:` section to the human-readable doctor and a `storage` object to the JSON doctor: * Spillover slot reports the dir's existence and entry count. Pre-creation state ("not yet created") is shown explicitly rather than as a missing dir — the dir is created lazily on first spill, not at boot. * Stash slot reports the file's existence and parked-draft count by re-reading via `composer_stash::load_stash`. Empty / missing stash shows the Ctrl+S hint so the user knows how to use the feature. The JSON schema always emits both nested slots regardless of state (so dashboard schemas stay stable across hosts); the human-readable hides the "not yet created" line for spillover when the dir is missing to keep the report scannable.	2026-05-03 06:46:20 -05:00
Hunter Bown	b1c6e6b173	feat(doctor): report .opencode + .claude skill dirs (#432 follow-up) The cross-tool skill discovery shipped in 432a0c1 walks `.opencode/skills/` and `.claude/skills/` alongside the `.agents/skills/` and `skills/` workspace folders, but the `deepseek doctor` output still only listed the original three slots. Operators staring at "where are my Claude-style skills?" had no way to confirm whether the new dirs were even being checked. Updates both surfaces: * Human-readable doctor — adds two conditionally-printed lines for `.opencode skills dir` and `.claude skills dir`. Empty dirs are omitted to keep the report scannable; the dirs exist on most workspaces only when the user has installed another AI tool's skill catalog there. * JSON doctor (`deepseek doctor --json`) — adds `opencode` and `claude` slots to the `skills` object alongside the existing `global`, `agents`, `local`. Each carries `path`, `present`, and `count`. JSON consumers see all five keys regardless of presence so dashboard schemas stay stable across hosts. The `selected_skills_dir` field still reflects the legacy "highest-precedence single dir" — workspace-aware discovery is done at runtime by `discover_in_workspace`, but `selected` is a useful "where do I install a NEW skill" hint and stays unchanged for backwards compatibility with existing diagnostic tooling.	2026-05-03 06:43:47 -05:00
Hunter Bown	a368dc53b8	feat(commands): /hooks read-only lifecycle hook listing (#460 MVP) Slash command enumerates configured lifecycle hooks from the user's `[hooks]` table, grouped by event. The full picker / persisted enable-disable surface in #460 is still M-sized work; this MVP gives users a no-typing view of what's actually loaded — the most-asked question once hooks start firing. Implementation: * `crates/tui/src/commands/hooks.rs` formats the hook list with per-event headings, hook name (or `(unnamed)`), background marker, timeout, condition summary, and a 60-char shell command preview. * `condition_summary` covers every `HookCondition` variant (Always/ToolName/ToolCategory/Mode/ExitCode/All/Any) so the listing stays informative for compound conditions too. * `event_label` maps each `HookEvent` to its config-file string so the listing matches what the user wrote in TOML. * New `HookExecutor::config()` accessor exposes the underlying `HooksConfig` for read-only callers; doesn't open the door to mutation, which still belongs to the broader #460 work. * Registered in `commands::COMMANDS` with `aliases: &["hook"]`, usage `/hooks [list]`, and `MessageId::CmdHooksDescription` localized in en, ja, zh-Hans, pt-BR. * Wired into `command_palette::command_runs_directly` so pressing Enter from Ctrl+K runs `/hooks list` straight. Tests: 6 unit tests covering preview-cap truncation, newline stripping, condition-summary variants, event-label exhaustiveness, and BTreeMap-grouping ordering.	2026-05-03 06:36:37 -05:00
Hunter Bown	15127046e8	feat(stash): /stash clear subcommand to wipe the stash file (#440 polish) Pairs with `/stash list` and `/stash pop` so the user can fully manage the stash from inside the TUI without reaching for `rm`. * New `composer_stash::clear_stash()` returns the number of entries dropped so the slash command can report it. Atomic-write replaces the file with empty content; missing / empty files return `Ok(0)` without erroring. * `clear` / `wipe` / `drop` are accepted as the subcommand alias. The "unknown subcommand" hint now lists the three live subcommands explicitly. * CommandInfo usage updated to `/stash [list\|pop\|clear]` so `/help` and the autocomplete reflect the new option. * 3 new tests in `composer_stash`: returns-0 when file absent, returns-0 when file is empty, drops entries and reports count on a populated stash. No new dependency; reuses `crate::utils::write_atomic` for the truncate-and-rewrite.	2026-05-03 06:28:18 -05:00
Hunter Bown	ba871c56f6	feat(cli): deepseek pr <N> — pre-seed TUI with PR context (#451 ) `deepseek pr 1234` fetches the PR's title, body, base/head, URL, and full diff via `gh`, then launches the interactive TUI with a review prompt already typed in the composer. The user can edit before sending or hit Enter to fire as-is. Falls back gracefully with an actionable error when `gh` is not on PATH. Implementation: * `Commands::Pr { number, repo, checkout }` subcommand. Optional `--repo <owner/name>` mirrors `gh pr view`'s flag. Optional `--checkout` opt-in for `gh pr checkout`; default is to leave the working tree alone since `gh pr checkout` errors out on dirty trees. * `run_pr` helper drives three best-effort gh shell-outs (`pr view --json`, `pr diff`, optional `pr checkout`) and formats a structured prompt: PR header → URL → branches → description → fenced ```diff block. * `format_pr_prompt` caps the diff at 200 KiB with codepoint- safe truncation so a massive PR doesn't blow the model's context window before the user even hits Enter. * New `TuiOptions::initial_input: Option<String>` plumbs the pre-typed text into `App::new` (which now branches its composer-state init around the option). Cursor lands at the end of the seed text. Future callers (welcome screens, share- link landing pages, etc.) can reuse the same channel. * `run_interactive` gains an `initial_input: Option<String>` parameter; existing callers pass `None`. Tests: 3 new tests in `pr_prompt_tests` cover the happy path (title/url/branches/body/diff render correctly), empty-input fallbacks (placeholder for missing title/body/branches/url), and codepoint-safe truncation when the diff exceeds the 200 KiB cap. Bulk update: every other `TuiOptions { ... }` test-builder across the workspace (~21 sites) gains `initial_input: None` so the new field doesn't break the existing test suite.	2026-05-03 06:23:54 -05:00
Hunter Bown	6fb8739feb	feat(composer): prompt stash — Ctrl+S parks, /stash list+pop (#440 ) A stash is a side-channel from history: it holds drafts the user parked deliberately instead of submissions made in the past (which live in `composer_history.rs`). * `crates/tui/src/composer_stash.rs` — JSONL-backed store at `~/.deepseek/composer_stash.jsonl`. One JSON object per line with `ts` (RFC 3339) and `text`. Self-healing parser drops malformed lines instead of poisoning the file. Multi-line drafts round-trip intact via JSON's newline escaping. Capped at 200 entries; oldest pruned at push time. Empty / whitespace-only text is silently dropped. * `crates/tui/src/commands/stash.rs` — `/stash list` renders the stash with one-line previews and timestamps; `/stash pop` restores the most recently parked draft into the composer (LIFO) and rewrites the file. `/park` aliases `/stash`. * Composer Ctrl+S handler in `tui/ui.rs` — pushes the current draft onto the stash, clears the composer, and surfaces a toast confirming the action so the no-op-feel doesn't fool users into thinking nothing happened. Empty composers are a no-op so a stray Ctrl+S can't pollute the file. * New `KbStashDraft` keybinding entry registered in the help overlay; localized in en, ja, zh-Hans, pt-BR. Tests: 7 unit tests in `composer_stash.rs` cover round-trip, LIFO pop, empty-on-pop, drop-empty-text, multi-line preservation, malformed-line resilience, and cap pruning. 4 unit tests in `commands/stash.rs` cover the preview helper's truncation, multi-line first-line behavior, and empty-input handling.	2026-05-03 06:09:35 -05:00
Hunter Bown	0fa042dc99	feat(audit): emit tool.spillover events when output is spilled (#500 polish) The existing `tool.result` audit event records that a tool finished but says nothing about spillover — operators tailing `~/.deepseek/audit.log` couldn't see when 200 KiB of stdout landed under `~/.deepseek/tool_outputs/`. Adds a discrete `tool.spillover` event keyed off `apply_spillover`'s return value, fired in both the sequential and parallel tool paths so the log entry exists regardless of how the tool was scheduled. Each event carries: {"event": "tool.spillover", "tool_id": "...", "tool_name": "exec_shell", "path": "/.../call-abc.txt"} This is a pure observability addition. The model still receives the same truncated head + footer; the UI still renders the inline `full output: <path>` annotation; the spillover writer contract is unchanged. No new tests — `apply_spillover` already has unit-level coverage and the engine paths are exercised by integration runs.	2026-05-03 05:58:02 -05:00
Hunter Bown	d7017b7829	feat(skills): walk workspace .opencode + .claude skill dirs (#432 ) The skills catalogue and `load_skill` tool now scan every candidate directory in the workspace plus the global default, not just the first one that exists: <workspace>/.agents/skills (deepseek-native convention) <workspace>/skills (flat, project-local) <workspace>/.opencode/skills (OpenCode interop) <workspace>/.claude/skills (Claude Code interop) ~/.deepseek/skills (global, user-installed) Skills installed for any AI-tool convention land in the same catalogue without the user having to symlink or duplicate files. Name conflicts resolve first-match-wins per the precedence list above, so workspace-local skills shadow user/global ones — that's the right shadowing for "this repo overrides my defaults". Implementation: * `skills::skills_directories(workspace)` returns the existing candidate dirs in precedence order (host-dependent for the global default). * `skills::discover_in_workspace(workspace)` walks each, merges the discovered skills, and accumulates warnings. * `render_available_skills_context_for_workspace(workspace)` wraps `discover_in_workspace` for `prompts.rs`. The legacy single-dir `render_available_skills_context(skills_dir)` is retained as a fallback so callers that don't have a workspace view (e.g. mcp_server.rs) still work. * `LoadSkillTool` (#434) routes through `discover_in_workspace` so its lookup matches what the system-prompt catalogue advertises. The "skill not found" error message now lists the searched dirs to help the user debug missing installs. Tests: 4 new tests in `skills/mod.rs`: precedence-order resolution, first-wins merge across .agents and .claude, .opencode discovery, system-prompt rendering for cross-tool dirs. The existing 6 single-dir tests pass unchanged.	2026-05-03 05:52:28 -05:00
Hunter Bown	8290b136e1	feat(tui): push DISAMBIGUATE_ESCAPE_CODES on startup (#442 ) Opt into the Kitty keyboard protocol's escape-code disambiguation so terminals that support it (Kitty, Ghostty, Alacritty 0.13+, WezTerm, recent Konsole / xterm) report unambiguous events for Option/Alt-modified keys, plain Esc, and multi-byte sequences. Push happens after `enable_raw_mode` and the alt-screen / mouse-capture / bracketed-paste setup so the order matches shutdown's reverse-order pop. Only the disambiguation tier is pushed — `REPORT_EVENT_TYPES` and the higher tiers emit release events that the existing key handlers would mis-route as duplicate presses. Pop on exit was already wired in main.rs (panic) and ui.rs (normal shutdown) per #443; the recent #443 follow-up extended that to the suspend paths so editor / shell-suspend children inherit a clean keyboard mode. The push + the four pops form a complete pair. Failure to push is logged at debug level and ignored — a quirky terminal can't block startup. On terminals without protocol support the escape sequence is silently discarded and behaviour is identical to today (iTerm2, Terminal.app, Windows 10 conhost). No new dependency; everything runs through crossterm's existing `PushKeyboardEnhancementFlags` command.	2026-05-03 05:45:52 -05:00
Hunter Bown	e8af3cd37d	feat(tools): load_skill model-callable tool (#434 ) Adds a `load_skill` tool that takes a skill id and returns the SKILL.md body plus the sibling companion-file list in one tool call. The existing progressive-disclosure pattern (system prompt lists skills → model `read_file <path>`) still works; this tool is the higher-level affordance for skills that ship with multiple resource files. Implementation: * `LoadSkillTool` lives in `crates/tui/src/tools/skill.rs`. Read- only, auto-approved, parallel-safe. * On call, resolves the active skills directory via the new `skills::resolve_skills_dir` helper, which mirrors `App::new`'s hierarchy: `<workspace>/.agents/skills` → `<workspace>/skills` → `~/.deepseek/skills`. No new plumbing through ToolContext — the workspace is already there. * Returns the skill body wrapped in a self-contained block: description quote, source path, the SKILL.md verbatim, and a `## Companion files` section listing siblings (sorted lex, deterministic for tests). Solo skills skip the companions section entirely so the tool result stays tight. * Errors with a helpful hint when the name is unknown — the hint includes the catalogue ("Available: foo, bar, baz") so the model can recover without an extra discovery call. * Wired into `ToolRegistryBuilder::with_skill_tools` and pulled into both Agent and Plan tool-setup paths. Plan mode benefits because skills are read-only references that planners often need. Tests: 5 unit tests covering: description-headed body, companion enumeration excluding SKILL.md and nested dirs, empty result for solo skills, and the conditional `## Companion files` section.	2026-05-03 05:43:18 -05:00
Hunter Bown	5deaf97253	fix(tui): pop keyboard flags on suspend paths too (#443 follow-up) `main.rs` (process panic) and the normal TUI shutdown both pop keyboard enhancement flags before handing the terminal back to the child shell. The two suspend paths — `pause_terminal` (Ctrl+Z and shell-suspend) and `external_editor::spawn_editor_for_input` (composer `$EDITOR` launch) — were missing the same defensive pop. Today this is dormant: the TUI doesn't push keyboard enhancement flags explicitly, so there's nothing to pop. The fix is defence-in-depth: the day a future code path enables the flags (kitty keyboard protocol for sub-second-precision modifier reporting, say), the suspend handlers won't leak the half-configured input mode to Vim / less / a shell child. Aligns the four terminal-handoff sites (shutdown, panic, suspend, editor) so they all do the same thing.	2026-05-03 05:29:11 -05:00
Hunter Bown	ac0c16996e	feat(config): instructions array merged into system prompt (#454 ) Adds a new optional `instructions = ["./AGENTS.md", "~/.deepseek/global.md"]` config field that's loaded at startup and concatenated into the system prompt, in declared order, above the skills block. * `Config::instructions: Option<Vec<String>>` — raw paths from `~/.deepseek/config.toml` or the per-project overlay. * `Config::instructions_paths()` — `expand_path` each entry, drop empties, return the resolved `Vec<PathBuf>`. * `merge_project_config` — project's array replaces the user-level array wholesale (including `instructions = []` to clear the user list for the current repo). The typical "merge" pattern is for users who want both — they list `~/global.md` inside the project array. * `EngineConfig::instructions: Vec<PathBuf>` — threaded from config through both engine entry points (`Engine::new` for Default and `refresh_system_prompt` for runtime swaps). * `prompts::render_instructions_block(paths)` — loads each file in order, caps each at 100 KiB with a `[…elided]` marker on overflow, skips missing files with a tracing warning. Returns `None` when nothing renders so the caller appends nothing. * `system_prompt_for_mode_with_context_and_skills` gains an `instructions: Option<&[PathBuf]>` parameter. Block lives between the project-context block and the skills block so it benefits from KV prefix caching and per-project overrides apply consistently turn-over-turn. Documentation: * `config.example.toml` documents the field, the wholesale- override semantics, and the size cap. Tests: * 5 new tests in `prompts.rs`: no-op for empty input, skip missing files, declared-order concatenation, skip empty files, truncate oversize files, plus an end-to-end test that the block appears in the assembled system prompt when configured.	2026-05-03 05:25:31 -05:00
Hunter Bown	5e83f073b1	feat(footer): cumulative session-elapsed indicator (#448 ) Adds `App::session_started_at: Instant` (set at construction) and a low-priority `worked Nh Mm` chip in the footer's right cluster that surfaces session age once it crosses 60s. * `footer_worked_chip(elapsed)` returns empty spans for the first minute of a session so a fresh launch doesn't render a noisy ticker. Above the threshold it reuses the multi-day `humanize_duration` helper (#447) so the band promotion stays consistent: `1m`, `3h 12m`, `2d 5h`, `1w 2d`. * The chip slots in last in `auxiliary_spans`, which means under narrow widths it's the first thing the priority-drop loop removes — the existing chips (coherence / agents / replay / cache / mcp) keep their slots. * `FooterProps` carries a captured `worked: Vec<Span<'static>>` built at props-build time (matches the existing `retry` capture pattern). Render stays pure, tests can pin a known state without relying on wall-clock. Tests: 3 new tests in `tui/widgets/footer.rs` — chip hidden under 60s, chip rendered with humanized labels at 60s / 3h 12m / 2d 5h bands. The existing `from_app_idle_state` test gains a `worked.is_empty()` assertion (the test app is freshly constructed, well under the 60s threshold).	2026-05-03 05:17:01 -05:00
Hunter Bown	6dfb10f321	feat(a11y): NO_ANIMATIONS env override + accessibility docs (#450 ) `fancy_animations: false` and `low_motion: true` already exist on the settings struct, but the flag was undocumented and the only ways to opt in were the `/settings` slash command or hand-editing `~/.config/deepseek/settings.toml` — there was no environment- level signal that platform a11y tooling could carry forward. * `NO_ANIMATIONS=1` env var now forces `low_motion = true` and `fancy_animations = false` at startup, regardless of what's on disk. Recognises `1`, `true`, `yes`, `on` (case-insensitive); any other value is treated as unset. * `Settings::apply_env_overrides()` is now called at the end of `Settings::load()`, so every consumer (App::new, /config, the doctor surface) sees the override applied uniformly. The override is a startup-time overlay — changing the env var mid-session has no effect. * New `docs/ACCESSIBILITY.md` documents the existing `low_motion`, `fancy_animations`, `calm_mode`, `show_thinking`, and `show_tool_details` toggles plus the `NO_ANIMATIONS` startup override. Includes guidance for screen-reader users and a link back to this issue for follow-up motion regressions. Tests: 3 new tests in `settings.rs` (force-low-motion-on, override- user-opt-in, truthy-spelling-recognition). All three serialise through a static Mutex so the cargo parallel runner doesn't observe interleaved env mutations.	2026-05-03 05:09:17 -05:00
Hunter Bown	3625b887fa	feat(ui): humanize_duration handles hours, days, and weeks (#447 ) Long-running sessions (multi-hour cycles, multi-day automations) were rendering with the seconds/minutes-only formatter, so a two-day session showed as `2880m 0s` and `/goal` status used Rust's Debug Duration form (`188415.234s`). `humanize_duration` now walks through w/d/h/m/s and caps the output at two units so it stays compact in headers and notifications: * `45s`, `1m 12s`, `59m 59s` (existing seconds/minutes path) * `1h`, `2h 2m`, `3h 12m` (was `192m 30s`) * `1d`, `1d 1h`, `2d 5h` (the multi-day case from the issue) * `1w`, `1w 1d`, `3w 2d` (long-running automation) The two-tier rule drops sub-minute precision once you're past the hour boundary; the goal is "is this a couple of hours or days," not stopwatch precision. `/goal` status now wires through this formatter so multi-day goal-elapsed times read as `2d 3h` instead of the previous `188415.234s` Debug form. The notification system was the existing caller and picks up the new format automatically. Tests: 4 test functions in `notifications.rs` covering the four formatting bands (s/m, h/m, d/h, w/d) plus the boundary cases on each unit.	2026-05-03 05:05:30 -05:00
Hunter Bown	0b99ad1f25	feat(engine): wire tool-output spillover into the engine and pager (#500 ) The spillover writer (#422) and inline cell annotation (#423) were already in place; this commit makes the pipeline actually fire and gives the user a way to see the elided tail. * `apply_spillover` lives in `tools/truncate.rs` and mutates a `ToolResult` in place: writes the full content to `~/.deepseek/tool_outputs/<id>.txt`, replaces the inline content with a 32 KiB head plus a footer pointing at the file, and stamps `metadata.spillover_path` so downstream renderers can find it. Skips error results so the model still sees the failure verbatim. Preserves prior metadata when present. * `core/engine/turn_loop.rs` calls `apply_spillover` immediately after `execute_tool_with_lock` returns, before the result fans out to the model context (`ContentBlock::ToolResult`) and the UI (`Event::ToolCallComplete`). Both the parallel and sequential tool paths get the same hook so the model and the UI always see the same truncated content. * `tui/ui.rs::open_details_pager_for_cell` now folds the full spillover-file body into the tool-details pager when the focused cell has a `spillover_path`. Truncated head stays at the top (so the user can see what the model received) followed by a `── Full output (spillover) ──` separator and the file body. Missing files render an inline notice instead of silently truncating. * The model's footer ("Use `read_file path=…` if you need the elided tail") teaches the agent how to recover the rest of the payload on its next turn, so spilled output is not lost — just not paid for in context tokens unless the agent decides it actually needs the tail. Tests: 4 new unit tests in `tools/truncate.rs` (no-op below threshold, no-op for errors, truncate + stamp above threshold, preserve prior metadata). 3 new tests in `tui/ui/tests.rs` for the pager helper (no-op without spillover_path, file-load happy path, graceful notice when the file is missing).	2026-05-03 05:02:11 -05:00
Hunter Bown	637d0f088f	fix(agents): list Implementer/Verifier in agent_spawn + agent_assign schemas (#404 ) The SubAgentType enum gained `Implementer` and `Verifier` variants in #404, but the JSON-schema `description` strings on AgentSpawnTool::input_schema and DelegateToAgentTool::input_schema still listed the pre-#404 set (general/explore/plan/review/custom). The model only sees those descriptions, so the new roles were effectively hidden behind a docs lookup. Updates both descriptions to the post-#404 surface and references docs/SUBAGENTS.md for posture. Also adds the long-form aliases (builder/validator/tester) to the agent_assign hint so it matches the canonical alias map. Pure copy change — no behaviour delta.	2026-05-03 04:50:51 -05:00
Hunter Bown	482fcdee7c	docs(changelog): collapse #422 + #423 spillover entry Both halves now shipped; combined entry reads more clearly than two separate ones split across Added. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 04:45:03 -05:00
Hunter Bown	cea4617fb4	docs(changelog): record #422 spillover writer in v0.8.8 entry Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 04:38:17 -05:00
Hunter Bown	01fa11b96f	docs(changelog): note /sessions prune slash command in #406 entry Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 04:31:18 -05:00
Hunter Bown	2fa23c1d74	docs(changelog): record session-prune helper + doctor memory block Two items added in this stabilization pass that weren't yet in the changelog: - SessionManager::prune_sessions_older_than (#406 phase-1) - doctor --json memory block (#489 follow-up) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 04:27:12 -05:00
Hunter Bown	8e7664bc70	docs(changelog): populate [Unreleased] with v0.8.8 stabilization entries Catalogues the 24 v0.8.8 issues shipped across PRs #514 / #515 / #517 / #518 / #519 in the standard Keep-a-Changelog format, organized into Added / Changed / Fixed buckets with issue cross-references. Captured: - Added (10): memory MVP + remember tool, inline diff, OSC 8, retry banner, MCP chip, project config overlay, Implementer/Verifier roles, two doc files, competitive analysis - Changed (8): sub-agent cap, RwLock, output summarization, agent_list session boundary, concise todos, compact agent_spawn, Plan panel, RLM family - Fixed (8): self-update arch, Option+Backspace word delete, offline queue scope, display_path Windows, footer theme color, panic-exit keyboard flags, CI workflow cleanup, plus the v0.8.8 release-base fix Plus a Releases callout reminding maintainers that the npm wrapper publish stays manual and the GitHub release automation depends on the \`RELEASE_TAG_PAT\` secret. The dated section header lands when the actual version-bump commit fires \`auto-tag.yml\`. This commit just populates the [Unreleased] body so contributors get a clean summary while the PRs are still in review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 04:19:09 -05:00
Claude	0e5afe0b01	feat(v0.8.8): linux ARM64 prebuilts + install docs overhaul Triggered by a Telegram report from a Chinese user trying to deploy DeepSeek TUI on a HarmonyOS ARM64 thin-and-light: `npm i -g deepseek-tui` exited with `Unsupported architecture: arm64 on platform linux` because v0.8.7 only published x64 Linux artifacts. They worked around it with `cargo install`, but the README never documented that path for ARM users. This PR closes that gap on three layers: - Release workflow — add `aarch64-unknown-linux-gnu` to the build matrix using GitHub's `ubuntu-24.04-arm` runner. v0.8.8 will publish `deepseek-linux-arm64` and `deepseek-tui-linux-arm64` alongside the existing x64/macOS/Windows assets, plus add the row to the Release body's manual-download table. - npm wrapper — uncomment the linux/arm64 row in `ASSET_MATRIX`, rewrite the `Unsupported architecture/platform` error to print the full `cargo install deepseek-tui-cli deepseek-tui --locked` recipe and link to docs/INSTALL.md, and add `DEEPSEEK_TUI_OPTIONAL_INSTALL=1` so CI matrices that include unsupported platforms can keep running without a binary. - Docs — new docs/INSTALL.md covering every supported platform, prebuilt vs. cargo install vs. manual download, cross-compiling x64 -> ARM64 with `cross` or `gcc-aarch64-linux-gnu`, China mirror setup, and a troubleshooting section for the common arm64, MISSING_COMPANION_BINARY, and self-update arch-mapping (#503) errors. README and README.zh-CN now have an explicit Linux ARM64 quickstart pointing at `cargo install` for v0.8.7 today and `npm i -g` for v0.8.8+; the v0.8.7 known-issue block is updated to mention both #503 and the missing arm64 prebuilt. https://claude.ai/code/session_01Fg1FKMtDxVnC4pp6bNBRCS	2026-05-03 04:42:53 +00:00
Hunter Bown	98ab76a99c	docs: add v0.8.7 changelog + README release notes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 22:26:31 -05:00
Hunter Bown	5bfc1feb62	v0.8.6: survivability, UX polish, and release hardening Merge the v0.8.6 feature batch and release hardening.\n\nIncludes the full #373-#380/#382-#402 milestone scope, version bump to 0.8.6, secure /share temp-file handling, Windows-safe self-update replacement, and CI portability fixes.\n\nRemote PR checks passed on the final head before merge.	2026-05-02 20:11:33 -05:00
Hunter Bown	3d3ff0c5cf	Release v0.8.4: Phase 1 i18n + cache-prefix stability * fix(pricing): extend V4 Pro 75% discount expiry to 2026-05-31 15:59 UTC DeepSeek extended the promotional discount past the original 2026-05-05 cutoff. Without this update the TUI would have started showing 4× the actual billed cost on May 6. Source: https://api-docs.deepseek.com/quick_start/pricing — "extended until 2026/05/31 15:59 UTC". Adds a regression test pinning the new active window so a future revert to the May 5 date trips the suite immediately. Closes #267 * chore: remove stale TODO(integrate) markers from already-integrated modules Five `// TODO(integrate)` comments and one matching "Not yet integrated" note were misleading anyone grepping for integration work. Each module is in fact wired up: - execpolicy/mod.rs → tools/shell.rs:1322 (load_default_policy) - sandbox/mod.rs → tools/shell.rs:28, main.rs:2647, tui/approval.rs:30 - sandbox/policy.rs → main.rs:2752, tui/approval.rs:30 (SandboxPolicy) - command_safety.rs → tools/shell.rs:1321, tools/tasks.rs:13, tools/approval_cache.rs:26 - tui/streaming/mod.rs → tui/app.rs:38 (StreamingState) The remaining TODO at mcp.rs:1771 covers a separate "wire legacy sync API into CLI subcommands or remove" decision and is left in place. Closes #266 * docs(release): add install + dual-binary template to GitHub Release page Closes #265. The Release page used the auto-generated commit-title body. New users hitting the Release page from Twitter / npm-search had no on-page guidance that the dispatcher (`deepseek`) and the TUI runtime (`deepseek-tui`) ship as two binaries that must coexist; #258 was an external user spending 11 minutes figuring this out and #272 was the follow-on confusion. The new body covers: - npm wrapper as the recommended install - `cargo install deepseek-tui-cli deepseek-tui --locked` (both crates) - Manual download with a per-platform table showing both artifacts - sha256 verify using the existing `deepseek-artifacts-sha256.txt` - Changelog link * feat(debug): add /cache command surfacing per-turn DeepSeek cache hit/miss Step 1 of #263. Without per-turn telemetry the prefix-cache audit is unfounded speculation; the rest of the issue's investigation steps depend on this surface. The DeepSeek API already returns `prompt_cache_hit_tokens` and `prompt_cache_miss_tokens` per turn, and we already store the latest on App. This adds a 50-turn ring (`turn_cache_history`) populated at the same site as `last_prompt_cache__tokens`, plus a `/cache [count]` slash command that renders a fixed-width table of the last N turns with per-turn ratios and a session aggregate. Default count is 10; larger values clamp to the ring size. Edge cases the formatter handles: - No telemetry yet → friendly "no turns recorded" message - `cache_hit_tokens = None` (provider didn't report) → row renders all em-dashes and is excluded from session aggregates so one missing- telemetry turn can't make the average ratio look broken. - `cache_hit_tokens = Some, cache_miss_tokens = None` → infer miss as `input − hit` and mark the cell with ``. Footer documents the asterisk. - Ring at cap (50) → push evicts oldest. Tests cover all four paths plus the cap. * test(prompts): add cache-prefix stability harness for #263 step 2 The DeepSeek prefix-cache only hits while the byte prefix of each request matches the prior call. Anything in the cached prefix that varies turn-to-turn for unchanged inputs is a cache buster. Adds a focused harness next to the production surface so the property is regression-guarded: 1. `first_divergence(a, b)` helper that returns the first divergent byte position with a `±32 byte` window of context, used by the custom assertion `assert_byte_identical`. Future suspect tests can reuse this to surface "where" rather than just "fail". 2. `compose_prompt_is_byte_stable_across_calls` — sweeps every (mode, personality) pair and pins that two consecutive calls produce identical bytes. Rules out suspect #4 (mode-prompt churn). 3. `system_prompt_for_mode_with_context_is_byte_stable_for_unchanged_workspace` — the call site `engine.rs::build_tool_context` actually invokes, pinned for an empty workspace across all three modes. 4. `system_prompt_with_working_set_summary_is_byte_stable_for_constant_summary` — pins that the surrounding prompt construction faithfully embeds the working_set summary it's given without injecting extra non-determinism. (The actual working_set summary stability lives in `working_set.rs` and is the next investigation target — see issue note in PR description.) Foundation for the suspect-by-suspect bisection in the rest of #263. * fix(secrets): never overwrite the secrets file when load_unlocked errors `FileKeyringStore::set` and `delete` did `self.load_unlocked().unwrap_or_default()`, which wiped every existing secret if the read failed for any reason other than \"file is missing\": - file mode != 0600 (`InsecurePermissions`) — easy on headless / CI environments where a permissive umask got applied - corrupt JSON - transient I/O error In all of those, the next `store_unlocked` overwrote the file with an empty-or-single-entry blob and reset perms to 0600, silently losing every other provider's key. Switch both call sites to `?`. `load_unlocked` already returns `Ok(default)` for a missing file, so the first-write-creates-the-file ergonomic is preserved (covered by the new `file_store_set_still_creates_file_when_missing` test). Adds four regression tests: - set: insecure perms surface InsecurePermissions and leave the file byte-identical. - delete: same. - set: corrupt JSON surfaces the parse error and leaves the file byte-identical. - set: missing file path still works (idempotence guard). Closes #281 * fix(cache): make tool catalog byte-stable across calls and sessions DeepSeek's KV prefix cache hits on the longest matching byte prefix of the request. Two places in the tool-array path were silently introducing divergence: 1. `ToolRegistry::to_api_tools()` iterated `self.tools.values()` directly. Rust's default `HashMap` is seeded with `RandomState` per process, so every `deepseek` launch produced a different tool order — the cross- session resume case (the one with the biggest cache wins) never hit. 2. `active_tool_list_from_catalog()` filtered the catalog `Vec` by the active set in catalog order. When ToolSearch activated a previously- deferred tool mid-conversation, the new tool appeared at its catalog index, shifting every later tool's byte offset and busting the cached prefix from there onwards. Fixes: - `to_api_tools()` now sorts by tool name before emitting the API tool array. Stable across calls AND across launches. - `build_model_tool_catalog()` sorts each partition (built-ins first, contiguous; MCP tools after, also alphabetical). Mirrors Claude Code's `assembleToolPool` strategy where they explicitly call out cache stability as the reason: "a flat sort would interleave MCP tools into built-ins and invalidate all downstream cache keys whenever an MCP tool sorts between existing built-ins." - `active_tool_list_from_catalog()` puts always-loaded tools in catalog order at the head and deferred-but-now-active tools at the tail. A deferred-tool activation during ToolSearch no longer shifts earlier tools' positions. Adds three regression tests: - `to_api_tools_emits_alphabetical_order_regardless_of_registration_order` - `model_tool_catalog_sorts_each_partition_for_prefix_cache_stability` - `active_tool_list_pushes_deferred_activations_to_the_tail` Refs #263. Findings produced by reading reference Claude Code source side-by-side with our request-building flow; full delta analysis in the PR description. * fix(sandbox): elevate Agent-mode shell sandbox to allow network access The seatbelt-default policy is `WorkspaceWrite { network_access: false }`, which on macOS emits `(deny default)` with no `(allow network-outbound)` / `(allow system-socket)`. Every outbound socket call from a sandboxed shell command — including `getaddrinfo` for DNS — gets denied by the kernel. Symptom: "DNS resolution failed" for any URL the model tries to reach via curl, yt-dlp, package managers, etc. Engine.build_tool_context only elevated the policy in Yolo mode, leaving Agent mode (the default) stuck on the strict default. That's tighter than competitors (Claude Code, Codex) without buying any safety the application-level NetworkPolicy or the approval flow doesn't already provide. Switch the elevation to a `match` so: - Plan → no elevation (read-only investigation; shell tool not registered) - Agent → WorkspaceWrite { network_access: true, … } - Yolo → WorkspaceWrite { network_access: true, … } (unchanged) Adds `agent_and_yolo_modes_elevate_shell_sandbox_to_allow_network` so a future revert to the no-network default trips CI immediately. Closes #273 * fix(skills): treat bare github.com/<owner>/<repo> URLs as GitHubRepo Closes #269. `/skill install https://github.com/obra/superpowers` failed on every platform with `invalid gzip header`. Root cause: `InstallSource::parse` matched any `https://`-prefixed spec as `DirectUrl`, so the installer downloaded the HTML repo page (200 OK, `text/html`) and tried to gzip-decode HTML. The user reported it from Win11 + PowerShell but the parse path is platform-independent. Recognize bare GitHub repo URLs in `InstallSource::parse`: - `https://github.com/<owner>/<repo>` - `https://github.com/<owner>/<repo>/` - `https://github.com/<owner>/<repo>.git` - `https://github.com/<owner>/<repo>.git/` - `https://www.github.com/<owner>/<repo>` - `http://github.com/<owner>/<repo>` (legacy) …all route to the existing `GitHubRepo` source, which already produces `https://github.com/<repo>/archive/refs/heads/{main,master}.tar.gz` candidates with proper fallback. URLs with a third path segment (`/archive/...`, `/blob/...`, `/tree/...`) keep going through `DirectUrl` because the user picked that exact path. Adds two regression tests: one asserting the seven recognised forms all canonicalize to `github:obra/superpowers`, and one pinning the sub-resource paths to `DirectUrl`. * fix(cache): drop volatile fields from working_set summary block (#280) (#287) The working-set summary lands inside the system prompt before the historical conversation, so any byte that drifts there cache-misses everything that follows in DeepSeek's KV prefix cache. Two sources of turn-over-turn drift are removed: 1. The rendered line is now `- {path} ({kind})`. The previous form interpolated `entry.touches` and `self.turn - entry.last_turn`, both of which advance on every user message even when no new paths are observed. 2. A new `sorted_for_prompt` helper sorts by (touches DESC, path ASC) instead of the turn-aware `sorted_entries`. The recency bonus in `score_entry` crosses bucket boundaries as turns advance, so even without rendering `last seen` the order — and which entries cross the `max_prompt_entries` cutoff — drifted. Compaction pinning still uses `sorted_entries` because it genuinely wants recency. Adds a regression test that observes a fixed message set, calls `summary_block` before and after `next_turn()`, and asserts the two outputs are byte-identical. The shared `first_divergence` / `assert_byte_identical` helpers (from #279) move from `prompts::tests` into `test_support` so working_set tests can reuse them. Closes #280. * fix(cache): memoise tool catalog so descriptions stay byte-stable (#289) `to_api_tools` previously re-sampled `tool.description()` and `tool.input_schema()` on every call. Native tools return `&'static str` and a `json!` literal, so the bytes were stable in practice — but the `McpToolAdapter` returns `self.tool.description.as_deref()`, which can drift when the upstream MCP server reconnects with a different description string. Any drift mid-session rewrites the tool catalog that lands in the cached prefix and busts every byte that follows. Adds an `api_cache: OnceLock<Vec<Tool>>` field on `ToolRegistry`. The first `to_api_tools` call materialises the catalog; subsequent calls return a clone of the cached vector. Mutations (`register`, `remove`, `clear`) reset the field so the next read rebuilds. Mirrors reference-cc's `getToolSchemaCache` (`utils/api.ts:119–208`). Tests: - `to_api_tools_pins_description_bytes_across_calls` registers a tool whose `description()` advances through a script of pre-built strings on each call. After the cache is populated, the second `to_api_tools` read returns the original description because `description()` is no longer invoked. Without the cache the second read would return the next script entry. - `register_invalidates_api_tools_cache` registers a tool, snapshots, registers another, snapshots again, and asserts the second snapshot reflects both tools (cache rebuilt) and that the varying tool's description advanced (proving the rebuild actually re-sampled). - `remove_and_clear_invalidate_api_tools_cache` covers the other two invalidation paths. * fix(cache): sort project_tree and summarize_project output (#290) Both helpers walked the workspace via `ignore::WalkBuilder::build()` and emitted entries in the OS readdir order — non-deterministic across filesystems (htree-hash on ext4, insertion-order on APFS, etc.). Their output lands in the fallback branch of the system prompt's project context (when the workspace has no AGENTS.md / CLAUDE.md) and inside the `project_map` tool surface, both of which feed the cached prefix. `summarize_project` now sorts the collected key-files list before the type-detection logic and the fallback `Project with key files: …` join. `project_tree` collects `(rel_path, is_dir)` tuples, sorts by full path, and only then formats the indented tree. Sorting by full path preserves the visual tree shape — `"src" < "src/lib.rs"` because the shorter string compares less — while making siblings deterministic. Tests cover sibling order, parent-before-children invariant, byte stability across two consecutive calls, and the fallback `Project with key files:` branch (the only branch where the joined order escapes into output without further sorting downstream). * fix(client): unique fallback id for parallel streaming tool calls (#291) When a streamed tool_call delta omits the `id` field, the chat-completion decoder used to fall back to the literal string `"tool_call"` for every call. With the V4 API's native parallel tool calls (multiple tool_calls in one delta), every parallel call ended up with the same fallback id — downstream tool-result routing then matched the first call's result twice and the second call hung waiting for an answer that never arrived. The fallback now indexes by the assigned `content_block` position, producing `"call_0"`, `"call_1"`, … within a single response. Upstream- supplied ids are still forwarded verbatim; only the fallback path changes. Tests pin both invariants: - `decoder_assigns_unique_fallback_ids_to_parallel_tool_calls_missing_id` feeds two tool calls without `id` in one delta and asserts they get distinct ids. - `decoder_preserves_upstream_tool_call_id_when_present` keeps the forward-as-is path honest. * fix(cache): place handoff and working_set after static prompt blocks (#292) * fix(cache): drop volatile fields from working_set summary block (#280) The working-set summary lands inside the system prompt before the historical conversation, so any byte that drifts there cache-misses everything that follows in DeepSeek's KV prefix cache. Two sources of turn-over-turn drift are removed: 1. The rendered line is now `- {path} ({kind})`. The previous form interpolated `entry.touches` and `self.turn - entry.last_turn`, both of which advance on every user message even when no new paths are observed. 2. A new `sorted_for_prompt` helper sorts by (touches DESC, path ASC) instead of the turn-aware `sorted_entries`. The recency bonus in `score_entry` crosses bucket boundaries as turns advance, so even without rendering `last seen` the order — and which entries cross the `max_prompt_entries` cutoff — drifted. Compaction pinning still uses `sorted_entries` because it genuinely wants recency. Adds a regression test that observes a fixed message set, calls `summary_block` before and after `next_turn()`, and asserts the two outputs are byte-identical. The shared `first_divergence` / `assert_byte_identical` helpers (from #279) move from `prompts::tests` into `test_support` so working_set tests can reuse them. Closes #280. * fix(cache): place handoff and working_set after static prompt blocks `system_prompt_for_mode_with_context_and_skills` previously interleaved volatile content into the static prefix: 1. mode prompt static 2. project context static 3. working_set_summary ← volatile 4. skills_block static 5. handoff_block ← volatile 6. ## Context Management static 7. COMPACT_TEMPLATE static Anything past byte (3) cache-missed every time the working-set drifted or `/compact` rewrote `.deepseek/handoff.md` — including the static `## Context Management` and `## Compaction Handoff` blocks behind them. New order keeps every static block in the cached prefix and pushes the two volatile blocks to the end: 1. mode prompt 2. project context (or fallback automap) 3. skills block 4. ## Context Management (Agent / Yolo only) 5. COMPACT_TEMPLATE ── volatile boundary ── 6. handoff block 7. working-set summary Adds a doc comment on the function describing the volatile-content-last invariant so future contributors don't reintroduce churn into the prefix. Adds two regression tests: - `system_prompt_with_handoff_file_is_byte_stable_when_file_is_unchanged` pins the handoff path with a fixture file. - `handoff_and_working_set_appear_after_static_blocks` asserts the ordering invariant directly so a future reorder fails loudly. Reference: Claude Code's own prompt builder marks this same boundary with a `SYSTEM_PROMPT_DYNAMIC_BOUNDARY` constant; we don't introduce the abstraction yet but match the principle. * feat(i18n): localize slash command help (Phase 1a, #285) (#294) Adds 44 new MessageIds, one per slash command, and translations to all four shipped locales (en/ja/zh-Hans/pt-BR). Refactors CommandInfo so the English description now lives in localization.rs (single source of truth) instead of being duplicated on the struct, and threads the active Locale through the three render surfaces: - crates/tui/src/tui/views/help.rs (the ?/F1/Ctrl+/ help overlay) - crates/tui/src/tui/command_palette.rs (Ctrl+K palette) - crates/tui/src/commands/core.rs (the /help text command) Usage strings (e.g. /cache [count]) stay English by design — they're placeholder syntax, not natural language. The existing locale-coverage test (`shipped_first_pack_has_no_missing_core_messages`) already iterates ALL_MESSAGE_IDS across Locale::shipped(), so the 44 new IDs are automatically required to be present in all four locale arms or CI fails. This is the first of several incremental Phase 1 PRs. Phase 1b covers the debug commands (/tokens /cost /cache), 1c the footer hints, and 1d doctor output. Phases 2–3 cover onboarding and error surfaces. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(i18n): localize /tokens /cost /cache debug output (Phase 1b, #285) (#295) Adds 13 new MessageIds covering the report templates and the sub-strings shared across them, with translations for all four shipped locales (en/ja/zh-Hans/pt-BR): - CmdTokensReport, CmdTokensContextWithWindow, CmdTokensContextUnknownWindow - CmdTokensCacheBoth, CmdTokensCacheHitOnly, CmdTokensCacheMissOnly - CmdTokensNotReported - CmdCostReport - CmdCacheNoData, CmdCacheHeader, CmdCacheTotals, CmdCacheFootnote, CmdCacheAdvice Each template uses {placeholder} substitution via String::replace rather than format!, since format! requires a literal — the locale-resolved &'static str isn't one. The placeholder convention ({active}, {hit}, {miss}, …) means a translator can re-order or restructure a sentence freely without changing the call site. Helpers `token_count`, `active_context_summary`, `cache_summary`, and `format_cache_history` now take `Locale` so each can resolve their templates from the same source of truth. The English templates byte-match the previous hardcoded format strings so the existing 16 debug-command tests pass unchanged. Column headers in the cache table (`turn in out hit miss …`) are intentionally NOT localized — the body rows are formatted with fixed column widths and translating the header words would break alignment. Numbers, ratios, and the model id stay in English form. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(i18n): localize footer state + help section labels (Phase 1c, #285) (#296) Adds 11 new MessageIds covering visible footer chrome and the help-overlay section headings, with translations for all four shipped locales: Footer: - FooterWorking — animated `working` / `working.` / … pulse - FooterAgentSingular / FooterAgentsPlural — the sub-agent count chip - FooterPressCtrlCAgain — the quit-confirmation toast Help overlay sections (`?` / `F1` / `Ctrl+/`): - HelpSectionNavigation, HelpSectionEditing, HelpSectionActions, HelpSectionModes, HelpSectionSessions, HelpSectionClipboard, HelpSectionHelp `KeybindingSection::label` now takes Locale and returns tr(locale, …). `footer_working_label` and `footer_agents_chip` likewise take Locale; the two production callsites in tui/ui.rs pass `app.ui_locale`. The mode chip itself (agent / yolo / plan) intentionally stays English — those are brand/acronym labels, and translating them would mean explaining to maintainers what `代理` means in a bug report. The keybinding catalog DESCRIPTIONS (41 entries) are not translated in this PR — those are technical prose that would dwarf the rest of i18n work and can ship in v0.8.5. Section labels are translated so the help overlay groups read as expected in any locale. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(commands): smoke-test that every / command dispatches to a handler (#299) Adds two parallel-safe smoke tests in `crates/tui/src/commands/mod.rs` that iterate the COMMANDS registry and verify every command — and every declared alias — dispatches to a real handler. A dispatch miss surfaces as the fall-through `Unknown command:` error message in `execute`, which used to be invisible until a user typed the command and saw the "did you mean" suggestion fire on a registered command. The tests build a workspace-isolated app via `tempfile::TempDir` so side-effecting handlers (`/init` writing AGENTS.md, `/save` and `/export` writing files) do not pollute `crates/tui/` when CI runs from there. `/save` and `/export` get an explicit tempdir-relative path because their no-arg defaults still resolve relative to `cwd`. `/restore` is skipped — it shells out to git for the snapshot repo and its own dedicated tests in `commands/restore.rs` already serialize on the global env mutex via `scoped_home`. The existing coverage there is sufficient. Closes a gap surfaced when verifying that the v0.8.4 i18n refactor (#294, #295, #296) did not silently break any slash-command dispatch. All 44 commands and their aliases pass (16 aliases on top of the 44 names; `/restore` is the only skip). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(release): bump version to 0.8.4 (#297) CHANGELOG entry covers the v0.8.4 work landed since 0.8.3: - Localization Phase 1 (#285) — slash command help (#294), debug command output (#295), footer state and help-overlay section labels (#296). Adds 68 new MessageIds across all four shipped locales (en/ja/zh-Hans/pt-BR). - Cache-prefix stability (#263) — five companion fixes (#287, #288→#292, #289, #290, #291) that keep the DeepSeek prefix cache stable across turns. - Plus the items already in [Unreleased]: agent-mode network exec (#272), /skill GitHub URL parsing (#269), and the V4 Pro discount expiry extension (#267). Bumps: - Cargo.toml workspace version 0.8.3 → 0.8.4 - npm/deepseek-tui/package.json version + deepseekBinaryVersion 0.8.3 → 0.8.4 - Cargo.lock regenerated from the new workspace version. Phase 1d (doctor output), Phase 2 (onboarding/init/missing-companion), and Phase 3 (tool errors / sandbox denials / approvals) deferred to v0.8.5. The shipped Phase 1 surfaces (slash commands, debug telemetry, footer chrome) cover the highest-traffic UI paths Chinese users see first. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(release): bump internal path-dep versions + repair doc link (#301) CI on PR #300 (release feat/v0.8.4 → main) flagged two regressions introduced by the 0.8.4 version bump: 1. Version drift — path-dependency `version = "0.8.3"` references inside the workspace crates (10 crates: agent, app-server, cli, config, core, execpolicy, hooks, mcp, tools, tui) did not move with the workspace `[workspace.package] version = "0.8.4"`. The CI guard `scripts/release/check-versions.sh` requires they match. 2. Broken intra-doc-link `[crate::localization::english]` in the CommandInfo doc comment — `english` is private. Replaced with a reference to the public `description_for` accessor and the public `tr()` function. Verified with: - scripts/release/check-versions.sh — Version state OK. - RUSTDOCFLAGS=-Dwarnings cargo doc --workspace --no-deps — green. - cargo fmt + clippy + test all green. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 23:02:38 -05:00
Hunter Bown	e620e75f99	chore: release v0.8.3 Bumps workspace, all internal path-deps, and npm wrapper (version + deepseekBinaryVersion) from 0.8.2 → 0.8.3. Lockfile re-locked offline. CHANGELOG entry summarizing the 0.8.3 lane: skills path bug fix, privacy contraction, helpful missing-companion error (#258), engine decomposition (#227), bridge/persistence/palette test gap closures, crates.io badge, and 10 issue closures. Local v0.8.3 verified at /tmp/deepseek-0.8.3-test/ before publish. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 09:46:21 -05:00
Hunter Bown	bf6d82e4ba	chore: release v0.8.2 — Windows build fix, npm offline, model-visible skills, zh-CN README Bumps workspace, all internal path-deps, and npm wrapper (version + deepseekBinaryVersion) from 0.8.1 → 0.8.2. Lockfile re-locked offline to keep the registry index untouched. Triggers auto-tag.yml on push, which creates v0.8.2 and fires release.yml to build cross-platform binaries and draft the GitHub Release. npm publish remains manual per CLAUDE.md release runbook. Note: npm registry already has 0.8.2 published (with binaryVersion 0.8.1 from an earlier checkpoint). That release keeps working unchanged because v0.8.1 binaries stay on GitHub. Repo state aligns to 0.8.2 so the version-drift gate passes; next npm publish (which will need to be 0.8.3 since 0.8.2 is taken) will pick up binaryVersion=0.8.2 and pull the new binaries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 01:41:47 -05:00
Hunter Bown	5770a5747b	fix cargo install packaging for v0.8.1	2026-04-30 23:45:21 -05:00
Hunter Bown	3f24759966	release: stabilize shell handles for v0.8.0 Bumps the workspace/npm wrapper to 0.8.0 and fixes completed background shell jobs retaining live process handles, which could cause Too many open files, checkpoint save failures, shell spawn failures, and lag around send/close/Esc. Also includes Windows REPL bootstrap timeout hardening and Cargo/TUNA mirror install docs.	2026-04-30 21:34:00 -05:00
Hunter Bown	3e8da4b99b	chore: bump version to 0.7.9 Includes: - Post-turn freeze fix (reorder maybe_advance_cycle before TurnComplete) - Enter/steering fix (QueueFollowUp when model is streaming) - Esc fanout hardening (idempotent finalize methods) - cargo fmt pass on new code - CHANGELOG, README, and version bump across workspace + npm	2026-04-30 20:53:10 -05:00
Hunter Bown	d25783fe5b	fix(v0.7.8): reconcile swarm state and unicode search	2026-04-30 19:50:01 -05:00
Hunter Bown	820985671d	chore: bump version to 0.7.8 - Cargo.toml workspace version: 0.7.7 → 0.7.8 - npm/deepseek-tui/package.json: 0.7.7 → 0.7.8 - deepseekBinaryVersion: 0.7.7 → 0.7.8 - CHANGELOG.md: add v0.7.8 section	2026-04-30 18:13:35 -05:00
Hunter Bown	4a1768001b	docs: add v0.7.7 CHANGELOG entry	2026-04-30 10:43:40 -05:00
Hunter Bown	8ba8600155	release: v0.7.6 - Bump workspace version to 0.7.6 (Cargo.toml + all crate internal dep pins) - Bump npm wrapper version and deepseekBinaryVersion to 0.7.6 - Add v0.7.6 changelog entry: localization, paste burst, history search, pending input preview, grouped /config editor, searchable help overlay, Alt+↑ edit-last-queued, composer attachment management - Update README with v0.7.6 features (localization, paste, history search) - Archive v0.7.5 implementation plan to docs/archive/ - Update Cargo.lock	2026-04-29 17:00:36 -05:00
Hunter Bown	64d1698bde	Release 0.7.1 (#156 )	2026-04-28 18:38:44 -05:00

1 2 3

101 Commits