codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
Hunter Bown	7630259748	chore(release): bump to 0.8.29 Workspace + per-crate path-dep version pins, npm wrapper, and deepseekBinaryVersion all advance 0.8.28 -> 0.8.29. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:40:20 -05:00
Hunter Bown	e4255539fc	fix(tui): close the scroll-demon class structurally (#1085 regression) Issue #1085 ("TUI viewport drifts down inside alt-screen at end of turn, leaving top rows blank, esp. after sub-agents") was closed in v0.8.18 by adding `reset_terminal_viewport()` to home the cursor on TurnComplete / focus / resize. v0.8.27's flicker fix (`abf3fa66f`) dropped the `\x1b[2J\x1b[3J` deep-clear from that path to stop the double-clear flicker on Ghostty / VSCode / Win10 conhost. That left ratatui's incremental-diff renderer relying on its internal model matching reality — which only holds while nothing else writes to the terminal. Two latent `eprintln!` sites had been quietly emitting raw bytes into the alt-screen for the entire v0.8.x cycle: * `tools/subagent/mod.rs::persist_state_best_effort` (fires whenever the per-step sub-agent state save hits an error; under parallel sub-agents this can fire dozens of times per turn) * `tools/subagent/mod.rs::new_shared_subagent_manager` (fires once on init if the prior state file fails to load) Plus a third found during this fix: * `network_policy.rs::record` (fires every time a network-policy audit write fails) Each eprintln advanced the alt-screen cursor by one row and scrolled the buffer up by one row, but ratatui's renderer didn't know — it kept writing to absolute row positions, which now meant "one row higher than visible." After ~30 leaks the TUI content appeared to drift downward, with a blank band growing above the header. v0.8.18's periodic full-clear had been masking it; v0.8.27's flicker fix unmasked it. Three layers of defence so this class of bug "isn't an option anymore": 1. `crates/tui/src/runtime_log.rs` — file-backed tracing subscriber + Unix fd-level stderr redirect. A daily-rolling log file at `~/.deepseek/logs/tui-YYYY-MM-DD.log` is created at TUI startup (right after `EnterAlternateScreen`). A `tracing-subscriber` registry routes `tracing::warn!` / `tracing::error!` calls to it. On Unix, the process's stderr fd is `dup2`'d to the same file for the lifetime of the `TuiLogGuard`. Any future raw `eprintln!` — ours, a panic message, a third-party crate's verbose output — lands in the log file instead of the alt-screen. The guard restores the original stderr fd on drop so shutdown messages still reach the user's terminal. 2. `tracing::warn!` replacements for the three known leak sites (`subagent/mod.rs` ×2, `network_policy.rs` ×1). With (1) in place these messages now go to the log file with structured fields (`?err`, `host`, `tool`) instead of opaque text rows in the alt-screen. 3. Module-level `#![deny(clippy::print_stdout, clippy::print_stderr)]` on `tools/`, `core/`, `tui/`, `runtime_threads.rs`, and `network_policy.rs`. Any future `eprintln!` / `println!` added to a TUI runtime path fails the lint at compile time. Legitimate CLI-print paths (`main.rs` eval / init / doctor, `runtime_api.rs` server banners, `logging.rs` verbose helpers, `skills/mod.rs` listing utilities, `execpolicy/execpolicycheck.rs` JSON output, `ui::run_event_loop` post-`LeaveAlternateScreen` resume hint, two `#[test] #[ignore]` perf benches in `tui/transcript.rs` / `tui/widgets/mod.rs` / `core/capacity.rs`) keep their existing prints — they all run outside the alt-screen lifetime. The dup2 redirect is Unix-only because there's no equivalent stable Rust API for fd-redirecting `STDERR_FILENO` on Windows; on Windows the tracing-subscriber layer + the clippy denies still apply, and ratatui's own use of crossterm avoids the worst leakage classes. Cross-platform stderr redirect via `SetStdHandle` is a follow-up. The new `runtime_log` module ships with one test (`log_directory_prefers_home`) that pins the `HOME` / `USERPROFILE` / `dirs::home_dir()` resolution order — uses the process-wide `test_support::lock_test_env()` lock for env-mutation safety. Two `#[test] #[ignore]` benches in `tui/transcript.rs` (rail-prefix memory) and `tui/widgets/mod.rs` (transcript scroll bench) and one in `core/capacity.rs` (`bench_compute_profile`) keep their stdout prints via `#[allow(clippy::print_stdout)]` on the individual test. New dependencies: `tracing-subscriber 0.3` (env-filter + fmt features) and `tracing-appender 0.2` at the workspace root, both pulled into `crates/tui` only. Closes the v0.8.28 regression Hunter reported in screenshots: parallel sub-agents running `exec_shell` triggered the scroll demon with the TUI content squeezed into the bottom third of the terminal and ~30 rows of blank above the header. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:20:25 -05:00
Hunter Bown	61f40420aa	chore(release): bump to 0.8.28 Workspace + per-crate path-dep version pins, npm wrapper, and `deepseekBinaryVersion` all advance from 0.8.27 → 0.8.28. Lockfile refreshed via `cargo update --workspace --offline`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:21:58 -05:00
Hunter Bown	6c25a18b42	chore(release): bump to v0.8.27, add CHANGELOG	2026-05-10 08:41:04 -05:00
Hunter Bown	b92d3569fd	chore(release): prepare v0.8.26 — security hotfix Two responsibly-disclosed security fixes: - GHSA-88gh-2526-gfrr (@JafarAkhondali) - GHSA-72w5-pf8h-xfp4 (@47Cid) Plus version bump, CHANGELOG, regression tests for both. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 22:30:16 -05:00
Hunter Bown	bf582cb0f9	chore(release): prepare v0.8.25 - Bump workspace to 0.8.25 (per-crate path-dep pins, Cargo.lock, npm/deepseek-tui/package.json all in sync; check-versions.sh green). - Add v0.8.25 CHANGELOG entry covering markdown table wrap, reqwest-based self-update with SHA-256 verification, MCP framing centralization, terminal-mode recovery unification, recall_archive parent-registry exposure, ContextConfig.per_model removal, plus community PRs (#1300 Reid Liu, #1297 Duducoco, #1265 jinpengxuan, #1290 Reid Liu, #1246 heloanc, #1282 Wenjunyun123, #1274 Liu-Vince). - Refresh README.md and README.zh-CN.md "What's New" sections. - Prune stale docs: V0_7_5_IMPLEMENTATION_PLAN redirect stub, v0.8.8-coordinator-prompt, archived v0.8.10 handoff. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-09 20:50:06 -05:00
Hunter Bown	97d79862c2	chore(release): bump to 0.8.24, update CHANGELOG Version bump: 0.8.23 → 0.8.24 in workspace Cargo.toml. CHANGELOG entries for: - Workspace-local slash commands (#1259) - @-completion for gitignored dot-dirs - MCP paginated discovery via nextCursor (#1250, #1256, credit Liu-Vince) - Snapshot disk-space cap at 500 MB (#1112, credit Giggitycountless) - /clear resets Todos sidebar (#1258) - Language directive strengthened against project-context bias - Known issue: Windows flicker (#1260, #1251) — viewport-reset escape sequence on Windows conhost, investigation in progress	2026-05-09 00:12:33 -05:00
Hunter Bown	8e9957da5c	chore(release): prepare v0.8.23 - Bump workspace version 0.8.22 → 0.8.23 across Cargo.toml, every per-crate path-dependency pin, npm/deepseek-tui/package.json (both `version` and `deepseekBinaryVersion`), and Cargo.lock. - Add a 0.8.23 CHANGELOG entry covering the security hardening stack (sanitized child env, plan-mode tool surface, sub-agent approvals, symlink walks, runtime API auth, shell safety classification, MCP config path traversal), the macOS Keychain prompt fix, the #1244 MCP spawn error visibility + env passthrough work, the compact-thinking UX change, and a Known issues callout for mid-run MCP stderr. - Backfill missing CHANGELOG entries for v0.8.21 (community-heavy release, contributors credited) and v0.8.22 (fetch_url redirect validation). The gap was unintentional, so contributor work is being reflected in-repo now. - Add docs/RELEASE_CHECKLIST.md so future releases gate on the CHANGELOG/version/preflight steps explicitly.	2026-05-08 18:17:44 -05:00
Hunter Bown	8b60275981	chore(release): prepare v0.8.22 Validate redirected fetch targets before following them and prepare v0.8.22.	2026-05-08 13:34:26 -05:00
Hunter Bown	1fc892e604	chore(release): prepare v0.8.21 (#1229 )	2026-05-08 11:20:03 -05:00
Hunter Bown	f283e56bd1	fix(prompts): prioritize user language for reasoning (#1137 )	2026-05-07 23:53:15 -05:00
Hunter Bown	b31b93aaae	v0.8.19: endpoint, release workflow, IME + viewport fixes (#1128 ) * fix(config): keep DeepSeek beta endpoint for legacy cn alias * fix(ci): filter download-artifact to deepseek* pattern Prevents the release aggregation job from picking up non-binary artifacts (e.g. Docker .dockerbuild cache layers) that cause the checksum manifest to include spurious entries and the Release to carry files it shouldn't. * fix(tui): enable focus events to restore IME after app-switch On macOS, switching away (Cmd+Tab) and back suspends the IME compositor. Without focus-event handling, the TUI never signals readiness to the terminal, so CJK input methods (Pinyin, Zhuyin, etc.) stop working. - EnableFocusChange on startup so the terminal reports FocusGained/FocusLost - Re-push KeyboardEnhancementFlags on FocusGained (some terminals reset the enhanced keyboard mode on focus-loss) - DisableFocusChange on shutdown for clean terminal handoff * chore: cargo fmt * docs: add DataWhale and DeepSeek to acknowledgments * docs: fix DeepSeek name etymology in acknowledgments * fix(tui): recapture viewport on focus restore * docs: thank DeepSeek and DataWhale bilingually	2026-05-07 23:05:39 -05:00
Hunter Bown	aa9e32bf0e	chore(release): prepare v0.8.18	2026-05-07 15:36:30 -05:00
Hunter Bown	ee0ce460ee	chore(release): v0.8.17 A community-driven reliability release. Plan-mode safety, paste-Enter auto-submit, slash-menu skills coverage, the deepseek-cn endpoint preset, and a handful of platform / streaming / gateway-compat fixes, plus a small PTY-driven QA harness. See CHANGELOG.md for the full annotated change list with credits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-07 12:52:53 -05:00
Hunter Bown	4f77c625fd	fix(tui): forward-port v0.8.16 hotfix to main Forward-port the v0.8.16 RLM/sub-agent hotfix onto main after tagging the release branch.	2026-05-07 00:04:31 -05:00
Hunter Bown	1043374e6a	chore(release): prepare v0.8.15	2026-05-06 11:06:00 -05:00
Hunter Bown	ab59ef8ff2	fix(cost): count V4 reasoning tokens in usage output (#762 )	2026-05-05 19:57:25 -05:00
Hunter Bown	c4cbd7c19f	chore(release): finalize v0.8.13 stabilization	2026-05-05 13:06:09 -05:00
Hunter Bown	668f2c37f5	fix(build): stable-Rust compatibility + silence deferred-code warnings The match guard at tui/ui.rs:1603 used `&& let Some(...) = ...` inside an `if` guard, which requires the `if_let_guard` nightly feature on Rust < 1.94. Reported by an external user attempting `cargo install deepseek-tui` on stable rustc — it failed with E0658. Rewrite as a plain match guard with a nested `if let` inside the arm body so the language-picker hotkeys compile on every supported rustc. Workspace also now declares `rust-version = "1.88"` to match the codebase's actual reliance on `let_chains` in if/while conditions, so users on too-old toolchains see a clear cargo error instead of a confusing rustc one. `AGENTS.md` and `CLAUDE.md` gain a "stable Rust only" section documenting the trap and how to rewrite around it. Also annotate the deferred `TuiPrefs` (#657) and `handoff::THRESHOLDS` (#667) APIs with `#[allow(dead_code)]` so CI's `-D warnings` flag stays green while the call sites are staged for v0.8.13. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 01:57:14 -05:00
Hunter Bown	8785e6865e	chore(release): bump version to 0.8.12	2026-05-05 00:18:14 -05:00
Hunter Bown	9366a7c5f3	chore(release): bump version to 0.8.11 Final step in the v0.8.11 patch release. Bumps the workspace `Cargo.toml`, all 9 internal path-dep version pins, and `npm/deepseek-tui/package.json` to 0.8.11. `Cargo.lock` regenerated alongside. The v0.8.11 CHANGELOG entry already landed on `main` via the cache-maxing overhaul PR (#684). This commit only stamps the version. Together they ship: * Cache-maxing for V4 1M context — engine no longer rebuilds the system prompt on every turn (#684's `Session::last_system_prompt_hash`), the volatile working-set summary moved out of the system prompt into per-turn `<turn_meta>` on the latest user message, the tool array is anchored with `cache_control: ephemeral`, and the `messages_with_turn_metadata` injection skips tool-result messages so the assistant→tool_result invariant stays intact. * 500K compaction floor — automatic compaction refuses below 500K tokens via `MINIMUM_AUTO_COMPACTION_TOKENS`. Manual `/compact` bypasses (explicit user agency). * Token-only compaction trigger — dropped `CompactionConfig::message_threshold` and the message-count branch in `should_compact`; that 128K-era heuristic only fired on long sessions of small messages, exactly the case where rewriting the V4 prefix cache is most wasteful. * Legacy 128K naming — `DEFAULT_CONTEXT_WINDOW_TOKENS` → `LEGACY_DEEPSEEK_CONTEXT_WINDOW_TOKENS`. * `npm install` resilience — `install.js` now retries with exponential backoff, enforces per-attempt timeout + 30 s stall detector, honors `HTTPS_PROXY` / `HTTP_PROXY` / `NO_PROXY` (pure Node, no new dependencies), and prints download progress to stderr. Driven by a community report that `npm install` took 18 minutes through a CN npm mirror; the GitHub Releases binary fetch was the bottleneck and CN mirrors don't proxy GitHub. Verified locally: * cargo fmt --all -- --check ✓ * cargo clippy --workspace --all-targets --all-features --locked -- -D warnings ✓ * cargo test --workspace --all-features --locked ✓ * parity gates (snapshot, parity_protocol, parity_state) ✓ * bash scripts/release/check-versions.sh ✓ (workspace=0.8.11, npm=0.8.11, lockfile in sync) * node scripts/release/npm-wrapper-smoke.js ✓ Reminder for the maintainer at release time: the npm publish is manual and requires 2FA OTP on every publish. After this PR merges and the GitHub Release is fully drafted by `auto-tag.yml`, publish from a developer machine: cd npm/deepseek-tui npm publish --access public The `prepublishOnly` hook checks all eight binaries plus the SHA256 manifest are present on the GitHub Release before letting `npm publish` proceed, so this must happen after the GitHub Release is finalized. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 23:01:16 -05:00
Hunter Bown	a92c449de5	chore(release): bump version to 0.8.10 + CHANGELOG Picks up the v0.8.10 patch release contents: * Daemon API quartet for whalescale-desktop integration (#561-#564, PR #567). * Bug cluster: macOS seatbelt cargo registry (#558), MCP SIGTERM shutdown (#420), Linux PR_SET_PDEATHSIG (#421). * npm install on older glibc fix (#555/#560 via #556 + #565). * Shell cwd workspace-boundary validation (#524). * Memory help/docs polish (#497 via #569). * Onboarding language picker (#566). * Whale nicknames interleaved with Simplified Chinese. First-time contributors credited in CHANGELOG: @staryxchen, @shentoumengxin, @Vishnu1837, @20bytes. Workspace `Cargo.toml`, all 9 internal path-dep version pins, and `npm/deepseek-tui/package.json` all bumped to 0.8.10. `Cargo.lock` regenerated and committed alongside. Verified locally: * cargo fmt --all -- --check * cargo clippy --workspace --all-targets --all-features --locked -- -D warnings * cargo test --workspace --all-features --locked * bash scripts/release/check-versions.sh Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 10:13:26 -05:00
Hunter Bown	4511ea763f	chore(release): bump version to 0.8.9 + cargo fmt	2026-05-04 00:56:51 -05:00
Hunter Bown	84c55e9022	chore(release): bump version to 0.8.8 - Workspace `version = "0.8.8"` in root `Cargo.toml`. - 31 internal `deepseek-*` path-dep version pins across the 9 crates that declare them. - `npm/deepseek-tui/package.json` `version` and `deepseekBinaryVersion` both updated. - `Cargo.lock` regenerated for the new workspace version. - `CHANGELOG.md` `[Unreleased]` heading promoted to `[0.8.8] - 2026-05-03`. `scripts/release/check-versions.sh` reports the workspace, npm wrapper, and lockfile all aligned. Pushing this to `main` should fire `auto-tag.yml`, which creates the `v0.8.8` tag with `RELEASE_TAG_PAT`. The tag triggers `release.yml` to build the matrix and draft the GitHub Release. The npm wrapper publish remains manual (npm 2FA OTP requirement). What ships in v0.8.8 ==================== The full polish stack already merged via PRs #514 (stabilization), #515 (OSC 8 hyperlinks), #517 (inline diff render), #518 (user memory MVP), #519 (foreground polish + per-project overlay + security + Windows redraw fix), and #508 (Linux ARM64 prebuilts + install docs). See `CHANGELOG.md` and the README "What's new in v0.8.8" section for the full list.	2026-05-03 08:55:41 -05:00
Hunter Bown	15da881e1d	chore(release): bump version to 0.8.7 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 21:56:37 -05:00
Hunter Bown	5bfc1feb62	v0.8.6: survivability, UX polish, and release hardening Merge the v0.8.6 feature batch and release hardening.\n\nIncludes the full #373-#380/#382-#402 milestone scope, version bump to 0.8.6, secure /share temp-file handling, Windows-safe self-update replacement, and CI portability fixes.\n\nRemote PR checks passed on the final head before merge.	2026-05-02 20:11:33 -05:00
Hunter Bown	216f6be349	chore(release): bump version to 0.8.5 Workspace + npm wrapper + every internal crate path-dep pin moved from 0.8.4 → 0.8.5. scripts/release/check-versions.sh confirms parity across the three sources. cargo build / clippy / test all clean. Pushing this commit to main is the trigger for auto-tag.yml to create the v0.8.5 tag, which fires release.yml to build the cross-platform matrix and draft the GitHub Release. The npm publish remains a manual follow-up (2FA on every publish, no automation token provisioned). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-02 14:48:18 -05:00
Hunter Bown	3d3ff0c5cf	Release v0.8.4: Phase 1 i18n + cache-prefix stability * fix(pricing): extend V4 Pro 75% discount expiry to 2026-05-31 15:59 UTC DeepSeek extended the promotional discount past the original 2026-05-05 cutoff. Without this update the TUI would have started showing 4× the actual billed cost on May 6. Source: https://api-docs.deepseek.com/quick_start/pricing — "extended until 2026/05/31 15:59 UTC". Adds a regression test pinning the new active window so a future revert to the May 5 date trips the suite immediately. Closes #267 * chore: remove stale TODO(integrate) markers from already-integrated modules Five `// TODO(integrate)` comments and one matching "Not yet integrated" note were misleading anyone grepping for integration work. Each module is in fact wired up: - execpolicy/mod.rs → tools/shell.rs:1322 (load_default_policy) - sandbox/mod.rs → tools/shell.rs:28, main.rs:2647, tui/approval.rs:30 - sandbox/policy.rs → main.rs:2752, tui/approval.rs:30 (SandboxPolicy) - command_safety.rs → tools/shell.rs:1321, tools/tasks.rs:13, tools/approval_cache.rs:26 - tui/streaming/mod.rs → tui/app.rs:38 (StreamingState) The remaining TODO at mcp.rs:1771 covers a separate "wire legacy sync API into CLI subcommands or remove" decision and is left in place. Closes #266 * docs(release): add install + dual-binary template to GitHub Release page Closes #265. The Release page used the auto-generated commit-title body. New users hitting the Release page from Twitter / npm-search had no on-page guidance that the dispatcher (`deepseek`) and the TUI runtime (`deepseek-tui`) ship as two binaries that must coexist; #258 was an external user spending 11 minutes figuring this out and #272 was the follow-on confusion. The new body covers: - npm wrapper as the recommended install - `cargo install deepseek-tui-cli deepseek-tui --locked` (both crates) - Manual download with a per-platform table showing both artifacts - sha256 verify using the existing `deepseek-artifacts-sha256.txt` - Changelog link * feat(debug): add /cache command surfacing per-turn DeepSeek cache hit/miss Step 1 of #263. Without per-turn telemetry the prefix-cache audit is unfounded speculation; the rest of the issue's investigation steps depend on this surface. The DeepSeek API already returns `prompt_cache_hit_tokens` and `prompt_cache_miss_tokens` per turn, and we already store the latest on App. This adds a 50-turn ring (`turn_cache_history`) populated at the same site as `last_prompt_cache__tokens`, plus a `/cache [count]` slash command that renders a fixed-width table of the last N turns with per-turn ratios and a session aggregate. Default count is 10; larger values clamp to the ring size. Edge cases the formatter handles: - No telemetry yet → friendly "no turns recorded" message - `cache_hit_tokens = None` (provider didn't report) → row renders all em-dashes and is excluded from session aggregates so one missing- telemetry turn can't make the average ratio look broken. - `cache_hit_tokens = Some, cache_miss_tokens = None` → infer miss as `input − hit` and mark the cell with ``. Footer documents the asterisk. - Ring at cap (50) → push evicts oldest. Tests cover all four paths plus the cap. * test(prompts): add cache-prefix stability harness for #263 step 2 The DeepSeek prefix-cache only hits while the byte prefix of each request matches the prior call. Anything in the cached prefix that varies turn-to-turn for unchanged inputs is a cache buster. Adds a focused harness next to the production surface so the property is regression-guarded: 1. `first_divergence(a, b)` helper that returns the first divergent byte position with a `±32 byte` window of context, used by the custom assertion `assert_byte_identical`. Future suspect tests can reuse this to surface "where" rather than just "fail". 2. `compose_prompt_is_byte_stable_across_calls` — sweeps every (mode, personality) pair and pins that two consecutive calls produce identical bytes. Rules out suspect #4 (mode-prompt churn). 3. `system_prompt_for_mode_with_context_is_byte_stable_for_unchanged_workspace` — the call site `engine.rs::build_tool_context` actually invokes, pinned for an empty workspace across all three modes. 4. `system_prompt_with_working_set_summary_is_byte_stable_for_constant_summary` — pins that the surrounding prompt construction faithfully embeds the working_set summary it's given without injecting extra non-determinism. (The actual working_set summary stability lives in `working_set.rs` and is the next investigation target — see issue note in PR description.) Foundation for the suspect-by-suspect bisection in the rest of #263. * fix(secrets): never overwrite the secrets file when load_unlocked errors `FileKeyringStore::set` and `delete` did `self.load_unlocked().unwrap_or_default()`, which wiped every existing secret if the read failed for any reason other than \"file is missing\": - file mode != 0600 (`InsecurePermissions`) — easy on headless / CI environments where a permissive umask got applied - corrupt JSON - transient I/O error In all of those, the next `store_unlocked` overwrote the file with an empty-or-single-entry blob and reset perms to 0600, silently losing every other provider's key. Switch both call sites to `?`. `load_unlocked` already returns `Ok(default)` for a missing file, so the first-write-creates-the-file ergonomic is preserved (covered by the new `file_store_set_still_creates_file_when_missing` test). Adds four regression tests: - set: insecure perms surface InsecurePermissions and leave the file byte-identical. - delete: same. - set: corrupt JSON surfaces the parse error and leaves the file byte-identical. - set: missing file path still works (idempotence guard). Closes #281 * fix(cache): make tool catalog byte-stable across calls and sessions DeepSeek's KV prefix cache hits on the longest matching byte prefix of the request. Two places in the tool-array path were silently introducing divergence: 1. `ToolRegistry::to_api_tools()` iterated `self.tools.values()` directly. Rust's default `HashMap` is seeded with `RandomState` per process, so every `deepseek` launch produced a different tool order — the cross- session resume case (the one with the biggest cache wins) never hit. 2. `active_tool_list_from_catalog()` filtered the catalog `Vec` by the active set in catalog order. When ToolSearch activated a previously- deferred tool mid-conversation, the new tool appeared at its catalog index, shifting every later tool's byte offset and busting the cached prefix from there onwards. Fixes: - `to_api_tools()` now sorts by tool name before emitting the API tool array. Stable across calls AND across launches. - `build_model_tool_catalog()` sorts each partition (built-ins first, contiguous; MCP tools after, also alphabetical). Mirrors Claude Code's `assembleToolPool` strategy where they explicitly call out cache stability as the reason: "a flat sort would interleave MCP tools into built-ins and invalidate all downstream cache keys whenever an MCP tool sorts between existing built-ins." - `active_tool_list_from_catalog()` puts always-loaded tools in catalog order at the head and deferred-but-now-active tools at the tail. A deferred-tool activation during ToolSearch no longer shifts earlier tools' positions. Adds three regression tests: - `to_api_tools_emits_alphabetical_order_regardless_of_registration_order` - `model_tool_catalog_sorts_each_partition_for_prefix_cache_stability` - `active_tool_list_pushes_deferred_activations_to_the_tail` Refs #263. Findings produced by reading reference Claude Code source side-by-side with our request-building flow; full delta analysis in the PR description. * fix(sandbox): elevate Agent-mode shell sandbox to allow network access The seatbelt-default policy is `WorkspaceWrite { network_access: false }`, which on macOS emits `(deny default)` with no `(allow network-outbound)` / `(allow system-socket)`. Every outbound socket call from a sandboxed shell command — including `getaddrinfo` for DNS — gets denied by the kernel. Symptom: "DNS resolution failed" for any URL the model tries to reach via curl, yt-dlp, package managers, etc. Engine.build_tool_context only elevated the policy in Yolo mode, leaving Agent mode (the default) stuck on the strict default. That's tighter than competitors (Claude Code, Codex) without buying any safety the application-level NetworkPolicy or the approval flow doesn't already provide. Switch the elevation to a `match` so: - Plan → no elevation (read-only investigation; shell tool not registered) - Agent → WorkspaceWrite { network_access: true, … } - Yolo → WorkspaceWrite { network_access: true, … } (unchanged) Adds `agent_and_yolo_modes_elevate_shell_sandbox_to_allow_network` so a future revert to the no-network default trips CI immediately. Closes #273 * fix(skills): treat bare github.com/<owner>/<repo> URLs as GitHubRepo Closes #269. `/skill install https://github.com/obra/superpowers` failed on every platform with `invalid gzip header`. Root cause: `InstallSource::parse` matched any `https://`-prefixed spec as `DirectUrl`, so the installer downloaded the HTML repo page (200 OK, `text/html`) and tried to gzip-decode HTML. The user reported it from Win11 + PowerShell but the parse path is platform-independent. Recognize bare GitHub repo URLs in `InstallSource::parse`: - `https://github.com/<owner>/<repo>` - `https://github.com/<owner>/<repo>/` - `https://github.com/<owner>/<repo>.git` - `https://github.com/<owner>/<repo>.git/` - `https://www.github.com/<owner>/<repo>` - `http://github.com/<owner>/<repo>` (legacy) …all route to the existing `GitHubRepo` source, which already produces `https://github.com/<repo>/archive/refs/heads/{main,master}.tar.gz` candidates with proper fallback. URLs with a third path segment (`/archive/...`, `/blob/...`, `/tree/...`) keep going through `DirectUrl` because the user picked that exact path. Adds two regression tests: one asserting the seven recognised forms all canonicalize to `github:obra/superpowers`, and one pinning the sub-resource paths to `DirectUrl`. * fix(cache): drop volatile fields from working_set summary block (#280) (#287) The working-set summary lands inside the system prompt before the historical conversation, so any byte that drifts there cache-misses everything that follows in DeepSeek's KV prefix cache. Two sources of turn-over-turn drift are removed: 1. The rendered line is now `- {path} ({kind})`. The previous form interpolated `entry.touches` and `self.turn - entry.last_turn`, both of which advance on every user message even when no new paths are observed. 2. A new `sorted_for_prompt` helper sorts by (touches DESC, path ASC) instead of the turn-aware `sorted_entries`. The recency bonus in `score_entry` crosses bucket boundaries as turns advance, so even without rendering `last seen` the order — and which entries cross the `max_prompt_entries` cutoff — drifted. Compaction pinning still uses `sorted_entries` because it genuinely wants recency. Adds a regression test that observes a fixed message set, calls `summary_block` before and after `next_turn()`, and asserts the two outputs are byte-identical. The shared `first_divergence` / `assert_byte_identical` helpers (from #279) move from `prompts::tests` into `test_support` so working_set tests can reuse them. Closes #280. * fix(cache): memoise tool catalog so descriptions stay byte-stable (#289) `to_api_tools` previously re-sampled `tool.description()` and `tool.input_schema()` on every call. Native tools return `&'static str` and a `json!` literal, so the bytes were stable in practice — but the `McpToolAdapter` returns `self.tool.description.as_deref()`, which can drift when the upstream MCP server reconnects with a different description string. Any drift mid-session rewrites the tool catalog that lands in the cached prefix and busts every byte that follows. Adds an `api_cache: OnceLock<Vec<Tool>>` field on `ToolRegistry`. The first `to_api_tools` call materialises the catalog; subsequent calls return a clone of the cached vector. Mutations (`register`, `remove`, `clear`) reset the field so the next read rebuilds. Mirrors reference-cc's `getToolSchemaCache` (`utils/api.ts:119–208`). Tests: - `to_api_tools_pins_description_bytes_across_calls` registers a tool whose `description()` advances through a script of pre-built strings on each call. After the cache is populated, the second `to_api_tools` read returns the original description because `description()` is no longer invoked. Without the cache the second read would return the next script entry. - `register_invalidates_api_tools_cache` registers a tool, snapshots, registers another, snapshots again, and asserts the second snapshot reflects both tools (cache rebuilt) and that the varying tool's description advanced (proving the rebuild actually re-sampled). - `remove_and_clear_invalidate_api_tools_cache` covers the other two invalidation paths. * fix(cache): sort project_tree and summarize_project output (#290) Both helpers walked the workspace via `ignore::WalkBuilder::build()` and emitted entries in the OS readdir order — non-deterministic across filesystems (htree-hash on ext4, insertion-order on APFS, etc.). Their output lands in the fallback branch of the system prompt's project context (when the workspace has no AGENTS.md / CLAUDE.md) and inside the `project_map` tool surface, both of which feed the cached prefix. `summarize_project` now sorts the collected key-files list before the type-detection logic and the fallback `Project with key files: …` join. `project_tree` collects `(rel_path, is_dir)` tuples, sorts by full path, and only then formats the indented tree. Sorting by full path preserves the visual tree shape — `"src" < "src/lib.rs"` because the shorter string compares less — while making siblings deterministic. Tests cover sibling order, parent-before-children invariant, byte stability across two consecutive calls, and the fallback `Project with key files:` branch (the only branch where the joined order escapes into output without further sorting downstream). * fix(client): unique fallback id for parallel streaming tool calls (#291) When a streamed tool_call delta omits the `id` field, the chat-completion decoder used to fall back to the literal string `"tool_call"` for every call. With the V4 API's native parallel tool calls (multiple tool_calls in one delta), every parallel call ended up with the same fallback id — downstream tool-result routing then matched the first call's result twice and the second call hung waiting for an answer that never arrived. The fallback now indexes by the assigned `content_block` position, producing `"call_0"`, `"call_1"`, … within a single response. Upstream- supplied ids are still forwarded verbatim; only the fallback path changes. Tests pin both invariants: - `decoder_assigns_unique_fallback_ids_to_parallel_tool_calls_missing_id` feeds two tool calls without `id` in one delta and asserts they get distinct ids. - `decoder_preserves_upstream_tool_call_id_when_present` keeps the forward-as-is path honest. * fix(cache): place handoff and working_set after static prompt blocks (#292) * fix(cache): drop volatile fields from working_set summary block (#280) The working-set summary lands inside the system prompt before the historical conversation, so any byte that drifts there cache-misses everything that follows in DeepSeek's KV prefix cache. Two sources of turn-over-turn drift are removed: 1. The rendered line is now `- {path} ({kind})`. The previous form interpolated `entry.touches` and `self.turn - entry.last_turn`, both of which advance on every user message even when no new paths are observed. 2. A new `sorted_for_prompt` helper sorts by (touches DESC, path ASC) instead of the turn-aware `sorted_entries`. The recency bonus in `score_entry` crosses bucket boundaries as turns advance, so even without rendering `last seen` the order — and which entries cross the `max_prompt_entries` cutoff — drifted. Compaction pinning still uses `sorted_entries` because it genuinely wants recency. Adds a regression test that observes a fixed message set, calls `summary_block` before and after `next_turn()`, and asserts the two outputs are byte-identical. The shared `first_divergence` / `assert_byte_identical` helpers (from #279) move from `prompts::tests` into `test_support` so working_set tests can reuse them. Closes #280. * fix(cache): place handoff and working_set after static prompt blocks `system_prompt_for_mode_with_context_and_skills` previously interleaved volatile content into the static prefix: 1. mode prompt static 2. project context static 3. working_set_summary ← volatile 4. skills_block static 5. handoff_block ← volatile 6. ## Context Management static 7. COMPACT_TEMPLATE static Anything past byte (3) cache-missed every time the working-set drifted or `/compact` rewrote `.deepseek/handoff.md` — including the static `## Context Management` and `## Compaction Handoff` blocks behind them. New order keeps every static block in the cached prefix and pushes the two volatile blocks to the end: 1. mode prompt 2. project context (or fallback automap) 3. skills block 4. ## Context Management (Agent / Yolo only) 5. COMPACT_TEMPLATE ── volatile boundary ── 6. handoff block 7. working-set summary Adds a doc comment on the function describing the volatile-content-last invariant so future contributors don't reintroduce churn into the prefix. Adds two regression tests: - `system_prompt_with_handoff_file_is_byte_stable_when_file_is_unchanged` pins the handoff path with a fixture file. - `handoff_and_working_set_appear_after_static_blocks` asserts the ordering invariant directly so a future reorder fails loudly. Reference: Claude Code's own prompt builder marks this same boundary with a `SYSTEM_PROMPT_DYNAMIC_BOUNDARY` constant; we don't introduce the abstraction yet but match the principle. * feat(i18n): localize slash command help (Phase 1a, #285) (#294) Adds 44 new MessageIds, one per slash command, and translations to all four shipped locales (en/ja/zh-Hans/pt-BR). Refactors CommandInfo so the English description now lives in localization.rs (single source of truth) instead of being duplicated on the struct, and threads the active Locale through the three render surfaces: - crates/tui/src/tui/views/help.rs (the ?/F1/Ctrl+/ help overlay) - crates/tui/src/tui/command_palette.rs (Ctrl+K palette) - crates/tui/src/commands/core.rs (the /help text command) Usage strings (e.g. /cache [count]) stay English by design — they're placeholder syntax, not natural language. The existing locale-coverage test (`shipped_first_pack_has_no_missing_core_messages`) already iterates ALL_MESSAGE_IDS across Locale::shipped(), so the 44 new IDs are automatically required to be present in all four locale arms or CI fails. This is the first of several incremental Phase 1 PRs. Phase 1b covers the debug commands (/tokens /cost /cache), 1c the footer hints, and 1d doctor output. Phases 2–3 cover onboarding and error surfaces. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(i18n): localize /tokens /cost /cache debug output (Phase 1b, #285) (#295) Adds 13 new MessageIds covering the report templates and the sub-strings shared across them, with translations for all four shipped locales (en/ja/zh-Hans/pt-BR): - CmdTokensReport, CmdTokensContextWithWindow, CmdTokensContextUnknownWindow - CmdTokensCacheBoth, CmdTokensCacheHitOnly, CmdTokensCacheMissOnly - CmdTokensNotReported - CmdCostReport - CmdCacheNoData, CmdCacheHeader, CmdCacheTotals, CmdCacheFootnote, CmdCacheAdvice Each template uses {placeholder} substitution via String::replace rather than format!, since format! requires a literal — the locale-resolved &'static str isn't one. The placeholder convention ({active}, {hit}, {miss}, …) means a translator can re-order or restructure a sentence freely without changing the call site. Helpers `token_count`, `active_context_summary`, `cache_summary`, and `format_cache_history` now take `Locale` so each can resolve their templates from the same source of truth. The English templates byte-match the previous hardcoded format strings so the existing 16 debug-command tests pass unchanged. Column headers in the cache table (`turn in out hit miss …`) are intentionally NOT localized — the body rows are formatted with fixed column widths and translating the header words would break alignment. Numbers, ratios, and the model id stay in English form. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(i18n): localize footer state + help section labels (Phase 1c, #285) (#296) Adds 11 new MessageIds covering visible footer chrome and the help-overlay section headings, with translations for all four shipped locales: Footer: - FooterWorking — animated `working` / `working.` / … pulse - FooterAgentSingular / FooterAgentsPlural — the sub-agent count chip - FooterPressCtrlCAgain — the quit-confirmation toast Help overlay sections (`?` / `F1` / `Ctrl+/`): - HelpSectionNavigation, HelpSectionEditing, HelpSectionActions, HelpSectionModes, HelpSectionSessions, HelpSectionClipboard, HelpSectionHelp `KeybindingSection::label` now takes Locale and returns tr(locale, …). `footer_working_label` and `footer_agents_chip` likewise take Locale; the two production callsites in tui/ui.rs pass `app.ui_locale`. The mode chip itself (agent / yolo / plan) intentionally stays English — those are brand/acronym labels, and translating them would mean explaining to maintainers what `代理` means in a bug report. The keybinding catalog DESCRIPTIONS (41 entries) are not translated in this PR — those are technical prose that would dwarf the rest of i18n work and can ship in v0.8.5. Section labels are translated so the help overlay groups read as expected in any locale. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(commands): smoke-test that every / command dispatches to a handler (#299) Adds two parallel-safe smoke tests in `crates/tui/src/commands/mod.rs` that iterate the COMMANDS registry and verify every command — and every declared alias — dispatches to a real handler. A dispatch miss surfaces as the fall-through `Unknown command:` error message in `execute`, which used to be invisible until a user typed the command and saw the "did you mean" suggestion fire on a registered command. The tests build a workspace-isolated app via `tempfile::TempDir` so side-effecting handlers (`/init` writing AGENTS.md, `/save` and `/export` writing files) do not pollute `crates/tui/` when CI runs from there. `/save` and `/export` get an explicit tempdir-relative path because their no-arg defaults still resolve relative to `cwd`. `/restore` is skipped — it shells out to git for the snapshot repo and its own dedicated tests in `commands/restore.rs` already serialize on the global env mutex via `scoped_home`. The existing coverage there is sufficient. Closes a gap surfaced when verifying that the v0.8.4 i18n refactor (#294, #295, #296) did not silently break any slash-command dispatch. All 44 commands and their aliases pass (16 aliases on top of the 44 names; `/restore` is the only skip). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(release): bump version to 0.8.4 (#297) CHANGELOG entry covers the v0.8.4 work landed since 0.8.3: - Localization Phase 1 (#285) — slash command help (#294), debug command output (#295), footer state and help-overlay section labels (#296). Adds 68 new MessageIds across all four shipped locales (en/ja/zh-Hans/pt-BR). - Cache-prefix stability (#263) — five companion fixes (#287, #288→#292, #289, #290, #291) that keep the DeepSeek prefix cache stable across turns. - Plus the items already in [Unreleased]: agent-mode network exec (#272), /skill GitHub URL parsing (#269), and the V4 Pro discount expiry extension (#267). Bumps: - Cargo.toml workspace version 0.8.3 → 0.8.4 - npm/deepseek-tui/package.json version + deepseekBinaryVersion 0.8.3 → 0.8.4 - Cargo.lock regenerated from the new workspace version. Phase 1d (doctor output), Phase 2 (onboarding/init/missing-companion), and Phase 3 (tool errors / sandbox denials / approvals) deferred to v0.8.5. The shipped Phase 1 surfaces (slash commands, debug telemetry, footer chrome) cover the highest-traffic UI paths Chinese users see first. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(release): bump internal path-dep versions + repair doc link (#301) CI on PR #300 (release feat/v0.8.4 → main) flagged two regressions introduced by the 0.8.4 version bump: 1. Version drift — path-dependency `version = "0.8.3"` references inside the workspace crates (10 crates: agent, app-server, cli, config, core, execpolicy, hooks, mcp, tools, tui) did not move with the workspace `[workspace.package] version = "0.8.4"`. The CI guard `scripts/release/check-versions.sh` requires they match. 2. Broken intra-doc-link `[crate::localization::english]` in the CommandInfo doc comment — `english` is private. Replaced with a reference to the public `description_for` accessor and the public `tr()` function. Verified with: - scripts/release/check-versions.sh — Version state OK. - RUSTDOCFLAGS=-Dwarnings cargo doc --workspace --no-deps — green. - cargo fmt + clippy + test all green. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 23:02:38 -05:00
Hunter Bown	e620e75f99	chore: release v0.8.3 Bumps workspace, all internal path-deps, and npm wrapper (version + deepseekBinaryVersion) from 0.8.2 → 0.8.3. Lockfile re-locked offline. CHANGELOG entry summarizing the 0.8.3 lane: skills path bug fix, privacy contraction, helpful missing-companion error (#258), engine decomposition (#227), bridge/persistence/palette test gap closures, crates.io badge, and 10 issue closures. Local v0.8.3 verified at /tmp/deepseek-0.8.3-test/ before publish. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 09:46:21 -05:00
Hunter Bown	bf6d82e4ba	chore: release v0.8.2 — Windows build fix, npm offline, model-visible skills, zh-CN README Bumps workspace, all internal path-deps, and npm wrapper (version + deepseekBinaryVersion) from 0.8.1 → 0.8.2. Lockfile re-locked offline to keep the registry index untouched. Triggers auto-tag.yml on push, which creates v0.8.2 and fires release.yml to build cross-platform binaries and draft the GitHub Release. npm publish remains manual per CLAUDE.md release runbook. Note: npm registry already has 0.8.2 published (with binaryVersion 0.8.1 from an earlier checkpoint). That release keeps working unchanged because v0.8.1 binaries stay on GitHub. Repo state aligns to 0.8.2 so the version-drift gate passes; next npm publish (which will need to be 0.8.3 since 0.8.2 is taken) will pick up binaryVersion=0.8.2 and pull the new binaries. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-01 01:41:47 -05:00
Hunter Bown	5770a5747b	fix cargo install packaging for v0.8.1	2026-04-30 23:45:21 -05:00
Hunter Bown	3f24759966	release: stabilize shell handles for v0.8.0 Bumps the workspace/npm wrapper to 0.8.0 and fixes completed background shell jobs retaining live process handles, which could cause Too many open files, checkpoint save failures, shell spawn failures, and lag around send/close/Esc. Also includes Windows REPL bootstrap timeout hardening and Cargo/TUNA mirror install docs.	2026-04-30 21:34:00 -05:00
Hunter Bown	3e8da4b99b	chore: bump version to 0.7.9 Includes: - Post-turn freeze fix (reorder maybe_advance_cycle before TurnComplete) - Enter/steering fix (QueueFollowUp when model is streaming) - Esc fanout hardening (idempotent finalize methods) - cargo fmt pass on new code - CHANGELOG, README, and version bump across workspace + npm	2026-04-30 20:53:10 -05:00
Hunter Bown	820985671d	chore: bump version to 0.7.8 - Cargo.toml workspace version: 0.7.7 → 0.7.8 - npm/deepseek-tui/package.json: 0.7.7 → 0.7.8 - deepseekBinaryVersion: 0.7.7 → 0.7.8 - CHANGELOG.md: add v0.7.8 section	2026-04-30 18:13:35 -05:00
Hunter Bown	7f2f47edf8	v0.7.7: stabilize sub-agent / swarm / fanout lifecycle, Windows install, and TUI polish (#246 ) * wip(v0.7.7): handoff baseline of partial sub-agent stabilization Captures uncommitted work-in-progress on the v0.7.7 stabilization lane so subsequent fixes have a stable starting point. Subsequent commits finish the canonical SubAgentJob/SwarmJob model, fix sidebar/transcript/ footer agreement, copy/paste/cancel contract, checklist rendering, shell summary preservation, monotonic spend, and version provenance. Refs #235 #236 #237 #238 #239 #240 #241 #242 #243 #244 #245 * release: bump workspace version to 0.7.7 (#245) Refs #245 * fix(v0.7.7): canonical swarm card binding, monotonic spend, checklist + shell summary - Add `swarm_card_index: HashMap<swarm_id, history_index>` so overlapping fanouts each project to their own FanoutCard. Eliminates the screenshot contradiction where a stale background swarm's progress clobbered a newer card (#236, #238). - Suppress fanout-class tools (`agent_swarm`, `spawn_agents_on_csv`, `rlm`, `agent_spawn`) from `active_tool_status_label` so the footer no longer reports "tool agent_swarm · 1 active" while sidebar+card show the actual worker counts (#236, #238). - Add `App::displayed_session_cost` + `displayed_cost_high_water` so the visible session+sub-agent total is monotonic across reconciliation events (cache discounts, provisional → final). New tests: monotonicity under negative reconciliation; duplicate dedup keeps display steady (#244). - Preserve high-signal summary lines from the truncated tail of shell output: `test result:`, `failures:`, `error[E…]`, `Finished`, `Compiling`, panic markers. Stops the agent re-running cargo gates just to see pass/fail under truncation (#242). - Render `checklist_write` / `todo_` results as a purpose-built checklist card with completed/total + percent header, per-item status markers, and a collapsing affordance for long lists. Plumbed through the existing `GenericToolCell` so no new variant threading is needed (#241). Refs #236 #238 #241 #242 #244 fix(v0.7.7): Esc clears active tool entries optimistically (#243) When Esc cancels the foreground turn we now finalize the active cell immediately rather than waiting for the engine's TurnComplete echo to drain. This stops the footer "tool ... · X active" chip from briefly contradicting the cancelled state, and frees the composer for the next message. Background `block:false` swarms are intentionally NOT killed here — they remain durable, tracked through `swarm_jobs` and `swarm_card_index` so their FanoutCard updates as workers land. Subsequent `swarm_status` / `swarm_result` / `swarm_cancel` tool calls see the canonical store. New focused test verifies: after Esc, `active_cell` is None, the background swarm record is preserved, and `is_loading` is cleared so the composer can submit immediately. Refs #243 * fix(v0.7.7): Windows .exe lookup + post-turn snapshot detach (#247, #234) #247 — npm-distributed Windows package failed at runtime because the Rust dispatcher's `delegate_to_tui` / `delegate_simple_tui` looked for a sibling named exactly "deepseek-tui", while the actual file shipped by `scripts/install.js` is `deepseek-tui.exe`. Replace both lookups with `locate_sibling_tui_binary`, which: - Honours `DEEPSEEK_TUI_BIN` for explicit overrides - Tries `deepseek-tui{EXE_SUFFIX}` first (`.exe` on Windows, "" elsewhere) - Falls back to suffix-less `deepseek-tui` on Windows so users who applied the issue's manual workaround still launch successfully - Emits a platform-correct error path in the bail message Tests: `sibling_tui_candidate_picks_platform_correct_name`, `sibling_tui_candidate_windows_falls_back_to_suffixless` (windows-only), `locate_sibling_tui_binary_honours_env_override`. #234 — Detach the post-turn workspace snapshot so `git add -A && git commit` no longer pins the engine loop after `Event::TurnComplete`. The snapshot still runs on `tokio::task::spawn_blocking`, but the engine no longer awaits its `JoinHandle`, so the UI accepts input (text, copy, paste, selection) without waiting for the bookkeeping to finish. Cycle advance and pre-turn snapshot remain awaited — they are correctness-sensitive and the cycle path already emits a status chip ("↻ context refreshing…") so the user has visible feedback. Refs #234 #247 * chore(v0.7.7): bump npm package version 0.7.6 → 0.7.7 Required by `scripts/release/check-versions.sh` ("Version drift" CI gate); the workspace was bumped to 0.7.7 but `npm/deepseek-tui/package.json` still reported 0.7.6, blocking PR #246 from going green. Refs #245	2026-04-30 07:26:26 -05:00
Hunter Bown	8ba8600155	release: v0.7.6 - Bump workspace version to 0.7.6 (Cargo.toml + all crate internal dep pins) - Bump npm wrapper version and deepseekBinaryVersion to 0.7.6 - Add v0.7.6 changelog entry: localization, paste burst, history search, pending input preview, grouped /config editor, searchable help overlay, Alt+↑ edit-last-queued, composer attachment management - Update README with v0.7.6 features (localization, paste, history search) - Archive v0.7.5 implementation plan to docs/archive/ - Update Cargo.lock	2026-04-29 17:00:36 -05:00
Hunter Bown	c2b2c284f6	release: v0.7.5 — token-basis fixes, shell timeout recovery, context/cache policy Issues #202, #203, #204, #205: - Cycle/seam triggers use active request input size + response headroom reserve, not lifetime cumulative API usage. - V4 hard-cycle headroom calibrated around fixed TURN_MAX_OUTPUT_TOKENS plus CONTEXT_HEADROOM_TOKENS safety buffer. - /tokens, /cost, footer/header labels, and docs now separate active context, turn telemetry, cumulative usage, cache hit/miss, context percent, and cost. - Foreground exec_shell timeout output tells the model the process was killed and suggests task_shell_start or background exec_shell plus poll/wait. - Added regression tests for active-token basis, V4 headroom, seam trigger basis, footer label behavior, and shell timeout recovery metadata. - Preserved #200/#201 policy: V4 default is append-only, prefix-cache preserving; replacement compaction, Flash seams, and capacity intervention remain opt-in.	2026-04-29 10:13:27 -05:00
Hunter Bown	00c92e1c2a	Implement v0.7.4 long-running agent tools	2026-04-29 00:50:43 -05:00
Hunter Bown	35db361a87	v0.7.2: clippy clean, cost counter wiring, layered context fixup #167: Fix all 7 clippy warnings — annotated SeamMetadata dead fields, removed unused should_cycle calls, collapsed nested ifs, fixed useless_format and nonminimal_bool. #168: Wire TokenUsage mailbox drain to subagent_cost accumulator. handle_subagent_mailbox now intercepts TokenUsage before routing to cards, computes cost via calculate_turn_cost, and increments app.subagent_cost in real time. Footer reflects live sub-agent spend. Restored ArchivedContext variant to HistoryCell (corrupted by prior apply_patch). Version bump to 0.7.2. Refs: #166, #167, #168	2026-04-28 21:46:25 -05:00
Hunter Bown	64d1698bde	Release 0.7.1 (#156 )	2026-04-28 18:38:44 -05:00
Hunter Bown	a02907b89d	chore(release): v0.7.0	2026-04-28 16:08:33 -05:00
Hunter Bown	f3ada0be88	chore(secrets): #134 scaffold deepseek-secrets crate Adds the `deepseek-secrets` crate with the OS keyring backend, in-memory store for tests, and a JSON-on-disk fallback for headless environments. The Secrets façade collapses keyring -> env into a single resolver; callers layer on CLI flags above and TOML config below to preserve the keyring -> env -> config-file precedence. * `KeyringStore` trait + `DefaultKeyringStore` (keyring 3.6 with per-platform native features). * `InMemoryKeyringStore` for unit tests. * `FileKeyringStore` writes ~/.deepseek/secrets/secrets.json with mode 0600 on unix; rejects world-readable files at read time. * `Secrets::auto_detect` probes the OS keyring and falls back to the file store on headless Linux. * 9 unit tests covering round-trips, precedence, and 0600 perms. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 00:01:10 -05:00
Hunter Bown	65f5daa9f9	chore(release): v0.6.7 Bumps workspace + npm wrapper to 0.6.7. Auto-tag.yml will create the v0.6.7 tag, which fires release.yml to build the binary matrix and draft the GitHub Release. v0.6.7 highlights (all 12 open issues for this release landed): - #130 sub-agent mailbox abstraction with monotonic seq + watch-based backpressure; close-as-cancel propagates through nested children - #128 in-transcript DelegateCard + FanoutCard consume the mailbox stream; sidebar demoted to a navigator - #129 approval modal Codex-style takeover with benign / destructive variant routing — destructive ops require explicit second-key confirm - #103 stream-error diagnostics + transparent retry on early decode failure; HTTP/2 keepalive defaults; DEEPSEEK_FORCE_HTTP1 escape hatch - #101 @-file mention BLOCKER fix — two-pass workspace→cwd→fuzzy resolution; .deepseekignore honored; Workspace::resolve extracted - #52 OpenRouter + Novita as first-class providers; /provider picker modal with inline API-key prompt; OPENROUTER_API_KEY / NOVITA_API_KEY - #93 help overlay (?) with searchable command + keybinding catalog - #95 /statusline picker for configurable footer items, persisted to config.toml under tui.status_items - #94 live transcript overlay (Ctrl+T) with sticky-bottom auto-scroll and (cell, width, revision) wrap cache - #85 pending input preview widget — three semantic buckets (pending steers / rejected steers / queued follow-ups); Alt+Up edits last queued - #66 error taxonomy wired through engine + capacity controller + audit log + TUI severity rendering; ad-hoc string matching gone - #68 sub-agent prompts tightened — explicit SUMMARY/EVIDENCE/CHANGES/RISKS/BLOCKERS contract, mode-specific guidance, tool-calling conventions in one canonical file 1358 deepseek-tui binary tests passing (1217 baseline → +141), all parity gates green, clippy -D warnings clean, rustdoc clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 22:46:38 -05:00
Hunter Bown	d4b9ccfdb3	feat(subagent): full registry inheritance + auto-approve + depth cap + cwd (#99 ) v0.6.6 — sub-agents inherit the parent's full tool registry, auto-approve, respect a depth cap, and propagate cancellation. Adds optional cwd to agent_spawn for parallel-worktree dispatch. Schema-ready for roles (full library lands in 0.6.7). Changes: - New ToolRegistryBuilder::with_full_agent_surface(...) shared by parent and child - SubAgentToolRegistry::new refactored to use shared builder; per-type allowlist becomes advisory - SubAgentRuntime gains auto_approve, spawn_depth, max_spawn_depth, cancel_token - Depth check at spawn entry; cancellation cascade via CancellationToken::child_token() - <deepseek:subagent.done> sentinel emitted on child completion - cwd: Option<PathBuf> on agent_spawn with workspace-boundary validation - Stream wall-clock cap bumped to 30 min (was 300s) - max_spawn_depth configurable via EngineConfig (default 3) - Version bump to 0.6.6 Closes #99.	2026-04-27 19:16:22 -05:00
Hunter Bown	5cec1534be	feat(rlm): align with reference impl + add rlm_process tool; bump 0.6.5 The previous /rlm slash command flow had a UI rendering gap (the answer never made it back to the model's view) and required the user to invoke it manually. Pivoting to a tool-call surface and aligning the in-REPL helpers with the canonical reference (alexzhang13/rlm) by the paper authors so the same prompts and decomposition patterns transfer. New tool: rlm_process - crates/tui/src/tools/rlm_process.rs - Inputs: task (small, shown to root LLM each iter as root_prompt) + exactly one of file_path (workspace-relative, preferred) or content (inline, capped at 200k chars). Optional child_model and max_depth. - Loaded across Plan/Agent/YOLO; never deferred via ToolSearch. - Returns the final answer string + metadata (iterations, duration, tokens, termination). REPL surface aligned with reference (alexzhang13/rlm): - Variable name `context` (was PROMPT) - Code fence ```repl (was ```python; python/py kept as fallback) - Helpers: llm_query, llm_query_batched (NEW), rlm_query (was sub_rlm), rlm_query_batched (NEW), SHOW_VARS (NEW), FINAL, FINAL_VAR, repl_get/repl_set - Top-level JSON-serializable user variables auto-persist across rounds (no repl_set ceremony required) - FINAL(...) / FINAL_VAR(...) parseable from the model's raw response text (parse_text_final), in addition to the in-REPL sentinel path. Code-fenced occurrences are correctly ignored to prevent false hits. Sidecar (axum, 127.0.0.1:0): - Added POST /llm_batch and POST /rlm_batch endpoints (parallel fanout, cap 16 prompts per batch). Mirrors the reference's batched semantics. Other: - System prompt rewritten with reference's strategy patterns (PREVIEW → CHUNK+map-reduce via llm_query_batched → RECURSIVE decomposition via rlm_query → programmatic compute + LLM interp). - Strict termination loop unchanged: must emit ```repl or text-level FINAL each round; one fence-less round → reminder, two → DirectAnswer. - /rlm slash command remains for manual debug; description points the model toward rlm_process for the in-agent flow. Versions: workspace 0.6.4 → 0.6.5; npm wrapper 0.6.4 → 0.6.5. Gates green: cargo fmt, cargo clippy --all-targets --all-features --locked -D warnings, cargo test --workspace --all-features --locked (all pass), parity_protocol/parity_state/snapshot, RUSTDOCFLAGS= -Dwarnings cargo doc --workspace --no-deps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 01:17:09 -05:00
Hunter Bown	bd938a559c	fix(rlm): wire real recursive substrate; bump 0.6.4 The v0.6.3 RLM loop had Algorithm 1's outer shape but the substrate was non-functional: `llm_query()` was a Python stub that returned a hardcoded string and `child_model` was bound with an underscore prefix and silently dropped. The recursive sub-LLM call advertised by /rlm never fired. This commit wires the substrate end-to-end per Zhang/Kraska/Khattab (arXiv:2512.24601, Algorithm 1): - New axum HTTP sidecar (`rlm/sidecar.rs`) bound to 127.0.0.1:0 for the duration of one RLM turn. Python's `llm_query()` and `sub_rlm()` are real `urllib.request` POSTs; Rust services them via the existing DeepSeek client. Token usage from sidecar-served calls folds into the parent `RlmTurnResult.usage`. - `child_model` is plumbed through `Op::RlmQuery` → `AppAction::RlmQuery` → `run_rlm_turn` → sidecar handlers; default remains `deepseek-v4-flash`. - New `sub_rlm(prompt)` Python helper runs a full Algorithm-1 turn at depth-1 (paper's `sub_RLM`). Default `max_depth = 2` from `/rlm`. The recursive opaque-future cycle is broken by returning a concrete `Pin<Box<dyn Future + Send>>` from `run_rlm_turn_inner`. - Strict termination: the loop ends only via `FINAL(value)` (or the iteration cap). One fence-less round is tolerated with a reminder appended; two consecutive ones surface the model text as a `RlmTermination::DirectAnswer` exit. New `RlmTermination` enum lets callers tell `Final \| DirectAnswer \| Exhausted \| Error` apart. - Richer `Metadata(state)`: includes paper-required access patterns (`repl_get` / slicing / `splitlines` / `repl_set` / `llm_query` / `sub_rlm` / `FINAL`) and a live list of variable keys currently in the REPL state file. - Unicode-safe `truncate_text` (was mixing bytes with chars), per-turn state-file cleanup, `ROOM_TEMPERATURE` typo → `ROOT_TEMPERATURE`. - New end-to-end test `sidecar_url_is_exported_to_python_env` stands up a stand-in axum server, runs `print(llm_query('hello'))` in the real PythonRuntime, and asserts the reply round-trips. Catches future regressions in sidecar URL passthrough. Versions: workspace 0.6.3 → 0.6.4 in Cargo.toml; npm wrapper 0.6.3 → 0.6.4 in npm/deepseek-tui/package.json. Gates: cargo fmt, cargo clippy --all-targets --all-features --locked -D warnings, cargo test --workspace --all-features --locked (1088 passed), parity_protocol/parity_state/snapshot, RUSTDOCFLAGS=-Dwarnings cargo doc --workspace --no-deps — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:36:59 -05:00
Hunter Bown	28e8d70ffc	chore(release): bump workspace + npm wrapper to 0.6.3	2026-04-26 17:52:29 -05:00
Hunter Bown	ac1332565c	release: v0.6.2 Highlights: - fix(client): SSE idle-timeout so a stalled stream surfaces a clear error instead of hanging the active cell (#76) - fix(tui): sidebar Agents panel reads live engine progress, not just the cached snapshot — matches the footer chip in real time (#63) - fix(tui): generic tool result preview preserves newlines for diff stats / file lists / todo snapshots (#80) - fix(tui): slash-menu scroll viewport now exercises center-tracking past the first 6 entries (#64) - feat(mcp): connect-failure errors include URL, status, body excerpt, transport — credentials masked (#71) - feat(tools): mark alias tools (spawn_agent, close_agent, send_input, delegate_to_agent) with _deprecation metadata; removal slated 0.8.0 (#72) - feat(capacity): V4 model priors (deepseek-v4-pro/flash) + key normalization, plus DEEPSEEK_CAPACITY_PRIOR_V4_* env overrides (#73) - feat(tools): explain parallel fan-out caps in agent_spawn vs rlm_query descriptions and error messages — cost-class table in TOOL_SURFACE.md (#81) - chore(errors): partial wiring of the error taxonomy — classify_error_message helper used in capacity controller, audit log fields pending (#66) - chore(providers): scaffold OpenRouter and Novita variants end-to-end (env keys, default base URLs, model normalization). Modal /provider picker UI still pending (#52) Build hygiene: - cargo fmt clean, cargo clippy --workspace -- -D warnings clean - cargo test --workspace passes (979+ tests across crates) - pre-existing dead-code warnings gated per-item with TODO refs to #61/#66 EOF )	2026-04-26 13:56:40 -05:00
Hunter Bown	e1ac84ae44	release: v0.6.1 — pricing update, remove light theme + theme setting - V4 cache-hit input prices cut to 1/10th per DeepSeek pricing update: Pro promo 0.03625→0.003625, Pro base 0.145→0.0145, Flash 0.028→0.0028 - Remove the 'light' theme variant (Variant::Light, Theme::light(), test) - Remove the theme setting entirely — hardcode UI_THEME to whale/dark, drop the theme field from Settings, ConfigView, and config command - Bump workspace version 0.6.0 → 0.6.1 (Cargo.toml, npm pkg, CHANGELOG) - De-cringe the README: drop emojis, marketing fluff, unverified claims	2026-04-26 11:56:41 -05:00
Hunter Bown	5f223adea6	v0.6.0: native rlm_query tool + scroll fix + cleanup Adds a structured rlm_query tool for parallel/batched LLM fan-out. The model calls it with one prompt or up to 16 concurrent prompts; children dispatch via tokio::join_all against the existing DeepSeek client. Default child model is deepseek-v4-flash; override per-call via the model field. Available in Plan / Agent / YOLO. Cost folds into the session's running total automatically. Fixes scroll-stuck regression (#56): TranscriptScroll::resolve_top and scrolled_by now use a three-level fallback chain (same line → same cell line 0 → nearest cell at-or-before) instead of teleporting to ToBottom when an anchor cell vanishes. Loosens command-safety chains (#57): cargo build && cargo test and similar chains of known-safe commands now escalate to RequiresApproval instead of being hard-blocked as Dangerous. Chains containing unknown commands still block. Suppresses the GettingCrowded footer chip — context-percent header already covers conversation pressure. Refactors: - Extracts file_mention parsing/completion/expansion (~450 LOC) from the 5,500-line ui.rs into crates/tui/src/tui/file_mention.rs. - Deletes truly unused helpers (write_bytes, timestamped_filename, extension_from_url, output_path, has_project_doc, primary_doc_path). Tests: 853 pass. cargo clippy --workspace -D warnings clean. cargo fmt --all -- --check clean. Closes #46 #47 #48 #49 #50 #53 #54 #55 #56 #57 #58.	2026-04-25 21:48:17 -05:00

1 2 3

102 Commits