codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
Hunter Bown	abc8751d81	docs(release): add v0.8.11 fix entries to CHANGELOG + update README What's New - CHANGELOG: add YOLO sandbox, scroll lock, and capacity controller fixes - README: replace v0.8.10 highlights with v0.8.11 cache-maxing + fixes - Date corrected to 2026-05-04	2026-05-04 23:50:15 -05:00
Hunter Bown	4c783be52d	fix(yolo): drop sandbox to DangerFullAccess — no guardrails as advertised User report: YOLO mode was still routing shell commands through the WorkspaceWrite sandbox, which intercepted legitimate outside-workspace writes (package installs, sub-agent workspaces, package-manager state under ~/.cache, brew, npm install -g, pipx, …) and forced approval round-trips. That contradicts the YOLO contract — the user opted into "no guardrails" and instead got a guardrail. YOLO already auto-approves all tools and enables trust mode. The sandbox was the last residual restriction. Drop it. Change in `Engine::build_tool_context`: split the previously-merged `AppMode::Agent \| AppMode::Yolo` arm into two: * Agent keeps `WorkspaceWrite { writable_roots, network_access: true, … }` — interactive mode with explicit per-tool approval, so the sandbox plus the approval flow form a defense-in-depth layer. * Yolo uses `DangerFullAccess` — no sandbox. The user has opted into auto-approval + trust mode + no sandbox as one consistent posture. Plan mode unchanged (read-only, no shell tool registered). Updated `agent_and_yolo_modes_elevate_shell_sandbox_to_allow_network` to pin the new YOLO contract: `DangerFullAccess` specifically, not just "has network access." Verified locally: * `cargo fmt --all -- --check` clean. * `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings` clean. * `cargo test --workspace --all-features --locked` — green (the snapshot::repo flake still flakes in batch but passes in isolation; unrelated). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 23:37:20 -05:00
Hunter Bown	5c72e5f463	fix(scroll): preserve user_scrolled_during_stream lock across resolve Live repro: in a session producing content rapidly (sub-agent running, multiple tool calls), the user scrolls up to read earlier output. Their scroll position briefly takes effect, then snaps back to the live tail when the next stream chunk arrives. Symptom is "scrolling is broken / takes over instead of the transcript". Root cause in `crates/tui/src/tui/widgets/mod.rs:188-210`: * The user's mouse-scroll-up sets `transcript_scroll = at_line(N)` and `user_scrolled_during_stream = true`. * During render, `resolve_top` clamps the state against `max_start = total_lines.saturating_sub(visible_lines)`. If `max_start < N` (transcript shrunk between scrolls and render — e.g., a sub-agent in-progress card collapsed into a smaller finished card, or the content briefly fits in one screen), `resolve_top` returns `Self::to_bottom()` (TAIL_SENTINEL). * `is_at_tail()` on the post-resolve state returns `true`. * The auto-clear at line 208 fires → `user_scrolled_during_stream = false`. * Next `add_message` / sub-agent envelope sees `is_at_tail() && !user_scrolled_during_stream` and calls `scroll_to_bottom()`. The user is yanked off their position mid-read. `scrolled_by` has the same trapdoor: when `total_lines <= visible_ lines` it returns `to_bottom()` regardless of scroll direction (line 145-148 in scrolling.rs). A user scroll-up while content fits in one screen produces `to_bottom()` → `is_at_tail()` true → auto-clear → next chunk yanks. The fix ======= Snapshot whether the user's PRIOR state was deliberately tail (`is_at_tail()` BEFORE `resolve_top`), and only clear the lock when: 1. Prior state was already TAIL_SENTINEL (deliberate, set by `scrolled_by` reaching `max_start` while scrolling DOWN, or by `scroll_to_bottom()`). 2. AND `total_lines > visible_lines` (so "tail" is meaningful — if the whole transcript fits, "is_at_tail" is trivially true and clearing the lock would yank the user back to bottom on the next chunk despite their explicit scroll-up). This preserves all the legitimate clear paths: * `TurnComplete` event clears the lock at the per-turn boundary (`ui.rs:879`). * User invokes `scroll_to_bottom()` explicitly via key/menu (`app.rs:2459`). * User scrolls down enough that `scrolled_by` reaches `max_start` in a transcript with real scroll room — state goes through `to_bottom()` BEFORE resolve, so `was_explicit_tail = true` and the lock clears. What it stops: * Render-time resolve clamping `at_line(N)` to tail when content shrunk doesn't quietly revoke the user's intent. * `scrolled_by` collapsing a scroll-up to `to_bottom()` when content briefly fits in one screen no longer triggers the auto-clear (the prior state wasn't tail). Verified locally: * `cargo fmt --all -- --check` clean. * `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings` clean. * `cargo test --workspace --all-features --locked` — 2038 passed, 2 ignored, 0 failed (a snapshot::repo flake unrelated to scroll; passes in isolation). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 23:35:39 -05:00
Hunter Bown	1131e7a7b0	fix(capacity): disable controller by default (silent transcript wipe) User-facing repro: * In YOLO mode at low context utilisation (~5%), the engine briefly showed `resetting plan` in the footer and the transcript area went mostly black. Tools kept running (Plan panel + sidebar still rendered), but the chat history above the latest turn was gone. Root cause: the capacity controller's `VerifyAndReplan` action (`crates/tui/src/core/engine/capacity_flow.rs::apply_verify_and_replan`) runs `self.session.messages.clear()` and rebuilds from the canonical state. The capacity controller fires this when its slack-based `p_fail` calculation crosses the high-severe band — independently of the `auto_compact` setting, independently of token utilisation. The user opted out of auto-compaction in v0.8.11 (default `auto_compact = false`, #665), explicitly trusting the model with the full 1M-token V4 window. Auto-managing the prefix on their behalf via the capacity controller contradicts that posture and silently destroys both the user-visible transcript and V4's prefix cache. The fix ======= Flip `CapacityControllerConfig::default().enabled` from `true` to `false`. The controller's `observe_` and `decide` methods already short-circuit when `enabled` is false (`capacity.rs:255`, `capacity.rs:396`), so the existing wiring becomes a no-op for the default config — no need for defensive gating in `capacity_flow.rs`. Power users who want the controller can opt in via `capacity.enabled = true` in `~/.deepseek/config.toml`. The slack heuristics, model priors, cooldowns, and intervention paths all remain in the codebase, ready to re-engage on opt-in. Nothing deleted. Tests ===== `default_controller_is_disabled_and_skips_observations` — pins the new default; `observe_pre_turn` returns `None`. * `opt_in_controller_observes_and_decides` — confirms `enabled = true` rearms the controller end-to-end. * `app_config_without_capacity_uses_default_disabled` — pins that loading a config with no `[capacity]` section produces `enabled = false`. * `capacity_disabled_by_default_keeps_messages_intact` — direct regression for the user-reported symptom: with default config, even a forced error-escalation checkpoint cannot trigger `messages.clear()`. Asserts the transcript length is preserved. Verified locally: * `cargo fmt --all -- --check` clean. * `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings` clean. * `cargo test --workspace --all-features --locked` — 2039 passed, 2 ignored, 0 failed (one flake on `snapshot::repo::tests:: restore_removes_files_added_after_target_snapshot` was filesystem- timing-dependent, passes on isolation re-run; unrelated to this change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 23:24:24 -05:00
Hunter Bown	d4e5ee4eff	chore(release): date v0.8.11 CHANGELOG section	2026-05-04 23:07:57 -05:00
Hunter Bown	54ed8c1d2f	Merge pull request #678 from Hmbown/chore/v0.8.11-release chore(release): bump version to 0.8.11 + CHANGELOG	2026-05-04 23:07:44 -05:00
Hunter Bown	9366a7c5f3	chore(release): bump version to 0.8.11 Final step in the v0.8.11 patch release. Bumps the workspace `Cargo.toml`, all 9 internal path-dep version pins, and `npm/deepseek-tui/package.json` to 0.8.11. `Cargo.lock` regenerated alongside. The v0.8.11 CHANGELOG entry already landed on `main` via the cache-maxing overhaul PR (#684). This commit only stamps the version. Together they ship: * Cache-maxing for V4 1M context — engine no longer rebuilds the system prompt on every turn (#684's `Session::last_system_prompt_hash`), the volatile working-set summary moved out of the system prompt into per-turn `<turn_meta>` on the latest user message, the tool array is anchored with `cache_control: ephemeral`, and the `messages_with_turn_metadata` injection skips tool-result messages so the assistant→tool_result invariant stays intact. * 500K compaction floor — automatic compaction refuses below 500K tokens via `MINIMUM_AUTO_COMPACTION_TOKENS`. Manual `/compact` bypasses (explicit user agency). * Token-only compaction trigger — dropped `CompactionConfig::message_threshold` and the message-count branch in `should_compact`; that 128K-era heuristic only fired on long sessions of small messages, exactly the case where rewriting the V4 prefix cache is most wasteful. * Legacy 128K naming — `DEFAULT_CONTEXT_WINDOW_TOKENS` → `LEGACY_DEEPSEEK_CONTEXT_WINDOW_TOKENS`. * `npm install` resilience — `install.js` now retries with exponential backoff, enforces per-attempt timeout + 30 s stall detector, honors `HTTPS_PROXY` / `HTTP_PROXY` / `NO_PROXY` (pure Node, no new dependencies), and prints download progress to stderr. Driven by a community report that `npm install` took 18 minutes through a CN npm mirror; the GitHub Releases binary fetch was the bottleneck and CN mirrors don't proxy GitHub. Verified locally: * cargo fmt --all -- --check ✓ * cargo clippy --workspace --all-targets --all-features --locked -- -D warnings ✓ * cargo test --workspace --all-features --locked ✓ * parity gates (snapshot, parity_protocol, parity_state) ✓ * bash scripts/release/check-versions.sh ✓ (workspace=0.8.11, npm=0.8.11, lockfile in sync) * node scripts/release/npm-wrapper-smoke.js ✓ Reminder for the maintainer at release time: the npm publish is manual and requires 2FA OTP on every publish. After this PR merges and the GitHub Release is fully drafted by `auto-tag.yml`, publish from a developer machine: cd npm/deepseek-tui npm publish --access public The `prepublishOnly` hook checks all eight binaries plus the SHA256 manifest are present on the GitHub Release before letting `npm publish` proceed, so this must happen after the GitHub Release is finalized. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 23:01:16 -05:00
Hunter Bown	8dfecfb5d9	Merge pull request #685 from Agent-Skill-007/main Revise README for clarity and structure	2026-05-04 23:00:15 -05:00
Hunter Bown	e9b2c2ca5d	Merge pull request #684 from Hmbown/feat/v0.8.11-compaction-v4-overhaul feat(v0.8.11): cache-maxing overhaul — per-turn rebuild gate + working_set extraction + tool anchor	2026-05-04 22:51:29 -05:00
Hunter Bown	9e67e04e4a	fix(install,tests): fmt nit + downloadText flowing-mode bug CI on PR #684 caught two real issues that local checks missed: Lint failure (cargo fmt). A regression test landed with a multi-line `let ContentBlock::Text { text, .. } = real_user.content...` pattern that local rustfmt accepted but CI's pinned toolchain collapsed onto a single line. Reformatted to match. npm wrapper smoke failure ("Checksum manifest is missing deepseek-<platform>"). Subtle Node.js streams interaction in `install.js` introduced by the network-resilience cluster: * `httpRequest` attaches a `data` event listener on the response to re-arm the stall timer. * Attaching a `data` listener on a `Readable` puts the stream into flowing mode immediately. * `downloadText` then ran `for await (const chunk of response)` to collect the body — the async iterator expects paused-mode and silently misses chunks that flow before / between iteration ticks. * For small bodies (the ~100-byte SHA256 manifest), the entire response could flow through the stall listener before the async iterator's `read()` calls landed, leaving the joined body empty. * Result: `parseChecksumManifest("")` returned an empty Map → `verifyChecksum` saw no entries → "manifest is missing X" after the actual binary download succeeded. Binary downloads were unaffected because `download()` uses `response.pipe(sink)` plus a `data` listener for progress — both consume chunks via `data` events, no async iterator involved. Fix: collect the response body in `downloadText` via direct `data`/ `end` event subscription. `data` listeners stack — both the stall re-arm and the body collector fire on every chunk, no flowing-vs- paused conflict. Stall detection still works. Verified locally: `node scripts/release/npm-wrapper-smoke.js` "npm wrapper smoke passed with local assets from <url>". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 22:37:23 -05:00
Hunter Bown	76dd924c7f	fix(engine): turn_meta must skip tool-result messages (HTTP 400 fix) Live-test repro: typing a single user message in the TUI triggered a tool call (read_file Cargo.toml), and the next request to DeepSeek's API returned HTTP 400: "An assistant message with 'tool_calls' must be followed by tool messages responding to each 'tool_call_id'. (insufficient tool messages following tool_calls message)" Root cause: `messages_with_turn_metadata` walked the message list from the tail and prepended a `<turn_meta>` Text block to the last message with role="user". But tool-result messages also use role="user" internally (they serialize to role="tool" on the wire). Inserting a Text content block at index 0 of a tool-result message changed the shape from `[ToolResult(...)]` to `[Text("turn_meta..."), ToolResult(...)]`, which on the wire becomes a role="user" message with text instead of the role="tool" message the API needs to satisfy the assistant's prior tool_call. Hence the 400. The fix: * Restrict the injection target to messages that have at least one Text content block AND no ToolResult blocks. This identifies actual user-typed messages and skips tool-result envelopes. * When the trailing slice has no eligible user message (e.g. mid-turn when a tool result is the most recent message), skip injection entirely. The working_set will surface again on the next genuine user prompt; we don't retroactively prepend onto an earlier user message because that would also confuse the API's tool-call continuity checks. Two regression tests pin the contract: * `turn_metadata_skips_tool_result_messages` — assistant tool_call + tool_result + earlier user message: only the user message gets the prefix, the tool_result message stays a single-block ToolResult. * `turn_metadata_skips_when_only_tool_results_trail` — the corner case where the trailing user-role message is solely a tool result (no real user message in the slice): no injection happens, the message returns unchanged. Verified locally: * 2038 tests passed in TUI bin (2 ignored, was 2036 — these are the +2 new regressions). * `cargo fmt`, `cargo clippy --locked -D warnings`, parity gates all clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 22:27:11 -05:00
Hunter Bown	a08c891542	docs(changelog): cover message_threshold removal + install.js resilience Folds two follow-on changes into the 0.8.11 entry: * The token-only compaction trigger (drops the 128K-era message-count heuristic). * The npm `install.js` network-resilience cluster (retry + timeout + proxy + progress). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 22:20:07 -05:00
Hunter Bown	229f02ea2c	feat(npm): install.js network resilience for slow / firewalled networks A community user from China reported `npm install deepseek-tui` took 18 minutes through a CN npm mirror. The bottleneck is the GitHub Releases binary fetch (~46 MB across two binaries), not the npm tarball (which is 6.9 kB). The CN mirror does NOT proxy GitHub release downloads, so any user behind a slow or lossy connection is hitting the GitHub fetch directly with no resilience. Four behaviors added to `npm/deepseek-tui/scripts/install.js`: 1. Retry with exponential backoff. Up to 5 attempts on network errors (ECONNRESET, ECONNREFUSED, ETIMEDOUT, EAI_AGAIN, network/host unreachable, EPIPE, ECONNABORTED) and 5xx upstream responses. Backoff `1s, 2s, 4s, 8s, 16s` with ±20% jitter. 4xx and checksum-mismatch are flagged non-retryable so we don't thrash on permanent failures. Final error includes the underlying message and the attempt count. 2. Per-attempt total timeout + stall detector. Total timeout defaults to 5 minutes per attempt (`DEEPSEEK_TUI_DOWNLOAD_TIMEOUT_MS`, alias `DEEPSEEK_DOWNLOAD_TIMEOUT_MS`). A stall detector aborts the request when no bytes arrive for 30 s (`DEEPSEEK_TUI_DOWNLOAD_STALL_MS`, alias `DEEPSEEK_DOWNLOAD_STALL_MS`) so a hung connection doesn't waste the whole timeout. Both budgets are surfaced in the error so the user can dial them up if they're on a slow pipe. 3. HTTPS_PROXY / HTTP_PROXY support — pure Node, no new dependencies. Detects `HTTPS_PROXY` / `HTTP_PROXY` (and the lowercase variants) and routes through the proxy via CONNECT tunneling. `NO_PROXY` exclusion list honored, with `` and dotted- suffix matching. Proxy auth via standard `user:pass@` URL form is passed through as `Proxy-Authorization: Basic ...`. Pure-Node implementation using `net` + `tls` + `http` + `https` builtins — no `https-proxy-agent` dependency added. 4. Download progress indicator.* Writes to stderr every ~1 MB or every 2 s in TTY mode using `\r` to overwrite a single line. Non-TTY mode (CI, piped) emits one line per 5 MB so logs stay reasonable. Suppressed when `DEEPSEEK_TUI_QUIET_INSTALL=1` or when `npm_config_loglevel` is `silent` or `error`. Falls back to `N MB downloaded` when the response has no `Content-Length`. Public API unchanged: existing callers of `getBinaryPath` and `run` keep working identically when no new env vars are set. The escape hatch `DEEPSEEK_TUI_DISABLE_INSTALL=1` still exits cleanly. Verified locally: * `node -c install.js` and module-load syntax checks. * `DEEPSEEK_TUI_FORCE_DOWNLOAD=1 DEEPSEEK_TUI_VERSION=0.8.10 node install.js` — real GitHub Releases download succeeded with visible progress, both binaries landed. * `HTTPS_PROXY=http://invalid.proxy.local:9999 ... node install.js` — proxy path exercised, fails cleanly with the bad host named in the error message after retries exhausted. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 22:19:34 -05:00
Hunter Bown	d0e95f23b0	refactor(compaction): drop message_threshold, token-only triggering The `message_threshold` field on `CompactionConfig` was a 128K-era heuristic that fired compaction on long sessions of small messages — exactly the case where rewriting V4's prefix cache is most wasteful. Token budget is the only signal that maps to actual model context pressure; counting messages adds nothing. Changes: * Remove `CompactionConfig::message_threshold` field. * Remove the message-count branch in `should_compact` — token threshold + 500K floor is now the sole compaction trigger. * Remove `compaction_message_threshold_for_model`, `DEFAULT_COMPACTION_MESSAGE_THRESHOLD`, `COMPACTION_MESSAGE_DIVISOR`, `MAX_COMPACTION_MESSAGE_THRESHOLD` from `models.rs`. * Drop the `forced_config.message_threshold` tweak in the engine's capacity-guardrail forced-compaction path; that path now also bypasses the floor (`auto_floor_tokens = 0`) because we're at a hard ceiling and have to free budget regardless of cache cost. * Update production constructors (`main.rs`, `runtime_threads.rs`, `app.rs::compaction_config`) to drop the field. * Update tests: keep the floor + token-threshold assertions, delete the two tests that specifically validated message-count triggering, replace `should_compact_respects_message_threshold` with `message_count_no_longer_triggers_compaction` pinning the new contract. Verified locally: * `cargo fmt --all -- --check` clean. * `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings` clean. * `cargo test --workspace --all-features --locked` — 2036 passed in TUI bin (2 ignored), all other crates green. * parity gates: snapshot, parity_protocol, parity_state — all pass. * `git diff --exit-code -- Cargo.lock` — clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 22:19:09 -05:00
Agent007	c2a992ece0	Revise README for clarity and structure Updated README.md for clarity and formatting improvements.	2026-05-05 11:16:44 +08:00
Hunter Bown	e98efcf31d	fix(engine): drop dead working set prompt marker	2026-05-04 22:08:07 -05:00
Hunter Bown	991ae41571	docs(changelog): describe v0.8.11 cache overhaul	2026-05-04 22:07:04 -05:00
Hunter Bown	1a6589c55a	perf(tools): anchor tool array with cache control	2026-05-04 22:06:58 -05:00
Hunter Bown	b48b68f078	perf(engine): stabilize system prompt and move working set metadata	2026-05-04 22:06:55 -05:00
Hunter Bown	a14227edf8	refactor(models): rename legacy DeepSeek context window	2026-05-04 22:06:16 -05:00
Hunter Bown	a4dee56fcc	fix(compaction): 500K hard floor plus V4 default	2026-05-04 22:06:07 -05:00
Hunter Bown	43bb055dfe	Merge pull request #670 from Hmbown/fix/v0.8.11-view-stack-tracing diag(tui): trace view_stack push/pop for post-mortem black-screen repro	2026-05-04 20:24:39 -05:00
Hunter Bown	cba5e829fc	diag(tui): trace view_stack push/pop for post-mortem black-screen repro Maintainer-reported (handoff): after spawning a sub-agent in YOLO, the transcript renders solid black and scroll keys go dead, but footer + sidebar still render fine. The shape (black + dead input together) strongly suggests a `View` is on the stack that returns empty layout AND intercepts key events at the top level. The fix wants a tighter repro than we have today. Add `tracing::debug!` to every push / push_boxed / pop on `ViewStack` and to the implicit pops in `apply_action` (Close + EmitAndClose). Each line carries the `ModalKind` and post-action depth, so a future `RUST_LOG=deepseek_tui::view_stack=debug` capture will show exactly which view stayed pushed when the symptom recurred. No behavior change. The handoff explicitly suggested this as the first-look diagnostic step; we ship the diagnostic now so the next report comes with evidence. Refs the unresolved sub-agent black-transcript symptom captured in session-3 handoff. Will surface to a tracking issue once we have a concrete repro from the maintainer. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 20:17:43 -05:00
Hunter Bown	f6e4f634d4	Merge pull request #665 from Hmbown/fix/v0.8.11-stop-compacting-so-much fix(compaction): default off + raise unknown-model floor to 80% (#664, v0.8.11)	2026-05-04 20:05:57 -05:00
Hunter Bown	2640d8c091	Merge pull request #594 from Hmbown/fix/593-keyring-shadow fix(auth): dual-write API key to keyring + config to stop stale-keyring shadow (#593)	2026-05-04 20:05:52 -05:00
Hunter Bown	68f6d6995d	Merge pull request #592 from Hmbown/feat/584-compaction-telemetry feat(compaction): debug telemetry on summary calls + document framing fork (#584)	2026-05-04 20:05:46 -05:00
Hunter Bown	f1764704d8	Merge pull request #590 from Hmbown/fix/588-mirror-user-language fix(prompts): mirror user's language in reasoning + reply (#588)	2026-05-04 20:05:40 -05:00
Hunter Bown	d586ff05a8	Merge pull request #591 from Hmbown/fix/583-windows-bel-default-off fix(notifications): default Windows Auto fallback to Off, not BEL (#583)	2026-05-04 20:05:34 -05:00
Hunter Bown	a48f50d0db	Merge pull request #599 from Hmbown/fix/chinese-input-regression-tests test(tui): pin Chinese / IME character input contract for the composer	2026-05-04 20:05:24 -05:00
Hunter Bown	fc4f1e6564	fix(compaction): default to off + raise unknown-model floor to 80% (#664 ) Two coordinated changes that stop the engine from routinely rewriting the prompt prefix and burning DeepSeek V4's prefix-cache discount: 1. `Settings::default().auto_compact` flips from `true` to `false`. The `auto_compact = on` opt-in and the explicit `/compact` slash command stay available for users / agents that decide their workload benefits from compaction more than from cache stability. With V4's 1M-token window the user has plenty of headroom to run long sessions without auto-trimming, and aggressive compaction has been the dominant cost-spike vector in long sessions (the rewritten prefix invalidates ~90% of the cache discount on every compaction event). 2. `DEFAULT_COMPACTION_TOKEN_THRESHOLD` raised from `50_000` to `102_400` (80% of `DEFAULT_CONTEXT_WINDOW_TOKENS = 128_000`). This is the last-resort threshold used when `context_window_for_model` returns `None` — i.e. an unrecognised model id. Pre-v0.8.11 the fallback compacted at ~5% of a V4 window when model detection silently fell through. Now the fallback inherits the same late-trigger discipline as the V4 path, so model-detection drift doesn't quietly burn cache. Together: the two changes mean compaction never fires automatically by default, and even when explicitly opted in (or when the runtime-thread / capacity-flow paths invoke compaction with their own `enabled = true` config), the threshold is anchored at 80% of the model's context window (or 80% of the 128K default if the model is unknown), never below. Tests ===== - `default_settings_disable_auto_compact_to_protect_v4_prefix_cache` — pins the new default and explains the rationale inline. - `auto_compact_remains_explicitly_configurable` — unchanged; still asserts the `set("auto_compact", "on" \| "off")` round-trip works. - `compaction_threshold_scales_with_context_window` — updated to assert `compaction_threshold_for_model("unknown-model") == 102_400`. - `v4_soft_caps_only_apply_to_v4_models` — updated to assert the unknown-model + reasoning-effort path also lands on the new floor. Verification ============ - `cargo fmt --all -- --check` clean. - `cargo clippy -p deepseek-tui --bin deepseek-tui --all-features --locked -- -D warnings` clean. - `cargo test -p deepseek-tui --bin deepseek-tui --locked` → 2028 passed, 2 ignored. Refs #664 (handoff-instead-of-compact pattern, full implementation deferred). Behaviour-only change for v0.8.11; the larger agent-aware-handoff mechanism is its own design surface. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 19:28:02 -05:00
Hunter Bown	03d72840e6	test(tui): pin Chinese / IME character input contract for the composer Adds two regression tests to crates/tui/src/tui/paste.rs::tests that nail down what is currently a working code path but was not previously covered by name: * `ime_chinese_chars_route_through_to_composer` — simulates the macOS/Windows IME commit pattern (one `KeyCode::Char(c)` event per Chinese codepoint with realistic ~50 ms gaps so the paste-burst heuristic doesn't false-positive). Asserts that "你好世界" lands in `app.input` verbatim and that `cursor_position` advances by one per codepoint, not per UTF-8 byte. The non-ASCII branch in `handle_paste_burst_key` (paste.rs:42) is the structural anchor; this test pins it so a future "filter to ASCII for the paste-burst detector" change would surface immediately. * `bracketed_paste_preserves_chinese_and_mixed_text` — pastes a mix of CJK and Latin text ("你好世界 hello 世界 café") through the bracketed-paste path (`insert_paste_text` → `normalize_paste_text` → `insert_str`) and confirms every codepoint survives plus the cursor tracks codepoints, not bytes. Why these tests, why now: a community report surfaced the question "can users input Chinese characters" without specifying the exact failure mode. Code review of the input data path turned up nothing broken, and these tests confirm the data path is correct end-to-end for both single-char IME commits and bulk bracketed paste. The tests serve as evidence (the data path is provably fine) and as a guard against future regressions to Chinese-input support. The tests cost nothing at runtime and build under `cfg(test)` only. If users are still seeing a Chinese-input failure after this lands, the candidates worth investigating in priority order are: (1) display layer — `wrap_input_lines` / `cursor_row_col` may be miscounting double-width CJK cells; (2) terminal-specific delivery — certain IMEs / terminals don't emit the events crossterm expects; (3) locale at launch — `LC_ALL=C` in non-interactive shells breaks UTF-8 input upstream of crossterm. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 14:50:24 -05:00
Hunter Bown	071d23a4b7	fix(auth): dual-write API key to keyring + config so stale keyring stops shadowing onboarding (#593 ) Reproduction (from the user who filed #593, also the reporter of #586): 1. At any prior point, the user runs `deepseek auth set --provider deepseek`, which writes to the OS keyring under the `deepseek` slot. 2. The key is later rotated, the prior install is replaced, or the user moves to a different account. 3. The user opens the TUI, gets the in-TUI onboarding screen, and pastes their fresh API key. 4. `submit_api_key` → `save_api_key` writes only to `~/.deepseek/config.toml`. 5. At request time, `Secrets::resolve` follows the documented `keyring → env → config-file` precedence, and the stale keyring entry shadows the fresh config.toml value. 6. API call goes out with the dead key, gets a 401, the TUI shows "no response" with no obvious diagnostic. The fix ======= `save_api_key` now writes to both layers when a keyring backend is reachable: * The config file remains the durable, inspectable record of the active key (works in npm installs, IDE terminals, headless CI — everywhere). v0.8.8 made this the canonical location for a reason. * The OS keyring entry is rewritten on every onboarding submit so a stale credential from a prior install is overwritten in place. `SavedCredential` gains a new `KeyringAndConfigFile { backend, path }` variant; the existing `ConfigFile(PathBuf)` variant remains the fallback when no keyring backend is reachable (or under `cfg(test)`, so the unit suite never pollutes the host keyring). The onboarding toast naturally reports the actual outcome via `SavedCredential::describe`, which now reads `OS keyring (system keyring) and ~/.deepseek/config.toml` for the common case. `save_api_key_for` (the multi-provider entry point) is updated to extract the path from either variant, so non-DeepSeek providers (OpenRouter / Novita / Fireworks / NIM / SGLang) continue writing provider-table entries to config.toml only, with no behavior change. `deepseek doctor` warning ========================= `run_doctor` now compares the keyring's `deepseek` slot against the config file's `api_key` slot. When both are present and differ, the report surfaces the discrepancy with copy-paste remediation — `deepseek auth set --provider deepseek` rewrites both layers in one shot, and the in-TUI onboarding now does the same. The check skips keyring probes for other providers because they don't write to the keyring today; probing absent slots only triggers macOS Always-Allow prompts for nothing. Why dual-write rather than keyring-only ======================================= A previous attempt (`4e360274`, never merged to main) swapped the write path to keyring-only. That hides the key from anyone who expected to see it under `~/.deepseek/config.toml` and breaks the "deepseek-tui works in every folder, in npm installs, in IDE terminals" promise of v0.8.8. Dual-write keeps the inspectable copy and adds the layered override that defeats stale-shadow without changing the visible mental model. Tests ===== * `saved_credential_describe_lists_both_targets_for_keyring_and_config` pins the toast text shape so the user sees both targets after onboarding. * The existing `save_api_key_writes_config_file_under_cfg_test` and `test_save_api_key_doesnt_match_similar_keys` continue to pass — under `cfg(test)` the keyring path is gated out, so the config-only outcome remains the test-time contract. Verification ============ * `cargo fmt --all -- --check` clean. * `cargo clippy -p deepseek-tui --bin deepseek-tui --all-features --locked -- -D warnings` clean. * `cargo test -p deepseek-tui --bin deepseek-tui --locked` → 2029 passed, 2 ignored. Closes #593. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 13:55:07 -05:00
Hunter Bown	4e86a0fb8e	fix(prompts): expand language-mirroring carve-out + pin reasoning_content anchor Two small follow-ups to #588's review: * Gemini-code-assist suggested explicitly listing environment variables, command-line flags, and URLs alongside identifiers/tool-names in the carve-out clause, since those are exactly the categories an LLM is likeliest to "helpfully" translate (e.g. `--verbose` or `DEBUG=true`). Adopting verbatim — the additions are non-controversial and the failure mode they prevent is real. * Copilot flagged that the structural test only checked for the `## Language` heading. A future edit could keep the heading but silently weaken the section to a generic "respond in the user's language" directive, dropping the cross-cutting #588 commitment that the model's `reasoning_content` field — not just the visible reply — follows the user's language. Add a second structural anchor: assert the section body mentions `reasoning_content`. This matches the existing rlm test's "anchor tokens, not prose" convention (the API field name is the feature contract, not a wording choice). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 13:41:30 -05:00
Hunter Bown	a68c8dc974	docs(notifications): only completed turns notify; add Key Reference + WezTerm-on-Windows test Post-merge review feedback on #583 surfaced four small accuracy gaps: 1. The narrative docs in `docs/CONFIGURATION.md` and the inline comment in `config.example.toml` said the notification fires "when a turn takes longer than a threshold" — but the call site in `tui/ui.rs:928` is gated on `TurnOutcomeStatus::Completed`. Failed and cancelled turns are silent on purpose. Spell that out so users don't expect alerts on long failures. 2. The `notify_done` rustdoc still summarised `Auto` as "Osc9 for known terminals, Bel otherwise" — internally inconsistent with the new Windows-aware fallback documented one screen earlier on the `Method::Auto` enum and on `resolve_method`. Update the public rustdoc to point at the canonical resolution table on `resolve_method` and call out the `Off`-on-Windows branch. 3. The `## Key Reference` list in `docs/CONFIGURATION.md` had no entries for `[notifications].method`, `[notifications].threshold_secs`, or `[notifications].include_summary`. Other features with a dedicated subsection (e.g. `[memory].enabled`) are listed there too, so readers scanning the canonical key list could not discover the notification knobs. Added the three keys with cross-references to the Notifications subsection. 4. The Windows-only test only covered the unknown-`TERM_PROGRAM` → `Off` fallback. The positive path (known OSC-9 terminal still resolves to `Osc9`) was only tested via `iTerm.app`, which is a macOS-only program — Windows CI would still pass if the `WezTerm` arm of the match disappeared. Added `auto_detect_picks_osc9_for_wezterm_on_windows` so the WezTerm-on-Windows compatibility guarantee is exercised on the Windows runner. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 13:38:21 -05:00
Hunter Bown	dcadb5d388	fix(compaction): cache-hit % denominator + correct RUST_LOG filter docs The post-merge review on #584 surfaced two real bugs in the new summary-call telemetry: 1. The cache-hit percentage used `cache_hit + cache_miss` as the denominator. Providers that populate `prompt_cache_hit_tokens` but leave `prompt_cache_miss_tokens` as `None` (the rest of the codebase already infers misses from `input_tokens` for cost reporting and `/cache`) were silently reported as a flat 100% hit rate, masking the actual ratio. Switch the denominator to `usage.input_tokens` so the ratio matches how the rest of the project reasons about cache spread. Extract the calc into a small `summary_cache_hit_percent` helper so the invariant is unit-testable. 2. The doc comment on the emit site advertised that `RUST_LOG=deepseek_tui::compaction=debug` would also work as a filter. It does not — `EnvFilter` matches the explicit target string when one is set, so only `RUST_LOG=compaction=debug` activates the event. Drop the misleading parenthetical and call out the filter semantics explicitly. The new unit test pins the partial-telemetry guard so a future regress to `(hit + miss)` denominator would be caught immediately. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 13:35:05 -05:00
Hunter Bown	7712a37272	feat(compaction): debug telemetry on summary calls + document framing fork Two follow-ups from the post-#572 cache-aligned compaction review (#584): 1. `should_use_cache_aligned_summary` now carries a doc comment that explains why the cache-aligned and fallback summary requests are framed differently. Cache-aligned replays the conversation as the model's own history under `system: None`; fallback reformats it into a `User:/Assistant:` transcript under a "concise summaries" system prompt. The fallback's external-transcript framing is more conservative for the older / smaller models the cache-aligned path explicitly excludes, so dropping the system prompt risks regressing those models without a corresponding gain. Unifying the two paths is a research question that wants an A/B summary-quality eval, not a drive-by cleanup — flagged here for a future PR rather than resolved silently. 2. `create_summary` now emits one `tracing::debug!` event per summary call carrying which path was chosen, the prompt-token count, and the cache-hit / miss split. Filter with `RUST_LOG=compaction=debug` (or the full module path `RUST_LOG=deepseek_tui::compaction=debug`). This makes the V4 prefix-cache win from #572 observable post-deploy without adding UI surface — the compaction summary call is the request we most expect to benefit, and previously we had no per-call signal for it. No UI surface changes. No model-facing prompt changes. Only adds the path-choice variable and the debug log; existing compaction tests (56 across `compaction::` and `models::`) still pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 12:54:59 -05:00
Hunter Bown	3636908bb9	fix(notifications): default Windows Auto fallback to Off, not BEL On Windows, the audio stack maps BEL (`\x07`) to the `SystemAsterisk` / `MB_OK` chime — the same sound applications use for error popups. So with the previous `Method::Auto` fallback to `Bel`, every successful turn-completion notification ended up sounding identical to a software error. Reported by a community user who described it as "the popup-error sound from a CAD program I used to use" (#583). resolve_method() now returns `Off` instead of `Bel` on Windows for unknown TERM_PROGRAM values. Known OSC-9-capable terminals (`iTerm.app`, `Ghostty`, `WezTerm`) still resolve to `Osc9` on every platform, so users running WezTerm on Windows keep getting real notifications. macOS and Linux behaviour is unchanged. Windows users who actively want an audible cue can opt back in by setting `[notifications].method = "bel"` in `~/.deepseek/config.toml`. Also: - Documents `[notifications]` in `docs/CONFIGURATION.md` with an explicit Windows note (the schema was previously undocumented). - Updates the inline comment in `config.example.toml` so users reading the seed config see the platform-specific behaviour. - Splits the existing `auto_detect_picks_bel_for_unknown` test into a Unix variant (`#[cfg(not(target_os = "windows"))]`) and adds a new Windows-gated test that asserts the `Off` fallback, so CI's Windows runner exercises the platform-specific path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 12:49:03 -05:00
Hunter Bown	a239968f5b	fix(prompts): mirror user's language in reasoning + reply DeepSeek V4's `reasoning_content` channel inherits the system prompt's English bias even when users write in Chinese, so the visible thinking trace stays in English alongside (sometimes mixed-language) replies. Adds a `## Language` section near the top of `base.md` directing the model to mirror the user's language in both `reasoning_content` and the final reply, with a carve-out so identifiers, file paths, tool names, and log lines stay in their original form (translating `read_file` to `读取文件` would break tool calls). Default remains English when no clear signal is present, so existing behaviour is preserved. Includes a structural test in `crates/tui/src/prompts.rs` that asserts the section ships in every mode (Agent / Yolo / Plan). Wording is intentionally not asserted on, per the existing test module's "don't fail on prose" comment. Reported via the project Telegram community (#588). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 12:10:23 -05:00
Hunter Bown	a61da8de2c	Merge pull request #589 from Hmbown/docs/v0.8.10-readme-refresh docs: refresh README + zh-CN README for v0.8.10	2026-05-04 12:03:21 -05:00
Hunter Bown	b5b4d49324	docs: refresh README + zh-CN README for v0.8.10 - Add "What's new in v0.8.10" section covering hotfixes, runtime API expansion, cache-aware compaction, glibc 2.28 baseline, markdown rendering overhaul, and platform fixes - Remove verbose per-version sections (v0.7.6 through v0.8.8) — those belong in CHANGELOG.md, both READMEs now link to it - Condense "How it's wired" architecture blurb to 2 sentences - Restructure quickstart for flow: npm install → API key → platform notes → China mirror → providers - Trim Linux ARM64 section, drop ancient v0.8.7 workaround - Drop redundant /attach shortcut (covered by @path) - Tighten configuration env table to one row per variable - Cut English skills section (numbered publish workflow duplicated from SKILL.md) - Add 5 missing docs to Documentation table (INSTALL, MEMORY, SUBAGENTS, KEYBINDINGS, CHANGELOG) - Bring zh-CN README to parity: add Key Features, How it's wired, Thanks, Star History; preserve locale-switching guide English: 324 lines (was ~1090). Chinese: 333 lines (was ~520, now content-equivalent to English).	2026-05-04 11:59:55 -05:00
Hunter Bown	a16ff2890f	Merge pull request #587 from Hmbown/fix/release-arm64-target-path fix(release): use rust target (not glibc-versioned) path for binary copy	2026-05-04 11:24:39 -05:00
Hunter Bown	0136935b8d	fix(release): use rust target (not glibc-versioned) for binary copy path The aarch64 deepseek build in release.yml run 25329602631 succeeded in 4m 53s but the rename step failed: cp: cannot stat 'target/aarch64-unknown-linux-gnu.2.28/release/deepseek' cargo zigbuild parses `aarch64-unknown-linux-gnu.2.28` by passing `aarch64-unknown-linux-gnu` to cargo and the `.2.28` glibc minimum to zig's CC. The cargo target output dir is therefore `target/aarch64-unknown-linux-gnu/release/`, never the glibc-versioned form. v0.8.9 release.yml hard-coded the rust triple in the rename step and worked. v0.8.10 added `target_zig: <triple>.<glibc>` to the matrix and switched the rename step to `${{ matrix.target_zig \|\| matrix.target }}`, which silently became wrong for every zigbuild matrix leg. This commit: - Always uses `matrix.target` (rust triple) for the copy source path. - Adds a defensive `find target -name "${binary}"` debug listing if the expected binary isn't at the rust-target path, so future cargo-zigbuild output-dir changes are visible in the build log rather than just "No such file".	2026-05-04 11:24:08 -05:00
Hunter Bown	e7a21012b8	Merge pull request #585 from Hmbown/fix/release-arm64-rustls fix(release): switch TUI reqwest from native-tls to rustls (unblock v0.8.10 aarch64)	2026-05-04 11:10:28 -05:00
Hunter Bown	6ba6add03d	fix(release): switch TUI reqwest from native-tls to rustls The aarch64-unknown-linux-gnu release build for `deepseek-tui` failed in release.yml run 25327475634 with: openssl-sys v0.9.111: 'openssl/opensslconf.h' file not found `crates/tui/src/main.rs` was the only crate in the workspace pulling `reqwest` with `default-features = false, features = ["native-tls", ...]` — every other crate (including the dispatcher in `crates/cli`) already inherits the workspace default `["json", "rustls"]`. The aarch64 leg builds with `cargo zigbuild --target aarch64-unknown-linux-gnu.2.28`, whose zig sysroot does not ship openssl headers; the matching native-tls job for v0.8.9 succeeded by chance against an earlier runner image but the current `ubuntu-24.04-arm` image no longer satisfies openssl-sys's header probe under zigbuild. Switching the TUI's reqwest features from `native-tls` to `rustls` brings it in line with the rest of the workspace and removes nine crates from the build graph entirely (`openssl`, `openssl-sys`, `openssl-probe`, `openssl-macros`, `native-tls`, `hyper-tls`, `tokio-native-tls`, `foreign-types`, `foreign-types-shared`). reqwest 0.13.1 already uses `rustls-platform-verifier` for OS trust-store integration, so end-user TLS behavior against api.deepseek.com remains equivalent. Verified locally: - cargo clippy --workspace --all-targets --all-features --locked passes - cargo build --release -p deepseek-tui --locked succeeds - cargo fmt --all -- --check is clean - no source code in `crates/` references native-tls / openssl directly This is a release-pipeline-only fix; no user-visible feature changes.	2026-05-04 11:00:54 -05:00
Hunter Bown	e9b472e972	Merge pull request #571 from Hmbown/chore/v0.8.10-release chore(release): bump to v0.8.10	2026-05-04 10:21:08 -05:00
Hunter Bown	a92c449de5	chore(release): bump version to 0.8.10 + CHANGELOG Picks up the v0.8.10 patch release contents: * Daemon API quartet for whalescale-desktop integration (#561-#564, PR #567). * Bug cluster: macOS seatbelt cargo registry (#558), MCP SIGTERM shutdown (#420), Linux PR_SET_PDEATHSIG (#421). * npm install on older glibc fix (#555/#560 via #556 + #565). * Shell cwd workspace-boundary validation (#524). * Memory help/docs polish (#497 via #569). * Onboarding language picker (#566). * Whale nicknames interleaved with Simplified Chinese. First-time contributors credited in CHANGELOG: @staryxchen, @shentoumengxin, @Vishnu1837, @20bytes. Workspace `Cargo.toml`, all 9 internal path-dep version pins, and `npm/deepseek-tui/package.json` all bumped to 0.8.10. `Cargo.lock` regenerated and committed alongside. Verified locally: * cargo fmt --all -- --check * cargo clippy --workspace --all-targets --all-features --locked -- -D warnings * cargo test --workspace --all-features --locked * bash scripts/release/check-versions.sh Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 10:13:26 -05:00
Hunter Bown	36acd1fa10	Merge pull request #572 from Hmbown/feat/v0.8.10-features feat(v0.8.10): shell_env hook + toast stack + @-mention frecency + keybindings audit (#456 #439 #441 #559)	2026-05-04 10:10:57 -05:00
Hunter Bown	64cf0d8e46	Merge remote-tracking branch 'origin/main' into feat/v0.8.10-features	2026-05-04 10:03:06 -05:00
Hunter Bown	0b5da7a34b	Merge pull request #578 from loongmiaow-pixel/docs/windows-china-install-guide docs(install): add Windows build guide, AV troubleshooting, and China mirror details	2026-05-04 10:02:53 -05:00
Wu Yuxin	6bcf07a479	Update crates/tui/src/tui/markdown_render.rs Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-05-04 10:01:52 -05:00

1 2 3 4 5 ...

603 Commits