codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
CrepuscularIRIS	97a77d82f0	fix(settings): detect TERM_PROGRAM=ghostty and apply low-motion cap (#1445 ) Ghostty's GPU compositor flash-renders each full-screen repaint at 120 FPS, producing visible flicker identical to the VS Code issue fixed in #1356. Extend apply_env_overrides() to also force low_motion=true + fancy_animations=false when TERM_PROGRAM=ghostty, capping redraws to 30 FPS. Add ghostty_term_program_forces_low_motion_on test mirroring the existing vscode test, serialised through the process-wide lock_test_env() guard. Fixes #1445 Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>	2026-05-11 13:09:25 -05:00
AutoGHClaw	4bc5375fa6	fix(render): replace SMP emoji with stable BMP glyphs to fix layout on Windows terminals Emoji in U+1F000+ have no stable column-width contract across terminal emulators. On cmd.exe/PowerShell they render as 1-column placeholder boxes even though unicode_width reports 2; on WezTerm/Alacritty with certain font stacks the rendered width can be off by one column. Both cases break layout arithmetic in the header and file-tree widgets. Changes: - header.rs: replace 🐳 (U+1F433, 2-wide) with ◆ (U+25C6, always 1-wide) in the "max" reasoning-effort chip - file_tree.rs: drop the 📁/📄 (U+1F4C1/U+1F4C4) entry-icon prefix; the ▼/▶ expand marker already distinguishes dirs from files Fixes #1314	2026-05-11 13:09:25 -05:00
xiedongsheng	1352a65ac2	fix: reject no-op edit_file replacements	2026-05-11 13:09:22 -05:00
Hunter Bown	195388b62b	docs(release): add final v0.8.29 late fixes	2026-05-11 07:36:49 -05:00
reidliu41	7ac79799e7	Fix flaky workspace and env-sensitive tests Tighten session workspace Git root detection so invalid parent .git markers are not treated as real repositories. This prevents unrelated temporary workspaces from being scoped together when a stray .git directory exists under /tmp. Also move env-mutating tests onto the shared test env lock and make the streamable HTTP MCP mock server serve until the test ends, avoiding parallel test races and premature mock server shutdowns. (cherry picked from commit eecfc16fc99d072ac389980ec9e5e3f208297b8e)	2026-05-11 07:34:50 -05:00
reidliu41	46ab1fdf62	Scope skill completions to /skill Keep individual skills out of the top-level slash command menu so large skill collections do not crowd out built-in commands. Skills still complete after `/skill`, including both the full skill list after `/skill ` and prefix matches after `/skill <prefix>`. (cherry picked from commit 57f8e3ad84dad9cf46290c0dc23e2b26504196df)	2026-05-11 07:34:50 -05:00
Beltran12138	ca284d1fc0	fix: treat deepseek-chat / deepseek-reasoner aliases as reasoning models `requires_reasoning_content()` only matched literal `deepseek-v4*` model IDs, but `deepseek-chat` and `deepseek-reasoner` are DeepSeek's public API aliases that resolve server-side to `deepseek-v4-flash` and `deepseek-v4-pro` respectively. Both have thinking mode enabled by default, so when a user sets `default_text_model = "deepseek-chat"` (the value `deepseek auth` / onboarding writes), the thinking-mode sanitizer is skipped and tool-call assistant messages are sent without `reasoning_content`. DeepSeek then rejects the second turn with: HTTP 400: The `reasoning_content` in the thinking mode must be passed back to the API. Extend `requires_reasoning_content()` to recognise the `deepseek-chat` and `deepseek-reasoner` alias prefixes (covering suffixed variants like `deepseek-chat:free` used by proxied deployments). The explicit `reasoning_effort = "off"` escape hatch still disables replay via the unchanged `should_replay_reasoning_content()` check. Adds `alias_thinking_detection_tests` covering the aliases, explicit V4 IDs (regression guard), excluded non-thinking models, suffixed variants, and the reasoning-off override. Refs: https://api-docs.deepseek.com/guides/thinking_mode (cherry picked from commit 46941142123827fa16fc9a1fb41b78c293e935ce)	2026-05-11 07:34:50 -05:00
samhandsome	9279d92f4e	fix(tools): hydrate deferred schemas before execution Return a schema hydration result on first deferred tool use so the model can retry with visible parameters instead of executing guessed arguments. Add edit_file coverage for old_string/new_string aliases. (cherry picked from commit 91be171cc15dd895170bd1a486445f5e05356b57)	2026-05-11 07:34:50 -05:00
Hunter Bown	cd3767c75a	docs(release): credit late v0.8.29 bug-fix cherry-picks	2026-05-10 23:29:33 -05:00
reidliu41	0164ee869e	fix: parse CRLF SSE MCP events Accept both LF and CRLF SSE event separators in the MCP SSE transport so uvicorn and FastMCP servers can publish endpoint events correctly. Add regression coverage for CRLF endpoint discovery.	2026-05-10 23:27:51 -05:00
Wenjunyun123	21867b2d3a	style(tui): format footer width tests	2026-05-10 23:27:21 -05:00
Wenjunyun123	80a8559e29	fix(tui): keep footer chips within available width	2026-05-10 23:27:21 -05:00
reidliu41	4afa88ebfc	fix(tui): ignore leaked mouse reports in composer PR #1421 from @reidliu41. Filters SGR mouse-report bursts that some terminal chains leak into stdin while mouse capture is enabled, while preserving ordinary coordinate-like text.	2026-05-10 23:27:14 -05:00
Hunter Bown	f5784909cf	security(tui): avoid session-id taint in resume hint	2026-05-10 22:20:42 -05:00
Hunter Bown	678198440a	docs(prompts): capture the Chinese-native-model design tension in locale-preamble docstring Community feedback on the v0.8.29 follow-up (WeChat thread on #1118) made a sharp point: the standard Western-LLM advice "always write prompts in English" doesn't transfer to DeepSeek V4, which is a Chinese-first multilingual model with a Chinese-co-trained tokenizer. `你好` typically encodes to ~1 token, not 2; the "Chinese is expensive" framing is folk wisdom from a different model family. The naïve translation of that argument is "ship a fully translated base.md per locale" — and that's the move v0.9.x might eventually make. For v0.8.29 we deliberately stop at the bookend (preamble + closer in native script, English middle) because of three concrete costs: 1. Drift risk between N translated copies of a 200-line prompt — every rule change has to land in lockstep. 2. Cache stability — one English `base.md` lets us share prefix-cache state across locales for the workspace- static portion of the prompt. 3. Translation QA expense — 95% right is bad, because the missing 5% becomes silent behavior divergence. Captured all of this in the `locale_reinforcement_preamble` docstring so the next maintainer reading the prompt-assembly code sees the design tension and the cost model explicitly, and knows full translation is the natural next step if the bookend stops being sufficient. No runtime change; documentation only. Credit @MuMu (via Hunter) for the bookend pattern that motivates this design, and the unnamed WeChat commenter who made the tokenizer-economics argument that motivates this docstring expansion. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 22:04:49 -05:00
Hunter Bown	5b6902006c	security(mcp): defuse GitGuardian Basic-Auth-String false positive in proxy-redact test GitGuardian's "Basic Auth String" detector flagged commit `09dcbede0` because the test fixture for `redact_proxy_userinfo_strips_password` contained literal URL strings of the shape `scheme://username:password@host` — `alice:hunter2` and `bob`. The values are obvious placeholders (not real credentials), but the detector's regex is shape-based: any scheme-prefixed colon-separated userinfo segment terminated by `@` matches, regardless of whether the content is a real secret. The test still needs to exercise the redaction logic for credential- carrying proxy URLs. Fix: assemble the URLs via `format!` from explicit placeholder constants (`PLACEHOLDER_USER`, `PLACEHOLDER_PASS`) so the literal source text never contains a contiguous `scheme://name:secret@host` pattern. Runtime behavior is identical — `redact_proxy_userinfo` receives the same string and returns the same redacted form. Also reworded the function docstring (line 61) and the inline comment at the warning log site (line 993) to describe the userinfo segment without spelling out a literal `user:pass@host` shape that the same detector could later trip on. Two preexisting fixtures elsewhere in this file (`mask_url_secrets("https://user:s3cret@…")` at line 3155 and its docstring at line 46) have been on `main` for several releases and are presumably already on GitGuardian's allowlist — left untouched in this commit so the fix scope stays minimal. If they re-fire on a future scan, the same `format!` pattern can be applied there. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:39:23 -05:00
Hunter Bown	b6cf0199de	feat(prompts): locale-native closing bookend to defeat English-context drift (#1118 follow-up #2 ) The opening preamble from commit `47f6d69e5` works for the first few turns, but as English context accumulates in the session (code read into the transcript, error logs, file listings, search results, project context), the transformer's recency bias pulls `reasoning_content` back toward English even when the user keeps writing in their own language. The empirical fingerprint is "model thinks in Chinese for the first 3-4 turns, then quietly switches to English thinking around turn 5 as more code lands in context." Community feedback (WeChat thread on #1118 — @MuMu describes an XML-tagged "bilingual bookend" pattern they used in another project, and @益达 confirms the translation-accuracy problem with fully-translated prompts) pointed at the bookend pattern: keep the rule-heavy middle of the prompt in English (single source of truth, model is natively multilingual), but reinforce the locale directive at BOTH ends in native script. The opening anchors behavior at session start; the closer sits at the maximum- recency position right before the user's next message and re-asserts the rule each turn. `locale_reinforcement_closer()` returns Some for `zh-Hans` / `zh-CN` / `zh`, `ja` / `ja-JP`, `pt-BR` / `pt`. English (and unmatched locales) return None — system prompt stays byte-identical to the previous behavior for English users. The closer is appended after the previous-session handoff block (the existing "last block" position), so it's the very last content before the user's first message. Any future block that needs to sit closer to the user should be added BEFORE the closer with an updated test invariant. Three new tests pin the contract: * `locale_reinforcement_closer_returns_native_script_for_supported_locales` — each supported locale's closer is in its native script and explicitly mentions `reasoning_content` (the V4 knob). * `system_prompt_bookends_zh_hans_with_preamble_and_closer` — the full zh-Hans system prompt contains both `## 语言要求` (preamble) and `## 语言再次提醒` (closer), in that order, and no other top-level `##` section follows the closer. * `system_prompt_skips_locale_preamble_for_english` (extended) — English locale gets neither the preamble nor any of the three locale closers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:32:08 -05:00
Hunter Bown	09dcbede08	security(mcp): redact HTTP(S)_PROXY userinfo before logging malformed-URL warning PR #1408's MCP proxy support (commit `865db6248`) added a `tracing::warn!` for malformed `HTTPS_PROXY` values that included the raw URL via `proxy = %proxy_url`. With v0.8.29's new file-backed tracing subscriber (writing to `~/.deepseek/logs/tui-YYYY-MM-DD.log`), that means a corporate proxy URL of the shape `http://user:pass@proxy.example/` would leak the password to disk whenever reqwest rejected the URL. Fix: redact the `user:pass@` userinfo segment before logging via a new `redact_proxy_userinfo()` helper. `http://alice:hunter2@proxy/` becomes `http://***@proxy/`. URLs without userinfo are returned unchanged; the `@` is only treated as a userinfo separator when it appears before any `/`, `?`, or `#` (so path-embedded `@` doesn't trigger redaction). Garbage input (no `://`) passes through — the warning log site is already in the malformed-URL failure path. Pinned by `redact_proxy_userinfo_strips_password` covering five cases: full creds, user-only, no-userinfo, path-only-`@`, and garbage. The non-malformed path (where reqwest accepts the URL) never logs the URL at all, so this is the only leak vector. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:26:45 -05:00
Hunter Bown	47f6d69e58	feat(prompts): locale-native reinforcement preamble for non-English UIs (#1118 follow-up) `base.md` stays the single source of truth (English meta-language, DeepSeek V4 is natively multilingual, prefix-cache stable across users in the same locale). For non-English UI locales we now prepend a short locale-native passage so the model's first exposure to the prompt is an explicit "think and reply in {locale}" directive in the user's own writing system — defeats the failure mode reported in #1118 and visible in the recent WeChat screenshot where a user with `locale = zh-Hans` configured still got English thinking because the task context (Rust code, English log lines) overpowered the inferred `## Environment.lang` signal. Locales supported (matched against `PromptSessionContext.locale_tag`, which the caller resolves from `Settings`): * `zh-Hans` / `zh-CN` / `zh` — Simplified Chinese preamble * `ja` / `ja-JP` — Japanese preamble * `pt-BR` / `pt` — Brazilian Portuguese preamble English (and any unmatched locale) returns `None` and the system prompt is byte-identical to v0.8.28 — so this is a strict additive change for non-English users. Each preamble is ~6-8 lines and explicitly: * names the runtime ("DeepSeek TUI") so the model knows it's not switching personas * declares the directive for BOTH `reasoning_content` and the final reply (the V4 knob that #1118 hinges on) * preserves tool-name immutability (`read_file`, `exec_shell`, paths, env vars, CLI flags, URLs stay in their original form) * handles mid-session language switches (next-turn switching) * defers to explicit user override ("think in English" etc.) Three new tests pin the contract: * `locale_reinforcement_preamble_returns_native_script_for_supported_locales` — preamble must be in the locale's native script, must mention `reasoning_content`, and must call out tool-name immutability; English/unknown locales must return `None`. * `system_prompt_prepends_locale_preamble_for_zh_hans` — the preamble must appear before the English base prompt body in the assembled system prompt (attention precedence + cache ordering both depend on this). * `system_prompt_skips_locale_preamble_for_english` — English locale must produce a byte-identical prompt to the pre-feature behavior (no zh / ja / pt strings anywhere). Prefix-cache impact: per-locale cache shards stay intact (a zh-Hans user's prompt shares the preamble across turns; an English user's prompt is unchanged). Cross-locale cache is invalidated, which is correct — different users in different locales were never sharing cache for the right reasons. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:11:37 -05:00
Hunter Bown	f03df9d3e0	chore(web): regenerate facts.generated.ts for v0.8.29 Bumps version 0.8.26 → 0.8.29 and toolCount 61 → 62 (new tool from the v0.8.28 / v0.8.29 cycle landed on the canonical surface). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:01:38 -05:00
Hunter Bown	3bac8d4a1c	docs(changelog,readme): add [0.8.29] entry + What's New sections CHANGELOG.md gains a `[0.8.29]` section above `[0.8.28]` covering the scroll-demon structural fix, the #1395 wrong-project Ctrl+R fix, MCP HTTP proxy support, MCP discovery skip-malformed, note commands, AGENTS.md merge, CJK Auto routing, sync-cnb hardening, and the 4-PR test coverage batch. README.md and README.zh-CN.md "What's New" sections rewritten to match (v0.8.27 → v0.8.29). The `prompts::tests::changelog_entry_exists_for_current_package_version` integration test pins the CHANGELOG-must-have-current-version invariant. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 21:01:12 -05:00
Hunter Bown	865db62487	feat(mcp): honor HTTP(S)_PROXY env vars on MCP HTTP transport (#1408 ) `reqwest 0.13` does not auto-detect proxy env vars by default, so MCP HTTP connections were bypassing the proxy that every other tool on the user's box (curl, npm, git, …) was using. Users behind corporate egress proxies and China-mainland setups routing through a local Clash / Shadowsocks tunnel had their MCP servers fail to connect or silently leak around the tunnel. When the `MCP HTTP transport client builder` runs, we now read `HTTPS_PROXY` / `https_proxy` / `HTTP_PROXY` / `http_proxy` (first non-empty wins) and route via `reqwest::Proxy::all(...)`. `NO_PROXY` is honored via `reqwest::NoProxy::from_env()`. Malformed proxy URLs log a `tracing::warn!` (no scroll-demon leak — see runtime_log) and the connection proceeds without a proxy rather than failing the whole MCP attach. Closes #1408. Thanks @hlx98007 for the report. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:58:33 -05:00
reidliu41	6d099d425c	feat: add note management commands Extend /note beyond append-only usage with list, show, edit, remove, clear, path, and explicit add subcommands. Keep existing /note <text> behavior compatible, preserve the existing --- separated file format, and number notes only at display time so the stored notes stay clean. Update command help, localization, docs, and tests.	2026-05-10 20:56:39 -05:00
Vince	5d3ec1b439	fix(mcp): skip malformed items during discovery instead of dropping all When the MCP server returns a list where one entry cannot be deserialized (e.g. a tool missing the required `name` field), the previous code called `.unwrap_or_default()` on the whole-list deserialization, silently discarding every valid entry in the page. Switch all four discovery functions (tools, resources, resource-templates, prompts) to iterate over the JSON array and deserialize each item individually, skipping only those that fail. This ensures a single non-conformant entry never hides the rest of the list. Related: #1250	2026-05-10 20:56:35 -05:00
Hunter Bown	5f976d7e20	feat(header): surface runtime version as a header chip Adds `v0.8.29` (the workspace `CARGO_PKG_VERSION` resolved at compile time) to the right cluster of the header bar, after the existing provider / effort / Live / context chips. Users have been asking for the live version somewhere in the UI — previously only reachable via `deepseek --version` (CLI flag, not in the TUI) or `/status` (slash command, requires action). The chip is the lowest-priority element in the width cascade in `right_spans()`: under tight terminal width it drops before any of the existing status chips. Two pinned tests: * `header_renders_version_chip_when_width_allows` — at width 120 the chip must appear with the current `env!("CARGO_PKG_VERSION")`. * `narrow_header_drops_version_chip_before_dropping_mode` — at width 12 (extreme narrow) the chip drops while the mode label survives, matching the cascade priority. Styled with `palette::TEXT_HINT` so it sits visually behind the streaming dot / context signal — present but not distracting. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:47:56 -05:00
Hunter Bown	7630259748	chore(release): bump to 0.8.29 Workspace + per-crate path-dep version pins, npm wrapper, and deepseekBinaryVersion all advance 0.8.28 -> 0.8.29. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:40:20 -05:00
Hunter Bown	e4255539fc	fix(tui): close the scroll-demon class structurally (#1085 regression) Issue #1085 ("TUI viewport drifts down inside alt-screen at end of turn, leaving top rows blank, esp. after sub-agents") was closed in v0.8.18 by adding `reset_terminal_viewport()` to home the cursor on TurnComplete / focus / resize. v0.8.27's flicker fix (`abf3fa66f`) dropped the `\x1b[2J\x1b[3J` deep-clear from that path to stop the double-clear flicker on Ghostty / VSCode / Win10 conhost. That left ratatui's incremental-diff renderer relying on its internal model matching reality — which only holds while nothing else writes to the terminal. Two latent `eprintln!` sites had been quietly emitting raw bytes into the alt-screen for the entire v0.8.x cycle: * `tools/subagent/mod.rs::persist_state_best_effort` (fires whenever the per-step sub-agent state save hits an error; under parallel sub-agents this can fire dozens of times per turn) * `tools/subagent/mod.rs::new_shared_subagent_manager` (fires once on init if the prior state file fails to load) Plus a third found during this fix: * `network_policy.rs::record` (fires every time a network-policy audit write fails) Each eprintln advanced the alt-screen cursor by one row and scrolled the buffer up by one row, but ratatui's renderer didn't know — it kept writing to absolute row positions, which now meant "one row higher than visible." After ~30 leaks the TUI content appeared to drift downward, with a blank band growing above the header. v0.8.18's periodic full-clear had been masking it; v0.8.27's flicker fix unmasked it. Three layers of defence so this class of bug "isn't an option anymore": 1. `crates/tui/src/runtime_log.rs` — file-backed tracing subscriber + Unix fd-level stderr redirect. A daily-rolling log file at `~/.deepseek/logs/tui-YYYY-MM-DD.log` is created at TUI startup (right after `EnterAlternateScreen`). A `tracing-subscriber` registry routes `tracing::warn!` / `tracing::error!` calls to it. On Unix, the process's stderr fd is `dup2`'d to the same file for the lifetime of the `TuiLogGuard`. Any future raw `eprintln!` — ours, a panic message, a third-party crate's verbose output — lands in the log file instead of the alt-screen. The guard restores the original stderr fd on drop so shutdown messages still reach the user's terminal. 2. `tracing::warn!` replacements for the three known leak sites (`subagent/mod.rs` ×2, `network_policy.rs` ×1). With (1) in place these messages now go to the log file with structured fields (`?err`, `host`, `tool`) instead of opaque text rows in the alt-screen. 3. Module-level `#![deny(clippy::print_stdout, clippy::print_stderr)]` on `tools/`, `core/`, `tui/`, `runtime_threads.rs`, and `network_policy.rs`. Any future `eprintln!` / `println!` added to a TUI runtime path fails the lint at compile time. Legitimate CLI-print paths (`main.rs` eval / init / doctor, `runtime_api.rs` server banners, `logging.rs` verbose helpers, `skills/mod.rs` listing utilities, `execpolicy/execpolicycheck.rs` JSON output, `ui::run_event_loop` post-`LeaveAlternateScreen` resume hint, two `#[test] #[ignore]` perf benches in `tui/transcript.rs` / `tui/widgets/mod.rs` / `core/capacity.rs`) keep their existing prints — they all run outside the alt-screen lifetime. The dup2 redirect is Unix-only because there's no equivalent stable Rust API for fd-redirecting `STDERR_FILENO` on Windows; on Windows the tracing-subscriber layer + the clippy denies still apply, and ratatui's own use of crossterm avoids the worst leakage classes. Cross-platform stderr redirect via `SetStdHandle` is a follow-up. The new `runtime_log` module ships with one test (`log_directory_prefers_home`) that pins the `HOME` / `USERPROFILE` / `dirs::home_dir()` resolution order — uses the process-wide `test_support::lock_test_env()` lock for env-mutation safety. Two `#[test] #[ignore]` benches in `tui/transcript.rs` (rail-prefix memory) and `tui/widgets/mod.rs` (transcript scroll bench) and one in `core/capacity.rs` (`bench_compute_profile`) keep their stdout prints via `#[allow(clippy::print_stdout)]` on the individual test. New dependencies: `tracing-subscriber 0.3` (env-filter + fmt features) and `tracing-appender 0.2` at the workspace root, both pulled into `crates/tui` only. Closes the v0.8.28 regression Hunter reported in screenshots: parallel sub-agents running `exec_shell` triggered the scroll demon with the TUI content squeezed into the bottom third of the terminal and ~30 rows of blank above the header. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 20:20:25 -05:00
LinQ	602f7b5f1c	test(ui): cover sanitize_stream_chunk control-byte filtering `sanitize_stream_chunk` is the per-chunk filter every piece of streaming text passes through — assistant content, thinking content, tool results, web-search snippets — before reaching the renderer. Its job is to keep newlines / tabs intact while dropping control bytes that could otherwise let a chunk emit terminal escape sequences (\u{1b}[2J clear-screen, \u{8} backspace, \u{7} bell). Today the function has zero tests, so a future "let's normalise newlines" or "let's collapse all whitespace" innocuous tweak could silently regress the security posture or visibly mangle code blocks. Adds three unit tests: * `sanitize_stream_chunk_keeps_printable_and_drops_control_bytes` — newline/tab survive; ESC, BEL, BS, VT, FF, CR all drop. * `sanitize_stream_chunk_preserves_unicode` — CJK characters, emoji, and accented Latin pass through untouched. * `sanitize_stream_chunk_handles_empty_and_whitespace` — empty input stays empty; whitespace-only input is preserved; a chunk that is entirely control bytes legitimately shrinks to empty (the caller's "skip empty chunk" branches handle the result). Zero behaviour change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:30:21 -05:00
LinQ	5cd89ba3ac	test(web-search): cover optional_search_max_results precedence + edge cases `optional_search_max_results` decides how many results `web_search` fetches — too high wastes bandwidth, too low misses the model's intent. The function has three branches today (top-level `max_results`, then `search_query[0].max_results`, then the DEFAULT_MAX_RESULTS constant), zero of which are exercised in tests. `extract_search_query` already has four tests covering the basic shapes; this PR completes coverage with two edge cases the helper silently handles (whitespace trim + empty rejection). Six new tests, no behaviour change: * `optional_max_results_prefers_top_level_value` — the explicit outer field wins over a sibling in the array form. Pins the precedence so a future structured-query implementation can't flip it accidentally. * `optional_max_results_falls_back_to_array_form` — when only the inner form sets the bound (V4's structured `search_query: […]` shape), it reaches the caller correctly. * `optional_max_results_uses_default_when_neither_set` — DEFAULT applies for both the top-level and the array shapes, so the model can't burn the MAX_RESULTS budget by omitting the field. * `optional_max_results_only_reads_first_array_entry` — sub-search fan-out is a future feature; future entries are ignored today and a multi-query implementation will need to update this test intentionally. * `extract_search_query_trims_whitespace_from_array_form_q_alias` — pads from heredocs/copy-paste don't reach the upstream URL. * `extract_search_query_rejects_empty_query` — `""`, all-whitespace `q`, and an empty body each surface the same missing-field error rather than a confusing upstream "Bot challenge" page. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:30:21 -05:00
LinQ	6440052089	test(file-tool): cover parse_pages_arg edge cases `parse_pages_arg` validates the user-supplied `pages` argument that `ReadFileTool` forwards to `pdftotext -f START -l END`. The function has zero tests today even though it's the only gatekeeper between user input and a pdftotext spawn — silent acceptance of a malformed range yields a confusing empty extraction with no actionable error message. Adds five tests: * `parse_pages_arg_accepts_single_page` — `"3"` and `" 7 "` both return `Some((n, n))`. * `parse_pages_arg_accepts_range` — `"1-5"`, `"10-20"`, and whitespace-tolerant `" 1 - 5 "` all parse correctly. * `parse_pages_arg_rejects_invalid_ranges` — `5-1` (end < start), `0` and `0-3` (one-indexed contract), empty / whitespace-only inputs, `abc` (non-numeric), and `3.5` (floats) all return `None`. * `parse_pages_arg_rejects_half_open_ranges` — `1-`, `-5`, and `-` reject rather than silently extending to `u32::MAX` or `0`. * `parse_pages_arg_rejects_negative_numbers` — `-3-5` doesn't wrap into a giant positive number via u32 parsing. Zero behaviour change; locks the contract so a future innocuous edit can't silently shift validation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:30:21 -05:00
LinQ	68b25584cc	test(error-taxonomy): cover classify_error_message and Display impls `error_taxonomy.rs` is the central typed-error layer — every subsystem (client, tools, runtime, UI) routes through `ErrorEnvelope` and `classify_error_message` to decide retry policy, UI colour, and offline-flip semantics. Today it ships with zero tests, so a single innocuous keyword reordering could shift classification across the whole tree. Adds 17 unit tests: * One per category (Network, Authentication, Authorization, RateLimit, Timeout, InvalidInput, Parse, State, Tool, Internal) exercising the keyword variants the function intends to catch — e.g. context-overflow phrasings ("maximum context length", "context_length_exceeded", "prompt is too long", the OpenAI "you requested … the maximum is" wording, "context window"), HTTP 5xx with various spacing rules (502 / 503 / 504, leading space, trailing space, exact match, embedded), and 429/quota rate-limit phrasings. * Three precedence tests pinning the load-bearing ordering: InvalidInput beats Tool (so a "tool returned: maximum context length" still surfaces as a /compact-able invalid input), Timeout beats Network (so "504 Gateway Timeout" classifies as Timeout because its retry semantics are gentler than Network's), and RateLimit beats Authentication (so a 429 with API token phrasing doesn't get misrouted to auth-failure handling). * Unicode handling: a Chinese error message that still mentions "context length" hits InvalidInput; a pure-Chinese unknown message falls through to Internal. * Display impls round-trip through their snake_case wire form so consumers depending on the labels can't be silently broken. Zero behaviour change; only tests + one comment pinning the 504 precedence rule. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:30:21 -05:00
LinQ	05fcb0df24	feat(auto-router): recognise CJK complex keywords in model picker The Flash-router fallback heuristic `auto_model_heuristic` only matched English complexity keywords (`refactor`, `architecture`, `design`, `debug`, `security`, `review`, `audit`, `migrate`, `optimize`, `rewrite`, `implement`, `analyze`). A Chinese-speaking user typing "帮我重构这个模块" or "审计安全漏洞" silently fell through to the short/long-message length branches and usually landed on Flash for work that obviously needs Pro-grade reasoning — the symmetric of the companion gap in `auto_reasoning::select` (and the same root cause). Extracts the array into a `COMPLEX_KEYWORDS` constant and adds the Simplified and Traditional Chinese counterparts for each English keyword: * refactor → 重构 / 重構 * architecture → 架构 / 架構 * design → 设计 / 設計 * debug → 调试 / 調試 * security → 安全 * review → 审查 / 審查 * audit → 审计 / 審計 * migrate → 迁移 / 遷移 * optimize → 优化 / 優化 * rewrite → 重写 / 重寫 * implement → 实现 / 實現 * analyze → 分析 CJK matches the literal form because the existing `to_lowercase()` is a no-op for those scripts. English keywords are byte-identical to before, so English-only behaviour doesn't shift. Three new tests cover Simplified and Traditional Chinese keyword routing to Pro, plus a sanity test that short non-keyword Chinese prose still gets the cost-saving Flash fallback. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:29:43 -05:00
LinQ	68cc5d19cc	feat(auto-reasoning): recognise CJK debug/search keywords in Auto tier `auto_reasoning::select` is the per-turn classifier that picks `reasoning_effort` for `reasoning_effort = "auto"`. Today it only recognises English keywords (`debug`, `error`, `search`, `lookup`), so a user typing in Chinese or Japanese never trips the tier shifts: "帮我调试代码" stays on `High` instead of escalating to `Max`, "搜索一下文件" stays on `High` instead of dropping to `Low`. For a non-English Auto-mode user that's both wrong-side-of-cheap and wrong-side-of-careful on every turn. Extracts the keyword sets into `HIGH_EFFORT_KEYWORDS` and `LOW_EFFORT_KEYWORDS` constants and adds the Chinese / Japanese vocabulary that maps to the same intents: * HIGH (→ `Max`): 调试 / 错误 / 报错 / 出错 / 崩溃 / 調試 / 錯誤 in Chinese; デバッグ / エラー / バグ in Japanese. * LOW (→ `Low`): 搜索 / 查找 / 查询 in Chinese; 検索 in Japanese. Latin lowercase is preserved (the caller still lowercases the message), and CJK matches the literal form because CJK has no case. Four new tests cover Chinese debug keywords, Japanese debug keywords, Chinese search keywords, the single Japanese search keyword, and a sanity test that ordinary CJK prose (without keyword hits) still returns `High` — matching the English-only behaviour the function already had. All previous tests (`subagent_returns_low`, `debug_or_error_returns_max`, `search_or_lookup_returns_low`, `default_returns_high`) continue to pass — the original English-only paths are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:29:43 -05:00
LinQ	a5c4a21c9b	feat(project-context): merge global AGENTS.md with project AGENTS.md (#1157 ) travel into every session, ideally merged with a project's local AGENTS.md when both exist. Maintainer agreed: > yes that makes sense! am working on getting this organizational > structure better today so that worktrees etc can feel like an > intended way of using this. The fallback path already loaded the global file when no workspace context existed, but dropped it silently the moment a project AGENTS.md showed up. After this PR: * Both files present → merged. The global block is prepended with a labelled HTML-style fence (`<!-- global: /home/u/.deepseek/AGENTS.md -->`), then the project block follows with its own fence (`<!-- project (overrides global where they conflict) -->`). Order is global-first so workspace rules read last and win "last word" precedence with the model when they disagree. * Only project file present → unchanged from before. * Only global file present → unchanged from before (still acts as a fallback). The merge framing is suppressed in the global-only case so the prompt stays minimal. `source_path` continues to point at the more-specific file (project > global > nothing) because that's the path the user is likely to edit when they want to override something. Two tests: * `test_local_and_global_agents_merge_when_both_exist` — the actual #1157 scenario. Asserts both blocks are present, global precedes project, and the merge-framing label appears between them. * `test_global_agents_only_no_project_unchanged_fallback` — sanity check that the global-only path doesn't accidentally inherit the merge framing. The pre-existing `test_load_global_agents_when_project_has_no_context` still passes, so the global-as-fallback contract is preserved. Refs #1157 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:29:43 -05:00
LinQ	29a42ba31a	fix(web_search): drop spam-stuffed SERPs in Bing/DDG fallback (#964 ) #964 reports that `web_search` returns garbage results — every query in their reproduction case returned eight entries from a single low- quality forum domain (`*.forumgratuit.org`) regardless of input. The root cause is upstream: when Bing's scraping endpoint serves a stuffed page (often when our request looks too bot-like or the query falls into a degraded bucket), the parser happily extracts the b_algo entries and the model receives the junk as authoritative search results. Adds a `is_likely_spam_results` heuristic that runs after both Bing and DDG parsers. When 60% or more of the parsed entries share the same registrable root domain (with at least three entries to avoid false positives on legitimate two-link answers), the batch is discarded. The existing "no results" handling then surfaces a clean error message to the model instead of routing it toward spam. `root_domain` strips subdomains so `astralia.forumgratuit.org` and `russia.forumgratuit.org` collapse to `forumgratuit.org` for the purpose of dominance counting; eTLD+1 is approximated by keeping the last two labels, which is close enough for the threshold check. Five new tests cover the threshold (3-of-5 trips, 2-of-5 doesn't), short-batch passthrough, normal diverse SERPs (Wikipedia + SO + Reddit) staying through, and the precise spam reproduction from #964. Refs #964 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:29:43 -05:00
LinQ	f56f73e371	fix(prompts): strengthen language directive so thinking matches user (#1118 ) #1118 reports that even after configuring the locale to Chinese, V4 keeps emitting English `reasoning_content` (the thinking block) when the surrounding code/error logs are English-heavy. Maintainer agreed the prompt needs editing. The existing language directive already said "both for `reasoning_content` and for the final reply", but V4 falls into a failure mode where it mirrors the user message for the final answer while quietly defaulting to English for thinking. Three additions to `crates/tui/src/prompts/base.md` sharpen the rule: 1. Bold the "must both be in Simplified Chinese" requirement, and add the failure-mode escape hatches the prompt previously left implicit ("even when the surrounding system prompt is in English, and even when the task context [...] is overwhelmingly English"). 2. Spell out the mid-session-switch rule for `reasoning_content` explicitly. Today the prompt says "switch with them" but doesn't reinforce that this includes thinking — V4 sometimes carries the previous turn's reasoning language forward. 3. Add an explicit-override clause for the opposite preference (#1118 commenter pmsleepcheck preferred English thinking for token cost). Users can say "think in English" / "用英文思考" and the model honours that until the next override. The final reply still tracks the user's message language — only thinking is overridable. Adds `language_section_carries_reasoning_content_directives_for_1118` pinning the four load-bearing phrases ("reasoning_content", "must both be in Simplified Chinese", "overwhelmingly English", and both English + Chinese override examples) so a future innocuous edit can't quietly drop them. The existing `system_prompt_for_mode_with_context_is_byte_stable_for_unchanged_workspace` test still passes, so byte-stability for a fixed session is intact. Refs #1118 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:29:43 -05:00
Hunter Bown	efa00ff69b	security(ci): harden sync-cnb.yml — permissions, checkout v4, narrow trigger - Add explicit permissions: contents: read (least-privilege) - Bump actions/checkout@v3 → @v4 - Narrow trigger from on: [push] to on: push: branches: [main] + tags: ['v*'] Matches the hardening convention used by every other workflow in the repo.	2026-05-10 19:29:05 -05:00
LinQ	5b0b81c5fc	fix(session-picker): scope Ctrl+R picker to the current workspace (#1395 ) #1395 reports that Ctrl+R in project B restores a session that belongs to project A — the picker was calling `list_sessions()` and showing every session on disk regardless of where the user is. With hundreds of past sessions across many repos, the first hit on the "most recent" sort is rarely the one from the project the user just opened. `SessionMetadata.workspace` is already persisted, so the data needed to filter is there. This PR: 1. Adds a `workspace_scope` field to `SessionPickerView` and a `show_all_workspaces` toggle. `SessionPickerView::new` now takes `&Path` so every caller is forced to pass a scope. 2. Filters `filtered` to sessions whose recorded `workspace` canonicalises to the same path as the active workspace. Both sides go through `std::fs::canonicalize` so a symlinked or relative checkout matches its canonical form. 3. Adds an `a` keybinding inside the picker to flip `show_all_workspaces`, with a status-line readout ("scoped to this workspace" / "showing sessions from every workspace"). The user can always escape the scope if they need to. Three new tests: - `workspace_scope_filters_sessions_to_current_project` — reproduces the #1395 scenario (sessions in /tmp/project-a vs /tmp/project-b; the picker only surfaces the matching project). - `workspace_scope_toggle_a_expands_to_all_workspaces` — `a` flips back and forth; status announces the new mode. - `workspace_scope_none_means_show_all` — the historical unscoped behaviour is still reachable when the caller passes no workspace (used for tests + future opt-out). Updates the two call sites (`ui.rs` Ctrl+R handler and `commands/session.rs` `/sessions [show]`) to pass `&app.workspace`. Closes #1395 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:28:29 -05:00
Hunter Bown	793031a2e0	Release v0.8.28 Release v0.8.28	2026-05-10 18:34:51 -05:00
Hunter Bown	61f40420aa	chore(release): bump to 0.8.28 Workspace + per-crate path-dep version pins, npm wrapper, and `deepseekBinaryVersion` all advance from 0.8.27 → 0.8.28. Lockfile refreshed via `cargo update --workspace --offline`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:21:58 -05:00
Hunter Bown	ad738dfe44	docs(changelog): extend [0.8.28] with #1347 / #1379 / #1270 follow-up + settings guard fold Adds entries for: * PR #1347 Ctrl+Enter idle-engine fix (#1331) * PR #1379 V4-steered tool descriptions (#711) * #1270 wiring follow-up that un-ignores the two add_local_reference_completions tests * `term_program` / `no_animations` test guard fold into `lock_test_env` (Internal) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:16:56 -05:00
Hunter Bown	842802b873	test: fold settings.rs term_program + no_animations guards into lock_test_env PR #1365 (cherry-picked into v0.8.28) introduced `term_program_test_guard` as a fresh module-local `static Mutex<()>`, mirroring the existing `no_animations_test_guard`. Both serialize their own family of tests but not with each other — so under cargo's parallel runner, a `NO_ANIMATIONS=1` leak from one family lands in the env at the exact moment a `TERM_PROGRAM=iTerm.app` test calls the shared `apply_env_overrides`, flipping `low_motion` to true and failing `non_vscode_term_program_does_not_force_low_motion`. Both guards now return `crate::test_support::lock_test_env()` (the same fold the v0.8.28 test-stabilization commit applied to the EnvGuard family in `commands/config.rs`, `commands/network.rs`, and `tools/recall_archive.rs`). This serializes the two test groups with each other and with every other env-mutating test in the suite, eliminating the cross-test env-var race. `save_api_key_for_openrouter_writes_provider_table` was failing intermittently for the same reason — a concurrent env mutation in an unrelated test was clobbering HOME / DEEPSEEK_CONFIG_PATH in the window between our `EnvGuard::new` and `save_api_key_for`'s `default_config_path()` read. With the broader serialization in place, the race window closes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:16:18 -05:00
Hunter Bown	9759a77ae5	fix(working-set): wire #1270 's add_local_reference_completions into completions() and build_file_index PR #1270 from @SamhandsomeLee landed the function definitions and both regression tests as part of v0.8.27, but the wiring was incomplete: `add_local_reference_completions` was never called from `completions()`, and `build_file_index` never walked `local_reference_paths`. Hunter marked the two tests `#[ignore]` with a "v0.8.28 follow-up" trailer in `fe0673d68`. This completes the wiring: * `Workspace::completions()` now calls `add_local_reference_completions` for both the diverging-cwd branch and the workspace-root branch, after the existing `walk_for_completions`. The helper is a no-op unless the needle starts with `.` or contains `/` / `\`, so prose mentions skip the extra walk. * `Workspace::build_file_index()` now walks `local_reference_paths` after the curated dot-dir whitelist (`.deepseek`, `.cursor`, `.claude`, `.agents`), so explicit user paths into other gitignored dirs (e.g. `.generated/specs/device-layout.md`) fuzzy-resolve too. Honors `FILE_INDEX_MAX_ENTRIES` so the #697 walk-cap still bounds first-turn latency. * Drops `#[allow(dead_code)]` from the four helpers (`LOCAL_REFERENCE_SCAN_LIMIT`, `add_local_reference_completions`, `should_try_local_reference_completion`, `local_reference_paths`, `should_skip_local_reference_dir`) and un-ignores both `working_set` regression tests: `workspace_completions_surface_explicit_hidden_and_ignored_paths` and `fuzzy_index_resolves_hidden_and_ignored_files_except_deepseekignored`. Both tests pass. `.deepseekignore` entries remain blocked from both completion and basename fuzzy-resolution paths because `local_reference_paths` adds `.deepseekignore` as a custom-ignore file on the walker. Closes the v0.8.28 follow-up trailer left on `fe0673d68`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:11:52 -05:00
Zhiping	1035502bf9	fix: Ctrl+Enter content lost when engine is idle (#1331 ) The steer channel (rx_steer) is only monitored inside handle_deepseek_turn — when the engine is idle (no active turn), Ctrl+Enter sent the message into the void. It appeared in the transcript via the local mirror in steer_user_message, but the LLM never received it, and the next handle_send_message would drain it as a "stale steer". Fix: check app.is_loading in the Ctrl+Enter handler. When the engine is busy, steer into the current turn as before. When idle, send via submit_or_steer_message so the message goes through the regular Op::SendMessage path.	2026-05-10 18:10:36 -05:00
xuezhaoyu	beb4120adc	fix: guarantee END_SYNC_UPDATE is sent even when draw fails Wrap fallible draw operations in closures so that ESC[?2026l is always sent regardless of whether an intermediate step (write_all, flush, clear, draw) returns an error. Without this, a failing ? would return early and leave the terminal stuck in synchronized-update mode with a frozen screen. Review feedback from gemini-code-assist[bot] on #1361.	2026-05-10 18:10:00 -05:00
xuezhaoyu	0bd2832e56	fix: add DEC 2026 synchronized updates to prevent flickering on GPU terminals Wrap terminal.draw() and reset_terminal_viewport() with ESC[?2026h/l so GPU-accelerated terminals (Ghostty, VS Code, Kitty) defer rendering until the full frame is written, eliminating intermediate-frame flicker. Merge viewport-reset + draw into a single sync batch to avoid a visible blank frame between the two operations. Best-effort — unsupported terminals silently ignore the sequences. Fixes #1352	2026-05-10 18:10:00 -05:00
LinQ	9b2e747eeb	docs(tools): add steering sentences to tool descriptions Each tool description now names what to use instead of (cat/head/tail/ sed/grep/find/curl/heredocs in exec_shell), the return shape, and the limits. Steering language routes V4 toward our typed tools and away from shell footguns. Tools updated: read_file, write_file, edit_file, list_dir, grep_files, file_search, web_search, apply_patch, fetch_url. Removes the unused legacy normal.txt / plan.txt / yolo.txt prompt templates and the YOLO_PROMPT / PLAN_PROMPT constants. Both constants were referenced only by their own self-tests in prompts.rs; AGENT_PROMPT is preserved (its companion .txt is in the scope of a separate issue). All description strings stay under 1024 chars (max: 350) with no embedded newlines or Markdown headers, so the cached tool catalogue stays prefix-stable for V4's KV cache. Closes #711 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:08:47 -05:00
CrepuscularIRIS	3b1d0ae907	fix(settings): clarify apply_env_overrides comment — env always wins The previous comment incorrectly suggested that a user-set low_motion=false in the settings file could override the TERM_PROGRAM=vscode detection. In fact, apply_env_overrides() runs after disk load and unconditionally sets the flag, identical to the existing NO_ANIMATIONS precedent. Update the comment to state the actual precedence clearly. Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>	2026-05-10 18:08:43 -05:00
CrepuscularIRIS	5bededf77c	fix(settings): enable low_motion automatically in VS Code terminal VS Code's integrated terminal sets TERM_PROGRAM=vscode. Its compositor cannot keep up with the default 120 FPS redraw rate, producing rapid flickering on some machines while other terminal apps (Terminal.app, iTerm2) are unaffected (#1356). Extend apply_env_overrides() to detect TERM_PROGRAM=vscode and automatically activate low_motion mode (30 FPS cap, no fancy animations), matching the existing NO_ANIMATIONS env-var pattern. This is a zero- config fix: users running in VS Code get a stable display with no settings change required. Users who want the full animation rate can still set low_motion = false explicitly in their settings file — that file-level value is already loaded before apply_env_overrides() is called, so an explicit false in the file wins over this auto-detection. Two tests added: - vscode_term_program_forces_low_motion_on: TERM_PROGRAM=vscode enables low_motion and disables fancy_animations. - non_vscode_term_program_does_not_force_low_motion: other well-known terminal programs (iTerm.app, Apple_Terminal, WezTerm, xterm-256color) are unaffected. Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>	2026-05-10 18:08:43 -05:00
Hunter Bown	0fa86a7c35	docs(changelog): extend [0.8.28] with six cherry-picks + test isolation Adds entries for PRs #1373 (CNB mirror), #1340 (Cmux LC_TERMINAL), #1386 (file-index walk cap, #697), #1393 (prompt reliability), #1368 (duplicate error toast), and #1370 (continue tip), plus the `provider_switch_clears_turn_cache_history` HOME-isolation note. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 17:51:45 -05:00

1 2 3 4 5 ...

1096 Commits