codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
LinQ	68cc5d19cc	feat(auto-reasoning): recognise CJK debug/search keywords in Auto tier `auto_reasoning::select` is the per-turn classifier that picks `reasoning_effort` for `reasoning_effort = "auto"`. Today it only recognises English keywords (`debug`, `error`, `search`, `lookup`), so a user typing in Chinese or Japanese never trips the tier shifts: "帮我调试代码" stays on `High` instead of escalating to `Max`, "搜索一下文件" stays on `High` instead of dropping to `Low`. For a non-English Auto-mode user that's both wrong-side-of-cheap and wrong-side-of-careful on every turn. Extracts the keyword sets into `HIGH_EFFORT_KEYWORDS` and `LOW_EFFORT_KEYWORDS` constants and adds the Chinese / Japanese vocabulary that maps to the same intents: * HIGH (→ `Max`): 调试 / 错误 / 报错 / 出错 / 崩溃 / 調試 / 錯誤 in Chinese; デバッグ / エラー / バグ in Japanese. * LOW (→ `Low`): 搜索 / 查找 / 查询 in Chinese; 検索 in Japanese. Latin lowercase is preserved (the caller still lowercases the message), and CJK matches the literal form because CJK has no case. Four new tests cover Chinese debug keywords, Japanese debug keywords, Chinese search keywords, the single Japanese search keyword, and a sanity test that ordinary CJK prose (without keyword hits) still returns `High` — matching the English-only behaviour the function already had. All previous tests (`subagent_returns_low`, `debug_or_error_returns_max`, `search_or_lookup_returns_low`, `default_returns_high`) continue to pass — the original English-only paths are unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:29:43 -05:00
LinQ	a5c4a21c9b	feat(project-context): merge global AGENTS.md with project AGENTS.md (#1157 ) travel into every session, ideally merged with a project's local AGENTS.md when both exist. Maintainer agreed: > yes that makes sense! am working on getting this organizational > structure better today so that worktrees etc can feel like an > intended way of using this. The fallback path already loaded the global file when no workspace context existed, but dropped it silently the moment a project AGENTS.md showed up. After this PR: * Both files present → merged. The global block is prepended with a labelled HTML-style fence (`<!-- global: /home/u/.deepseek/AGENTS.md -->`), then the project block follows with its own fence (`<!-- project (overrides global where they conflict) -->`). Order is global-first so workspace rules read last and win "last word" precedence with the model when they disagree. * Only project file present → unchanged from before. * Only global file present → unchanged from before (still acts as a fallback). The merge framing is suppressed in the global-only case so the prompt stays minimal. `source_path` continues to point at the more-specific file (project > global > nothing) because that's the path the user is likely to edit when they want to override something. Two tests: * `test_local_and_global_agents_merge_when_both_exist` — the actual #1157 scenario. Asserts both blocks are present, global precedes project, and the merge-framing label appears between them. * `test_global_agents_only_no_project_unchanged_fallback` — sanity check that the global-only path doesn't accidentally inherit the merge framing. The pre-existing `test_load_global_agents_when_project_has_no_context` still passes, so the global-as-fallback contract is preserved. Refs #1157 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:29:43 -05:00
LinQ	29a42ba31a	fix(web_search): drop spam-stuffed SERPs in Bing/DDG fallback (#964 ) #964 reports that `web_search` returns garbage results — every query in their reproduction case returned eight entries from a single low- quality forum domain (`*.forumgratuit.org`) regardless of input. The root cause is upstream: when Bing's scraping endpoint serves a stuffed page (often when our request looks too bot-like or the query falls into a degraded bucket), the parser happily extracts the b_algo entries and the model receives the junk as authoritative search results. Adds a `is_likely_spam_results` heuristic that runs after both Bing and DDG parsers. When 60% or more of the parsed entries share the same registrable root domain (with at least three entries to avoid false positives on legitimate two-link answers), the batch is discarded. The existing "no results" handling then surfaces a clean error message to the model instead of routing it toward spam. `root_domain` strips subdomains so `astralia.forumgratuit.org` and `russia.forumgratuit.org` collapse to `forumgratuit.org` for the purpose of dominance counting; eTLD+1 is approximated by keeping the last two labels, which is close enough for the threshold check. Five new tests cover the threshold (3-of-5 trips, 2-of-5 doesn't), short-batch passthrough, normal diverse SERPs (Wikipedia + SO + Reddit) staying through, and the precise spam reproduction from #964. Refs #964 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:29:43 -05:00
LinQ	f56f73e371	fix(prompts): strengthen language directive so thinking matches user (#1118 ) #1118 reports that even after configuring the locale to Chinese, V4 keeps emitting English `reasoning_content` (the thinking block) when the surrounding code/error logs are English-heavy. Maintainer agreed the prompt needs editing. The existing language directive already said "both for `reasoning_content` and for the final reply", but V4 falls into a failure mode where it mirrors the user message for the final answer while quietly defaulting to English for thinking. Three additions to `crates/tui/src/prompts/base.md` sharpen the rule: 1. Bold the "must both be in Simplified Chinese" requirement, and add the failure-mode escape hatches the prompt previously left implicit ("even when the surrounding system prompt is in English, and even when the task context [...] is overwhelmingly English"). 2. Spell out the mid-session-switch rule for `reasoning_content` explicitly. Today the prompt says "switch with them" but doesn't reinforce that this includes thinking — V4 sometimes carries the previous turn's reasoning language forward. 3. Add an explicit-override clause for the opposite preference (#1118 commenter pmsleepcheck preferred English thinking for token cost). Users can say "think in English" / "用英文思考" and the model honours that until the next override. The final reply still tracks the user's message language — only thinking is overridable. Adds `language_section_carries_reasoning_content_directives_for_1118` pinning the four load-bearing phrases ("reasoning_content", "must both be in Simplified Chinese", "overwhelmingly English", and both English + Chinese override examples) so a future innocuous edit can't quietly drop them. The existing `system_prompt_for_mode_with_context_is_byte_stable_for_unchanged_workspace` test still passes, so byte-stability for a fixed session is intact. Refs #1118 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:29:43 -05:00
LinQ	5b0b81c5fc	fix(session-picker): scope Ctrl+R picker to the current workspace (#1395 ) #1395 reports that Ctrl+R in project B restores a session that belongs to project A — the picker was calling `list_sessions()` and showing every session on disk regardless of where the user is. With hundreds of past sessions across many repos, the first hit on the "most recent" sort is rarely the one from the project the user just opened. `SessionMetadata.workspace` is already persisted, so the data needed to filter is there. This PR: 1. Adds a `workspace_scope` field to `SessionPickerView` and a `show_all_workspaces` toggle. `SessionPickerView::new` now takes `&Path` so every caller is forced to pass a scope. 2. Filters `filtered` to sessions whose recorded `workspace` canonicalises to the same path as the active workspace. Both sides go through `std::fs::canonicalize` so a symlinked or relative checkout matches its canonical form. 3. Adds an `a` keybinding inside the picker to flip `show_all_workspaces`, with a status-line readout ("scoped to this workspace" / "showing sessions from every workspace"). The user can always escape the scope if they need to. Three new tests: - `workspace_scope_filters_sessions_to_current_project` — reproduces the #1395 scenario (sessions in /tmp/project-a vs /tmp/project-b; the picker only surfaces the matching project). - `workspace_scope_toggle_a_expands_to_all_workspaces` — `a` flips back and forth; status announces the new mode. - `workspace_scope_none_means_show_all` — the historical unscoped behaviour is still reachable when the caller passes no workspace (used for tests + future opt-out). Updates the two call sites (`ui.rs` Ctrl+R handler and `commands/session.rs` `/sessions [show]`) to pass `&app.workspace`. Closes #1395 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 19:28:29 -05:00
Hunter Bown	61f40420aa	chore(release): bump to 0.8.28 Workspace + per-crate path-dep version pins, npm wrapper, and `deepseekBinaryVersion` all advance from 0.8.27 → 0.8.28. Lockfile refreshed via `cargo update --workspace --offline`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:21:58 -05:00
Hunter Bown	842802b873	test: fold settings.rs term_program + no_animations guards into lock_test_env PR #1365 (cherry-picked into v0.8.28) introduced `term_program_test_guard` as a fresh module-local `static Mutex<()>`, mirroring the existing `no_animations_test_guard`. Both serialize their own family of tests but not with each other — so under cargo's parallel runner, a `NO_ANIMATIONS=1` leak from one family lands in the env at the exact moment a `TERM_PROGRAM=iTerm.app` test calls the shared `apply_env_overrides`, flipping `low_motion` to true and failing `non_vscode_term_program_does_not_force_low_motion`. Both guards now return `crate::test_support::lock_test_env()` (the same fold the v0.8.28 test-stabilization commit applied to the EnvGuard family in `commands/config.rs`, `commands/network.rs`, and `tools/recall_archive.rs`). This serializes the two test groups with each other and with every other env-mutating test in the suite, eliminating the cross-test env-var race. `save_api_key_for_openrouter_writes_provider_table` was failing intermittently for the same reason — a concurrent env mutation in an unrelated test was clobbering HOME / DEEPSEEK_CONFIG_PATH in the window between our `EnvGuard::new` and `save_api_key_for`'s `default_config_path()` read. With the broader serialization in place, the race window closes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:16:18 -05:00
Hunter Bown	9759a77ae5	fix(working-set): wire #1270 's add_local_reference_completions into completions() and build_file_index PR #1270 from @SamhandsomeLee landed the function definitions and both regression tests as part of v0.8.27, but the wiring was incomplete: `add_local_reference_completions` was never called from `completions()`, and `build_file_index` never walked `local_reference_paths`. Hunter marked the two tests `#[ignore]` with a "v0.8.28 follow-up" trailer in `fe0673d68`. This completes the wiring: * `Workspace::completions()` now calls `add_local_reference_completions` for both the diverging-cwd branch and the workspace-root branch, after the existing `walk_for_completions`. The helper is a no-op unless the needle starts with `.` or contains `/` / `\`, so prose mentions skip the extra walk. * `Workspace::build_file_index()` now walks `local_reference_paths` after the curated dot-dir whitelist (`.deepseek`, `.cursor`, `.claude`, `.agents`), so explicit user paths into other gitignored dirs (e.g. `.generated/specs/device-layout.md`) fuzzy-resolve too. Honors `FILE_INDEX_MAX_ENTRIES` so the #697 walk-cap still bounds first-turn latency. * Drops `#[allow(dead_code)]` from the four helpers (`LOCAL_REFERENCE_SCAN_LIMIT`, `add_local_reference_completions`, `should_try_local_reference_completion`, `local_reference_paths`, `should_skip_local_reference_dir`) and un-ignores both `working_set` regression tests: `workspace_completions_surface_explicit_hidden_and_ignored_paths` and `fuzzy_index_resolves_hidden_and_ignored_files_except_deepseekignored`. Both tests pass. `.deepseekignore` entries remain blocked from both completion and basename fuzzy-resolution paths because `local_reference_paths` adds `.deepseekignore` as a custom-ignore file on the walker. Closes the v0.8.28 follow-up trailer left on `fe0673d68`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:11:52 -05:00
Zhiping	1035502bf9	fix: Ctrl+Enter content lost when engine is idle (#1331 ) The steer channel (rx_steer) is only monitored inside handle_deepseek_turn — when the engine is idle (no active turn), Ctrl+Enter sent the message into the void. It appeared in the transcript via the local mirror in steer_user_message, but the LLM never received it, and the next handle_send_message would drain it as a "stale steer". Fix: check app.is_loading in the Ctrl+Enter handler. When the engine is busy, steer into the current turn as before. When idle, send via submit_or_steer_message so the message goes through the regular Op::SendMessage path.	2026-05-10 18:10:36 -05:00
xuezhaoyu	beb4120adc	fix: guarantee END_SYNC_UPDATE is sent even when draw fails Wrap fallible draw operations in closures so that ESC[?2026l is always sent regardless of whether an intermediate step (write_all, flush, clear, draw) returns an error. Without this, a failing ? would return early and leave the terminal stuck in synchronized-update mode with a frozen screen. Review feedback from gemini-code-assist[bot] on #1361.	2026-05-10 18:10:00 -05:00
xuezhaoyu	0bd2832e56	fix: add DEC 2026 synchronized updates to prevent flickering on GPU terminals Wrap terminal.draw() and reset_terminal_viewport() with ESC[?2026h/l so GPU-accelerated terminals (Ghostty, VS Code, Kitty) defer rendering until the full frame is written, eliminating intermediate-frame flicker. Merge viewport-reset + draw into a single sync batch to avoid a visible blank frame between the two operations. Best-effort — unsupported terminals silently ignore the sequences. Fixes #1352	2026-05-10 18:10:00 -05:00
LinQ	9b2e747eeb	docs(tools): add steering sentences to tool descriptions Each tool description now names what to use instead of (cat/head/tail/ sed/grep/find/curl/heredocs in exec_shell), the return shape, and the limits. Steering language routes V4 toward our typed tools and away from shell footguns. Tools updated: read_file, write_file, edit_file, list_dir, grep_files, file_search, web_search, apply_patch, fetch_url. Removes the unused legacy normal.txt / plan.txt / yolo.txt prompt templates and the YOLO_PROMPT / PLAN_PROMPT constants. Both constants were referenced only by their own self-tests in prompts.rs; AGENT_PROMPT is preserved (its companion .txt is in the scope of a separate issue). All description strings stay under 1024 chars (max: 350) with no embedded newlines or Markdown headers, so the cached tool catalogue stays prefix-stable for V4's KV cache. Closes #711 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 18:08:47 -05:00
CrepuscularIRIS	3b1d0ae907	fix(settings): clarify apply_env_overrides comment — env always wins The previous comment incorrectly suggested that a user-set low_motion=false in the settings file could override the TERM_PROGRAM=vscode detection. In fact, apply_env_overrides() runs after disk load and unconditionally sets the flag, identical to the existing NO_ANIMATIONS precedent. Update the comment to state the actual precedence clearly. Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>	2026-05-10 18:08:43 -05:00
CrepuscularIRIS	5bededf77c	fix(settings): enable low_motion automatically in VS Code terminal VS Code's integrated terminal sets TERM_PROGRAM=vscode. Its compositor cannot keep up with the default 120 FPS redraw rate, producing rapid flickering on some machines while other terminal apps (Terminal.app, iTerm2) are unaffected (#1356). Extend apply_env_overrides() to detect TERM_PROGRAM=vscode and automatically activate low_motion mode (30 FPS cap, no fancy animations), matching the existing NO_ANIMATIONS env-var pattern. This is a zero- config fix: users running in VS Code get a stable display with no settings change required. Users who want the full animation rate can still set low_motion = false explicitly in their settings file — that file-level value is already loaded before apply_env_overrides() is called, so an explicit false in the file wins over this auto-detection. Two tests added: - vscode_term_program_forces_low_motion_on: TERM_PROGRAM=vscode enables low_motion and disables fancy_animations. - non_vscode_term_program_does_not_force_low_motion: other well-known terminal programs (iTerm.app, Apple_Terminal, WezTerm, xterm-256color) are unaffected. Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>	2026-05-10 18:08:43 -05:00
Hunter Bown	9ba42f0995	test: stop provider_switch test from clobbering user settings.toml `provider_switch_clears_turn_cache_history` in `tui/ui/tests.rs` calls `switch_provider(..., ApiProvider::Ollama, ...)`, which internally persists the new provider via `Settings::save()` — writing `default_provider = "ollama"` to `~/Library/Application Support/deepseek/settings.toml` (or its `dirs::data_dir()` equivalent on Linux/Windows). Because the test's `create_test_app` did not isolate `HOME` / `USERPROFILE`, each run silently overwrote the developer's real preferences. The contamination then leaked back into adjacent picker tests: `model_picker::tests::arrow_keys_move_within_focused_pane` became order-sensitive, passing when it happened to run before `provider_switch_clears_turn_cache_history` and failing after, because Ollama is a pass-through provider and `ModelPickerView` then hid the DeepSeek model rows. Two fixes: * `tui/ui/tests.rs::provider_switch_clears_turn_cache_history` now wraps the test in a `HomeGuard` that redirects HOME / USERPROFILE to a tempdir for the test's lifetime and restores the original values on drop. The guard owns the `test_support::lock_test_env()` mutex so clippy's `await_holding_lock` lint stays quiet through the `.await` (the pattern mirrors `tools::recall_archive::HomeGuard`). * `tui/model_picker.rs::create_test_app` now also pins `app.api_provider = ApiProvider::Deepseek` alongside the existing `app.model` / `app.reasoning_effort` overrides, so the picker tests stop depending on whatever `default_provider` happens to be in the developer's `settings.toml` for any other reason. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 17:50:36 -05:00
nightfallsad	0388a5510f	fix: The continue's tips is vague	2026-05-10 17:38:07 -05:00
nightfallsad	7846908ac8	fix: The continue's tips is vague	2026-05-10 17:38:07 -05:00
Leo Douglas	a5c107e115	fix(ui): suppress duplicate error toast when turn error is already in transcript When an engine error occurs (e.g. refusing an insecure HTTP base URL), the same error was displayed twice: once as a HistoryCell::Error in the transcript and again as a sticky toast in the footer/composer area. The toast was created because apply_engine_error_to_app set status_message, which sync_status_message_to_toasts() converted into a sticky toast (15s TTL) since the text contained "error"/"failed". Add turn_error_posted flag to App, set when an EngineEvent::Error is posted to the transcript, reset on TurnStarted. The TurnComplete error handler and apply_engine_error_to_app now skip setting status_message when the flag is set, keeping the error display in the transcript only. The auth+env_only onboarding path retains its status_message since that flow relies on it to prompt the user for a saved API key.	2026-05-10 17:38:04 -05:00
Zhiping	a6a598bd32	feat: add prompt guidance for verification and tool failure handling Three additions to base.md's Verification Principle section: 1. Before reporting a task as complete, verify the result when practical; if not verified, say so explicitly. 2. Preserve only key facts from tool results (paths, errors, exit status, cache values); do not copy large raw outputs. 3. Inspect error before retrying a failed tool call; do not repeat the identical action blindly.	2026-05-10 17:38:00 -05:00
LinQ	af22a5c7b1	fix(working-set): cap file-index walk to bound first-turn latency (#697 ) `WorkingSet::build_file_index` walks the workspace tree (depth 6) plus all `DISCOVERY_ALWAYS_DIRS` (depth 5) the first time `fuzzy_resolve` is called. On huge workspaces that walk dominates the first turn's wall clock, surfacing as the ~10-second `Working...` hang reported in #697. Adds a `FILE_INDEX_MAX_ENTRIES = 50_000` cap. When the walk produces more than 50K (file or directory) entries the index is returned early with a warning. A surplus entry simply isn't fuzzy-resolvable; literal paths still resolve via the existing fallback so functionality is preserved on outsized workspaces. 50K is well above any realistic project's depth-6 entry count, so for typical users the cap is a no-op. The existing `working_set` tests (26/26) still pass — this is purely a defensive upper bound on a path that previously had none. Refs #697 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 17:36:40 -05:00
CrepuscularIRIS	60347b8940	feat(notifications): add LC_TERMINAL fallback probe for Cmux and test coverage Cmux typically does not set TERM_PROGRAM; it sets LC_TERMINAL=Cmux instead. The previous resolve_method() only checked TERM_PROGRAM, causing Cmux users to fall back to the Bel method instead of OSC 9 notifications (#1281). Changes: - Add LC_TERMINAL as a secondary env-var probe in resolve_method(), checked after TERM_PROGRAM. This picks up Cmux (and any other OSC-9 capable terminal that sets LC_TERMINAL rather than TERM_PROGRAM). - Add Cmux to the OSC9_TERMINALS allowlist. - Document that terminals setting neither env var can force OSC 9 with [notifications].method = "osc9" in the config file. - Add two new tests: - auto_detect_picks_osc9_for_cmux_via_lc_terminal - auto_detect_picks_osc9_for_wezterm_via_lc_terminal - Harden existing auto_detect_picks_bel_for_unknown_on_unix to clear LC_TERMINAL before asserting the Bel fallback, preventing flakiness in test runner environments where LC_TERMINAL is set to a known terminal. - Update NotificationsConfig.method doc to mention Cmux and the LC_TERMINAL probe. Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>	2026-05-10 17:36:33 -05:00
Hunter Bown	c2d13c21bf	test: stabilize parallel test execution + bump task_manager timeouts Fixes a cluster of intermittent failures observed on macOS under parallel test load. Root causes were tests mutating shared global state (HOME, USERPROFILE, DEEPSEEK_CONFIG_PATH env vars and ~/.deepseek/ filesystem) without holding the process-wide test lock, plus a few outdated-by-PR assertions and a tight 3s timeout on Windows CI. Changes: * Three EnvGuard / HomeGuard types (commands/config.rs, commands/network.rs, tools/recall_archive.rs) now acquire crate::test_support::lock_test_env() and hold the MutexGuard for their full lifetime, replacing local mutexes that serialized only within a module. Call sites that previously acquired lock_test_env() explicitly with `let _lock = ...` before constructing the guard drop that redundant acquisition; std::sync::Mutex is not reentrant and double-locking on the same thread would deadlock. * settings.rs::config_path_test_guard() now returns the global test_env lock instead of an isolated module-local mutex. * model_picker.rs create_test_app() now returns (App, MutexGuard) so picker tests hold the same lock — eliminates env-var races with config-mutating tests in adjacent modules. * task_manager.rs: 4 tests using wait_for_terminal_state bump 3s -> 10s to give Windows CI file-I/O headroom (we saw one intermittent timeout on the v0.8.27 PR Windows job). * config.rs: 2 api-key tests now set DEEPSEEK_SECRET_BACKEND=local so they exercise file-backed storage in CI rather than fail on Keychain access. * history.rs: removes streaming_thinking_live_collapses_unless_verbose which asserted the OLD behavior PR #1390 (#861 RC4) intentionally changed. The new contract is covered by the three tests PR #1390 added. * .claude/HANDOFF_v0.8.28_user_issues.md: notes #1394 / PR #1393 as a deferred prompt-reliability enhancement. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 17:33:57 -05:00
Zhiping	5ccfe91bff	fix: cache usage unknown shown as 0, and add truthful reporting prompt (#false-claim) Two independent fixes: 1. Prompt truthful reporting (base.md): add explicit rules for honest outcome reporting — if a tool fails/returns-empty say so; if cache usage is unobserved treat it as unknown/null, not 0. 2. Cache usage u64 → Option<u64> (session.rs): when the API does not report cache hit/miss tokens, the cumulative SessionUsage defaulted to 0. Models interpreted this as "no cache hits" rather than "unknown". Changing to Option<u64> ensures absent cache data serializes as null in the model context. Tests added for all three cases: starts None, stays None when API omits cache, accumulates correctly when API reports cache.	2026-05-10 16:43:26 -05:00
LinQ	ca5204e311	fix(ui): stream thinking content during collapsed view (#861 RC4, #1324 ) Closes the visibility gap reported in #1324 ("Thinking 思考内容不能流式输出，只能等到完全输出后通过 ctrl+O 查看完整思考内容") and root cause 4 of #861. Today `render_thinking` blanks the body whenever `collapsed && streaming`: ```rust let body_text = if collapsed && streaming { String::new() } else if collapsed { … } else { … }; ``` That left the user staring at a "thinking..." placeholder for the entire reasoning phase — V4-Pro thinking can run for tens of seconds, so the live transcript looked frozen even though tokens were flowing. Fix: 1. During `collapsed && streaming` we now render the raw content instead of blanking. `extract_reasoning_summary` is meaningless while the block is mid-flight (no completed reasoning to summarise), so the streaming branch returns the body verbatim. 2. The `> THINKING_SUMMARY_LINE_LIMIT` truncation now drops head lines while streaming, keeping the visible window tracking the live cursor at the bottom — which is what users expect when watching a model think. 3. The existing "thinking collapsed; press Ctrl+O for full text" affordance was gated on `!streaming`; it now renders during streaming as well, with a slightly different label ("thinking continues; …") so the user knows there's more content above and how to reach it. Three new tests cover the new contract: streaming-collapsed shows live content, the head is dropped not the tail, and the live affordance fires when truncated. Refs #861 (RC4), closes #1324 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 16:43:16 -05:00
LinQ	741f36d2be	fix(ui): drain active thinking entry on MessageComplete (#861 RC3) Closes the order-of-events race in #861 root cause 3: when the engine bursts events, the dispatch loop can pull `MessageComplete` off the channel ahead of `ThinkingComplete`. Today's `MessageComplete` reads `app.last_reasoning.take()` to attach the reasoning block to the assistant message in `api_messages`. If `ThinkingComplete` has not fired yet, `last_reasoning` is `None` and the thinking content is dropped — DeepSeek V4 then returns HTTP 400 on the next turn because it requires `reasoning_content` replay for assistant messages that carry tool calls. Adds a defensive head-of-handler drain in `MessageComplete`: when `streaming_thinking_active_entry.is_some()`, finalize the active thinking entry and stash the reasoning buffer into `last_reasoning` before the existing body runs. The drain is a no-op in the normal case where `ThinkingComplete` arrived first (the entry has already been cleared), so this branch is order-independent. Adds `message_complete_drain_preserves_thinking_when_thinking_complete_lost` which exercises the head-of-handler invariant directly: with a thinking entry still active and `last_reasoning` empty, the drain must move the buffer into `last_reasoning` before downstream reads. Refs #861 (RC3). RC1 and RC2 are already addressed by the existing `finalize_current_streaming_thinking` plumbing in `apply_engine_error_to_app` and `start_streaming_thinking_block`; RC4 (streaming-time truncation affordance) is left out of this PR to keep the scope on the data-loss path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 16:43:16 -05:00
Zhiping	d4a4892ee3	fix: deny of one tool call permanently blocks all future same-tool calls (#1377 ) When a user denies a tool call (e.g. edit_file), the tool_name was inserted into approval_session_denied alongside the per-call approval_key. Every subsequent invocation of the same tool type was then auto-denied for the rest of the session without prompting. Fix: only store the approval_key (per-call unique). This still prevents the model's retry loop from re-prompting the exact same command (#360), but allows the user to approve a fresh invocation of the same tool type.	2026-05-10 16:43:16 -05:00
Hunter Bown	8956b698c0	fix(mcp): validate path inside mcp_config_mtime (CodeQL defensive) CodeQL's rust/path-injection scan flagged `mcp_config_mtime(path)` because the helper takes `&Path` and calls `fs::metadata(path)`. Both call sites already validate via `validate_mcp_config_path` — `from_config_path` runs the check before constructing the pool, and `reload_if_config_changed` only sees paths that came from a `from_config_path`-validated `config_source` field — so the alert is a false positive about cross-function data flow. The clean fix is to tie the validation to the call site rather than rely on cross-function reasoning: `mcp_config_mtime` now short-circuits to `None` for paths that fail the same allow-list check `load_config` and `save_config` already use. The lazy-reload caller already treats `None` as "skip the check this turn", so a rejected path simply degrades gracefully rather than producing an error path. Cost is one regex check per call on a path we're about to stat anyway. This also makes the helper safe-by-construction for any future caller that forgets to validate, which matches the pattern of the adjacent `load_config` / `save_config` helpers. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 12:01:19 -05:00
Hunter Bown	c87196835c	feat(skills): /skills <prefix> filters the local list (#1318 ) On top of v0.8.26's inter-row spacing for /skills (#1328 from @reidliu41), the list now also accepts an optional name-prefix argument so users with crowded skill folders can narrow the view without scrolling. /skills → full list (unchanged) /skills git → only skills whose name starts with "git" /skills GIT → same (case-insensitive) /skills nope → "No skills match prefix `nope` (out of 12 …)" /skills --remote → unchanged /skills sync → unchanged /skills --bogus → "Usage: …" error (rejected so future flags don't silently turn into no-match prefixes) The match-count header reflects both the matched count and the registry total, so the user can see at a glance how aggressive the filter is. Empty match sets explicitly say so and point back at unfiltered `/skills`. Skill names that start with `-` aren't allowed by the loader, so reserving the dash prefix for flags is safe. Plus the matching usage / description updates in the command metadata + all four shipping locales (en / ja / zh-Hans / pt-BR) so /help shows the new argument. Closes #1318. Thanks @simuusang for the report. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 10:51:57 -05:00
Hunter Bown	3abd4dd988	feat(tui): localize onboarding screens (api key, trust, tips, picker) When a user picked 简体中文 / 日本語 / Português (Brasil) on the step-2 language picker, every subsequent onboarding screen used to stay in English. The set_locale_from_onboarding path already re-resolved `app.ui_locale`, but the hardcoded `Line::from(Span:: styled("Connect your DeepSeek API key", …))` strings in `onboarding/api_key.rs`, `trust_directory.rs`, `language.rs`, and the `tips_lines()` block in `onboarding/mod.rs` never consulted the locale. This commit: - Adds 25 `MessageId` entries (`OnboardLanguageTitle`, `OnboardApiKey`, `OnboardTrust`, `OnboardTips*`, …) covering the title / body / hint / footer strings for each screen. - Translates each into all four shipping locales (en / ja / zh-Hans / pt-BR), with the same care the existing translation surfaces use (no machine translation; idiomatic phrasing for each locale). - Threads the active locale through `language::lines`, `api_key::lines`, `trust_directory::lines`, and `tips_lines` via `app.tr(MessageId::…)`. - Adds `api_key_screen_renders_in_selected_locale` regression test pinning that the rendered lines actually contain the translated strings for zh-Hans / ja / en. Particularly noticeable for users on CJK input methods: picking their language at step 2 now means the remaining setup runs in that language rather than forcing IME juggling for English text. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 10:43:04 -05:00
Hunter Bown	f267d4b874	fix(tui): paste-burst detects short CJK first line as paste (#1302 ) Cherry-picked from @reidliu41's PR #1342. Pasting `请联网搜索：\n…` (short non-ASCII first line + newline) used to fail the `decide_begin_buffer` heuristic — `grabbed.chars().any(is_whitespace)` is false on a 6-codepoint Chinese run, and `chars().count() >= 16` is false at 6 chars — so the trailing pasted newline fell through as a real Enter and submitted the first line on its own. The heuristic now also treats `!grabbed.is_ascii()` as paste-like, which captures the CJK case without false-firing on ASCII typing (plain ASCII typists still need either whitespace or 16+ chars to look like a paste). Includes the regression test from PR #1342, slightly reworded. Closes #1302. Thanks @reidliu41. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 10:30:57 -05:00
Hunter Bown	8bf2557692	feat(tui): consolidate large pastes at paste-time + auto-disable burst Two paste-UX improvements that address recurring complaints: 1. Visible-before-submit consolidation. v0.7.x added a 16 000-char safety cap that folded oversized inputs into `.deepseek/pastes/paste- …md` and swapped them for an `@`-mention so the model could read the full content via the normal mention-resolution path. The cap was checked inside `submit_input` only — meaning a user who pasted 50k chars and pressed Enter saw the file get created AND the message sent in the same frame, with no chance to review the @-mention beforehand. People reasonably read this as "the TUI auto-sent an @-mention I didn't authorise." Consolidation now also runs at the end of `insert_paste_text`, so the @-mention shows up in the composer (along with a "consolidated — sent as @mention" toast) the moment the paste lands. The submit-time path stays as a safety net for any other code path that fills the buffer above the cap, so the cap is still enforced exactly once. 2. Auto-disable paste-burst on verified bracketed paste. The paste-burst heuristic (rapid-keystroke detection for terminals without bracketed paste) used to run unconditionally. On modern terminals (Ghostty / iTerm2 / WezTerm / Windows Terminal) bracketed paste is reliable, and paste-burst running alongside it created false positives — fast typing, IME commits, autocomplete bursts could all be mis-classified as a paste. The new `App::bracketed_paste_seen` flag flips to `true` the first time a real `Event::Paste` arrives; from that moment, `handle_paste_burst_ key` short-circuits. Terminals that never deliver bracketed paste (the original target audience) are unaffected — the heuristic still fires for them. Both changes have new unit tests: - `paste_consolidates_oversized_text_into_paste_file_visibly` - `paste_under_threshold_does_not_consolidate` - `paste_burst_short_circuits_after_bracketed_paste_observed` Existing `submit_input_consolidates_oversized_input_into_paste_file` still passes — it bypasses `insert_paste_text` and exercises the safety net. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 10:29:03 -05:00
Hunter Bown	524c513c03	chore: workspace clippy + missing-field fix-up for preflight Two small workspace-clippy gaps that snuck through the per-crate sweeps in this branch: * `crates/cli/src/lib.rs` — the OpenAI-provider passthrough test was building a `ResolvedRuntimeOptions` literal directly and missed the `yolo: Option<bool>` field that landed earlier on this branch in `665801bb8` (`fix(cli): forward --yolo to TUI binary`). Set to `None` to match the test's non-yolo intent. * `crates/tui/src/mcp.rs` — the new `reload_if_config_changed` swap test was using `iter().any(\|n\| *n == "new")`, which is rust-1.94 clippy's `manual_contains` lint. Switched to `names.contains(&"new")`. `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings` is now green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 10:15:09 -05:00
Hunter Bown	499937be50	feat(tools): add model-callable notify tool (#1322 ) Adds a new always-loaded tool spec `notify` that lets the model trigger a single desktop notification when a long-running task completes or genuinely needs the user's attention. Implementation delegates to the existing `tui::notifications` infrastructure, so the user's `[notifications].method` config drives delivery: - iTerm2 / Ghostty / WezTerm → OSC 9 (banner + sound) - macOS / Linux fallback → BEL - Windows → off by default; opt-in to BEL + MessageBeep - `method = "off"` → silent no-op (the tool still succeeds) Title and body are character-bounded (80 / 200 chars) and trim-checked, so a runaway model can't paint a paragraph into the terminal title bar or slice through a multi-byte sequence and emit invalid UTF-8. tmux passthrough is detected via `$TMUX` and OSC 9 is double-escaped so the outer terminal still receives it. The tool description steers the model away from chatter — only fire on real completion / attention beats, not as a "still alive" ping. Always-loaded (added to `should_default_defer_tool`'s allowlist) so the model sees it without a ToolSearch round-trip; auto-approval since the only side effect is a single terminal escape write. Closes #1322. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 10:09:50 -05:00
Hunter Bown	0ee70178ef	feat(mcp): lazy auto-reload of pool on mcp.json content change (#1267 pt2) v0.8.26 surfaced the spawn-error tail (`Stdio transport closed` → the underlying EACCES/sandbox-deny line). v0.8.27 closes the second half of the report: users no longer need to manually `/mcp reload` after editing `~/.deepseek/mcp.json`. `McpPool` gained three fields: a `config_source` path (set when the pool is built via `from_config_path`), a 64-bit content hash of the active config, and the most recently observed mtime of the source file. `reload_if_config_changed` does a cheap `stat` first; on mtime-equal it returns immediately. Only when the mtime has moved does the pool re-read the file, hash it, and compare to the stored hash — content-unchanged reloads (e.g. `touch` on a networked FS) are skipped. On a real content change the connections map is cleared so the next `get_or_connect` reattaches under the new config (sandbox flags, env, args, server set). `get_or_connect` now invokes `reload_if_config_changed` at entry and swallows its errors (a transient stat/parse failure can't take down the existing pool). Pools built via `McpPool::new` (tests, ad-hoc snapshots) are unaffected — they have no source path and short-circuit out. No file watcher, no long-lived task, no signature changes for the existing callers. Closes the part-2 follow-up on #1267. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 10:04:21 -05:00
Hunter Bown	21f9e9d38d	feat(tui): context-sensitive Ctrl+C — copy / cancel / arm-exit (#1337 , #1367 ) Plain Ctrl+C used to mean "cancel turn or arm exit" unconditionally, which fought the OS-wide copy convention on Windows: every time a user pressed Ctrl+C to copy a model response, they instead armed the exit prompt and lost their place. Ctrl+Shift+C and Cmd+C copied correctly but weren't discoverable. The handler is now a four-stage decision, factored into a `CtrlCDisposition` helper with a unit-tested priority table: 1. CopySelection — transcript selection active → copy + clear it (matches Windows / cross-platform Ctrl+C convention; #1337). 2. CancelTurn — turn in flight → cancel (unchanged). 3. ConfirmExit — quit-armed within the 2s window → exit. 4. ArmExit — idle, no selection → arm the "press Ctrl+C again" prompt for 2s (unchanged). A turn-in-flight beats a quit-arm even when both are true, so a Ctrl+C that lands while the user is mid-turn but had recently half-armed the exit prompt always cancels the turn rather than exiting. Pinned by `ctrl_c_disposition_loading_beats_armed_quit`. Cmd+C (macOS) and Ctrl+Shift+C continue to copy via `is_copy_shortcut` unchanged; only plain Ctrl+C now branches on selection state. For #1367, on TurnStarted the status-message slot now surfaces "Press Esc or Ctrl+C to cancel" if it's empty. Real transient messages still take precedence; the hint clears on the next status update. Closes the discoverability gap for users who didn't know how to interrupt a long-running task. Closes #1337, #1367. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 09:57:57 -05:00
Hunter Bown	be54a046d0	fix(markdown): hard-break overlong words to stop right-edge overflow Both `render_line_with_links` (paragraphs, list items) and the standalone `wrap_text` (code blocks) were word-based wrappers: when a single word's display width exceeded the available column budget they placed the word alone on a line and let it overflow the right edge of the transcript silently. Long URLs, file paths, commit hashes, JWTs, and any no-whitespace CJK run all hit this in #1344 and #1351 reports. The fix mirrors the v0.8.25 table-cell fix (`wrap_cell_text`): extract the per-character width-aware splitter as a free helper `push_word_breaking_chars`, and call it from `wrap_text`, `wrap_cell_text`, and the new char-break branch in `render_line_with_links`. Each rendered line is now guaranteed to fit in the requested width; full content is preserved across the wrapped segments. Snapshot-style regression tests pin the invariant at widths 40, 60, 80, and 120 — covering 200-char `a`-runs, long URL fixtures, mixed-short+overlong-word fixtures, and the existing table-cell property. A regression guard also confirms short words still break on whitespace (no mid-word breaks for ordinary prose). Closes #1344 (output-side overflow). Partial fix for #1351 (the table-cell concern was already fixed in v0.8.25; the long-prompt input-area concern is a separate visible-window issue, not a wrap bug — the composer already uses a grapheme-based wrapper). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 09:51:49 -05:00
Hunter Bown	aea6bb5f46	feat(tui): pager copy-out via c / y key (#1354 ) The pager intercepts mouse capture, so terminal-native selection is disabled inside it. Until now there was no in-app way to copy the content the user came specifically to see — high-frustration UX gap for the Alt+V (tool details), Ctrl+O (thinking), shell-job, task, MCP-manager, and selection pagers. Both `c` (clipboard convention) and `y` (vim-yank convention) now emit a `ViewEvent::CopyToClipboard` carrying the full pager body. The host dispatcher in `ui.rs` writes through `app.clipboard` and toasts a status confirmation ("Pager content copied" / "Copy failed"). Empty-body pagers report the empty state instead of silently no-op'ing. Footer hint updated to surface the new keys: j/k scroll Space page Ctrl+D/U half g/G top/bottom / search c copy q/Esc close Mouse selection inside the pager remains intercepted (the alternative — releasing capture inside the modal — would break vim-style navigation), so this is the supported copy path. Closes #1354. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 09:45:37 -05:00
Hunter Bown	328e3bd191	fix(skills): surface common-cause hints on /skills --remote / sync errors The user-facing error path already formatted the underlying anyhow chain with `{err:#}`, but reqwest chains alone read as opaque fragments ("error sending request: tcp connect: connection refused" etc.) for users without low-level network experience. `format_registry_error` now inspects the formatted chain for common failure signatures and appends a one-line hint: - DNS lookup / `getaddrinfo` failures - connection refused / reset / aborted - TLS handshake / certificate / SSL - HTTP 404 / 401 / 403 / 429 - request timed out Each hint points at the most likely cause (network reachability, trust store, registry URL, rate limit) and a concrete next step. The original chain is still rendered verbatim above the hint, so power users keep their detail and casual users get a starting point. Closes #1329 (the diagnostic side; the actual root cause is now diagnosable from the surfaced chain + hint). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 09:40:57 -05:00
Hunter Bown	abf3fa66f6	fix(tui): drop destructive 2J/3J from viewport reset to fix flicker The forced-repaint sequence written before each TurnComplete / focus-gain / resize used to be: \x1b[r\x1b[?6l\x1b[H\x1b[2J\x1b[3J which combined with the immediately-following ratatui `terminal.clear()` produced a double-clear. Terminals that don't optimize successive clears against the alt-screen buffer (Ghostty, VSCode integrated terminal, Win10 conhost) rendered the second clear as a visible blank-then-repaint flicker on every redraw trigger. The lighter sequence `\x1b[r\x1b[?6l\x1b[H` resets DECSTBM and DECOM and homes the cursor (still solving the original viewport-drift fix that 0.8.22 added) but leaves the pixel-clear to ratatui's diff renderer. The alt-screen buffer's double-buffering absorbs that single clear without flicker on every terminal we tested. Terminals that were already flicker-free (macOS Terminal.app, iTerm2, alacritty) remain so. Closes #1119, #1260, #1295, #1352, #1356, #1363, #1366. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 09:26:16 -05:00
Hunter Bown	fe0673d683	test: ignore unwired #1270 tests; mark for v0.8.28 follow-up	2026-05-10 09:00:31 -05:00
Hunter Bown	b8f01f1e49	fix: resolve clippy warnings from cherry-picked PRs	2026-05-10 08:55:53 -05:00
Hunter Bown	6c25a18b42	chore(release): bump to v0.8.27, add CHANGELOG	2026-05-10 08:41:04 -05:00
imkingjh999	cb084d1564	style: fix fmt and clippy warnings	2026-05-10 08:37:25 -05:00
imkingjh999	5065092a11	fix: let OPENAI_MODEL env override config.toml provider model Previously OPENAI_MODEL only set default_text_model which was lower priority than the provider config model. Now it directly overrides the openai provider's model field.	2026-05-10 08:37:25 -05:00
imkingjh999	a80c318098	fix: only use default_model fallback for DeepSeek providers Other providers (openai, ollama) should get their model from config.toml / env vars (e.g. OPENAI_MODEL), not from the global default_model setting which is DeepSeek-centric.	2026-05-10 08:37:25 -05:00
imkingjh999	9fae156182	fix: address PR review feedback - Use CARGO_PKG_VERSION for User-Agent instead of hardcoded version - Restore default_model fallback for backward compatibility	2026-05-10 08:37:25 -05:00
imkingjh999	b024a92ec4	feat: persist provider/model per-provider in settings - Save provider choice to settings.default_provider on switch - Save model per-provider to settings.provider_models - On startup, load provider-specific model instead of global default - Hide DeepSeek models from picker on pass-through providers - Show friendly message for /models on unsupported providers	2026-05-10 08:36:52 -05:00
lbcheng	0ab0d77b98	feat(tui): add card-rail (╭ │ ╰) boxing to tool output cells Wraps ExecCell and GenericToolCell rendered output with card-rail glyphs for visual structure, similar to Claude Code's card-style tool rendering. - wrap_card_rail(): adds ╭/│/╰ glyphs to rendered lines - Applied to ExecCell::render and GenericToolCell::lines_with_mode - Pre-computed caches (output_summary, is_diff) kept from previous commit for per-frame performance - Live mode output remains visible (head+tail+omitted), not collapsed - Card-rail glyphs reused from existing tool_card.rs CardRail enum Test plan: cargo test -p deepseek-tui (2380 passed, 0 failed) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 08:34:55 -05:00
lbcheng	3cbd336bd5	perf(tui): pre-compute tool output summaries and collapse live view Pre-compute render caches to avoid re-parsing every frame: - Add output_summary: Option<String> to ExecCell and GenericToolCell - Add is_diff: bool to GenericToolCell (cached after first detection) - Populate caches once in handle_tool_call_complete / orphan path Live mode rendering simplified to one-line summary + expand affordance: - GenericToolCell and ExecCell now show a single muted summary line with "Enter to expand tool output" affordance in Live mode - Transcript mode still emits full output - render_tool_output_summary_line truncates to fit terminal width - Make output_looks_like_diff pub(crate) for pre-computation access Test plan: - cargo test -p deepseek-tui (2379 passed) - config_ui::build_document_reflects_app_state is a pre-existing failure Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-05-10 08:34:55 -05:00
lbcheng	9f2b8a5f14	fix(tui): move tolerance to module-level const, add total helpers Promote COST_EQ_TOLERANCE from a function-local const to a module-level constant in sidebar.rs. Add SessionCostSnapshot::total_usd() and total_cny() helpers that encapsulate session+subagent cost summation, used during session restore. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> # Conflicts: # crates/tui/src/session_manager.rs # crates/tui/src/tui/sidebar.rs # crates/tui/src/tui/ui.rs	2026-05-10 08:32:25 -05:00

1 2 3 4 5 ...

771 Commits