Commit Graph

813 Commits

Author SHA1 Message Date
Hunter Bown fd82f85800 fix(tui): auto-disable DEC 2026 sync output on Ptyxis to stop VTE 0.84 flicker
Ptyxis 50.x (the new default terminal on Ubuntu 26.04) ships with
VTE 0.84.x, which parses the `\x1b[?2026h` / `\x1b[?2026l` synchronized-
output begin/end pair but still flashes the entire viewport on every
wrapped frame instead of deferring rendering. gnome-terminal 3.58 on
the same VTE renders cleanly, so the heuristic stays narrow: trigger
only on TERM_PROGRAM matching `ptyxis` (case-insensitive) or
PTYXIS_VERSION non-empty.

Add a new `synchronized_output` setting (`auto` | `on` | `off`,
default `auto`) controlling whether the renderer wraps each frame in
DEC 2026. `apply_env_overrides` flips `auto` → `off` when Ptyxis is
detected; the four wrapping sites in ui.rs (`draw_app_frame_inner`,
`reset_terminal_viewport`, `resume_terminal`, and the early-init
viewport reset) now respect the resolved flag. Users on Ptyxis who
upgrade past an upstream fix or want to confirm one landed can
override with `/set synchronized_output on`.

8 new tests cover: default-auto resolves enabled, off disables,
on stays enabled, set/aliases, Ptyxis via TERM_PROGRAM, Ptyxis via
PTYXIS_VERSION alone, explicit `on` beats the heuristic, explicit
`off` is preserved, and no non-Ptyxis TERM_PROGRAM (including Ghostty
and VS Code, which both keep DEC 2026 on) regresses.

Reported via WeChat by Cyrux on Ubuntu 26.04 with v0.8.30 npm install;
analysis by Hunter pinpointed Ptyxis + VTE 0.84 as the cause.
2026-05-11 22:04:26 -05:00
Hunter Bown 1a73791e1d chore(release): bump to 0.8.30
- workspace.package.version: 0.8.29 → 0.8.30
- per-crate path-dependency version pins: 0.8.29 → 0.8.30
- npm/deepseek-tui: version + deepseekBinaryVersion → 0.8.30
- Cargo.lock refreshed via `cargo update --workspace --offline`
- CHANGELOG: `[Unreleased]` → `[0.8.30] - 2026-05-11` with the full
  release-theme paragraph and the new "Changed" section for the
  Alt+<key> unification

Verified with `./scripts/release/check-versions.sh`:
  Version state OK: workspace=0.8.30, npm=0.8.30, lockfile in sync.
2026-05-11 19:21:45 -05:00
Hunter Bown 31198c11d0 fix(tui): unify transcript-nav shortcuts under Alt+<key>; stop eating first letters
v0.8.29 spot-fixed `g` with a gg double-tap (commit c13ddb04d), but
the underlying bug class still affected `G`, `[`, `]`, `?`, `l`, and
both `v`/`V`: bare press on an empty composer hijacked the keystroke
for transcript navigation, swallowing the first character of a
message. Even the gg fix itself only suppressed the SCROLL — the
first `g` was still eaten, so typing "good morning" produced "ood
morning" with no whale and no warning.

Unified fix: all seven bindings now require the `Alt` modifier (same
pattern as the existing `Alt+R` history search and `Alt+V` tool
details). Plain letters always insert as text:

  Alt+G          → scroll to top
  Alt+Shift+G    → scroll to bottom
  Alt+[ / Alt+]  → previous / next tool output
  Alt+?          → open searchable help (F1 / Ctrl+/ also bound)
  Alt+L          → pager for the last message
  Alt+V          → tool-details pager  (was already bound; only path now)

The `App::transcript_pending_g` field from the v0.8.29 half-fix is
removed along with its Esc / Enter / Char-catch-all resets. The
existing helper `details_shortcut_modifiers` (which accepted bare,
Shift, and Alt-only — the permissive predicate that ate the bare `v`
keystroke) is replaced by `alt_nav_modifiers`, which requires `Alt`,
allows `Shift`, and blocks `Ctrl` / `Super` so the bindings don't
collide with platform clipboard or window-management shortcuts.

Same modifier vocabulary as `Alt+R` / `Alt+P` / `Alt+1..3`, so this
makes the keymap more consistent rather than introducing a new
convention.

Regression-guarded by `alt_nav_modifiers_require_alt_and_exclude_ctrl_super`
in `crates/tui/src/tui/ui/tests.rs`, which exercises every modifier
combination the predicate needs to accept or reject. Full
`cargo test -p deepseek-tui` (2751 tests) passes; clippy clean; fmt
clean.

User-facing migration: any user who learned the bare-letter nav
shortcuts in v0.8.x needs to add `Alt+`. The trade-off is unambiguous
— losing the first letter of "good", "great", "let", "list", "very",
"verify", and anything starting with `?` was burning more users than
the bare-letter shortcuts were serving.
2026-05-11 19:21:31 -05:00
Hunter Bown a328344691 feat(tui): restore 🐳🐋 cycling status indicator next to the effort chip
The whale was a 12-frame animated indicator (`🐳, 🐳., 🐳.., 🐳..., 🐳..,
🐳., 🐋, 🐋., 🐋.., 🐋..., 🐋.., 🐋.`) that shipped from v0.3.5 onward
and rendered in the top-right status cluster of the header. Commit
`1a04659a9` ("smoother TUI streaming") quietly swapped it for a 6-frame
geometric ring (`◍ ◉ ◌ ◌ ◉ ◍`); `f4dbf828c` later deleted the function
entirely. Nothing in the CHANGELOG mentioned either step, and the
absence has been on the maintainer's mind ever since.

This commit restores the whale as a configurable status indicator that
sits immediately before the reasoning-effort chip ("next to max"):

- `widgets/header.rs` gains a public `header_status_indicator_frame`
  helper and a `HeaderData::with_status_indicator(Option<&'static str>)`
  builder. The frame computation is pure (keyed off `turn_started_at`
  and the mode string) so the widget itself stays a stateless render.
- The chip renders as the first item in the status cluster, before
  `provider` / `effort` / `Live` / context. Idle state shows a steady
  🐳; an active turn cycles frames every 420 ms (same cadence as the
  original v0.3.5 implementation).

New setting `status_indicator`:

- `whale` (default) — restored historical cycling.
- `dots` — the 6-frame geometric replacement, for users who came in
  during the dots era and prefer it.
- `off` — hide the chip entirely.

Settable via `/config status_indicator <whale|dots|off>`, persisted in
`settings.toml`, mirrored in the typed `config_ui::SettingsSection`
with a new `StatusIndicatorValue` enum so the web/JSON config surface
sees it too.

Default-to-whale rationale: this restores the historical behaviour for
every user, including those who never realized the whale was gone, and
keeps the "🐳 in /config" delight that the project's name has always
implied.

Regression-guarded by seven new tests in `widgets/header.rs::tests`
covering idle frame, frame advancement, dots variant, off variant
including aliases, unknown-mode fallback to whale, render placement
before the effort label, and confirmation that `off` hides the chip
without disturbing the effort chip layout.
2026-05-11 18:59:52 -05:00
Hunter Bown b1998fff8c revert(tui): drop typing-synced wave frame source; keep gate decoupling
The previous commit (15525751c) did two things in one shot:
1. Decoupled the footer water-spout gate from `low_motion`, so
   `low_motion = true` no longer hides the wave when `fancy_animations
   = true`.
2. Re-wired the wave's frame source from wall-clock milliseconds
   to a per-turn character-commit counter, on the theory that the
   wave should visually move at the same cadence as the text on
   screen ("water = typing").

The user-visible result of (2) was that the wave looked notably
different than in v0.8.29 — slower, sluggish, less alive. Root cause:
the sine math in `footer_working_strip_glyph_at` (`t = frame / 1000.0`,
primary term × 8.0) was tuned for frame ≈ 1000 units/sec, which is
what wall-clock ms produces. Driving frame off character commits
gives ~10–30 units/sec, so the wave evolves ~30× slower than the
intended tuning. Theoretically fixable by re-tuning the sine
constants, but that's a bigger change with its own visual
regressions to vet, and the user explicitly asked to "put it back
to where it was."

This commit reverts only (2):

- Removes `StreamingState::stream_commit_frame` field.
- Removes the increment in `commit_text` and `finalize_block_text`.
- Removes the zeroing in `reset`.
- Removes the five `stream_commit_frame_*` regression tests.
- Changes `render_footer` to assign `Some(now_ms)` again instead of
  `Some(app.streaming_state.stream_commit_frame)`.

The decoupling from (1) stays: the gate is still
`if app.fancy_animations { ... }`, so `low_motion = true` no longer
hides the wave. The settings.rs docstrings stay updated.

CHANGELOG entry is collapsed to a single short bullet describing the
decoupling-only fix.

Net effect for users: the wave looks and feels exactly like v0.8.29,
but `low_motion = true` now keeps the whale visible (was the
original regression that started all of this).
2026-05-11 18:40:17 -05:00
Hunter Bown 15525751ce feat(tui): decouple footer water-spout from low_motion; sync wave to typing cadence
The water-spout strip in the footer used to be hard-gated by `!low_motion`,
which meant the typewriter-streaming option silently killed the spout
animation — even with `fancy_animations = true` the strip stayed plain
whitespace. Users testing the typewriter pacing in v0.8.29 reported "where
did the whale go," which is on us: we'd collapsed two concerns
(streaming pacing vs footer animation) onto one flag.

This commit makes the two flags orthogonal:

- `low_motion` governs streaming pacing only (typewriter = one char per
  commit tick vs upstream cadence = drain everything queued).
- `fancy_animations` governs whether the spout-strip is rendered at all.

It also wires in a new idea that fell out naturally once the two were
decoupled: instead of driving the wave animation off wall-clock
milliseconds, drive it off a per-turn character-commit counter
(`StreamingState::stream_commit_frame`). The wave then visually moves at
the same cadence as the text:

- Typewriter mode → wave drips at one frame per character.
- Upstream mode → wave surges when V4-pro bursts a warm-cache turn.
- Tool calls and planning pauses → no chars arrive, wave freezes. The
  textual `working...` pulse still ticks on wall-clock, so a heartbeat
  is always visible.
- New turn (`StreamingState::reset`) → counter zeroes so each turn
  opens with a fresh wave shape.

`stream_commit_frame` is a `u64` advanced inside `commit_text` and
`finalize_block_text` by the character count of each committed slice,
so multi-byte UTF-8 (e.g., CJK) advances the wave by one glyph per
character rather than three frames per character — matching the
visual weight of each glyph.

Regression-guarded by five new tests in `crates/tui/src/tui/streaming/mod.rs`:
- `stream_commit_frame_advances_by_character_count_on_commit`
- `stream_commit_frame_counts_unicode_chars_not_bytes`
- `stream_commit_frame_advances_on_finalize`
- `stream_commit_frame_resets_on_reset`
- `stream_commit_frame_freezes_when_no_text_arrives`

Also folds in `cargo fmt` cleanup for two files where prior commits on
this branch landed without re-formatting (`crates/tui/src/tui/ui.rs`
around the new Esc-arm wrapper introduced for the `gg` double-tap, and
the new `fireworks_custom_base_url_preserves_provider_model` test in
`crates/config/src/lib.rs`). No behavior change from those edits.

Settings doc comments in `crates/tui/src/settings.rs` updated to spell
out the new orthogonal semantics so the next maintainer doesn't have
to reverse-engineer it from `render_footer`.

CHANGELOG entry added under a new `[Unreleased]` section.
2026-05-11 16:55:37 -05:00
Hunter Bown c13ddb04d4 fix(tui): change transcript 'g' scroll to vim-style 'gg' double-tap
A single bare 'g' with an empty composer was hijacked as a scroll-to-top
command, preventing users from typing 'g' as the first character of a message.
The transcript would jump to line 0 instead of inserting 'g' into the composer.

Change to a vim-style 'gg' double-tap: first 'g' arms transcript_pending_g,
second 'g' executes the scroll. Any other character input, Enter, or Escape
resets the pending flag so a stray 'g' during composition arms without
scrolling.

Also adds transcript_pending_g field to App struct (default false).
2026-05-11 16:43:45 -05:00
Hunter Bown 7454b23ae6 fix(shell): kill process group in collect_output to prevent UI freeze
Cherry-picked from PR #1475 by CrepuscularIRIS (autoghclaw/issue-828).

When a shell command spawns background subprocesses (nohup, sleep &, etc.),
those subprocesses inherit the pipe write-ends. After the shell exits, the
subprocesses keep those write-ends open, causing handle.join() on reader
threads to block indefinitely in read(). Since list_jobs() calls poll() →
collect_output() on every TUI render tick, the entire UI event loop blocks.

Fix: kill the process group (PGID = child PID) before joining reader threads,
so orphaned subprocesses release their pipe write-ends. Also wires the
previously dead-coded cleanup() into list_jobs() with a 1-hour eviction
window to bound process table growth.

Fixes #828.
2026-05-11 16:34:00 -05:00
Hunter Bown 62a893d9c2 fix(config): preserve model names for all providers with custom base URLs
Previously only OpenRouter was whitelisted via provider_preserves_custom_base_url_model,
causing six other providers (Sglang, Novita, Fireworks, Vllm, Ollama, NvidiaNim) to still
rewrite user-configured model names when a custom base URL was set. Users routing through
their own gateway would get 400s because the TUI sent provider-prefixed model names
(e.g. accounts/fireworks/models/deepseek-v4-pro) that the gateway didn't recognise.

The fix removes the provider-specific guard: when base_url_is_custom_for_provider()
returns true (i.e. the user set a non-default endpoint), the model name is preserved
as-is for every provider, not just OpenRouter.

Affected:
- crates/config/src/lib.rs: ProviderKind::Openrouter guard removed
- crates/tui/src/config.rs: ApiProvider::Openrouter guard removed
- Test: fireworks_custom_base_url_preserves_provider_model added
- Test: nvidia_nim_reads_facade_provider_table updated for new behaviour

Addresses the #857 class bug (B1 in the v0.8.30 audit).
2026-05-11 16:25:45 -05:00
Hunter Bown 716f45cfbc fix(ui): Ctrl+O guard accepts extra modifier bits + AGENTS two-binary note
The Ctrl+O thinking-pager arm guarded on
`key.modifiers == KeyModifiers::CONTROL` (exact match), so any
additional modifier bit set by the terminal — Shift while a
native-selection mouse bypass was active, Caps Lock indicator on
some keyboard layouts — silently fell through to the $EDITOR arm at
ui.rs:2833 and did nothing visible when the composer was empty. The
user saw the "thinking collapsed; press Ctrl+O for full text"
affordance, pressed it, and the handler appeared to ignore them.

Relaxed to `contains(KeyModifiers::CONTROL)` to match the established
pattern at Ctrl+P (ui.rs:2068) and Ctrl+B (ui.rs:2077). With the
existing `app.input.is_empty()` guard preserved, the $EDITOR arm
still owns the non-empty-composer case, so the two handlers continue
to partition Ctrl+O cleanly.

Also documents the two-binary install gotcha in AGENTS.md: the CLI
dispatcher (`crates/cli` → `deepseek`) and the TUI runtime
(`crates/tui` → `deepseek-tui`) ship as separate executables, and
`cargo install --path crates/cli` alone leaves the TUI stale — which
is how both this fix and the active_cell fix from dc2433a8b
initially appeared to be no-ops during local maintainer testing.
The release pipeline packages both binaries, so end users were
never affected by that side; this is purely a maintainer-local
footgun and is now spelled out for future agents.

Extends the existing v0.8.29 CHANGELOG entry to credit both halves
of the Ctrl+O fix.
2026-05-11 14:31:41 -05:00
Hunter Bown dc2433a8b5 fix(ui): Ctrl+O expands thinking blocks still in active_cell
After `ThinkingComplete` the finalized thinking entry sits in
`app.active_cell` with `streaming = false` until the active cell
flushes to history at end-of-turn. During that window the transcript
rendered the "thinking collapsed; press Ctrl+O for full text"
affordance from `render_thinking`, but `open_thinking_pager` only
searched `app.history` — so the handler surfaced "No thinking blocks
to expand" while pointing at the affordance. The affordance was
truthful; the handler was lying.

Routed the lookup through `cell_at_virtual_index` /
`virtual_cell_count`, the existing virtual-index API that
`open_tool_details_pager` already uses for the same active-cell
window. The selection-based path resolves through the virtual index
too, so dragging into an in-flight thinking block and pressing
Ctrl+O now works as well.

Regression guard: `open_thinking_pager_finds_thinking_in_active_cell`
drives the entry into active_cell, finalizes it so the "collapsed"
affordance is what render_thinking emits, then asserts Ctrl+O pushes
the Pager view instead of surfacing the "No thinking blocks" status.
2026-05-11 14:03:53 -05:00
Hunter Bown bfb9da3462 test: isolate config-mutating smoke tests 2026-05-11 13:28:45 -05:00
CrepuscularIRIS dde1e5e2f1 fix(tui): default composer_arrows_scroll on when mouse capture is off
On platforms where mouse capture is disabled by default (Windows CMD /
legacy conhost), the terminal sends mouse-wheel events as Up/Down arrow-
key sequences.  Without composer_arrows_scroll those sequences cycle the
input history instead of scrolling the transcript (#1443).

Set the default for composer_arrows_scroll to !use_mouse_capture so that
terminals that forward wheel events as arrows get page-scrolling out of the
box, while terminals with real mouse capture (Windows Terminal, Linux, macOS)
keep the existing history-navigation default.

The explicit [tui] composer_arrows_scroll config key still overrides the
derived default in both directions.

Also enable mouse capture by default for ConEmu/Cmder (ConEmuPID env var),
which handles VT mouse-mode reporting cleanly, giving those users in-app
scrolling without needing --mouse-capture.

Fixes #1443

Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>
2026-05-11 13:09:25 -05:00
CrepuscularIRIS ae45d1054b fix(shell): detect Docker buildx provenance EPERM and surface actionable hint
When exec_shell runs a Docker build on macOS and Docker Desktop's signed
process has written com.apple.provenance-tagged files under
~/.docker/buildx/activity/, the child process spawned by the TUI
sandbox gets EPERM when it tries to update those files, producing:

  failed to update builder last activity time: open
  /Users/.../.docker/buildx/activity/.tmp-...: operation not permitted

Add looks_like_macos_provenance_failure() to detect this pattern via
three heuristics (provenance xattr mention, activity-time message, or
buildx/activity path + EPERM), with an early-return guard that suppresses
the hint on clean exits. Wire the hint into both the foreground exec_shell
path and build_shell_delta_tool_result so it surfaces on background task
polls too.

Four unit tests cover the positive cases and the two guard cases (exit 0,
unrelated EPERM).

Closes #1449

Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>
2026-05-11 13:09:25 -05:00
CrepuscularIRIS 97a77d82f0 fix(settings): detect TERM_PROGRAM=ghostty and apply low-motion cap (#1445)
Ghostty's GPU compositor flash-renders each full-screen repaint at 120 FPS,
producing visible flicker identical to the VS Code issue fixed in #1356.
Extend apply_env_overrides() to also force low_motion=true +
fancy_animations=false when TERM_PROGRAM=ghostty, capping redraws to 30 FPS.

Add ghostty_term_program_forces_low_motion_on test mirroring the existing
vscode test, serialised through the process-wide lock_test_env() guard.

Fixes #1445

Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>
2026-05-11 13:09:25 -05:00
AutoGHClaw 4bc5375fa6 fix(render): replace SMP emoji with stable BMP glyphs to fix layout on Windows terminals
Emoji in U+1F000+ have no stable column-width contract across terminal
emulators.  On cmd.exe/PowerShell they render as 1-column placeholder
boxes even though unicode_width reports 2; on WezTerm/Alacritty with
certain font stacks the rendered width can be off by one column.  Both
cases break layout arithmetic in the header and file-tree widgets.

Changes:
- header.rs: replace 🐳 (U+1F433, 2-wide) with ◆ (U+25C6, always 1-wide)
  in the "max" reasoning-effort chip
- file_tree.rs: drop the 📁/📄 (U+1F4C1/U+1F4C4) entry-icon prefix; the
  ▼/▶ expand marker already distinguishes dirs from files

Fixes #1314
2026-05-11 13:09:25 -05:00
xiedongsheng 1352a65ac2 fix: reject no-op edit_file replacements 2026-05-11 13:09:22 -05:00
reidliu41 7ac79799e7 Fix flaky workspace and env-sensitive tests
Tighten session workspace Git root detection so invalid parent .git
  markers are not treated as real repositories. This prevents unrelated
  temporary workspaces from being scoped together when a stray .git
  directory exists under /tmp.

  Also move env-mutating tests onto the shared test env lock and make the
  streamable HTTP MCP mock server serve until the test ends, avoiding
  parallel test races and premature mock server shutdowns.

(cherry picked from commit eecfc16fc99d072ac389980ec9e5e3f208297b8e)
2026-05-11 07:34:50 -05:00
reidliu41 46ab1fdf62 Scope skill completions to /skill
Keep individual skills out of the top-level slash command menu so large
  skill collections do not crowd out built-in commands.

  Skills still complete after `/skill`, including both the full skill list
  after `/skill ` and prefix matches after `/skill <prefix>`.

(cherry picked from commit 57f8e3ad84dad9cf46290c0dc23e2b26504196df)
2026-05-11 07:34:50 -05:00
Beltran12138 ca284d1fc0 fix: treat deepseek-chat / deepseek-reasoner aliases as reasoning models
`requires_reasoning_content()` only matched literal `deepseek-v4*` model
IDs, but `deepseek-chat` and `deepseek-reasoner` are DeepSeek's public
API aliases that resolve server-side to `deepseek-v4-flash` and
`deepseek-v4-pro` respectively. Both have thinking mode enabled by
default, so when a user sets `default_text_model = "deepseek-chat"` (the
value `deepseek auth` / onboarding writes), the thinking-mode sanitizer
is skipped and tool-call assistant messages are sent without
`reasoning_content`. DeepSeek then rejects the second turn with:

  HTTP 400: The `reasoning_content` in the thinking mode must be passed
  back to the API.

Extend `requires_reasoning_content()` to recognise the `deepseek-chat`
and `deepseek-reasoner` alias prefixes (covering suffixed variants like
`deepseek-chat:free` used by proxied deployments). The explicit
`reasoning_effort = "off"` escape hatch still disables replay via the
unchanged `should_replay_reasoning_content()` check.

Adds `alias_thinking_detection_tests` covering the aliases, explicit V4
IDs (regression guard), excluded non-thinking models, suffixed variants,
and the reasoning-off override.

Refs: https://api-docs.deepseek.com/guides/thinking_mode
(cherry picked from commit 46941142123827fa16fc9a1fb41b78c293e935ce)
2026-05-11 07:34:50 -05:00
samhandsome 9279d92f4e fix(tools): hydrate deferred schemas before execution
Return a schema hydration result on first deferred tool use so the model can retry with visible parameters instead of executing guessed arguments. Add edit_file coverage for old_string/new_string aliases.

(cherry picked from commit 91be171cc15dd895170bd1a486445f5e05356b57)
2026-05-11 07:34:50 -05:00
reidliu41 0164ee869e fix: parse CRLF SSE MCP events
Accept both LF and CRLF SSE event separators in the MCP SSE transport so
  uvicorn and FastMCP servers can publish endpoint events correctly.

  Add regression coverage for CRLF endpoint discovery.
2026-05-10 23:27:51 -05:00
Wenjunyun123 21867b2d3a style(tui): format footer width tests 2026-05-10 23:27:21 -05:00
Wenjunyun123 80a8559e29 fix(tui): keep footer chips within available width 2026-05-10 23:27:21 -05:00
reidliu41 4afa88ebfc fix(tui): ignore leaked mouse reports in composer
PR #1421 from @reidliu41. Filters SGR mouse-report bursts that some terminal chains leak into stdin while mouse capture is enabled, while preserving ordinary coordinate-like text.
2026-05-10 23:27:14 -05:00
Hunter Bown f5784909cf security(tui): avoid session-id taint in resume hint 2026-05-10 22:20:42 -05:00
Hunter Bown 678198440a docs(prompts): capture the Chinese-native-model design tension in locale-preamble docstring
Community feedback on the v0.8.29 follow-up (WeChat thread on
#1118) made a sharp point: the standard Western-LLM advice
"always write prompts in English" doesn't transfer to DeepSeek
V4, which is a Chinese-first multilingual model with a
Chinese-co-trained tokenizer. `你好` typically encodes to ~1
token, not 2; the "Chinese is expensive" framing is folk wisdom
from a different model family.

The naïve translation of that argument is "ship a fully
translated base.md per locale" — and that's the move v0.9.x
might eventually make. For v0.8.29 we deliberately stop at the
bookend (preamble + closer in native script, English middle)
because of three concrete costs:

  1. Drift risk between N translated copies of a 200-line
     prompt — every rule change has to land in lockstep.
  2. Cache stability — one English `base.md` lets us share
     prefix-cache state across locales for the workspace-
     static portion of the prompt.
  3. Translation QA expense — 95% right is bad, because the
     missing 5% becomes silent behavior divergence.

Captured all of this in the `locale_reinforcement_preamble`
docstring so the next maintainer reading the prompt-assembly
code sees the design tension and the cost model explicitly,
and knows full translation is the natural next step if the
bookend stops being sufficient.

No runtime change; documentation only. Credit @MuMu (via Hunter)
for the bookend pattern that motivates this design, and the
unnamed WeChat commenter who made the tokenizer-economics
argument that motivates this docstring expansion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 22:04:49 -05:00
Hunter Bown 5b6902006c security(mcp): defuse GitGuardian Basic-Auth-String false positive in proxy-redact test
GitGuardian's "Basic Auth String" detector flagged commit 09dcbede0
because the test fixture for `redact_proxy_userinfo_strips_password`
contained literal URL strings of the shape
`scheme://username:password@host` — `alice:hunter2` and `bob`. The
values are obvious placeholders (not real credentials), but the
detector's regex is shape-based: any scheme-prefixed colon-separated
userinfo segment terminated by `@` matches, regardless of whether the
content is a real secret.

The test still needs to exercise the redaction logic for credential-
carrying proxy URLs. Fix: assemble the URLs via `format!` from
explicit placeholder constants (`PLACEHOLDER_USER`,
`PLACEHOLDER_PASS`) so the literal source text never contains a
contiguous `scheme://name:secret@host` pattern. Runtime behavior is
identical — `redact_proxy_userinfo` receives the same string and
returns the same redacted form.

Also reworded the function docstring (line 61) and the inline comment
at the warning log site (line 993) to describe the userinfo segment
without spelling out a literal `user:pass@host` shape that the same
detector could later trip on.

Two preexisting fixtures elsewhere in this file
(`mask_url_secrets("https://user:s3cret@…")` at line 3155 and its
docstring at line 46) have been on `main` for several releases and
are presumably already on GitGuardian's allowlist — left untouched
in this commit so the fix scope stays minimal. If they re-fire on a
future scan, the same `format!` pattern can be applied there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:39:23 -05:00
Hunter Bown b6cf0199de feat(prompts): locale-native closing bookend to defeat English-context drift (#1118 follow-up #2)
The opening preamble from commit `47f6d69e5` works for the first
few turns, but as English context accumulates in the session (code
read into the transcript, error logs, file listings, search
results, project context), the transformer's recency bias pulls
`reasoning_content` back toward English even when the user keeps
writing in their own language. The empirical fingerprint is "model
thinks in Chinese for the first 3-4 turns, then quietly switches
to English thinking around turn 5 as more code lands in context."

Community feedback (WeChat thread on #1118 — @MuMu describes an
XML-tagged "bilingual bookend" pattern they used in another
project, and @益达 confirms the translation-accuracy problem with
fully-translated prompts) pointed at the bookend pattern: keep the
rule-heavy middle of the prompt in English (single source of
truth, model is natively multilingual), but reinforce the locale
directive at BOTH ends in native script. The opening anchors
behavior at session start; the closer sits at the maximum-
recency position right before the user's next message and
re-asserts the rule each turn.

`locale_reinforcement_closer()` returns Some for `zh-Hans` /
`zh-CN` / `zh`, `ja` / `ja-JP`, `pt-BR` / `pt`. English (and
unmatched locales) return None — system prompt stays
byte-identical to the previous behavior for English users.

The closer is appended after the previous-session handoff block
(the existing "last block" position), so it's the very last
content before the user's first message. Any future block that
needs to sit closer to the user should be added BEFORE the
closer with an updated test invariant.

Three new tests pin the contract:

* `locale_reinforcement_closer_returns_native_script_for_supported_locales`
  — each supported locale's closer is in its native script and
  explicitly mentions `reasoning_content` (the V4 knob).
* `system_prompt_bookends_zh_hans_with_preamble_and_closer` —
  the full zh-Hans system prompt contains both `## 语言要求`
  (preamble) and `## 语言再次提醒` (closer), in that order, and
  no other top-level `##` section follows the closer.
* `system_prompt_skips_locale_preamble_for_english` (extended)
  — English locale gets neither the preamble nor any of the
  three locale closers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:32:08 -05:00
Hunter Bown 09dcbede08 security(mcp): redact HTTP(S)_PROXY userinfo before logging malformed-URL warning
PR #1408's MCP proxy support (commit 865db6248) added a
`tracing::warn!` for malformed `HTTPS_PROXY` values that included
the raw URL via `proxy = %proxy_url`. With v0.8.29's new
file-backed tracing subscriber (writing to
`~/.deepseek/logs/tui-YYYY-MM-DD.log`), that means a corporate
proxy URL of the shape `http://user:pass@proxy.example/` would
leak the password to disk whenever reqwest rejected the URL.

Fix: redact the `user:pass@` userinfo segment before logging via
a new `redact_proxy_userinfo()` helper. `http://alice:hunter2@proxy/`
becomes `http://***@proxy/`. URLs without userinfo are returned
unchanged; the `@` is only treated as a userinfo separator when it
appears before any `/`, `?`, or `#` (so path-embedded `@` doesn't
trigger redaction). Garbage input (no `://`) passes through — the
warning log site is already in the malformed-URL failure path.

Pinned by `redact_proxy_userinfo_strips_password` covering five
cases: full creds, user-only, no-userinfo, path-only-`@`, and
garbage. The non-malformed path (where reqwest accepts the URL)
never logs the URL at all, so this is the only leak vector.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:26:45 -05:00
Hunter Bown 47f6d69e58 feat(prompts): locale-native reinforcement preamble for non-English UIs (#1118 follow-up)
`base.md` stays the single source of truth (English meta-language,
DeepSeek V4 is natively multilingual, prefix-cache stable across
users in the same locale). For non-English UI locales we now prepend
a short locale-native passage so the model's first exposure to the
prompt is an explicit "think and reply in {locale}" directive in the
user's own writing system — defeats the failure mode reported in
#1118 and visible in the recent WeChat screenshot where a user with
`locale = zh-Hans` configured still got English thinking because the
task context (Rust code, English log lines) overpowered the inferred
`## Environment.lang` signal.

Locales supported (matched against `PromptSessionContext.locale_tag`,
which the caller resolves from `Settings`):

* `zh-Hans` / `zh-CN` / `zh` — Simplified Chinese preamble
* `ja` / `ja-JP` — Japanese preamble
* `pt-BR` / `pt` — Brazilian Portuguese preamble

English (and any unmatched locale) returns `None` and the system
prompt is byte-identical to v0.8.28 — so this is a strict additive
change for non-English users.

Each preamble is ~6-8 lines and explicitly:

* names the runtime ("DeepSeek TUI") so the model knows it's not
  switching personas
* declares the directive for BOTH `reasoning_content` and the final
  reply (the V4 knob that #1118 hinges on)
* preserves tool-name immutability (`read_file`, `exec_shell`,
  paths, env vars, CLI flags, URLs stay in their original form)
* handles mid-session language switches (next-turn switching)
* defers to explicit user override ("think in English" etc.)

Three new tests pin the contract:

* `locale_reinforcement_preamble_returns_native_script_for_supported_locales`
  — preamble must be in the locale's native script, must mention
  `reasoning_content`, and must call out tool-name immutability;
  English/unknown locales must return `None`.
* `system_prompt_prepends_locale_preamble_for_zh_hans` — the
  preamble must appear *before* the English base prompt body in
  the assembled system prompt (attention precedence + cache
  ordering both depend on this).
* `system_prompt_skips_locale_preamble_for_english` — English
  locale must produce a byte-identical prompt to the pre-feature
  behavior (no zh / ja / pt strings anywhere).

Prefix-cache impact: per-locale cache shards stay intact (a
zh-Hans user's prompt shares the preamble across turns; an English
user's prompt is unchanged). Cross-locale cache is invalidated,
which is correct — different users in different locales were never
sharing cache for the right reasons.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:11:37 -05:00
Hunter Bown 865db62487 feat(mcp): honor HTTP(S)_PROXY env vars on MCP HTTP transport (#1408)
`reqwest 0.13` does not auto-detect proxy env vars by default, so MCP
HTTP connections were bypassing the proxy that every other tool on
the user's box (curl, npm, git, …) was using. Users behind corporate
egress proxies and China-mainland setups routing through a local
Clash / Shadowsocks tunnel had their MCP servers fail to connect or
silently leak around the tunnel.

When the `MCP HTTP transport client builder` runs, we now read
`HTTPS_PROXY` / `https_proxy` / `HTTP_PROXY` / `http_proxy` (first
non-empty wins) and route via `reqwest::Proxy::all(...)`. `NO_PROXY`
is honored via `reqwest::NoProxy::from_env()`. Malformed proxy URLs
log a `tracing::warn!` (no scroll-demon leak — see runtime_log) and
the connection proceeds without a proxy rather than failing the
whole MCP attach.

Closes #1408. Thanks @hlx98007 for the report.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:58:33 -05:00
reidliu41 6d099d425c feat: add note management commands
Extend /note beyond append-only usage with list, show, edit, remove,
  clear, path, and explicit add subcommands.

  Keep existing /note <text> behavior compatible, preserve the existing
  --- separated file format, and number notes only at display time so the
  stored notes stay clean.

  Update command help, localization, docs, and tests.
2026-05-10 20:56:39 -05:00
Vince 5d3ec1b439 fix(mcp): skip malformed items during discovery instead of dropping all
When the MCP server returns a list where one entry cannot be deserialized
(e.g. a tool missing the required `name` field), the previous code called
`.unwrap_or_default()` on the whole-list deserialization, silently discarding
every valid entry in the page.

Switch all four discovery functions (tools, resources, resource-templates,
prompts) to iterate over the JSON array and deserialize each item
individually, skipping only those that fail.  This ensures a single
non-conformant entry never hides the rest of the list.

Related: #1250
2026-05-10 20:56:35 -05:00
Hunter Bown 5f976d7e20 feat(header): surface runtime version as a header chip
Adds `v0.8.29` (the workspace `CARGO_PKG_VERSION` resolved at compile
time) to the right cluster of the header bar, after the existing
provider / effort / Live / context chips. Users have been asking for
the live version somewhere in the UI — previously only reachable via
`deepseek --version` (CLI flag, not in the TUI) or `/status` (slash
command, requires action).

The chip is the lowest-priority element in the width cascade in
`right_spans()`: under tight terminal width it drops before any of
the existing status chips. Two pinned tests:

* `header_renders_version_chip_when_width_allows` — at width 120
  the chip must appear with the current `env!("CARGO_PKG_VERSION")`.
* `narrow_header_drops_version_chip_before_dropping_mode` — at
  width 12 (extreme narrow) the chip drops while the mode label
  survives, matching the cascade priority.

Styled with `palette::TEXT_HINT` so it sits visually behind the
streaming dot / context signal — present but not distracting.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:47:56 -05:00
Hunter Bown 7630259748 chore(release): bump to 0.8.29
Workspace + per-crate path-dep version pins, npm wrapper, and
deepseekBinaryVersion all advance 0.8.28 -> 0.8.29.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:40:20 -05:00
Hunter Bown e4255539fc fix(tui): close the scroll-demon class structurally (#1085 regression)
Issue #1085 ("TUI viewport drifts down inside alt-screen at end of
turn, leaving top rows blank, esp. after sub-agents") was closed in
v0.8.18 by adding `reset_terminal_viewport()` to home the cursor on
TurnComplete / focus / resize. v0.8.27's flicker fix (`abf3fa66f`)
dropped the `\x1b[2J\x1b[3J` deep-clear from that path to stop the
double-clear flicker on Ghostty / VSCode / Win10 conhost. That left
ratatui's incremental-diff renderer relying on its internal model
matching reality — which only holds while nothing else writes to
the terminal.

Two latent `eprintln!` sites had been quietly emitting raw bytes
into the alt-screen for the entire v0.8.x cycle:

* `tools/subagent/mod.rs::persist_state_best_effort` (fires whenever
  the per-step sub-agent state save hits an error; under parallel
  sub-agents this can fire dozens of times per turn)
* `tools/subagent/mod.rs::new_shared_subagent_manager` (fires once
  on init if the prior state file fails to load)

Plus a third found during this fix:

* `network_policy.rs::record` (fires every time a network-policy
  audit write fails)

Each eprintln advanced the alt-screen cursor by one row and
scrolled the buffer up by one row, but ratatui's renderer didn't
know — it kept writing to absolute row positions, which now meant
"one row higher than visible." After ~30 leaks the TUI content
appeared to drift downward, with a blank band growing above the
header. v0.8.18's periodic full-clear had been masking it; v0.8.27's
flicker fix unmasked it.

Three layers of defence so this class of bug "isn't an option
anymore":

1. **`crates/tui/src/runtime_log.rs` — file-backed tracing
   subscriber + Unix fd-level stderr redirect.** A daily-rolling log
   file at `~/.deepseek/logs/tui-YYYY-MM-DD.log` is created at TUI
   startup (right after `EnterAlternateScreen`). A
   `tracing-subscriber` registry routes `tracing::warn!` /
   `tracing::error!` calls to it. On Unix, the process's stderr fd
   is `dup2`'d to the same file for the lifetime of the
   `TuiLogGuard`. Any future raw `eprintln!` — ours, a panic
   message, a third-party crate's verbose output — lands in the log
   file instead of the alt-screen. The guard restores the original
   stderr fd on drop so shutdown messages still reach the user's
   terminal.

2. **`tracing::warn!` replacements** for the three known leak sites
   (`subagent/mod.rs` ×2, `network_policy.rs` ×1). With (1) in
   place these messages now go to the log file with structured
   fields (`?err`, `host`, `tool`) instead of opaque text rows in
   the alt-screen.

3. **Module-level
   `#![deny(clippy::print_stdout, clippy::print_stderr)]`** on
   `tools/`, `core/`, `tui/`, `runtime_threads.rs`, and
   `network_policy.rs`. Any future `eprintln!` / `println!` added
   to a TUI runtime path fails the lint at compile time.
   Legitimate CLI-print paths (`main.rs` eval / init / doctor,
   `runtime_api.rs` server banners, `logging.rs` verbose helpers,
   `skills/mod.rs` listing utilities, `execpolicy/execpolicycheck.rs`
   JSON output, `ui::run_event_loop` post-`LeaveAlternateScreen`
   resume hint, two `#[test] #[ignore]` perf benches in
   `tui/transcript.rs` / `tui/widgets/mod.rs` / `core/capacity.rs`)
   keep their existing prints — they all run outside the alt-screen
   lifetime.

The dup2 redirect is Unix-only because there's no equivalent stable
Rust API for fd-redirecting `STDERR_FILENO` on Windows; on Windows
the tracing-subscriber layer + the clippy denies still apply, and
ratatui's own use of crossterm avoids the worst leakage classes.
Cross-platform stderr redirect via `SetStdHandle` is a follow-up.

The new `runtime_log` module ships with one test
(`log_directory_prefers_home`) that pins the `HOME` /
`USERPROFILE` / `dirs::home_dir()` resolution order — uses the
process-wide `test_support::lock_test_env()` lock for env-mutation
safety. Two `#[test] #[ignore]` benches in
`tui/transcript.rs` (rail-prefix memory) and `tui/widgets/mod.rs`
(transcript scroll bench) and one in `core/capacity.rs`
(`bench_compute_profile`) keep their stdout prints via
`#[allow(clippy::print_stdout)]` on the individual test.

New dependencies: `tracing-subscriber 0.3` (env-filter + fmt
features) and `tracing-appender 0.2` at the workspace root, both
pulled into `crates/tui` only.

Closes the v0.8.28 regression Hunter reported in screenshots:
parallel sub-agents running `exec_shell` triggered the scroll
demon with the TUI content squeezed into the bottom third of the
terminal and ~30 rows of blank above the header.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 20:20:25 -05:00
LinQ 602f7b5f1c test(ui): cover sanitize_stream_chunk control-byte filtering
`sanitize_stream_chunk` is the per-chunk filter every piece of
streaming text passes through — assistant content, thinking
content, tool results, web-search snippets — before reaching the
renderer. Its job is to keep newlines / tabs intact while dropping
control bytes that could otherwise let a chunk emit terminal escape
sequences (\u{1b}[2J clear-screen, \u{8} backspace, \u{7} bell).

Today the function has zero tests, so a future "let's normalise
newlines" or "let's collapse all whitespace" innocuous tweak could
silently regress the security posture or visibly mangle code blocks.

Adds three unit tests:

* `sanitize_stream_chunk_keeps_printable_and_drops_control_bytes` —
  newline/tab survive; ESC, BEL, BS, VT, FF, CR all drop.
* `sanitize_stream_chunk_preserves_unicode` — CJK characters,
  emoji, and accented Latin pass through untouched.
* `sanitize_stream_chunk_handles_empty_and_whitespace` — empty
  input stays empty; whitespace-only input is preserved; a chunk
  that is entirely control bytes legitimately shrinks to empty
  (the caller's "skip empty chunk" branches handle the result).

Zero behaviour change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:30:21 -05:00
LinQ 5cd89ba3ac test(web-search): cover optional_search_max_results precedence + edge cases
`optional_search_max_results` decides how many results `web_search`
fetches — too high wastes bandwidth, too low misses the model's
intent. The function has three branches today (top-level
`max_results`, then `search_query[0].max_results`, then the
DEFAULT_MAX_RESULTS constant), zero of which are exercised in tests.

`extract_search_query` already has four tests covering the basic
shapes; this PR completes coverage with two edge cases the helper
silently handles (whitespace trim + empty rejection).

Six new tests, no behaviour change:

* `optional_max_results_prefers_top_level_value` — the explicit
  outer field wins over a sibling in the array form. Pins the
  precedence so a future structured-query implementation can't
  flip it accidentally.
* `optional_max_results_falls_back_to_array_form` — when only the
  inner form sets the bound (V4's structured `search_query: […]`
  shape), it reaches the caller correctly.
* `optional_max_results_uses_default_when_neither_set` — DEFAULT
  applies for both the top-level and the array shapes, so the model
  can't burn the MAX_RESULTS budget by omitting the field.
* `optional_max_results_only_reads_first_array_entry` — sub-search
  fan-out is a future feature; future entries are ignored today
  and a multi-query implementation will need to update this test
  intentionally.
* `extract_search_query_trims_whitespace_from_array_form_q_alias` —
  pads from heredocs/copy-paste don't reach the upstream URL.
* `extract_search_query_rejects_empty_query` — `""`, all-whitespace
  `q`, and an empty body each surface the same missing-field error
  rather than a confusing upstream "Bot challenge" page.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:30:21 -05:00
LinQ 6440052089 test(file-tool): cover parse_pages_arg edge cases
`parse_pages_arg` validates the user-supplied `pages` argument that
`ReadFileTool` forwards to `pdftotext -f START -l END`. The function
has zero tests today even though it's the only gatekeeper between
user input and a pdftotext spawn — silent acceptance of a malformed
range yields a confusing empty extraction with no actionable error
message.

Adds five tests:

* `parse_pages_arg_accepts_single_page` — `"3"` and `" 7 "` both
  return `Some((n, n))`.
* `parse_pages_arg_accepts_range` — `"1-5"`, `"10-20"`, and
  whitespace-tolerant `" 1 - 5 "` all parse correctly.
* `parse_pages_arg_rejects_invalid_ranges` — `5-1` (end < start),
  `0` and `0-3` (one-indexed contract), empty / whitespace-only
  inputs, `abc` (non-numeric), and `3.5` (floats) all return `None`.
* `parse_pages_arg_rejects_half_open_ranges` — `1-`, `-5`, and `-`
  reject rather than silently extending to `u32::MAX` or `0`.
* `parse_pages_arg_rejects_negative_numbers` — `-3-5` doesn't wrap
  into a giant positive number via u32 parsing.

Zero behaviour change; locks the contract so a future innocuous edit
can't silently shift validation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:30:21 -05:00
LinQ 68b25584cc test(error-taxonomy): cover classify_error_message and Display impls
`error_taxonomy.rs` is the central typed-error layer — every
subsystem (client, tools, runtime, UI) routes through
`ErrorEnvelope` and `classify_error_message` to decide retry
policy, UI colour, and offline-flip semantics. Today it ships with
zero tests, so a single innocuous keyword reordering could shift
classification across the whole tree.

Adds 17 unit tests:

* One per category (Network, Authentication, Authorization,
  RateLimit, Timeout, InvalidInput, Parse, State, Tool, Internal)
  exercising the keyword variants the function intends to catch —
  e.g. context-overflow phrasings ("maximum context length",
  "context_length_exceeded", "prompt is too long", the OpenAI
  "you requested … the maximum is" wording, "context window"),
  HTTP 5xx with various spacing rules (502 / 503 / 504, leading
  space, trailing space, exact match, embedded), and 429/quota
  rate-limit phrasings.
* Three precedence tests pinning the load-bearing ordering:
  InvalidInput beats Tool (so a "tool returned: maximum context
  length" still surfaces as a /compact-able invalid input),
  Timeout beats Network (so "504 Gateway Timeout" classifies as
  Timeout because its retry semantics are gentler than Network's),
  and RateLimit beats Authentication (so a 429 with API token
  phrasing doesn't get misrouted to auth-failure handling).
* Unicode handling: a Chinese error message that still mentions
  "context length" hits InvalidInput; a pure-Chinese unknown
  message falls through to Internal.
* Display impls round-trip through their snake_case wire form so
  consumers depending on the labels can't be silently broken.

Zero behaviour change; only tests + one comment pinning the 504
precedence rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:30:21 -05:00
LinQ 05fcb0df24 feat(auto-router): recognise CJK complex keywords in model picker
The Flash-router fallback heuristic `auto_model_heuristic` only matched
English complexity keywords (`refactor`, `architecture`, `design`,
`debug`, `security`, `review`, `audit`, `migrate`, `optimize`,
`rewrite`, `implement`, `analyze`). A Chinese-speaking user typing
"帮我重构这个模块" or "审计安全漏洞" silently fell through to the
short/long-message length branches and usually landed on Flash for
work that obviously needs Pro-grade reasoning — the symmetric of the
companion gap in `auto_reasoning::select` (and the same root cause).

Extracts the array into a `COMPLEX_KEYWORDS` constant and adds the
Simplified and Traditional Chinese counterparts for each English
keyword:

* refactor → 重构 / 重構
* architecture → 架构 / 架構
* design → 设计 / 設計
* debug → 调试 / 調試
* security → 安全
* review → 审查 / 審查
* audit → 审计 / 審計
* migrate → 迁移 / 遷移
* optimize → 优化 / 優化
* rewrite → 重写 / 重寫
* implement → 实现 / 實現
* analyze → 分析

CJK matches the literal form because the existing `to_lowercase()`
is a no-op for those scripts. English keywords are byte-identical to
before, so English-only behaviour doesn't shift.

Three new tests cover Simplified and Traditional Chinese keyword
routing to Pro, plus a sanity test that short non-keyword Chinese
prose still gets the cost-saving Flash fallback.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:29:43 -05:00
LinQ 68cc5d19cc feat(auto-reasoning): recognise CJK debug/search keywords in Auto tier
`auto_reasoning::select` is the per-turn classifier that picks
`reasoning_effort` for `reasoning_effort = "auto"`. Today it only
recognises English keywords (`debug`, `error`, `search`, `lookup`),
so a user typing in Chinese or Japanese never trips the tier shifts:
"帮我调试代码" stays on `High` instead of escalating to `Max`,
"搜索一下文件" stays on `High` instead of dropping to `Low`. For a
non-English Auto-mode user that's both wrong-side-of-cheap and
wrong-side-of-careful on every turn.

Extracts the keyword sets into `HIGH_EFFORT_KEYWORDS` and
`LOW_EFFORT_KEYWORDS` constants and adds the Chinese / Japanese
vocabulary that maps to the same intents:

* HIGH (→ `Max`): 调试 / 错误 / 报错 / 出错 / 崩溃 / 調試 / 錯誤
  in Chinese; デバッグ / エラー / バグ in Japanese.
* LOW (→ `Low`): 搜索 / 查找 / 查询 in Chinese; 検索 in Japanese.

Latin lowercase is preserved (the caller still lowercases the
message), and CJK matches the literal form because CJK has no case.

Four new tests cover Chinese debug keywords, Japanese debug keywords,
Chinese search keywords, the single Japanese search keyword, and a
sanity test that ordinary CJK prose (without keyword hits) still
returns `High` — matching the English-only behaviour the function
already had.

All previous tests (`subagent_returns_low`, `debug_or_error_returns_max`,
`search_or_lookup_returns_low`, `default_returns_high`) continue to
pass — the original English-only paths are unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:29:43 -05:00
LinQ a5c4a21c9b feat(project-context): merge global AGENTS.md with project AGENTS.md (#1157)
travel into every session, ideally merged with a project's local
AGENTS.md when both exist. Maintainer agreed:

> yes that makes sense! am working on getting this organizational
> structure better today so that worktrees etc can feel like an
> intended way of using this.

The fallback path already loaded the global file when no workspace
context existed, but dropped it silently the moment a project
AGENTS.md showed up. After this PR:

* Both files present → merged. The global block is prepended with a
  labelled HTML-style fence (`<!-- global: /home/u/.deepseek/AGENTS.md -->`),
  then the project block follows with its own fence
  (`<!-- project (overrides global where they conflict) -->`). Order
  is global-first so workspace rules read last and win "last word"
  precedence with the model when they disagree.
* Only project file present → unchanged from before.
* Only global file present → unchanged from before (still acts as a
  fallback). The merge framing is suppressed in the global-only case
  so the prompt stays minimal.

`source_path` continues to point at the more-specific file (project
> global > nothing) because that's the path the user is likely to
edit when they want to override something.

Two tests:
* `test_local_and_global_agents_merge_when_both_exist` —
  the actual #1157 scenario. Asserts both blocks are present, global
  precedes project, and the merge-framing label appears between them.
* `test_global_agents_only_no_project_unchanged_fallback` — sanity
  check that the global-only path doesn't accidentally inherit the
  merge framing.

The pre-existing `test_load_global_agents_when_project_has_no_context`
still passes, so the global-as-fallback contract is preserved.

Refs #1157
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:29:43 -05:00
LinQ 29a42ba31a fix(web_search): drop spam-stuffed SERPs in Bing/DDG fallback (#964)
#964 reports that `web_search` returns garbage results — every query
in their reproduction case returned eight entries from a single low-
quality forum domain (`*.forumgratuit.org`) regardless of input. The
root cause is upstream: when Bing's scraping endpoint serves a stuffed
page (often when our request looks too bot-like or the query falls
into a degraded bucket), the parser happily extracts the b_algo
entries and the model receives the junk as authoritative search
results.

Adds a `is_likely_spam_results` heuristic that runs after both Bing
and DDG parsers. When 60% or more of the parsed entries share the
same registrable root domain (with at least three entries to avoid
false positives on legitimate two-link answers), the batch is
discarded. The existing "no results" handling then surfaces a clean
error message to the model instead of routing it toward spam.

`root_domain` strips subdomains so `astralia.forumgratuit.org` and
`russia.forumgratuit.org` collapse to `forumgratuit.org` for the
purpose of dominance counting; eTLD+1 is approximated by keeping the
last two labels, which is close enough for the threshold check.

Five new tests cover the threshold (3-of-5 trips, 2-of-5 doesn't),
short-batch passthrough, normal diverse SERPs (Wikipedia + SO +
Reddit) staying through, and the precise spam reproduction from #964.

Refs #964
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:29:43 -05:00
LinQ f56f73e371 fix(prompts): strengthen language directive so thinking matches user (#1118)
#1118 reports that even after configuring the locale to Chinese, V4
keeps emitting English `reasoning_content` (the thinking block) when
the surrounding code/error logs are English-heavy. Maintainer agreed
the prompt needs editing.

The existing language directive already said "both for `reasoning_content`
and for the final reply", but V4 falls into a failure mode where it
mirrors the user message for the final answer while quietly defaulting
to English for thinking. Three additions to `crates/tui/src/prompts/base.md`
sharpen the rule:

1. **Bold the "must both be in Simplified Chinese" requirement**, and
   add the failure-mode escape hatches the prompt previously left
   implicit ("even when the surrounding system prompt is in English,
   and even when the task context [...] is overwhelmingly English").

2. **Spell out the mid-session-switch rule for `reasoning_content`**
   explicitly. Today the prompt says "switch with them" but doesn't
   reinforce that this includes thinking — V4 sometimes carries the
   previous turn's reasoning language forward.

3. **Add an explicit-override clause** for the opposite preference
   (#1118 commenter pmsleepcheck preferred English thinking for token
   cost). Users can say "think in English" / "用英文思考" and the
   model honours that until the next override. The final reply still
   tracks the user's message language — only thinking is overridable.

Adds `language_section_carries_reasoning_content_directives_for_1118`
pinning the four load-bearing phrases ("reasoning_content",
"must both be in Simplified Chinese", "overwhelmingly English", and
both English + Chinese override examples) so a future innocuous edit
can't quietly drop them.

The existing `system_prompt_for_mode_with_context_is_byte_stable_for_unchanged_workspace`
test still passes, so byte-stability for a fixed session is intact.

Refs #1118
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:29:43 -05:00
LinQ 5b0b81c5fc fix(session-picker): scope Ctrl+R picker to the current workspace (#1395)
#1395 reports that Ctrl+R in project B restores a session that
belongs to project A — the picker was calling `list_sessions()` and
showing every session on disk regardless of where the user is. With
hundreds of past sessions across many repos, the first hit on the
"most recent" sort is rarely the one from the project the user just
opened.

`SessionMetadata.workspace` is already persisted, so the data needed
to filter is there. This PR:

1. Adds a `workspace_scope` field to `SessionPickerView` and a
   `show_all_workspaces` toggle. `SessionPickerView::new` now takes
   `&Path` so every caller is forced to pass a scope.
2. Filters `filtered` to sessions whose recorded `workspace`
   canonicalises to the same path as the active workspace. Both
   sides go through `std::fs::canonicalize` so a symlinked or
   relative checkout matches its canonical form.
3. Adds an `a` keybinding inside the picker to flip
   `show_all_workspaces`, with a status-line readout
   ("scoped to this workspace" / "showing sessions from every
   workspace"). The user can always escape the scope if they need
   to.

Three new tests:
- `workspace_scope_filters_sessions_to_current_project` —
  reproduces the #1395 scenario (sessions in /tmp/project-a vs
  /tmp/project-b; the picker only surfaces the matching project).
- `workspace_scope_toggle_a_expands_to_all_workspaces` — `a` flips
  back and forth; status announces the new mode.
- `workspace_scope_none_means_show_all` — the historical
  unscoped behaviour is still reachable when the caller passes
  no workspace (used for tests + future opt-out).

Updates the two call sites (`ui.rs` Ctrl+R handler and
`commands/session.rs` `/sessions [show]`) to pass
`&app.workspace`.

Closes #1395
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 19:28:29 -05:00
Hunter Bown 61f40420aa chore(release): bump to 0.8.28
Workspace + per-crate path-dep version pins, npm wrapper, and
`deepseekBinaryVersion` all advance from 0.8.27 → 0.8.28. Lockfile
refreshed via `cargo update --workspace --offline`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:21:58 -05:00
Hunter Bown 842802b873 test: fold settings.rs term_program + no_animations guards into lock_test_env
PR #1365 (cherry-picked into v0.8.28) introduced
`term_program_test_guard` as a fresh module-local
`static Mutex<()>`, mirroring the existing
`no_animations_test_guard`. Both serialize their own family of
tests but not with each other — so under cargo's parallel runner,
a `NO_ANIMATIONS=1` leak from one family lands in the env at
the exact moment a `TERM_PROGRAM=iTerm.app` test calls the shared
`apply_env_overrides`, flipping `low_motion` to true and failing
`non_vscode_term_program_does_not_force_low_motion`.

Both guards now return `crate::test_support::lock_test_env()`
(the same fold the v0.8.28 test-stabilization commit applied to
the EnvGuard family in `commands/config.rs`, `commands/network.rs`,
and `tools/recall_archive.rs`). This serializes the two test
groups with each other and with every other env-mutating test in
the suite, eliminating the cross-test env-var race.

`save_api_key_for_openrouter_writes_provider_table` was failing
intermittently for the same reason — a concurrent env mutation
in an unrelated test was clobbering HOME / DEEPSEEK_CONFIG_PATH
in the window between our `EnvGuard::new` and
`save_api_key_for`'s `default_config_path()` read. With the
broader serialization in place, the race window closes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:16:18 -05:00
Hunter Bown 9759a77ae5 fix(working-set): wire #1270's add_local_reference_completions into completions() and build_file_index
PR #1270 from @SamhandsomeLee landed the function definitions and
both regression tests as part of v0.8.27, but the wiring was
incomplete: `add_local_reference_completions` was never called
from `completions()`, and `build_file_index` never walked
`local_reference_paths`. Hunter marked the two tests `#[ignore]`
with a "v0.8.28 follow-up" trailer in `fe0673d68`.

This completes the wiring:

* `Workspace::completions()` now calls
  `add_local_reference_completions` for both the diverging-cwd
  branch and the workspace-root branch, after the existing
  `walk_for_completions`. The helper is a no-op unless the needle
  starts with `.` or contains `/` / `\`, so prose mentions skip
  the extra walk.
* `Workspace::build_file_index()` now walks `local_reference_paths`
  after the curated dot-dir whitelist (`.deepseek`, `.cursor`,
  `.claude`, `.agents`), so explicit user paths into other
  gitignored dirs (e.g. `.generated/specs/device-layout.md`)
  fuzzy-resolve too. Honors `FILE_INDEX_MAX_ENTRIES` so the
  #697 walk-cap still bounds first-turn latency.
* Drops `#[allow(dead_code)]` from the four helpers
  (`LOCAL_REFERENCE_SCAN_LIMIT`, `add_local_reference_completions`,
  `should_try_local_reference_completion`,
  `local_reference_paths`, `should_skip_local_reference_dir`) and
  un-ignores both `working_set` regression tests:
  `workspace_completions_surface_explicit_hidden_and_ignored_paths`
  and
  `fuzzy_index_resolves_hidden_and_ignored_files_except_deepseekignored`.

Both tests pass. `.deepseekignore` entries remain blocked from
both completion and basename fuzzy-resolution paths because
`local_reference_paths` adds `.deepseekignore` as a custom-ignore
file on the walker.

Closes the v0.8.28 follow-up trailer left on `fe0673d68`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 18:11:52 -05:00