Commit Graph

835 Commits

Author SHA1 Message Date
Hunter Bown dcc2c448eb fix(client): vLLM uses chat_template_kwargs to toggle reasoning, not the Anthropic field
`apply_reasoning_effort`'s vLLM branch was injecting
`thinking: {type: "disabled"}` at the top of the request body to
turn off model reasoning. But vLLM speaks OpenAI's
chat-completions protocol, not Anthropic-native extension fields,
and silently ignored that directive — the model emitted a full
hidden reasoning trace into the non-OpenAI-standard `reasoning`
field (which this client does not surface), so users saw a
~13-second perceived freeze before the first content token
arrived.

The vLLM branch now emits the OpenAI extension
`chat_template_kwargs.enable_thinking` — the canonical way to
toggle Qwen3's `<think>` mode, DeepSeek-R1's reasoning trace, and
any other reasoning-capable model served via vLLM. End-to-end
measurement against vLLM hosting Qwen3.6-35B-A3B-FP8:

  - TTFT:           13039ms → 274ms
  - Total LLM call: 13s     → 5.7s
  - Output rate:    3 ch/s  → 46 ch/s

The `high` / `max` reasoning levels likewise route through
`chat_template_kwargs` so the toggle is consistent across effort
levels. No change for any non-vLLM provider (NVIDIA NIM continues
to use the NVIDIA-specific `chat_template_kwargs.thinking` key;
Anthropic-native providers keep the Anthropic-native field).

Resolved a 3-way merge conflict against the v0.8.32 AtlasCloud
harvest (PR #1436) so AtlasCloud stays in the no-op match arm
alongside OpenAI / Ollama while the new vLLM arm gets its own
branch. Note for future Sglang / Fireworks / Novita work: those
servers likely have the same bug but each has its own
chat_template_kwargs schema; this PR is intentionally minimal
to the verified-fix scope.

Harvested from PR #1480 by @h3c-hexin

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 01:25:16 -05:00
Hunter Bown f60a77e191 fix(session): skip turn_meta block when deriving session title
`SessionManager::create_saved_session_with_id_and_mode` picks the
first `ContentBlock::Text` off the user's message via `find_map`
and uses that as the session title. The engine prepends a synthetic
`<turn_meta>...</turn_meta>` block (Block 0) ahead of the real user
text (Block 1), so the `/sessions` picker was rendering the metadata
blob as the session name.

Guard the find_map filter on `!text.starts_with("<turn_meta>")` so
titles fall through to the actual user input. Existing sessions
without the prefix block are unaffected (the guard is a no-op when
no metadata block is present); the existing `truncate_title` long-
input handling continues to apply.

Harvested from PR #1498 by @wdw8276

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 01:23:14 -05:00
Hunter Bown 2a306d2b0f chore(docs): wrap bare URLs in <...> so rustdoc -D warnings stays green
The PR #1294 (Tavily / Bocha provider) and PR #1467 vision-tool
harvests both surfaced rustdoc warnings about bare URLs in doc
comments — `https://tavily.com` rather than `<https://tavily.com>`.
Under `RUSTDOCFLAGS=-D warnings` (which CI runs) those warnings
escalate to errors. Wrapping each URL in `<...>` produces the
clickable autolink rustdoc expects.

Fixed sites: `SearchProvider::{Tavily, Bocha}` doc comments in
`crates/tui/src/config.rs`, both Tavily/Bocha API endpoint refs in
`crates/tui/src/tools/web_search.rs`, and the pandoc.org link in
the `crates/tui/src/tools/pandoc.rs` module header.

Two pre-existing `WorkingSet::build_file_index` /
`WorkingSet::fuzzy_resolve` unresolved-link errors on `main` are
intentionally NOT touched in this commit — they were introduced
by `9759a77ae` and live outside the v0.8.32 scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 01:10:28 -05:00
Hunter Bown c8614a24fd fix(tui): enable Kitty keyboard protocol on Windows so Shift+Enter inserts newline
Closes #1359. On Windows 11 + VSCode integrated terminal +
PowerShell, pressing `Shift+Enter` in the composer submitted the
message instead of inserting a newline. `Alt+Enter` / `Ctrl+J`
were broken the same way. Root cause: crossterm's
`PushKeyboardEnhancementFlags` checks `is_ansi_code_supported()`
before emitting the escape, and on Windows that helper queries
the console mode rather than the VT capability and
unconditionally returns false — so the Kitty push `\x1b[>1u` was
never written. xterm.js then stayed in legacy mode where
`Shift+Enter` and `Enter` both encode as `\r`, indistinguishable.

The fix writes the push and pop escapes directly under
`#[cfg(windows)]`, bypassing the crossterm capability gate.
VSCode and Windows Terminal honour the Kitty keyboard protocol;
terminals that don't (older conhost without VT processing)
silently discard the unknown escapes. The same gate also meant
`PopKeyboardEnhancementFlags` was silently dropped on Windows in
the `main.rs` panic hook and in
`tui::external_editor::spawn_editor_for_input` — both call sites
now route through `pop_keyboard_enhancement_flags` so a crash or
`$EDITOR` invocation can't leave the parent shell with a
Kitty-enhanced keyboard state.

Two `#[cfg(windows)]` regression tests pin the direct-write path
so accidentally falling back to `execute!()` against
`crossterm::PushKeyboardEnhancementFlags` would now fail in CI:

- `push_keyboard_flags_writes_kitty_push_sequence_on_windows`
- `pop_keyboard_flags_writes_kitty_pop_sequence_on_windows`

Non-Windows behaviour is unchanged — the existing
`recover_terminal_modes_emits_expected_csi_sequences_with_gating`
test still passes on Linux and macOS.

Also adds a v0.8.29 audit note to `docs/KEYBINDINGS.md` and
documents a pre-existing FocusGained stack-depth bug for a
separate fix.

Harvested from PR #1483 by @CrepuscularIRIS

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 01:04:47 -05:00
Hunter Bown 3aaf0ad95e feat(vision): add image_analyze tool gated behind vision_model feature flag
`image_analyze` sends an image file to an OpenAI-compatible vision
endpoint and returns the model's natural-language description.
Complements `image_ocr` (which uses local tesseract for "what text
is on this image"); `image_analyze` is for "what is this image
about" — visual reasoning the local OCR engine can't do.

Trust-boundary scope: **two-step opt-in only**.

1. The feature is gated by `[features] vision_model = true` —
   default `false`.
2. The tool needs a `[vision_model]` config block specifying
   `model` (with optional `api_key` / `base_url` — falls back to
   the main config api_key + the OpenAI base URL).

Without both, the tool isn't registered, so no install fires a
vision API call without explicit user setup. Workspace boundary:
the tool rejects absolute paths and any `..` parent-dir
traversal before any base64 encoding or HTTP call. Stateless —
each call sends only the requested image + optional prompt; no
session, no conversation history attached. Supports PNG, JPEG,
GIF, WebP, and BMP inputs.

**Billing**: each call hits the configured vision endpoint
(OpenAI by default — `gpt-4o-mini` / `gpt-4o` family commonly
configured). Users with their own deployments (Gemini, Claude
Vision via OpenAI shim, local llama.cpp) can point `base_url` /
`api_key` at the alternative.

Tests cover the tool metadata (read-only capability, correct
name), MIME-type detection across the supported formats and the
unsupported-format rejection path, and the workspace-boundary
checks (absolute paths and `..` traversal both reject before
any API call). Skipped from the upstream PR: the
`.github/workflows/sync-cnb.yml` rewrite, which v0.8.31 already
addressed with the concurrency/scoped-push refactor; landing the
older form would regress that commit.

Resolved a clippy::collapsible_if in tool_setup.rs (the
`if feature && let Some(cfg) = ...` form) to satisfy the
workspace -D warnings gate.

Harvested from PR #1467 by @MMMarcinho

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 01:03:19 -05:00
Hunter Bown bd603a271c feat(tools): add image_ocr tool — extract text from images via tesseract
Lets the model OCR a screenshot, scanned receipt, whiteboard photo,
or image-only PDF the user drops into the workspace, without
bouncing through `exec_shell` (which would mean an approval prompt
plus the model having to remember tesseract's CLI surface). The
tool spawns `tesseract <image> -` and returns the recognised text
inline — no file is written. Capability is ReadOnly + parallel
since OCR is a side-effect-free read.

Registration is gated on `crate::dependencies::resolve_tesseract()`
via the new `ToolRegistryBuilder::with_image_ocr_tools()` builder,
hooked into `with_agent_tools` alongside `pandoc_convert`. When
tesseract is missing the tool isn't advertised — same
probe-then-decide pattern v0.8.31 introduced for Python. The
execute path also late-resolves so a concurrent uninstall surfaces
the install-tesseract hint rather than the raw spawn failure.

`deepseek doctor`'s "Tool Dependencies" section reports tesseract
status next to pandoc / node / python with platform-aware install
hints. For non-default language packs or PSM modes the user can
still drop into `exec_shell` with the full tesseract CLI surface.

Tests check the metadata (ReadOnly + parallel, not WritesFiles),
the missing-path rejection, and the happy-path OCR round-trip
against `crates/tui/tests/fixtures/ocr_hello.png` — a 2 KB
300×100 grayscale PNG generated with ImageMagick rendering
"HELLO OCR" in Helvetica. The happy-path test skips silently on
hosts without tesseract (matching the catalog-build behaviour) and
on hosts where the fixture isn't checked out (sparse / shallow
clones).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:58:48 -05:00
Hunter Bown aed7dbefaa feat(tools): add pandoc_convert tool — universal document conversion
Pandoc is the de-facto Swiss Army knife for moving prose between
the formats engineers and writers actually use: Markdown to HTML,
HTML to Markdown, reST to anything, anything to LaTeX / DOCX /
EPUB. Surfacing it as a model-callable tool unblocks a large class
of "rewrite this report as ..." / "publish this changelog as ..."
workflows that previously required the user to drop into a
terminal between turns.

The tool reads `source_path` (any pandoc-supported input format —
pandoc autodetects from the extension), converts to one of the
11 whitelisted target formats, and either writes the result to
`output_path` (when provided) or returns the converted text
inline. Target whitelist:

  markdown, gfm, commonmark, html, rst, latex,
  docx, odt, epub, plain, asciidoc

Picked for coverage of real document-handling without dragging in
additional system tooling (no PDF target — that needs a LaTeX
engine; no S5/Slidy — niche and surprising). Binary targets
(docx, odt, epub) reject inline-text requests with a clear error
naming the required `output_path`; text targets work in either
mode.

Registration is gated on `crate::dependencies::resolve_pandoc()`
through the new `ToolRegistryBuilder::with_pandoc_tools()` builder
method, hooked into `with_agent_tools` so the tool surface picks
it up everywhere the existing diagnostics / project tools do.
When pandoc is missing the tool simply isn't registered (same
probe-then-decide pattern v0.8.31 introduced for Python).
Approval routes through the WritesFiles / Suggest tier matching
other file-writing tools.

`deepseek doctor`'s "Tool Dependencies" section reports pandoc as
present / absent with platform-aware install hints
(`brew install pandoc` / `apt install pandoc` /
`winget install JohnMacFarlane.Pandoc`).

Tests cover the format whitelist round-tripping into the schema's
`enum` field, the binary-format rejection path, the unsupported-
format rejection path, the missing-source-file rejection, the
Markdown→HTML inline round-trip, and the `output_path` write
roundtrip + summary message.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:56:00 -05:00
Hunter Bown 2566f3c546 feat(tools): add js_execution tool — model-provided JavaScript via local Node
Mirrors `code_execution` (Python) so the model sees a single
consistent surface for "run this snippet locally and tell me what
it printed" across both interpreters. Same tempfile spawn pattern,
same 120-second timeout, same stdout/stderr/return_code JSON
result shape — prompt-cache layouts covering one tool also cover
the other.

Registration is gated on `crate::dependencies::resolve_node()`:
when Node is missing the tool is simply not advertised in the
model's catalog (the same probe-then-decide pattern v0.8.31
introduced for Python), so the model never sees a runtime it
can't actually use. `deepseek doctor`'s "Tool Dependencies"
section reports Node alongside Python with platform-aware
install hints.

Approval, Plan-mode gating, missing-tool dispatch, and turn-loop
exec wiring follow the existing `CODE_EXECUTION_TOOL_NAME`
pathways one-for-one — the new tool name `js_execution` is added
to every site that already special-cased `code_execution` in
`turn_loop.rs`. Default approval description: "Run model-provided
JavaScript code in local Node.js execution sandbox" (Suggest tier).

Tests cover the tool definition shape, happy-path stdout capture,
non-zero exit on a thrown Error, and the `code` field requirement.
The two spawning tests skip cleanly on hosts without Node (the
catalog wouldn't advertise the tool there anyway, so failing the
suite would be a regression on the probe contract).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:52:13 -05:00
Hunter Bown d62facafac feat(translate): opt-in /translate command localises model replies
Two-layer design for users whose UI locale is not English:

1. **System-prompt directive (primary)**: when the user enables
   translation via `/translate`, a `## Language Output Requirement`
   block is appended to the system prompt instructing the model to
   reply in the resolved session locale (Simplified Chinese,
   Traditional Chinese, Japanese, or Brazilian Portuguese). Code
   identifiers, technical terms without an established translation,
   and code blocks the user explicitly requests in English are
   exempt. The block is gated on `PromptSessionContext.translation_
   enabled`, so it adds zero tokens for installs that don't opt in.

2. **Post-hoc heuristic (fallback)**: a lightweight detector in
   `tui::translation` compares Latin-letter count against weighted
   CJK characters (CJK chars carry ~3× the information per glyph,
   so the ratio comparison stays fair across mixed code+prose).
   When a reply still surfaces English despite the directive, the
   detector flags it and a focused per-message `client.translate()`
   call renders the localised version before display. The dedicated
   translation request runs without conversation history, tool
   calls, or streaming — the only role is translate-and-return.

Adds the `/translate` slash command, locale strings for the new UI
states, the post-hoc fallback module, the per-message
`TranslationStatus`, and threading through `core::ops`,
`core::engine`, `runtime_threads`, and the TUI app/UI surface.

Trust-boundary check: opt-in only — `translation_enabled` defaults
to false everywhere, so English-locale installs see zero behaviour
change. The system prompt addition is conditional on the runtime
flag, not the contributor's earlier always-on form. Threaded the
new `Locale::ZhHant` arm through the v0.8.32 `/change` slash
command match to keep the pattern exhaustiveness check passing.

Harvested from PR #1462 by @YaYII

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:46:50 -05:00
Hunter Bown 8f33e4bd48 feat(providers): add AtlasCloud as a first-class provider
AtlasCloud (https://atlascloud.ai) hosts the V4 family on its own
DeepSeek-compatible endpoint at `https://api.atlascloud.ai/v1`, and
several contributors had been running it through the
OpenAI-compatible passthrough with manual `base_url` / model
overrides. Selecting `provider = "atlascloud"` in
`~/.deepseek/config.toml` (or via `DEEPSEEK_PROVIDER=atlascloud`)
now wires up:

- documented `DEFAULT_ATLASCLOUD_BASE_URL` /
  `DEFAULT_ATLASCLOUD_MODEL` defaults so a fresh install needs
  only the api_key
- a `[providers.atlascloud]` config block with the same fields
  every other named provider exposes (api_key / base_url / model
  / http_headers)
- `ATLASCLOUD_API_KEY` env var path, including the secrets test
  cleanup loop so per-test env hygiene continues to work
- the provider-picker / `/provider` slash command entries so the
  provider is reachable from the runtime UI, not just config
- the env-driven `*_BASE_URL` override branch so users who pin a
  proxy can still flip it without editing config.toml

Trust-boundary pins held: AtlasCloud is opt-in (default remains
DeepSeek), no API keys are hardcoded, the api_key resolution flows
through the same `secrets` crate path every other provider uses,
and the provider-config base_url stays settable per environment.

Resolved 3-way merge conflicts in `crates/secrets/src/lib.rs` (env
cleanup loop) and `crates/tui/src/config.rs` (per-provider
base_url match arm + `provider_passes_model_through` predicate)
so the contributor's AtlasCloud branch coexists with the v0.8.x
provider expansion already on `main`. Added the missing match arm
in `validate_provider_base_url` so the non-exhaustive-pattern
check passes after the new variant lands.

Harvested from PR #1436 by @lucaszhu-hue

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:40:43 -05:00
Hunter Bown 40df46c73d feat(web_search): add configurable Tavily and Bocha provider backends
DuckDuckGo HTML scraping with Bing fallback remains the default
`web_search` backend — no API key required, no behaviour change for
installs that don't opt in. Users in regions where those scrapers
are rate-limited or unreliable can now set `[search]
provider = "tavily" | "bocha"` plus `api_key = "..."` in
`~/.deepseek/config.toml` (or via the `DEEPSEEK_SEARCH_PROVIDER` /
`DEEPSEEK_SEARCH_API_KEY` env vars) to route every `web_search`
call through the chosen API.

Tavily targets general AI-agent search; Bocha (博查) is the
mainland-China-friendly equivalent. Both providers are gated by
the existing `[network]` policy on their respective hosts
(`api.tavily.com`, `api.bochaai.com`) and surface a clear
`ToolError` (rather than a silent fallback to DuckDuckGo) when
the user has opted in but forgotten to set `api_key`. Test pins
the missing-key behaviour for both providers.

Resolved 3-way merge conflicts in `web_search.rs` (description
text and test module) so the contributor's helpers
(`truncate_error_body`, `sanitize_error_body`) coexist with the
v0.8.30 spam-detection and query-parsing tests already on `main`.
Folded the `SearchProvider::default()` impl into a `#[default]`
derive to satisfy the workspace `-D warnings` clippy gate, and
threaded `search: Option<SearchConfig>` through `merge_config` so
the multi-config layering doesn't break the field initialiser.

Harvested from PR #1294 by @sandofree

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:37:34 -05:00
Hunter Bown a57cb4dfa7 feat(approval): collapse approval modal to a one-line banner with Tab
Previously the approval modal rendered as a full-screen takeover
card that hid the transcript behind it, so users had to dismiss the
prompt — losing the decision context — just to re-read the tool
call they were being asked to approve. The new collapsed mode flips
the modal to a single-line banner pinned at the bottom of the area
("<tool> — <risk badge>  [Tab to expand]"), so the conversation
stays visible while the decision is pending. Tab toggles between
the two modes; the selected option, pending-confirm state, and
risk colour scheme are preserved across the toggle.

Test pin: a fresh `ApprovalView` starts expanded, the first Tab
collapses, the second Tab restores — and both return `ViewAction::None`
so no decision side effect leaks out of the toggle.

Harvested from PR #1455 by @tiger-dog

Note for the maintainer: PR #1455 also includes a Chinese-language
preamble to `prompts/base.md` that biases reasoning_content toward
Chinese on Chinese-language turns. That change touches the system
prompt and is left for a separate sign-off — see
.private/handoffs/v0.8.32-1455-prompt-preamble.md for the diff and
the suggested call.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:29:31 -05:00
Hunter Bown 02f889f193 fix(markdown): preserve underscores inside identifiers when rendering inline spans
Before this change, the inline markdown parser greedily matched the
underscore in identifiers like `deepseek_tui` or `foo_bar_baz` against
the `_italic_` pattern, so the second half of the identifier rendered
in italic and transcript snippets that quoted code symbols read as
garbled prose. The same bug applied to `*italic*` against tokens like
`look_at*tail`.

Both italic delimiters now apply a CommonMark-style boundary check on
the closing run: when the character immediately after the closing
`_` / `*` is a letter, digit, or underscore, the delimiter is left
as literal text rather than treated as markup. Identifiers survive
intact; legitimate emphasis still renders.

Regression-pinned with `crate deepseek_tui handles approvals`,
`see foo_bar_baz for details`, and `look at *not_emphasised*tail`.

Harvested from PR #1455 by @tiger-dog

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:28:10 -05:00
Hunter Bown 1f0065ccee feat(commands): add /change slash command to display latest CHANGELOG entry
`/change` reads the most recent `## [version]` section from the
workspace `CHANGELOG.md` (or the bundled release-notes copy when
no workspace changelog is available) and renders it inline in the
TUI. On non-English locales the command also queues a model-side
translation request so localised users see the changelog text in
their UI language; with no API key configured, the offline path
returns the section verbatim with a brief explanatory header.

Lets users discover what changed in the version they just upgraded
into without leaving the chat — and keeps the v0.8.32 release-notes
flow consistent with `deepseek update`'s newly-fixed sibling-TUI
refresh: now both binaries match the version, and `/change` shows
what that version actually delivered.

Resolved a `clippy::needless_range_loop` warning in the section
extractor (idiomatic `iter().enumerate().skip(...).find(...)` instead
of an indexed range loop) so the harvest passes the workspace's
`-D warnings` clippy gate without touching the contributor's design.

Harvested from PR #1416 by @zhuangbiaowei

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:25:30 -05:00
Hunter Bown 23d3a71126 fix(cli): deepseek update refreshes sibling TUI binary alongside dispatcher
Before this change, `deepseek update` would replace the running
dispatcher binary at `~/.cargo/bin/deepseek` but leave the sibling
`~/.cargo/bin/deepseek-tui` at whatever version was installed last.
The dispatcher then reported the new release while the TUI binary it
shells out to for every interactive turn stayed pinned to the old
build — most visible on Volta-managed npm installs and on any flow
that uses `deepseek update` instead of re-running both
`cargo install --path crates/{cli,tui}`.

The updater now enumerates the running binary plus an existing
colocated sibling up front, fetches and SHA256-verifies every needed
release asset before replacing anything on disk, then swaps the
sibling first and the running dispatcher last so a partial network
failure can't leave the launcher updated while the TUI remains stale.
The success message lists every refreshed binary by full path.

Tests cover sibling target detection (dispatcher present + sibling TUI
present → both targeted) and the no-sibling fallback (dispatcher only
→ single target).

Harvested from PR #1492 by @NorethSea

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:23:27 -05:00
Hunter Bown f94bedf1ea feat(tools): swap read_file PDF path to bundled pdf-extract; pdftotext now opt-in
Before v0.8.32 the PDF branch in `read_file` shelled out to `pdftotext`
(Poppler), which meant first-time users on hosts without it saw the
tool return a `binary_unavailable` sentinel and had to install Poppler
before the model could open a PDF. The `pdf-extract` crate already
powered URL-fetched PDFs in `web_run`, so this change reuses it for
the local PDF path too — extraction is now zero-dependency on every
supported host. The `pages` parameter still filters by 1-indexed
inclusive page range; both whole-file and per-page variants run with
no system dependency.

Users with column-heavy or complex-table PDFs (academic papers,
financial filings) where `pdftotext -layout` still wins can opt into
the external path with `prefer_external_pdftotext = true` in
`~/.config/deepseek/settings.toml`. When set, the previous Poppler
dispatch returns — including the `binary_unavailable` install hint
when the binary is missing on PATH. `deepseek doctor` reframes
`pdftotext` as optional and explains the opt-in instead of treating
it as a missing dependency.

Tests round-trip a checked-in academic PDF through the new path and
verify the `pages` window still slices correctly. The pre-v0.8.32
"binary_unavailable on missing pdftotext" branch is exercised against
the new opt-in setting via a serialised env-var guard.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:21:24 -05:00
Hunter Bown 383ed44fe8 fix(child_env): preserve MSVC toolchain vars for Windows exec_shell
Harvested from PR #1487 by @Jianfengwu2024 — the rest of that PR
(the TriadMind architecture-governance crate) needs a Discussion-
level design conversation before it can land, but this Windows
env-allowlist fix is a clean independent improvement and stands on
its own.

When the parent shell has already loaded VsDevCmd / vcvars (the
standard pattern for running Rust + MSVC on Windows), `exec_shell`
was stripping the toolchain env on its way to the child. The result:
the model finds `link.exe` via `PATH` but the linker can't resolve
`kernel32.lib` / `ucrt.lib` because LIB and the SDK roots were
filtered out. Any model-driven `cargo build` from inside the TUI
silently broke on Windows installs that don't run inside a Developer
Command Prompt.

Adds 13 MSVC-related env vars to the `is_allowed_parent_env_key`
allowlist so they survive the sanitization pass:

  LIB / LIBPATH / INCLUDE
  VSINSTALLDIR / VCINSTALLDIR / VCTOOLSINSTALLDIR
  WINDOWSSDKDIR / WINDOWSSDKVERSION
  UNIVERSALCRTSDKDIR / UCRTVERSION
  EXTENSIONSDKDIR / DEVENVDIR / VISUALSTUDIOVERSION

Also extends the `mcp_env_allowlist_inherits_base_keys` fixture and
adds `sanitized_child_env_preserves_windows_toolchain_vars` as a
regression test (locked under `env_lock()` so it serialises with the
other env-mutating tests in the file).

Pure additive — no non-Windows behaviour changes.

Harvested from PR #1487 by @Jianfengwu2024
2026-05-11 23:19:44 -05:00
Hunter Bown 0407451150 chore(release): bump to 0.8.31
- workspace.package.version: 0.8.30 → 0.8.31
- per-crate path-dependency version pins: 0.8.30 → 0.8.31 (31 entries)
- npm/deepseek-tui: version + deepseekBinaryVersion → 0.8.31
- Cargo.lock refreshed via `cargo update --workspace --offline`
- CHANGELOG: [Unreleased] → [0.8.31] - 2026-05-12 with theme paragraph
- new [Unreleased] anchor + [0.8.31] compare link

Verified with `./scripts/release/check-versions.sh`:
  Version state OK: workspace=0.8.31, npm=0.8.31, lockfile in sync.

Verified parity gates green pre-bump (see prior commits) and
post-bump (cargo fmt --all --check, cargo clippy --workspace
--all-targets --all-features --locked -- -D warnings).
2026-05-11 23:07:02 -05:00
Hunter Bown 2b54b7e79d feat(tools): probe Python interpreter at catalog-build time; doctor surfaces missing deps
Closes the "code_execution: program not found" failure reported by a
Windows contributor and adds the surrounding plumbing so future
external-binary dependencies can be added without each one repeating
the same probe-and-skip logic.

New module `crates/tui/src/dependencies.rs`:

  * `PYTHON_CANDIDATES = ["python3", "python", "py -3"]` — try the
    Unix-style name first, then the bare name (common on Windows and
    modern macOS), then the Windows launcher as a last resort.
  * `probe_executable(spec)` — splits the spec on whitespace, spawns
    `<program> [args...] --version`, returns true on success. stdout
    and stderr are routed to /dev/null so the probe doesn't print to
    the TUI's first frame.
  * `resolve_python_interpreter()` — OnceLock-cached resolver that
    returns the first candidate that probed successfully, or None.
  * `resolve_pdftotext()` — same shape, for doctor's PDF-path
    diagnostic.
  * `split_interpreter_spec("py -3") → ("py", ["-3"])` — needed so
    `tokio::process::Command::new(program).args(args).arg(script)`
    runs the Windows launcher correctly.

Wiring in `core/engine/tool_catalog.rs::ensure_advanced_tooling`:

  * The hardcoded `catalog.push(Tool { name: "code_execution", … })`
    is now gated on `resolve_python_interpreter().is_some()`. On a
    machine without Python the tool is not advertised — the model
    never sees a tool it can't actually run, which removes the
    "Failed to execute tool: program not found" class of failure
    entirely. On a machine WITH Python the behavior is unchanged
    from the user's perspective.

Wiring in `execute_code_execution_tool`:

  * Resolves the interpreter at call time (still cached) and splits
    on whitespace so `py -3` runs as `py -3 /tmp/.../code.py`.
  * Writes the user code to a tempfile under tempdir() and runs the
    file rather than passing it through `python3 -c "..."`. Fixes
    argv length limits on Windows, multiline code with quote
    nesting, and traceback line numbers (real filename instead of
    `<string>`). Folds in the contributor's Part 2 proposal.
  * Returns a clear actionable error if the interpreter somehow
    disappears between startup and the call (uninstall mid-session,
    PATH manipulation, etc.).

Doctor (`crates/tui/src/main.rs::run_doctor`) gains two new sections:

  * "Tool Dependencies": ✓/✗ for Python and pdftotext with a
    platform-specific install hint when missing. The Python miss
    explicitly tells users that `code_execution` is not advertised.
  * "Terminal Quirks": shows which env-driven auto-overrides are
    currently active (VS Code / Ghostty low-motion, Termius/SSH
    low-motion per #1433, Ptyxis sync-output-off per v0.8.31).
    Answers the "why is my flicker / motion behaving this way"
    question without needing the user to read the source.

The contributor's Part 1 proposal (share the sandbox/approval
substrate with task_shell_start) is deferred to v0.8.32 — that
touches `sandbox/` which CLAUDE.md flags as "stop and ask before
changing" and needs an explicit design pass on approval semantics.

Six new unit tests in `dependencies.rs` covering probe behavior,
spec splitting, and cache stability. Existing tool_catalog tests
still pass.
2026-05-11 22:54:52 -05:00
Hunter Bown cc71ec191f fix(settings): auto low-motion in Termius + every SSH session (#1433)
Closes #1433. Harvested from PR #1479 by @CrepuscularIRIS / autoghclaw,
with two changes from the original PR:

  * `is_some_and` instead of `.map_or(false, |v| !v.is_empty())` —
    the latter trips `clippy::unnecessary_map_or` on Rust 1.94+
    under `-D warnings`, which is what blocked the PR's Lint check
    in CI. is_some_and reads cleaner and ships the same behavior.
  * `non_vscode_term_program_does_not_force_low_motion` now clears
    SSH_CLIENT / SSH_TTY before iterating its negative-case
    fixtures so the suite still passes when run from a developer's
    actual SSH session.

Detection logic mirrors the existing VS Code (#1356) and Ghostty
(#1445) overrides: any of TERM_PROGRAM=Termius, SSH_CLIENT set, or
SSH_TTY set unconditionally flips low_motion = true and
fancy_animations = false. The 120 FPS cursor-repositioning that
caused the cursor to cycle through input boxes over SSH is dropped
to the 30 FPS cap the typewriter path already uses.

Two new tests: termius_term_program_forces_low_motion_on and
ssh_session_forces_low_motion_on. Both serialise through the
existing term_program_test_guard / crate-wide test lock to avoid
racing concurrent env-var-mutating tests in the suite.
2026-05-11 22:13:53 -05:00
Hunter Bown 1f7cd9cc2f feat(mcp): custom HTTP headers per server for authenticated gateways
Closes #1454. Harvested from PR #1456 by @Oliver-ZPLiu.

Adds `pub headers: HashMap<String, String>` to McpServerConfig,
threaded through HttpTransport::new and StreamableHttpTransport so
every outbound POST applies the user-configured headers after the
fixed Accept / Content-Type framing. Mirrors the field shape that
Claude Code, Codex, and OpenCode already accept in their MCP
config formats — unblocks Hugging Face MCP, GitHub MCP, Atlassian
Rovo MCP, and any other Streamable HTTP gateway that needs a
Bearer token or API key.

Defense-in-depth filter (`is_safe_custom_header`) drops:
  * empty / whitespace-only keys (would surface as a reqwest
    builder error mid-send and abort the connection);
  * `Accept` / `Content-Type` duplicates (the MCP Streamable HTTP
    protocol negotiates on these exact values; a stray override
    would silently break tool discovery);
  * values containing ASCII CR or LF (response-splitting defense
    against a misbehaving proxy).

Skipped headers emit a `tracing::warn!` and the rest of the
request still goes out, so a single bad entry can't take down a
server.

Scope-limited vs PR #1456:
  * SSE legacy transport intentionally not threaded (follow-up).
    The PR didn't cover it; modern MCP servers are Streamable HTTP.
  * No env-var interpolation in v0.8.31 — headers are sent
    literally, matching the PR. Documented in the field doc so
    users know tokens pasted directly into mcp.json live there
    as plain text. `${VAR}` substitution is a follow-up.

8 new tests (the original PR shipped without any): config
round-trip with custom headers, empty headers omitted from
serialized output, accept-normal-auth, reject empty key, reject
CR/LF in value, reject Accept/Content-Type override (case-
insensitive), and StreamableHttpTransport stores headers.

Empty-headers MCP fixtures updated at 10 sites across mcp.rs and
main.rs to match the new struct shape.
2026-05-11 22:12:19 -05:00
Hunter Bown b4158dcc1b feat(read_file): bounded chunked reads via start_line / max_lines
Harvested from PR #1451 by @Oliver-ZPLiu (closes part of #1450).
Token-budget control for read_file: large files no longer drop their
entire contents into the conversation context on every turn.

Default window is 200 lines / ~16 KB; the hard cap is 500 lines.
Small files (≤ 200 lines AND ≤ 16 KB, no explicit range params) keep
the historical raw-contents return so existing prompts that read
config files / single source files see no behavior change. Otherwise
the response is wrapped in a `<file …>` tag with line-numbered
content, `shown_lines`, `truncated`, and `next_start_line`
attributes, plus a `[TRUNCATED]` hint so the model can page through
in 16 KB slices.

Cleanups from the original PR while harvesting:
- shown_lines is now 1-based inclusive (e.g. "3-6"), matching
  start_line / next_start_line / the line-number prefix on each
  rendered row. The original PR mixed 0-based indices in attributes
  with 1-based numbers in body output, which was confusing.
- The continuation hint mirrors that 1-based range so model
  reasoning over "what did I just see?" is unambiguous.
- Added 6 unit tests covering: small-file fast path, explicit
  range wrap-in-file-tag with 1-based lines, out-of-range
  no-content sentinel, zero start_line / max_lines rejection,
  hard-cap clamp at 500 lines, and large-file no-range default
  window. The original PR shipped without tests.
2026-05-11 22:07:07 -05:00
Hunter Bown fd82f85800 fix(tui): auto-disable DEC 2026 sync output on Ptyxis to stop VTE 0.84 flicker
Ptyxis 50.x (the new default terminal on Ubuntu 26.04) ships with
VTE 0.84.x, which parses the `\x1b[?2026h` / `\x1b[?2026l` synchronized-
output begin/end pair but still flashes the entire viewport on every
wrapped frame instead of deferring rendering. gnome-terminal 3.58 on
the same VTE renders cleanly, so the heuristic stays narrow: trigger
only on TERM_PROGRAM matching `ptyxis` (case-insensitive) or
PTYXIS_VERSION non-empty.

Add a new `synchronized_output` setting (`auto` | `on` | `off`,
default `auto`) controlling whether the renderer wraps each frame in
DEC 2026. `apply_env_overrides` flips `auto` → `off` when Ptyxis is
detected; the four wrapping sites in ui.rs (`draw_app_frame_inner`,
`reset_terminal_viewport`, `resume_terminal`, and the early-init
viewport reset) now respect the resolved flag. Users on Ptyxis who
upgrade past an upstream fix or want to confirm one landed can
override with `/set synchronized_output on`.

8 new tests cover: default-auto resolves enabled, off disables,
on stays enabled, set/aliases, Ptyxis via TERM_PROGRAM, Ptyxis via
PTYXIS_VERSION alone, explicit `on` beats the heuristic, explicit
`off` is preserved, and no non-Ptyxis TERM_PROGRAM (including Ghostty
and VS Code, which both keep DEC 2026 on) regresses.

Reported via WeChat by Cyrux on Ubuntu 26.04 with v0.8.30 npm install;
analysis by Hunter pinpointed Ptyxis + VTE 0.84 as the cause.
2026-05-11 22:04:26 -05:00
Hunter Bown 1a73791e1d chore(release): bump to 0.8.30
- workspace.package.version: 0.8.29 → 0.8.30
- per-crate path-dependency version pins: 0.8.29 → 0.8.30
- npm/deepseek-tui: version + deepseekBinaryVersion → 0.8.30
- Cargo.lock refreshed via `cargo update --workspace --offline`
- CHANGELOG: `[Unreleased]` → `[0.8.30] - 2026-05-11` with the full
  release-theme paragraph and the new "Changed" section for the
  Alt+<key> unification

Verified with `./scripts/release/check-versions.sh`:
  Version state OK: workspace=0.8.30, npm=0.8.30, lockfile in sync.
2026-05-11 19:21:45 -05:00
Hunter Bown 31198c11d0 fix(tui): unify transcript-nav shortcuts under Alt+<key>; stop eating first letters
v0.8.29 spot-fixed `g` with a gg double-tap (commit c13ddb04d), but
the underlying bug class still affected `G`, `[`, `]`, `?`, `l`, and
both `v`/`V`: bare press on an empty composer hijacked the keystroke
for transcript navigation, swallowing the first character of a
message. Even the gg fix itself only suppressed the SCROLL — the
first `g` was still eaten, so typing "good morning" produced "ood
morning" with no whale and no warning.

Unified fix: all seven bindings now require the `Alt` modifier (same
pattern as the existing `Alt+R` history search and `Alt+V` tool
details). Plain letters always insert as text:

  Alt+G          → scroll to top
  Alt+Shift+G    → scroll to bottom
  Alt+[ / Alt+]  → previous / next tool output
  Alt+?          → open searchable help (F1 / Ctrl+/ also bound)
  Alt+L          → pager for the last message
  Alt+V          → tool-details pager  (was already bound; only path now)

The `App::transcript_pending_g` field from the v0.8.29 half-fix is
removed along with its Esc / Enter / Char-catch-all resets. The
existing helper `details_shortcut_modifiers` (which accepted bare,
Shift, and Alt-only — the permissive predicate that ate the bare `v`
keystroke) is replaced by `alt_nav_modifiers`, which requires `Alt`,
allows `Shift`, and blocks `Ctrl` / `Super` so the bindings don't
collide with platform clipboard or window-management shortcuts.

Same modifier vocabulary as `Alt+R` / `Alt+P` / `Alt+1..3`, so this
makes the keymap more consistent rather than introducing a new
convention.

Regression-guarded by `alt_nav_modifiers_require_alt_and_exclude_ctrl_super`
in `crates/tui/src/tui/ui/tests.rs`, which exercises every modifier
combination the predicate needs to accept or reject. Full
`cargo test -p deepseek-tui` (2751 tests) passes; clippy clean; fmt
clean.

User-facing migration: any user who learned the bare-letter nav
shortcuts in v0.8.x needs to add `Alt+`. The trade-off is unambiguous
— losing the first letter of "good", "great", "let", "list", "very",
"verify", and anything starting with `?` was burning more users than
the bare-letter shortcuts were serving.
2026-05-11 19:21:31 -05:00
Hunter Bown a328344691 feat(tui): restore 🐳🐋 cycling status indicator next to the effort chip
The whale was a 12-frame animated indicator (`🐳, 🐳., 🐳.., 🐳..., 🐳..,
🐳., 🐋, 🐋., 🐋.., 🐋..., 🐋.., 🐋.`) that shipped from v0.3.5 onward
and rendered in the top-right status cluster of the header. Commit
`1a04659a9` ("smoother TUI streaming") quietly swapped it for a 6-frame
geometric ring (`◍ ◉ ◌ ◌ ◉ ◍`); `f4dbf828c` later deleted the function
entirely. Nothing in the CHANGELOG mentioned either step, and the
absence has been on the maintainer's mind ever since.

This commit restores the whale as a configurable status indicator that
sits immediately before the reasoning-effort chip ("next to max"):

- `widgets/header.rs` gains a public `header_status_indicator_frame`
  helper and a `HeaderData::with_status_indicator(Option<&'static str>)`
  builder. The frame computation is pure (keyed off `turn_started_at`
  and the mode string) so the widget itself stays a stateless render.
- The chip renders as the first item in the status cluster, before
  `provider` / `effort` / `Live` / context. Idle state shows a steady
  🐳; an active turn cycles frames every 420 ms (same cadence as the
  original v0.3.5 implementation).

New setting `status_indicator`:

- `whale` (default) — restored historical cycling.
- `dots` — the 6-frame geometric replacement, for users who came in
  during the dots era and prefer it.
- `off` — hide the chip entirely.

Settable via `/config status_indicator <whale|dots|off>`, persisted in
`settings.toml`, mirrored in the typed `config_ui::SettingsSection`
with a new `StatusIndicatorValue` enum so the web/JSON config surface
sees it too.

Default-to-whale rationale: this restores the historical behaviour for
every user, including those who never realized the whale was gone, and
keeps the "🐳 in /config" delight that the project's name has always
implied.

Regression-guarded by seven new tests in `widgets/header.rs::tests`
covering idle frame, frame advancement, dots variant, off variant
including aliases, unknown-mode fallback to whale, render placement
before the effort label, and confirmation that `off` hides the chip
without disturbing the effort chip layout.
2026-05-11 18:59:52 -05:00
Hunter Bown b1998fff8c revert(tui): drop typing-synced wave frame source; keep gate decoupling
The previous commit (15525751c) did two things in one shot:
1. Decoupled the footer water-spout gate from `low_motion`, so
   `low_motion = true` no longer hides the wave when `fancy_animations
   = true`.
2. Re-wired the wave's frame source from wall-clock milliseconds
   to a per-turn character-commit counter, on the theory that the
   wave should visually move at the same cadence as the text on
   screen ("water = typing").

The user-visible result of (2) was that the wave looked notably
different than in v0.8.29 — slower, sluggish, less alive. Root cause:
the sine math in `footer_working_strip_glyph_at` (`t = frame / 1000.0`,
primary term × 8.0) was tuned for frame ≈ 1000 units/sec, which is
what wall-clock ms produces. Driving frame off character commits
gives ~10–30 units/sec, so the wave evolves ~30× slower than the
intended tuning. Theoretically fixable by re-tuning the sine
constants, but that's a bigger change with its own visual
regressions to vet, and the user explicitly asked to "put it back
to where it was."

This commit reverts only (2):

- Removes `StreamingState::stream_commit_frame` field.
- Removes the increment in `commit_text` and `finalize_block_text`.
- Removes the zeroing in `reset`.
- Removes the five `stream_commit_frame_*` regression tests.
- Changes `render_footer` to assign `Some(now_ms)` again instead of
  `Some(app.streaming_state.stream_commit_frame)`.

The decoupling from (1) stays: the gate is still
`if app.fancy_animations { ... }`, so `low_motion = true` no longer
hides the wave. The settings.rs docstrings stay updated.

CHANGELOG entry is collapsed to a single short bullet describing the
decoupling-only fix.

Net effect for users: the wave looks and feels exactly like v0.8.29,
but `low_motion = true` now keeps the whale visible (was the
original regression that started all of this).
2026-05-11 18:40:17 -05:00
Hunter Bown 15525751ce feat(tui): decouple footer water-spout from low_motion; sync wave to typing cadence
The water-spout strip in the footer used to be hard-gated by `!low_motion`,
which meant the typewriter-streaming option silently killed the spout
animation — even with `fancy_animations = true` the strip stayed plain
whitespace. Users testing the typewriter pacing in v0.8.29 reported "where
did the whale go," which is on us: we'd collapsed two concerns
(streaming pacing vs footer animation) onto one flag.

This commit makes the two flags orthogonal:

- `low_motion` governs streaming pacing only (typewriter = one char per
  commit tick vs upstream cadence = drain everything queued).
- `fancy_animations` governs whether the spout-strip is rendered at all.

It also wires in a new idea that fell out naturally once the two were
decoupled: instead of driving the wave animation off wall-clock
milliseconds, drive it off a per-turn character-commit counter
(`StreamingState::stream_commit_frame`). The wave then visually moves at
the same cadence as the text:

- Typewriter mode → wave drips at one frame per character.
- Upstream mode → wave surges when V4-pro bursts a warm-cache turn.
- Tool calls and planning pauses → no chars arrive, wave freezes. The
  textual `working...` pulse still ticks on wall-clock, so a heartbeat
  is always visible.
- New turn (`StreamingState::reset`) → counter zeroes so each turn
  opens with a fresh wave shape.

`stream_commit_frame` is a `u64` advanced inside `commit_text` and
`finalize_block_text` by the character count of each committed slice,
so multi-byte UTF-8 (e.g., CJK) advances the wave by one glyph per
character rather than three frames per character — matching the
visual weight of each glyph.

Regression-guarded by five new tests in `crates/tui/src/tui/streaming/mod.rs`:
- `stream_commit_frame_advances_by_character_count_on_commit`
- `stream_commit_frame_counts_unicode_chars_not_bytes`
- `stream_commit_frame_advances_on_finalize`
- `stream_commit_frame_resets_on_reset`
- `stream_commit_frame_freezes_when_no_text_arrives`

Also folds in `cargo fmt` cleanup for two files where prior commits on
this branch landed without re-formatting (`crates/tui/src/tui/ui.rs`
around the new Esc-arm wrapper introduced for the `gg` double-tap, and
the new `fireworks_custom_base_url_preserves_provider_model` test in
`crates/config/src/lib.rs`). No behavior change from those edits.

Settings doc comments in `crates/tui/src/settings.rs` updated to spell
out the new orthogonal semantics so the next maintainer doesn't have
to reverse-engineer it from `render_footer`.

CHANGELOG entry added under a new `[Unreleased]` section.
2026-05-11 16:55:37 -05:00
Hunter Bown c13ddb04d4 fix(tui): change transcript 'g' scroll to vim-style 'gg' double-tap
A single bare 'g' with an empty composer was hijacked as a scroll-to-top
command, preventing users from typing 'g' as the first character of a message.
The transcript would jump to line 0 instead of inserting 'g' into the composer.

Change to a vim-style 'gg' double-tap: first 'g' arms transcript_pending_g,
second 'g' executes the scroll. Any other character input, Enter, or Escape
resets the pending flag so a stray 'g' during composition arms without
scrolling.

Also adds transcript_pending_g field to App struct (default false).
2026-05-11 16:43:45 -05:00
Hunter Bown 7454b23ae6 fix(shell): kill process group in collect_output to prevent UI freeze
Cherry-picked from PR #1475 by CrepuscularIRIS (autoghclaw/issue-828).

When a shell command spawns background subprocesses (nohup, sleep &, etc.),
those subprocesses inherit the pipe write-ends. After the shell exits, the
subprocesses keep those write-ends open, causing handle.join() on reader
threads to block indefinitely in read(). Since list_jobs() calls poll() →
collect_output() on every TUI render tick, the entire UI event loop blocks.

Fix: kill the process group (PGID = child PID) before joining reader threads,
so orphaned subprocesses release their pipe write-ends. Also wires the
previously dead-coded cleanup() into list_jobs() with a 1-hour eviction
window to bound process table growth.

Fixes #828.
2026-05-11 16:34:00 -05:00
Hunter Bown 62a893d9c2 fix(config): preserve model names for all providers with custom base URLs
Previously only OpenRouter was whitelisted via provider_preserves_custom_base_url_model,
causing six other providers (Sglang, Novita, Fireworks, Vllm, Ollama, NvidiaNim) to still
rewrite user-configured model names when a custom base URL was set. Users routing through
their own gateway would get 400s because the TUI sent provider-prefixed model names
(e.g. accounts/fireworks/models/deepseek-v4-pro) that the gateway didn't recognise.

The fix removes the provider-specific guard: when base_url_is_custom_for_provider()
returns true (i.e. the user set a non-default endpoint), the model name is preserved
as-is for every provider, not just OpenRouter.

Affected:
- crates/config/src/lib.rs: ProviderKind::Openrouter guard removed
- crates/tui/src/config.rs: ApiProvider::Openrouter guard removed
- Test: fireworks_custom_base_url_preserves_provider_model added
- Test: nvidia_nim_reads_facade_provider_table updated for new behaviour

Addresses the #857 class bug (B1 in the v0.8.30 audit).
2026-05-11 16:25:45 -05:00
Hunter Bown 716f45cfbc fix(ui): Ctrl+O guard accepts extra modifier bits + AGENTS two-binary note
The Ctrl+O thinking-pager arm guarded on
`key.modifiers == KeyModifiers::CONTROL` (exact match), so any
additional modifier bit set by the terminal — Shift while a
native-selection mouse bypass was active, Caps Lock indicator on
some keyboard layouts — silently fell through to the $EDITOR arm at
ui.rs:2833 and did nothing visible when the composer was empty. The
user saw the "thinking collapsed; press Ctrl+O for full text"
affordance, pressed it, and the handler appeared to ignore them.

Relaxed to `contains(KeyModifiers::CONTROL)` to match the established
pattern at Ctrl+P (ui.rs:2068) and Ctrl+B (ui.rs:2077). With the
existing `app.input.is_empty()` guard preserved, the $EDITOR arm
still owns the non-empty-composer case, so the two handlers continue
to partition Ctrl+O cleanly.

Also documents the two-binary install gotcha in AGENTS.md: the CLI
dispatcher (`crates/cli` → `deepseek`) and the TUI runtime
(`crates/tui` → `deepseek-tui`) ship as separate executables, and
`cargo install --path crates/cli` alone leaves the TUI stale — which
is how both this fix and the active_cell fix from dc2433a8b
initially appeared to be no-ops during local maintainer testing.
The release pipeline packages both binaries, so end users were
never affected by that side; this is purely a maintainer-local
footgun and is now spelled out for future agents.

Extends the existing v0.8.29 CHANGELOG entry to credit both halves
of the Ctrl+O fix.
2026-05-11 14:31:41 -05:00
Hunter Bown dc2433a8b5 fix(ui): Ctrl+O expands thinking blocks still in active_cell
After `ThinkingComplete` the finalized thinking entry sits in
`app.active_cell` with `streaming = false` until the active cell
flushes to history at end-of-turn. During that window the transcript
rendered the "thinking collapsed; press Ctrl+O for full text"
affordance from `render_thinking`, but `open_thinking_pager` only
searched `app.history` — so the handler surfaced "No thinking blocks
to expand" while pointing at the affordance. The affordance was
truthful; the handler was lying.

Routed the lookup through `cell_at_virtual_index` /
`virtual_cell_count`, the existing virtual-index API that
`open_tool_details_pager` already uses for the same active-cell
window. The selection-based path resolves through the virtual index
too, so dragging into an in-flight thinking block and pressing
Ctrl+O now works as well.

Regression guard: `open_thinking_pager_finds_thinking_in_active_cell`
drives the entry into active_cell, finalizes it so the "collapsed"
affordance is what render_thinking emits, then asserts Ctrl+O pushes
the Pager view instead of surfacing the "No thinking blocks" status.
2026-05-11 14:03:53 -05:00
Hunter Bown bfb9da3462 test: isolate config-mutating smoke tests 2026-05-11 13:28:45 -05:00
CrepuscularIRIS dde1e5e2f1 fix(tui): default composer_arrows_scroll on when mouse capture is off
On platforms where mouse capture is disabled by default (Windows CMD /
legacy conhost), the terminal sends mouse-wheel events as Up/Down arrow-
key sequences.  Without composer_arrows_scroll those sequences cycle the
input history instead of scrolling the transcript (#1443).

Set the default for composer_arrows_scroll to !use_mouse_capture so that
terminals that forward wheel events as arrows get page-scrolling out of the
box, while terminals with real mouse capture (Windows Terminal, Linux, macOS)
keep the existing history-navigation default.

The explicit [tui] composer_arrows_scroll config key still overrides the
derived default in both directions.

Also enable mouse capture by default for ConEmu/Cmder (ConEmuPID env var),
which handles VT mouse-mode reporting cleanly, giving those users in-app
scrolling without needing --mouse-capture.

Fixes #1443

Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>
2026-05-11 13:09:25 -05:00
CrepuscularIRIS ae45d1054b fix(shell): detect Docker buildx provenance EPERM and surface actionable hint
When exec_shell runs a Docker build on macOS and Docker Desktop's signed
process has written com.apple.provenance-tagged files under
~/.docker/buildx/activity/, the child process spawned by the TUI
sandbox gets EPERM when it tries to update those files, producing:

  failed to update builder last activity time: open
  /Users/.../.docker/buildx/activity/.tmp-...: operation not permitted

Add looks_like_macos_provenance_failure() to detect this pattern via
three heuristics (provenance xattr mention, activity-time message, or
buildx/activity path + EPERM), with an early-return guard that suppresses
the hint on clean exits. Wire the hint into both the foreground exec_shell
path and build_shell_delta_tool_result so it surfaces on background task
polls too.

Four unit tests cover the positive cases and the two guard cases (exit 0,
unrelated EPERM).

Closes #1449

Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>
2026-05-11 13:09:25 -05:00
CrepuscularIRIS 97a77d82f0 fix(settings): detect TERM_PROGRAM=ghostty and apply low-motion cap (#1445)
Ghostty's GPU compositor flash-renders each full-screen repaint at 120 FPS,
producing visible flicker identical to the VS Code issue fixed in #1356.
Extend apply_env_overrides() to also force low_motion=true +
fancy_animations=false when TERM_PROGRAM=ghostty, capping redraws to 30 FPS.

Add ghostty_term_program_forces_low_motion_on test mirroring the existing
vscode test, serialised through the process-wide lock_test_env() guard.

Fixes #1445

Signed-off-by: CrepuscularIRIS <serenitygp@qq.com>
2026-05-11 13:09:25 -05:00
AutoGHClaw 4bc5375fa6 fix(render): replace SMP emoji with stable BMP glyphs to fix layout on Windows terminals
Emoji in U+1F000+ have no stable column-width contract across terminal
emulators.  On cmd.exe/PowerShell they render as 1-column placeholder
boxes even though unicode_width reports 2; on WezTerm/Alacritty with
certain font stacks the rendered width can be off by one column.  Both
cases break layout arithmetic in the header and file-tree widgets.

Changes:
- header.rs: replace 🐳 (U+1F433, 2-wide) with ◆ (U+25C6, always 1-wide)
  in the "max" reasoning-effort chip
- file_tree.rs: drop the 📁/📄 (U+1F4C1/U+1F4C4) entry-icon prefix; the
  ▼/▶ expand marker already distinguishes dirs from files

Fixes #1314
2026-05-11 13:09:25 -05:00
xiedongsheng 1352a65ac2 fix: reject no-op edit_file replacements 2026-05-11 13:09:22 -05:00
reidliu41 7ac79799e7 Fix flaky workspace and env-sensitive tests
Tighten session workspace Git root detection so invalid parent .git
  markers are not treated as real repositories. This prevents unrelated
  temporary workspaces from being scoped together when a stray .git
  directory exists under /tmp.

  Also move env-mutating tests onto the shared test env lock and make the
  streamable HTTP MCP mock server serve until the test ends, avoiding
  parallel test races and premature mock server shutdowns.

(cherry picked from commit eecfc16fc99d072ac389980ec9e5e3f208297b8e)
2026-05-11 07:34:50 -05:00
reidliu41 46ab1fdf62 Scope skill completions to /skill
Keep individual skills out of the top-level slash command menu so large
  skill collections do not crowd out built-in commands.

  Skills still complete after `/skill`, including both the full skill list
  after `/skill ` and prefix matches after `/skill <prefix>`.

(cherry picked from commit 57f8e3ad84dad9cf46290c0dc23e2b26504196df)
2026-05-11 07:34:50 -05:00
Beltran12138 ca284d1fc0 fix: treat deepseek-chat / deepseek-reasoner aliases as reasoning models
`requires_reasoning_content()` only matched literal `deepseek-v4*` model
IDs, but `deepseek-chat` and `deepseek-reasoner` are DeepSeek's public
API aliases that resolve server-side to `deepseek-v4-flash` and
`deepseek-v4-pro` respectively. Both have thinking mode enabled by
default, so when a user sets `default_text_model = "deepseek-chat"` (the
value `deepseek auth` / onboarding writes), the thinking-mode sanitizer
is skipped and tool-call assistant messages are sent without
`reasoning_content`. DeepSeek then rejects the second turn with:

  HTTP 400: The `reasoning_content` in the thinking mode must be passed
  back to the API.

Extend `requires_reasoning_content()` to recognise the `deepseek-chat`
and `deepseek-reasoner` alias prefixes (covering suffixed variants like
`deepseek-chat:free` used by proxied deployments). The explicit
`reasoning_effort = "off"` escape hatch still disables replay via the
unchanged `should_replay_reasoning_content()` check.

Adds `alias_thinking_detection_tests` covering the aliases, explicit V4
IDs (regression guard), excluded non-thinking models, suffixed variants,
and the reasoning-off override.

Refs: https://api-docs.deepseek.com/guides/thinking_mode
(cherry picked from commit 46941142123827fa16fc9a1fb41b78c293e935ce)
2026-05-11 07:34:50 -05:00
samhandsome 9279d92f4e fix(tools): hydrate deferred schemas before execution
Return a schema hydration result on first deferred tool use so the model can retry with visible parameters instead of executing guessed arguments. Add edit_file coverage for old_string/new_string aliases.

(cherry picked from commit 91be171cc15dd895170bd1a486445f5e05356b57)
2026-05-11 07:34:50 -05:00
reidliu41 0164ee869e fix: parse CRLF SSE MCP events
Accept both LF and CRLF SSE event separators in the MCP SSE transport so
  uvicorn and FastMCP servers can publish endpoint events correctly.

  Add regression coverage for CRLF endpoint discovery.
2026-05-10 23:27:51 -05:00
Wenjunyun123 21867b2d3a style(tui): format footer width tests 2026-05-10 23:27:21 -05:00
Wenjunyun123 80a8559e29 fix(tui): keep footer chips within available width 2026-05-10 23:27:21 -05:00
reidliu41 4afa88ebfc fix(tui): ignore leaked mouse reports in composer
PR #1421 from @reidliu41. Filters SGR mouse-report bursts that some terminal chains leak into stdin while mouse capture is enabled, while preserving ordinary coordinate-like text.
2026-05-10 23:27:14 -05:00
Hunter Bown f5784909cf security(tui): avoid session-id taint in resume hint 2026-05-10 22:20:42 -05:00
Hunter Bown 678198440a docs(prompts): capture the Chinese-native-model design tension in locale-preamble docstring
Community feedback on the v0.8.29 follow-up (WeChat thread on
#1118) made a sharp point: the standard Western-LLM advice
"always write prompts in English" doesn't transfer to DeepSeek
V4, which is a Chinese-first multilingual model with a
Chinese-co-trained tokenizer. `你好` typically encodes to ~1
token, not 2; the "Chinese is expensive" framing is folk wisdom
from a different model family.

The naïve translation of that argument is "ship a fully
translated base.md per locale" — and that's the move v0.9.x
might eventually make. For v0.8.29 we deliberately stop at the
bookend (preamble + closer in native script, English middle)
because of three concrete costs:

  1. Drift risk between N translated copies of a 200-line
     prompt — every rule change has to land in lockstep.
  2. Cache stability — one English `base.md` lets us share
     prefix-cache state across locales for the workspace-
     static portion of the prompt.
  3. Translation QA expense — 95% right is bad, because the
     missing 5% becomes silent behavior divergence.

Captured all of this in the `locale_reinforcement_preamble`
docstring so the next maintainer reading the prompt-assembly
code sees the design tension and the cost model explicitly,
and knows full translation is the natural next step if the
bookend stops being sufficient.

No runtime change; documentation only. Credit @MuMu (via Hunter)
for the bookend pattern that motivates this design, and the
unnamed WeChat commenter who made the tokenizer-economics
argument that motivates this docstring expansion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 22:04:49 -05:00
Hunter Bown 5b6902006c security(mcp): defuse GitGuardian Basic-Auth-String false positive in proxy-redact test
GitGuardian's "Basic Auth String" detector flagged commit 09dcbede0
because the test fixture for `redact_proxy_userinfo_strips_password`
contained literal URL strings of the shape
`scheme://username:password@host` — `alice:hunter2` and `bob`. The
values are obvious placeholders (not real credentials), but the
detector's regex is shape-based: any scheme-prefixed colon-separated
userinfo segment terminated by `@` matches, regardless of whether the
content is a real secret.

The test still needs to exercise the redaction logic for credential-
carrying proxy URLs. Fix: assemble the URLs via `format!` from
explicit placeholder constants (`PLACEHOLDER_USER`,
`PLACEHOLDER_PASS`) so the literal source text never contains a
contiguous `scheme://name:secret@host` pattern. Runtime behavior is
identical — `redact_proxy_userinfo` receives the same string and
returns the same redacted form.

Also reworded the function docstring (line 61) and the inline comment
at the warning log site (line 993) to describe the userinfo segment
without spelling out a literal `user:pass@host` shape that the same
detector could later trip on.

Two preexisting fixtures elsewhere in this file
(`mask_url_secrets("https://user:s3cret@…")` at line 3155 and its
docstring at line 46) have been on `main` for several releases and
are presumably already on GitGuardian's allowlist — left untouched
in this commit so the fix scope stays minimal. If they re-fire on a
future scan, the same `format!` pattern can be applied there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 21:39:23 -05:00