Final step in the v0.8.11 patch release. Bumps the workspace
`Cargo.toml`, all 9 internal path-dep version pins, and
`npm/deepseek-tui/package.json` to **0.8.11**. `Cargo.lock`
regenerated alongside.
The v0.8.11 CHANGELOG entry already landed on `main` via the
cache-maxing overhaul PR (#684). This commit only stamps the
version. Together they ship:
* **Cache-maxing for V4 1M context** — engine no longer rebuilds the
system prompt on every turn (#684's `Session::last_system_prompt_hash`),
the volatile working-set summary moved out of the system prompt
into per-turn `<turn_meta>` on the latest user message, the tool
array is anchored with `cache_control: ephemeral`, and the
`messages_with_turn_metadata` injection skips tool-result
messages so the assistant→tool_result invariant stays intact.
* **500K compaction floor** — automatic compaction refuses below
500K tokens via `MINIMUM_AUTO_COMPACTION_TOKENS`. Manual
`/compact` bypasses (explicit user agency).
* **Token-only compaction trigger** — dropped
`CompactionConfig::message_threshold` and the message-count
branch in `should_compact`; that 128K-era heuristic only fired
on long sessions of small messages, exactly the case where
rewriting the V4 prefix cache is most wasteful.
* **Legacy 128K naming** — `DEFAULT_CONTEXT_WINDOW_TOKENS` →
`LEGACY_DEEPSEEK_CONTEXT_WINDOW_TOKENS`.
* **`npm install` resilience** — `install.js` now retries with
exponential backoff, enforces per-attempt timeout + 30 s stall
detector, honors `HTTPS_PROXY` / `HTTP_PROXY` / `NO_PROXY` (pure
Node, no new dependencies), and prints download progress to
stderr. Driven by a community report that `npm install` took 18
minutes through a CN npm mirror; the GitHub Releases binary
fetch was the bottleneck and CN mirrors don't proxy GitHub.
Verified locally:
* cargo fmt --all -- --check ✓
* cargo clippy --workspace --all-targets --all-features
--locked -- -D warnings ✓
* cargo test --workspace --all-features --locked ✓
* parity gates (snapshot, parity_protocol, parity_state) ✓
* bash scripts/release/check-versions.sh ✓
(workspace=0.8.11, npm=0.8.11, lockfile in sync)
* node scripts/release/npm-wrapper-smoke.js ✓
Reminder for the maintainer at release time: the npm publish is
manual and requires 2FA OTP on every publish. After this PR
merges and the GitHub Release is fully drafted by `auto-tag.yml`,
publish from a developer machine:
cd npm/deepseek-tui
npm publish --access public
The `prepublishOnly` hook checks all eight binaries plus the
SHA256 manifest are present on the GitHub Release before letting
`npm publish` proceed, so this must happen *after* the GitHub
Release is finalized.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI on PR #684 caught two real issues that local checks missed:
**Lint failure (cargo fmt).** A regression test landed with a multi-line
`let ContentBlock::Text { text, .. } = real_user.content...` pattern
that local rustfmt accepted but CI's pinned toolchain collapsed onto a
single line. Reformatted to match.
**npm wrapper smoke failure ("Checksum manifest is missing
deepseek-<platform>").** Subtle Node.js streams interaction in
`install.js` introduced by the network-resilience cluster:
* `httpRequest` attaches a `data` event listener on the response to
re-arm the stall timer.
* Attaching a `data` listener on a `Readable` puts the stream into
flowing mode immediately.
* `downloadText` then ran `for await (const chunk of response)` to
collect the body — the async iterator expects paused-mode and
silently misses chunks that flow before / between iteration ticks.
* For small bodies (the ~100-byte SHA256 manifest), the entire
response could flow through the stall listener before the async
iterator's `read()` calls landed, leaving the joined body empty.
* Result: `parseChecksumManifest("")` returned an empty Map →
`verifyChecksum` saw no entries → "manifest is missing X" after
the actual binary download succeeded.
Binary downloads were unaffected because `download()` uses
`response.pipe(sink)` plus a `data` listener for progress — both
consume chunks via `data` events, no async iterator involved.
Fix: collect the response body in `downloadText` via direct `data`/
`end` event subscription. `data` listeners stack — both the stall
re-arm and the body collector fire on every chunk, no flowing-vs-
paused conflict. Stall detection still works.
Verified locally: `node scripts/release/npm-wrapper-smoke.js`
"npm wrapper smoke passed with local assets from <url>".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live-test repro: typing a single user message in the TUI triggered a
tool call (read_file Cargo.toml), and the *next* request to DeepSeek's
API returned HTTP 400:
"An assistant message with 'tool_calls' must be followed by tool
messages responding to each 'tool_call_id'. (insufficient tool
messages following tool_calls message)"
Root cause: `messages_with_turn_metadata` walked the message list from
the tail and prepended a `<turn_meta>` Text block to the *last* message
with role="user". But tool-result messages also use role="user"
internally (they serialize to role="tool" on the wire). Inserting a
Text content block at index 0 of a tool-result message changed the
shape from `[ToolResult(...)]` to `[Text("turn_meta..."), ToolResult(...)]`,
which on the wire becomes a role="user" message with text instead of
the role="tool" message the API needs to satisfy the assistant's
prior tool_call. Hence the 400.
The fix:
* Restrict the injection target to messages that have at least one
Text content block AND no ToolResult blocks. This identifies actual
user-typed messages and skips tool-result envelopes.
* When the trailing slice has no eligible user message (e.g. mid-turn
when a tool result is the most recent message), skip injection
entirely. The working_set will surface again on the next genuine
user prompt; we don't retroactively prepend onto an earlier user
message because that would also confuse the API's tool-call
continuity checks.
Two regression tests pin the contract:
* `turn_metadata_skips_tool_result_messages` — assistant tool_call +
tool_result + earlier user message: only the user message gets the
prefix, the tool_result message stays a single-block ToolResult.
* `turn_metadata_skips_when_only_tool_results_trail` — the corner
case where the trailing user-role message is solely a tool result
(no real user message in the slice): no injection happens, the
message returns unchanged.
Verified locally:
* 2038 tests passed in TUI bin (2 ignored, was 2036 — these are the
+2 new regressions).
* `cargo fmt`, `cargo clippy --locked -D warnings`, parity gates all
clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Folds two follow-on changes into the 0.8.11 entry:
* The token-only compaction trigger (drops the 128K-era message-count
heuristic).
* The npm `install.js` network-resilience cluster (retry + timeout +
proxy + progress).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A community user from China reported `npm install deepseek-tui`
took 18 minutes through a CN npm mirror. The bottleneck is the
GitHub Releases binary fetch (~46 MB across two binaries), not the
npm tarball (which is 6.9 kB). The CN mirror does NOT proxy GitHub
release downloads, so any user behind a slow or lossy connection
is hitting the GitHub fetch directly with no resilience.
Four behaviors added to `npm/deepseek-tui/scripts/install.js`:
1. **Retry with exponential backoff.** Up to 5 attempts on network
errors (ECONNRESET, ECONNREFUSED, ETIMEDOUT, EAI_AGAIN,
network/host unreachable, EPIPE, ECONNABORTED) and 5xx upstream
responses. Backoff `1s, 2s, 4s, 8s, 16s` with ±20% jitter. 4xx
and checksum-mismatch are flagged non-retryable so we don't
thrash on permanent failures. Final error includes the underlying
message and the attempt count.
2. **Per-attempt total timeout + stall detector.** Total timeout
defaults to 5 minutes per attempt (`DEEPSEEK_TUI_DOWNLOAD_TIMEOUT_MS`,
alias `DEEPSEEK_DOWNLOAD_TIMEOUT_MS`). A stall detector aborts
the request when no bytes arrive for 30 s
(`DEEPSEEK_TUI_DOWNLOAD_STALL_MS`, alias
`DEEPSEEK_DOWNLOAD_STALL_MS`) so a hung connection doesn't waste
the whole timeout. Both budgets are surfaced in the error so the
user can dial them up if they're on a slow pipe.
3. **HTTPS_PROXY / HTTP_PROXY support — pure Node, no new
dependencies.** Detects `HTTPS_PROXY` / `HTTP_PROXY` (and the
lowercase variants) and routes through the proxy via CONNECT
tunneling. `NO_PROXY` exclusion list honored, with `*` and dotted-
suffix matching. Proxy auth via standard `user:pass@` URL form is
passed through as `Proxy-Authorization: Basic ...`. Pure-Node
implementation using `net` + `tls` + `http` + `https` builtins —
no `https-proxy-agent` dependency added.
4. **Download progress indicator.** Writes to stderr every ~1 MB
or every 2 s in TTY mode using `\r` to overwrite a single line.
Non-TTY mode (CI, piped) emits one line per 5 MB so logs stay
reasonable. Suppressed when `DEEPSEEK_TUI_QUIET_INSTALL=1` or
when `npm_config_loglevel` is `silent` or `error`. Falls back to
`N MB downloaded` when the response has no `Content-Length`.
Public API unchanged: existing callers of `getBinaryPath` and `run`
keep working identically when no new env vars are set. The escape
hatch `DEEPSEEK_TUI_DISABLE_INSTALL=1` still exits cleanly.
Verified locally:
* `node -c install.js` and module-load syntax checks.
* `DEEPSEEK_TUI_FORCE_DOWNLOAD=1 DEEPSEEK_TUI_VERSION=0.8.10 node
install.js` — real GitHub Releases download succeeded with
visible progress, both binaries landed.
* `HTTPS_PROXY=http://invalid.proxy.local:9999 ... node install.js`
— proxy path exercised, fails cleanly with the bad host named
in the error message after retries exhausted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `message_threshold` field on `CompactionConfig` was a 128K-era
heuristic that fired compaction on long sessions of small messages
— exactly the case where rewriting V4's prefix cache is most wasteful.
Token budget is the only signal that maps to actual model context
pressure; counting messages adds nothing.
Changes:
* Remove `CompactionConfig::message_threshold` field.
* Remove the message-count branch in `should_compact` — token
threshold + 500K floor is now the sole compaction trigger.
* Remove `compaction_message_threshold_for_model`,
`DEFAULT_COMPACTION_MESSAGE_THRESHOLD`,
`COMPACTION_MESSAGE_DIVISOR`, `MAX_COMPACTION_MESSAGE_THRESHOLD`
from `models.rs`.
* Drop the `forced_config.message_threshold` tweak in the engine's
capacity-guardrail forced-compaction path; that path now also
bypasses the floor (`auto_floor_tokens = 0`) because we're at a
hard ceiling and have to free budget regardless of cache cost.
* Update production constructors (`main.rs`, `runtime_threads.rs`,
`app.rs::compaction_config`) to drop the field.
* Update tests: keep the floor + token-threshold assertions, delete
the two tests that specifically validated message-count
triggering, replace `should_compact_respects_message_threshold`
with `message_count_no_longer_triggers_compaction` pinning the
new contract.
Verified locally:
* `cargo fmt --all -- --check` clean.
* `cargo clippy --workspace --all-targets --all-features --locked
-- -D warnings` clean.
* `cargo test --workspace --all-features --locked` — 2036 passed
in TUI bin (2 ignored), all other crates green.
* parity gates: snapshot, parity_protocol, parity_state — all pass.
* `git diff --exit-code -- Cargo.lock` — clean.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Maintainer-reported (handoff): after spawning a sub-agent in YOLO, the
transcript renders solid black and scroll keys go dead, but footer +
sidebar still render fine. The shape (black + dead input together)
strongly suggests a `View` is on the stack that returns empty layout
AND intercepts key events at the top level. The fix wants a tighter
repro than we have today.
Add `tracing::debug!` to every push / push_boxed / pop on `ViewStack`
and to the implicit pops in `apply_action` (Close + EmitAndClose).
Each line carries the `ModalKind` and post-action depth, so a future
`RUST_LOG=deepseek_tui::view_stack=debug` capture will show exactly
which view stayed pushed when the symptom recurred.
No behavior change. The handoff explicitly suggested this as the
first-look diagnostic step; we ship the diagnostic now so the next
report comes with evidence.
Refs the unresolved sub-agent black-transcript symptom captured in
session-3 handoff. Will surface to a tracking issue once we have a
concrete repro from the maintainer.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two coordinated changes that stop the engine from routinely rewriting the
prompt prefix and burning DeepSeek V4's prefix-cache discount:
1. `Settings::default().auto_compact` flips from `true` to `false`. The
`auto_compact = on` opt-in and the explicit `/compact` slash command
stay available for users / agents that decide their workload benefits
from compaction more than from cache stability. With V4's 1M-token
window the user has plenty of headroom to run long sessions without
auto-trimming, and aggressive compaction has been the dominant
cost-spike vector in long sessions (the rewritten prefix invalidates
~90% of the cache discount on every compaction event).
2. `DEFAULT_COMPACTION_TOKEN_THRESHOLD` raised from `50_000` to
`102_400` (80% of `DEFAULT_CONTEXT_WINDOW_TOKENS = 128_000`). This is
the last-resort threshold used when `context_window_for_model` returns
`None` — i.e. an unrecognised model id. Pre-v0.8.11 the fallback
compacted at ~5% of a V4 window when model detection silently fell
through. Now the fallback inherits the same late-trigger discipline as
the V4 path, so model-detection drift doesn't quietly burn cache.
Together: the two changes mean compaction never fires automatically by
default, and even when explicitly opted in (or when the runtime-thread /
capacity-flow paths invoke compaction with their own `enabled = true`
config), the threshold is anchored at 80% of the model's context window
(or 80% of the 128K default if the model is unknown), never below.
Tests
=====
- `default_settings_disable_auto_compact_to_protect_v4_prefix_cache` —
pins the new default and explains the rationale inline.
- `auto_compact_remains_explicitly_configurable` — unchanged; still
asserts the `set("auto_compact", "on" | "off")` round-trip works.
- `compaction_threshold_scales_with_context_window` — updated to assert
`compaction_threshold_for_model("unknown-model") == 102_400`.
- `v4_soft_caps_only_apply_to_v4_models` — updated to assert the
unknown-model + reasoning-effort path also lands on the new floor.
Verification
============
- `cargo fmt --all -- --check` clean.
- `cargo clippy -p deepseek-tui --bin deepseek-tui --all-features
--locked -- -D warnings` clean.
- `cargo test -p deepseek-tui --bin deepseek-tui --locked` →
2028 passed, 2 ignored.
Refs #664 (handoff-instead-of-compact pattern, full implementation
deferred). Behaviour-only change for v0.8.11; the larger
agent-aware-handoff mechanism is its own design surface.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds two regression tests to crates/tui/src/tui/paste.rs::tests that
nail down what is currently a working code path but was not previously
covered by name:
* `ime_chinese_chars_route_through_to_composer` — simulates the
macOS/Windows IME commit pattern (one `KeyCode::Char(c)` event per
Chinese codepoint with realistic ~50 ms gaps so the paste-burst
heuristic doesn't false-positive). Asserts that "你好世界" lands in
`app.input` verbatim and that `cursor_position` advances by one per
codepoint, not per UTF-8 byte. The non-ASCII branch in
`handle_paste_burst_key` (paste.rs:42) is the structural anchor;
this test pins it so a future "filter to ASCII for the paste-burst
detector" change would surface immediately.
* `bracketed_paste_preserves_chinese_and_mixed_text` — pastes a mix
of CJK and Latin text ("你好世界 hello 世界 café") through the
bracketed-paste path (`insert_paste_text` → `normalize_paste_text`
→ `insert_str`) and confirms every codepoint survives plus the
cursor tracks codepoints, not bytes.
Why these tests, why now: a community report surfaced the question
"can users input Chinese characters" without specifying the exact
failure mode. Code review of the input data path turned up nothing
broken, and these tests confirm the data path is correct end-to-end
for both single-char IME commits and bulk bracketed paste. The tests
serve as evidence (the data path is provably fine) and as a guard
against future regressions to Chinese-input support.
The tests cost nothing at runtime and build under `cfg(test)` only.
If users are still seeing a Chinese-input failure after this lands,
the candidates worth investigating in priority order are: (1) display
layer — `wrap_input_lines` / `cursor_row_col` may be miscounting
double-width CJK cells; (2) terminal-specific delivery — certain
IMEs / terminals don't emit the events crossterm expects; (3) locale
at launch — `LC_ALL=C` in non-interactive shells breaks UTF-8 input
upstream of crossterm.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reproduction (from the user who filed #593, also the reporter of #586):
1. At any prior point, the user runs `deepseek auth set --provider deepseek`,
which writes to the OS keyring under the `deepseek` slot.
2. The key is later rotated, the prior install is replaced, or the user
moves to a different account.
3. The user opens the TUI, gets the in-TUI onboarding screen, and pastes
their fresh API key.
4. `submit_api_key` → `save_api_key` writes only to `~/.deepseek/config.toml`.
5. At request time, `Secrets::resolve` follows the documented
`keyring → env → config-file` precedence, and the **stale** keyring
entry shadows the fresh config.toml value.
6. API call goes out with the dead key, gets a 401, the TUI shows
"no response" with no obvious diagnostic.
The fix
=======
`save_api_key` now writes to **both** layers when a keyring backend is
reachable:
* The config file remains the durable, inspectable record of the
active key (works in npm installs, IDE terminals, headless CI —
everywhere). v0.8.8 made this the canonical location for a reason.
* The OS keyring entry is rewritten on every onboarding submit so a
stale credential from a prior install is overwritten in place.
`SavedCredential` gains a new `KeyringAndConfigFile { backend, path }`
variant; the existing `ConfigFile(PathBuf)` variant remains the
fallback when no keyring backend is reachable (or under `cfg(test)`,
so the unit suite never pollutes the host keyring). The onboarding
toast naturally reports the actual outcome via
`SavedCredential::describe`, which now reads
`OS keyring (system keyring) and ~/.deepseek/config.toml` for the
common case.
`save_api_key_for` (the multi-provider entry point) is updated to
extract the path from either variant, so non-DeepSeek providers
(OpenRouter / Novita / Fireworks / NIM / SGLang) continue writing
provider-table entries to config.toml only, with no behavior change.
`deepseek doctor` warning
=========================
`run_doctor` now compares the keyring's `deepseek` slot against the
config file's `api_key` slot. When both are present and differ, the
report surfaces the discrepancy with copy-paste remediation —
`deepseek auth set --provider deepseek` rewrites both layers in one
shot, and the in-TUI onboarding now does the same. The check skips
keyring probes for other providers because they don't write to the
keyring today; probing absent slots only triggers macOS Always-Allow
prompts for nothing.
Why dual-write rather than keyring-only
=======================================
A previous attempt (`4e360274`, never merged to main) swapped the
write path to keyring-only. That hides the key from anyone who
expected to see it under `~/.deepseek/config.toml` and breaks the
"deepseek-tui works in every folder, in npm installs, in IDE
terminals" promise of v0.8.8. Dual-write keeps the inspectable copy
and adds the layered override that defeats stale-shadow without
changing the visible mental model.
Tests
=====
* `saved_credential_describe_lists_both_targets_for_keyring_and_config`
pins the toast text shape so the user sees both targets after
onboarding.
* The existing `save_api_key_writes_config_file_under_cfg_test` and
`test_save_api_key_doesnt_match_similar_keys` continue to pass —
under `cfg(test)` the keyring path is gated out, so the
config-only outcome remains the test-time contract.
Verification
============
* `cargo fmt --all -- --check` clean.
* `cargo clippy -p deepseek-tui --bin deepseek-tui --all-features
--locked -- -D warnings` clean.
* `cargo test -p deepseek-tui --bin deepseek-tui --locked` →
2029 passed, 2 ignored.
Closes#593.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two small follow-ups to #588's review:
* Gemini-code-assist suggested explicitly listing environment variables,
command-line flags, and URLs alongside identifiers/tool-names in the
carve-out clause, since those are exactly the categories an LLM is
likeliest to "helpfully" translate (e.g. `--verbose` or `DEBUG=true`).
Adopting verbatim — the additions are non-controversial and the failure
mode they prevent is real.
* Copilot flagged that the structural test only checked for the `## Language`
heading. A future edit could keep the heading but silently weaken the
section to a generic "respond in the user's language" directive,
dropping the cross-cutting #588 commitment that the model's
`reasoning_content` field — not just the visible reply — follows the
user's language. Add a second structural anchor: assert the section
body mentions `reasoning_content`. This matches the existing rlm test's
"anchor tokens, not prose" convention (the API field name is the
feature contract, not a wording choice).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Post-merge review feedback on #583 surfaced four small accuracy gaps:
1. The narrative docs in `docs/CONFIGURATION.md` and the inline comment
in `config.example.toml` said the notification fires "when a turn
takes longer than a threshold" — but the call site in
`tui/ui.rs:928` is gated on `TurnOutcomeStatus::Completed`. Failed
and cancelled turns are silent on purpose. Spell that out so users
don't expect alerts on long failures.
2. The `notify_done` rustdoc still summarised `Auto` as "Osc9 for known
terminals, Bel otherwise" — internally inconsistent with the new
Windows-aware fallback documented one screen earlier on the
`Method::Auto` enum and on `resolve_method`. Update the public
rustdoc to point at the canonical resolution table on
`resolve_method` and call out the `Off`-on-Windows branch.
3. The `## Key Reference` list in `docs/CONFIGURATION.md` had no entries
for `[notifications].method`, `[notifications].threshold_secs`, or
`[notifications].include_summary`. Other features with a dedicated
subsection (e.g. `[memory].enabled`) are listed there too, so readers
scanning the canonical key list could not discover the notification
knobs. Added the three keys with cross-references to the
Notifications subsection.
4. The Windows-only test only covered the unknown-`TERM_PROGRAM` →
`Off` fallback. The positive path (known OSC-9 terminal still
resolves to `Osc9`) was only tested via `iTerm.app`, which is a
macOS-only program — Windows CI would still pass if the `WezTerm`
arm of the match disappeared. Added
`auto_detect_picks_osc9_for_wezterm_on_windows` so the
WezTerm-on-Windows compatibility guarantee is exercised on the
Windows runner.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The post-merge review on #584 surfaced two real bugs in the new
summary-call telemetry:
1. The cache-hit percentage used `cache_hit + cache_miss` as the
denominator. Providers that populate `prompt_cache_hit_tokens` but
leave `prompt_cache_miss_tokens` as `None` (the rest of the codebase
already infers misses from `input_tokens` for cost reporting and
`/cache`) were silently reported as a flat 100% hit rate, masking the
actual ratio. Switch the denominator to `usage.input_tokens` so the
ratio matches how the rest of the project reasons about cache spread.
Extract the calc into a small `summary_cache_hit_percent` helper so
the invariant is unit-testable.
2. The doc comment on the emit site advertised that
`RUST_LOG=deepseek_tui::compaction=debug` would also work as a
filter. It does not — `EnvFilter` matches the explicit target string
when one is set, so only `RUST_LOG=compaction=debug` activates the
event. Drop the misleading parenthetical and call out the filter
semantics explicitly.
The new unit test pins the partial-telemetry guard so a future regress
to `(hit + miss)` denominator would be caught immediately.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups from the post-#572 cache-aligned compaction review (#584):
1. `should_use_cache_aligned_summary` now carries a doc comment that
explains why the cache-aligned and fallback summary requests are
framed differently. Cache-aligned replays the conversation as the
model's own history under `system: None`; fallback reformats it
into a `User:/Assistant:` transcript under a "concise summaries"
system prompt. The fallback's external-transcript framing is more
conservative for the older / smaller models the cache-aligned path
explicitly excludes, so dropping the system prompt risks regressing
those models without a corresponding gain. Unifying the two paths
is a research question that wants an A/B summary-quality eval, not
a drive-by cleanup — flagged here for a future PR rather than
resolved silently.
2. `create_summary` now emits one `tracing::debug!` event per summary
call carrying which path was chosen, the prompt-token count, and
the cache-hit / miss split. Filter with `RUST_LOG=compaction=debug`
(or the full module path
`RUST_LOG=deepseek_tui::compaction=debug`). This makes the V4
prefix-cache win from #572 observable post-deploy without adding
UI surface — the compaction summary call is the request we most
expect to benefit, and previously we had no per-call signal for it.
No UI surface changes. No model-facing prompt changes. Only adds the
path-choice variable and the debug log; existing compaction tests
(56 across `compaction::*` and `models::*`) still pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On Windows, the audio stack maps BEL (`\x07`) to the
`SystemAsterisk` / `MB_OK` chime — the same sound applications use
for error popups. So with the previous `Method::Auto` fallback to
`Bel`, every successful turn-completion notification ended up
sounding identical to a software error.
Reported by a community user who described it as "the popup-error
sound from a CAD program I used to use" (#583).
resolve_method() now returns `Off` instead of `Bel` on Windows for
unknown TERM_PROGRAM values. Known OSC-9-capable terminals
(`iTerm.app`, `Ghostty`, `WezTerm`) still resolve to `Osc9` on
every platform, so users running WezTerm on Windows keep getting
real notifications. macOS and Linux behaviour is unchanged.
Windows users who actively want an audible cue can opt back in by
setting `[notifications].method = "bel"` in `~/.deepseek/config.toml`.
Also:
- Documents `[notifications]` in `docs/CONFIGURATION.md` with an
explicit Windows note (the schema was previously undocumented).
- Updates the inline comment in `config.example.toml` so users
reading the seed config see the platform-specific behaviour.
- Splits the existing `auto_detect_picks_bel_for_unknown` test
into a Unix variant (`#[cfg(not(target_os = "windows"))]`) and
adds a new Windows-gated test that asserts the `Off` fallback,
so CI's Windows runner exercises the platform-specific path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DeepSeek V4's `reasoning_content` channel inherits the system
prompt's English bias even when users write in Chinese, so the
visible thinking trace stays in English alongside (sometimes
mixed-language) replies.
Adds a `## Language` section near the top of `base.md` directing
the model to mirror the user's language in *both*
`reasoning_content` and the final reply, with a carve-out so
identifiers, file paths, tool names, and log lines stay in their
original form (translating `read_file` to `读取文件` would break
tool calls). Default remains English when no clear signal is
present, so existing behaviour is preserved.
Includes a structural test in `crates/tui/src/prompts.rs` that
asserts the section ships in every mode (Agent / Yolo / Plan).
Wording is intentionally not asserted on, per the existing test
module's "don't fail on prose" comment.
Reported via the project Telegram community (#588).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add "What's new in v0.8.10" section covering hotfixes, runtime API
expansion, cache-aware compaction, glibc 2.28 baseline, markdown
rendering overhaul, and platform fixes
- Remove verbose per-version sections (v0.7.6 through v0.8.8) — those
belong in CHANGELOG.md, both READMEs now link to it
- Condense "How it's wired" architecture blurb to 2 sentences
- Restructure quickstart for flow: npm install → API key → platform
notes → China mirror → providers
- Trim Linux ARM64 section, drop ancient v0.8.7 workaround
- Drop redundant /attach shortcut (covered by @path)
- Tighten configuration env table to one row per variable
- Cut English skills section (numbered publish workflow duplicated
from SKILL.md)
- Add 5 missing docs to Documentation table (INSTALL, MEMORY,
SUBAGENTS, KEYBINDINGS, CHANGELOG)
- Bring zh-CN README to parity: add Key Features, How it's wired,
Thanks, Star History; preserve locale-switching guide
English: 324 lines (was ~1090). Chinese: 333 lines (was ~520, now
content-equivalent to English).
The aarch64 deepseek build in release.yml run 25329602631 succeeded in
4m 53s but the rename step failed:
cp: cannot stat 'target/aarch64-unknown-linux-gnu.2.28/release/deepseek'
cargo zigbuild parses `aarch64-unknown-linux-gnu.2.28` by passing
`aarch64-unknown-linux-gnu` to cargo and the `.2.28` glibc minimum to
zig's CC. The cargo target output dir is therefore
`target/aarch64-unknown-linux-gnu/release/`, never the
glibc-versioned form.
v0.8.9 release.yml hard-coded the rust triple in the rename step and
worked. v0.8.10 added `target_zig: <triple>.<glibc>` to the matrix and
switched the rename step to `${{ matrix.target_zig || matrix.target }}`,
which silently became wrong for every zigbuild matrix leg.
This commit:
- Always uses `matrix.target` (rust triple) for the copy source path.
- Adds a defensive `find target -name "${binary}"` debug listing if
the expected binary isn't at the rust-target path, so future
cargo-zigbuild output-dir changes are visible in the build log
rather than just "No such file".
The aarch64-unknown-linux-gnu release build for `deepseek-tui` failed in
release.yml run 25327475634 with:
openssl-sys v0.9.111: 'openssl/opensslconf.h' file not found
`crates/tui/src/main.rs` was the only crate in the workspace pulling
`reqwest` with `default-features = false, features = ["native-tls", ...]`
— every other crate (including the dispatcher in `crates/cli`) already
inherits the workspace default `["json", "rustls"]`. The aarch64 leg
builds with `cargo zigbuild --target aarch64-unknown-linux-gnu.2.28`,
whose zig sysroot does not ship openssl headers; the matching native-tls
job for v0.8.9 succeeded by chance against an earlier runner image but
the current `ubuntu-24.04-arm` image no longer satisfies openssl-sys's
header probe under zigbuild.
Switching the TUI's reqwest features from `native-tls` to `rustls` brings
it in line with the rest of the workspace and removes nine crates from
the build graph entirely (`openssl`, `openssl-sys`, `openssl-probe`,
`openssl-macros`, `native-tls`, `hyper-tls`, `tokio-native-tls`,
`foreign-types`, `foreign-types-shared`). reqwest 0.13.1 already uses
`rustls-platform-verifier` for OS trust-store integration, so end-user
TLS behavior against api.deepseek.com remains equivalent.
Verified locally:
- cargo clippy --workspace --all-targets --all-features --locked passes
- cargo build --release -p deepseek-tui --locked succeeds
- cargo fmt --all -- --check is clean
- no source code in `crates/` references native-tls / openssl directly
This is a release-pipeline-only fix; no user-visible feature changes.
Two related polish items wrapped together because both touch how the
user perceives the model's context behavior.
### Cache awareness in the agent prompt
The system prompt's Context Management section already lives inside
the volatile-content-last invariant — but the model never knew *why*
the prompt is shaped that way, or that it has any agency over keeping
the cache hit rate up.
Added a `### Prompt-cache awareness` subsection (Agent / Yolo modes)
with five concrete dos-and-don'ts:
- Append, don't reorder.
- Don't paraphrase quoted content (refer back by path).
- Use `/compact` as a hard reset, not a tweak.
- Read once, refer back instead of re-reading.
- Watch the `cache hit %` chip — red < 40%, yellow < 80%.
The chip itself already exists in the default footer status set
(`StatusItem::Cache`); the prompt addition closes the loop so the model
treats it as a real signal instead of a passive readout.
### #573 — typing `/mo` + Enter activates the first matching command
Previously a partial slash command + Enter sent the literal `/mo` as a
turn. The popup was already showing `/model` highlighted, so the user
expectation (and the OPENCODE behavior the issue cites) is that Enter
runs the highlight. The fix routes Enter through
`apply_slash_menu_selection` first when the popup is open and the input
starts with `/`. If the popup is empty (no matches) the legacy submit
path still fires — Enter on a non-slash line is unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Walks every key handler in `crates/tui/src/tui/ui.rs` and
`crates/tui/src/tui/app.rs`, confirms each chord resolves to a live
action, and groups them by context (global, composer, transcript,
sidebar, palette, approval modal, onboarding) so users have a single
page to point at instead of guessing from the help overlay.
Audit findings inline at the bottom of the doc:
* No broken bindings: every chord resolves to a live handler.
* `Ctrl-P` was previously double-bound (history + palette); that's
reconciled — the palette opens via `Ctrl-K`, `Ctrl-P` keeps history.
* The `?` help overlay entries all correspond to bindings in the
catalog; aspirational ones were either implemented this release or
dropped.
Deferral note for #436 (configurable keymap) and #437 (separate
`tui.toml`): both need a named-binding registry that names every chord
on this page and lets a user file override individual entries with
conflict detection. Half-implementing that in a patch release is worse
than landing the spec first; v0.8.10 ships the spec, the registry
follows in v0.8.11.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>