Release v0.8.4: Phase 1 i18n + cache-prefix stability

* fix(pricing): extend V4 Pro 75% discount expiry to 2026-05-31 15:59 UTC

DeepSeek extended the promotional discount past the original 2026-05-05
cutoff. Without this update the TUI would have started showing 4× the
actual billed cost on May 6.

Source: https://api-docs.deepseek.com/quick_start/pricing — "extended
until 2026/05/31 15:59 UTC".

Adds a regression test pinning the new active window so a future revert
to the May 5 date trips the suite immediately.

Closes #267

* chore: remove stale TODO(integrate) markers from already-integrated modules

Five `// TODO(integrate)` comments and one matching "Not yet integrated"
note were misleading anyone grepping for integration work. Each module
is in fact wired up:

- execpolicy/mod.rs       → tools/shell.rs:1322 (load_default_policy)
- sandbox/mod.rs          → tools/shell.rs:28, main.rs:2647, tui/approval.rs:30
- sandbox/policy.rs       → main.rs:2752, tui/approval.rs:30 (SandboxPolicy)
- command_safety.rs       → tools/shell.rs:1321, tools/tasks.rs:13,
                            tools/approval_cache.rs:26
- tui/streaming/mod.rs    → tui/app.rs:38 (StreamingState)

The remaining TODO at mcp.rs:1771 covers a separate "wire legacy sync API
into CLI subcommands or remove" decision and is left in place.

Closes #266

* docs(release): add install + dual-binary template to GitHub Release page

Closes #265.

The Release page used the auto-generated commit-title body. New users
hitting the Release page from Twitter / npm-search had no on-page
guidance that the dispatcher (`deepseek`) and the TUI runtime
(`deepseek-tui`) ship as two binaries that must coexist; #258 was an
external user spending 11 minutes figuring this out and #272 was the
follow-on confusion.

The new body covers:
- npm wrapper as the recommended install
- `cargo install deepseek-tui-cli deepseek-tui --locked` (both crates)
- Manual download with a per-platform table showing both artifacts
- sha256 verify using the existing `deepseek-artifacts-sha256.txt`
- Changelog link

* feat(debug): add /cache command surfacing per-turn DeepSeek cache hit/miss

Step 1 of #263. Without per-turn telemetry the prefix-cache audit is
unfounded speculation; the rest of the issue's investigation steps
depend on this surface.

The DeepSeek API already returns `prompt_cache_hit_tokens` and
`prompt_cache_miss_tokens` per turn, and we already store the *latest*
on App. This adds a 50-turn ring (`turn_cache_history`) populated at
the same site as `last_prompt_cache_*_tokens`, plus a `/cache [count]`
slash command that renders a fixed-width table of the last N turns
with per-turn ratios and a session aggregate. Default count is 10;
larger values clamp to the ring size.

Edge cases the formatter handles:

- No telemetry yet → friendly "no turns recorded" message
- `cache_hit_tokens = None` (provider didn't report) → row renders all
  em-dashes and is excluded from session aggregates so one missing-
  telemetry turn can't make the average ratio look broken.
- `cache_hit_tokens = Some, cache_miss_tokens = None` → infer miss as
  `input − hit` and mark the cell with `*`. Footer documents the
  asterisk.
- Ring at cap (50) → push evicts oldest.

Tests cover all four paths plus the cap.

* test(prompts): add cache-prefix stability harness for #263 step 2

The DeepSeek prefix-cache only hits while the byte prefix of each
request matches the prior call. Anything in the cached prefix that
varies turn-to-turn for unchanged inputs is a cache buster.

Adds a focused harness next to the production surface so the property
is regression-guarded:

1. `first_divergence(a, b)` helper that returns the first divergent
   byte position with a `±32 byte` window of context, used by the
   custom assertion `assert_byte_identical`. Future suspect tests can
   reuse this to surface "where" rather than just "fail".

2. `compose_prompt_is_byte_stable_across_calls` — sweeps every
   (mode, personality) pair and pins that two consecutive calls
   produce identical bytes. Rules out suspect #4 (mode-prompt churn).

3. `system_prompt_for_mode_with_context_is_byte_stable_for_unchanged_workspace`
   — the call site `engine.rs::build_tool_context` actually invokes,
   pinned for an empty workspace across all three modes.

4. `system_prompt_with_working_set_summary_is_byte_stable_for_constant_summary`
   — pins that the surrounding prompt construction faithfully embeds
   the working_set summary it's given without injecting extra
   non-determinism. (The actual working_set summary stability lives
   in `working_set.rs` and is the next investigation target — see
   issue note in PR description.)

Foundation for the suspect-by-suspect bisection in the rest of #263.

* fix(secrets): never overwrite the secrets file when load_unlocked errors

`FileKeyringStore::set` and `delete` did
`self.load_unlocked().unwrap_or_default()`, which wiped every existing
secret if the read failed for any reason other than \"file is missing\":

- file mode != 0600 (`InsecurePermissions`) — easy on headless / CI
  environments where a permissive umask got applied
- corrupt JSON
- transient I/O error

In all of those, the next `store_unlocked` overwrote the file with an
empty-or-single-entry blob and reset perms to 0600, silently losing
every other provider's key.

Switch both call sites to `?`. `load_unlocked` already returns
`Ok(default)` for a missing file, so the first-write-creates-the-file
ergonomic is preserved (covered by the new
`file_store_set_still_creates_file_when_missing` test).

Adds four regression tests:

- set: insecure perms surface InsecurePermissions and leave the file
  byte-identical.
- delete: same.
- set: corrupt JSON surfaces the parse error and leaves the file
  byte-identical.
- set: missing file path still works (idempotence guard).

Closes #281

* fix(cache): make tool catalog byte-stable across calls and sessions

DeepSeek's KV prefix cache hits on the longest matching byte prefix of
the request. Two places in the tool-array path were silently introducing
divergence:

1. `ToolRegistry::to_api_tools()` iterated `self.tools.values()` directly.
   Rust's default `HashMap` is seeded with `RandomState` per process, so
   every `deepseek` launch produced a different tool order — the cross-
   session resume case (the one with the biggest cache wins) never hit.

2. `active_tool_list_from_catalog()` filtered the catalog `Vec` by the
   active set in catalog order. When ToolSearch activated a previously-
   deferred tool mid-conversation, the new tool appeared at its catalog
   index, shifting every later tool's byte offset and busting the cached
   prefix from there onwards.

Fixes:

- `to_api_tools()` now sorts by tool name before emitting the API tool
  array. Stable across calls AND across launches.
- `build_model_tool_catalog()` sorts each partition (built-ins first,
  contiguous; MCP tools after, also alphabetical). Mirrors Claude Code's
  `assembleToolPool` strategy where they explicitly call out cache
  stability as the reason: "a flat sort would interleave MCP tools into
  built-ins and invalidate all downstream cache keys whenever an MCP
  tool sorts between existing built-ins."
- `active_tool_list_from_catalog()` puts always-loaded tools in catalog
  order at the head and deferred-but-now-active tools at the tail. A
  deferred-tool activation during ToolSearch no longer shifts earlier
  tools' positions.

Adds three regression tests:

- `to_api_tools_emits_alphabetical_order_regardless_of_registration_order`
- `model_tool_catalog_sorts_each_partition_for_prefix_cache_stability`
- `active_tool_list_pushes_deferred_activations_to_the_tail`

Refs #263. Findings produced by reading reference Claude Code source
side-by-side with our request-building flow; full delta analysis in
the PR description.

* fix(sandbox): elevate Agent-mode shell sandbox to allow network access

The seatbelt-default policy is `WorkspaceWrite { network_access: false }`,
which on macOS emits `(deny default)` with no `(allow network-outbound)` /
`(allow system-socket)`. Every outbound socket call from a sandboxed
shell command — including `getaddrinfo` for DNS — gets denied by the
kernel. Symptom: "DNS resolution failed" for any URL the model tries to
reach via curl, yt-dlp, package managers, etc.

Engine.build_tool_context only elevated the policy in Yolo mode, leaving
Agent mode (the default) stuck on the strict default. That's tighter
than competitors (Claude Code, Codex) without buying any safety the
application-level NetworkPolicy or the approval flow doesn't already
provide.

Switch the elevation to a `match` so:

- Plan       → no elevation (read-only investigation; shell tool not registered)
- Agent      → WorkspaceWrite { network_access: true, … }
- Yolo       → WorkspaceWrite { network_access: true, … } (unchanged)

Adds `agent_and_yolo_modes_elevate_shell_sandbox_to_allow_network` so a
future revert to the no-network default trips CI immediately.

Closes #273

* fix(skills): treat bare github.com/<owner>/<repo> URLs as GitHubRepo

Closes #269.

`/skill install https://github.com/obra/superpowers` failed on every
platform with `invalid gzip header`. Root cause: `InstallSource::parse`
matched any `https://`-prefixed spec as `DirectUrl`, so the installer
downloaded the HTML repo page (200 OK, `text/html`) and tried to
gzip-decode HTML. The user reported it from Win11 + PowerShell but the
parse path is platform-independent.

Recognize bare GitHub repo URLs in `InstallSource::parse`:

- `https://github.com/<owner>/<repo>`
- `https://github.com/<owner>/<repo>/`
- `https://github.com/<owner>/<repo>.git`
- `https://github.com/<owner>/<repo>.git/`
- `https://www.github.com/<owner>/<repo>`
- `http://github.com/<owner>/<repo>` (legacy)

…all route to the existing `GitHubRepo` source, which already produces
`https://github.com/<repo>/archive/refs/heads/{main,master}.tar.gz`
candidates with proper fallback. URLs with a third path segment
(`/archive/...`, `/blob/...`, `/tree/...`) keep going through
`DirectUrl` because the user picked that exact path.

Adds two regression tests: one asserting the seven recognised forms
all canonicalize to `github:obra/superpowers`, and one pinning the
sub-resource paths to `DirectUrl`.

* fix(cache): drop volatile fields from working_set summary block (#280) (#287)

The working-set summary lands inside the system prompt before the
historical conversation, so any byte that drifts there cache-misses
everything that follows in DeepSeek's KV prefix cache. Two sources of
turn-over-turn drift are removed:

1. The rendered line is now `- {path} ({kind})`. The previous form
   interpolated `entry.touches` and `self.turn - entry.last_turn`,
   both of which advance on every user message even when no new
   paths are observed.

2. A new `sorted_for_prompt` helper sorts by (touches DESC, path ASC)
   instead of the turn-aware `sorted_entries`. The recency bonus in
   `score_entry` crosses bucket boundaries as turns advance, so even
   without rendering `last seen` the order — and which entries cross
   the `max_prompt_entries` cutoff — drifted. Compaction pinning
   still uses `sorted_entries` because it genuinely wants recency.

Adds a regression test that observes a fixed message set, calls
`summary_block` before and after `next_turn()`, and asserts the two
outputs are byte-identical. The shared `first_divergence` /
`assert_byte_identical` helpers (from #279) move from `prompts::tests`
into `test_support` so working_set tests can reuse them.

Closes #280.

* fix(cache): memoise tool catalog so descriptions stay byte-stable (#289)

`to_api_tools` previously re-sampled `tool.description()` and
`tool.input_schema()` on every call. Native tools return `&'static str`
and a `json!` literal, so the bytes were stable in practice — but the
`McpToolAdapter` returns `self.tool.description.as_deref()`, which can
drift when the upstream MCP server reconnects with a different
description string. Any drift mid-session rewrites the tool catalog
that lands in the cached prefix and busts every byte that follows.

Adds an `api_cache: OnceLock<Vec<Tool>>` field on `ToolRegistry`. The
first `to_api_tools` call materialises the catalog; subsequent calls
return a clone of the cached vector. Mutations (`register`, `remove`,
`clear`) reset the field so the next read rebuilds. Mirrors
reference-cc's `getToolSchemaCache` (`utils/api.ts:119–208`).

Tests:
- `to_api_tools_pins_description_bytes_across_calls` registers a tool
  whose `description()` advances through a script of pre-built strings
  on each call. After the cache is populated, the second `to_api_tools`
  read returns the original description because `description()` is no
  longer invoked. Without the cache the second read would return the
  next script entry.
- `register_invalidates_api_tools_cache` registers a tool, snapshots,
  registers another, snapshots again, and asserts the second snapshot
  reflects both tools (cache rebuilt) and that the varying tool's
  description advanced (proving the rebuild actually re-sampled).
- `remove_and_clear_invalidate_api_tools_cache` covers the other two
  invalidation paths.

* fix(cache): sort project_tree and summarize_project output (#290)

Both helpers walked the workspace via `ignore::WalkBuilder::build()`
and emitted entries in the OS readdir order — non-deterministic across
filesystems (htree-hash on ext4, insertion-order on APFS, etc.). Their
output lands in the fallback branch of the system prompt's project
context (when the workspace has no AGENTS.md / CLAUDE.md) and inside
the `project_map` tool surface, both of which feed the cached prefix.

`summarize_project` now sorts the collected key-files list before the
type-detection logic and the fallback `Project with key files: …` join.

`project_tree` collects `(rel_path, is_dir)` tuples, sorts by full
path, and only then formats the indented tree. Sorting by full path
preserves the visual tree shape — `"src" < "src/lib.rs"` because the
shorter string compares less — while making siblings deterministic.

Tests cover sibling order, parent-before-children invariant, byte
stability across two consecutive calls, and the fallback `Project
with key files:` branch (the only branch where the joined order
escapes into output without further sorting downstream).

* fix(client): unique fallback id for parallel streaming tool calls (#291)

When a streamed tool_call delta omits the `id` field, the chat-completion
decoder used to fall back to the literal string `"tool_call"` for every
call. With the V4 API's native parallel tool calls (multiple tool_calls
in one delta), every parallel call ended up with the same fallback id —
downstream tool-result routing then matched the first call's result
twice and the second call hung waiting for an answer that never arrived.

The fallback now indexes by the assigned `content_block` position,
producing `"call_0"`, `"call_1"`, … within a single response. Upstream-
supplied ids are still forwarded verbatim; only the fallback path
changes.

Tests pin both invariants:
- `decoder_assigns_unique_fallback_ids_to_parallel_tool_calls_missing_id`
  feeds two tool calls without `id` in one delta and asserts they get
  distinct ids.
- `decoder_preserves_upstream_tool_call_id_when_present` keeps the
  forward-as-is path honest.

* fix(cache): place handoff and working_set after static prompt blocks (#292)

* fix(cache): drop volatile fields from working_set summary block (#280)

The working-set summary lands inside the system prompt before the
historical conversation, so any byte that drifts there cache-misses
everything that follows in DeepSeek's KV prefix cache. Two sources of
turn-over-turn drift are removed:

1. The rendered line is now `- {path} ({kind})`. The previous form
   interpolated `entry.touches` and `self.turn - entry.last_turn`,
   both of which advance on every user message even when no new
   paths are observed.

2. A new `sorted_for_prompt` helper sorts by (touches DESC, path ASC)
   instead of the turn-aware `sorted_entries`. The recency bonus in
   `score_entry` crosses bucket boundaries as turns advance, so even
   without rendering `last seen` the order — and which entries cross
   the `max_prompt_entries` cutoff — drifted. Compaction pinning
   still uses `sorted_entries` because it genuinely wants recency.

Adds a regression test that observes a fixed message set, calls
`summary_block` before and after `next_turn()`, and asserts the two
outputs are byte-identical. The shared `first_divergence` /
`assert_byte_identical` helpers (from #279) move from `prompts::tests`
into `test_support` so working_set tests can reuse them.

Closes #280.

* fix(cache): place handoff and working_set after static prompt blocks

`system_prompt_for_mode_with_context_and_skills` previously interleaved
volatile content into the static prefix:

  1. mode prompt           static
  2. project context       static
  3. working_set_summary   ← volatile
  4. skills_block          static
  5. handoff_block         ← volatile
  6. ## Context Management static
  7. COMPACT_TEMPLATE      static

Anything past byte (3) cache-missed every time the working-set drifted
or `/compact` rewrote `.deepseek/handoff.md` — including the static
`## Context Management` and `## Compaction Handoff` blocks behind them.

New order keeps every static block in the cached prefix and pushes the
two volatile blocks to the end:

  1. mode prompt
  2. project context (or fallback automap)
  3. skills block
  4. ## Context Management (Agent / Yolo only)
  5. COMPACT_TEMPLATE
  ── volatile boundary ──
  6. handoff block
  7. working-set summary

Adds a doc comment on the function describing the volatile-content-last
invariant so future contributors don't reintroduce churn into the
prefix. Adds two regression tests:

- `system_prompt_with_handoff_file_is_byte_stable_when_file_is_unchanged`
  pins the handoff path with a fixture file.
- `handoff_and_working_set_appear_after_static_blocks` asserts the
  ordering invariant directly so a future reorder fails loudly.

Reference: Claude Code's own prompt builder marks this same boundary
with a `SYSTEM_PROMPT_DYNAMIC_BOUNDARY` constant; we don't introduce
the abstraction yet but match the principle.

* feat(i18n): localize slash command help (Phase 1a, #285) (#294)

Adds 44 new MessageIds, one per slash command, and translations to all
four shipped locales (en/ja/zh-Hans/pt-BR). Refactors CommandInfo so the
English description now lives in localization.rs (single source of
truth) instead of being duplicated on the struct, and threads the
active Locale through the three render surfaces:

- crates/tui/src/tui/views/help.rs (the ?/F1/Ctrl+/ help overlay)
- crates/tui/src/tui/command_palette.rs (Ctrl+K palette)
- crates/tui/src/commands/core.rs (the /help text command)

Usage strings (e.g. /cache [count]) stay English by design — they're
placeholder syntax, not natural language.

The existing locale-coverage test
(`shipped_first_pack_has_no_missing_core_messages`) already iterates
ALL_MESSAGE_IDS across Locale::shipped(), so the 44 new IDs are
automatically required to be present in all four locale arms or CI
fails.

This is the first of several incremental Phase 1 PRs. Phase 1b covers
the debug commands (/tokens /cost /cache), 1c the footer hints, and
1d doctor output. Phases 2–3 cover onboarding and error surfaces.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): localize /tokens /cost /cache debug output (Phase 1b, #285) (#295)

Adds 13 new MessageIds covering the report templates and the
sub-strings shared across them, with translations for all four
shipped locales (en/ja/zh-Hans/pt-BR):

- CmdTokensReport, CmdTokensContextWithWindow, CmdTokensContextUnknownWindow
- CmdTokensCacheBoth, CmdTokensCacheHitOnly, CmdTokensCacheMissOnly
- CmdTokensNotReported
- CmdCostReport
- CmdCacheNoData, CmdCacheHeader, CmdCacheTotals, CmdCacheFootnote, CmdCacheAdvice

Each template uses {placeholder} substitution via String::replace
rather than format!, since format! requires a literal — the
locale-resolved &'static str isn't one. The placeholder convention
({active}, {hit}, {miss}, …) means a translator can re-order or
restructure a sentence freely without changing the call site.

Helpers `token_count`, `active_context_summary`, `cache_summary`, and
`format_cache_history` now take `Locale` so each can resolve their
templates from the same source of truth.

The English templates byte-match the previous hardcoded format strings
so the existing 16 debug-command tests pass unchanged.

Column headers in the cache table (`turn   in    out   hit   miss …`)
are intentionally NOT localized — the body rows are formatted with
fixed column widths and translating the header words would break
alignment. Numbers, ratios, and the model id stay in English form.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): localize footer state + help section labels (Phase 1c, #285) (#296)

Adds 11 new MessageIds covering visible footer chrome and the help-overlay
section headings, with translations for all four shipped locales:

Footer:
- FooterWorking — animated `working` / `working.` / … pulse
- FooterAgentSingular / FooterAgentsPlural — the sub-agent count chip
- FooterPressCtrlCAgain — the quit-confirmation toast

Help overlay sections (`?` / `F1` / `Ctrl+/`):
- HelpSectionNavigation, HelpSectionEditing, HelpSectionActions,
  HelpSectionModes, HelpSectionSessions, HelpSectionClipboard,
  HelpSectionHelp

`KeybindingSection::label` now takes Locale and returns tr(locale, …).
`footer_working_label` and `footer_agents_chip` likewise take Locale; the
two production callsites in tui/ui.rs pass `app.ui_locale`.

The mode chip itself (agent / yolo / plan) intentionally stays English —
those are brand/acronym labels, and translating them would mean explaining
to maintainers what `代理` means in a bug report.

The keybinding catalog DESCRIPTIONS (41 entries) are not translated in this
PR — those are technical prose that would dwarf the rest of i18n work and
can ship in v0.8.5. Section labels are translated so the help overlay
groups read as expected in any locale.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(commands): smoke-test that every / command dispatches to a handler (#299)

Adds two parallel-safe smoke tests in `crates/tui/src/commands/mod.rs`
that iterate the COMMANDS registry and verify every command — and every
declared alias — dispatches to a real handler. A dispatch miss surfaces
as the fall-through `Unknown command:` error message in `execute`,
which used to be invisible until a user typed the command and saw the
"did you mean" suggestion fire on a registered command.

The tests build a workspace-isolated app via `tempfile::TempDir` so
side-effecting handlers (`/init` writing AGENTS.md, `/save` and
`/export` writing files) do not pollute `crates/tui/` when CI runs from
there. `/save` and `/export` get an explicit tempdir-relative path
because their no-arg defaults still resolve relative to `cwd`.

`/restore` is skipped — it shells out to git for the snapshot repo and
its own dedicated tests in `commands/restore.rs` already serialize on
the global env mutex via `scoped_home`. The existing coverage there is
sufficient.

Closes a gap surfaced when verifying that the v0.8.4 i18n refactor
(#294, #295, #296) did not silently break any slash-command dispatch.
All 44 commands and their aliases pass (16 aliases on top of the
44 names; `/restore` is the only skip).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(release): bump version to 0.8.4 (#297)

CHANGELOG entry covers the v0.8.4 work landed since 0.8.3:

- Localization Phase 1 (#285) — slash command help (#294), debug command
  output (#295), footer state and help-overlay section labels (#296).
  Adds 68 new MessageIds across all four shipped locales (en/ja/zh-Hans/pt-BR).

- Cache-prefix stability (#263) — five companion fixes (#287, #288→#292,
  #289, #290, #291) that keep the DeepSeek prefix cache stable across turns.

- Plus the items already in [Unreleased]: agent-mode network exec (#272),
  /skill GitHub URL parsing (#269), and the V4 Pro discount expiry extension
  (#267).

Bumps:
- Cargo.toml workspace version 0.8.3 → 0.8.4
- npm/deepseek-tui/package.json version + deepseekBinaryVersion 0.8.3 → 0.8.4
- Cargo.lock regenerated from the new workspace version.

Phase 1d (doctor output), Phase 2 (onboarding/init/missing-companion),
and Phase 3 (tool errors / sandbox denials / approvals) deferred to v0.8.5.
The shipped Phase 1 surfaces (slash commands, debug telemetry, footer
chrome) cover the highest-traffic UI paths Chinese users see first.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(release): bump internal path-dep versions + repair doc link (#301)

CI on PR #300 (release feat/v0.8.4 → main) flagged two regressions
introduced by the 0.8.4 version bump:

1. Version drift — path-dependency `version = "0.8.3"` references
   inside the workspace crates (10 crates: agent, app-server, cli,
   config, core, execpolicy, hooks, mcp, tools, tui) did not move with
   the workspace `[workspace.package] version = "0.8.4"`. The CI guard
   `scripts/release/check-versions.sh` requires they match.

2. Broken intra-doc-link `[crate::localization::english]` in the
   CommandInfo doc comment — `english` is private. Replaced with a
   reference to the public `description_for` accessor and the public
   `tr()` function.

Verified with:
- scripts/release/check-versions.sh — Version state OK.
- RUSTDOCFLAGS=-Dwarnings cargo doc --workspace --no-deps — green.
- cargo fmt + clippy + test all green.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hunter Bown
2026-05-01 23:02:38 -05:00
committed by GitHub
parent 40ec563f3e
commit 3d3ff0c5cf
42 changed files with 2436 additions and 258 deletions
+47
View File
@@ -128,6 +128,53 @@ jobs:
with:
files: artifacts/*/*
prerelease: false
body: |
## Install
### Recommended — npm (one command, both binaries)
```bash
npm install -g deepseek-tui
```
The wrapper downloads both binaries from this Release and places them in the same directory.
### Cargo (Linux / macOS)
```bash
cargo install deepseek-tui-cli deepseek-tui --locked
```
Both crates are required — `deepseek-tui-cli` produces the `deepseek` dispatcher and `deepseek-tui` produces the interactive runtime that the dispatcher delegates to. Installing only one binary will fail at runtime with a `MISSING_COMPANION_BINARY` error.
### Manual download
**Both** binaries below must be downloaded for your platform and dropped into the same directory (e.g. `~/.local/bin/`):
| Platform | Dispatcher | TUI runtime |
|---|---|---|
| Linux x64 | `deepseek-linux-x64` | `deepseek-tui-linux-x64` |
| macOS x64 | `deepseek-macos-x64` | `deepseek-tui-macos-x64` |
| macOS ARM | `deepseek-macos-arm64` | `deepseek-tui-macos-arm64` |
| Windows x64 | `deepseek-windows-x64.exe` | `deepseek-tui-windows-x64.exe` |
Then `chmod +x` both (Unix) and run `./deepseek`.
### Verify (recommended)
Download `deepseek-artifacts-sha256.txt` from this Release and verify:
```bash
# Linux
sha256sum -c deepseek-artifacts-sha256.txt
# macOS
shasum -a 256 -c deepseek-artifacts-sha256.txt
```
## Changelog
See [CHANGELOG.md](https://github.com/Hmbown/DeepSeek-TUI/blob/main/CHANGELOG.md) for the full notes for this release.
# npm publish is intentionally not automated. The npm account requires 2FA OTP
# on every publish, and a granular automation token that bypasses 2FA has not
+53
View File
@@ -7,6 +7,59 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
## [0.8.4] - 2026-05-02
### Added
- **Localization expansion (Phase 1, #285)** — every slash command's help
description, the full `/tokens` / `/cost` / `/cache` debug output, the
footer state and chip text, and the help-overlay section headings are
now translated for all four shipped locales (`en`, `ja`, `zh-Hans`,
`pt-BR`). Set the language with `/config locale zh-Hans` (or
`LANG=zh_CN.UTF-8` / `LC_ALL=zh_CN.UTF-8` from the shell). Non-Latin
scripts render via the same `unicode_width` plumbing the existing 27
chrome strings already use; the `shipped_first_pack_has_no_missing_core_messages`
test enforces full coverage across all four locales for every new
`MessageId`. Tool descriptions sent to the model and the base system
prompt intentionally remain English (training-data alignment, prefix
cache stability).
- Phase 1a (#294): 44 new IDs covering slash commands.
- Phase 1b (#295): 13 new IDs covering `/tokens` / `/cost` / `/cache`
debug output. Templates use `{placeholder}` substitution so a
translator can re-order args freely.
- Phase 1c (#296): 11 new IDs covering footer state, sub-agent chip,
quit-confirmation toast, and help-overlay section labels.
- **Stable cache prefix** (#263) — five companion fixes to keep the
DeepSeek prefix cache stable across turns: drop volatile fields from
the working-set summary block (#280, #287), place handoff and
working-set after the static prompt blocks (#288#292), memoise the
tool catalog so descriptions stay byte-stable (#289), sort
`project_tree` and `summarize_project` output (#290), and use a unique
fallback id for parallel streaming tool calls so downstream tool-result
routing doesn't match the first call twice (#291). The combined effect
is a meaningful jump in cache hit rate after the third turn.
### Fixed
- **Agent-mode shell exec could not reach the network** (#272) — the seatbelt
default policy denies all outbound network including DNS, so any
`exec_shell` command needing the network (`curl`, `yt-dlp`, package
managers, …) failed in Agent mode unless the user dropped to Yolo. The
engine now elevates the sandbox policy to `WorkspaceWrite { network_access:
true, … }` for both Agent and Yolo. Plan mode is unchanged (read-only
investigation never registers the shell tool). The application-level
`NetworkPolicy` (`crates/tui/src/network_policy.rs`) remains the only
outbound-traffic boundary.
- **`/skill install <github-repo-url>` failed with `invalid gzip header`** (#269)
`https://github.com/<owner>/<repo>` parsed as a raw direct URL, so the
installer downloaded the HTML repo page and tried to gzip-decode HTML.
Bare GitHub repo URLs (with or without `.git`, with or without `www.`,
with or without a trailing slash) now route to the `GitHubRepo` source the
same as `github:<owner>/<repo>`. URLs that already point at a specific
archive / blob / tree path still go through `DirectUrl`.
- **V4 Pro discount expiry extended** (#267) — DeepSeek extended the V4 Pro 75%
promotional discount from 2026-05-05 15:59 UTC to 2026-05-31 15:59 UTC. Without
this update the TUI would have started showing 4× the actual billed cost on
May 6 onwards. Verified at https://api-docs.deepseek.com/quick_start/pricing.
## [0.8.3] - 2026-05-01
### Fixed
Generated
+14 -14
View File
@@ -1011,7 +1011,7 @@ dependencies = [
[[package]]
name = "deepseek-agent"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"deepseek-config",
"serde",
@@ -1019,7 +1019,7 @@ dependencies = [
[[package]]
name = "deepseek-app-server"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"anyhow",
"axum",
@@ -1042,7 +1042,7 @@ dependencies = [
[[package]]
name = "deepseek-config"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"anyhow",
"deepseek-secrets",
@@ -1055,7 +1055,7 @@ dependencies = [
[[package]]
name = "deepseek-core"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"anyhow",
"chrono",
@@ -1074,7 +1074,7 @@ dependencies = [
[[package]]
name = "deepseek-execpolicy"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"anyhow",
"deepseek-protocol",
@@ -1083,7 +1083,7 @@ dependencies = [
[[package]]
name = "deepseek-hooks"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"anyhow",
"async-trait",
@@ -1097,7 +1097,7 @@ dependencies = [
[[package]]
name = "deepseek-mcp"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"anyhow",
"deepseek-protocol",
@@ -1107,7 +1107,7 @@ dependencies = [
[[package]]
name = "deepseek-protocol"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"serde",
"serde_json",
@@ -1115,7 +1115,7 @@ dependencies = [
[[package]]
name = "deepseek-secrets"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"dirs",
"keyring",
@@ -1128,7 +1128,7 @@ dependencies = [
[[package]]
name = "deepseek-state"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"anyhow",
"chrono",
@@ -1140,7 +1140,7 @@ dependencies = [
[[package]]
name = "deepseek-tools"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"anyhow",
"async-trait",
@@ -1153,7 +1153,7 @@ dependencies = [
[[package]]
name = "deepseek-tui"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"anyhow",
"arboard",
@@ -1214,7 +1214,7 @@ dependencies = [
[[package]]
name = "deepseek-tui-cli"
version = "0.8.3"
version = "0.8.4"
dependencies = [
"anyhow",
"chrono",
@@ -1237,7 +1237,7 @@ dependencies = [
[[package]]
name = "deepseek-tui-core"
version = "0.8.3"
version = "0.8.4"
[[package]]
name = "deranged"
+1 -1
View File
@@ -19,7 +19,7 @@ default-members = ["crates/cli", "crates/app-server", "crates/tui"]
resolver = "2"
[workspace.package]
version = "0.8.3"
version = "0.8.4"
edition = "2024"
license = "MIT"
repository = "https://github.com/Hmbown/DeepSeek-TUI"
+1 -1
View File
@@ -7,5 +7,5 @@ repository.workspace = true
description = "Model/provider registry and fallback strategy for DeepSeek workspace architecture"
[dependencies]
deepseek-config = { path = "../config", version = "0.8.3" }
deepseek-config = { path = "../config", version = "0.8.4" }
serde.workspace = true
+9 -9
View File
@@ -10,15 +10,15 @@ description = "Codex-style app-server transport for DeepSeek workspace architect
anyhow.workspace = true
axum.workspace = true
clap.workspace = true
deepseek-agent = { path = "../agent", version = "0.8.3" }
deepseek-config = { path = "../config", version = "0.8.3" }
deepseek-core = { path = "../core", version = "0.8.3" }
deepseek-execpolicy = { path = "../execpolicy", version = "0.8.3" }
deepseek-hooks = { path = "../hooks", version = "0.8.3" }
deepseek-mcp = { path = "../mcp", version = "0.8.3" }
deepseek-protocol = { path = "../protocol", version = "0.8.3" }
deepseek-state = { path = "../state", version = "0.8.3" }
deepseek-tools = { path = "../tools", version = "0.8.3" }
deepseek-agent = { path = "../agent", version = "0.8.4" }
deepseek-config = { path = "../config", version = "0.8.4" }
deepseek-core = { path = "../core", version = "0.8.4" }
deepseek-execpolicy = { path = "../execpolicy", version = "0.8.4" }
deepseek-hooks = { path = "../hooks", version = "0.8.4" }
deepseek-mcp = { path = "../mcp", version = "0.8.4" }
deepseek-protocol = { path = "../protocol", version = "0.8.4" }
deepseek-state = { path = "../state", version = "0.8.4" }
deepseek-tools = { path = "../tools", version = "0.8.4" }
serde.workspace = true
serde_json.workspace = true
tokio.workspace = true
+7 -7
View File
@@ -14,13 +14,13 @@ path = "src/main.rs"
anyhow.workspace = true
clap.workspace = true
clap_complete.workspace = true
deepseek-agent = { path = "../agent", version = "0.8.3" }
deepseek-app-server = { path = "../app-server", version = "0.8.3" }
deepseek-config = { path = "../config", version = "0.8.3" }
deepseek-execpolicy = { path = "../execpolicy", version = "0.8.3" }
deepseek-mcp = { path = "../mcp", version = "0.8.3" }
deepseek-secrets = { path = "../secrets", version = "0.8.3" }
deepseek-state = { path = "../state", version = "0.8.3" }
deepseek-agent = { path = "../agent", version = "0.8.4" }
deepseek-app-server = { path = "../app-server", version = "0.8.4" }
deepseek-config = { path = "../config", version = "0.8.4" }
deepseek-execpolicy = { path = "../execpolicy", version = "0.8.4" }
deepseek-mcp = { path = "../mcp", version = "0.8.4" }
deepseek-secrets = { path = "../secrets", version = "0.8.4" }
deepseek-state = { path = "../state", version = "0.8.4" }
chrono.workspace = true
dirs.workspace = true
serde.workspace = true
+1 -1
View File
@@ -8,7 +8,7 @@ description = "Config schema and precedence model for DeepSeek workspace archite
[dependencies]
anyhow.workspace = true
deepseek-secrets = { path = "../secrets", version = "0.8.3" }
deepseek-secrets = { path = "../secrets", version = "0.8.4" }
dirs.workspace = true
serde.workspace = true
serde_json.workspace = true
+8 -8
View File
@@ -9,14 +9,14 @@ description = "Core runtime boundaries for DeepSeek workspace architecture"
[dependencies]
anyhow.workspace = true
chrono.workspace = true
deepseek-agent = { path = "../agent", version = "0.8.3" }
deepseek-config = { path = "../config", version = "0.8.3" }
deepseek-execpolicy = { path = "../execpolicy", version = "0.8.3" }
deepseek-hooks = { path = "../hooks", version = "0.8.3" }
deepseek-mcp = { path = "../mcp", version = "0.8.3" }
deepseek-protocol = { path = "../protocol", version = "0.8.3" }
deepseek-state = { path = "../state", version = "0.8.3" }
deepseek-tools = { path = "../tools", version = "0.8.3" }
deepseek-agent = { path = "../agent", version = "0.8.4" }
deepseek-config = { path = "../config", version = "0.8.4" }
deepseek-execpolicy = { path = "../execpolicy", version = "0.8.4" }
deepseek-hooks = { path = "../hooks", version = "0.8.4" }
deepseek-mcp = { path = "../mcp", version = "0.8.4" }
deepseek-protocol = { path = "../protocol", version = "0.8.4" }
deepseek-state = { path = "../state", version = "0.8.4" }
deepseek-tools = { path = "../tools", version = "0.8.4" }
serde_json.workspace = true
tokio.workspace = true
uuid.workspace = true
+1 -1
View File
@@ -8,5 +8,5 @@ description = "Execution policy and approval model parity for DeepSeek workspace
[dependencies]
anyhow.workspace = true
deepseek-protocol = { path = "../protocol", version = "0.8.3" }
deepseek-protocol = { path = "../protocol", version = "0.8.4" }
serde.workspace = true
+1 -1
View File
@@ -10,7 +10,7 @@ description = "Hook dispatch and notifications parity for DeepSeek workspace arc
anyhow.workspace = true
async-trait.workspace = true
chrono.workspace = true
deepseek-protocol = { path = "../protocol", version = "0.8.3" }
deepseek-protocol = { path = "../protocol", version = "0.8.4" }
reqwest.workspace = true
serde.workspace = true
serde_json.workspace = true
+1 -1
View File
@@ -8,6 +8,6 @@ description = "MCP server lifecycle and tool proxy compatibility for DeepSeek wo
[dependencies]
anyhow.workspace = true
deepseek-protocol = { path = "../protocol", version = "0.8.3" }
deepseek-protocol = { path = "../protocol", version = "0.8.4" }
serde.workspace = true
serde_json.workspace = true
+101 -2
View File
@@ -273,13 +273,20 @@ impl KeyringStore for FileKeyringStore {
}
fn set(&self, key: &str, value: &str) -> Result<(), SecretsError> {
let mut blob = self.load_unlocked().unwrap_or_default();
// load_unlocked already returns Ok(default) for a missing file, so the
// first-write-creates-the-file path is preserved. Any other Err
// (insecure permissions, corrupt JSON, transient I/O) MUST surface to
// the caller — propagating it via `unwrap_or_default()` silently
// wipes every previously stored secret on the next `store_unlocked`.
let mut blob = self.load_unlocked()?;
blob.entries.insert(key.to_string(), value.to_string());
self.store_unlocked(&blob)
}
fn delete(&self, key: &str) -> Result<(), SecretsError> {
let mut blob = self.load_unlocked().unwrap_or_default();
// Same invariant as `set`: never fall back to an empty blob on read
// error, or `delete <one-key>` becomes `delete <every-key>`.
let mut blob = self.load_unlocked()?;
blob.entries.remove(key);
self.store_unlocked(&blob)
}
@@ -564,6 +571,98 @@ mod tests {
);
}
// Regression for #281: `set` and `delete` used to call
// `load_unlocked().unwrap_or_default()`, which silently wiped every
// existing secret whenever the read failed (insecure permissions,
// corrupt JSON, or any other I/O error).
#[cfg(unix)]
#[test]
fn file_store_set_does_not_clobber_secrets_when_perms_are_bad() {
use std::os::unix::fs::PermissionsExt;
let tmp = tempfile::tempdir().unwrap();
let path = tmp.path().join("secrets.json");
let original = "{\"entries\":{\"deepseek\":\"sk-keep\",\"nvidia\":\"nv-keep\"}}";
fs::write(&path, original).unwrap();
let mut perms = fs::metadata(&path).unwrap().permissions();
perms.set_mode(0o644);
fs::set_permissions(&path, perms).unwrap();
let store = FileKeyringStore::new(path.clone());
let err = store.set("openrouter", "or-new").unwrap_err();
assert!(
matches!(err, SecretsError::InsecurePermissions { .. }),
"set must surface the read error rather than overwriting; got: {err}"
);
let on_disk = fs::read_to_string(&path).unwrap();
assert_eq!(
on_disk, original,
"set must not modify the file when load_unlocked errored"
);
}
#[cfg(unix)]
#[test]
fn file_store_delete_does_not_clobber_secrets_when_perms_are_bad() {
use std::os::unix::fs::PermissionsExt;
let tmp = tempfile::tempdir().unwrap();
let path = tmp.path().join("secrets.json");
let original = "{\"entries\":{\"deepseek\":\"sk-keep\",\"nvidia\":\"nv-keep\"}}";
fs::write(&path, original).unwrap();
let mut perms = fs::metadata(&path).unwrap().permissions();
perms.set_mode(0o644);
fs::set_permissions(&path, perms).unwrap();
let store = FileKeyringStore::new(path.clone());
let err = store.delete("nvidia").unwrap_err();
assert!(
matches!(err, SecretsError::InsecurePermissions { .. }),
"delete must surface the read error rather than wiping the file; got: {err}"
);
let on_disk = fs::read_to_string(&path).unwrap();
assert_eq!(on_disk, original);
}
#[test]
fn file_store_set_does_not_clobber_secrets_when_json_is_corrupt() {
let tmp = tempfile::tempdir().unwrap();
let path = tmp.path().join("secrets.json");
// Corrupt JSON. Permissions ok where unix; on Windows the perm-check
// doesn't run so we exercise the json-error path directly.
fs::write(&path, "{ this is not valid json").unwrap();
#[cfg(unix)]
{
use std::os::unix::fs::PermissionsExt;
let mut perms = fs::metadata(&path).unwrap().permissions();
perms.set_mode(0o600);
fs::set_permissions(&path, perms).unwrap();
}
let store = FileKeyringStore::new(path.clone());
let err = store.set("deepseek", "sk-new").unwrap_err();
assert!(
matches!(err, SecretsError::Json(_)),
"set must surface the parse error rather than wiping the file; got: {err}"
);
let on_disk = fs::read_to_string(&path).unwrap();
assert_eq!(on_disk, "{ this is not valid json");
}
#[test]
fn file_store_set_still_creates_file_when_missing() {
// Regression guard: the #281 fix removed `unwrap_or_default()` from
// the load call. Make sure the original first-write-creates-the-file
// ergonomic still works — `load_unlocked` returns `Ok(default)` for
// a missing file, so the `?` should pass through cleanly.
let tmp = tempfile::tempdir().unwrap();
let path = tmp.path().join("nested").join("secrets.json");
let store = FileKeyringStore::new(path.clone());
store.set("deepseek", "sk-fresh").unwrap();
assert_eq!(store.get("deepseek").unwrap(), Some("sk-fresh".to_string()));
}
#[test]
fn file_store_default_path_uses_home() {
// We don't override HOME here (other tests do); we just check the
+1 -1
View File
@@ -9,7 +9,7 @@ description = "Tool invocation lifecycle, schema validation, and scheduler paral
[dependencies]
anyhow.workspace = true
async-trait.workspace = true
deepseek-protocol = { path = "../protocol", version = "0.8.3" }
deepseek-protocol = { path = "../protocol", version = "0.8.4" }
serde.workspace = true
serde_json.workspace = true
tokio.workspace = true
+3 -3
View File
@@ -14,9 +14,9 @@ path = "src/main.rs"
[dependencies]
anyhow = "1.0.100"
arboard = "3.4"
deepseek-tui-cli = { path = "../cli", version = "0.8.3" }
deepseek-secrets = { path = "../secrets", version = "0.8.3" }
deepseek-tools = { path = "../tools", version = "0.8.3" }
deepseek-tui-cli = { path = "../cli", version = "0.8.4" }
deepseek-secrets = { path = "../secrets", version = "0.8.4" }
deepseek-tools = { path = "../tools", version = "0.8.4" }
async-stream = "0.3.6"
async-trait = "0.1"
bytes = "1.11.0"
+72 -3
View File
@@ -1178,11 +1178,21 @@ pub(super) fn parse_sse_chunk(
*thinking_started = false;
}
let block_index = *content_index;
let id = tc
.get("id")
.and_then(Value::as_str)
.unwrap_or("tool_call")
.to_string();
.map(str::to_string)
// Some upstream gateways (and the responses-API
// bridge) elide the `id` on the first chunk of a
// tool call. Falling back to a constant string
// collides when the model emits parallel tool
// calls in the same delta — every call ended up
// with the same id and downstream tool-result
// routing matched the first one twice. Index by
// the content-block position to keep the
// fallback unique within the response.
.unwrap_or_else(|| format!("call_{block_index}"));
let name = tc
.get("function")
.and_then(|f| f.get("name"))
@@ -1201,7 +1211,6 @@ pub(super) fn parse_sse_chunk(
})
});
let block_index = *content_index;
events.push(StreamEvent::ContentBlockStart {
index: block_index,
content_block: ContentBlockStart::ToolUse {
@@ -1471,4 +1480,64 @@ mod stream_decoder_tests {
"should yield InputJsonDelta carrying the tool args; got {events:?}"
);
}
/// Regression for the parallel-tool-calls-without-id collision (audit
/// Finding 8): when the upstream chunk omits the `id` field, the
/// fallback used to be the literal string `"tool_call"` for every
/// parallel call, so two tool calls in one delta ended up sharing an
/// id. Downstream routing then matched the first call's tool_result
/// twice and the second call hung. The fallback is now indexed by the
/// content-block position, keeping each call unique within the
/// response.
#[test]
fn decoder_assigns_unique_fallback_ids_to_parallel_tool_calls_missing_id() {
let events = decode_chunk(
r#"{"choices":[{"delta":{"tool_calls":[
{"index":0,"function":{"name":"grep_files","arguments":"{\"pattern\":\"a\"}"}},
{"index":1,"function":{"name":"read_file","arguments":"{\"path\":\"x\"}"}}
]}}]}"#,
);
let ids: Vec<&str> = events
.iter()
.filter_map(|e| match e {
StreamEvent::ContentBlockStart {
content_block: ContentBlockStart::ToolUse { id, .. },
..
} => Some(id.as_str()),
_ => None,
})
.collect();
assert_eq!(
ids.len(),
2,
"expected two tool-use blocks for parallel tool calls; got {events:?}"
);
assert_ne!(
ids[0], ids[1],
"parallel tool calls without upstream `id` must get distinct fallback ids; got {ids:?}"
);
}
#[test]
fn decoder_preserves_upstream_tool_call_id_when_present() {
// Counter-test to the fallback regression: when the upstream chunk
// does include `id`, we forward it verbatim — we shouldn't quietly
// rewrite ids the API gave us just because we have a fallback path.
let events = decode_chunk(
r#"{"choices":[{"delta":{"tool_calls":[{"index":0,"id":"call_xyz","function":{"name":"grep_files","arguments":"{}"}}]}}]}"#,
);
let id = events
.iter()
.find_map(|e| match e {
StreamEvent::ContentBlockStart {
content_block: ContentBlockStart::ToolUse { id, .. },
..
} => Some(id.as_str()),
_ => None,
})
.expect("tool-use block present");
assert_eq!(id, "call_xyz");
}
}
-1
View File
@@ -1,4 +1,3 @@
// TODO(integrate): Wire command safety analysis into shell tool approval flow
#![allow(dead_code)]
//! Command safety analysis for shell execution
+3 -1
View File
@@ -15,7 +15,9 @@ pub fn help(app: &mut App, topic: Option<&str>) -> CommandResult {
if let Some(cmd) = super::get_command_info(topic) {
let mut help = format!(
"{}\n\n {}\n\n Usage: {}",
cmd.name, cmd.description, cmd.usage
cmd.name,
cmd.description_for(app.ui_locale),
cmd.usage
);
if !cmd.aliases.is_empty() {
let _ = write!(help, "\n Aliases: {}", cmd.aliases.join(", "));
+286 -43
View File
@@ -2,82 +2,81 @@
//! Debug commands: tokens, cost, system, context, undo, retry
use std::time::Instant;
use super::CommandResult;
use crate::compaction::estimate_input_tokens_conservative;
use crate::localization::{Locale, MessageId, tr};
use crate::models::{SystemPrompt, context_window_for_model};
use crate::tui::app::{App, AppAction};
use crate::tui::app::{App, AppAction, TurnCacheRecord};
use crate::tui::history::HistoryCell;
fn token_count(value: Option<u32>) -> String {
value.map_or_else(|| "not reported".to_string(), |tokens| tokens.to_string())
fn token_count(value: Option<u32>, locale: Locale) -> String {
value.map_or_else(
|| tr(locale, MessageId::CmdTokensNotReported).to_string(),
|tokens| tokens.to_string(),
)
}
fn active_context_summary(app: &App) -> String {
fn active_context_summary(app: &App, locale: Locale) -> String {
let estimated =
estimate_input_tokens_conservative(&app.api_messages, app.system_prompt.as_ref());
match context_window_for_model(&app.model) {
Some(window) => {
let used = estimated.min(window as usize);
let percent = (used as f64 / f64::from(window) * 100.0).clamp(0.0, 100.0);
format!("~{used} / {window} ({percent:.1}%)")
tr(locale, MessageId::CmdTokensContextWithWindow)
.replace("{used}", &used.to_string())
.replace("{window}", &window.to_string())
.replace("{percent}", &format!("{percent:.1}"))
}
None => format!("~{estimated} / unknown window"),
None => tr(locale, MessageId::CmdTokensContextUnknownWindow)
.replace("{estimated}", &estimated.to_string()),
}
}
fn cache_summary(app: &App) -> String {
fn cache_summary(app: &App, locale: Locale) -> String {
match (
app.last_prompt_cache_hit_tokens,
app.last_prompt_cache_miss_tokens,
) {
(Some(hit), Some(miss)) => format!("{hit} hit / {miss} miss"),
(Some(hit), None) => format!("{hit} hit / miss not reported"),
(None, Some(miss)) => format!("hit not reported / {miss} miss"),
(None, None) => "not reported".to_string(),
(Some(hit), Some(miss)) => tr(locale, MessageId::CmdTokensCacheBoth)
.replace("{hit}", &hit.to_string())
.replace("{miss}", &miss.to_string()),
(Some(hit), None) => {
tr(locale, MessageId::CmdTokensCacheHitOnly).replace("{hit}", &hit.to_string())
}
(None, Some(miss)) => {
tr(locale, MessageId::CmdTokensCacheMissOnly).replace("{miss}", &miss.to_string())
}
(None, None) => tr(locale, MessageId::CmdTokensNotReported).to_string(),
}
}
/// Show token usage for session
pub fn tokens(app: &mut App) -> CommandResult {
let locale = app.ui_locale;
let message_count = app.api_messages.len();
let chat_count = app.history.len();
CommandResult::message(format!(
"Token Usage:\n\
\n\
Active context: {}\n\
Last API input: {} (turn telemetry; may count repeated prefix across tool rounds)\n\
Last API output: {}\n\
Cache hit/miss: {} (telemetry/cost only)\n\
Cumulative tokens: {} (session usage telemetry)\n\
Approx session cost: ${:.4}\n\
API messages: {}\n\
Chat messages: {}\n\
Model: {}",
active_context_summary(app),
token_count(app.last_prompt_tokens),
token_count(app.last_completion_tokens),
cache_summary(app),
app.total_tokens,
app.session_cost,
message_count,
chat_count,
app.model,
))
let report = tr(locale, MessageId::CmdTokensReport)
.replace("{active}", &active_context_summary(app, locale))
.replace("{input}", &token_count(app.last_prompt_tokens, locale))
.replace("{output}", &token_count(app.last_completion_tokens, locale))
.replace("{cache}", &cache_summary(app, locale))
.replace("{total}", &app.total_tokens.to_string())
.replace("{cost}", &format!("{:.4}", app.session_cost))
.replace("{api_messages}", &message_count.to_string())
.replace("{chat_messages}", &chat_count.to_string())
.replace("{model}", &app.model);
CommandResult::message(report)
}
/// Show session cost breakdown
pub fn cost(app: &mut App) -> CommandResult {
CommandResult::message(format!(
"Session Cost:\n\
\n\
Approx total spent: ${:.4}\n\n\
Cost estimates are approximate and use provider usage telemetry when available.\n\n\
DeepSeek API Pricing:\n\
\n\
Pricing details are not configured in this CLI.",
app.session_cost,
))
let report = tr(app.ui_locale, MessageId::CmdCostReport)
.replace("{cost}", &format!("{:.4}", app.session_cost));
CommandResult::message(report)
}
/// Show current system prompt
@@ -121,6 +120,141 @@ pub fn context(_app: &mut App) -> CommandResult {
CommandResult::action(AppAction::OpenContextInspector)
}
/// Show per-turn DeepSeek prefix-cache telemetry for the last N turns (#263).
///
/// `arg` is parsed as a count override (default 10, capped at the ring size).
/// Renders a fixed-width table the user can paste into a bug report.
pub fn cache(app: &mut App, arg: Option<&str>) -> CommandResult {
let want = arg
.and_then(|s| s.trim().parse::<usize>().ok())
.unwrap_or(10);
let cap = app.turn_cache_history.len();
let count = want
.min(cap)
.min(crate::tui::app::App::TURN_CACHE_HISTORY_CAP);
if cap == 0 {
return CommandResult::message(tr(app.ui_locale, MessageId::CmdCacheNoData));
}
CommandResult::message(format_cache_history(app, count, app.ui_locale))
}
fn format_cache_history(app: &App, count: usize, locale: Locale) -> String {
let total = app.turn_cache_history.len();
let start = total.saturating_sub(count);
let rows: Vec<&TurnCacheRecord> = app.turn_cache_history.iter().skip(start).collect();
let mut totals_input: u64 = 0;
let mut totals_hit: u64 = 0;
let mut totals_miss: u64 = 0;
let mut header = tr(locale, MessageId::CmdCacheHeader)
.replace("{count}", &rows.len().to_string())
.replace("{total}", &total.to_string())
.replace("{model}", &app.model);
header.push_str(&"".repeat(76));
header.push('\n');
header.push_str("turn in out hit miss replay ratio age\n");
header.push_str(&"".repeat(76));
header.push('\n');
let now = Instant::now();
let mut body = String::new();
let absolute_start = total.saturating_sub(rows.len());
for (i, rec) in rows.iter().enumerate() {
let turn_index = absolute_start + i + 1;
totals_input += u64::from(rec.input_tokens);
let replay_cell = rec
.reasoning_replay_tokens
.map_or_else(|| "".to_string(), |t| t.to_string());
let age = humanize_age(now.saturating_duration_since(rec.recorded_at));
// No cache telemetry → render `—` everywhere and don't pollute totals
// with inferred zeros. Some providers (and some routes inside DeepSeek)
// skip the cache fields; including a synthesized 0/N for those turns
// would make every aggregate ratio look broken.
let Some(hit) = rec.cache_hit_tokens else {
body.push_str(&format!(
"{turn:>4} {input:>5} {output:>5} {hit:>5} {miss:>5} {replay:>6} {ratio:>6} {age}\n",
turn = turn_index,
input = rec.input_tokens,
output = rec.output_tokens,
hit = "",
miss = "",
replay = replay_cell,
ratio = "",
age = age,
));
continue;
};
let miss_reported = rec.cache_miss_tokens;
let miss = miss_reported.unwrap_or_else(|| rec.input_tokens.saturating_sub(hit));
let accounted = u64::from(hit) + u64::from(miss);
let ratio = if accounted == 0 {
"".to_string()
} else {
format!("{:>5.1}%", 100.0 * f64::from(hit) / accounted as f64)
};
totals_hit += u64::from(hit);
totals_miss += u64::from(miss);
let miss_cell = match miss_reported {
Some(_) => format!("{miss}"),
None => format!("{miss}*"),
};
body.push_str(&format!(
"{turn:>4} {input:>5} {output:>5} {hit:>5} {miss:>5} {replay:>6} {ratio} {age}\n",
turn = turn_index,
input = rec.input_tokens,
output = rec.output_tokens,
hit = hit,
miss = miss_cell,
replay = replay_cell,
ratio = ratio,
age = age,
));
}
let totals_accounted = totals_hit + totals_miss;
let avg_ratio = if totals_accounted == 0 {
"".to_string()
} else {
format!(
"{:.1}%",
100.0 * totals_hit as f64 / totals_accounted as f64
)
};
let mut footer = String::new();
footer.push_str(&"".repeat(76));
footer.push('\n');
footer.push_str(
&tr(locale, MessageId::CmdCacheTotals)
.replace("{sum_in}", &totals_input.to_string())
.replace("{sum_hit}", &totals_hit.to_string())
.replace("{sum_miss}", &totals_miss.to_string())
.replace("{avg}", &avg_ratio),
);
footer.push_str(tr(locale, MessageId::CmdCacheFootnote));
footer.push_str(tr(locale, MessageId::CmdCacheAdvice));
format!("{header}{body}{footer}")
}
fn humanize_age(d: std::time::Duration) -> String {
let secs = d.as_secs();
if secs < 60 {
format!("{secs}s")
} else if secs < 3600 {
format!("{}m{:02}s", secs / 60, secs % 60)
} else {
format!("{}h{:02}m", secs / 3600, (secs % 3600) / 60)
}
}
#[cfg(test)]
mod tests {
use super::*;
@@ -256,6 +390,115 @@ mod tests {
assert!(msg.contains("chars total"));
}
#[test]
fn cache_command_reports_no_data_before_first_turn() {
let mut app = create_test_app();
let result = cache(&mut app, None);
let msg = result.message.expect("cache produces a message");
assert!(msg.contains("no turns recorded yet"), "got: {msg}");
}
#[test]
fn cache_command_renders_recorded_turns_with_ratio() {
let mut app = create_test_app();
let now = Instant::now();
// Three turns: 75% hit, 50% hit, miss-only (provider didn't report hit).
app.push_turn_cache_record(TurnCacheRecord {
input_tokens: 4_000,
output_tokens: 200,
cache_hit_tokens: Some(3_000),
cache_miss_tokens: Some(1_000),
reasoning_replay_tokens: None,
recorded_at: now,
});
app.push_turn_cache_record(TurnCacheRecord {
input_tokens: 6_000,
output_tokens: 250,
cache_hit_tokens: Some(3_000),
cache_miss_tokens: Some(3_000),
reasoning_replay_tokens: Some(150),
recorded_at: now,
});
// Turn 3: hit reported but provider didn't report miss separately —
// infer miss = input hit and mark with `*`.
app.push_turn_cache_record(TurnCacheRecord {
input_tokens: 5_000,
output_tokens: 100,
cache_hit_tokens: Some(2_500),
cache_miss_tokens: None,
reasoning_replay_tokens: None,
recorded_at: now,
});
// Turn 4: no telemetry at all — must not pollute aggregate ratios.
app.push_turn_cache_record(TurnCacheRecord {
input_tokens: 1_000,
output_tokens: 50,
cache_hit_tokens: None,
cache_miss_tokens: None,
reasoning_replay_tokens: None,
recorded_at: now,
});
let result = cache(&mut app, None);
let msg = result.message.expect("cache produces a message");
// Header reflects total rows and model.
assert!(msg.contains("last 4 of 4 turn(s)"), "got: {msg}");
// Per-turn ratios are rendered.
assert!(msg.contains("75.0%"), "got: {msg}");
assert!(msg.contains("50.0%"), "got: {msg}");
// Turn 3: hit=2500, inferred miss=2500 → 50.0% with `*`-marked miss.
assert!(msg.contains("2500*"), "got: {msg}");
// Turn 4 (no telemetry) shows em-dashes and is excluded from totals.
// Aggregate over turns 1-3: hit=8500, miss=6500 → 56.7%.
assert!(msg.contains("avg hit ratio: 56.7%"), "got: {msg}");
// Footer guidance is present.
assert!(msg.contains("70%"), "got: {msg}");
}
#[test]
fn cache_command_count_argument_clamps_to_history() {
let mut app = create_test_app();
for _ in 0..3 {
app.push_turn_cache_record(TurnCacheRecord {
input_tokens: 1_000,
output_tokens: 100,
cache_hit_tokens: Some(500),
cache_miss_tokens: Some(500),
reasoning_replay_tokens: None,
recorded_at: Instant::now(),
});
}
let result = cache(&mut app, Some("100"));
let msg = result.message.expect("cache produces a message");
// Asked for 100 turns, only 3 exist — should report "last 3 of 3".
assert!(msg.contains("last 3 of 3 turn(s)"), "got: {msg}");
}
#[test]
fn turn_cache_history_is_capped_at_50() {
let mut app = create_test_app();
for i in 0..(crate::tui::app::App::TURN_CACHE_HISTORY_CAP + 12) {
app.push_turn_cache_record(TurnCacheRecord {
input_tokens: i as u32,
output_tokens: 1,
cache_hit_tokens: Some(i as u32),
cache_miss_tokens: Some(0),
reasoning_replay_tokens: None,
recorded_at: Instant::now(),
});
}
assert_eq!(
app.turn_cache_history.len(),
crate::tui::app::App::TURN_CACHE_HISTORY_CAP
);
// Oldest record was evicted; newest record is still at the back.
assert_eq!(
app.turn_cache_history.back().unwrap().input_tokens,
(crate::tui::app::App::TURN_CACHE_HISTORY_CAP + 11) as u32
);
}
#[test]
fn test_context_shows_usage_stats() {
let mut app = create_test_app();
+175 -49
View File
@@ -20,6 +20,7 @@ mod session;
mod skills;
mod task;
use crate::localization::{Locale, MessageId, tr};
use crate::tui::app::{App, AppAction};
/// Result of executing a command
@@ -74,13 +75,18 @@ impl CommandResult {
}
}
/// Command metadata for help and autocomplete
/// Command metadata for help and autocomplete.
///
/// The English description lives in `localization::english` (private), keyed
/// by `description_id`. Callers resolve a localized description through
/// [`CommandInfo::description_for`] which delegates to
/// [`crate::localization::tr`].
#[derive(Debug, Clone, Copy)]
pub struct CommandInfo {
pub name: &'static str,
pub aliases: &'static [&'static str],
pub description: &'static str,
pub usage: &'static str,
pub description_id: MessageId,
}
impl CommandInfo {
@@ -96,11 +102,16 @@ impl CommandInfo {
}
}
pub fn palette_description(&self) -> String {
pub fn description_for(&self, locale: Locale) -> &'static str {
tr(locale, self.description_id)
}
pub fn palette_description_for(&self, locale: Locale) -> String {
let desc = self.description_for(locale);
if self.aliases.is_empty() {
self.description.to_string()
desc.to_string()
} else {
format!("{} aliases: {}", self.description, self.aliases.join(", "))
format!("{} aliases: {}", desc, self.aliases.join(", "))
}
}
}
@@ -111,266 +122,273 @@ pub const COMMANDS: &[CommandInfo] = &[
CommandInfo {
name: "help",
aliases: &["?"],
description: "Show help information",
usage: "/help [command]",
description_id: MessageId::CmdHelpDescription,
},
CommandInfo {
name: "clear",
aliases: &[],
description: "Clear conversation history",
usage: "/clear",
description_id: MessageId::CmdClearDescription,
},
CommandInfo {
name: "exit",
aliases: &["quit", "q"],
description: "Exit the application",
usage: "/exit",
description_id: MessageId::CmdExitDescription,
},
CommandInfo {
name: "model",
aliases: &[],
description: "Switch or view current model",
usage: "/model [name]",
description_id: MessageId::CmdModelDescription,
},
CommandInfo {
name: "models",
aliases: &[],
description: "List available models from API",
usage: "/models",
description_id: MessageId::CmdModelsDescription,
},
CommandInfo {
name: "provider",
aliases: &[],
description: "Switch or view the active LLM backend (deepseek | nvidia-nim)",
usage: "/provider [name]",
description_id: MessageId::CmdProviderDescription,
},
CommandInfo {
name: "queue",
aliases: &["queued"],
description: "View or edit queued messages",
usage: "/queue [list|edit <n>|drop <n>|clear]",
description_id: MessageId::CmdQueueDescription,
},
CommandInfo {
name: "subagents",
aliases: &["agents"],
description: "List sub-agent status",
usage: "/subagents",
description_id: MessageId::CmdSubagentsDescription,
},
CommandInfo {
name: "links",
aliases: &["dashboard", "api"],
description: "Show DeepSeek dashboard and docs links",
usage: "/links",
description_id: MessageId::CmdLinksDescription,
},
CommandInfo {
name: "home",
aliases: &["stats", "overview"],
description: "Show home dashboard with stats and quick actions",
usage: "/home",
description_id: MessageId::CmdHomeDescription,
},
CommandInfo {
name: "note",
aliases: &[],
description: "Append note to persistent notes file (.deepseek/notes.md)",
usage: "/note <text>",
description_id: MessageId::CmdNoteDescription,
},
CommandInfo {
name: "attach",
aliases: &["image", "media"],
description: "Attach image/video media; use @path for text files or directories",
usage: "/attach <path>",
description_id: MessageId::CmdAttachDescription,
},
CommandInfo {
name: "task",
aliases: &["tasks"],
description: "Manage background tasks",
usage: "/task [add <prompt>|list|show <id>|cancel <id>]",
description_id: MessageId::CmdTaskDescription,
},
CommandInfo {
name: "jobs",
aliases: &["job"],
description: "Inspect and control background shell jobs",
usage: "/jobs [list|show <id>|poll <id>|wait <id>|stdin <id> <input>|cancel <id>]",
description_id: MessageId::CmdJobsDescription,
},
CommandInfo {
name: "mcp",
aliases: &[],
description: "Open or manage MCP servers",
usage: "/mcp [init|add stdio <name> <command> [args...]|add http <name> <url>|enable <name>|disable <name>|remove <name>|validate|reload]",
description_id: MessageId::CmdMcpDescription,
},
// Session commands
CommandInfo {
name: "save",
aliases: &[],
description: "Save session to file",
usage: "/save [path]",
description_id: MessageId::CmdSaveDescription,
},
CommandInfo {
name: "sessions",
aliases: &["resume"],
description: "Open session picker",
usage: "/sessions",
description_id: MessageId::CmdSessionsDescription,
},
CommandInfo {
name: "load",
aliases: &[],
description: "Load session from file",
usage: "/load [path]",
description_id: MessageId::CmdLoadDescription,
},
CommandInfo {
name: "compact",
aliases: &[],
description: "Trigger context compaction to free up space (legacy; v0.6.6 prefers cycle restart)",
usage: "/compact",
description_id: MessageId::CmdCompactDescription,
},
CommandInfo {
name: "context",
aliases: &["ctx"],
description: "Open compact session context inspector",
usage: "/context",
description_id: MessageId::CmdContextDescription,
},
CommandInfo {
name: "cycles",
aliases: &[],
description: "List checkpoint-restart cycle handoffs in this session",
usage: "/cycles",
description_id: MessageId::CmdCyclesDescription,
},
CommandInfo {
name: "cycle",
aliases: &[],
description: "Show the carry-forward briefing for a specific cycle",
usage: "/cycle <n>",
description_id: MessageId::CmdCycleDescription,
},
CommandInfo {
name: "recall",
aliases: &[],
description: "Search prior cycle archives (BM25 over message text)",
usage: "/recall <query>",
description_id: MessageId::CmdRecallDescription,
},
CommandInfo {
name: "export",
aliases: &[],
description: "Export conversation to markdown",
usage: "/export [path]",
description_id: MessageId::CmdExportDescription,
},
// Config commands
CommandInfo {
name: "config",
aliases: &[],
description: "Open interactive configuration editor",
usage: "/config",
description_id: MessageId::CmdConfigDescription,
},
CommandInfo {
name: "yolo",
aliases: &[],
description: "Enable YOLO mode (shell + trust + auto-approve)",
usage: "/yolo",
description_id: MessageId::CmdYoloDescription,
},
CommandInfo {
name: "agent",
aliases: &[],
description: "Switch to agent mode",
usage: "/agent",
description_id: MessageId::CmdAgentDescription,
},
CommandInfo {
name: "plan",
aliases: &[],
description: "Switch to plan mode and review suggested implementation steps",
usage: "/plan",
description_id: MessageId::CmdPlanDescription,
},
CommandInfo {
name: "trust",
aliases: &[],
description: "Manage workspace trust and per-path allowlist (`/trust add <path>`, `/trust list`, `/trust on|off`)",
usage: "/trust [on|off|add <path>|remove <path>|list]",
description_id: MessageId::CmdTrustDescription,
},
CommandInfo {
name: "logout",
aliases: &[],
description: "Clear API key and return to setup",
usage: "/logout",
description_id: MessageId::CmdLogoutDescription,
},
// Debug commands
CommandInfo {
name: "tokens",
aliases: &[],
description: "Show token usage for session",
usage: "/tokens",
description_id: MessageId::CmdTokensDescription,
},
CommandInfo {
name: "system",
aliases: &[],
description: "Show current system prompt",
usage: "/system",
description_id: MessageId::CmdSystemDescription,
},
CommandInfo {
name: "undo",
aliases: &[],
description: "Remove last message pair",
usage: "/undo",
description_id: MessageId::CmdUndoDescription,
},
CommandInfo {
name: "retry",
aliases: &[],
description: "Retry the last request",
usage: "/retry",
description_id: MessageId::CmdRetryDescription,
},
CommandInfo {
name: "init",
aliases: &[],
description: "Generate AGENTS.md for project",
usage: "/init",
description_id: MessageId::CmdInitDescription,
},
CommandInfo {
name: "settings",
aliases: &[],
description: "Show persistent settings",
usage: "/settings",
description_id: MessageId::CmdSettingsDescription,
},
CommandInfo {
name: "statusline",
aliases: &["status"],
description: "Configure which items appear in the footer",
usage: "/statusline",
description_id: MessageId::CmdStatuslineDescription,
},
// Skills commands
CommandInfo {
name: "skills",
aliases: &[],
description: "List local skills (or --remote to browse the curated registry)",
usage: "/skills [--remote]",
description_id: MessageId::CmdSkillsDescription,
},
CommandInfo {
name: "skill",
aliases: &[],
description: "Activate a skill, or install/update/uninstall/trust a community skill",
usage: "/skill <name|install <spec>|update <name>|uninstall <name>|trust <name>>",
description_id: MessageId::CmdSkillDescription,
},
CommandInfo {
name: "review",
aliases: &[],
description: "Run a structured code review on a file, diff, or PR",
usage: "/review <target>",
description_id: MessageId::CmdReviewDescription,
},
CommandInfo {
name: "restore",
aliases: &[],
description: "Roll back the workspace to a prior pre/post-turn snapshot. With no arg, lists recent snapshots.",
usage: "/restore [N]",
description_id: MessageId::CmdRestoreDescription,
},
// RLM command
CommandInfo {
name: "rlm",
aliases: &["recursive"],
description: "Recursive Language Model (RLM) turn — store the prompt in a Python REPL and let the model write code to process it, with `llm_query()` / `sub_rlm()` for sub-LLM calls.",
usage: "/rlm <prompt>",
description_id: MessageId::CmdRlmDescription,
},
// Debug/cost command
CommandInfo {
name: "cost",
aliases: &[],
description: "Show session cost breakdown",
usage: "/cost",
description_id: MessageId::CmdCostDescription,
},
// Cache telemetry (#263)
CommandInfo {
name: "cache",
aliases: &[],
usage: "/cache [count]",
description_id: MessageId::CmdCacheDescription,
},
];
@@ -423,6 +441,7 @@ pub fn execute(cmd: &str, app: &mut App) -> CommandResult {
// Debug commands
"tokens" => debug::tokens(app),
"cost" => debug::cost(app),
"cache" => debug::cache(app, arg),
"system" => debug::system_prompt(app),
"context" | "ctx" => debug::context(app),
"undo" => debug::undo(app),
@@ -712,7 +731,7 @@ mod tests {
.find(|cmd| cmd.name == "context")
.expect("context command should exist");
assert_eq!(context.aliases, &["ctx"]);
assert!(context.description.contains("inspector"));
assert!(context.description_for(Locale::En).contains("inspector"));
let mut app = create_test_app();
let result = execute("/ctx", &mut app);
@@ -764,6 +783,113 @@ mod tests {
assert!(deepseek_result.action.is_none());
}
/// Build an App scoped to an isolated tempdir so dispatch-side-effects
/// (e.g. `/init` writing AGENTS.md, `/export` writing chat transcripts)
/// don't pollute the repo working tree when the smoke tests run.
fn create_isolated_test_app() -> (App, tempfile::TempDir) {
let tmpdir = tempfile::TempDir::new().expect("tempdir for smoke test");
let workspace = tmpdir.path().to_path_buf();
let options = TuiOptions {
model: "deepseek-v4-pro".to_string(),
workspace: workspace.clone(),
allow_shell: false,
use_alt_screen: true,
use_mouse_capture: false,
use_bracketed_paste: true,
max_subagents: 1,
skills_dir: workspace.join("skills"),
memory_path: workspace.join("memory.md"),
notes_path: workspace.join("notes.txt"),
mcp_config_path: workspace.join("mcp.json"),
use_memory: false,
start_in_agent_mode: false,
skip_onboarding: true,
yolo: false,
resume_session_id: None,
};
let app = App::new(options, &Config::default());
(app, tmpdir)
}
/// Smoke test: every entry in `COMMANDS` must dispatch to a real handler.
/// A dispatch miss surfaces as the fall-through `Unknown command:` error
/// message in `execute`. This catches the case where a new command is
/// added to `COMMANDS` (so it shows up in `/help` and the palette) but
/// the matching arm in `execute` is forgotten — the user would type the
/// command, see it autocomplete, and then get an unhelpful "did you
/// mean" suggestion. Also catches panics in handlers because the test
/// runner unwinds the panic and reports the offending command.
/// `/save` and `/export` default their output paths to `cwd`-relative
/// filenames when no arg is supplied, which would scribble files into
/// `crates/tui/` when CI runs from there. Pass an explicit tempdir-
/// relative path for those two so the dispatch test stays sandboxed.
fn invocation_for(command_name: &str, alias_or_name: &str, tmpdir: &std::path::Path) -> String {
match command_name {
"save" => format!("/{alias_or_name} {}", tmpdir.join("session.json").display()),
"export" => format!("/{alias_or_name} {}", tmpdir.join("chat.md").display()),
_ => format!("/{alias_or_name}"),
}
}
/// `/restore` is covered by its own dedicated tests in
/// `commands/restore.rs` that serialize on the global env mutex via
/// `scoped_home` (snapshot repo init shells out to git, which races
/// against parallel-running tests). Skip it here so this smoke test
/// stays parallel-safe.
fn skip_in_dispatch_smoke(name: &str) -> bool {
name == "restore"
}
/// Smoke test: every entry in `COMMANDS` must dispatch to a real handler.
/// A dispatch miss surfaces as the fall-through `Unknown command:` error
/// message in `execute`. This catches the case where a new command is
/// added to `COMMANDS` (so it shows up in `/help` and the palette) but
/// the matching arm in `execute` is forgotten — the user would type the
/// command, see it autocomplete, and then get an unhelpful "did you
/// mean" suggestion. Also catches panics in handlers because the test
/// runner unwinds the panic and reports the offending command.
#[test]
fn every_registered_command_dispatches_to_a_handler() {
for command in COMMANDS {
if skip_in_dispatch_smoke(command.name) {
continue;
}
let (mut app, tmpdir) = create_isolated_test_app();
let invocation = invocation_for(command.name, command.name, tmpdir.path());
let result = execute(&invocation, &mut app);
if let Some(msg) = &result.message {
assert!(
!msg.contains("Unknown command"),
"/{} fell through to the unknown-command branch: {msg}",
command.name,
);
}
}
}
/// Same check, but for declared aliases — `/q` should not fall through
/// just because the registry lists it as an alias of `/exit`.
#[test]
fn every_command_alias_dispatches_to_a_handler() {
for command in COMMANDS {
if skip_in_dispatch_smoke(command.name) {
continue;
}
for alias in command.aliases {
let (mut app, tmpdir) = create_isolated_test_app();
let invocation = invocation_for(command.name, alias, tmpdir.path());
let result = execute(&invocation, &mut app);
if let Some(msg) = &result.message {
assert!(
!msg.contains("Unknown command"),
"/{alias} (alias of /{}) fell through to unknown: {msg}",
command.name,
);
}
}
}
}
#[test]
fn unknown_command_suggests_nearest_match() {
let mut app = create_test_app();
+19 -9
View File
@@ -1195,15 +1195,25 @@ impl Engine {
ctx = ctx.with_network_policy(decider.clone());
}
if mode == AppMode::Yolo {
ctx.with_elevated_sandbox_policy(crate::sandbox::SandboxPolicy::WorkspaceWrite {
writable_roots: vec![self.session.workspace.clone()],
network_access: true,
exclude_tmpdir: false,
exclude_slash_tmp: false,
})
} else {
ctx
match mode {
// Plan mode is read-only investigation; the shell tool is not
// registered, so leaving the sandbox policy at the seatbelt-strict
// default is fine.
AppMode::Plan => ctx,
// Agent and Yolo both register the shell tool. The sandbox-default
// policy denies all outbound network — including DNS — which
// breaks ordinary developer commands (cargo fetch, npm install,
// curl, yt-dlp, …) without buying the user any safety the
// application-level NetworkPolicy / approval flow doesn't already
// provide. Elevate to workspace-write + network. (#273)
AppMode::Agent | AppMode::Yolo => {
ctx.with_elevated_sandbox_policy(crate::sandbox::SandboxPolicy::WorkspaceWrite {
writable_roots: vec![self.session.workspace.clone()],
network_access: true,
exclude_tmpdir: false,
exclude_slash_tmp: false,
})
}
}
}
+96
View File
@@ -3,6 +3,7 @@ use super::*;
use super::context::WORKING_SET_SUMMARY_MARKER;
use crate::models::SystemBlock;
use serde_json::json;
use std::collections::HashSet;
use std::fs;
use std::path::PathBuf;
use std::time::Instant;
@@ -258,6 +259,63 @@ fn model_tool_catalog_keeps_everything_loaded_in_yolo_mode() {
assert!(catalog.iter().all(|tool| tool.defer_loading == Some(false)));
}
#[test]
fn model_tool_catalog_sorts_each_partition_for_prefix_cache_stability() {
// Regression for #263: deterministic byte order of the tools array is a
// hard requirement for DeepSeek's KV prefix cache. Built-ins stay as a
// contiguous prefix; MCP tools follow. Within each partition: alphabetical.
let catalog = build_model_tool_catalog(
vec![
api_tool("read_file"),
api_tool("apply_patch"),
api_tool("exec_shell"),
],
vec![api_tool("mcp_zoo_b"), api_tool("mcp_aardvark_a")],
AppMode::Yolo,
);
let names: Vec<&str> = catalog.iter().map(|t| t.name.as_str()).collect();
assert_eq!(
names,
vec![
"apply_patch",
"exec_shell",
"read_file",
"mcp_aardvark_a",
"mcp_zoo_b",
],
"built-ins must be alphabetical and contiguous; MCP tools follow, alphabetical",
);
}
#[test]
fn active_tool_list_pushes_deferred_activations_to_the_tail() {
// Regression for #263: when ToolSearch activates a deferred tool mid-
// session, it must NOT be inserted at its catalog index — that would
// shift every later tool's byte offset and bust the cached prefix.
// Deferred-but-now-active tools belong at the tail.
let mut a = api_tool("a_load_now");
a.defer_loading = Some(false);
let mut search = api_tool("search_via_toolsearch");
search.defer_loading = Some(true);
let mut b = api_tool("b_load_now");
b.defer_loading = Some(false);
let catalog = vec![a, search, b];
let active: HashSet<String> = ["a_load_now", "search_via_toolsearch", "b_load_now"]
.into_iter()
.map(String::from)
.collect();
let listed = active_tools_for_step(&catalog, &active, false);
let names: Vec<&str> = listed.iter().map(|t| t.name.as_str()).collect();
assert_eq!(
names,
vec!["a_load_now", "b_load_now", "search_via_toolsearch"],
"deferred-but-active tools must come after always-loaded tools",
);
}
#[test]
fn turn_tool_registry_builder_keeps_plan_mode_read_only_for_files() {
let (engine, _handle) = Engine::new(EngineConfig::default(), &Config::default());
@@ -290,6 +348,44 @@ fn agent_mode_can_build_auto_approved_tool_context() {
assert!(engine.build_tool_context(AppMode::Yolo, false).auto_approve);
}
#[test]
fn agent_and_yolo_modes_elevate_shell_sandbox_to_allow_network() {
// Regression for #273: the seatbelt-default policy denies all outbound
// network (including DNS), which broke `curl`, `yt-dlp`, package managers,
// and similar shell commands in Agent mode. Elevation must include
// network access so the application-level NetworkPolicy stays the only
// outbound boundary.
let (engine, _handle) = Engine::new(EngineConfig::default(), &Config::default());
let agent_ctx = engine.build_tool_context(AppMode::Agent, false);
let agent_policy = agent_ctx
.elevated_sandbox_policy
.as_ref()
.expect("Agent mode should elevate the sandbox policy");
assert!(
agent_policy.has_network_access(),
"Agent mode must allow shell network access; got {agent_policy:?}",
);
let yolo_ctx = engine.build_tool_context(AppMode::Yolo, false);
assert!(
yolo_ctx
.elevated_sandbox_policy
.as_ref()
.expect("Yolo mode should elevate the sandbox policy")
.has_network_access(),
);
// Plan mode is read-only investigation and does not register the shell
// tool, so it intentionally leaves the policy at the strict default.
assert!(
engine
.build_tool_context(AppMode::Plan, false)
.elevated_sandbox_policy
.is_none(),
);
}
#[tokio::test]
async fn session_update_preserves_reasoning_tool_only_turn() {
let (mut engine, handle) = Engine::new(EngineConfig::default(), &Config::default());
+27 -5
View File
@@ -105,6 +105,14 @@ pub(super) fn build_model_tool_catalog(
) -> Vec<Tool> {
apply_native_tool_deferral(&mut native_tools, mode);
apply_mcp_tool_deferral(&mut mcp_tools, mode);
// Sort each partition by name for prefix-cache stability (#263). The
// upstream `to_api_tools()` already sorts the registry's HashMap output;
// this catalog is built from caller-supplied Vecs which the test harness
// and (future) caller refactors may not pre-sort. Built-ins stay as a
// contiguous prefix ahead of MCP tools so adding/removing an MCP tool
// never shifts a built-in's position.
native_tools.sort_by(|a, b| a.name.cmp(&b.name));
mcp_tools.sort_by(|a, b| a.name.cmp(&b.name));
native_tools.extend(mcp_tools);
native_tools
}
@@ -188,11 +196,25 @@ pub(super) fn initial_active_tools(catalog: &[Tool]) -> HashSet<String> {
}
fn active_tool_list_from_catalog(catalog: &[Tool], active: &HashSet<String>) -> Vec<Tool> {
catalog
.iter()
.filter(|tool| active.contains(&tool.name))
.cloned()
.collect()
// Two-pass for prefix-cache stability (#263). Always-loaded tools come
// first in their stable catalog order; tools that started life deferred
// and were activated mid-conversation by ToolSearch get appended at the
// tail. Otherwise activating a deferred tool shifts every later tool's
// byte offset and busts the cached prefix from that point onwards.
let mut head: Vec<Tool> = Vec::new();
let mut tail: Vec<Tool> = Vec::new();
for tool in catalog {
if !active.contains(&tool.name) {
continue;
}
if tool.defer_loading.unwrap_or(false) {
tail.push(tool.clone());
} else {
head.push(tool.clone());
}
}
head.extend(tail);
head
}
pub(super) fn active_tools_for_step(
-1
View File
@@ -1,4 +1,3 @@
// TODO(integrate): Wire exec-policy into shell tool — tracked as future work
#![allow(dead_code)]
#![allow(unused_imports)]
+621
View File
@@ -214,6 +214,74 @@ pub enum MessageId {
HelpFooterMove,
HelpFooterJump,
HelpFooterClose,
CmdAgentDescription,
CmdAttachDescription,
CmdCacheDescription,
CmdClearDescription,
CmdCompactDescription,
CmdConfigDescription,
CmdContextDescription,
CmdCostDescription,
CmdCycleDescription,
CmdCyclesDescription,
CmdExitDescription,
CmdExportDescription,
CmdHelpDescription,
CmdHomeDescription,
CmdInitDescription,
CmdJobsDescription,
CmdLinksDescription,
CmdLoadDescription,
CmdLogoutDescription,
CmdMcpDescription,
CmdModelDescription,
CmdModelsDescription,
CmdNoteDescription,
CmdPlanDescription,
CmdProviderDescription,
CmdQueueDescription,
CmdRecallDescription,
CmdRestoreDescription,
CmdRetryDescription,
CmdReviewDescription,
CmdRlmDescription,
CmdSaveDescription,
CmdSessionsDescription,
CmdSettingsDescription,
CmdSkillDescription,
CmdSkillsDescription,
CmdStatuslineDescription,
CmdSubagentsDescription,
CmdSystemDescription,
CmdTaskDescription,
CmdTokensDescription,
CmdTrustDescription,
CmdUndoDescription,
CmdYoloDescription,
CmdCacheAdvice,
CmdCacheFootnote,
CmdCacheHeader,
CmdCacheNoData,
CmdCacheTotals,
CmdCostReport,
CmdTokensCacheBoth,
CmdTokensCacheHitOnly,
CmdTokensCacheMissOnly,
CmdTokensContextUnknownWindow,
CmdTokensContextWithWindow,
CmdTokensNotReported,
CmdTokensReport,
FooterAgentSingular,
FooterAgentsPlural,
FooterPressCtrlCAgain,
FooterWorking,
HelpSectionActions,
HelpSectionClipboard,
HelpSectionEditing,
HelpSectionHelp,
HelpSectionModes,
HelpSectionNavigation,
HelpSectionSessions,
}
#[allow(dead_code)]
@@ -245,6 +313,74 @@ pub const ALL_MESSAGE_IDS: &[MessageId] = &[
MessageId::HelpFooterMove,
MessageId::HelpFooterJump,
MessageId::HelpFooterClose,
MessageId::CmdAgentDescription,
MessageId::CmdAttachDescription,
MessageId::CmdCacheDescription,
MessageId::CmdClearDescription,
MessageId::CmdCompactDescription,
MessageId::CmdConfigDescription,
MessageId::CmdContextDescription,
MessageId::CmdCostDescription,
MessageId::CmdCycleDescription,
MessageId::CmdCyclesDescription,
MessageId::CmdExitDescription,
MessageId::CmdExportDescription,
MessageId::CmdHelpDescription,
MessageId::CmdHomeDescription,
MessageId::CmdInitDescription,
MessageId::CmdJobsDescription,
MessageId::CmdLinksDescription,
MessageId::CmdLoadDescription,
MessageId::CmdLogoutDescription,
MessageId::CmdMcpDescription,
MessageId::CmdModelDescription,
MessageId::CmdModelsDescription,
MessageId::CmdNoteDescription,
MessageId::CmdPlanDescription,
MessageId::CmdProviderDescription,
MessageId::CmdQueueDescription,
MessageId::CmdRecallDescription,
MessageId::CmdRestoreDescription,
MessageId::CmdRetryDescription,
MessageId::CmdReviewDescription,
MessageId::CmdRlmDescription,
MessageId::CmdSaveDescription,
MessageId::CmdSessionsDescription,
MessageId::CmdSettingsDescription,
MessageId::CmdSkillDescription,
MessageId::CmdSkillsDescription,
MessageId::CmdStatuslineDescription,
MessageId::CmdSubagentsDescription,
MessageId::CmdSystemDescription,
MessageId::CmdTaskDescription,
MessageId::CmdTokensDescription,
MessageId::CmdTrustDescription,
MessageId::CmdUndoDescription,
MessageId::CmdYoloDescription,
MessageId::CmdCacheAdvice,
MessageId::CmdCacheFootnote,
MessageId::CmdCacheHeader,
MessageId::CmdCacheNoData,
MessageId::CmdCacheTotals,
MessageId::CmdCostReport,
MessageId::CmdTokensCacheBoth,
MessageId::CmdTokensCacheHitOnly,
MessageId::CmdTokensCacheMissOnly,
MessageId::CmdTokensContextUnknownWindow,
MessageId::CmdTokensContextWithWindow,
MessageId::CmdTokensNotReported,
MessageId::CmdTokensReport,
MessageId::FooterAgentSingular,
MessageId::FooterAgentsPlural,
MessageId::FooterPressCtrlCAgain,
MessageId::FooterWorking,
MessageId::HelpSectionActions,
MessageId::HelpSectionClipboard,
MessageId::HelpSectionEditing,
MessageId::HelpSectionHelp,
MessageId::HelpSectionModes,
MessageId::HelpSectionNavigation,
MessageId::HelpSectionSessions,
];
pub fn tr(locale: Locale, id: MessageId) -> &'static str {
@@ -396,6 +532,130 @@ fn english(id: MessageId) -> &'static str {
MessageId::HelpFooterMove => " Up/Down move ",
MessageId::HelpFooterJump => " PgUp/PgDn jump ",
MessageId::HelpFooterClose => " Esc close ",
MessageId::CmdAgentDescription => "Switch to agent mode",
MessageId::CmdAttachDescription => {
"Attach image/video media; use @path for text files or directories"
}
MessageId::CmdCacheDescription => {
"Show DeepSeek prefix-cache hit/miss stats for the last N turns"
}
MessageId::CmdClearDescription => "Clear conversation history",
MessageId::CmdCompactDescription => {
"Trigger context compaction to free up space (legacy; v0.6.6 prefers cycle restart)"
}
MessageId::CmdConfigDescription => "Open interactive configuration editor",
MessageId::CmdContextDescription => "Open compact session context inspector",
MessageId::CmdCostDescription => "Show session cost breakdown",
MessageId::CmdCycleDescription => "Show the carry-forward briefing for a specific cycle",
MessageId::CmdCyclesDescription => "List checkpoint-restart cycle handoffs in this session",
MessageId::CmdExitDescription => "Exit the application",
MessageId::CmdExportDescription => "Export conversation to markdown",
MessageId::CmdHelpDescription => "Show help information",
MessageId::CmdHomeDescription => "Show home dashboard with stats and quick actions",
MessageId::CmdInitDescription => "Generate AGENTS.md for project",
MessageId::CmdJobsDescription => "Inspect and control background shell jobs",
MessageId::CmdLinksDescription => "Show DeepSeek dashboard and docs links",
MessageId::CmdLoadDescription => "Load session from file",
MessageId::CmdLogoutDescription => "Clear API key and return to setup",
MessageId::CmdMcpDescription => "Open or manage MCP servers",
MessageId::CmdModelDescription => "Switch or view current model",
MessageId::CmdModelsDescription => "List available models from API",
MessageId::CmdNoteDescription => {
"Append note to persistent notes file (.deepseek/notes.md)"
}
MessageId::CmdPlanDescription => {
"Switch to plan mode and review suggested implementation steps"
}
MessageId::CmdProviderDescription => {
"Switch or view the active LLM backend (deepseek | nvidia-nim)"
}
MessageId::CmdQueueDescription => "View or edit queued messages",
MessageId::CmdRecallDescription => "Search prior cycle archives (BM25 over message text)",
MessageId::CmdRestoreDescription => {
"Roll back the workspace to a prior pre/post-turn snapshot. With no arg, lists recent snapshots."
}
MessageId::CmdRetryDescription => "Retry the last request",
MessageId::CmdReviewDescription => "Run a structured code review on a file, diff, or PR",
MessageId::CmdRlmDescription => {
"Recursive Language Model (RLM) turn — store the prompt in a Python REPL and let the model write code to process it, with `llm_query()` / `sub_rlm()` for sub-LLM calls."
}
MessageId::CmdSaveDescription => "Save session to file",
MessageId::CmdSessionsDescription => "Open session picker",
MessageId::CmdSettingsDescription => "Show persistent settings",
MessageId::CmdSkillDescription => {
"Activate a skill, or install/update/uninstall/trust a community skill"
}
MessageId::CmdSkillsDescription => {
"List local skills (or --remote to browse the curated registry)"
}
MessageId::CmdStatuslineDescription => "Configure which items appear in the footer",
MessageId::CmdSubagentsDescription => "List sub-agent status",
MessageId::CmdSystemDescription => "Show current system prompt",
MessageId::CmdTaskDescription => "Manage background tasks",
MessageId::CmdTokensDescription => "Show token usage for session",
MessageId::CmdTrustDescription => {
"Manage workspace trust and per-path allowlist (`/trust add <path>`, `/trust list`, `/trust on|off`)"
}
MessageId::CmdUndoDescription => "Remove last message pair",
MessageId::CmdYoloDescription => "Enable YOLO mode (shell + trust + auto-approve)",
MessageId::CmdCacheAdvice => {
"Hit/miss ratios over ~70% after the third turn indicate a stable cache prefix; \n\
lower than that on long sessions suggests prefix churn worth investigating (#263)."
}
MessageId::CmdCacheFootnote => {
"* miss inferred from input hit when the provider did not report it explicitly.\n"
}
MessageId::CmdCacheHeader => {
"Cache telemetry — last {count} of {total} turn(s) (model: {model})\n"
}
MessageId::CmdCacheNoData => {
"Cache history: no turns recorded yet.\n\n\
DeepSeek surfaces `prompt_cache_hit_tokens` / `prompt_cache_miss_tokens` \
on every API turn that the model supports it (V4 family). Run a turn \
and try /cache again."
}
MessageId::CmdCacheTotals => {
"Σ in: {sum_in} Σ hit: {sum_hit} Σ miss: {sum_miss} avg hit ratio: {avg}\n"
}
MessageId::CmdCostReport => {
"Session Cost:\n\
\n\
Approx total spent: ${cost}\n\n\
Cost estimates are approximate and use provider usage telemetry when available.\n\n\
DeepSeek API Pricing:\n\
\n\
Pricing details are not configured in this CLI."
}
MessageId::CmdTokensCacheBoth => "{hit} hit / {miss} miss",
MessageId::CmdTokensCacheHitOnly => "{hit} hit / miss not reported",
MessageId::CmdTokensCacheMissOnly => "hit not reported / {miss} miss",
MessageId::CmdTokensContextUnknownWindow => "~{estimated} / unknown window",
MessageId::CmdTokensContextWithWindow => "~{used} / {window} ({percent}%)",
MessageId::FooterAgentSingular => "1 agent",
MessageId::FooterAgentsPlural => "{count} agents",
MessageId::FooterPressCtrlCAgain => "Press Ctrl+C again to quit",
MessageId::FooterWorking => "working",
MessageId::HelpSectionActions => "Actions",
MessageId::HelpSectionClipboard => "Clipboard",
MessageId::HelpSectionEditing => "Input editing",
MessageId::HelpSectionHelp => "Help",
MessageId::HelpSectionModes => "Modes",
MessageId::HelpSectionNavigation => "Navigation",
MessageId::HelpSectionSessions => "Sessions",
MessageId::CmdTokensNotReported => "not reported",
MessageId::CmdTokensReport => {
"Token Usage:\n\
\n\
Active context: {active}\n\
Last API input: {input} (turn telemetry; may count repeated prefix across tool rounds)\n\
Last API output: {output}\n\
Cache hit/miss: {cache} (telemetry/cost only)\n\
Cumulative tokens: {total} (session usage telemetry)\n\
Approx session cost: ${cost}\n\
API messages: {api_messages}\n\
Chat messages: {chat_messages}\n\
Model: {model}"
}
}
}
@@ -443,6 +703,129 @@ fn japanese(id: MessageId) -> Option<&'static str> {
MessageId::HelpFooterMove => " Up/Down 移動 ",
MessageId::HelpFooterJump => " PgUp/PgDn ジャンプ ",
MessageId::HelpFooterClose => " Esc 閉じる ",
MessageId::CmdAgentDescription => "Agent モードに切り替え",
MessageId::CmdAttachDescription => {
"画像・動画メディアを添付(テキストファイルやディレクトリは @path)"
}
MessageId::CmdCacheDescription => {
"直近 N ターンの DeepSeek プレフィックスキャッシュのヒット/ミス統計を表示"
}
MessageId::CmdClearDescription => "会話履歴をクリア",
MessageId::CmdCompactDescription => {
"コンテキスト圧縮で容量を確保(旧式:v0.6.6 以降はサイクル再起動を推奨)"
}
MessageId::CmdConfigDescription => "インタラクティブな設定エディタを開く",
MessageId::CmdContextDescription => "コンパクトなセッションコンテキスト検査ツールを開く",
MessageId::CmdCostDescription => "セッションのコスト内訳を表示",
MessageId::CmdCycleDescription => "指定したサイクルの引き継ぎブリーフィングを表示",
MessageId::CmdCyclesDescription => {
"セッション内のチェックポイント再起動サイクルの引き継ぎを一覧表示"
}
MessageId::CmdExitDescription => "アプリを終了",
MessageId::CmdExportDescription => "会話を Markdown にエクスポート",
MessageId::CmdHelpDescription => "ヘルプを表示",
MessageId::CmdHomeDescription => "統計とクイックアクション付きのホームダッシュボードを表示",
MessageId::CmdInitDescription => "プロジェクト用に AGENTS.md を生成",
MessageId::CmdJobsDescription => "バックグラウンドのシェルジョブを確認・制御",
MessageId::CmdLinksDescription => "DeepSeek ダッシュボードとドキュメントへのリンクを表示",
MessageId::CmdLoadDescription => "ファイルからセッションを読み込み",
MessageId::CmdLogoutDescription => "API キーを消去してセットアップに戻る",
MessageId::CmdMcpDescription => "MCP サーバを開く・管理する",
MessageId::CmdModelDescription => "現在のモデルを切り替え・確認",
MessageId::CmdModelsDescription => "API から利用可能なモデルを一覧表示",
MessageId::CmdNoteDescription => "永続ノートファイル(.deepseek/notes.md)に追記",
MessageId::CmdPlanDescription => "Plan モードに切り替え、推奨される実装手順を確認",
MessageId::CmdProviderDescription => {
"現在の LLM バックエンドを切り替え・確認(deepseek | nvidia-nim"
}
MessageId::CmdQueueDescription => "キューされたメッセージを確認・編集",
MessageId::CmdRecallDescription => {
"過去のサイクルアーカイブを検索(メッセージ本文への BM25 検索)"
}
MessageId::CmdRestoreDescription => {
"ワークスペースを以前のターン前/後スナップショットへロールバック。引数なしで最近のスナップショットを一覧表示。"
}
MessageId::CmdRetryDescription => "直前のリクエストを再試行",
MessageId::CmdReviewDescription => "ファイル・diff・PR に対して構造化コードレビューを実行",
MessageId::CmdRlmDescription => {
"再帰言語モデル(RLM)ターン — プロンプトを Python REPL に格納し、モデルが処理コードを記述。サブ LLM 呼び出しは `llm_query()` / `sub_rlm()`。"
}
MessageId::CmdSaveDescription => "セッションをファイルに保存",
MessageId::CmdSessionsDescription => "セッションピッカーを開く",
MessageId::CmdSettingsDescription => "永続化された設定を表示",
MessageId::CmdSkillDescription => {
"スキルを有効化、またはコミュニティスキルをインストール/更新/アンインストール/信頼"
}
MessageId::CmdSkillsDescription => {
"ローカルスキルを一覧表示(--remote で精選レジストリを参照)"
}
MessageId::CmdStatuslineDescription => "フッターに表示する項目を設定",
MessageId::CmdSubagentsDescription => "サブエージェントの状態を一覧表示",
MessageId::CmdSystemDescription => "現在のシステムプロンプトを表示",
MessageId::CmdTaskDescription => "バックグラウンドタスクを管理",
MessageId::CmdTokensDescription => "セッションのトークン使用量を表示",
MessageId::CmdTrustDescription => {
"ワークスペースの信頼設定とパス別許可リストを管理(`/trust add <path>`、`/trust list`、`/trust on|off`"
}
MessageId::CmdUndoDescription => "最後のメッセージ対を削除",
MessageId::CmdYoloDescription => "YOLO モードを有効化(shell + 信頼 + 自動承認)",
MessageId::CmdCacheAdvice => {
"3 ターン目以降にヒット率が ~70% 以上で安定していれば、プレフィックスキャッシュは健全。\n\
(#263)"
}
MessageId::CmdCacheFootnote => {
"* プロバイダがミスを単独で報告しない場合は「入力 − ヒット」から推定。\n"
}
MessageId::CmdCacheHeader => {
"キャッシュテレメトリ — 直近 {count} / {total} ターン(モデル: {model}\n"
}
MessageId::CmdCacheNoData => {
"キャッシュ履歴: まだターンを記録していません。\n\n\
DeepSeek (V4 ) API `prompt_cache_hit_tokens` / \
`prompt_cache_miss_tokens` 1 /cache "
}
MessageId::CmdCacheTotals => {
"Σ 入力: {sum_in} Σ ヒット: {sum_hit} Σ ミス: {sum_miss} 平均ヒット率: {avg}\n"
}
MessageId::CmdCostReport => {
"セッション費用:\n\
\n\
: ${cost}\n\n\
使使\n\n\
DeepSeek API :\n\
\n\
CLI "
}
MessageId::CmdTokensCacheBoth => "ヒット {hit} / ミス {miss}",
MessageId::CmdTokensCacheHitOnly => "ヒット {hit} / ミスは未報告",
MessageId::CmdTokensCacheMissOnly => "ヒットは未報告 / ミス {miss}",
MessageId::CmdTokensContextUnknownWindow => "~{estimated} / コンテキスト窓不明",
MessageId::CmdTokensContextWithWindow => "~{used} / {window} ({percent}%)",
MessageId::FooterAgentSingular => "1 エージェント",
MessageId::FooterAgentsPlural => "{count} エージェント",
MessageId::FooterPressCtrlCAgain => "もう一度 Ctrl+C で終了",
MessageId::FooterWorking => "処理中",
MessageId::HelpSectionActions => "操作",
MessageId::HelpSectionClipboard => "クリップボード",
MessageId::HelpSectionEditing => "入力編集",
MessageId::HelpSectionHelp => "ヘルプ",
MessageId::HelpSectionModes => "モード",
MessageId::HelpSectionNavigation => "ナビゲーション",
MessageId::HelpSectionSessions => "セッション",
MessageId::CmdTokensNotReported => "未報告",
MessageId::CmdTokensReport => {
"トークン使用量:\n\
\n\
: {active}\n\
API : {input}\n\
API : {output}\n\
/: {cache}/\n\
: {total}使\n\
: ${cost}\n\
API : {api_messages}\n\
: {chat_messages}\n\
: {model}"
}
})
}
@@ -479,6 +862,111 @@ fn chinese_simplified(id: MessageId) -> Option<&'static str> {
MessageId::HelpFooterMove => " Up/Down 移动 ",
MessageId::HelpFooterJump => " PgUp/PgDn 跳转 ",
MessageId::HelpFooterClose => " Esc 关闭 ",
MessageId::CmdAgentDescription => "切换到 Agent 模式",
MessageId::CmdAttachDescription => "附加图片或视频媒体;文本文件或目录请使用 @path",
MessageId::CmdCacheDescription => "显示最近 N 轮的 DeepSeek 前缀缓存命中/未命中统计",
MessageId::CmdClearDescription => "清除对话历史",
MessageId::CmdCompactDescription => {
"触发上下文压缩以释放空间(旧版命令;v0.6.6 起建议改用循环重启)"
}
MessageId::CmdConfigDescription => "打开交互式配置编辑器",
MessageId::CmdContextDescription => "打开紧凑会话上下文检查器",
MessageId::CmdCostDescription => "显示本次会话的费用明细",
MessageId::CmdCycleDescription => "显示指定循环的延续简报",
MessageId::CmdCyclesDescription => "列出本次会话中的检查点重启循环交接",
MessageId::CmdExitDescription => "退出应用",
MessageId::CmdExportDescription => "将对话导出为 Markdown",
MessageId::CmdHelpDescription => "显示帮助信息",
MessageId::CmdHomeDescription => "显示主页面板,含统计与快捷操作",
MessageId::CmdInitDescription => "为项目生成 AGENTS.md",
MessageId::CmdJobsDescription => "查看并管理后台 shell 作业",
MessageId::CmdLinksDescription => "显示 DeepSeek 控制台与文档链接",
MessageId::CmdLoadDescription => "从文件加载会话",
MessageId::CmdLogoutDescription => "清除 API 密钥并返回设置",
MessageId::CmdMcpDescription => "打开或管理 MCP 服务器",
MessageId::CmdModelDescription => "切换或查看当前模型",
MessageId::CmdModelsDescription => "列出 API 中可用的模型",
MessageId::CmdNoteDescription => "将笔记追加到持久笔记文件(.deepseek/notes.md",
MessageId::CmdPlanDescription => "切换到 Plan 模式并查看建议的实现步骤",
MessageId::CmdProviderDescription => "切换或查看当前 LLM 后端(deepseek | nvidia-nim",
MessageId::CmdQueueDescription => "查看或编辑已排队的消息",
MessageId::CmdRecallDescription => "搜索此前的循环归档(基于消息文本的 BM25 检索)",
MessageId::CmdRestoreDescription => {
"将工作区回滚到此前的轮次前/后快照。不带参数时列出最近的快照。"
}
MessageId::CmdRetryDescription => "重试上一次请求",
MessageId::CmdReviewDescription => "对文件、diff 或 PR 进行结构化代码审查",
MessageId::CmdRlmDescription => {
"递归语言模型(RLM)轮次 —— 将提示词存入 Python REPL,让模型编写代码进行处理;可用 `llm_query()` / `sub_rlm()` 调用子 LLM。"
}
MessageId::CmdSaveDescription => "将会话保存到文件",
MessageId::CmdSessionsDescription => "打开会话选择器",
MessageId::CmdSettingsDescription => "显示持久化设置",
MessageId::CmdSkillDescription => "激活技能,或安装/更新/卸载/信任社区技能",
MessageId::CmdSkillsDescription => "列出本地技能(或使用 --remote 浏览精选注册表)",
MessageId::CmdStatuslineDescription => "配置底栏要显示哪些条目",
MessageId::CmdSubagentsDescription => "列出子代理状态",
MessageId::CmdSystemDescription => "显示当前系统提示词",
MessageId::CmdTaskDescription => "管理后台任务",
MessageId::CmdTokensDescription => "显示本次会话的 token 用量",
MessageId::CmdTrustDescription => {
"管理工作区信任与按路径的白名单(`/trust add <path>`、`/trust list`、`/trust on|off`"
}
MessageId::CmdUndoDescription => "移除最后一组消息对",
MessageId::CmdYoloDescription => "启用 YOLO 模式(shell + 信任 + 自动批准)",
MessageId::CmdCacheAdvice => {
"第 3 轮起命中率稳定在 ~70% 以上即表示前缀缓存稳定;\n\
#263"
}
MessageId::CmdCacheFootnote => "* 当提供方未单独上报未命中时,由「输入 − 命中」推算。\n",
MessageId::CmdCacheHeader => "缓存遥测 —— 最近 {count} / {total} 轮(模型:{model}\n",
MessageId::CmdCacheNoData => {
"缓存历史:尚未记录任何轮次。\n\n\
DeepSeek V4 API `prompt_cache_hit_tokens` / \
`prompt_cache_miss_tokens` /cache"
}
MessageId::CmdCacheTotals => {
"Σ 输入:{sum_in} Σ 命中:{sum_hit} Σ 未命中:{sum_miss} 平均命中率:{avg}\n"
}
MessageId::CmdCostReport => {
"会话费用:\n\
\n\
${cost}\n\n\
使\n\n\
DeepSeek API \n\
\n\
CLI "
}
MessageId::CmdTokensCacheBoth => "命中 {hit} / 未命中 {miss}",
MessageId::CmdTokensCacheHitOnly => "命中 {hit} / 未命中未上报",
MessageId::CmdTokensCacheMissOnly => "命中未上报 / 未命中 {miss}",
MessageId::CmdTokensContextUnknownWindow => "~{estimated} / 窗口未知",
MessageId::CmdTokensContextWithWindow => "~{used} / {window}{percent}%",
MessageId::FooterAgentSingular => "1 个子代理",
MessageId::FooterAgentsPlural => "{count} 个子代理",
MessageId::FooterPressCtrlCAgain => "再次按 Ctrl+C 退出",
MessageId::FooterWorking => "工作中",
MessageId::HelpSectionActions => "操作",
MessageId::HelpSectionClipboard => "剪贴板",
MessageId::HelpSectionEditing => "输入编辑",
MessageId::HelpSectionHelp => "帮助",
MessageId::HelpSectionModes => "模式",
MessageId::HelpSectionNavigation => "导航",
MessageId::HelpSectionSessions => "会话",
MessageId::CmdTokensNotReported => "未上报",
MessageId::CmdTokensReport => {
"令牌用量:\n\
\n\
{active}\n\
API {input}\n\
API {output}\n\
/ {cache}/\n\
{total}\n\
${cost}\n\
API {api_messages}\n\
{chat_messages}\n\
{model}"
}
})
}
@@ -517,6 +1005,139 @@ fn portuguese_brazil(id: MessageId) -> Option<&'static str> {
MessageId::HelpFooterMove => " Up/Down move ",
MessageId::HelpFooterJump => " PgUp/PgDn salta ",
MessageId::HelpFooterClose => " Esc fecha ",
MessageId::CmdAgentDescription => "Mudar para o modo agent",
MessageId::CmdAttachDescription => {
"Anexar imagem ou vídeo; use @path para arquivos de texto ou diretórios"
}
MessageId::CmdCacheDescription => {
"Exibir estatísticas de hit/miss do cache de prefixo DeepSeek nas últimas N rodadas"
}
MessageId::CmdClearDescription => "Limpar o histórico da conversa",
MessageId::CmdCompactDescription => {
"Compactar o contexto para liberar espaço (legado; a v0.6.6 prefere o reinício de ciclo)"
}
MessageId::CmdConfigDescription => "Abrir o editor interativo de configuração",
MessageId::CmdContextDescription => "Abrir o inspetor compacto de contexto da sessão",
MessageId::CmdCostDescription => "Exibir o detalhamento de custo da sessão",
MessageId::CmdCycleDescription => {
"Exibir o briefing de continuidade de um ciclo específico"
}
MessageId::CmdCyclesDescription => {
"Listar as transferências dos ciclos checkpoint-restart desta sessão"
}
MessageId::CmdExitDescription => "Sair do aplicativo",
MessageId::CmdExportDescription => "Exportar a conversa para markdown",
MessageId::CmdHelpDescription => "Exibir informações de ajuda",
MessageId::CmdHomeDescription => "Exibir o painel inicial com estatísticas e ações rápidas",
MessageId::CmdInitDescription => "Gerar AGENTS.md para o projeto",
MessageId::CmdJobsDescription => "Inspecionar e controlar jobs de shell em segundo plano",
MessageId::CmdLinksDescription => "Exibir links do painel e da documentação do DeepSeek",
MessageId::CmdLoadDescription => "Carregar a sessão de um arquivo",
MessageId::CmdLogoutDescription => "Limpar a chave de API e voltar à configuração",
MessageId::CmdMcpDescription => "Abrir ou gerenciar servidores MCP",
MessageId::CmdModelDescription => "Trocar ou exibir o modelo atual",
MessageId::CmdModelsDescription => "Listar os modelos disponíveis pela API",
MessageId::CmdNoteDescription => {
"Adicionar nota ao arquivo persistente (.deepseek/notes.md)"
}
MessageId::CmdPlanDescription => {
"Mudar para o modo plan e revisar os passos de implementação sugeridos"
}
MessageId::CmdProviderDescription => {
"Trocar ou exibir o backend LLM ativo (deepseek | nvidia-nim)"
}
MessageId::CmdQueueDescription => "Ver ou editar mensagens enfileiradas",
MessageId::CmdRecallDescription => {
"Buscar arquivos de ciclos anteriores (BM25 sobre o texto das mensagens)"
}
MessageId::CmdRestoreDescription => {
"Reverter o workspace a um snapshot pré/pós-turno anterior. Sem argumento, lista os snapshots recentes."
}
MessageId::CmdRetryDescription => "Repetir a última requisição",
MessageId::CmdReviewDescription => {
"Executar uma revisão de código estruturada em um arquivo, diff ou PR"
}
MessageId::CmdRlmDescription => {
"Turno do Recursive Language Model (RLM) — guarda o prompt em um REPL Python e deixa o modelo escrever o código que o processa; use `llm_query()` / `sub_rlm()` para chamadas a sub-LLMs."
}
MessageId::CmdSaveDescription => "Salvar a sessão em arquivo",
MessageId::CmdSessionsDescription => "Abrir o seletor de sessões",
MessageId::CmdSettingsDescription => "Exibir as configurações persistidas",
MessageId::CmdSkillDescription => {
"Ativar uma skill, ou instalar/atualizar/desinstalar/confiar em uma skill da comunidade"
}
MessageId::CmdSkillsDescription => {
"Listar skills locais (ou --remote para navegar pelo registro curado)"
}
MessageId::CmdStatuslineDescription => "Configurar quais itens aparecem no rodapé",
MessageId::CmdSubagentsDescription => "Listar o status dos sub-agentes",
MessageId::CmdSystemDescription => "Exibir o prompt de sistema atual",
MessageId::CmdTaskDescription => "Gerenciar tarefas em segundo plano",
MessageId::CmdTokensDescription => "Exibir o uso de tokens da sessão",
MessageId::CmdTrustDescription => {
"Gerenciar a confiança do workspace e a allowlist por caminho (`/trust add <path>`, `/trust list`, `/trust on|off`)"
}
MessageId::CmdUndoDescription => "Remover o último par de mensagens",
MessageId::CmdYoloDescription => {
"Ativar o modo YOLO (shell + confiança + aprovação automática)"
}
MessageId::CmdCacheAdvice => {
"Taxas de hit/miss acima de ~70% a partir do terceiro turno indicam um prefixo de cache estável;\n\
valores menores em sessões longas sugerem instabilidade no prefixo, vale investigar (#263)."
}
MessageId::CmdCacheFootnote => {
"* miss inferido a partir de entrada hit quando o provedor não o reporta separadamente.\n"
}
MessageId::CmdCacheHeader => {
"Telemetria do cache — últimos {count} de {total} turno(s) (modelo: {model})\n"
}
MessageId::CmdCacheNoData => {
"Histórico do cache: nenhum turno registrado ainda.\n\n\
O DeepSeek expõe `prompt_cache_hit_tokens` / `prompt_cache_miss_tokens` em cada turno \
da API onde o modelo suporta (família V4). Execute um turno e tente /cache de novo."
}
MessageId::CmdCacheTotals => {
"Σ entrada: {sum_in} Σ hit: {sum_hit} Σ miss: {sum_miss} taxa média de hit: {avg}\n"
}
MessageId::CmdCostReport => {
"Custo da sessão:\n\
\n\
Total aproximado: ${cost}\n\n\
Estimativas de custo são aproximadas e usam a telemetria de uso do provedor quando disponível.\n\n\
Preços da API DeepSeek:\n\
\n\
Os detalhes de preço não estão configurados nesta CLI."
}
MessageId::CmdTokensCacheBoth => "{hit} hit / {miss} miss",
MessageId::CmdTokensCacheHitOnly => "{hit} hit / miss não reportado",
MessageId::CmdTokensCacheMissOnly => "hit não reportado / {miss} miss",
MessageId::CmdTokensContextUnknownWindow => "~{estimated} / janela desconhecida",
MessageId::CmdTokensContextWithWindow => "~{used} / {window} ({percent}%)",
MessageId::FooterAgentSingular => "1 sub-agente",
MessageId::FooterAgentsPlural => "{count} sub-agentes",
MessageId::FooterPressCtrlCAgain => "Pressione Ctrl+C novamente para sair",
MessageId::FooterWorking => "trabalhando",
MessageId::HelpSectionActions => "Ações",
MessageId::HelpSectionClipboard => "Área de transferência",
MessageId::HelpSectionEditing => "Edição de entrada",
MessageId::HelpSectionHelp => "Ajuda",
MessageId::HelpSectionModes => "Modos",
MessageId::HelpSectionNavigation => "Navegação",
MessageId::HelpSectionSessions => "Sessões",
MessageId::CmdTokensNotReported => "não reportado",
MessageId::CmdTokensReport => {
"Uso de tokens:\n\
\n\
Contexto ativo: {active}\n\
Última entrada da API: {input} (telemetria por turno; pode contar o mesmo prefixo várias vezes em rodadas com ferramentas)\n\
Última saída da API: {output}\n\
Hit/miss do cache: {cache} (apenas para telemetria/custo)\n\
Tokens acumulados: {total} (telemetria de uso da sessão)\n\
Custo aproximado: ${cost}\n\
Mensagens da API: {api_messages}\n\
Mensagens do chat: {chat_messages}\n\
Modelo: {model}"
}
})
}
+18 -4
View File
@@ -14,7 +14,7 @@ struct ModelPricing {
}
fn v4_pro_discount_ends_at() -> DateTime<Utc> {
Utc.with_ymd_and_hms(2026, 5, 5, 15, 59, 0)
Utc.with_ymd_and_hms(2026, 5, 31, 15, 59, 0)
.single()
.expect("valid DeepSeek V4 Pro discount end timestamp")
}
@@ -37,7 +37,7 @@ fn pricing_for_model_at(model: &str, now: DateTime<Utc>) -> Option<ModelPricing>
if lower.contains("v4-pro") || lower.contains("v4pro") {
if now <= v4_pro_discount_ends_at() {
// DeepSeek lists these as a limited-time 75% discount through
// 2026-05-05 15:59 UTC.
// 2026-05-31 15:59 UTC.
return Some(ModelPricing {
input_cache_hit_per_million: 0.003625,
input_cache_miss_per_million: 0.435,
@@ -131,7 +131,7 @@ mod tests {
#[test]
fn v4_pro_uses_limited_time_discount_before_expiry() {
let before_expiry = Utc
.with_ymd_and_hms(2026, 5, 5, 15, 58, 59)
.with_ymd_and_hms(2026, 5, 31, 15, 58, 59)
.single()
.unwrap();
let pricing = pricing_for_model_at("deepseek-v4-pro", before_expiry).unwrap();
@@ -143,7 +143,10 @@ mod tests {
#[test]
fn v4_pro_returns_to_base_rates_after_discount_expiry() {
let after_expiry = Utc.with_ymd_and_hms(2026, 5, 5, 16, 0, 0).single().unwrap();
let after_expiry = Utc
.with_ymd_and_hms(2026, 5, 31, 16, 0, 0)
.single()
.unwrap();
let pricing = pricing_for_model_at("deepseek-v4-pro", after_expiry).unwrap();
assert_eq!(pricing.input_cache_hit_per_million, 0.0145);
@@ -151,6 +154,17 @@ mod tests {
assert_eq!(pricing.output_per_million, 3.48);
}
#[test]
fn v4_pro_discount_still_applies_just_before_old_may5_expiry() {
// Regression for #267: extension to 2026-05-31 15:59 UTC.
let after_old_expiry = Utc.with_ymd_and_hms(2026, 5, 6, 0, 0, 0).single().unwrap();
let pricing = pricing_for_model_at("deepseek-v4-pro", after_old_expiry).unwrap();
assert_eq!(pricing.input_cache_hit_per_million, 0.003625);
assert_eq!(pricing.input_cache_miss_per_million, 0.435);
assert_eq!(pricing.output_per_million, 0.87);
}
#[test]
fn v4_flash_keeps_current_published_rates() {
let now = Utc.with_ymd_and_hms(2026, 4, 25, 0, 0, 0).single().unwrap();
+202 -14
View File
@@ -177,6 +177,22 @@ pub fn system_prompt_for_mode_with_context(
}
/// Get the system prompt for a specific mode with project and skills context.
///
/// **Volatile-content-last invariant.** Blocks are appended in order from
/// most-static to most-volatile so DeepSeek's KV prefix cache hits the
/// longest possible byte prefix turn-over-turn:
///
/// 1. mode prompt (compile-time constant)
/// 2. project context / fallback (workspace-static)
/// 3. skills block (skills-dir-static)
/// 4. `## Context Management` (compile-time constant, Agent/Yolo only)
/// 5. compaction handoff template (compile-time constant)
/// 6. handoff block — file-backed; rewritten by `/compact` and on exit
/// 7. working-set summary — drifts when a new path is observed
///
/// Anything appended after a volatile block forfeits the cache for the rest
/// of the request. New blocks belong above the handoff/working-set boundary
/// unless they themselves are turn-volatile.
pub fn system_prompt_for_mode_with_context_and_skills(
mode: AppMode,
workspace: &Path,
@@ -188,7 +204,7 @@ pub fn system_prompt_for_mode_with_context_and_skills(
// Load project context from workspace
let project_context = load_project_context_with_parents(workspace);
// Combine base prompt with project context
// 12. Mode prompt + project context (or fallback automap).
let mut full_prompt = if let Some(project_block) = project_context.as_system_block() {
format!("{}\n\n{}", mode_prompt, project_block)
} else {
@@ -201,22 +217,13 @@ pub fn system_prompt_for_mode_with_context_and_skills(
)
};
if let Some(summary) = working_set_summary
&& !summary.trim().is_empty()
{
full_prompt = format!("{full_prompt}\n\n{summary}");
}
// 3. Skills block.
if let Some(skills_block) = skills_dir.and_then(crate::skills::render_available_skills_context)
{
full_prompt = format!("{full_prompt}\n\n{skills_block}");
}
if let Some(handoff_block) = load_handoff_block(workspace) {
full_prompt = format!("{full_prompt}\n\n{handoff_block}");
}
// Add compaction instruction for agent modes
// 4. Context Management (Agent / Yolo only).
if matches!(mode, AppMode::Agent | AppMode::Yolo) {
full_prompt.push_str(
"\n\n## Context Management\n\n\
@@ -228,11 +235,27 @@ pub fn system_prompt_for_mode_with_context_and_skills(
);
}
// Append the compaction handoff template so the model knows the format
// to use when writing `.deepseek/handoff.md` on exit / `/compact`.
// 5. Compaction handoff template so the model knows the format to use
// when writing `.deepseek/handoff.md` on exit / `/compact`.
full_prompt.push_str("\n\n");
full_prompt.push_str(COMPACT_TEMPLATE);
// ── Volatile-content boundary ─────────────────────────────────────────
// Everything below drifts mid-session and busts the prefix cache for
// bytes that follow. Keep new static blocks above this comment.
// 6. Previous-session handoff (file-backed, rewritten by `/compact`).
if let Some(handoff_block) = load_handoff_block(workspace) {
full_prompt = format!("{full_prompt}\n\n{handoff_block}");
}
// 7. Working-set summary (drifts when a new path is observed).
if let Some(summary) = working_set_summary
&& !summary.trim().is_empty()
{
full_prompt = format!("{full_prompt}\n\n{summary}");
}
SystemPrompt::Text(full_prompt)
}
@@ -425,4 +448,169 @@ mod tests {
assert!(!YOLO_PROMPT.is_empty());
assert!(!PLAN_PROMPT.is_empty());
}
// ── Cache-prefix stability harness (#263 step 2) ───────────────────────
//
// These tests pin the byte-stability invariant required for DeepSeek's
// KV prefix cache to hit: any prompt-construction surface that ends up
// in the cached prefix must produce identical bytes given identical
// inputs across calls.
use crate::test_support::assert_byte_identical;
#[test]
fn compose_prompt_is_byte_stable_across_calls() {
// Suspect #4 from #263: mode prompt churn within a single mode.
// Two calls with identical (mode, personality) inputs must produce
// identical bytes — anything else is a cache buster.
for mode in [AppMode::Agent, AppMode::Yolo, AppMode::Plan] {
for personality in [Personality::Calm, Personality::Playful] {
let a = compose_prompt(mode, personality);
let b = compose_prompt(mode, personality);
assert_byte_identical(
&format!("compose_prompt(mode={mode:?}, personality={personality:?})"),
&a,
&b,
);
}
}
}
#[test]
fn system_prompt_for_mode_with_context_is_byte_stable_for_unchanged_workspace() {
// Same workspace, no working_set / skills churn between calls →
// identical bytes. This pins the most representative production
// surface (engine.rs builds the system prompt via this fn or
// its sibling _and_skills variant on every turn).
let tmp = tempdir().expect("tempdir");
let workspace = tmp.path();
for mode in [AppMode::Agent, AppMode::Yolo, AppMode::Plan] {
let a = match system_prompt_for_mode_with_context(mode, workspace, None) {
SystemPrompt::Text(text) => text,
SystemPrompt::Blocks(_) => panic!("expected text system prompt"),
};
let b = match system_prompt_for_mode_with_context(mode, workspace, None) {
SystemPrompt::Text(text) => text,
SystemPrompt::Blocks(_) => panic!("expected text system prompt"),
};
assert_byte_identical(
&format!("system_prompt_for_mode_with_context(mode={mode:?}) on empty workspace"),
&a,
&b,
);
}
}
#[test]
fn system_prompt_with_working_set_summary_is_byte_stable_for_constant_summary() {
// The `working_set_summary` argument is the volatile surface (suspect
// #1 in #263). Independently verifying THIS surface needs a separate
// test in working_set.rs; here we just pin that the surrounding
// prompt construction faithfully embeds whatever summary it's given
// without injecting any non-determinism on its own.
let tmp = tempdir().expect("tempdir");
let workspace = tmp.path();
let summary = "## Repo Working Set\nWorkspace: /tmp/x\n";
let a = match system_prompt_for_mode_with_context(AppMode::Agent, workspace, Some(summary))
{
SystemPrompt::Text(text) => text,
SystemPrompt::Blocks(_) => panic!("expected text system prompt"),
};
let b = match system_prompt_for_mode_with_context(AppMode::Agent, workspace, Some(summary))
{
SystemPrompt::Text(text) => text,
SystemPrompt::Blocks(_) => panic!("expected text system prompt"),
};
assert_byte_identical(
"system_prompt_for_mode_with_context with constant working_set summary",
&a,
&b,
);
assert!(a.contains(summary), "summary must be embedded as-is");
}
#[test]
fn system_prompt_with_handoff_file_is_byte_stable_when_file_is_unchanged() {
// Companion to the working-set stability test: if `.deepseek/handoff.md`
// hasn't moved between two builds, the rendered prompt must produce
// identical bytes. The handoff block is the second volatile surface
// (the first is the working-set summary) — both land below the static
// boundary in `system_prompt_for_mode_with_context_and_skills`.
let tmp = tempdir().expect("tempdir");
let workspace = tmp.path();
let handoff_dir = workspace.join(".deepseek");
std::fs::create_dir_all(&handoff_dir).unwrap();
std::fs::write(
handoff_dir.join("handoff.md"),
"# Session handoff\n\n## Active task\nFinish #280.\n\n## Open blockers\n- [ ] none\n",
)
.unwrap();
let a = match system_prompt_for_mode_with_context(AppMode::Agent, workspace, None) {
SystemPrompt::Text(text) => text,
SystemPrompt::Blocks(_) => panic!("expected text system prompt"),
};
let b = match system_prompt_for_mode_with_context(AppMode::Agent, workspace, None) {
SystemPrompt::Text(text) => text,
SystemPrompt::Blocks(_) => panic!("expected text system prompt"),
};
assert_byte_identical(
"system_prompt_for_mode_with_context with constant handoff file",
&a,
&b,
);
assert!(a.contains(HANDOFF_BLOCK_MARKER), "handoff must be embedded");
assert!(a.contains("Finish #280."), "handoff body must be present");
}
#[test]
fn handoff_and_working_set_appear_after_static_blocks() {
// Cache-prefix invariant: the volatile blocks (handoff, working_set)
// must come *after* the static `## Context Management` and the
// compaction handoff template (`## Compaction Handoff`) so a churn
// in either volatile section doesn't drag the static blocks out of
// the cached prefix. Pre-fix ordering placed handoff between the
// skills block and `## Context Management`, which busted the cache
// every time `/compact` rewrote the file.
let tmp = tempdir().expect("tempdir");
let workspace = tmp.path();
let handoff_dir = workspace.join(".deepseek");
std::fs::create_dir_all(&handoff_dir).unwrap();
std::fs::write(handoff_dir.join("handoff.md"), "# handoff body\n").unwrap();
let summary = "## Repo Working Set\nWorkspace: /tmp/x\n";
let prompt =
match system_prompt_for_mode_with_context(AppMode::Agent, workspace, Some(summary)) {
SystemPrompt::Text(text) => text,
SystemPrompt::Blocks(_) => panic!("expected text system prompt"),
};
let context_pos = prompt
.find("## Context Management")
.expect("Context Management section present in Agent mode");
let compact_pos = prompt
.find("## Compaction Handoff")
.expect("compaction handoff template present");
let handoff_pos = prompt
.find(HANDOFF_BLOCK_MARKER)
.expect("handoff block present when fixture file exists");
let working_set_pos = prompt
.find("## Repo Working Set")
.expect("working-set summary present when supplied");
assert!(
context_pos < handoff_pos,
"## Context Management must precede the handoff block"
);
assert!(
compact_pos < handoff_pos,
"## Compaction Handoff must precede the handoff block"
);
assert!(
handoff_pos < working_set_pos,
"handoff block must precede the working-set summary (most-volatile last)"
);
}
}
-3
View File
@@ -1,9 +1,6 @@
// TODO(integrate): Wire sandbox into shell tool — tracked as future security feature
#![allow(dead_code)]
//! Sandbox module for secure command execution.
//! NOTE: Not yet integrated into shell tool - planned security feature.
//!
//! This module provides sandboxing capabilities for shell commands executed by
//! DeepSeek TUI. Sandboxing restricts what system resources a command can access,
-1
View File
@@ -1,4 +1,3 @@
// TODO(integrate): Wire sandbox policy into shell tool — tracked as future work
#![allow(dead_code)]
//! Sandbox policy definitions for command execution restrictions.
+75 -1
View File
@@ -82,7 +82,9 @@ impl InstallSource {
/// Parse a user-supplied spec. Empty / whitespace-only input is rejected.
///
/// * `github:owner/repo` → [`InstallSource::GitHubRepo`]
/// * `http://` or `https://` prefix → [`InstallSource::DirectUrl`]
/// * `https://github.com/owner/repo[.git]` (no path past the repo) →
/// [`InstallSource::GitHubRepo`]
/// * any other `http://` or `https://` prefix → [`InstallSource::DirectUrl`]
/// * anything else → [`InstallSource::Registry`]
pub fn parse(spec: &str) -> Result<Self> {
let trimmed = spec.trim();
@@ -107,12 +109,43 @@ impl InstallSource {
return Ok(Self::GitHubRepo(format!("{owner}/{repo}")));
}
if trimmed.starts_with("https://") || trimmed.starts_with("http://") {
if let Some(repo) = parse_github_browser_url(trimmed) {
return Ok(Self::GitHubRepo(repo));
}
return Ok(Self::DirectUrl(trimmed.to_string()));
}
Ok(Self::Registry(trimmed.to_string()))
}
}
/// Detect bare `https://github.com/<owner>/<repo>` URLs (with or without a
/// trailing `.git`) and return `owner/repo`. Returns `None` for any URL that
/// already points at a specific archive / blob / tree path — those are real
/// direct URLs and the caller fetches them as-is.
fn parse_github_browser_url(url: &str) -> Option<String> {
let after_scheme = url
.strip_prefix("https://")
.or_else(|| url.strip_prefix("http://"))?;
let (host, rest) = after_scheme.split_once('/')?;
if !host.eq_ignore_ascii_case("github.com") && !host.eq_ignore_ascii_case("www.github.com") {
return None;
}
let trimmed = rest.trim_end_matches('/');
let mut parts = trimmed.splitn(3, '/');
let owner = parts.next()?.trim();
let repo = parts.next()?.trim().trim_end_matches(".git");
if owner.is_empty() || repo.is_empty() {
return None;
}
// If there is a third segment, the URL points at a sub-resource
// (`/archive/...`, `/blob/...`, `/tree/...`). Treat that as a real direct
// URL — the user explicitly wants whatever lives at that path.
if parts.next().is_some() {
return None;
}
Some(format!("{owner}/{repo}"))
}
// ─────────────────────────────────────────────────────────────────────────────
// Outcome / result types
// ─────────────────────────────────────────────────────────────────────────────
@@ -1034,6 +1067,47 @@ mod tests {
);
}
#[test]
fn parse_github_browser_url_routes_to_github_repo() {
// Regression for #269: `https://github.com/<owner>/<repo>` was being
// parsed as a DirectUrl, so the installer downloaded the HTML repo
// page and tried to gzip-decode HTML ("invalid gzip header").
for spec in [
"https://github.com/obra/superpowers",
"https://github.com/obra/superpowers/",
"https://github.com/obra/superpowers.git",
"https://github.com/obra/superpowers.git/",
"https://www.github.com/obra/superpowers",
"http://github.com/obra/superpowers",
" https://github.com/obra/superpowers ",
] {
let parsed = InstallSource::parse(spec)
.unwrap_or_else(|err| panic!("parse({spec}) failed: {err}"));
assert_eq!(
parsed,
InstallSource::GitHubRepo("obra/superpowers".to_string()),
"spec {spec} must route to GitHubRepo",
);
}
}
#[test]
fn parse_github_archive_url_stays_direct() {
// URLs that point at a specific subresource (archive tarball, blob,
// tree) are real direct URLs — the user picked that exact path.
for spec in [
"https://github.com/obra/superpowers/archive/refs/heads/main.tar.gz",
"https://github.com/obra/superpowers/blob/main/README.md",
"https://github.com/obra/superpowers/tree/main",
] {
let parsed = InstallSource::parse(spec).unwrap();
assert!(
matches!(parsed, InstallSource::DirectUrl(_)),
"spec {spec} must stay DirectUrl, got {parsed:?}",
);
}
}
#[test]
fn parse_registry_source() {
let s = InstallSource::parse("my-skill").unwrap();
+42
View File
@@ -17,3 +17,45 @@ pub(crate) fn lock_test_env() -> MutexGuard<'static, ()> {
Err(poisoned) => poisoned.into_inner(),
}
}
/// Find the byte position of the first divergence between two strings,
/// returning a windowed view (`±32 bytes` around the divergence) so failures
/// in cache-prefix-stability tests show *which* bytes drifted, not just that
/// they did. Returns `None` when the strings are byte-identical.
pub(crate) fn first_divergence(a: &str, b: &str) -> Option<(usize, String, String)> {
let a_bytes = a.as_bytes();
let b_bytes = b.as_bytes();
let max = a_bytes.len().min(b_bytes.len());
for i in 0..max {
if a_bytes[i] != b_bytes[i] {
let lo = i.saturating_sub(32);
let a_hi = (i + 32).min(a_bytes.len());
let b_hi = (i + 32).min(b_bytes.len());
let a_ctx = String::from_utf8_lossy(&a_bytes[lo..a_hi]).into_owned();
let b_ctx = String::from_utf8_lossy(&b_bytes[lo..b_hi]).into_owned();
return Some((i, a_ctx, b_ctx));
}
}
if a_bytes.len() != b_bytes.len() {
return Some((
max,
format!("(len={})", a_bytes.len()),
format!("(len={})", b_bytes.len()),
));
}
None
}
/// Assert two strings are byte-identical, panicking with a windowed diff
/// around the first divergence when they aren't. Used by the prefix-cache
/// stability harness (#263, #280) to pin construction surfaces that land in
/// DeepSeek's KV cache prefix.
#[track_caller]
pub(crate) fn assert_byte_identical(label: &str, a: &str, b: &str) {
if let Some((pos, a_ctx, b_ctx)) = first_divergence(a, b) {
panic!(
"{label}: prompt construction is non-deterministic — first diff at byte {pos}\n\
side A (±32B) \n{a_ctx:?}\n side B (±32B) \n{b_ctx:?}",
);
}
}
+211 -4
View File
@@ -7,7 +7,7 @@
//! - Filtering by capability
use std::collections::HashMap;
use std::sync::Arc;
use std::sync::{Arc, OnceLock};
use serde_json::Value;
@@ -24,6 +24,11 @@ use super::spec::{
pub struct ToolRegistry {
tools: HashMap<String, Arc<dyn ToolSpec>>,
context: ToolContext,
/// Memoised serialised tool catalog. Rebuilt lazily on first
/// `to_api_tools` call after a mutation; pinned across reads so the
/// description and schema bytes stay byte-stable for DeepSeek's KV
/// prefix cache. Invalidated on `register` / `remove` / `clear`.
api_cache: OnceLock<Vec<Tool>>,
}
impl ToolRegistry {
@@ -33,6 +38,7 @@ impl ToolRegistry {
Self {
tools: HashMap::new(),
context,
api_cache: OnceLock::new(),
}
}
@@ -42,6 +48,7 @@ impl ToolRegistry {
if self.tools.insert(name.clone(), tool).is_some() {
tracing::warn!("Overwriting existing tool: {}", name);
}
self.invalidate_api_cache();
}
/// Register multiple tools at once.
@@ -133,10 +140,32 @@ impl ToolRegistry {
}
/// Convert all tools to API Tool format for sending to the model.
///
/// Output is sorted by tool name for **prefix-cache stability** (#263).
/// Rust's `HashMap` uses a randomly-seeded hasher per process, so a raw
/// `self.tools.values()` iteration emits tools in a different order on
/// every `deepseek` launch, invalidating DeepSeek's KV prefix cache for
/// every cross-session resume. Sorting here matches the way Claude Code
/// stabilises its tool array (`assembleToolPool` in their reference).
///
/// The serialised catalog is memoised on first call and pinned across
/// reads so each tool's `description()` and `input_schema()` are sampled
/// exactly once per registration. MCP adapters whose upstream description
/// drifts on reconnect would otherwise rewrite the catalog mid-session
/// and bust the prefix cache. The cache is invalidated on `register`,
/// `remove`, and `clear`.
#[must_use]
pub fn to_api_tools(&self) -> Vec<Tool> {
self.tools
.values()
self.api_cache
.get_or_init(|| self.build_api_tools())
.clone()
}
fn build_api_tools(&self) -> Vec<Tool> {
let mut tools: Vec<&Arc<dyn ToolSpec>> = self.tools.values().collect();
tools.sort_by(|a, b| a.name().cmp(b.name()));
tools
.into_iter()
.map(|tool| Tool {
tool_type: None,
name: tool.name().to_string(),
@@ -151,6 +180,10 @@ impl ToolRegistry {
.collect()
}
fn invalidate_api_cache(&mut self) {
self.api_cache = OnceLock::new();
}
/// Convert tools to API Tool format with optional cache control on the last tool.
#[must_use]
#[allow(dead_code)]
@@ -230,13 +263,18 @@ impl ToolRegistry {
#[must_use]
#[allow(dead_code)]
pub fn remove(&mut self, name: &str) -> Option<Arc<dyn ToolSpec>> {
self.tools.remove(name)
let removed = self.tools.remove(name);
if removed.is_some() {
self.invalidate_api_cache();
}
removed
}
/// Clear all tools from the registry.
#[allow(dead_code)]
pub fn clear(&mut self) {
self.tools.clear();
self.invalidate_api_cache();
}
}
@@ -832,6 +870,175 @@ mod tests {
assert_eq!(api_tools[0].description, "A test tool");
}
/// Tool whose `description()` advances through a script of pre-built
/// strings, one per call. Used to demonstrate that the api-tools cache
/// pins the description bytes on first read instead of re-sampling them
/// each turn (#263 follow-up; mirrors reference-cc's `getToolSchemaCache`).
struct VaryingDescriptionTool {
name: String,
descriptions: Vec<String>,
next: std::sync::atomic::AtomicUsize,
}
impl VaryingDescriptionTool {
fn new(name: &str, descriptions: &[&str]) -> Self {
Self {
name: name.to_string(),
descriptions: descriptions.iter().map(|s| (*s).to_string()).collect(),
next: std::sync::atomic::AtomicUsize::new(0),
}
}
}
#[async_trait::async_trait]
impl ToolSpec for VaryingDescriptionTool {
fn name(&self) -> &str {
&self.name
}
fn description(&self) -> &str {
let idx = self
.next
.fetch_add(1, std::sync::atomic::Ordering::SeqCst)
.min(self.descriptions.len() - 1);
&self.descriptions[idx]
}
fn input_schema(&self) -> Value {
json!({"type": "object", "properties": {}, "required": []})
}
fn capabilities(&self) -> Vec<ToolCapability> {
vec![ToolCapability::ReadOnly]
}
async fn execute(
&self,
_input: Value,
_context: &ToolContext,
) -> Result<ToolResult, ToolError> {
Ok(ToolResult::success("ok".to_string()))
}
}
#[test]
fn to_api_tools_pins_description_bytes_across_calls() {
// Regression for the cache-stability follow-up: an MCP adapter that
// returns a different `description()` on reconnect (or any other
// tool whose description isn't a `&'static str`) would otherwise
// rewrite the catalog bytes mid-session and miss the prefix cache.
// The registry pins the first call's value until it's mutated.
let tmp = tempdir().expect("tempdir");
let ctx = ToolContext::new(tmp.path().to_path_buf());
let mut registry = ToolRegistry::new(ctx);
registry.register(Arc::new(VaryingDescriptionTool::new(
"varying",
&["first description", "second description"],
)));
let first = registry.to_api_tools();
let second = registry.to_api_tools();
assert_eq!(first.len(), 1);
assert_eq!(first[0].description, "first description");
assert_eq!(
first, second,
"api-tools catalog must be byte-identical across reads with no mutation in between"
);
}
#[test]
fn register_invalidates_api_tools_cache() {
// Counter-test: when a real change happens (a new tool registers,
// an existing one is removed, or `clear` is called), the cache must
// be discarded so the next read reflects the live registry.
let tmp = tempdir().expect("tempdir");
let ctx = ToolContext::new(tmp.path().to_path_buf());
let mut registry = ToolRegistry::new(ctx);
registry.register(Arc::new(VaryingDescriptionTool::new(
"varying",
&["first description", "second description"],
)));
let before = registry.to_api_tools();
assert_eq!(before.len(), 1);
registry.register(make_test_tool("late_arrival"));
let after = registry.to_api_tools();
assert_eq!(after.len(), 2, "cache must rebuild after register");
assert!(after.iter().any(|t| t.name == "varying"));
assert!(after.iter().any(|t| t.name == "late_arrival"));
// The varying tool's description advances on cache rebuild — the
// first read above sampled `first description`; this rebuild samples
// `second description`. The point is just that the bytes *can*
// change after a real mutation, not that they always do.
let varying_after = after
.iter()
.find(|t| t.name == "varying")
.expect("varying tool present");
assert_eq!(varying_after.description, "second description");
}
#[test]
fn remove_and_clear_invalidate_api_tools_cache() {
let tmp = tempdir().expect("tempdir");
let ctx = ToolContext::new(tmp.path().to_path_buf());
let mut registry = ToolRegistry::new(ctx);
registry.register(make_test_tool("alpha"));
registry.register(make_test_tool("beta"));
let before = registry.to_api_tools();
assert_eq!(before.len(), 2);
let _ = registry.remove("alpha");
let after_remove = registry.to_api_tools();
assert_eq!(after_remove.len(), 1);
assert_eq!(after_remove[0].name, "beta");
registry.clear();
let after_clear = registry.to_api_tools();
assert!(after_clear.is_empty(), "cache must clear with the registry");
}
#[test]
fn to_api_tools_emits_alphabetical_order_regardless_of_registration_order() {
// Regression for #263: HashMap iteration is non-deterministic across
// process launches, which busts DeepSeek's KV prefix cache for every
// cross-session resume. `to_api_tools` must emit by name regardless
// of registration order so two consecutive calls (and two distinct
// launches) produce byte-identical output.
let tmp = tempdir().expect("tempdir");
let ctx = ToolContext::new(tmp.path().to_path_buf());
let order_a = {
let mut registry = ToolRegistry::new(ctx.clone());
registry.register(make_test_tool("zebra"));
registry.register(make_test_tool("alpha"));
registry.register(make_test_tool("mango"));
registry
.to_api_tools()
.iter()
.map(|t| t.name.clone())
.collect::<Vec<_>>()
};
let order_b = {
let mut registry = ToolRegistry::new(ctx.clone());
registry.register(make_test_tool("alpha"));
registry.register(make_test_tool("mango"));
registry.register(make_test_tool("zebra"));
registry
.to_api_tools()
.iter()
.map(|t| t.name.clone())
.collect::<Vec<_>>()
};
assert_eq!(order_a, vec!["alpha", "mango", "zebra"]);
assert_eq!(order_a, order_b);
}
#[test]
fn test_registry_remove() {
let tmp = tempdir().expect("tempdir");
+43
View File
@@ -59,6 +59,32 @@ pub enum AppMode {
Plan,
}
/// One row in the per-turn cache-telemetry ring (`/cache` debug surface, #263).
#[derive(Debug, Clone)]
pub struct TurnCacheRecord {
/// Provider-reported total input tokens for the turn (cache-hit +
/// cache-miss + uncategorized). Useful for sanity-checking that hits +
/// misses sum back to roughly the prompt size.
pub input_tokens: u32,
/// Provider-reported output tokens.
pub output_tokens: u32,
/// `prompt_cache_hit_tokens` from DeepSeek's usage payload. `None` when
/// the model in use does not report cache telemetry (see
/// `Capabilities::cache_telemetry_supported`).
pub cache_hit_tokens: Option<u32>,
/// `prompt_cache_miss_tokens`. `None` when the provider did not report it
/// — in that case the `/cache` formatter infers the miss as
/// `input_tokens cache_hit_tokens`.
pub cache_miss_tokens: Option<u32>,
/// Approximate tokens spent re-sending prior `reasoning_content` on
/// V4-thinking tool-calling turns (chars/3 heuristic). Helps separate
/// cache misses caused by reasoning-replay churn from misses caused by
/// real prefix instability.
pub reasoning_replay_tokens: Option<u32>,
/// Local timestamp the turn telemetry was recorded.
pub recorded_at: Instant,
}
/// DeepSeek reasoning-effort tier, mirrored on ChatGPT/Claude effort pickers.
///
/// The config file accepts all five string values for forward-compat with
@@ -666,6 +692,9 @@ pub struct App {
pub last_prompt_cache_hit_tokens: Option<u32>,
/// DeepSeek context-cache miss tokens from the last API call. Telemetry only.
pub last_prompt_cache_miss_tokens: Option<u32>,
/// Per-turn cache telemetry ring (`/cache` debug surface, #263). Newest
/// turn at the back. Capped at [`Self::TURN_CACHE_HISTORY_CAP`].
pub turn_cache_history: VecDeque<TurnCacheRecord>,
/// Approximate input tokens spent re-sending prior `reasoning_content` on
/// the last thinking-mode tool-calling turn (V4 §5.1.1 "Interleaved
/// Thinking"). Computed client-side at ~4 chars/token.
@@ -790,6 +819,19 @@ pub enum ApiKeyError {
// === App State ===
impl App {
/// Cap on [`Self::turn_cache_history`]. Holds enough turns to debug a long
/// session without being so large the on-screen `/cache` table wraps.
pub const TURN_CACHE_HISTORY_CAP: usize = 50;
/// Append a per-turn cache-telemetry record, trimming the oldest entry once
/// the ring exceeds [`Self::TURN_CACHE_HISTORY_CAP`].
pub fn push_turn_cache_record(&mut self, record: TurnCacheRecord) {
self.turn_cache_history.push_back(record);
while self.turn_cache_history.len() > Self::TURN_CACHE_HISTORY_CAP {
self.turn_cache_history.pop_front();
}
}
pub fn tr(&self, id: MessageId) -> &'static str {
tr(self.ui_locale, id)
}
@@ -1039,6 +1081,7 @@ impl App {
last_completion_tokens: None,
last_prompt_cache_hit_tokens: None,
last_prompt_cache_miss_tokens: None,
turn_cache_history: VecDeque::new(),
last_reasoning_replay_tokens: None,
workspace_context: None,
workspace_context_refreshed_at: None,
+19 -3
View File
@@ -14,6 +14,7 @@ use ratatui::{
use unicode_width::UnicodeWidthStr;
use crate::commands;
use crate::localization::Locale;
use crate::palette;
use crate::skills::SkillRegistry;
use crate::tools::spec::ApprovalRequirement;
@@ -46,6 +47,7 @@ pub struct CommandPaletteView {
}
pub fn build_entries(
locale: Locale,
skills_dir: &Path,
workspace: &Path,
mcp_config_path: &Path,
@@ -54,7 +56,7 @@ pub fn build_entries(
let mut entries = Vec::new();
for command in commands::COMMANDS {
let mut description = command.palette_description();
let mut description = command.palette_description_for(locale);
if command.requires_argument() {
description.push_str(" ");
description.push_str(command.usage);
@@ -919,7 +921,13 @@ mod tests {
#[test]
fn command_palette_command_entries_include_links_and_config_but_not_removed_commands() {
let entries = build_entries(Path::new("."), Path::new("."), Path::new("mcp.json"), None);
let entries = build_entries(
Locale::En,
Path::new("."),
Path::new("."),
Path::new("mcp.json"),
None,
);
let command_labels = entries
.iter()
.filter(|entry| entry.section == PaletteSection::Command)
@@ -934,7 +942,13 @@ mod tests {
#[test]
fn command_palette_inserts_model_command_for_argument_entry() {
let entries = build_entries(Path::new("."), Path::new("."), Path::new("mcp.json"), None);
let entries = build_entries(
Locale::En,
Path::new("."),
Path::new("."),
Path::new("mcp.json"),
None,
);
let model = entries
.iter()
.find(|entry| entry.section == PaletteSection::Command && entry.label == "/model")
@@ -991,6 +1005,7 @@ mod tests {
],
};
let entries = build_entries(
Locale::En,
Path::new("."),
Path::new("."),
Path::new("mcp.json"),
@@ -1044,6 +1059,7 @@ mod tests {
}],
};
let entries = build_entries(
Locale::En,
Path::new("."),
Path::new("."),
Path::new("mcp.json"),
+12 -10
View File
@@ -33,16 +33,18 @@ pub enum KeybindingSection {
}
impl KeybindingSection {
pub fn label(self) -> &'static str {
match self {
Self::Navigation => "Navigation",
Self::Editing => "Input editing",
Self::Submission => "Actions",
Self::Modes => "Modes",
Self::Sessions => "Sessions",
Self::Clipboard => "Clipboard",
Self::Help => "Help",
}
pub fn label(self, locale: crate::localization::Locale) -> &'static str {
use crate::localization::{MessageId, tr};
let id = match self {
Self::Navigation => MessageId::HelpSectionNavigation,
Self::Editing => MessageId::HelpSectionEditing,
Self::Submission => MessageId::HelpSectionActions,
Self::Modes => MessageId::HelpSectionModes,
Self::Sessions => MessageId::HelpSectionSessions,
Self::Clipboard => MessageId::HelpSectionClipboard,
Self::Help => MessageId::HelpSectionHelp,
};
tr(locale, id)
}
/// Stable ordering for help rendering — matches the variant declaration
-1
View File
@@ -1,4 +1,3 @@
// TODO(integrate): Wire streaming collector into TUI rendering pipeline
#![allow(dead_code)]
//! Markdown stream collector for newline-gated rendering.
+19 -4
View File
@@ -716,6 +716,14 @@ async fn run_event_loop(
app.last_prompt_cache_hit_tokens = usage.prompt_cache_hit_tokens;
app.last_prompt_cache_miss_tokens = usage.prompt_cache_miss_tokens;
app.last_reasoning_replay_tokens = usage.reasoning_replay_tokens;
app.push_turn_cache_record(crate::tui::app::TurnCacheRecord {
input_tokens: usage.input_tokens,
output_tokens: usage.output_tokens,
cache_hit_tokens: usage.prompt_cache_hit_tokens,
cache_miss_tokens: usage.prompt_cache_miss_tokens,
reasoning_replay_tokens: usage.reasoning_replay_tokens,
recorded_at: Instant::now(),
});
if let Some(error) = error {
app.status_message = Some(format!("Turn failed: {error}"));
}
@@ -1422,6 +1430,7 @@ async fn run_event_loop(
}
app.view_stack
.push(CommandPaletteView::new(build_command_palette_entries(
app.ui_locale,
&app.skills_dir,
&app.workspace,
&app.mcp_config_path,
@@ -4575,7 +4584,11 @@ fn render_footer(f: &mut Frame, area: Rect, app: &mut App) {
// within ~2s. Mirrors codex-rs's `FooterMode::QuitShortcutReminder`.
let quit_prompt = if app.quit_is_armed() {
Some(FooterToast {
text: "Press Ctrl+C again to quit".to_string(),
text: crate::localization::tr(
app.ui_locale,
crate::localization::MessageId::FooterPressCtrlCAgain,
)
.to_string(),
color: palette::STATUS_WARNING,
})
} else {
@@ -4614,7 +4627,7 @@ fn render_footer(f: &mut Frame, area: Rect, app: &mut App) {
// non-tool work falls back to the existing dot-pulse label.
props.state_label = active_subagent_status_label(app)
.or_else(|| active_tool_status_label(app))
.unwrap_or_else(|| crate::tui::widgets::footer_working_label(dot_frame));
.unwrap_or_else(|| crate::tui::widgets::footer_working_label(dot_frame, app.ui_locale));
props.state_color = palette::DEEPSEEK_SKY;
// Spout drift: only animate when low_motion is off. The textual
@@ -4942,7 +4955,7 @@ fn render_footer_from(
Vec::new()
};
let agents = if has(S::Agents) {
crate::tui::widgets::footer_agents_chip(running_agent_count(app))
crate::tui::widgets::footer_agents_chip(running_agent_count(app), app.ui_locale)
} else {
Vec::new()
};
@@ -5048,7 +5061,8 @@ fn footer_auxiliary_spans(app: &App, max_width: usize) -> Vec<Span<'static>> {
// coherence, in-flight sub-agents, reasoning replay tokens, cache hit
// rate, and session cost.
let coherence_spans = footer_coherence_spans(app);
let agents_spans = crate::tui::widgets::footer_agents_chip(running_agent_count(app));
let agents_spans =
crate::tui::widgets::footer_agents_chip(running_agent_count(app), app.ui_locale);
let replay_spans = footer_reasoning_replay_spans(app);
let cache_spans = footer_cache_spans(app);
let displayed_cost = app.displayed_session_cost();
@@ -5713,6 +5727,7 @@ fn handle_context_menu_action(app: &mut App, action: ContextMenuAction) {
ContextMenuAction::OpenCommandPalette => {
app.view_stack
.push(CommandPaletteView::new(build_command_palette_entries(
app.ui_locale,
&app.skills_dir,
&app.workspace,
&app.mcp_config_path,
+10 -5
View File
@@ -88,7 +88,7 @@ impl HelpView {
}
pub fn new_for_locale(locale: Locale) -> Self {
let entries = build_entries();
let entries = build_entries(locale);
let mut view = Self {
locale,
entries,
@@ -144,17 +144,18 @@ impl HelpView {
}
}
fn build_entries() -> Vec<HelpEntry> {
fn build_entries(locale: Locale) -> Vec<HelpEntry> {
let mut entries = Vec::new();
for command in commands::COMMANDS {
let label = format!("/{}", command.name);
let localized = command.description_for(locale);
let description = if command.aliases.is_empty() {
command.description.to_string()
localized.to_string()
} else {
format!(
"{} (aliases: {})",
command.description,
localized,
command
.aliases
.iter()
@@ -182,7 +183,11 @@ fn build_entries() -> Vec<HelpEntry> {
for binding in KEYBINDINGS {
let label = binding.chord.to_string();
let description = format!("[{}] {}", binding.section.label(), binding.description);
let description = format!(
"[{}] {}",
binding.section.label(locale),
binding.description
);
let haystack = format!(
"{} {}",
label.to_ascii_lowercase(),
+25 -19
View File
@@ -15,6 +15,7 @@ use ratatui::{
};
use unicode_width::UnicodeWidthStr;
use crate::localization::{Locale, MessageId, tr};
use crate::palette;
use crate::tui::app::{App, AppMode};
@@ -123,16 +124,18 @@ pub fn footer_working_strip_string(width: usize, frame: u64) -> String {
out
}
/// Pulse `working` through `working`, `working.`, `working..`, `working...`
/// Pulse the localized "working" label through 03 trailing ASCII dots
/// keyed off `frame`. The cycle period is 4 frames (matching the four
/// states), so adjacent ticks visibly differ. Returns a static-friendly
/// states), so adjacent ticks visibly differ. Dots stay ASCII regardless
/// of locale so the animation reads identically across scripts. Returns a
/// `String` so callers can drop it into a `Span::styled` without lifetime
/// gymnastics.
#[must_use]
pub fn footer_working_label(frame: u64) -> String {
pub fn footer_working_label(frame: u64, locale: Locale) -> String {
let dots = (frame % 4) as usize;
let mut out = String::with_capacity(7 + dots);
out.push_str("working");
let base = tr(locale, MessageId::FooterWorking);
let mut out = String::with_capacity(base.len() + dots);
out.push_str(base);
for _ in 0..dots {
out.push('.');
}
@@ -141,16 +144,18 @@ pub fn footer_working_label(frame: u64) -> String {
/// Build a "N agents" chip span list when there are sub-agents in flight.
/// Empty list when N == 0 hides the chip entirely. Singular for N == 1
/// reads naturally; plural otherwise.
/// reads naturally; plural otherwise. The pluralization template lives in
/// the locale registry so CJK locales can render the count without the
/// English plural-`s` artefact.
#[must_use]
pub fn footer_agents_chip(running: usize) -> Vec<Span<'static>> {
pub fn footer_agents_chip(running: usize, locale: Locale) -> Vec<Span<'static>> {
if running == 0 {
return Vec::new();
}
let text = if running == 1 {
"1 agent".to_string()
tr(locale, MessageId::FooterAgentSingular).to_string()
} else {
format!("{running} agents")
tr(locale, MessageId::FooterAgentsPlural).replace("{count}", &running.to_string())
};
vec![Span::styled(
text,
@@ -508,6 +513,7 @@ fn truncate_to_width(text: &str, max_width: usize) -> String {
mod tests {
use super::{FooterProps, FooterWidget, Renderable};
use crate::config::Config;
use crate::localization::Locale;
use crate::palette;
use crate::tui::app::{App, AppMode, TuiOptions};
use ratatui::{
@@ -597,20 +603,20 @@ mod tests {
// ---- agents chip wording ----
#[test]
fn footer_agents_chip_is_empty_when_no_agents_running() {
let chip = super::footer_agents_chip(0);
let chip = super::footer_agents_chip(0, Locale::En);
assert!(chip.is_empty(), "0 agents in flight → no chip");
}
#[test]
fn footer_agents_chip_uses_singular_for_one() {
let chip = super::footer_agents_chip(1);
let chip = super::footer_agents_chip(1, Locale::En);
assert_eq!(chip.len(), 1);
assert_eq!(chip[0].content.as_ref(), "1 agent");
}
#[test]
fn footer_agents_chip_uses_plural_for_many() {
let chip = super::footer_agents_chip(3);
let chip = super::footer_agents_chip(3, Locale::En);
assert_eq!(chip.len(), 1);
assert_eq!(chip[0].content.as_ref(), "3 agents");
}
@@ -618,7 +624,7 @@ mod tests {
#[test]
fn footer_agents_chip_renders_into_widget() {
let app = make_app();
let agents = super::footer_agents_chip(2);
let agents = super::footer_agents_chip(2, Locale::En);
let props = FooterProps::from_app(
&app,
None,
@@ -779,16 +785,16 @@ mod tests {
// The label sequence `working` → `working.` → `working..` →
// `working...` then wraps back. Each frame is a discrete tick;
// the cycle is exactly 4 frames so adjacent ticks visibly differ.
assert_eq!(super::footer_working_label(0), "working");
assert_eq!(super::footer_working_label(1), "working.");
assert_eq!(super::footer_working_label(2), "working..");
assert_eq!(super::footer_working_label(3), "working...");
assert_eq!(super::footer_working_label(0, Locale::En), "working");
assert_eq!(super::footer_working_label(1, Locale::En), "working.");
assert_eq!(super::footer_working_label(2, Locale::En), "working..");
assert_eq!(super::footer_working_label(3, Locale::En), "working...");
assert_eq!(
super::footer_working_label(4),
super::footer_working_label(4, Locale::En),
"working",
"wraps back at frame 4",
);
assert_eq!(super::footer_working_label(7), "working...");
assert_eq!(super::footer_working_label(7, Locale::En), "working...");
}
/// Render the footer at `width` and return the visible single-line text.
+131 -18
View File
@@ -1,7 +1,7 @@
//! Utility helpers shared across the `DeepSeek` CLI.
use std::fs;
use std::path::Path;
use std::path::{Path, PathBuf};
use crate::models::{ContentBlock, Message};
use anyhow::{Context, Result};
@@ -38,6 +38,13 @@ pub fn is_key_file(path: &Path) -> bool {
}
/// Generate a high-level summary of the project based on key files.
///
/// Output is byte-stable across calls: `WalkBuilder` doesn't sort siblings
/// (the OS readdir order leaks through), so the joined `key_files` list
/// would otherwise reorder run-to-run on filesystems that don't pre-sort.
/// Only matters when the workspace has no `AGENTS.md` / `CLAUDE.md`, since
/// the system prompt routes through `ProjectContext::as_system_block` first
/// and only falls back here when no project-context document exists.
#[must_use]
pub fn summarize_project(root: &Path) -> String {
let mut key_files = Vec::new();
@@ -58,6 +65,8 @@ pub fn summarize_project(root: &Path) -> String {
}
}
key_files.sort();
if key_files.is_empty() {
return "Unknown project type".to_string();
}
@@ -90,38 +99,43 @@ pub fn summarize_project(root: &Path) -> String {
}
/// Generate a tree-like view of the project structure.
///
/// Sibling order is fixed by sorting collected paths — the underlying
/// `WalkBuilder` follows the OS readdir order, which is non-deterministic
/// across filesystems. Sorting by full path preserves the tree shape (a
/// directory still precedes its children because `"src" < "src/lib.rs"`)
/// while making the rendered output byte-stable across runs.
#[must_use]
pub fn project_tree(root: &Path, max_depth: usize) -> String {
let mut tree_lines = Vec::new();
let mut entries: Vec<(PathBuf, bool)> = Vec::new();
let mut builder = WalkBuilder::new(root);
builder
.hidden(false)
.follow_links(true)
.max_depth(Some(max_depth + 1));
let walker = builder.build();
for entry in walker {
let entry = match entry {
Ok(entry) => entry,
Err(_) => continue,
};
let path = entry.path();
for entry in builder.build().flatten() {
let depth = entry.depth();
if depth == 0 || depth > max_depth {
continue;
}
let rel_path = entry
.path()
.strip_prefix(root)
.unwrap_or(entry.path())
.to_path_buf();
let is_dir = entry.file_type().is_some_and(|ft| ft.is_dir());
entries.push((rel_path, is_dir));
}
let rel_path = path.strip_prefix(root).unwrap_or(path);
let indent = " ".repeat(depth - 1);
let prefix = if entry.file_type().is_some_and(|ft| ft.is_dir()) {
"DIR: "
} else {
"FILE: "
};
entries.sort_by(|a, b| a.0.cmp(&b.0));
let mut tree_lines = Vec::with_capacity(entries.len());
for (rel_path, is_dir) in entries {
let depth = rel_path.components().count();
let indent = " ".repeat(depth.saturating_sub(1));
let prefix = if is_dir { "DIR: " } else { "FILE: " };
tree_lines.push(format!(
"{}{}{}",
indent,
@@ -310,3 +324,102 @@ mod tests {
});
}
}
#[cfg(test)]
mod project_mapping_tests {
use super::{project_tree, summarize_project};
use std::fs;
use tempfile::tempdir;
#[test]
fn project_tree_sorts_siblings_alphabetically() {
// Cross-platform readdir doesn't guarantee alphabetical order — on
// ext4 with htree it's hash order, on APFS it's roughly insertion
// order, on ZFS it's storage-class dependent. The system prompt
// embeds this string in the cached prefix when a workspace has no
// AGENTS.md / CLAUDE.md, so the function has to be byte-stable
// across runs regardless of host filesystem.
let tmp = tempdir().expect("tempdir");
let root = tmp.path();
// Create files in a deliberately scrambled order to make the
// hosting filesystem's pre-sort (if any) less likely to mask a
// missing sort in our code.
fs::write(root.join("zebra.txt"), "z").expect("write zebra");
fs::write(root.join("apple.txt"), "a").expect("write apple");
fs::write(root.join("mango.txt"), "m").expect("write mango");
let tree = project_tree(root, 1);
let lines: Vec<&str> = tree.lines().collect();
let apple_pos = lines
.iter()
.position(|l| l.contains("apple.txt"))
.expect("apple line");
let mango_pos = lines
.iter()
.position(|l| l.contains("mango.txt"))
.expect("mango line");
let zebra_pos = lines
.iter()
.position(|l| l.contains("zebra.txt"))
.expect("zebra line");
assert!(apple_pos < mango_pos);
assert!(mango_pos < zebra_pos);
}
#[test]
fn project_tree_keeps_directory_before_its_children() {
// Sorting siblings by full path is enough to preserve tree shape:
// `"src" < "src/lib.rs"` because the shorter string compares less.
let tmp = tempdir().expect("tempdir");
let root = tmp.path();
let src = root.join("src");
fs::create_dir_all(&src).expect("mkdir src");
fs::write(src.join("lib.rs"), "lib").expect("write lib");
fs::write(src.join("main.rs"), "main").expect("write main");
let tree = project_tree(root, 2);
let src_pos = tree.find("DIR: src").expect("src dir line");
let lib_pos = tree.find("FILE: lib.rs").expect("lib file line");
let main_pos = tree.find("FILE: main.rs").expect("main file line");
assert!(src_pos < lib_pos, "directory must precede its children");
assert!(lib_pos < main_pos, "siblings sorted by name");
}
#[test]
fn project_tree_is_byte_stable_across_calls() {
let tmp = tempdir().expect("tempdir");
let root = tmp.path();
fs::write(root.join("z.txt"), "z").expect("write");
fs::write(root.join("a.txt"), "a").expect("write");
assert_eq!(project_tree(root, 1), project_tree(root, 1));
}
#[test]
fn summarize_project_sorts_key_files_in_fallback() {
// When `summarize_project` can't classify a project type it falls
// back to listing the discovered key files. That joined list must
// be deterministic so the system prompt that embeds it doesn't
// drift between runs on filesystems that emit readdir in a
// non-alphabetical order.
let tmp = tempdir().expect("tempdir");
let root = tmp.path();
// Use key files that don't trigger any of the type detectors
// (Cargo.toml / package.json / requirements.txt) so the function
// hits the `Project with key files: …` branch.
fs::write(root.join("Makefile"), "all:").expect("write makefile");
fs::write(root.join("README.md"), "# x").expect("write readme");
let summary = summarize_project(root);
assert!(
summary.starts_with("Project with key files: "),
"expected fallback branch; got: {summary}"
);
let suffix = summary
.strip_prefix("Project with key files: ")
.expect("prefix");
assert_eq!(suffix, "Makefile, README.md");
}
}
+79 -7
View File
@@ -380,9 +380,17 @@ impl WorkingSet {
}
/// Render a compact working-set block for the system prompt.
///
/// Byte-stable across `next_turn()` calls when no new paths are observed
/// (#280): the rendered lines drop the turn-relative `touches` and
/// `last seen N turn(s) ago` fields, and the order is taken from
/// `sorted_for_prompt` (turn-agnostic) instead of `sorted_entries`.
/// The block lands in the system prompt before the historical
/// conversation; any byte that drifts here cache-misses everything that
/// follows in DeepSeek's KV prefix cache.
pub fn summary_block(&self, workspace: &Path) -> Option<String> {
let entries = self.sorted_entries();
let prompt_entries: Vec<&WorkingSetEntry> = entries
let prompt_entries: Vec<&WorkingSetEntry> = self
.sorted_for_prompt()
.into_iter()
.take(self.config.max_prompt_entries)
.collect();
@@ -404,12 +412,8 @@ impl WorkingSet {
if !prompt_entries.is_empty() {
lines.push("Active paths (prioritize these):".to_string());
for entry in prompt_entries {
let age = self.turn.saturating_sub(entry.last_turn);
let kind = if entry.is_dir { "dir" } else { "file" };
lines.push(format!(
"- {} ({kind}, touches: {}, last seen: {} turn(s) ago)",
entry.path, entry.touches, age
));
lines.push(format!("- {} ({kind})", entry.path));
}
}
@@ -531,6 +535,18 @@ impl WorkingSet {
});
entries
}
/// Turn-agnostic ordering used when rendering the prompt summary block.
/// `sorted_entries` mixes in a recency bonus from `self.turn`, so its
/// output reorders as turns advance even when no new paths are touched —
/// that movement would cross `max_prompt_entries` boundaries and bust the
/// KV prefix cache (#280). Compaction pinning still uses the recency-aware
/// `sorted_entries`; only the prompt-facing surface is stabilised here.
fn sorted_for_prompt(&self) -> Vec<&WorkingSetEntry> {
let mut entries: Vec<&WorkingSetEntry> = self.entries.values().collect();
entries.sort_by(|a, b| b.touches.cmp(&a.touches).then_with(|| a.path.cmp(&b.path)));
entries
}
}
fn score_entry(entry: &WorkingSetEntry, current_turn: u64) -> i64 {
@@ -986,6 +1002,62 @@ mod tests {
assert!(block.contains("src/lib.rs"));
}
/// #280 regression: `summary_block` must produce byte-identical output
/// across `next_turn()` advances when no new paths are touched. Prior to
/// the fix, the rendered lines interpolated `entry.touches` and
/// `self.turn - entry.last_turn`, both of which drift turn-over-turn even
/// when the path set is unchanged. The drift busted DeepSeek's KV prefix
/// cache on every user message because the working-set block lands in the
/// system prompt before the historical conversation.
#[test]
fn summary_block_is_byte_stable_across_next_turn_when_no_new_paths_observed() {
use crate::test_support::assert_byte_identical;
let tmp = TempDir::new().expect("tempdir");
fs::write(tmp.path().join("Cargo.toml"), "[package]\nname = \"x\"").expect("write");
let src = tmp.path().join("src");
fs::create_dir_all(&src).expect("mkdir");
fs::write(src.join("a.rs"), "a").expect("write");
fs::write(src.join("b.rs"), "b").expect("write");
let mut ws = WorkingSet::default();
ws.observe_user_message("Edit src/a.rs and src/b.rs", tmp.path());
let before = ws.summary_block(tmp.path()).expect("block before");
ws.next_turn();
let after = ws.summary_block(tmp.path()).expect("block after");
assert_byte_identical(
"summary_block must be stable across next_turn when no new paths touched",
&before,
&after,
);
}
/// Companion to the byte-stability test: a fresh path *should* invalidate
/// the block (the KV cache is allowed to miss when there's genuinely new
/// signal), so the model still sees newly touched paths after the block
/// stabilises across no-op turns.
#[test]
fn summary_block_changes_when_a_new_path_is_observed() {
let tmp = TempDir::new().expect("tempdir");
fs::write(tmp.path().join("Cargo.toml"), "[package]\nname = \"x\"").expect("write");
let src = tmp.path().join("src");
fs::create_dir_all(&src).expect("mkdir");
fs::write(src.join("a.rs"), "a").expect("write");
fs::write(src.join("c.rs"), "c").expect("write");
let mut ws = WorkingSet::default();
ws.observe_user_message("src/a.rs", tmp.path());
let before = ws.summary_block(tmp.path()).expect("block before");
ws.observe_user_message("src/c.rs", tmp.path());
let after = ws.summary_block(tmp.path()).expect("block after");
assert_ne!(before, after, "new path must update the rendered summary");
assert!(after.contains("src/c.rs"));
}
#[test]
fn extract_paths_from_message_picks_up_tool_results() {
let msg = Message {
+2 -2
View File
@@ -1,7 +1,7 @@
{
"name": "deepseek-tui",
"version": "0.8.3",
"deepseekBinaryVersion": "0.8.3",
"version": "0.8.4",
"deepseekBinaryVersion": "0.8.4",
"description": "Install and run deepseek and deepseek-tui binaries from GitHub release artifacts.",
"author": "Hmbown",
"license": "MIT",