Commit Graph

295 Commits

Author SHA1 Message Date
Hunter Bown bef1895bed Merge pull request #518 from Hmbown/feat/489-memory-mvp
feat(memory): user-memory MVP — persistent notes, `# ` quick-add, /memory, remember tool (#489–#493)
2026-05-03 08:18:47 -05:00
Hunter Bown 7321165933 Merge pull request #517 from Hmbown/feat/505-inline-diff-rendering
feat(tools): inline unified-diff in edit_file / write_file results (#505)
2026-05-03 08:18:44 -05:00
Hunter Bown 7547d168a4 feat(memory): user-memory MVP — persistent notes, # quick-add, /memory, remember tool (#489–#493)
Adds a small, opt-in user-memory layer so the model has access to durable
preferences and conventions across sessions, and the user can dump quick
notes without leaving the TUI.

### What ships

- **Hierarchy loader** (#490): on every prompt assembly the engine reads
  `Config::memory_path()` (defaults to `~/.deepseek/memory.md`, override via
  `memory_path` in config or `DEEPSEEK_MEMORY_PATH`) and injects the file
  as a `<user_memory>` block alongside the existing `<project_instructions>`
  block. Goes above the volatile-content boundary so prefix-cache stays warm.
  Oversize files (>100 KiB) are truncated with a marker.
- **`# foo` composer quick-add** (#492): typing a single line that starts
  with `#` (but not `##` / `#!`) appends a timestamped bullet to the memory
  file and consumes the input — no turn fires. The composer status line
  surfaces the path that was written. Multi-`#` prefixes deliberately fall
  through so users can paste Markdown headings.
- **`/memory` slash command** (#491): `/memory` (or `/memory show`) prints
  the resolved path and contents inline; `/memory path`, `/memory clear`,
  and `/memory edit` (prints `${VISUAL:-${EDITOR:-vi}} <path>`) cover the
  rest of the manual-curation surface.
- **`remember` tool** (auto-update): model-callable tool that takes a
  `note` string and appends it as a bullet — the same persistence path as
  `# foo`. Auto-approved (writes only the user's own memory file). Only
  registered when memory is enabled, so it doesn't pollute the catalog when
  the feature is off.
- **Opt-in toggle** (#493): default behaviour is off. Enable with
  `[memory] enabled = true` in `config.toml` or `DEEPSEEK_MEMORY=on` in
  the environment.

### What's wired

- New `crates/tui/src/memory.rs` module (`load`, `as_system_block`,
  `compose_block`, `append_entry`).
- New `crates/tui/src/tools/remember.rs` (`RememberTool` + 3 tests).
- New `crates/tui/src/commands/memory.rs` (`memory(app, arg)` handler).
- `EngineConfig` gains `memory_enabled: bool` + `memory_path: PathBuf`.
- `ToolContext` gains `memory_path: Option<PathBuf>`.
- `App` exposes `memory_path` + `use_memory` from `AppOptions` (previously
  destructured-and-dropped); `main.rs` populates `use_memory` from
  `config.memory_enabled()`.
- `system_prompt_for_mode_with_context_and_skills` accepts an optional
  `user_memory_block` parameter; the engine computes it via
  `memory::compose_block(...)` and threads it through.
- Composer Enter handler intercepts `# foo` only when
  `config.memory_enabled()` is true; otherwise falls through to existing
  turn-submission path.
- `MemoryConfig` table (`[memory] enabled`) added to `Config`, surfaced
  in `config.example.toml`, plumbed through `merge_config`.

### Tests

- 8 unit tests in `memory::tests` covering `load` (missing / whitespace /
  real), `as_system_block` (xml shape, empty input, oversize truncation),
  and `append_entry` (creation, repeated append, empty-after-strip rejection).
- 3 unit tests in `tools::remember::tests` covering disabled-state error,
  successful append, and missing-`note`-field validation.

### Verification

cargo fmt --all -- --check                                          ✓
cargo clippy --workspace --all-targets --all-features --locked --   -D warnings   ✓
cargo test --workspace --all-features --locked                      ✓ (1821 + supporting; was 1809 on main)

Closes #490 #491 #492 #493
Refines #489 (EPIC parent — phase-1 MVP delivered; phase-2 items
494–497 stay on the v0.9.0 board)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 02:51:17 -05:00
Hunter Bown f2cf3843ec feat(tools): inline unified-diff in edit_file / write_file results (#505)
`edit_file` and `write_file` now capture the file contents before and
after the mutation, generate a unified diff with `similar`, and emit it
at the head of the `ToolResult` body. The TUI's existing
`output_looks_like_diff` detector (history.rs:1335) sees the `@@`
header in the first 5 lines and routes the payload through
`diff_render::render_diff`, which already renders unified diffs with
line numbers and coloured `+`/`-` gutters.

The model also benefits — it sees exactly which lines changed instead
of just `Replaced N occurrence(s)` or `Wrote N bytes`. Identical
content produces an empty diff, in which case the body falls back to
`<summary>\n(no changes)`.

### What's wired

- New `crates/tui/src/tools/diff_format.rs` exposes
  `make_unified_diff(path, old, new) -> String` using
  `similar::TextDiff::from_lines(...).unified_diff().context_radius(3)`.
- `WriteFileTool::execute` snapshots prior contents (or empty for new
  files), writes, then emits `<diff>\n<summary>` where summary is
  `Wrote N bytes to PATH` for existing files and
  `Created PATH (N bytes)` for new ones.
- `EditFileTool::execute` snapshots, replaces, writes, emits
  `<diff>\nReplaced N occurrence(s) in PATH`.
- `similar = "2"` added to `crates/tui/Cargo.toml`. Pure-Rust, no
  C deps; v2.7.0 in Cargo.lock.

### Tests

- 4 unit tests in `diff_format::tests` covering identical inputs,
  replacement, new-file (against empty), and presence of the `@@`
  header in the first 5 lines (so the TUI detector trips).
- Existing `test_write_file_tool` / `test_edit_file_tool` updated to
  assert both the summary line and the unified-diff body
  (`--- a/`, `-old`, `+new`).

### Verification

cargo fmt --all -- --check                                          ✓
cargo clippy --workspace --all-targets --all-features --locked --   -D warnings   ✓
cargo test --workspace --all-features --locked                      ✓ (1824 + supporting; was 1820)

Closes #505

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 02:30:44 -05:00
Hunter Bown b0f878bad6 Merge branch 'chore/v0.8.8-stabilization' into feat/498-osc8-hyperlinks 2026-05-03 02:18:00 -05:00
Hunter Bown dd3882e1bf fix(utils): display_path normalizes home-relative suffix to platform separator
`display_path_with_home` joined the `~` prefix with `MAIN_SEPARATOR_STR`
but called `rest.display()` for the suffix, which preserves whatever
separators the input carried. On Windows that produced mixed-separator
output like `~\projects/foo` for any path that came in with forward
slashes — visible in the tests that #506 added to lock down the
contract (the tests passed locally on Unix but failed on the
windows-latest CI runner).

Walk `rest.components()` and join each `Normal` component with
`MAIN_SEPARATOR_STR`. Pure-Rust, no extra deps, behavior is
byte-identical on Unix because the input separator was already `/`.

Verified locally:
- `cargo test -p deepseek-tui --locked display_path` ✓ (5 passed)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 02:17:50 -05:00
Hunter Bown 3013a54c78 feat(tui): emit OSC 8 hyperlinks so URLs are Cmd+click-openable (#498)
Modern terminals (iTerm2, Terminal.app 13+, Ghostty, Kitty, WezTerm,
Alacritty, recent gnome-terminal/konsole) make a URL clickable when it's
wrapped in:

    \x1b]8;;TARGET\x1b\\LABEL\x1b]8;;\x1b\\

Terminals that don't understand the sequence simply render the visible
LABEL and ignore the escape, so emitting OSC 8 is a strict UX upgrade
for supporting terminals and a no-op for the rest.

### What's wired

- New `crates/tui/src/tui/osc8.rs` module with `wrap_link(target, label)`,
  `strip_into(s, &mut out)`, and a process-wide `ENABLED` AtomicBool that
  defaults to `true`.
- `markdown_render::render_line_with_links` now wraps recognized URLs
  (`http(s)://…`) in OSC 8 when the runtime flag is on. Display width is
  computed from the bare URL — the escapes are zero-width on supporting
  terminals.
- `ui_text::line_to_string` and `line_to_plain` strip OSC 8 wrappers when
  the span content contains an escape, so selection / clipboard output
  carries clean URLs and not the raw escape codes.
- `[tui] osc8_links: bool` config (default `true`) added to `TuiConfig`,
  documented in `docs/CONFIGURATION.md`, and surfaced in
  `config.example.toml`. `run_tui` applies it at startup.

### Tests

- 7 unit tests in `osc8::tests` covering wrap, strip-with-ESC-terminator,
  strip-with-BEL-terminator, plain passthrough, mixed escapes, default
  state, and round-trip set/unset.
- 2 markdown_render tests proving URLs in paragraph blocks emit the OSC 8
  wrapper when enabled and emit plain text when disabled.
- 2 ui_text tests proving `line_to_plain` strips OSC 8 wrappers from spans
  and passes plain spans through unchanged.

Tests that touch the global ENABLED flag serialize through a static
Mutex inside the test module so cargo's parallel runner can't observe a
torn read.

### Verification

cargo fmt --all -- --check                                          ✓
cargo clippy --workspace --all-targets --all-features --locked --   -D warnings   ✓
cargo test --workspace --all-features --locked                      ✓ (1820 + supporting; was 1809)

Closes #498

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 02:13:14 -05:00
Hunter Bown ad8064b143 chore(v0.8.8): stabilization batch — sub-agent caps, mutex contention, RLM polish, CI cleanup
Bundles the v0.8.8 stabilization fixes that were already implemented in the
working tree, plus the workflow/doc reconciliation called out in #507.

### Sub-agent runtime fixes
- **#509** Default sub-agent cap raised to 10 (configurable via
  `[subagents].max_concurrent` in `config.toml`, hard ceiling 20). The
  running-count calculation now ignores non-running, no-handle, and finished
  handles so completed agents stop counting against the cap.
- **#510** `SharedSubAgentManager` is now `Arc<RwLock<...>>`; the read paths
  that previously held a `Mutex` for inspection now take a read lock,
  eliminating the multi-agent fan-out UI freeze.
- **#511** `compact_tool_result_for_context` summarizes `agent_result` /
  `agent_wait` payloads before they are folded into the parent context.
- **#512** RLM tool cards map to `ToolFamily::Rlm` and render `rlm`, not
  `swarm`. Stale "swarm" wording cleaned in docs/comments/tests.
- **#513** (foreground stopgap only) Foreground RLM work is visible in the
  Agents sidebar projection. Full async RLM lifecycle remains v0.8.9 — the
  issue stays open with a refined scope.

### TUI / UX fixes
- **#487** Offline composer queue is now session-scoped; legacy unscoped
  queues fail closed.
- **#488** Composer Option+Backspace deletes by word; cross-platform key
  routing helpers added.
- **#443/#444** Keyboard enhancement flags pop on normal AND panic exit; the
  raw-mode startup probe is now bounded by a configurable timeout.
- **#449** Production footer reads statusline colors from `app.ui_theme`
  rather than the bespoke palette.
- **#506** `display_path_with_home` no longer mutates `HOME` in tests; the
  flake on shared-env CI is gone.

### Self-update / packaging
- **#503** `update.rs` arch mapping uses release-asset naming (`arm64`/`x64`)
  instead of the raw Rust constants. The platform-asset selector also rejects
  `.sha256` siblings as primary binaries. Tests now live alongside the source
  in `mod tests` (the `#[path]`-based integration test was removed because it
  duplicated test runs and forced a `pub(crate)` helper that no real caller
  used).
- **`Max 5 in flight` wording updated** in `agent_spawn` description,
  `prompts/base.md`, and `docs/TOOL_SURFACE.md` so the model sees the real
  default cap (10) and the configuration knob name.

### CI / release docs (#507)
- Pruned three duplicated/dead workflows: `crates-publish.yml`, `parity.yml`,
  `publish-npm.yml`. Their gates already run in `ci.yml` for every push/PR.
- `release.yml` build job now allows `parity` to be skipped (it only runs on
  tag push), unblocking `workflow_dispatch` reruns. The job still fails
  closed on a real parity failure.
- `RELEASE_RUNBOOK.md` reconciled: crate publishing is documented as the
  manual `scripts/release/publish-crates.sh` flow (no automated workflow);
  references to the deleted workflows removed.
- `CLAUDE.md` notes the `RELEASE_TAG_PAT` requirement for the auto-tag →
  release.yml chain (without it, the tag is created but `release.yml` does
  not fire) and documents the `workflow_dispatch` parity-skip behavior.

### Docs
- `docs/COMPETITIVE_ANALYSIS.md` added — capability matrix vs OpenCode and
  Codex CLI, gap analysis, and recommended implementation order.

### Verification (this branch)
- `cargo fmt --all -- --check` ✓
- `cargo check --workspace --all-targets --locked` ✓
- `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings` ✓
- `cargo test --workspace --all-features --locked` ✓ (1809 + supporting)
- Parity gates ✓ (snapshot, parity_protocol, parity_state)
- `cargo build --release --locked -p deepseek-tui-cli -p deepseek-tui` ✓
- Lockfile drift guard ✓
- `deepseek doctor --json` clean
- `deepseek eval` (offline harness) success=true, 0 tool errors

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 01:57:37 -05:00
Hunter Bown 15da881e1d chore(release): bump version to 0.8.7
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 21:56:37 -05:00
Hunter Bown a4102ccad4 fix(tui): allow selection across all transcript cell types
The selection-tightening from 7125172f restricted copy/select to user
and assistant cell bodies only, which made it impossible to copy text
from system notes, thinking blocks, or tool output. Drop the
body-start gate so the rendered transcript block is selectable in full.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 21:54:57 -05:00
Hunter Bown 7125172f67 fix(tui): tighten selection and live task panels 2026-05-02 21:05:15 -05:00
Hunter Bown 5bfc1feb62 v0.8.6: survivability, UX polish, and release hardening
Merge the v0.8.6 feature batch and release hardening.\n\nIncludes the full #373-#380/#382-#402 milestone scope, version bump to 0.8.6, secure /share temp-file handling, Windows-safe self-update replacement, and CI portability fixes.\n\nRemote PR checks passed on the final head before merge.
2026-05-02 20:11:33 -05:00
Hunter Bown 2d61513a9e v0.8.5: config test fixes + default_model session-apply bugfix (#381)
* feat: add config UI support for TUI and web modes

- Introduced a new `config_ui.rs` module to handle configuration UI for TUI and web.
- Updated `TuiOptions` and `App` structures to include `config_path` and `config_profile`.
- Implemented functions to build and apply configuration documents.
- Added tests to ensure the new configuration UI behaves as expected.
- Integrated web configuration session handling into the event loop.
- Updated various modules to accommodate the new configuration options and UI.

* refactor(tui): remove local path reference for schemaui dependency

Remove the local file system path reference for schemaui in favor of
using the published crate from the registry. This change updates the
Cargo.toml to use only the version specification and adds the source
and checksum information to Cargo.lock.

* fix: add AGENTS.md guide and improve config error handling

- Add comprehensive AGENTS.md file with project instructions for AI
  assistants, including build commands, dependencies, and GitHub
  operations guidance
- Introduce is_error field to CommandResult struct for better error
  tracking
- Refactor config application logic to properly handle errors using
  the new is_error flag
- Add test utilities for WebConfigSession to support testing
- Optimize web config event polling by extracting drain logic into
  separate function
- Add unit tests for session-only config application and engine sync
  requirements

* fix(security): add SSRF protection to fetch_url (#261)

Block private, link-local, and cloud metadata IPs in fetch_url HTTP requests. Co-authored-by: JasonOA888

* test(portability): inject paths instead of mutating HOME (Windows fix)

CI's `Test (windows-latest)` job failed because both my new tests
(composer_history and the spawn_supervised crash-dump test) mutated
HOME to redirect `dirs::home_dir()`. That works on macOS / Linux but
not on Windows, where dirs::home_dir() reads USERPROFILE / queries
SHGetKnownFolderPath rather than HOME.

Fix: refactor both modules to expose path-injecting helpers so tests
never need to touch the env var:

- composer_history: split load_history / append_history into thin
  wrappers around load_history_from(&Path) / append_history_to(&Path).
  Tests use the *_to / *_from form with a tempdir path.
- utils::write_panic_dump: same pattern — write_panic_dump_to(&Path)
  takes the crash dir directly. The spawn_supervised end-to-end test
  splits into two: one verifies panic-doesn't-propagate (no on-disk
  side effect needed), one verifies write_panic_dump_to writes the
  expected log format.

Production callers continue to use the env-driven default (`HOME`/
`USERPROFILE` via `dirs::home_dir()`) so no behavior change. Tests
work identically on every platform now.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tui): clear chat area each frame so stale cells don't bleed into sidebar

ChatWidget's render path was `Paragraph::new(lines).render(content_area, buf)`
with no Block and no Clear — ratatui's Paragraph only writes cells that
contain text, leaving any cell the current frame's paragraph doesn't
touch holding the *previous* frame's contents. With wide tool output
(`gh pr list`, `git log`) emitting ISO-8601 timestamps like
`2026-05-02T07:29:24Z`, then a subsequent shorter-paragraph frame, the
old timestamp tails (`:24Z`, `7:29:24Z`, etc.) persisted on the right
edge of the chat area, visually colliding with the section headers in
the sidebar (`Plan` rendering as `:24Zan`, `Agents` as `:24Zents`).

Fix: render `Clear` over the full content_area before drawing the
Paragraph. Cheap (one buffer-fill per frame) and guarantees stale cells
can never persist into the next frame's render.

Reported in v0.8.5 testing right after install. The other v0.8.5
bordered widgets (composer, sidebar sections, footer) already render
into a Block with a solid background style, so they were never
affected — only the chat area used a bare Paragraph.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(theme): vendor + theme schemaui to deepseek navy palette (config UI)

The schemaui-0.12.0 crate the contributor brought in via #365 ships
hardcoded Color::Gray / Color::DarkGray / Color::White / Color::Yellow
references across its rendering components. Visually it clashed with
the rest of deepseek-tui — the editor area read as gray-on-black on a
TUI that's otherwise navy ink + sky accents. Two ship-day options
weren't acceptable: defaulting back to the legacy modal lost the new
editor's UX, and living with gray was off-brand.

This commit forks schemaui at 0.12.0 into vendor/schemaui-0.12.0 and
themes the rendering layer to match deepseek-tui's palette. The patch
is wired in via a workspace-level [patch.crates-io] override so the
deepseek-tui Cargo.toml continues to depend on `schemaui = "0.12.0"`
and would automatically resolve back to crates.io if we ever drop the
override (e.g. once upstream lands a ColorTheme API).

Changes inside the vendored fork:

- New `src/deepseek_palette.rs` with the brand RGB values:
  SURFACE_INK / SURFACE_RAISED for backgrounds, BORDER_DIM /
  BORDER_ACTIVE for chrome, TEXT_PRIMARY / TEXT_MUTED / TEXT_DIM,
  ACCENT_SKY / ACCENT_BLUE / ACCENT_PURPLE, and STATUS_OK / WARN /
  ERROR. Values mirror crates/tui/src/palette.rs in the workspace.
- `src/lib.rs` exposes the palette module under `cfg(feature = "tui")`.
- `src/tui/view/frame.rs::draw` paints a navy backdrop across the
  full frame area before any child widget renders, so any cell that
  doesn't get explicitly written reads as ink instead of the terminal
  default.
- `tabstrip.rs`, `overlay.rs`, `popup.rs`, `body.rs`, `sections.rs`,
  `footer.rs`, `help.rs`, `fields.rs`: every Color::Gray / DarkGray /
  White / Yellow / Cyan / Blue / Magenta / Red / Green / LightBlue
  swapped out for a deepseek_palette token, plus explicit `bg(...)`
  fills on the top-level Block styles and Paragraph wrappers.
- `Cargo.toml` adds an empty `[workspace]` so the vendored crate
  builds standalone (its dev-deps don't drift into ours).

Workspace-level changes:

- `Cargo.toml` adds `[patch.crates-io] schemaui = { path =
  "vendor/schemaui-0.12.0" }`. Production deepseek-tui builds pick up
  the themed fork transparently.
- `.gitignore` excludes `vendor/.../web/ui/node_modules/` (15 MB of
  npm artefacts the Rust build doesn't need) and the vendored
  Cargo.lock (regenerated locally per build).

Verification:
- cargo build --workspace --all-features: clean
- cargo clippy --workspace --all-targets --all-features --locked: clean
- cargo test --workspace: 1777 passed, 0 failed
- /config inside `deepseek` now opens a navy-themed editor matching
  the rest of the TUI; tabs, body panel, footer, popup, and help
  overlay all read on brand.

Future work tracked separately: upstream a `with_theme(ColorTheme)`
builder API to schemaui so we can drop the fork. Until then, sync the
fork against new schemaui releases when we want their fixes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Revert "feat(theme): vendor + theme schemaui to deepseek navy palette"

This reverts ed597ccc — vendoring 28,913 lines of schemaui to recolor
a config editor was the wrong tradeoff. Maintenance cost for a
cosmetic match wasn't worth it, and the recolor wasn't even fully
working (terminal-default bg kept bleeding through Style::default()
calls in the form fields).

The simpler path: keep the schemaui-driven editor available as
`/config tui` for users who want the form-style UX, but make bare
`/config` open the legacy native modal that already matches the
deepseek-tui navy chrome by inheritance. No fork, no vendored copy,
no ongoing sync burden.

Changes:
- `git rm -r vendor/schemaui-0.12.0/` (28,913 lines gone)
- Drop `[patch.crates-io]` from workspace Cargo.toml — schemaui
  resolves back to crates.io v0.12.0 unmodified.
- Drop the corresponding `.gitignore` exclusions (no more vendor dir
  to filter).
- `config_ui::parse_mode` default mode flipped from `Tui` to `Native`.
  Bare `/config` → legacy navy modal. Explicit `/config tui` → the
  contributor's schemaui editor (still available, gray-on-default
  chrome, but opt-in). `/config web` and `/config <key>` /
  `/config <key> <value>` unchanged.
- Help text updated to list `[native|tui|web]` in that order.

Verified: cargo build / clippy --workspace --all-features --locked
with -D warnings: clean.

The contributor's work (#365) ships and gets credit; users discover
the alternate editor via the help text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tui): paint chat area with explicit navy ink instead of Clear

The Clear-instead-of-fill in 0ae2cead reset cells to the terminal's
default background, which read as a brown-gray on most user setups
even though the rest of the TUI chrome is navy. Replace the Clear
with an explicit Block fill at palette::DEEPSEEK_INK, and pass the
same bg through to the Paragraph itself so streamed text cells
inherit ink rather than bouncing back to terminal default.

Net effect: the chat area visually unifies with the sidebar /
composer / footer instead of showing as a contrasting brown-gray
panel in the middle of an otherwise navy frame.

Stale-cell guarantee from #372-followup is preserved — the Block
fills every cell in the area on each frame, so wide tool output
(`gh pr list` ISO timestamps, etc.) still can't bleed past the
current frame's actual text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(config): update tests for Native default + fix default_model override in session-only apply

- Update test_show_config_defaults_to_native and
  execute_config_opens_config_view_action to expect
  OpenConfigView (Native) instead of OpenConfigEditor(Tui),
  matching the parse_mode default change from ce98f054.

- Fix apply_document bug where default_model was processed
  in the main key-value loop after model, causing
  set_config_value('default_model') to overwrite the
  runtime model. default_model is now only applied when
  persist=true, preventing session-only edits from being
  silently reverted.

* style: cargo fmt

* chore: remove end-of-night report (session artifact)

---------

Co-authored-by: unic <yuniqueunic@gmail.com>
Co-authored-by: Jason <jason@aveoresearchlabs.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: YuniqueUnic <YuniqueUnic@users.noreply.github.com>
2026-05-02 16:25:03 -05:00
Hunter Bown 07be656412 style: cargo fmt --all sweep
Applies the workspace formatter to the v0.8.5 commits — local builds
ran without `cargo fmt --check` so a few format inconsistencies
slipped through and CI's `parity` job (which runs fmt --check) failed.
Mechanical reflow only; no functional changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 14:50:59 -05:00
Hunter Bown 216f6be349 chore(release): bump version to 0.8.5
Workspace + npm wrapper + every internal crate path-dep pin moved from
0.8.4 → 0.8.5. scripts/release/check-versions.sh confirms parity across
the three sources. cargo build / clippy / test all clean.

Pushing this commit to main is the trigger for auto-tag.yml to create
the v0.8.5 tag, which fires release.yml to build the cross-platform
matrix and draft the GitHub Release. The npm publish remains a manual
follow-up (2FA on every publish, no automation token provisioned).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 14:48:18 -05:00
Hunter Bown b678058ff0 test: serialise composer_history env mutation + update rlm test for #358
Two test fixes uncovered by the full-suite run:

1. composer_history tests were using a module-local mutex to serialize
   their HOME env mutation, but other tests in the workspace (config,
   commands::restore, etc.) ALSO mutate HOME without that lock. Switch
   to the crate-wide `test_support::lock_test_env()` so all HOME-
   mutating tests share one mutex.

2. The `prompts::tests::rlm_first_class_guidance_present` test was
   pinning the OLD "RLM Is First-Class" framing that #358 deliberately
   reframed as "RLM Is a Specialty Tool". Renamed the test to
   `rlm_specialty_tool_guidance_present` and updated the assertions to
   guard the new framing — so a future encouraging-language regression
   lights up CI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:40:57 -05:00
Hunter Bown 735287774f fix(approval): cache denials per session — ESC on dangerous command stops re-prompting (#360)
When the user pressed ESC (or Deny / Abort) on an approval prompt, the
TUI correctly told the engine to deny the call. But the model would
often retry the same command — same name, same args, same approval
fingerprint — and the user would see the dialog again, frustrating in
the same way the equivalent yes-yes-yes loop would be.

Symmetric to the existing `approval_session_approved` "always approve"
cache: add `approval_session_denied: HashSet<String>` populated when
the user denies (not when the timeout fired — a timeout might mean
the user stepped away rather than refused). Subsequent ApprovalRequired
events whose approval_key or tool_name match the cache auto-deny via
`engine.deny_tool_call(...)` without re-showing the dialog. Logged via
`tool.approval.auto_deny_session` so the audit log captures the silent
denial.

Closes #360.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:35:25 -05:00
Hunter Bown 162e2e027c feat(composer): cross-session input history persistence (#366)
Pressing Up-arrow at the composer now recalls submissions from previous
sessions, not just the current one. Implementation:

- New `crates/tui/src/composer_history.rs` module with `load_history()`
  + `append_history()`. Persists to `~/.deepseek/composer_history.txt`
  (one entry per line, oldest first). Capped at 1000 entries — entries
  older than the cap are pruned at append time so the file never grows
  unboundedly.
- `App::new` now seeds `input_history` from the persisted file at
  startup, so Up-arrow at first launch shows yesterday's prompts.
- `App::submit_message` mirrors each non-slash submission to the
  persisted history. Slash commands and empty/whitespace submissions
  are skipped — those don't help recall and would pollute the stream.
- Consecutive-duplicate dedup so re-submitting the same prompt doesn't
  bloat the file.

The persisted history is global (not per-workspace) — matches the
arrow-up recall pattern users expect from shells and Claude Code. Per-
workspace scoping is a follow-up if multi-project users find it noisy.

Tests: 6 unit tests cover round-trip, slash-skip, empty-skip,
consecutive-duplicate dedup, cap-pruning, and missing-file safety. The
test module uses an internal Mutex to serialize HOME env mutations so
tests can still run in parallel without stomping each other.

Closes #366.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:32:45 -05:00
Hunter Bown 40f7037d8e docs(prompts): tell the model to skip markdown tables in terminal output (#372)
Markdown tables don't render correctly in a terminal — monospace fonts
plus variable-width content (especially CJK characters) can't reliably
align column borders. Adds an "Output formatting" section to both
base.md and base.txt instructing the model to prefer plain prose,
bulleted/numbered lists, code blocks, or `- **Label**: value` pairs
over tables. If column-aligned data is genuinely necessary, the
guidance asks for narrow, ASCII-only, 2–3 column tables.

Closes #372.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:27:54 -05:00
Hunter Bown 1986a15dd5 feat(persistence): schema migration framework — forward upgrade path for ~/.deepseek/ records (#350)
Every persistence layer in crates/tui/src/ already gates `schema_version
> CURRENT_*` to reject newer-than-supported records (good — prevents
silent truncation when an older binary tries to load a v3 file with v4
fields). What was missing: the **forward upgrade path** for older
records. When we bump CURRENT_SESSION_SCHEMA_VERSION from 3 to 4 to
add a field, every v3 session on disk would silently load with the
new field's serde default — which is OK for additions but breaks
catastrophically for renames or shape changes.

This commit lays down the framework:

**`crates/tui/src/schema_migration.rs`** — new module:

- `SchemaMigration` trait. Each persistence domain implements it once
  with `CURRENT_VERSION`, `DOMAIN`, and an ordered `MIGRATIONS` list
  of `fn(&mut serde_json::Value) -> Result<(), MigrationError>` steps.
  Index `i` migrates from version `i+1` to `i+2`.

- `SchemaMigration::migrate(value, from_version)` — runs every required
  step, stamping `value["schema_version"]` after each step so a partial
  failure leaves a known-state record rather than mixed.

- `MigrationError` — typed error with from/to versions + reason.

- `backup_before_migrate(path, domain)` — creates a `.bak` copy of the
  source file before mutation. Errors are warn-logged and ignored
  (continues because `write_atomic` is itself crash-safe). The `.bak`
  is left on disk as a manual recovery artifact — no automatic GC.

**`schema_migration::registry`** — submodule that registers every
existing persistence domain (session, offline_queue, runtime, task,
automation, automation_run) at its current version with an empty
MIGRATIONS list. No domain has shipped a schema bump yet, so today's
behavior is a no-op. The next bump is now a 4-step recipe:

1. Write the `migrate_<domain>_v<N>_to_v<N+1>` step in this module.
2. Append it to `MIGRATIONS` and bump `CURRENT_VERSION`.
3. Wire `<Domain>Migration::migrate(...)` into the load function in
   the owning module.
4. Add a fixture-based integration test.

Tests: 6 unit tests covering no-op, all-steps, partial migration,
newer-than-current rejection, backup creation, and backup-failure
robustness.

Wiring into individual load sites (session_manager, runtime_threads,
task_manager, automation_manager) is intentionally deferred until the
first actual schema bump needs it — wiring without migrations would
add code paths nothing exercises, and the framework is the part that
needs to land before the next bump can ship safely.

Closes #350.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:23:33 -05:00
Hunter Bown 4a282e767b chore(deps): remove 8 unused dependencies flagged by cargo-machete (#341)
cargo-machete found 8 direct dependencies that are declared but never
used in the source tree. Removing them tightens the dependency graph
and shrinks Cargo.lock by 40 lines (transitive crate removals where
nothing else pulled them in).

Removed:
- deepseek-core: tokio (the core scaffold doesn't drive any tasks itself)
- deepseek-config: serde_json (TOML-only crate; no JSON serialization)
- deepseek-mcp: deepseek-protocol (proxy boundary doesn't consume protocol types)
- deepseek-app-server: tracing (no tracing! macros in the transport layer)
- deepseek-tui: bytes, csv, deepseek-tui-cli, tokio-stream
  - bytes: no Bytes-typed I/O paths in the TUI
  - csv: agent_swarm/spawn_agents_on_csv removed in #336/#357
  - deepseek-tui-cli: TUI is the runtime, not the dispatcher; no facade calls
  - tokio-stream: futures-util::StreamExt is sufficient for our SSE / mpsc paths

Verified by grep across each crate's `src/` — no `use` of the dep, no
fully-qualified path references. cargo build, cargo clippy -D warnings,
and cargo test continue to pass with the slimmed graph.

Closes #341.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:18:31 -05:00
Hunter Bown f0e1a6c63a docs(prompts): tighten /rlm guidance — specialty tool, not first-class (#358)
The previous rlm prompt guidance ("Treat rlm as a normal reasoning
tool, not a last-resort escape hatch") encouraged the model to reach
for rlm in cases where a direct read_file or focused agent_spawn would
do better. The "RLM Is First-Class" framing was too encouraging given
that rlm is genuinely a specialty tool: it pays off ONLY when the input
can't fit in the model's context window.

Three audit items from #358 addressed:

1. **Reaching for rlm too often.** Reframed as "specialty tool" with
   explicit do-not-use-when guidance front-loaded. The decomposition
   workflow now says "ONLY when an input genuinely doesn't fit" with
   a concrete size threshold (~50K tokens / a whole file / a long
   transcript / a multi-document corpus).

2. **Tool description encourages overuse.** The rlm tool's description()
   now leads with "DO NOT use this tool when..." (input fits, grep
   suffices, short classification, interactive exploration), and only
   then describes the legitimate use cases. Adds explicit cost/speed
   caveat.

3. **Helpers documented as if they were tools.** Both the rlm tool
   description and base.md/base.txt now state plainly: `llm_query`,
   `llm_query_batched`, `rlm_query`, `rlm_query_batched` live INSIDE
   the Python REPL. They are functions the sub-agent uses, NOT
   separately-callable tools the model invokes.

Closes #358.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:16:31 -05:00
Hunter Bown bc13dbfee7 fix(tui): show live submit disposition in composer hint (#345)
When the user has typed something into the composer and hits Enter, the
message goes to one of four fates depending on engine state:

- Immediate (idle + online) — most common, sends right away
- Steer (busy + tool execution) — forwards mid-turn
- QueueFollowUp (busy + streaming text) — parks for after TurnComplete
- Queue (offline) — parks on offline queue

Previously the user had no way to tell which would fire BEFORE pressing
Enter. The disposition flips with fast-changing internal state (whether
the model is currently streaming text vs. running a tool, whether
network connectivity has just dropped) and only the post-submit status
toast hinted at the result — which is too late if you wanted a different
behaviour.

Fix: extend the composer's bottom hint line so when the composer has
non-empty content, it shows what Enter will do RIGHT NOW. The hint flips
live with engine state, so the user sees the real behaviour before
pressing Enter:

  ↵ steer into current turn        (sky blue, busy + tool execution)
  ↵ queue for next turn            (muted, busy + streaming)
  ↵ offline queue (no engine)      (warning yellow, offline)

The Immediate case stays unhinted — that's the default and surfacing it
would be noise.

Closes #345.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:13:45 -05:00
Hunter Bown c88b980d52 perf(session): metadata-only path for list_sessions — string-extract metadata block (#337)
`SessionManager::list_sessions` previously called
`serde_json::from_reader` to extract just the `metadata` field, which
forced serde to scan every JSON token in the file just to validate
structure — including the entire `messages` and `tool_log` arrays we
were about to discard. For a user with hundreds of long sessions, a
single startup `list_sessions()` was reading and parsing tens of MB of
JSON.

Optimization: read at most 64 KB up front and string-extract the
top-level `metadata` object with a brace-balanced, string-aware scanner.
Real metadata blocks are < 1 KB and always appear before the large
`messages` payload, so the prefix read covers every realistic case.
Falls back to a full-file read only if the metadata block isn't
extractable from the prefix (legacy or oddly-formatted file).

Net: typical session metadata load goes from O(file size) to O(1 KB)
regardless of conversation length, and the disk read is bounded.

Tests:
- extract_top_level_metadata_skips_huge_messages_array — verifies the
  scanner correctly extracts metadata from a session whose `messages`
  array contains the literal string `"metadata"` in a user message.
- extract_top_level_metadata_handles_braces_inside_strings — verifies
  brace-in-string handling so `{` / `}` inside JSON string values
  don't throw off the depth counter.

Closes #337.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:11:37 -05:00
Hunter Bown c5627ebb14 fix(config): /logout + new key now uses the new key (#343)
After running /logout and entering a new API key, subsequent requests
could still be sent with the old key because the resolution path checked
the OS keyring before the in-memory override. The keyring still held
the old credential, so it shadowed the freshly-typed one.

Three changes:

1. **`Config::deepseek_api_key()` — explicit override is now path 0.**
   When `self.api_key` is explicitly set (non-empty, non-sentinel), it
   wins over keyring/env/provider-config. This is what the user just
   typed, so it should be authoritative. Existing keyring-based flows
   are unaffected: users who store their key via `auth set` have
   `self.api_key = None`, so path 1 (keyring) still wins for them.

2. **`clear_api_key()` now wipes the keyring + provider-scoped keys.**
   Previously only the legacy root `api_key = ...` line was stripped
   from config.toml. Now every known provider slot in the OS keyring
   (deepseek, nvidia-nim, openrouter, novita, fireworks, sglang) is
   deleted, and every `api_key` line nested in a `[providers.<name>]`
   table is also stripped.

3. **`/logout` clears the in-memory `Config` too.** The dispatcher
   handler in ui.rs::execute_command_input wipes `config.api_key` and
   every `config.providers.*.api_key` so a future clone of the
   long-lived Config doesn't leak the stale value. The companion
   onboarding flow in ui.rs also stamps the new key onto `config`
   itself rather than only on a one-shot clone, so subsequent
   /provider switches see the new credential.

Test coverage:
- `clear_api_key_strips_root_and_provider_scoped_keys` — verifies all
  three credential locations get wiped from a fixture config.toml.
- `deepseek_api_key_prefers_explicit_in_memory_override` — guards the
  precedence flip.
- `deepseek_api_key_ignores_sentinel_placeholder` — confirms the
  legacy `KEYRING_SENTINEL` placeholder still falls through.

Closes #343.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:08:52 -05:00
Hunter Bown 899c703d81 fix(tui): convert remaining tokio::spawn sites to spawn_supervised + restore terminal on panic (#346)
Completes the panic-safety work #346 started in a8be33b3. Converts every
trivial production tokio::spawn site to spawn_supervised so a panicking
task writes a crash dump to ~/.deepseek/crashes/ and the parent process
stays alive.

Sites converted:
- tools/rlm.rs:190 — RLM progress drain
- tools/subagent/mod.rs:888 — run_subagent_task spawn
- tools/subagent/mod.rs:988 — run_subagent_task resume
- core/engine.rs:744 — sub-agent mailbox drainer
- core/engine.rs:1601 — engine event-loop spawn
- lsp/client.rs:127 — LSP writer
- lsp/client.rs:129 — LSP reader
- lsp/client.rs:135 — LSP dispatcher
- rlm/bridge.rs:188 — bridge progress drain
- task_manager.rs:790 — task worker loop
- automation_manager.rs:822 — automation scheduler

Sites left as-is (already panic-safe with their own catch_unwind):
- runtime_threads.rs:1242, 1462 — custom AssertUnwindSafe + catch_unwind
- mcp.rs:322 — MCP SSE loop with custom catch_unwind

Sites that don't need conversion:
- runtime_api.rs:287 — axum::serve runs in the parent task, not spawned
- runtime_api.rs:1583+ — test-helper spawn_test_server inside #[cfg(test)]
- All other spawn calls are in #[cfg(test)] modules where panics are
  expected to propagate.

Also:
- main.rs panic hook now restores the terminal (LeaveAlternateScreen +
  disable_raw_mode) before invoking the original hook, so a panicked TUI
  doesn't leave the user's shell stuck in alt-screen mode.
- Adds spawn_supervised_tests::panicking_task_writes_crash_dump_and_does_not_kill_parent
  that proves a panicking task produces a dated crash log under
  ~/.deepseek/crashes/<task>.log and the parent task completes Ok.

Closes #346.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 10:02:28 -05:00
Hunter Bown 0c55c732a2 chore(tools): full SwarmOutcome cascade — delete swarm.rs + event variant + UI handlers (#357)
Completes the v0.8.5 cleanup #336 started: with the model-callable swarm
surface gone, the supporting event/UI/state plumbing has no consumers.

- Delete crates/tui/src/tools/swarm.rs (2215 lines, parked under
  #![allow(dead_code)] since #336)
- Drop pub mod swarm from tools/mod.rs
- Remove Event::SwarmProgress variant + handler in tui/ui.rs
- Remove app.rs swarm fields: pending_swarm_task_count, swarm_jobs,
  last_swarm_id, swarm_card_index (and SwarmOutcome import + retain)
- Remove subagent_routing.rs swarm helpers: seed_fanout_card_from_tool_call,
  sync_fanout_card_from_tool_result, sync_fanout_card_from_swarm_outcome,
  worker_slot_from_swarm_task, status_to_lifecycle, swarm_task_status_to_lifecycle
- Simplify active_fanout_counts to read directly from the active FanoutCard
- Simplify handle_subagent_mailbox is_fanout to only "rlm" dispatches
- Strip dead "agent_swarm" / "spawn_agents_on_csv" string match arms in
  ui.rs (tool dispatch, task panel refresh, ListSubAgents trigger,
  active-cell skip), tool_card.rs (ToolFamily::Fanout), and tool_routing.rs
  (extract_fanout_prompts function deleted entirely)
- Trim WorkerSlot to id/agent_id/status (label/model/nickname were only
  populated by worker_slot_from_swarm_task); remove unused with_agent ctor
- Remove unused SubAgentManager::max_agents and ::available_slots methods
  (only swarm.rs called them)
- Update widgets/agent_card.rs doc comments to point at rlm + future
  multi-child dispatch instead of agent_swarm

FanoutCard decision: kept. It remains the visual primitive for rlm and
for any future multi-child dispatch the parent agent makes via repeated
agent_spawn calls.

Net: 2698 lines removed, 90 added.

Closes #357.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-02 09:56:33 -05:00
Hunter Bown 42eea19066 fix(tui): route bracketed paste to provider picker key entry instead of composer (#342)
Add handle_paste(text) -> ViewAction method to the ModalView trait with a
default no-op. ProviderPickerView overrides it in KeyEntry stage to
sanitize and append pasted text to api_key_input (rejecting whitespace
in the same way as the Char handler).

Wire into the Event::Paste handler in ui.rs: before falling through to
app.insert_paste_text(), check view_stack.handle_paste(). If the top
modal consumes the paste, skip the composer entirely. If a modal is
open but does NOT consume the paste, also skip the composer — any
modal that receives paste while focused should handle it, not leak
into the chat input.
2026-05-02 02:33:26 -05:00
Hunter Bown 47bb91a9b7 fix(commands): wire /config <key> <value> to setter — args no longer silently ignored (#338)
Add config_command(app, arg) that dispatches three paths:
  /config (no args)       -> opens interactive editor (existing behavior)
  /config <key>           -> shows current value of a single setting
  /config <key> <value>   -> sets value via existing set_config_value

Keys like model, approval_mode, locale, auto_compact, calm_mode,
show_thinking, mode, max_history, sidebar_width, sidebar_focus,
composer_density, composer_border, transcript_spacing are all read
live from App state for the /config <key> display path.

Unknown keys show a helpful error referencing /help config.
2026-05-02 02:28:25 -05:00
Hunter Bown e5f56dee82 feat(config): add ApiProvider::DeepseekCN variant for China endpoint (#361)
Add DeepseekCN as a first-class provider variant with:
- Enum variant + parse/as_str/display_name/all methods
- DEFAULT_DEEPSEEKCN_BASE_URL (https://api.deepseeki.com)
- Auto-detection when base_url contains api.deepseeki.com
- Locale-based auto-suggest: if no provider is configured and
  system locale (LC_ALL/LC_MESSAGES/LANG) starts with 'zh-*',
  the TUI defaults to DeepseekCN at startup
- ProvidersConfig.deepseek_cn for provider-scoped credentials
- All match arms updated across config.rs, client.rs,
  provider_picker.rs, main.rs, and ui.rs
- provider_picker tests updated for the 7th provider entry
2026-05-02 02:10:57 -05:00
Hunter Bown a8be33b35b fix(tui): panic safety foundations — spawn_supervised wrapper + process panic hook (#346)
Add spawn_supervised(name, location, future) to utils.rs that wraps
futures in AssertUnwindSafe + catch_unwind, logs panics via tracing::error!,
and writes crash dumps to ~/.deepseek/crashes/.

Add process-level panic hook to main.rs that writes crash dumps before
the default hook fires.

Convert persistence_actor::spawn_persistence_actor as the first
spawn_supervised caller to prove the wiring. Remaining 34 tokio::spawn
sites marked as follow-up for a focused PR.

Also fix save_mcp_config in main.rs to use write_atomic (missed in #355).
2026-05-02 01:53:50 -05:00
Hunter Bown 5bd63c779a fix(tui): atomic file writes for ~/.deepseek/ persistence (#355)
Add write_atomic helper (NamedTempFile + fsync + rename) in utils.rs.
Convert all non-append fs::write sites:
  - session_manager.rs: save_session/save_checkpoint/save_offline_queue_state
  - workspace_trust.rs: write_trust_file_at
  - task_manager.rs: write_json_atomic → delegates to write_atomic
  - runtime_threads.rs: write_json_atomic → delegates to write_atomic
  - mcp.rs: save_config/init_config/save_legacy
  - audit.rs: buffered append with flush_and_sync after each event
  - runtime_threads append_event: add sync_all after flush
2026-05-02 01:50:21 -05:00
Hunter Bown aa23182674 chore(tools): remove /swarm command + agent_swarm/spawn_agents_on_csv tool surface; park swarm.rs pending #357 cascade (#336)
Surface removed: /swarm slash command, agent_swarm, spawn_agents_on_csv, swarm_status, swarm_result, swarm_cancel tools, report_agent_job_result. Prompts/docs/tests updated. swarm.rs parked with #![allow(dead_code)] pending the full cascade in #357. RLM prompt audit tracked in #358.
2026-05-02 01:30:23 -05:00
Hunter Bown 0ca0570a86 fix(tui): preserve composer draft when navigating input history (#283) 2026-05-02 00:52:08 -05:00
Hunter Bown f1f601c28b perf(tui): persistence actor for session save/checkpoint I/O
Replaces synchronous disk writes on the UI thread with a dedicated
persistence actor task. The UI now try_sends a PersistRequest and
returns immediately — keyboard input is never gated on write completion.

Changes:
- New persistence_actor module with bounded-coalescing actor
- Actor spawns at TUI startup; global singleton so no App struct change
- All persist_checkpoint/persist_session_snapshot/clear_checkpoint calls
  replaced with persistence_actor::persist(PersistRequest::...)
- Dropped redundant TurnStarted persist (nothing changed between
  SendMessage's checkpoint and TurnStarted)
- Fixed collapsible_if clippy lint

This is the P0 fix for the post-send terminal freeze caused by
serialising 500KB+ sessions to disk on the UI thread.
2026-05-02 00:13:45 -05:00
Hunter Bown 87f42656a7 feat(swarm): add /swarm command with sequential|mixture|distill|deliberate modes (Phase A foundation, #303) 2026-05-01 23:48:24 -05:00
Hunter Bown 359c27437b feat(i18n): Phase 1c-extra — keybinding descriptions, /home, /settings, /help labels
Closes the gate the maintainer set for v0.8.5: every / command, /help,
and /settings should look perfect in both English and Chinese before
multi-agent work begins. v0.8.4 shipped Phase 1a/b/c (88 MessageIds)
but four mixed-language gaps remained:

1. **Keybinding descriptions** (41 entries) — the help overlay showed
   translated section labels (Phase 1c) over English description text.
   `KeybindingEntry` now carries `description_id: MessageId` instead
   of a raw `&'static str`; all 41 descriptions translated to
   en/ja/zh-Hans/pt-BR.

2. **Settings: header** — `Settings::display` now takes a `Locale`
   and resolves the title via `MessageId::SettingsTitle`. The
   field-name keys (auto_compact, calm_mode, etc.) intentionally stay
   English — they are the literal TOML keys users edit.

3. **/home dashboard** — entirely English before. ~25 lines of section
   headers, mode tips, and quick-action hints translated. Path
   interpolations route through `display_path` (privacy invariant).

4. **/help <topic>** text command — the inline labels `Usage:` and
   `Aliases:` plus the `Unknown command:` fallback all use tr().

Also adds three buffer-render tests confirming the help overlay /
settings / home dashboard render in zh-Hans without missing markers
or English bleed-through.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 23:38:27 -05:00
Hunter Bown 3d3ff0c5cf Release v0.8.4: Phase 1 i18n + cache-prefix stability
* fix(pricing): extend V4 Pro 75% discount expiry to 2026-05-31 15:59 UTC

DeepSeek extended the promotional discount past the original 2026-05-05
cutoff. Without this update the TUI would have started showing 4× the
actual billed cost on May 6.

Source: https://api-docs.deepseek.com/quick_start/pricing — "extended
until 2026/05/31 15:59 UTC".

Adds a regression test pinning the new active window so a future revert
to the May 5 date trips the suite immediately.

Closes #267

* chore: remove stale TODO(integrate) markers from already-integrated modules

Five `// TODO(integrate)` comments and one matching "Not yet integrated"
note were misleading anyone grepping for integration work. Each module
is in fact wired up:

- execpolicy/mod.rs       → tools/shell.rs:1322 (load_default_policy)
- sandbox/mod.rs          → tools/shell.rs:28, main.rs:2647, tui/approval.rs:30
- sandbox/policy.rs       → main.rs:2752, tui/approval.rs:30 (SandboxPolicy)
- command_safety.rs       → tools/shell.rs:1321, tools/tasks.rs:13,
                            tools/approval_cache.rs:26
- tui/streaming/mod.rs    → tui/app.rs:38 (StreamingState)

The remaining TODO at mcp.rs:1771 covers a separate "wire legacy sync API
into CLI subcommands or remove" decision and is left in place.

Closes #266

* docs(release): add install + dual-binary template to GitHub Release page

Closes #265.

The Release page used the auto-generated commit-title body. New users
hitting the Release page from Twitter / npm-search had no on-page
guidance that the dispatcher (`deepseek`) and the TUI runtime
(`deepseek-tui`) ship as two binaries that must coexist; #258 was an
external user spending 11 minutes figuring this out and #272 was the
follow-on confusion.

The new body covers:
- npm wrapper as the recommended install
- `cargo install deepseek-tui-cli deepseek-tui --locked` (both crates)
- Manual download with a per-platform table showing both artifacts
- sha256 verify using the existing `deepseek-artifacts-sha256.txt`
- Changelog link

* feat(debug): add /cache command surfacing per-turn DeepSeek cache hit/miss

Step 1 of #263. Without per-turn telemetry the prefix-cache audit is
unfounded speculation; the rest of the issue's investigation steps
depend on this surface.

The DeepSeek API already returns `prompt_cache_hit_tokens` and
`prompt_cache_miss_tokens` per turn, and we already store the *latest*
on App. This adds a 50-turn ring (`turn_cache_history`) populated at
the same site as `last_prompt_cache_*_tokens`, plus a `/cache [count]`
slash command that renders a fixed-width table of the last N turns
with per-turn ratios and a session aggregate. Default count is 10;
larger values clamp to the ring size.

Edge cases the formatter handles:

- No telemetry yet → friendly "no turns recorded" message
- `cache_hit_tokens = None` (provider didn't report) → row renders all
  em-dashes and is excluded from session aggregates so one missing-
  telemetry turn can't make the average ratio look broken.
- `cache_hit_tokens = Some, cache_miss_tokens = None` → infer miss as
  `input − hit` and mark the cell with `*`. Footer documents the
  asterisk.
- Ring at cap (50) → push evicts oldest.

Tests cover all four paths plus the cap.

* test(prompts): add cache-prefix stability harness for #263 step 2

The DeepSeek prefix-cache only hits while the byte prefix of each
request matches the prior call. Anything in the cached prefix that
varies turn-to-turn for unchanged inputs is a cache buster.

Adds a focused harness next to the production surface so the property
is regression-guarded:

1. `first_divergence(a, b)` helper that returns the first divergent
   byte position with a `±32 byte` window of context, used by the
   custom assertion `assert_byte_identical`. Future suspect tests can
   reuse this to surface "where" rather than just "fail".

2. `compose_prompt_is_byte_stable_across_calls` — sweeps every
   (mode, personality) pair and pins that two consecutive calls
   produce identical bytes. Rules out suspect #4 (mode-prompt churn).

3. `system_prompt_for_mode_with_context_is_byte_stable_for_unchanged_workspace`
   — the call site `engine.rs::build_tool_context` actually invokes,
   pinned for an empty workspace across all three modes.

4. `system_prompt_with_working_set_summary_is_byte_stable_for_constant_summary`
   — pins that the surrounding prompt construction faithfully embeds
   the working_set summary it's given without injecting extra
   non-determinism. (The actual working_set summary stability lives
   in `working_set.rs` and is the next investigation target — see
   issue note in PR description.)

Foundation for the suspect-by-suspect bisection in the rest of #263.

* fix(secrets): never overwrite the secrets file when load_unlocked errors

`FileKeyringStore::set` and `delete` did
`self.load_unlocked().unwrap_or_default()`, which wiped every existing
secret if the read failed for any reason other than \"file is missing\":

- file mode != 0600 (`InsecurePermissions`) — easy on headless / CI
  environments where a permissive umask got applied
- corrupt JSON
- transient I/O error

In all of those, the next `store_unlocked` overwrote the file with an
empty-or-single-entry blob and reset perms to 0600, silently losing
every other provider's key.

Switch both call sites to `?`. `load_unlocked` already returns
`Ok(default)` for a missing file, so the first-write-creates-the-file
ergonomic is preserved (covered by the new
`file_store_set_still_creates_file_when_missing` test).

Adds four regression tests:

- set: insecure perms surface InsecurePermissions and leave the file
  byte-identical.
- delete: same.
- set: corrupt JSON surfaces the parse error and leaves the file
  byte-identical.
- set: missing file path still works (idempotence guard).

Closes #281

* fix(cache): make tool catalog byte-stable across calls and sessions

DeepSeek's KV prefix cache hits on the longest matching byte prefix of
the request. Two places in the tool-array path were silently introducing
divergence:

1. `ToolRegistry::to_api_tools()` iterated `self.tools.values()` directly.
   Rust's default `HashMap` is seeded with `RandomState` per process, so
   every `deepseek` launch produced a different tool order — the cross-
   session resume case (the one with the biggest cache wins) never hit.

2. `active_tool_list_from_catalog()` filtered the catalog `Vec` by the
   active set in catalog order. When ToolSearch activated a previously-
   deferred tool mid-conversation, the new tool appeared at its catalog
   index, shifting every later tool's byte offset and busting the cached
   prefix from there onwards.

Fixes:

- `to_api_tools()` now sorts by tool name before emitting the API tool
  array. Stable across calls AND across launches.
- `build_model_tool_catalog()` sorts each partition (built-ins first,
  contiguous; MCP tools after, also alphabetical). Mirrors Claude Code's
  `assembleToolPool` strategy where they explicitly call out cache
  stability as the reason: "a flat sort would interleave MCP tools into
  built-ins and invalidate all downstream cache keys whenever an MCP
  tool sorts between existing built-ins."
- `active_tool_list_from_catalog()` puts always-loaded tools in catalog
  order at the head and deferred-but-now-active tools at the tail. A
  deferred-tool activation during ToolSearch no longer shifts earlier
  tools' positions.

Adds three regression tests:

- `to_api_tools_emits_alphabetical_order_regardless_of_registration_order`
- `model_tool_catalog_sorts_each_partition_for_prefix_cache_stability`
- `active_tool_list_pushes_deferred_activations_to_the_tail`

Refs #263. Findings produced by reading reference Claude Code source
side-by-side with our request-building flow; full delta analysis in
the PR description.

* fix(sandbox): elevate Agent-mode shell sandbox to allow network access

The seatbelt-default policy is `WorkspaceWrite { network_access: false }`,
which on macOS emits `(deny default)` with no `(allow network-outbound)` /
`(allow system-socket)`. Every outbound socket call from a sandboxed
shell command — including `getaddrinfo` for DNS — gets denied by the
kernel. Symptom: "DNS resolution failed" for any URL the model tries to
reach via curl, yt-dlp, package managers, etc.

Engine.build_tool_context only elevated the policy in Yolo mode, leaving
Agent mode (the default) stuck on the strict default. That's tighter
than competitors (Claude Code, Codex) without buying any safety the
application-level NetworkPolicy or the approval flow doesn't already
provide.

Switch the elevation to a `match` so:

- Plan       → no elevation (read-only investigation; shell tool not registered)
- Agent      → WorkspaceWrite { network_access: true, … }
- Yolo       → WorkspaceWrite { network_access: true, … } (unchanged)

Adds `agent_and_yolo_modes_elevate_shell_sandbox_to_allow_network` so a
future revert to the no-network default trips CI immediately.

Closes #273

* fix(skills): treat bare github.com/<owner>/<repo> URLs as GitHubRepo

Closes #269.

`/skill install https://github.com/obra/superpowers` failed on every
platform with `invalid gzip header`. Root cause: `InstallSource::parse`
matched any `https://`-prefixed spec as `DirectUrl`, so the installer
downloaded the HTML repo page (200 OK, `text/html`) and tried to
gzip-decode HTML. The user reported it from Win11 + PowerShell but the
parse path is platform-independent.

Recognize bare GitHub repo URLs in `InstallSource::parse`:

- `https://github.com/<owner>/<repo>`
- `https://github.com/<owner>/<repo>/`
- `https://github.com/<owner>/<repo>.git`
- `https://github.com/<owner>/<repo>.git/`
- `https://www.github.com/<owner>/<repo>`
- `http://github.com/<owner>/<repo>` (legacy)

…all route to the existing `GitHubRepo` source, which already produces
`https://github.com/<repo>/archive/refs/heads/{main,master}.tar.gz`
candidates with proper fallback. URLs with a third path segment
(`/archive/...`, `/blob/...`, `/tree/...`) keep going through
`DirectUrl` because the user picked that exact path.

Adds two regression tests: one asserting the seven recognised forms
all canonicalize to `github:obra/superpowers`, and one pinning the
sub-resource paths to `DirectUrl`.

* fix(cache): drop volatile fields from working_set summary block (#280) (#287)

The working-set summary lands inside the system prompt before the
historical conversation, so any byte that drifts there cache-misses
everything that follows in DeepSeek's KV prefix cache. Two sources of
turn-over-turn drift are removed:

1. The rendered line is now `- {path} ({kind})`. The previous form
   interpolated `entry.touches` and `self.turn - entry.last_turn`,
   both of which advance on every user message even when no new
   paths are observed.

2. A new `sorted_for_prompt` helper sorts by (touches DESC, path ASC)
   instead of the turn-aware `sorted_entries`. The recency bonus in
   `score_entry` crosses bucket boundaries as turns advance, so even
   without rendering `last seen` the order — and which entries cross
   the `max_prompt_entries` cutoff — drifted. Compaction pinning
   still uses `sorted_entries` because it genuinely wants recency.

Adds a regression test that observes a fixed message set, calls
`summary_block` before and after `next_turn()`, and asserts the two
outputs are byte-identical. The shared `first_divergence` /
`assert_byte_identical` helpers (from #279) move from `prompts::tests`
into `test_support` so working_set tests can reuse them.

Closes #280.

* fix(cache): memoise tool catalog so descriptions stay byte-stable (#289)

`to_api_tools` previously re-sampled `tool.description()` and
`tool.input_schema()` on every call. Native tools return `&'static str`
and a `json!` literal, so the bytes were stable in practice — but the
`McpToolAdapter` returns `self.tool.description.as_deref()`, which can
drift when the upstream MCP server reconnects with a different
description string. Any drift mid-session rewrites the tool catalog
that lands in the cached prefix and busts every byte that follows.

Adds an `api_cache: OnceLock<Vec<Tool>>` field on `ToolRegistry`. The
first `to_api_tools` call materialises the catalog; subsequent calls
return a clone of the cached vector. Mutations (`register`, `remove`,
`clear`) reset the field so the next read rebuilds. Mirrors
reference-cc's `getToolSchemaCache` (`utils/api.ts:119–208`).

Tests:
- `to_api_tools_pins_description_bytes_across_calls` registers a tool
  whose `description()` advances through a script of pre-built strings
  on each call. After the cache is populated, the second `to_api_tools`
  read returns the original description because `description()` is no
  longer invoked. Without the cache the second read would return the
  next script entry.
- `register_invalidates_api_tools_cache` registers a tool, snapshots,
  registers another, snapshots again, and asserts the second snapshot
  reflects both tools (cache rebuilt) and that the varying tool's
  description advanced (proving the rebuild actually re-sampled).
- `remove_and_clear_invalidate_api_tools_cache` covers the other two
  invalidation paths.

* fix(cache): sort project_tree and summarize_project output (#290)

Both helpers walked the workspace via `ignore::WalkBuilder::build()`
and emitted entries in the OS readdir order — non-deterministic across
filesystems (htree-hash on ext4, insertion-order on APFS, etc.). Their
output lands in the fallback branch of the system prompt's project
context (when the workspace has no AGENTS.md / CLAUDE.md) and inside
the `project_map` tool surface, both of which feed the cached prefix.

`summarize_project` now sorts the collected key-files list before the
type-detection logic and the fallback `Project with key files: …` join.

`project_tree` collects `(rel_path, is_dir)` tuples, sorts by full
path, and only then formats the indented tree. Sorting by full path
preserves the visual tree shape — `"src" < "src/lib.rs"` because the
shorter string compares less — while making siblings deterministic.

Tests cover sibling order, parent-before-children invariant, byte
stability across two consecutive calls, and the fallback `Project
with key files:` branch (the only branch where the joined order
escapes into output without further sorting downstream).

* fix(client): unique fallback id for parallel streaming tool calls (#291)

When a streamed tool_call delta omits the `id` field, the chat-completion
decoder used to fall back to the literal string `"tool_call"` for every
call. With the V4 API's native parallel tool calls (multiple tool_calls
in one delta), every parallel call ended up with the same fallback id —
downstream tool-result routing then matched the first call's result
twice and the second call hung waiting for an answer that never arrived.

The fallback now indexes by the assigned `content_block` position,
producing `"call_0"`, `"call_1"`, … within a single response. Upstream-
supplied ids are still forwarded verbatim; only the fallback path
changes.

Tests pin both invariants:
- `decoder_assigns_unique_fallback_ids_to_parallel_tool_calls_missing_id`
  feeds two tool calls without `id` in one delta and asserts they get
  distinct ids.
- `decoder_preserves_upstream_tool_call_id_when_present` keeps the
  forward-as-is path honest.

* fix(cache): place handoff and working_set after static prompt blocks (#292)

* fix(cache): drop volatile fields from working_set summary block (#280)

The working-set summary lands inside the system prompt before the
historical conversation, so any byte that drifts there cache-misses
everything that follows in DeepSeek's KV prefix cache. Two sources of
turn-over-turn drift are removed:

1. The rendered line is now `- {path} ({kind})`. The previous form
   interpolated `entry.touches` and `self.turn - entry.last_turn`,
   both of which advance on every user message even when no new
   paths are observed.

2. A new `sorted_for_prompt` helper sorts by (touches DESC, path ASC)
   instead of the turn-aware `sorted_entries`. The recency bonus in
   `score_entry` crosses bucket boundaries as turns advance, so even
   without rendering `last seen` the order — and which entries cross
   the `max_prompt_entries` cutoff — drifted. Compaction pinning
   still uses `sorted_entries` because it genuinely wants recency.

Adds a regression test that observes a fixed message set, calls
`summary_block` before and after `next_turn()`, and asserts the two
outputs are byte-identical. The shared `first_divergence` /
`assert_byte_identical` helpers (from #279) move from `prompts::tests`
into `test_support` so working_set tests can reuse them.

Closes #280.

* fix(cache): place handoff and working_set after static prompt blocks

`system_prompt_for_mode_with_context_and_skills` previously interleaved
volatile content into the static prefix:

  1. mode prompt           static
  2. project context       static
  3. working_set_summary   ← volatile
  4. skills_block          static
  5. handoff_block         ← volatile
  6. ## Context Management static
  7. COMPACT_TEMPLATE      static

Anything past byte (3) cache-missed every time the working-set drifted
or `/compact` rewrote `.deepseek/handoff.md` — including the static
`## Context Management` and `## Compaction Handoff` blocks behind them.

New order keeps every static block in the cached prefix and pushes the
two volatile blocks to the end:

  1. mode prompt
  2. project context (or fallback automap)
  3. skills block
  4. ## Context Management (Agent / Yolo only)
  5. COMPACT_TEMPLATE
  ── volatile boundary ──
  6. handoff block
  7. working-set summary

Adds a doc comment on the function describing the volatile-content-last
invariant so future contributors don't reintroduce churn into the
prefix. Adds two regression tests:

- `system_prompt_with_handoff_file_is_byte_stable_when_file_is_unchanged`
  pins the handoff path with a fixture file.
- `handoff_and_working_set_appear_after_static_blocks` asserts the
  ordering invariant directly so a future reorder fails loudly.

Reference: Claude Code's own prompt builder marks this same boundary
with a `SYSTEM_PROMPT_DYNAMIC_BOUNDARY` constant; we don't introduce
the abstraction yet but match the principle.

* feat(i18n): localize slash command help (Phase 1a, #285) (#294)

Adds 44 new MessageIds, one per slash command, and translations to all
four shipped locales (en/ja/zh-Hans/pt-BR). Refactors CommandInfo so the
English description now lives in localization.rs (single source of
truth) instead of being duplicated on the struct, and threads the
active Locale through the three render surfaces:

- crates/tui/src/tui/views/help.rs (the ?/F1/Ctrl+/ help overlay)
- crates/tui/src/tui/command_palette.rs (Ctrl+K palette)
- crates/tui/src/commands/core.rs (the /help text command)

Usage strings (e.g. /cache [count]) stay English by design — they're
placeholder syntax, not natural language.

The existing locale-coverage test
(`shipped_first_pack_has_no_missing_core_messages`) already iterates
ALL_MESSAGE_IDS across Locale::shipped(), so the 44 new IDs are
automatically required to be present in all four locale arms or CI
fails.

This is the first of several incremental Phase 1 PRs. Phase 1b covers
the debug commands (/tokens /cost /cache), 1c the footer hints, and
1d doctor output. Phases 2–3 cover onboarding and error surfaces.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): localize /tokens /cost /cache debug output (Phase 1b, #285) (#295)

Adds 13 new MessageIds covering the report templates and the
sub-strings shared across them, with translations for all four
shipped locales (en/ja/zh-Hans/pt-BR):

- CmdTokensReport, CmdTokensContextWithWindow, CmdTokensContextUnknownWindow
- CmdTokensCacheBoth, CmdTokensCacheHitOnly, CmdTokensCacheMissOnly
- CmdTokensNotReported
- CmdCostReport
- CmdCacheNoData, CmdCacheHeader, CmdCacheTotals, CmdCacheFootnote, CmdCacheAdvice

Each template uses {placeholder} substitution via String::replace
rather than format!, since format! requires a literal — the
locale-resolved &'static str isn't one. The placeholder convention
({active}, {hit}, {miss}, …) means a translator can re-order or
restructure a sentence freely without changing the call site.

Helpers `token_count`, `active_context_summary`, `cache_summary`, and
`format_cache_history` now take `Locale` so each can resolve their
templates from the same source of truth.

The English templates byte-match the previous hardcoded format strings
so the existing 16 debug-command tests pass unchanged.

Column headers in the cache table (`turn   in    out   hit   miss …`)
are intentionally NOT localized — the body rows are formatted with
fixed column widths and translating the header words would break
alignment. Numbers, ratios, and the model id stay in English form.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(i18n): localize footer state + help section labels (Phase 1c, #285) (#296)

Adds 11 new MessageIds covering visible footer chrome and the help-overlay
section headings, with translations for all four shipped locales:

Footer:
- FooterWorking — animated `working` / `working.` / … pulse
- FooterAgentSingular / FooterAgentsPlural — the sub-agent count chip
- FooterPressCtrlCAgain — the quit-confirmation toast

Help overlay sections (`?` / `F1` / `Ctrl+/`):
- HelpSectionNavigation, HelpSectionEditing, HelpSectionActions,
  HelpSectionModes, HelpSectionSessions, HelpSectionClipboard,
  HelpSectionHelp

`KeybindingSection::label` now takes Locale and returns tr(locale, …).
`footer_working_label` and `footer_agents_chip` likewise take Locale; the
two production callsites in tui/ui.rs pass `app.ui_locale`.

The mode chip itself (agent / yolo / plan) intentionally stays English —
those are brand/acronym labels, and translating them would mean explaining
to maintainers what `代理` means in a bug report.

The keybinding catalog DESCRIPTIONS (41 entries) are not translated in this
PR — those are technical prose that would dwarf the rest of i18n work and
can ship in v0.8.5. Section labels are translated so the help overlay
groups read as expected in any locale.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(commands): smoke-test that every / command dispatches to a handler (#299)

Adds two parallel-safe smoke tests in `crates/tui/src/commands/mod.rs`
that iterate the COMMANDS registry and verify every command — and every
declared alias — dispatches to a real handler. A dispatch miss surfaces
as the fall-through `Unknown command:` error message in `execute`,
which used to be invisible until a user typed the command and saw the
"did you mean" suggestion fire on a registered command.

The tests build a workspace-isolated app via `tempfile::TempDir` so
side-effecting handlers (`/init` writing AGENTS.md, `/save` and
`/export` writing files) do not pollute `crates/tui/` when CI runs from
there. `/save` and `/export` get an explicit tempdir-relative path
because their no-arg defaults still resolve relative to `cwd`.

`/restore` is skipped — it shells out to git for the snapshot repo and
its own dedicated tests in `commands/restore.rs` already serialize on
the global env mutex via `scoped_home`. The existing coverage there is
sufficient.

Closes a gap surfaced when verifying that the v0.8.4 i18n refactor
(#294, #295, #296) did not silently break any slash-command dispatch.
All 44 commands and their aliases pass (16 aliases on top of the
44 names; `/restore` is the only skip).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(release): bump version to 0.8.4 (#297)

CHANGELOG entry covers the v0.8.4 work landed since 0.8.3:

- Localization Phase 1 (#285) — slash command help (#294), debug command
  output (#295), footer state and help-overlay section labels (#296).
  Adds 68 new MessageIds across all four shipped locales (en/ja/zh-Hans/pt-BR).

- Cache-prefix stability (#263) — five companion fixes (#287, #288→#292,
  #289, #290, #291) that keep the DeepSeek prefix cache stable across turns.

- Plus the items already in [Unreleased]: agent-mode network exec (#272),
  /skill GitHub URL parsing (#269), and the V4 Pro discount expiry extension
  (#267).

Bumps:
- Cargo.toml workspace version 0.8.3 → 0.8.4
- npm/deepseek-tui/package.json version + deepseekBinaryVersion 0.8.3 → 0.8.4
- Cargo.lock regenerated from the new workspace version.

Phase 1d (doctor output), Phase 2 (onboarding/init/missing-companion),
and Phase 3 (tool errors / sandbox denials / approvals) deferred to v0.8.5.
The shipped Phase 1 surfaces (slash commands, debug telemetry, footer
chrome) cover the highest-traffic UI paths Chinese users see first.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(release): bump internal path-dep versions + repair doc link (#301)

CI on PR #300 (release feat/v0.8.4 → main) flagged two regressions
introduced by the 0.8.4 version bump:

1. Version drift — path-dependency `version = "0.8.3"` references
   inside the workspace crates (10 crates: agent, app-server, cli,
   config, core, execpolicy, hooks, mcp, tools, tui) did not move with
   the workspace `[workspace.package] version = "0.8.4"`. The CI guard
   `scripts/release/check-versions.sh` requires they match.

2. Broken intra-doc-link `[crate::localization::english]` in the
   CommandInfo doc comment — `english` is private. Replaced with a
   reference to the public `description_for` accessor and the public
   `tr()` function.

Verified with:
- scripts/release/check-versions.sh — Version state OK.
- RUSTDOCFLAGS=-Dwarnings cargo doc --workspace --no-deps — green.
- cargo fmt + clippy + test all green.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 23:02:38 -05:00
Hunter Bown 1042e37fbd refactor(cli): centralize feature command output 2026-05-01 11:06:42 -05:00
Hunter Bown e620e75f99 chore: release v0.8.3
Bumps workspace, all internal path-deps, and npm wrapper (version +
deepseekBinaryVersion) from 0.8.2 → 0.8.3. Lockfile re-locked offline.
CHANGELOG entry summarizing the 0.8.3 lane: skills path bug fix,
privacy contraction, helpful missing-companion error (#258), engine
decomposition (#227), bridge/persistence/palette test gap closures,
crates.io badge, and 10 issue closures.

Local v0.8.3 verified at /tmp/deepseek-0.8.3-test/ before publish.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 09:46:21 -05:00
Hunter Bown a7e629ae4d test(parity): scan engine submodules after decomposition refactor
The protocol-recovery contract tests `include_str!`-ed `engine.rs` and
asserted the fake-wrapper markers (`[TOOL_CALL]`, `<function_calls>`,
…) appeared as string literals in that file. The recent engine
decomposition refactor (commits f0fad7aa..a64bc9bb) split engine.rs
into `engine/streaming.rs`, `engine/turn_loop.rs`, `engine/dispatch.rs`,
`engine/tool_setup.rs`, `engine/tool_execution.rs`,
`engine/tool_catalog.rs`, `engine/context.rs`, `engine/approval.rs`,
`engine/capacity_flow.rs`, and `engine/lsp_hooks.rs`. The marker
literals followed the code into those files, so the original
single-file `include_str!` no longer saw them and 4 protocol-recovery
tests went red.

Switch to an `ENGINE_SOURCES: &[&str]` array of `include_str!`s across
engine.rs + every submodule, with a small `any_engine_source_contains`
helper. Test bodies are otherwise unchanged. The file-size sanity
check on `engine.rs` (>10_000 bytes) still passes — engine.rs is still
~65k bytes after the refactor.

Same regression coverage as before; just survives the new file layout.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 09:34:11 -05:00
Hunter Bown bb88ab9129 fix(cli): make missing-companion-binary error actually helpful (#258)
@whereiszebra (issue #258) downloaded just \`deepseek-macos-arm64\` from
the GitHub Release, ran it, hit:

  error: deepseek-tui binary not found at /path/to/deepseek-tui.
  Build workspace default members to install it, or set DEEPSEEK_TUI_BIN
  to its absolute path.

…spent 11 minutes figuring out they also needed \`deepseek-tui-macos-arm64\`
sitting next to it, and self-closed with: "Release page does not document
that both deepseek-macos-arm64 and deepseek-tui-macos-arm64 must be
downloaded together."

The dispatcher's error was the wrong message for the population that hits
it most often — direct GitHub Release downloaders. "Build workspace default
members" is meaningless if you didn't clone the repo. \`DEEPSEEK_TUI_BIN\`
is also not what they need.

New message lists the three concrete install paths that actually work for a
fresh user — npm, cargo, or grab BOTH binaries from the same release
page — and keeps the env var override as a final fallback for power
users. No logic change; just better text. Existing
\`locate_sibling_tui_binary_honours_env_override\` test still passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 09:28:28 -05:00
Hunter Bown a64bc9bbe5 refactor(engine): isolate streaming state helpers 2026-05-01 09:09:09 -05:00
Hunter Bown 8379230ef1 refactor(engine): split tool execution helpers 2026-05-01 08:07:22 -05:00
Hunter Bown 8dd5ed38d7 refactor(engine): extract context helpers 2026-05-01 07:09:30 -05:00
Hunter Bown f0fad7aa2e refactor(engine): modularize turn tool setup 2026-05-01 06:07:59 -05:00
Hunter Bown 0887a88465 refactor(engine): extract tool catalog helpers 2026-05-01 05:09:39 -05:00
Hunter Bown d2c007833f test(rlm): make bridge client seam mockable 2026-05-01 04:10:59 -05:00
Hunter Bown 84da3b7fc6 test(rlm): cover bridge batch and depth guard 2026-05-01 03:09:05 -05:00
Hunter Bown df53a22113 test(utils): gate display_path tests to cfg(unix)
The tests set \$HOME to drive `dirs::home_dir()`. On Unix that's the
contract dirs uses; on Windows dirs reads %USERPROFILE% first, so
setting HOME has no effect and the tests fail.

The `display_path` function itself is platform-identical — it
delegates to `dirs::home_dir()` for the home prefix and uses
`std::path::MAIN_SEPARATOR` for the separator after the tilde. The
contraction logic is exercised on macOS/Linux which is sufficient
coverage for an abstraction whose platform detail is delegated.

If we want Windows-specific assertion coverage in the future, it should
either set USERPROFILE alongside HOME or accept an injected home dir.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 02:59:57 -05:00