Commit Graph

636 Commits

Author SHA1 Message Date
Hunter Bown b31bc4104b feat(config): separate tui.toml for theme and keybinds (#657) 2026-05-05 00:12:29 -05:00
Hunter Bown 1e968a3ec4 feat(tui): vim modal editing in composer (#659) 2026-05-05 00:12:28 -05:00
Hunter Bown caf1ac2a89 feat(skills): remote registry sync with /skills sync command (#654) 2026-05-05 00:12:17 -05:00
Hunter Bown 2928d20e09 feat(tools): FIM edit tool for V4 /beta endpoint (#668) 2026-05-05 00:12:17 -05:00
Hunter Bown 84ff982086 feat(lsp): auto-attach diagnostics to edit results (#656) 2026-05-05 00:12:09 -05:00
Hunter Bown fb2e61e123 feat(execpolicy): bash arity dictionary for command-prefix allow rules (#655) 2026-05-05 00:12:09 -05:00
Hunter Bown 95e92ef1fc feat(commands): unified slash-command namespace with template substitution (#661) 2026-05-05 00:12:08 -05:00
Hunter Bown f8f8160532 feat(engine): reasoning_effort auto mode (#669) 2026-05-05 00:11:56 -05:00
Hunter Bown c5c90da985 fix(notifications): actually call MessageBeep on Windows (#646) 2026-05-05 00:11:14 -05:00
Hunter Bown 7feadba939 fix(fork): optimize truncate_id to avoid unnecessary allocation (#649) 2026-05-05 00:11:08 -05:00
Hunter Bown 0370e45a97 fix(tui): replace hardcoded colors with Color::Reset + add use_terminal_colors config (#671) 2026-05-05 00:11:02 -05:00
Hunter Bown b750cef1b5 fix(theme): complete Color::Reset migration across all UI widgets (#651) 2026-05-05 00:10:24 -05:00
Hunter Bown 848725e65d docs: document zh-Hans locale activation (#652) 2026-05-05 00:08:39 -05:00
Hunter Bown faeda74d66 docs: add CODE_OF_CONDUCT.md (#686) 2026-05-04 23:58:49 -05:00
Hunter Bown cb2699de08 docs: add SECURITY.md with vulnerability reporting policy (#648) 2026-05-04 23:58:24 -05:00
Hunter Bown abc8751d81 docs(release): add v0.8.11 fix entries to CHANGELOG + update README What's New
- CHANGELOG: add YOLO sandbox, scroll lock, and capacity controller fixes
- README: replace v0.8.10 highlights with v0.8.11 cache-maxing + fixes
- Date corrected to 2026-05-04
2026-05-04 23:50:15 -05:00
Hunter Bown 4c783be52d fix(yolo): drop sandbox to DangerFullAccess — no guardrails as advertised
User report: YOLO mode was still routing shell commands through the
WorkspaceWrite sandbox, which intercepted legitimate outside-workspace
writes (package installs, sub-agent workspaces, package-manager state
under ~/.cache, brew, npm install -g, pipx, …) and forced approval
round-trips. That contradicts the YOLO contract — the user opted into
"no guardrails" and instead got a guardrail.

YOLO already auto-approves all tools and enables trust mode. The
sandbox was the last residual restriction. Drop it.

Change in `Engine::build_tool_context`: split the previously-merged
`AppMode::Agent | AppMode::Yolo` arm into two:

* **Agent** keeps `WorkspaceWrite { writable_roots, network_access:
  true, … }` — interactive mode with explicit per-tool approval, so
  the sandbox plus the approval flow form a defense-in-depth layer.
* **Yolo** uses `DangerFullAccess` — no sandbox. The user has
  opted into auto-approval + trust mode + no sandbox as one
  consistent posture.

Plan mode unchanged (read-only, no shell tool registered).

Updated `agent_and_yolo_modes_elevate_shell_sandbox_to_allow_network`
to pin the new YOLO contract: `DangerFullAccess` specifically, not
just "has network access."

Verified locally:

* `cargo fmt --all -- --check` clean.
* `cargo clippy --workspace --all-targets --all-features --locked
  -- -D warnings` clean.
* `cargo test --workspace --all-features --locked` — green
  (the snapshot::repo flake still flakes in batch but passes in
  isolation; unrelated).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:37:20 -05:00
Hunter Bown 5c72e5f463 fix(scroll): preserve user_scrolled_during_stream lock across resolve
Live repro: in a session producing content rapidly (sub-agent
running, multiple tool calls), the user scrolls up to read earlier
output. Their scroll position briefly takes effect, then snaps back
to the live tail when the next stream chunk arrives. Symptom is
"scrolling is broken / takes over instead of the transcript".

Root cause in `crates/tui/src/tui/widgets/mod.rs:188-210`:

* The user's mouse-scroll-up sets `transcript_scroll = at_line(N)`
  and `user_scrolled_during_stream = true`.
* During render, `resolve_top` clamps the state against
  `max_start = total_lines.saturating_sub(visible_lines)`. If
  `max_start < N` (transcript shrunk between scrolls and render —
  e.g., a sub-agent in-progress card collapsed into a smaller
  finished card, or the content briefly fits in one screen),
  `resolve_top` returns `Self::to_bottom()` (TAIL_SENTINEL).
* `is_at_tail()` on the post-resolve state returns `true`.
* The auto-clear at line 208 fires →
  `user_scrolled_during_stream = false`.
* Next `add_message` / sub-agent envelope sees `is_at_tail() &&
  !user_scrolled_during_stream` and calls `scroll_to_bottom()`. The
  user is yanked off their position mid-read.

`scrolled_by` has the same trapdoor: when `total_lines <= visible_
lines` it returns `to_bottom()` regardless of scroll direction
(line 145-148 in scrolling.rs). A user scroll-up while content
fits in one screen produces `to_bottom()` → `is_at_tail()` true →
auto-clear → next chunk yanks.

The fix
=======

Snapshot whether the user's PRIOR state was deliberately tail
(`is_at_tail()` BEFORE `resolve_top`), and only clear the lock
when:

1. Prior state was already TAIL_SENTINEL (deliberate, set by
   `scrolled_by` reaching `max_start` while scrolling DOWN, or by
   `scroll_to_bottom()`).
2. AND `total_lines > visible_lines` (so "tail" is meaningful —
   if the whole transcript fits, "is_at_tail" is trivially true
   and clearing the lock would yank the user back to bottom on
   the next chunk despite their explicit scroll-up).

This preserves all the legitimate clear paths:
* `TurnComplete` event clears the lock at the per-turn boundary
  (`ui.rs:879`).
* User invokes `scroll_to_bottom()` explicitly via key/menu
  (`app.rs:2459`).
* User scrolls down enough that `scrolled_by` reaches `max_start`
  in a transcript with real scroll room — state goes through
  `to_bottom()` BEFORE resolve, so `was_explicit_tail = true` and
  the lock clears.

What it stops:
* Render-time resolve clamping `at_line(N)` to tail when content
  shrunk doesn't quietly revoke the user's intent.
* `scrolled_by` collapsing a scroll-up to `to_bottom()` when
  content briefly fits in one screen no longer triggers the
  auto-clear (the prior state wasn't tail).

Verified locally:

* `cargo fmt --all -- --check` clean.
* `cargo clippy --workspace --all-targets --all-features --locked
  -- -D warnings` clean.
* `cargo test --workspace --all-features --locked` — 2038 passed,
  2 ignored, 0 failed (a snapshot::repo flake unrelated to scroll;
  passes in isolation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:35:39 -05:00
Hunter Bown 1131e7a7b0 fix(capacity): disable controller by default (silent transcript wipe)
User-facing repro:

* In YOLO mode at low context utilisation (~5%), the engine briefly
  showed `resetting plan` in the footer and the transcript area went
  mostly black. Tools kept running (Plan panel + sidebar still
  rendered), but the chat history above the latest turn was gone.

Root cause: the capacity controller's `VerifyAndReplan` action
(`crates/tui/src/core/engine/capacity_flow.rs::apply_verify_and_replan`)
runs `self.session.messages.clear()` and rebuilds from the canonical
state. The capacity controller fires this when its slack-based
`p_fail` calculation crosses the high-severe band — independently of
the `auto_compact` setting, independently of token utilisation.

The user opted out of auto-compaction in v0.8.11 (default
`auto_compact = false`, #665), explicitly trusting the model with
the full 1M-token V4 window. Auto-managing the prefix on their
behalf via the capacity controller contradicts that posture and
silently destroys both the user-visible transcript and V4's prefix
cache.

The fix
=======

Flip `CapacityControllerConfig::default().enabled` from `true` to
`false`. The controller's `observe_*` and `decide` methods already
short-circuit when `enabled` is false (`capacity.rs:255`,
`capacity.rs:396`), so the existing wiring becomes a no-op for the
default config — no need for defensive gating in
`capacity_flow.rs`.

Power users who want the controller can opt in via
`capacity.enabled = true` in `~/.deepseek/config.toml`. The slack
heuristics, model priors, cooldowns, and intervention paths all
remain in the codebase, ready to re-engage on opt-in. Nothing
deleted.

Tests
=====

* `default_controller_is_disabled_and_skips_observations` — pins
  the new default; `observe_pre_turn` returns `None`.
* `opt_in_controller_observes_and_decides` — confirms `enabled =
  true` rearms the controller end-to-end.
* `app_config_without_capacity_uses_default_disabled` — pins that
  loading a config with no `[capacity]` section produces
  `enabled = false`.
* `capacity_disabled_by_default_keeps_messages_intact` — direct
  regression for the user-reported symptom: with default config,
  even a forced error-escalation checkpoint cannot trigger
  `messages.clear()`. Asserts the transcript length is preserved.

Verified locally:

* `cargo fmt --all -- --check` clean.
* `cargo clippy --workspace --all-targets --all-features --locked
  -- -D warnings` clean.
* `cargo test --workspace --all-features --locked` — 2039 passed,
  2 ignored, 0 failed (one flake on `snapshot::repo::tests::
  restore_removes_files_added_after_target_snapshot` was filesystem-
  timing-dependent, passes on isolation re-run; unrelated to this
  change).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:24:24 -05:00
Hunter Bown d4e5ee4eff chore(release): date v0.8.11 CHANGELOG section 2026-05-04 23:07:57 -05:00
Hunter Bown 54ed8c1d2f Merge pull request #678 from Hmbown/chore/v0.8.11-release
chore(release): bump version to 0.8.11 + CHANGELOG
2026-05-04 23:07:44 -05:00
Hunter Bown 9366a7c5f3 chore(release): bump version to 0.8.11
Final step in the v0.8.11 patch release. Bumps the workspace
`Cargo.toml`, all 9 internal path-dep version pins, and
`npm/deepseek-tui/package.json` to **0.8.11**. `Cargo.lock`
regenerated alongside.

The v0.8.11 CHANGELOG entry already landed on `main` via the
cache-maxing overhaul PR (#684). This commit only stamps the
version. Together they ship:

* **Cache-maxing for V4 1M context** — engine no longer rebuilds the
  system prompt on every turn (#684's `Session::last_system_prompt_hash`),
  the volatile working-set summary moved out of the system prompt
  into per-turn `<turn_meta>` on the latest user message, the tool
  array is anchored with `cache_control: ephemeral`, and the
  `messages_with_turn_metadata` injection skips tool-result
  messages so the assistant→tool_result invariant stays intact.
* **500K compaction floor** — automatic compaction refuses below
  500K tokens via `MINIMUM_AUTO_COMPACTION_TOKENS`. Manual
  `/compact` bypasses (explicit user agency).
* **Token-only compaction trigger** — dropped
  `CompactionConfig::message_threshold` and the message-count
  branch in `should_compact`; that 128K-era heuristic only fired
  on long sessions of small messages, exactly the case where
  rewriting the V4 prefix cache is most wasteful.
* **Legacy 128K naming** — `DEFAULT_CONTEXT_WINDOW_TOKENS` →
  `LEGACY_DEEPSEEK_CONTEXT_WINDOW_TOKENS`.
* **`npm install` resilience** — `install.js` now retries with
  exponential backoff, enforces per-attempt timeout + 30 s stall
  detector, honors `HTTPS_PROXY` / `HTTP_PROXY` / `NO_PROXY` (pure
  Node, no new dependencies), and prints download progress to
  stderr. Driven by a community report that `npm install` took 18
  minutes through a CN npm mirror; the GitHub Releases binary
  fetch was the bottleneck and CN mirrors don't proxy GitHub.

Verified locally:
* cargo fmt --all -- --check                            ✓
* cargo clippy --workspace --all-targets --all-features
                --locked -- -D warnings                 ✓
* cargo test --workspace --all-features --locked        ✓
* parity gates (snapshot, parity_protocol, parity_state) ✓
* bash scripts/release/check-versions.sh                ✓
  (workspace=0.8.11, npm=0.8.11, lockfile in sync)
* node scripts/release/npm-wrapper-smoke.js             ✓

Reminder for the maintainer at release time: the npm publish is
manual and requires 2FA OTP on every publish. After this PR
merges and the GitHub Release is fully drafted by `auto-tag.yml`,
publish from a developer machine:

    cd npm/deepseek-tui
    npm publish --access public

The `prepublishOnly` hook checks all eight binaries plus the
SHA256 manifest are present on the GitHub Release before letting
`npm publish` proceed, so this must happen *after* the GitHub
Release is finalized.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 23:01:16 -05:00
Hunter Bown 8dfecfb5d9 Merge pull request #685 from Agent-Skill-007/main
Revise README for clarity and structure
2026-05-04 23:00:15 -05:00
Hunter Bown e9b2c2ca5d Merge pull request #684 from Hmbown/feat/v0.8.11-compaction-v4-overhaul
feat(v0.8.11): cache-maxing overhaul — per-turn rebuild gate + working_set extraction + tool anchor
2026-05-04 22:51:29 -05:00
Leon.C f387037f51 docs: add CODE_OF_CONDUCT.md
Add Contributor Covenant Code of Conduct v2.1 to establish community guidelines and set clear expectations for participant behavior.
2026-05-05 11:50:02 +08:00
Hunter Bown 9e67e04e4a fix(install,tests): fmt nit + downloadText flowing-mode bug
CI on PR #684 caught two real issues that local checks missed:

**Lint failure (cargo fmt).** A regression test landed with a multi-line
`let ContentBlock::Text { text, .. } = real_user.content...` pattern
that local rustfmt accepted but CI's pinned toolchain collapsed onto a
single line. Reformatted to match.

**npm wrapper smoke failure ("Checksum manifest is missing
deepseek-<platform>").** Subtle Node.js streams interaction in
`install.js` introduced by the network-resilience cluster:

* `httpRequest` attaches a `data` event listener on the response to
  re-arm the stall timer.
* Attaching a `data` listener on a `Readable` puts the stream into
  flowing mode immediately.
* `downloadText` then ran `for await (const chunk of response)` to
  collect the body — the async iterator expects paused-mode and
  silently misses chunks that flow before / between iteration ticks.
* For small bodies (the ~100-byte SHA256 manifest), the entire
  response could flow through the stall listener before the async
  iterator's `read()` calls landed, leaving the joined body empty.
* Result: `parseChecksumManifest("")` returned an empty Map →
  `verifyChecksum` saw no entries → "manifest is missing X" after
  the actual binary download succeeded.

Binary downloads were unaffected because `download()` uses
`response.pipe(sink)` plus a `data` listener for progress — both
consume chunks via `data` events, no async iterator involved.

Fix: collect the response body in `downloadText` via direct `data`/
`end` event subscription. `data` listeners stack — both the stall
re-arm and the body collector fire on every chunk, no flowing-vs-
paused conflict. Stall detection still works.

Verified locally: `node scripts/release/npm-wrapper-smoke.js`
"npm wrapper smoke passed with local assets from <url>".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:37:23 -05:00
Hunter Bown 76dd924c7f fix(engine): turn_meta must skip tool-result messages (HTTP 400 fix)
Live-test repro: typing a single user message in the TUI triggered a
tool call (read_file Cargo.toml), and the *next* request to DeepSeek's
API returned HTTP 400:

    "An assistant message with 'tool_calls' must be followed by tool
    messages responding to each 'tool_call_id'. (insufficient tool
    messages following tool_calls message)"

Root cause: `messages_with_turn_metadata` walked the message list from
the tail and prepended a `<turn_meta>` Text block to the *last* message
with role="user". But tool-result messages also use role="user"
internally (they serialize to role="tool" on the wire). Inserting a
Text content block at index 0 of a tool-result message changed the
shape from `[ToolResult(...)]` to `[Text("turn_meta..."), ToolResult(...)]`,
which on the wire becomes a role="user" message with text instead of
the role="tool" message the API needs to satisfy the assistant's
prior tool_call. Hence the 400.

The fix:

* Restrict the injection target to messages that have at least one
  Text content block AND no ToolResult blocks. This identifies actual
  user-typed messages and skips tool-result envelopes.
* When the trailing slice has no eligible user message (e.g. mid-turn
  when a tool result is the most recent message), skip injection
  entirely. The working_set will surface again on the next genuine
  user prompt; we don't retroactively prepend onto an earlier user
  message because that would also confuse the API's tool-call
  continuity checks.

Two regression tests pin the contract:

* `turn_metadata_skips_tool_result_messages` — assistant tool_call +
  tool_result + earlier user message: only the user message gets the
  prefix, the tool_result message stays a single-block ToolResult.
* `turn_metadata_skips_when_only_tool_results_trail` — the corner
  case where the trailing user-role message is solely a tool result
  (no real user message in the slice): no injection happens, the
  message returns unchanged.

Verified locally:

* 2038 tests passed in TUI bin (2 ignored, was 2036 — these are the
  +2 new regressions).
* `cargo fmt`, `cargo clippy --locked -D warnings`, parity gates all
  clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:27:11 -05:00
Hunter Bown a08c891542 docs(changelog): cover message_threshold removal + install.js resilience
Folds two follow-on changes into the 0.8.11 entry:

* The token-only compaction trigger (drops the 128K-era message-count
  heuristic).
* The npm `install.js` network-resilience cluster (retry + timeout +
  proxy + progress).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:20:07 -05:00
Hunter Bown 229f02ea2c feat(npm): install.js network resilience for slow / firewalled networks
A community user from China reported `npm install deepseek-tui`
took 18 minutes through a CN npm mirror. The bottleneck is the
GitHub Releases binary fetch (~46 MB across two binaries), not the
npm tarball (which is 6.9 kB). The CN mirror does NOT proxy GitHub
release downloads, so any user behind a slow or lossy connection
is hitting the GitHub fetch directly with no resilience.

Four behaviors added to `npm/deepseek-tui/scripts/install.js`:

1. **Retry with exponential backoff.** Up to 5 attempts on network
   errors (ECONNRESET, ECONNREFUSED, ETIMEDOUT, EAI_AGAIN,
   network/host unreachable, EPIPE, ECONNABORTED) and 5xx upstream
   responses. Backoff `1s, 2s, 4s, 8s, 16s` with ±20% jitter. 4xx
   and checksum-mismatch are flagged non-retryable so we don't
   thrash on permanent failures. Final error includes the underlying
   message and the attempt count.

2. **Per-attempt total timeout + stall detector.** Total timeout
   defaults to 5 minutes per attempt (`DEEPSEEK_TUI_DOWNLOAD_TIMEOUT_MS`,
   alias `DEEPSEEK_DOWNLOAD_TIMEOUT_MS`). A stall detector aborts
   the request when no bytes arrive for 30 s
   (`DEEPSEEK_TUI_DOWNLOAD_STALL_MS`, alias
   `DEEPSEEK_DOWNLOAD_STALL_MS`) so a hung connection doesn't waste
   the whole timeout. Both budgets are surfaced in the error so the
   user can dial them up if they're on a slow pipe.

3. **HTTPS_PROXY / HTTP_PROXY support — pure Node, no new
   dependencies.** Detects `HTTPS_PROXY` / `HTTP_PROXY` (and the
   lowercase variants) and routes through the proxy via CONNECT
   tunneling. `NO_PROXY` exclusion list honored, with `*` and dotted-
   suffix matching. Proxy auth via standard `user:pass@` URL form is
   passed through as `Proxy-Authorization: Basic ...`. Pure-Node
   implementation using `net` + `tls` + `http` + `https` builtins —
   no `https-proxy-agent` dependency added.

4. **Download progress indicator.** Writes to stderr every ~1 MB
   or every 2 s in TTY mode using `\r` to overwrite a single line.
   Non-TTY mode (CI, piped) emits one line per 5 MB so logs stay
   reasonable. Suppressed when `DEEPSEEK_TUI_QUIET_INSTALL=1` or
   when `npm_config_loglevel` is `silent` or `error`. Falls back to
   `N MB downloaded` when the response has no `Content-Length`.

Public API unchanged: existing callers of `getBinaryPath` and `run`
keep working identically when no new env vars are set. The escape
hatch `DEEPSEEK_TUI_DISABLE_INSTALL=1` still exits cleanly.

Verified locally:

* `node -c install.js` and module-load syntax checks.
* `DEEPSEEK_TUI_FORCE_DOWNLOAD=1 DEEPSEEK_TUI_VERSION=0.8.10 node
  install.js` — real GitHub Releases download succeeded with
  visible progress, both binaries landed.
* `HTTPS_PROXY=http://invalid.proxy.local:9999 ... node install.js`
  — proxy path exercised, fails cleanly with the bad host named
  in the error message after retries exhausted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:19:34 -05:00
Hunter Bown d0e95f23b0 refactor(compaction): drop message_threshold, token-only triggering
The `message_threshold` field on `CompactionConfig` was a 128K-era
heuristic that fired compaction on long sessions of small messages
— exactly the case where rewriting V4's prefix cache is most wasteful.
Token budget is the only signal that maps to actual model context
pressure; counting messages adds nothing.

Changes:

* Remove `CompactionConfig::message_threshold` field.
* Remove the message-count branch in `should_compact` — token
  threshold + 500K floor is now the sole compaction trigger.
* Remove `compaction_message_threshold_for_model`,
  `DEFAULT_COMPACTION_MESSAGE_THRESHOLD`,
  `COMPACTION_MESSAGE_DIVISOR`, `MAX_COMPACTION_MESSAGE_THRESHOLD`
  from `models.rs`.
* Drop the `forced_config.message_threshold` tweak in the engine's
  capacity-guardrail forced-compaction path; that path now also
  bypasses the floor (`auto_floor_tokens = 0`) because we're at a
  hard ceiling and have to free budget regardless of cache cost.
* Update production constructors (`main.rs`, `runtime_threads.rs`,
  `app.rs::compaction_config`) to drop the field.
* Update tests: keep the floor + token-threshold assertions, delete
  the two tests that specifically validated message-count
  triggering, replace `should_compact_respects_message_threshold`
  with `message_count_no_longer_triggers_compaction` pinning the
  new contract.

Verified locally:

* `cargo fmt --all -- --check` clean.
* `cargo clippy --workspace --all-targets --all-features --locked
  -- -D warnings` clean.
* `cargo test --workspace --all-features --locked` — 2036 passed
  in TUI bin (2 ignored), all other crates green.
* parity gates: snapshot, parity_protocol, parity_state — all pass.
* `git diff --exit-code -- Cargo.lock` — clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 22:19:09 -05:00
Agent007 c2a992ece0 Revise README for clarity and structure
Updated README.md for clarity and formatting improvements.
2026-05-05 11:16:44 +08:00
Hunter Bown e98efcf31d fix(engine): drop dead working set prompt marker 2026-05-04 22:08:07 -05:00
Hunter Bown 991ae41571 docs(changelog): describe v0.8.11 cache overhaul 2026-05-04 22:07:04 -05:00
Hunter Bown 1a6589c55a perf(tools): anchor tool array with cache control 2026-05-04 22:06:58 -05:00
Hunter Bown b48b68f078 perf(engine): stabilize system prompt and move working set metadata 2026-05-04 22:06:55 -05:00
Hunter Bown a14227edf8 refactor(models): rename legacy DeepSeek context window 2026-05-04 22:06:16 -05:00
Hunter Bown a4dee56fcc fix(compaction): 500K hard floor plus V4 default 2026-05-04 22:06:07 -05:00
Hunter Bown 43bb055dfe Merge pull request #670 from Hmbown/fix/v0.8.11-view-stack-tracing
diag(tui): trace view_stack push/pop for post-mortem black-screen repro
2026-05-04 20:24:39 -05:00
wangfeng b6a6c88327 fix(tui): replace hardcoded colors with Color::Reset for terminal compatibility (closes #666) 2026-05-04 18:21:15 -07:00
Hunter Bown cba5e829fc diag(tui): trace view_stack push/pop for post-mortem black-screen repro
Maintainer-reported (handoff): after spawning a sub-agent in YOLO, the
transcript renders solid black and scroll keys go dead, but footer +
sidebar still render fine. The shape (black + dead input together)
strongly suggests a `View` is on the stack that returns empty layout
AND intercepts key events at the top level. The fix wants a tighter
repro than we have today.

Add `tracing::debug!` to every push / push_boxed / pop on `ViewStack`
and to the implicit pops in `apply_action` (Close + EmitAndClose).
Each line carries the `ModalKind` and post-action depth, so a future
`RUST_LOG=deepseek_tui::view_stack=debug` capture will show exactly
which view stayed pushed when the symptom recurred.

No behavior change. The handoff explicitly suggested this as the
first-look diagnostic step; we ship the diagnostic now so the next
report comes with evidence.

Refs the unresolved sub-agent black-transcript symptom captured in
session-3 handoff. Will surface to a tracking issue once we have a
concrete repro from the maintainer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 20:17:43 -05:00
wangfeng e5b862540f feat(engine): reasoning_effort auto mode (closes #663) 2026-05-04 18:14:50 -07:00
wangfeng fd8c9fdb20 feat(tools): FIM edit tool + strict tool mode for V4 endpoints (closes #662) 2026-05-04 18:11:56 -07:00
Hunter Bown f6e4f634d4 Merge pull request #665 from Hmbown/fix/v0.8.11-stop-compacting-so-much
fix(compaction): default off + raise unknown-model floor to 80% (#664, v0.8.11)
2026-05-04 20:05:57 -05:00
Hunter Bown 2640d8c091 Merge pull request #594 from Hmbown/fix/593-keyring-shadow
fix(auth): dual-write API key to keyring + config to stop stale-keyring shadow (#593)
2026-05-04 20:05:52 -05:00
wangfeng 3edcc6dacb feat(tools): FIM edit tool stub for V4 /beta endpoint (closes #662) 2026-05-04 18:05:48 -07:00
Hunter Bown 68f6d6995d Merge pull request #592 from Hmbown/feat/584-compaction-telemetry
feat(compaction): debug telemetry on summary calls + document framing fork (#584)
2026-05-04 20:05:46 -05:00
Hunter Bown f1764704d8 Merge pull request #590 from Hmbown/fix/588-mirror-user-language
fix(prompts): mirror user's language in reasoning + reply (#588)
2026-05-04 20:05:40 -05:00
Hunter Bown d586ff05a8 Merge pull request #591 from Hmbown/fix/583-windows-bel-default-off
fix(notifications): default Windows Auto fallback to Off, not BEL (#583)
2026-05-04 20:05:34 -05:00
Hunter Bown a48f50d0db Merge pull request #599 from Hmbown/fix/chinese-input-regression-tests
test(tui): pin Chinese / IME character input contract for the composer
2026-05-04 20:05:24 -05:00
Hunter Bown fc4f1e6564 fix(compaction): default to off + raise unknown-model floor to 80% (#664)
Two coordinated changes that stop the engine from routinely rewriting the
prompt prefix and burning DeepSeek V4's prefix-cache discount:

1. `Settings::default().auto_compact` flips from `true` to `false`. The
   `auto_compact = on` opt-in and the explicit `/compact` slash command
   stay available for users / agents that decide their workload benefits
   from compaction more than from cache stability. With V4's 1M-token
   window the user has plenty of headroom to run long sessions without
   auto-trimming, and aggressive compaction has been the dominant
   cost-spike vector in long sessions (the rewritten prefix invalidates
   ~90% of the cache discount on every compaction event).

2. `DEFAULT_COMPACTION_TOKEN_THRESHOLD` raised from `50_000` to
   `102_400` (80% of `DEFAULT_CONTEXT_WINDOW_TOKENS = 128_000`). This is
   the last-resort threshold used when `context_window_for_model` returns
   `None` — i.e. an unrecognised model id. Pre-v0.8.11 the fallback
   compacted at ~5% of a V4 window when model detection silently fell
   through. Now the fallback inherits the same late-trigger discipline as
   the V4 path, so model-detection drift doesn't quietly burn cache.

Together: the two changes mean compaction never fires automatically by
default, and even when explicitly opted in (or when the runtime-thread /
capacity-flow paths invoke compaction with their own `enabled = true`
config), the threshold is anchored at 80% of the model's context window
(or 80% of the 128K default if the model is unknown), never below.

Tests
=====

- `default_settings_disable_auto_compact_to_protect_v4_prefix_cache` —
  pins the new default and explains the rationale inline.
- `auto_compact_remains_explicitly_configurable` — unchanged; still
  asserts the `set("auto_compact", "on" | "off")` round-trip works.
- `compaction_threshold_scales_with_context_window` — updated to assert
  `compaction_threshold_for_model("unknown-model") == 102_400`.
- `v4_soft_caps_only_apply_to_v4_models` — updated to assert the
  unknown-model + reasoning-effort path also lands on the new floor.

Verification
============

- `cargo fmt --all -- --check` clean.
- `cargo clippy -p deepseek-tui --bin deepseek-tui --all-features
  --locked -- -D warnings` clean.
- `cargo test -p deepseek-tui --bin deepseek-tui --locked` →
  2028 passed, 2 ignored.

Refs #664 (handoff-instead-of-compact pattern, full implementation
deferred). Behaviour-only change for v0.8.11; the larger
agent-aware-handoff mechanism is its own design surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 19:28:02 -05:00