Commit Graph

502 Commits

Author SHA1 Message Date
Hunter Bown f6c7a36076 feat(execpolicy): heredoc body parsing in normalize_command (#419)
`normalize_command` now strips heredoc bodies before shlex tokenization
so a user's `auto_allow = ["cat > file.txt"]` pattern matches the
heredoc form `cat <<EOF > file.txt\nbody\nEOF` cleanly. Recognises the
common forms (`<<DELIM`, `<<-DELIM`, `<<'DELIM'`, `<<"DELIM"`) while
leaving the here-string operator (`<<<`) untouched.

Six unit tests cover: simple body strip, dash form, quoted delimiter,
non-heredoc passthrough, here-string preservation, and the end-to-end
pattern-match path.
2026-05-03 07:44:43 -05:00
Hunter Bown 604edc9f83 feat(tls): honor SSL_CERT_FILE for corporate-CA / MITM proxies (#418)
Corporate users behind TLS-inspecting proxies (Zscaler, Netskope,
Palo Alto, in-house mitmproxy fleets) need to add the proxy's
intermediate CA to the trusted-roots set so the deepseek client
doesn't fail with `unable to get local issuer certificate`.

The reqwest builder already trusts the platform's system store
via native-tls. This adds opt-in support for the conventional
`SSL_CERT_FILE` env var so users can point at their own bundle:

* New `add_extra_root_certs(builder, path)` helper reads the
  file, tries `Certificate::from_pem_bundle` (covers single-cert
  files too), falls back to `from_der` for binary cert files.
* Wired into `build_http_client` when `SSL_CERT_FILE` is set
  and non-empty. Failures log a warning via the existing
  `logging::warn` channel and return the builder unchanged —
  the existing system trust still applies, so a malformed env
  var degrades gracefully instead of bricking the launch.
* Each successful load logs `info` with the cert count so
  operators can confirm their bundle was picked up.

Documented in `docs/CONFIGURATION.md`'s environment-variables
list alongside the existing TLS-related notes.

No new dependency — reqwest's `native-tls` feature already
exposes `Certificate::from_pem_bundle` / `from_der`.
2026-05-03 07:35:23 -05:00
Hunter Bown 6566a59097 feat(security): deny loosest approval/sandbox values at project scope (#417)
Continues #417 by closing the value-level escalation case for
the two pure-loosening values:

* `approval_policy = "auto"` would auto-approve every tool
  call that the user's stricter setting (\`suggest\`, \`never\`,
  etc.) was prompting on. Pure escalation; project should
  never be able to set this.
* `sandbox_mode = "danger-full-access"` exits the workspace
  sandbox entirely. Pure escalation; project should never be
  able to set this.

Both denies are unconditional at project scope — the user's
prior value (or absence) doesn't matter. The denied value
emits a stderr warning so users see the deny.

Sub-tightening comparisons (e.g. user `"never"` → project
`"on-request"` is allowed even though it loosens) stay
v0.8.9 follow-up because they need a richer ordering check
across all `approval_policy` / `sandbox_mode` values.

Tests:

* `project_overlay_denies_approval_auto_and_sandbox_danger_values`
  exercises both escalation values in the same merge and
  confirms a non-escalation field on the same project file
  still applies.
* `project_overlay_preserves_user_strict_value_when_project_tries_to_loosen`
  exercises the belt-and-suspenders case: user has
  `approval_policy = "never"`, project tries `"auto"`, the
  user's strict value survives.
2026-05-03 07:32:08 -05:00
Hunter Bown 926ffcb4f4 feat(security): deny dangerous keys at project-config scope (#417)
A malicious `<workspace>/.deepseek/config.toml` could escalate
privileges via the per-project overlay shipped in #485:

* `api_key` / `base_url` / `provider` — exfiltrate prompts to
  an attacker-controlled endpoint by swapping the user's
  credentials and target host.
* `mcp_config_path` — point the MCP loader at a config that
  spawns arbitrary stdio servers under the user's identity.

Adds a `DENY_AT_PROJECT_SCOPE` allowlist-by-omission to
`merge_project_config`. The four credential / redirect keys
are silently dropped from the overlay; a stderr warning fires
when one is present so a user who *did* expect the override
sees the deny instead of a silent discard:

  warning: project-scope config key `api_key` is ignored —
  set it in `~/.deepseek/config.toml` instead.

The remaining override surface (model, approval_policy,
sandbox_mode, notes_path, reasoning_effort, max_subagents,
allow_shell, instructions array) is unchanged. Note that this
slice does NOT yet block escalation via value comparison — a
project setting `approval_policy = "auto"` still wins over a
user's stricter `"never"`. That richer check is filed as a
v0.8.9 follow-up.

Tests:

* `project_overlay_overrides_model_but_denies_provider`
  replaces the previous test that asserted provider WOULD
  override (now reversed).
* New `project_overlay_denies_dangerous_credentials_and_redirects`
  models the attacker scenario directly: project sets all four
  denied keys, asserts the user's pre-existing values survive
  and the project's are discarded.

CHANGELOG documents the deny-list rationale and lists which
fields remain overridable.
2026-05-03 07:27:44 -05:00
Hunter Bown c20edc43d6 test(spillover): pin _prior wrap path for non-object metadata (#500 follow-up)
`apply_spillover` has a defensive branch that handles a tool
whose `result.metadata` is something other than a JSON object
(rare — most use the `json!({})` pattern — but legal per
`serde_json::Value`). The branch wraps the prior payload under
a `_prior` key so callers that introspect can recover the
original data, then attaches `spillover_path` to the new
object.

That branch had no test coverage. Adds
`apply_spillover_wraps_non_object_metadata_under_prior_key`
which:

* Constructs a `ToolResult` with array-shaped metadata
  (`json!(["unexpected", "array", "payload"])`).
* Triggers spillover with a 200 KiB body.
* Asserts the prior array round-trips under `_prior`.
* Asserts `spillover_path` lands alongside.

Pure additive coverage; no production change. Defends the
recovery path against a future refactor that might assume
metadata is always an object.
2026-05-03 07:23:42 -05:00
Hunter Bown 8d2ffa108d fix(docs): correct two broken intra-doc links
The CI runs `cargo doc --workspace --no-deps` with
`RUSTDOCFLAGS=-Dwarnings`. Two doc-comment links broke the
build:

* `commands/session.rs::prune` referenced
  `[\`SessionManager::prune_sessions_older_than\`]` which
  rustdoc tries to resolve as an item in scope. Without
  importing `SessionManager` into the doc-comment scope, the
  link was unresolvable. Fix by qualifying with the full
  module path: `[\`crate::session_manager::SessionManager::…\`]`.
* `config.rs::max_subagents` had a free-form `[subagents]`
  reference that rustdoc parsed as an intra-doc link. Wrap it
  in backticks so it renders as inline code instead.

No code change. Pure rustdoc hygiene; CI gate passes again.
2026-05-03 07:20:48 -05:00
Hunter Bown c244760b67 feat(stash): /stash pop reports remaining count (#440 polish)
After popping, the user wants to know whether to keep popping
or move on. Currently the message just shows the restored
preview — silent on stash depth. Adds a parenthetical:

  Restored stashed draft: <preview> (3 more parked)
  Restored stashed draft: <preview> (1 more parked)
  Restored stashed draft: <preview> (stash now empty)

Mirrors the queue-edit confirmation pattern so users get
consistent depth feedback whether they're popping a draft or
editing a queued message.
2026-05-03 07:18:02 -05:00
Hunter Bown 0fe05b682a test(session): pin offline-queue session_id stamping (#487 follow-up)
The #487 fix relies on `save_offline_queue_state` correctly
stamping the session id so the load path's mismatch check has
something to compare against. The existing
`test_offline_queue_round_trip_and_clear` covers
serialization + clear but doesn't pin the session_id stamping
behavior.

Adds `test_offline_queue_stamps_session_id_on_save` which
exercises three cases:

* `save(state, Some("session-A"))` → loaded session_id is
  `Some("session-A")`. The stamp made it to disk.
* `save(state, Some("session-B"))` → re-saving replaces the
  stamp; loaded session_id is `Some("session-B")`. No stale
  ID lingers.
* `save(state, None)` → loaded session_id is `None`. The UI's
  load path treats this as legacy-unscoped and refuses to
  restore (fail-closed), which is what protects users from
  pre-#487 queues leaking into new chats.

Pure additive coverage. The 2 existing offline-queue tests
pass unchanged.
2026-05-03 07:16:11 -05:00
Hunter Bown a4c8cb2514 feat(prompts): structured Markdown compaction template (#429)
Replaces the legacy compaction template with the spec'd
Goal / Constraints / Progress (Done / In Progress / Blocked) /
Key Decisions / Next step structure.

The richer Progress sub-bullets help long resumed sessions
distinguish "what's verified done" from "what's mid-flight" —
useful when the model writes `.deepseek/handoff.md` before a
long break. The previous Active-task / Files-touched /
Key-decisions / Open-blockers / Next-step framing collapsed
"in progress" and "blocked" into a single "open blockers"
heading, which lost the lineage of "I started X, hit Y,
then…" trails.

Backwards compat: existing `.deepseek/handoff.md` files
continue to render fine because the loader
(`prompts.rs::load_handoff_block`) injects them as plain
markdown — the template only guides what NEW handoffs look
like.

The "pinned-tool-output configurability" half of #429's spec
remains a v0.8.9 follow-up because it requires changes to
`cycle_manager.rs` compaction logic itself; the template
restructure is independently shippable and is the bigger UX
delta in practice.

Tests: existing `compact_template_is_included_in_full_prompt`
updated to assert the new section headings and the nested
Progress sub-bullets. All 24 prompt tests pass.
2026-05-03 07:12:45 -05:00
Hunter Bown 8a679bf662 chore(hooks): tracing::warn on hook failures (#455 follow-up)
Hook failures were silent — the executor returned a `HookResult`
with `success=false`, but every call site discards it with
`let _ = ...`. Operators tailing `deepseek` had no visibility
into hook errors short of running each hook command by hand.

Centralizes the logging inside `HookExecutor::execute` so every
fire site benefits without sprinkling instrumentation. Logs
through `tracing::warn!` with structured fields (`hook`,
`event`, `exit_code`, `duration_ms`, `error`, `stderr_head`)
so operators can `RUST_LOG=warn deepseek` and immediately see
which hooks are misbehaving.

Successful runs log nothing — `tool_call_before` /
`tool_call_after` fire on every tool dispatch, so per-call
success logging would be unreadably noisy.

No behavioral change for users with no hooks (the function
fast-paths out before reaching this branch). No behavioral
change for users with passing hooks. Failed hooks still
respect `continue_on_error` and the surrounding loop is
unchanged.
2026-05-03 07:10:19 -05:00
Hunter Bown c0b6c2a1e5 perf(hooks): fast-path skip when no hooks configured (#455 follow-up)
Now that `tool_call_before` / `tool_call_after` fire on every
tool dispatch, the cost of constructing a `HookContext` (which
allocates for `workspace`, `model`, `session_id`, …) shows up
on the hot path even when the user has zero hooks configured —
the common case.

Adds `HookExecutor::has_hooks_for_event(event)` as a cheap
boolean gate that callers consult before building the context.
The pre-check returns false when:

* `config.enabled == false` (globally disabled).
* No hook in the config has the given `event`.

Wired through every fire site:

* `tool_routing.rs::handle_tool_call_started` —
  `ToolCallBefore`.
* `tool_routing.rs::handle_tool_call_complete` —
  `ToolCallAfter`. Also skips the `result.content.clone()`
  that the `with_tool_result` builder demands.
* `ui.rs::dispatch_user_message` — `MessageSubmit`.
* `ui.rs::apply_engine_error_to_app` — `OnError`.

Inside `HookExecutor::execute` itself, also short-circuit
before calling `context.to_env_vars()` when no hooks match the
event — defends against a caller that builds the context but
forgets to gate.

Tests:
  3 new tests cover empty-config / globally-disabled /
  per-event filtering. The existing 18 hook tests pass
  unchanged.

No behavioral change for users with hooks configured; pure
allocation-free fast path otherwise.
2026-05-03 07:07:11 -05:00
Hunter Bown e569f2ca99 feat(hooks): fire message_submit + on_error too (#455 observer-only)
Completes the observer-only slice of #455 by wiring the two
remaining `HookEvent` variants that were defined but never
fired:

* `MessageSubmit` fires from `dispatch_user_message` before
  the message is handed to the engine. Hook context carries
  `message` so observers can log every prompt the user
  submits, redact for compliance audit, or page on
  `/wipe-database`-style content. Read-only.
* `OnError` fires from `apply_engine_error_to_app` before the
  error cell reaches the transcript. Hook context carries
  `error`. Useful for paging on auth / billing / invalid-
  request failures without tailing the audit log.

Combined with the prior `tool_call_before` / `tool_call_after`
wiring, every `HookEvent` variant now has a live producer:
`SessionStart`, `SessionEnd`, `MessageSubmit`, `ToolCallBefore`,
`ToolCallAfter`, `ModeChange`, `OnError`. The `/hooks events`
listing already enumerates them with their on-fire semantics.

Hooks remain read-only observers in this slice. Mutation is
v0.8.9 follow-up because it needs a synchronous-gate contract
that would change semantics for every hook surface — including
the lifecycle events that have shipped for many releases.
2026-05-03 07:01:52 -05:00
Hunter Bown 4310202645 feat(hooks): fire tool_call_before / tool_call_after (#455 observer-only)
The `HookEvent::ToolCallBefore` and `HookEvent::ToolCallAfter`
enum variants were defined but never fired from production code,
so `[[hooks.hooks]]` entries with those events sat dormant.

Wires the fires from `tui/tool_routing.rs`:

* `handle_tool_call_started` fires `ToolCallBefore` with the
  hook context populated with `tool_name` and `tool_args`. The
  fire happens before any UI bookkeeping so observers see the
  call as early as possible.
* `handle_tool_call_complete` fires `ToolCallAfter` after the
  cell finalization with the result content (or stringified
  error) + success flag. Stays last in the function so any UI
  state the hook might want to observe via shell-out is
  already settled.

Hooks remain read-only observers in this slice. Mutation
(modifying tool args before execution, or the result before it
reaches the model) is a v0.8.9 follow-up that needs a
synchronous-gate contract; the existing executor is fire-and-
forget and adding mutation would change semantics for every
existing hook surface (session_start, mode_change, etc.).

Operators can wire `tool_call_before` / `tool_call_after`
hooks in `~/.deepseek/config.toml` immediately to log every
tool call, page on long shell exec, or audit risky operations.
The `/hooks events` listing already enumerates them.

No new tests — `tool_routing.rs` has no existing test surface,
and the hook execution path is already covered via
`hooks::tests::*`. The wiring is mechanically minimal.
2026-05-03 06:59:26 -05:00
Hunter Bown a2c7c94f5d test(pr): pin is_command_available contract (#451 follow-up)
Adds a tiny test that exercises both branches of the helper used
by `deepseek pr <N>` to detect `gh`'s presence:

* Positive case — `sh` (POSIX baseline) is reported present.
  Gated on `cfg(unix)` because Windows runners aren't
  guaranteed to have `sh.exe` outside git-bash.
* Negative case — a deliberately-implausible
  `this-command-cannot-exist-…ENOENT-marker` returns `false`
  rather than panicking from the `Command::new` exec failure.

Pure additive coverage; no production change.
2026-05-03 06:54:05 -05:00
Hunter Bown 8ed1cb4e68 feat(hooks): /hooks events subcommand for discovery (#460 polish)
The shipped `/hooks list` told users WHAT was configured but
not WHAT they could configure. Without this, the only way to
learn the supported `HookEvent` values is to grep source — not
ideal when most users just want to wire up a notification on
session_end.

Adds `/hooks events` (aliases `event` / `list-events`) which
prints every `HookEvent` variant alongside a short descriptive
blurb (when it fires, current observability-vs-mutation status).
Ordered lifecycle → per-tool → situational so the listing reads
naturally and stays stable across releases.

Updates `CommandInfo::usage` to `/hooks [list|events]` so the
fuzzy autocomplete shows the new subcommand.

Tests:
  1 new test (`events_subcommand_lists_every_event_variant_in_documented_order`)
  pins the order, the per-event descriptive blurb format, and
  exhaustive variant coverage. The existing 6 hooks tests pass
  unchanged.
2026-05-03 06:51:27 -05:00
Hunter Bown 14931566b5 test(audit): pin emit_tool_audit contract (#500 follow-up)
The `tool.spillover` audit emission shipped in 0fa042 added a
new caller to `emit_tool_audit` but the function itself had no
unit tests pinning its contract — operators relying on
`DEEPSEEK_TOOL_AUDIT_LOG` deserve regression coverage on the
JSONL writer.

Adds 3 tests:

* `emit_tool_audit_writes_jsonl_line_when_env_var_set` —
  verifies each call appends a parseable JSON line, with the
  expected `event` and `tool_id` keys reaching disk.
* `emit_tool_audit_is_noop_when_env_var_unset` — pins the
  early-return when the env var is missing (no panic, no file
  side effects).
* `emit_tool_audit_creates_parent_directory` — confirms the
  `create_dir_all(parent)` step works for previously-missing
  paths so operators can point the env var at a fresh path
  without a chicken-and-egg setup step.

All three serialise through a static Mutex because they mutate
process-global `DEEPSEEK_TOOL_AUDIT_LOG`. Cleanup happens on
each test under the same guard.
2026-05-03 06:48:59 -05:00
Hunter Bown a8e0693958 feat(doctor): report spillover dir + composer stash file (#422/#440 polish)
The v0.8.8 polish stack added two on-disk surfaces operators
might want to inspect — `~/.deepseek/tool_outputs/` for spilled
tool output (#422 / #500), and `~/.deepseek/composer_stash.jsonl`
for parked composer drafts (#440). Neither showed up in
`deepseek doctor`, so users couldn't see at a glance "do I have
parked drafts?" or "how much disk has spillover claimed?"

Adds a `Storage:` section to the human-readable doctor and a
`storage` object to the JSON doctor:

* Spillover slot reports the dir's existence and entry count.
  Pre-creation state ("not yet created") is shown explicitly
  rather than as a missing dir — the dir is created lazily on
  first spill, not at boot.
* Stash slot reports the file's existence and parked-draft
  count by re-reading via `composer_stash::load_stash`. Empty /
  missing stash shows the Ctrl+S hint so the user knows how to
  use the feature.

The JSON schema always emits both nested slots regardless of
state (so dashboard schemas stay stable across hosts); the
human-readable hides the "not yet created" line for spillover
when the dir is missing to keep the report scannable.
2026-05-03 06:46:20 -05:00
Hunter Bown b1c6e6b173 feat(doctor): report .opencode + .claude skill dirs (#432 follow-up)
The cross-tool skill discovery shipped in 432a0c1 walks
`.opencode/skills/` and `.claude/skills/` alongside the
`.agents/skills/` and `skills/` workspace folders, but the
`deepseek doctor` output still only listed the original three
slots. Operators staring at "where are my Claude-style skills?"
had no way to confirm whether the new dirs were even being
checked.

Updates both surfaces:

* Human-readable doctor — adds two conditionally-printed lines
  for `.opencode skills dir` and `.claude skills dir`. Empty
  dirs are omitted to keep the report scannable; the dirs
  exist on most workspaces only when the user has installed
  another AI tool's skill catalog there.
* JSON doctor (`deepseek doctor --json`) — adds `opencode` and
  `claude` slots to the `skills` object alongside the existing
  `global`, `agents`, `local`. Each carries `path`, `present`,
  and `count`. JSON consumers see all five keys regardless of
  presence so dashboard schemas stay stable across hosts.

The `selected_skills_dir` field still reflects the legacy
"highest-precedence single dir" — workspace-aware discovery is
done at runtime by `discover_in_workspace`, but `selected` is a
useful "where do I install a NEW skill" hint and stays
unchanged for backwards compatibility with existing diagnostic
tooling.
2026-05-03 06:43:47 -05:00
Hunter Bown 5627d6535b docs: document NO_ANIMATIONS, instructions array, /hooks, /stash
Catches up `docs/CONFIGURATION.md` with the v0.8.8 polish stack so
operators have one source of truth for the new surfaces:

* `NO_ANIMATIONS` env override (#450) joins the existing
  environment-variable list, with a cross-reference to
  `docs/ACCESSIBILITY.md`.
* New `### Instruction sources` section documents the
  `instructions = [...]` config field (#454): expansion rules,
  100 KiB per-file cap with `[…elided]` marker, missing-file
  warning behavior, and the project-wholesale-replaces-user
  override semantics.
* New `### /hooks listing` section documents the read-only
  slash command (#460 MVP) so users know how to introspect
  configured lifecycle hooks without `cat`-ing config.toml.
* New `### Composer stash` section documents Ctrl+S +
  `/stash list|pop|clear` (#440) including the 200-entry cap
  and multiline preservation.

Pure documentation; no code changes. Existing prompt-stability
and config-loading tests are unaffected.
2026-05-03 06:39:29 -05:00
Hunter Bown a368dc53b8 feat(commands): /hooks read-only lifecycle hook listing (#460 MVP)
Slash command enumerates configured lifecycle hooks from the
user's `[hooks]` table, grouped by event. The full picker /
persisted enable-disable surface in #460 is still M-sized work;
this MVP gives users a no-typing view of what's actually loaded
— the most-asked question once hooks start firing.

Implementation:

* `crates/tui/src/commands/hooks.rs` formats the hook list with
  per-event headings, hook name (or `(unnamed)`), background
  marker, timeout, condition summary, and a 60-char shell
  command preview.
* `condition_summary` covers every `HookCondition` variant
  (Always/ToolName/ToolCategory/Mode/ExitCode/All/Any) so the
  listing stays informative for compound conditions too.
* `event_label` maps each `HookEvent` to its config-file string
  so the listing matches what the user wrote in TOML.
* New `HookExecutor::config()` accessor exposes the underlying
  `HooksConfig` for read-only callers; doesn't open the door
  to mutation, which still belongs to the broader #460 work.
* Registered in `commands::COMMANDS` with `aliases: &["hook"]`,
  usage `/hooks [list]`, and `MessageId::CmdHooksDescription`
  localized in en, ja, zh-Hans, pt-BR.
* Wired into `command_palette::command_runs_directly` so
  pressing Enter from Ctrl+K runs `/hooks list` straight.

Tests:
  6 unit tests covering preview-cap truncation, newline
  stripping, condition-summary variants, event-label
  exhaustiveness, and BTreeMap-grouping ordering.
2026-05-03 06:36:37 -05:00
Hunter Bown 15127046e8 feat(stash): /stash clear subcommand to wipe the stash file (#440 polish)
Pairs with `/stash list` and `/stash pop` so the user can fully
manage the stash from inside the TUI without reaching for `rm`.

* New `composer_stash::clear_stash()` returns the number of
  entries dropped so the slash command can report it.
  Atomic-write replaces the file with empty content; missing /
  empty files return `Ok(0)` without erroring.
* `clear` / `wipe` / `drop` are accepted as the subcommand
  alias. The "unknown subcommand" hint now lists the three live
  subcommands explicitly.
* CommandInfo usage updated to `/stash [list|pop|clear]` so
  `/help` and the autocomplete reflect the new option.
* 3 new tests in `composer_stash`: returns-0 when file absent,
  returns-0 when file is empty, drops entries and reports count
  on a populated stash.

No new dependency; reuses `crate::utils::write_atomic` for the
truncate-and-rewrite.
2026-05-03 06:28:18 -05:00
Hunter Bown ba871c56f6 feat(cli): deepseek pr <N> — pre-seed TUI with PR context (#451)
`deepseek pr 1234` fetches the PR's title, body, base/head, URL,
and full diff via `gh`, then launches the interactive TUI with a
review prompt already typed in the composer. The user can edit
before sending or hit Enter to fire as-is. Falls back gracefully
with an actionable error when `gh` is not on PATH.

Implementation:

* `Commands::Pr { number, repo, checkout }` subcommand. Optional
  `--repo <owner/name>` mirrors `gh pr view`'s flag. Optional
  `--checkout` opt-in for `gh pr checkout`; default is to leave
  the working tree alone since `gh pr checkout` errors out on
  dirty trees.
* `run_pr` helper drives three best-effort gh shell-outs
  (`pr view --json`, `pr diff`, optional `pr checkout`) and
  formats a structured prompt: PR header → URL → branches →
  description → fenced ```diff block.
* `format_pr_prompt` caps the diff at 200 KiB with codepoint-
  safe truncation so a massive PR doesn't blow the model's
  context window before the user even hits Enter.
* New `TuiOptions::initial_input: Option<String>` plumbs the
  pre-typed text into `App::new` (which now branches its
  composer-state init around the option). Cursor lands at the
  end of the seed text. Future callers (welcome screens, share-
  link landing pages, etc.) can reuse the same channel.
* `run_interactive` gains an `initial_input: Option<String>`
  parameter; existing callers pass `None`.

Tests:
  3 new tests in `pr_prompt_tests` cover the happy path
  (title/url/branches/body/diff render correctly), empty-input
  fallbacks (placeholder for missing title/body/branches/url),
  and codepoint-safe truncation when the diff exceeds the
  200 KiB cap.

Bulk update: every other `TuiOptions { ... }` test-builder
across the workspace (~21 sites) gains `initial_input: None`
so the new field doesn't break the existing test suite.
2026-05-03 06:23:54 -05:00
Hunter Bown a9222f4b8c feat(stash): make /stash run directly from the command palette (#440 polish)
`/stash` defaults to `list` when invoked without an argument, so
in the Ctrl+K command palette it should execute on Enter rather
than insert `/stash ` and wait for the user to type `list`. The
identical pattern already applies to `/queue`, which has the same
optional-arg shape.

Adds `"stash"` to the `command_runs_directly` allowlist alongside
`queue`. The fuzzy-search rank, label match, and section grouping
already pick up `/stash` automatically because they iterate over
`commands::COMMANDS` (which gained the entry in 2db4843).

No behavior change on type-then-Enter — only on the
hit-Enter-from-the-palette path. The existing 8 command-palette
tests pass unchanged.
2026-05-03 06:14:34 -05:00
Hunter Bown 2db48435e8 feat(stash): register /stash in /help and autocomplete (#440 polish)
The slash command landed in 6fb87 but only via the dispatch
match arm — `/help` and the fuzzy autocomplete consult
`COMMANDS: &[CommandInfo]` to enumerate available commands, and
without a `CommandInfo` entry the new `/stash` was effectively
hidden from discovery.

Adds a `CommandInfo` row with `aliases: &["park"]`, a
`/stash [list|pop]` usage hint, and a new
`MessageId::CmdStashDescription` localized in en, ja, zh-Hans,
pt-BR. The description reminds users that Ctrl+S is the
matching push entry point — both surfaces should reinforce each
other in the help overlay.

No behavior change on the dispatch path; this is pure
discoverability.
2026-05-03 06:13:06 -05:00
Hunter Bown 6fb8739feb feat(composer): prompt stash — Ctrl+S parks, /stash list+pop (#440)
A stash is a side-channel from history: it holds drafts the user
parked deliberately instead of submissions made in the past
(which live in `composer_history.rs`).

* `crates/tui/src/composer_stash.rs` — JSONL-backed store at
  `~/.deepseek/composer_stash.jsonl`. One JSON object per line
  with `ts` (RFC 3339) and `text`. Self-healing parser drops
  malformed lines instead of poisoning the file. Multi-line
  drafts round-trip intact via JSON's newline escaping. Capped
  at 200 entries; oldest pruned at push time. Empty /
  whitespace-only text is silently dropped.
* `crates/tui/src/commands/stash.rs` — `/stash list` renders the
  stash with one-line previews and timestamps; `/stash pop`
  restores the most recently parked draft into the composer
  (LIFO) and rewrites the file. `/park` aliases `/stash`.
* Composer Ctrl+S handler in `tui/ui.rs` — pushes the current
  draft onto the stash, clears the composer, and surfaces a
  toast confirming the action so the no-op-feel doesn't fool
  users into thinking nothing happened. Empty composers are a
  no-op so a stray Ctrl+S can't pollute the file.
* New `KbStashDraft` keybinding entry registered in the help
  overlay; localized in en, ja, zh-Hans, pt-BR.

Tests:
  7 unit tests in `composer_stash.rs` cover round-trip, LIFO pop,
  empty-on-pop, drop-empty-text, multi-line preservation,
  malformed-line resilience, and cap pruning. 4 unit tests in
  `commands/stash.rs` cover the preview helper's truncation,
  multi-line first-line behavior, and empty-input handling.
2026-05-03 06:09:35 -05:00
Hunter Bown 99223b148c docs(prompt): list load_skill in the model's toolbox reference (#434)
The new `load_skill` tool was registered into the agent and plan
mode tool sets in 0c1699 but the prompt's `## Toolbox`
quick-reference still listed only the legacy progressive-
disclosure pattern (system prompt → read_file). The model has to
read the tool description to know `load_skill` exists, but
without a hint in the toolbox it's easy to miss when scanning.

Adds a `**Skills**` line that points at `load_skill` and explains
when to prefer it over `read_file` + `list_dir`. Pulls from the
existing `## Skills` section above for context, so the model
sees one short cross-reference instead of duplicate setup
instructions.

No code change; prompt-only doc edit. Existing prompt-stability
tests pass unchanged because they don't compare prose.
2026-05-03 06:01:15 -05:00
Hunter Bown 0fa042dc99 feat(audit): emit tool.spillover events when output is spilled (#500 polish)
The existing `tool.result` audit event records that a tool
finished but says nothing about spillover — operators tailing
`~/.deepseek/audit.log` couldn't see when 200 KiB of stdout
landed under `~/.deepseek/tool_outputs/`.

Adds a discrete `tool.spillover` event keyed off
`apply_spillover`'s return value, fired in both the sequential
and parallel tool paths so the log entry exists regardless of
how the tool was scheduled. Each event carries:

  {"event": "tool.spillover", "tool_id": "...",
   "tool_name": "exec_shell", "path": "/.../call-abc.txt"}

This is a pure observability addition. The model still receives
the same truncated head + footer; the UI still renders the
inline `full output: <path>` annotation; the spillover writer
contract is unchanged. No new tests — `apply_spillover` already
has unit-level coverage and the engine paths are exercised by
integration runs.
2026-05-03 05:58:02 -05:00
Hunter Bown 6b0a60883a test(skill): integration tests for the load_skill execute path (#434/#432)
The five existing tests cover the helpers (`format_skill_body`,
`collect_companion_files`) directly. Adds two integration tests
that drive the full `LoadSkillTool::execute` async path:

* `execute_finds_skills_in_opencode_dir_via_workspace_discovery` —
  installs a skill under `<workspace>/.opencode/skills/` and
  verifies the tool finds it via `discover_in_workspace`,
  returns the body, and stamps `metadata.skill_path` pointing
  at the .opencode dir. Pins #432's multi-dir wiring through
  the actual tool entry point, not just the unit-level helper.
* `execute_returns_helpful_error_for_unknown_skill` — verifies
  the "skill not found" error includes both the missing name
  and the available skill list so the model can recover
  without a separate discovery call.

Both use `#[tokio::test]` because `ToolSpec::execute` is async.
ToolContext is constructed via the existing `ToolContext::new`
helper so the test stays hermetic across hosts.
2026-05-03 05:56:29 -05:00
Hunter Bown d7017b7829 feat(skills): walk workspace .opencode + .claude skill dirs (#432)
The skills catalogue and `load_skill` tool now scan every
candidate directory in the workspace plus the global default,
not just the first one that exists:

  <workspace>/.agents/skills    (deepseek-native convention)
  <workspace>/skills            (flat, project-local)
  <workspace>/.opencode/skills  (OpenCode interop)
  <workspace>/.claude/skills    (Claude Code interop)
  ~/.deepseek/skills            (global, user-installed)

Skills installed for any AI-tool convention land in the same
catalogue without the user having to symlink or duplicate
files. Name conflicts resolve first-match-wins per the
precedence list above, so workspace-local skills shadow
user/global ones — that's the right shadowing for "this repo
overrides my defaults".

Implementation:

* `skills::skills_directories(workspace)` returns the existing
  candidate dirs in precedence order (host-dependent for the
  global default).
* `skills::discover_in_workspace(workspace)` walks each, merges
  the discovered skills, and accumulates warnings.
* `render_available_skills_context_for_workspace(workspace)`
  wraps `discover_in_workspace` for `prompts.rs`. The legacy
  single-dir `render_available_skills_context(skills_dir)` is
  retained as a fallback so callers that don't have a workspace
  view (e.g. mcp_server.rs) still work.
* `LoadSkillTool` (#434) routes through `discover_in_workspace`
  so its lookup matches what the system-prompt catalogue
  advertises. The "skill not found" error message now lists the
  searched dirs to help the user debug missing installs.

Tests:
  4 new tests in `skills/mod.rs`: precedence-order resolution,
  first-wins merge across .agents and .claude, .opencode
  discovery, system-prompt rendering for cross-tool dirs. The
  existing 6 single-dir tests pass unchanged.
2026-05-03 05:52:28 -05:00
Hunter Bown 8290b136e1 feat(tui): push DISAMBIGUATE_ESCAPE_CODES on startup (#442)
Opt into the Kitty keyboard protocol's escape-code disambiguation
so terminals that support it (Kitty, Ghostty, Alacritty 0.13+,
WezTerm, recent Konsole / xterm) report unambiguous events for
Option/Alt-modified keys, plain Esc, and multi-byte sequences.

Push happens after `enable_raw_mode` and the alt-screen /
mouse-capture / bracketed-paste setup so the order matches
shutdown's reverse-order pop. Only the disambiguation tier is
pushed — `REPORT_EVENT_TYPES` and the higher tiers emit release
events that the existing key handlers would mis-route as
duplicate presses.

Pop on exit was already wired in main.rs (panic) and ui.rs
(normal shutdown) per #443; the recent #443 follow-up extended
that to the suspend paths so editor / shell-suspend children
inherit a clean keyboard mode. The push + the four pops form
a complete pair.

Failure to push is logged at debug level and ignored — a quirky
terminal can't block startup. On terminals without protocol
support the escape sequence is silently discarded and behaviour
is identical to today (iTerm2, Terminal.app, Windows 10 conhost).

No new dependency; everything runs through crossterm's existing
`PushKeyboardEnhancementFlags` command.
2026-05-03 05:45:52 -05:00
Hunter Bown e8af3cd37d feat(tools): load_skill model-callable tool (#434)
Adds a `load_skill` tool that takes a skill id and returns the
SKILL.md body plus the sibling companion-file list in one tool
call. The existing progressive-disclosure pattern (system prompt
lists skills → model `read_file <path>`) still works; this tool
is the higher-level affordance for skills that ship with multiple
resource files.

Implementation:

* `LoadSkillTool` lives in `crates/tui/src/tools/skill.rs`. Read-
  only, auto-approved, parallel-safe.
* On call, resolves the active skills directory via the new
  `skills::resolve_skills_dir` helper, which mirrors
  `App::new`'s hierarchy: `<workspace>/.agents/skills` →
  `<workspace>/skills` → `~/.deepseek/skills`. No new plumbing
  through ToolContext — the workspace is already there.
* Returns the skill body wrapped in a self-contained block:
  description quote, source path, the SKILL.md verbatim, and a
  `## Companion files` section listing siblings (sorted lex,
  deterministic for tests). Solo skills skip the companions
  section entirely so the tool result stays tight.
* Errors with a helpful hint when the name is unknown — the
  hint includes the catalogue ("Available: foo, bar, baz") so
  the model can recover without an extra discovery call.
* Wired into `ToolRegistryBuilder::with_skill_tools` and pulled
  into both Agent and Plan tool-setup paths. Plan mode benefits
  because skills are read-only references that planners often
  need.

Tests:
  5 unit tests covering: description-headed body, companion
  enumeration excluding SKILL.md and nested dirs, empty result
  for solo skills, and the conditional `## Companion files`
  section.
2026-05-03 05:43:18 -05:00
Hunter Bown 20913b2f17 test(config): pin instructions-array merge semantics (#454 follow-up)
Adds four tests that pin the documented contract for the new
`instructions = [...]` field added in 0c1699:

* Project array replaces the user array wholesale (the typical
  "merge" pattern is for users who want both — they list
  ~/global.md inside the project array).
* Explicit `instructions = []` clears the user list — a project
  signalling "this repo doesn't want any of those globals".
* Absent project field leaves the user list intact (nothing
  in the project file → user wins by default).
* Empty / whitespace-only entries are filtered out — the user
  shouldn't get a "could not read instructions file" warning
  for a stray `""` in the array.

These were the semantics promised in the original #454 commit
and the `config.example.toml` doc; pinning them with tests
prevents regressions.
2026-05-03 05:33:09 -05:00
Hunter Bown 5deaf97253 fix(tui): pop keyboard flags on suspend paths too (#443 follow-up)
`main.rs` (process panic) and the normal TUI shutdown both pop
keyboard enhancement flags before handing the terminal back to
the child shell. The two suspend paths — `pause_terminal`
(Ctrl+Z and shell-suspend) and
`external_editor::spawn_editor_for_input` (composer `$EDITOR`
launch) — were missing the same defensive pop.

Today this is dormant: the TUI doesn't push keyboard
enhancement flags explicitly, so there's nothing to pop. The
fix is defence-in-depth: the day a future code path enables
the flags (kitty keyboard protocol for sub-second-precision
modifier reporting, say), the suspend handlers won't leak the
half-configured input mode to Vim / less / a shell child.

Aligns the four terminal-handoff sites (shutdown, panic,
suspend, editor) so they all do the same thing.
2026-05-03 05:29:11 -05:00
Hunter Bown ac0c16996e feat(config): instructions array merged into system prompt (#454)
Adds a new optional `instructions = ["./AGENTS.md", "~/.deepseek/global.md"]`
config field that's loaded at startup and concatenated into the
system prompt, in declared order, above the skills block.

* `Config::instructions: Option<Vec<String>>` — raw paths from
  `~/.deepseek/config.toml` or the per-project overlay.
* `Config::instructions_paths()` — `expand_path` each entry,
  drop empties, return the resolved `Vec<PathBuf>`.
* `merge_project_config` — project's array replaces the
  user-level array wholesale (including `instructions = []` to
  clear the user list for the current repo). The typical "merge"
  pattern is for users who want both — they list `~/global.md`
  inside the project array.
* `EngineConfig::instructions: Vec<PathBuf>` — threaded from
  config through both engine entry points (`Engine::new` for
  Default and `refresh_system_prompt` for runtime swaps).
* `prompts::render_instructions_block(paths)` — loads each file
  in order, caps each at 100 KiB with a `[…elided]` marker on
  overflow, skips missing files with a tracing warning. Returns
  `None` when nothing renders so the caller appends nothing.
* `system_prompt_for_mode_with_context_and_skills` gains an
  `instructions: Option<&[PathBuf]>` parameter. Block lives
  between the project-context block and the skills block so it
  benefits from KV prefix caching and per-project overrides
  apply consistently turn-over-turn.

Documentation:
* `config.example.toml` documents the field, the wholesale-
  override semantics, and the size cap.

Tests:
* 5 new tests in `prompts.rs`: no-op for empty input, skip
  missing files, declared-order concatenation, skip empty files,
  truncate oversize files, plus an end-to-end test that the
  block appears in the assembled system prompt when configured.
2026-05-03 05:25:31 -05:00
Hunter Bown 5e83f073b1 feat(footer): cumulative session-elapsed indicator (#448)
Adds `App::session_started_at: Instant` (set at construction) and
a low-priority `worked Nh Mm` chip in the footer's right cluster
that surfaces session age once it crosses 60s.

* `footer_worked_chip(elapsed)` returns empty spans for the first
  minute of a session so a fresh launch doesn't render a noisy
  ticker. Above the threshold it reuses the multi-day
  `humanize_duration` helper (#447) so the band promotion stays
  consistent: `1m`, `3h 12m`, `2d 5h`, `1w 2d`.
* The chip slots in last in `auxiliary_spans`, which means under
  narrow widths it's the first thing the priority-drop loop
  removes — the existing chips (coherence / agents / replay /
  cache / mcp) keep their slots.
* `FooterProps` carries a captured `worked: Vec<Span<'static>>`
  built at props-build time (matches the existing `retry`
  capture pattern). Render stays pure, tests can pin a known
  state without relying on wall-clock.

Tests:
  3 new tests in `tui/widgets/footer.rs` — chip hidden under 60s,
  chip rendered with humanized labels at 60s / 3h 12m / 2d 5h
  bands. The existing `from_app_idle_state` test gains a
  `worked.is_empty()` assertion (the test app is freshly
  constructed, well under the 60s threshold).
2026-05-03 05:17:01 -05:00
Hunter Bown 6dfb10f321 feat(a11y): NO_ANIMATIONS env override + accessibility docs (#450)
`fancy_animations: false` and `low_motion: true` already exist on
the settings struct, but the flag was undocumented and the only
ways to opt in were the `/settings` slash command or hand-editing
`~/.config/deepseek/settings.toml` — there was no environment-
level signal that platform a11y tooling could carry forward.

* `NO_ANIMATIONS=1` env var now forces `low_motion = true` and
  `fancy_animations = false` at startup, regardless of what's on
  disk. Recognises `1`, `true`, `yes`, `on` (case-insensitive);
  any other value is treated as unset.
* `Settings::apply_env_overrides()` is now called at the end of
  `Settings::load()`, so every consumer (App::new, /config, the
  doctor surface) sees the override applied uniformly. The
  override is a startup-time overlay — changing the env var
  mid-session has no effect.
* New `docs/ACCESSIBILITY.md` documents the existing `low_motion`,
  `fancy_animations`, `calm_mode`, `show_thinking`, and
  `show_tool_details` toggles plus the `NO_ANIMATIONS` startup
  override. Includes guidance for screen-reader users and a link
  back to this issue for follow-up motion regressions.

Tests:
  3 new tests in `settings.rs` (force-low-motion-on, override-
  user-opt-in, truthy-spelling-recognition). All three serialise
  through a static Mutex so the cargo parallel runner doesn't
  observe interleaved env mutations.
2026-05-03 05:09:17 -05:00
Hunter Bown 3625b887fa feat(ui): humanize_duration handles hours, days, and weeks (#447)
Long-running sessions (multi-hour cycles, multi-day automations)
were rendering with the seconds/minutes-only formatter, so a
two-day session showed as `2880m 0s` and `/goal` status used
Rust's Debug Duration form (`188415.234s`).

`humanize_duration` now walks through w/d/h/m/s and caps the
output at two units so it stays compact in headers and
notifications:

* `45s`, `1m 12s`, `59m 59s` (existing seconds/minutes path)
* `1h`, `2h 2m`, `3h 12m` (was `192m 30s`)
* `1d`, `1d 1h`, `2d 5h` (the multi-day case from the issue)
* `1w`, `1w 1d`, `3w 2d` (long-running automation)

The two-tier rule drops sub-minute precision once you're past
the hour boundary; the goal is "is this a couple of hours or
days," not stopwatch precision.

`/goal` status now wires through this formatter so multi-day
goal-elapsed times read as `2d 3h` instead of the previous
`188415.234s` Debug form. The notification system was the
existing caller and picks up the new format automatically.

Tests:
  4 test functions in `notifications.rs` covering the four
  formatting bands (s/m, h/m, d/h, w/d) plus the boundary cases
  on each unit.
2026-05-03 05:05:30 -05:00
Hunter Bown 0b99ad1f25 feat(engine): wire tool-output spillover into the engine and pager (#500)
The spillover writer (#422) and inline cell annotation (#423) were
already in place; this commit makes the pipeline actually fire and
gives the user a way to see the elided tail.

* `apply_spillover` lives in `tools/truncate.rs` and mutates a
  `ToolResult` in place: writes the full content to
  `~/.deepseek/tool_outputs/<id>.txt`, replaces the inline content
  with a 32 KiB head plus a footer pointing at the file, and stamps
  `metadata.spillover_path` so downstream renderers can find it.
  Skips error results so the model still sees the failure verbatim.
  Preserves prior metadata when present.

* `core/engine/turn_loop.rs` calls `apply_spillover` immediately
  after `execute_tool_with_lock` returns, before the result fans
  out to the model context (`ContentBlock::ToolResult`) and the UI
  (`Event::ToolCallComplete`). Both the parallel and sequential
  tool paths get the same hook so the model and the UI always see
  the same truncated content.

* `tui/ui.rs::open_details_pager_for_cell` now folds the full
  spillover-file body into the tool-details pager when the focused
  cell has a `spillover_path`. Truncated head stays at the top
  (so the user can see what the model received) followed by a
  `── Full output (spillover) ──` separator and the file body.
  Missing files render an inline notice instead of silently
  truncating.

* The model's footer ("Use `read_file path=…` if you need the
  elided tail") teaches the agent how to recover the rest of the
  payload on its next turn, so spilled output is not lost — just
  not paid for in context tokens unless the agent decides it
  actually needs the tail.

Tests:
  4 new unit tests in `tools/truncate.rs` (no-op below threshold,
  no-op for errors, truncate + stamp above threshold, preserve
  prior metadata). 3 new tests in `tui/ui/tests.rs` for the pager
  helper (no-op without spillover_path, file-load happy path,
  graceful notice when the file is missing).
2026-05-03 05:02:11 -05:00
Hunter Bown 637d0f088f fix(agents): list Implementer/Verifier in agent_spawn + agent_assign schemas (#404)
The SubAgentType enum gained `Implementer` and `Verifier` variants in #404,
but the JSON-schema `description` strings on AgentSpawnTool::input_schema
and DelegateToAgentTool::input_schema still listed the pre-#404 set
(general/explore/plan/review/custom). The model only sees those
descriptions, so the new roles were effectively hidden behind a
docs lookup.

Updates both descriptions to the post-#404 surface and references
docs/SUBAGENTS.md for posture. Also adds the long-form aliases
(builder/validator/tester) to the agent_assign hint so it matches
the canonical alias map. Pure copy change — no behaviour delta.
2026-05-03 04:50:51 -05:00
Hunter Bown 482fcdee7c docs(changelog): collapse #422 + #423 spillover entry
Both halves now shipped; combined entry reads more clearly than two
separate ones split across Added.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:45:03 -05:00
Hunter Bown de4085304d feat(tui): inline spillover-path annotation in tool cells (#423)
PR #422 (sister commit on this branch) shipped the storage half:
\`crates/tui/src/tools/truncate.rs\` writes large tool outputs to
\`~/.deepseek/tool_outputs/<id>.txt\` and the boot prune drops files
older than 7 days. This commit ships the UI half — the inline
annotation that surfaces the spilled path in the tool cell so the
user (and the model) can find the elided tail.

### What's wired

- New \`spillover_path: Option<PathBuf>\` field on \`GenericToolCell\`.
  Threaded through every construction site (production +
  test fixtures = 28 sites; bulk-updated via a Python regex that
  preserves indentation per site).
- \`tool_routing::push_orphan_tool_completion\` now reads
  \`ToolResult.metadata.spillover_path\` and stamps it on the cell.
  When tools start writing the metadata field (#500's wiring step),
  the annotation lights up automatically.
- \`GenericToolCell::lines_with_mode\` emits a one-line muted
  annotation in \`RenderMode::Live\` only:

      full output: /Users/you/.deepseek/tool_outputs/call-abc12.txt

  Transcript-mode replay omits the annotation because the full
  output is already inline.
- \`render_spillover_annotation\` truncates the path to fit narrow
  widths (40-col sidebar friendly) using the existing
  \`truncate_text\` helper.

### Why no OSC 8 hyperlink yet

The OSC 8 wrap-link helper lives on PR #515's branch (also stacked
on \`chore/v0.8.8-stabilization\`); both PRs land independently to
\`main\`. Once both are in, a follow-up commit can wrap the path
in \`osc8::wrap_link\` so supporting terminals make it
Cmd+click-openable. The plain-text path works in every terminal
today, so there's no functional regression.

### Tests

4 new tests in \`tui::history::tests\`:
- \`render_spillover_annotation_shows_path\` — full path appears in
  the live-mode render
- \`render_spillover_annotation_omitted_in_transcript_mode\` —
  transcript replay leaves the annotation off
- \`render_spillover_annotation_omitted_when_no_path_set\` — the
  common case (most tool results don't trigger spillover) is
  unaffected
- \`render_spillover_annotation_truncates_to_width\` — narrow
  widths don't overflow the line

### Verification

cargo fmt --all -- --check                                          ✓
cargo clippy --workspace --all-targets --all-features --locked --   -D warnings   ✓
cargo test --workspace --all-features --locked                      ✓ 1877 + supporting (was 1873)

Closes #423. #500 (preview pane) now has both halves of its
prerequisites in place — the bytes are on disk (#422) and the path
is surfaced in the cell (#423).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:44:43 -05:00
Hunter Bown cea4617fb4 docs(changelog): record #422 spillover writer in v0.8.8 entry
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:38:17 -05:00
Hunter Bown cf616e03bd feat(tools): spillover-file writer + 7-day boot prune (#422)
#500 (tool-output spillover preview pane) explicitly depends on
#422 (the storage writer) and #423 (the UI annotation). This ships
the storage half so the other two unblock cleanly.

### What's wired

New module \`crates/tui/src/tools/truncate.rs\`:

- \`spillover_root()\` — resolves \`~/.deepseek/tool_outputs/\`.
- \`spillover_path(id)\` — sanitises the tool call id (ASCII
  alphanumerics + \`-\`/\`_\`, drops \`.\` so \`..\` can't escape) and
  returns \`<root>/<id>.txt\`.
- \`write_spillover(id, content)\` — atomic write via the existing
  \`utils::write_atomic\` helper. Creates parent directory on demand.
- \`prune_older_than(max_age)\` — drops files older than \`max_age\`
  by mtime. Returns the count pruned. Per-file errors are logged
  and skipped, never propagated.
- \`maybe_spillover(id, content, threshold, head_bytes)\` —
  convenience for the "too long? spill it." pattern. Walks back to
  the previous UTF-8 char boundary so the head slice is always
  valid \`str\`.

Constants:
- \`SPILLOVER_DIR_NAME = "tool_outputs"\`
- \`SPILLOVER_THRESHOLD_BYTES = 100 KiB\` (mirrors
  \`MAX_MEMORY_SIZE\` for cross-feature consistency)
- \`SPILLOVER_MAX_AGE = 7 days\` (mirrors workspace snapshot prune)

Boot wiring in \`run_interactive\` calls \`prune_older_than\`
unconditionally; non-fatal — any error is logged at WARN and the
TUI starts anyway.

### Module-level \`#[allow(dead_code)]\`

The boot-prune is the only live caller today. The storage helpers
(\`write_spillover\`, \`maybe_spillover\`, \`spillover_path\`) are
intentionally unused outside the module's own tests until #423 / #500
land — those follow-ups need the bytes to exist, and the contracts
are pinned by tests so they can't drift before then. Module-level
\`#![allow(dead_code)]\` documents the deferral with a comment
pointing at the follow-up issues.

### Tests

8 unit tests in \`tools::truncate::tests\`:
- \`sanitise_id\` keeps safe chars, drops dangerous ones (\`..\`,
  \`/etc/passwd\` traversal attempts).
- \`write_spillover\` creates the directory and writes content.
- \`write_spillover\` rejects empty / fully-invalid ids.
- \`maybe_spillover\` returns \`None\` below threshold.
- \`maybe_spillover\` writes + returns the head slice above
  threshold.
- \`maybe_spillover\` walks back to a char boundary so the head
  string is never mid-codepoint (regression test using 4-byte
  whale emojis).
- \`prune_older_than\` is a no-op when the root doesn't exist.
- \`prune_older_than\` keeps fresh files and drops stale ones via a
  Unix \`utimensat\` backdating helper.

Tests serialize through a static \`Mutex\` because they share
process-global \`$HOME\`; the \`with_test_home\` helper documents
the SAFETY contract for the env-var override.

### Verification

cargo fmt --all -- --check                                          ✓
cargo clippy --workspace --all-targets --all-features --locked --   -D warnings   ✓
cargo test --workspace --all-features --locked                      ✓ 1873 + supporting (was 1865)

Closes #422 (storage half). #423 and #500 remain open with the
bytes now reachable on disk for them to cite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:38:00 -05:00
Hunter Bown 01fa11b96f docs(changelog): note /sessions prune slash command in #406 entry
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:31:18 -05:00
Hunter Bown 89500e4ebe feat(commands): /sessions prune <days> slash command (#406 phase-1.5)
The previous commit shipped \`SessionManager::prune_sessions_older_than\`
as a bare helper marked \`#[allow(dead_code)]\` pending phase-2 wiring.
This commit wires it into a user-callable slash command so users can
clean up stale sessions today, and removes the dead-code allow.

### Surface

\`\`\`
/sessions                      → open the picker (existing)
/sessions show | list | picker → alias for the picker
/sessions prune <days>         → drop sessions older than N days
\`\`\`

\`/sessions prune 30\` returns "pruned N sessions older than 30d" or
"no sessions older than 30d to prune". Errors:
- missing arg → usage hint
- non-positive / non-integer arg → typed error
- unknown subcommand → typed error with usage

The prune handler builds a fresh \`SessionManager\` from
\`default_location\` so it reads the same \`~/.deepseek/sessions/\`
directory the persistence layer writes; doesn't take a lock since
it's a one-shot CLI-style operation that runs to completion.

### What changed

- \`commands::session::sessions\` now takes \`arg: Option<&str>\`
  and dispatches \`show\` / \`prune\` / unknown.
- New \`prune\` private fn parses the days argument, opens
  \`SessionManager::default_location\`, calls
  \`prune_sessions_older_than\` with the corresponding \`Duration\`.
- \`commands::COMMANDS\` table updated: usage now reads
  \`/sessions [show|prune <days>]\`.
- \`commands::mod\` dispatch arm passes \`arg\` through.
- \`SessionManager::prune_sessions_older_than\` doc comment updated
  to reflect the live wiring; \`#[allow(dead_code)]\` removed.

### Tests

5 new tests in \`commands::session::tests\`:
- existing \`test_sessions_pushes_picker_view\` updated to the new
  signature
- \`test_sessions_show_subcommand_pushes_picker_view\` —
  \`/sessions show\` is an explicit alias for the picker
- \`test_sessions_prune_requires_days_argument\` — missing arg
  produces usage hint
- \`test_sessions_prune_rejects_non_positive_days\` — \`0\`,
  negative, non-numeric, and decimal inputs are all rejected
- \`test_sessions_unknown_subcommand_errors\` — typo path errors
  with subcommand list

### Verification

cargo fmt --all -- --check                                          ✓
cargo clippy --workspace --all-targets --all-features --locked --   -D warnings   ✓
cargo test --workspace --all-features --locked                      ✓ 1865 + supporting

Refines #406 — phase 1.5 (user-callable surface) shipped on top of
phase 1 (helper). Phase 2 (boot-prune + retention policy) stays open
for v0.8.9 once the policy is decided.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:31:00 -05:00
Hunter Bown 2fa23c1d74 docs(changelog): record session-prune helper + doctor memory block
Two items added in this stabilization pass that weren't yet in the
changelog:
- SessionManager::prune_sessions_older_than (#406 phase-1)
- doctor --json memory block (#489 follow-up)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:27:12 -05:00
Hunter Bown 220f1b30c5 feat(sessions): SessionManager::prune_sessions_older_than helper (#406 phase-1)
#406 asks for an auto-archive system for old session state. The full
design needs prior-art survey + retention-policy decisions that are
explicitly out of scope for v0.8.8. This commit ships the **building
block** — a tested public method that removes session files older
than a given Duration — so phase 2 can wire it into a config-knob
boot prune without re-litigating the implementation.

\`\`\`rust
pub fn prune_sessions_older_than(
    &self,
    max_age: std::time::Duration,
) -> std::io::Result<usize> { ... }
\`\`\`

Behaviour:
- Compares against the metadata's \`updated_at\` (not filesystem
  mtime — the user may have rsynced \`~/.deepseek\`; fs mtimes can
  lie about real session age).
- Returns the count pruned; failures on individual files are
  logged at WARN and skipped, not propagated, so one bad record
  doesn't block the rest.
- Skips the checkpoint subdirectory entirely. Top-level
  \`<session_id>.json\` files are the only candidates;
  \`checkpoints/latest.json\` and friends are owned by the
  checkpoint subsystem and live with stricter durability rules.
- Marked \`#[allow(dead_code)]\` with a comment pointing at #406
  phase 2 — the helper exists today, the production wiring lands
  next.

### Tests

5 new tests in \`session_manager::tests\`:
- empty directory returns zero
- all-fresh records survive
- all-stale records get removed
- mixed directory removes only the stale ones
- checkpoint subdirectory is left alone (file untouched, count is
  still 1 for the top-level stale record)

### Verification

cargo fmt --all -- --check                                          ✓
cargo clippy --workspace --all-targets --all-features --locked --   -D warnings   ✓
cargo test --workspace --all-features --locked                      ✓ 1861 + supporting

Refines #406 — phase 1 (helper + tests) shipped. Issue stays open
for the v0.8.9 phase-2 work that decides the retention policy and
boot-prune wiring.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:26:35 -05:00
Hunter Bown 8e7664bc70 docs(changelog): populate [Unreleased] with v0.8.8 stabilization entries
Catalogues the 24 v0.8.8 issues shipped across PRs #514 / #515 /
#517 / #518 / #519 in the standard Keep-a-Changelog format,
organized into Added / Changed / Fixed buckets with issue
cross-references.

Captured:
- Added (10): memory MVP + remember tool, inline diff, OSC 8,
  retry banner, MCP chip, project config overlay,
  Implementer/Verifier roles, two doc files, competitive analysis
- Changed (8): sub-agent cap, RwLock, output summarization,
  agent_list session boundary, concise todos, compact agent_spawn,
  Plan panel, RLM family
- Fixed (8): self-update arch, Option+Backspace word delete,
  offline queue scope, display_path Windows, footer theme color,
  panic-exit keyboard flags, CI workflow cleanup, plus the v0.8.8
  release-base fix

Plus a Releases callout reminding maintainers that the npm wrapper
publish stays manual and the GitHub release automation depends on
the \`RELEASE_TAG_PAT\` secret.

The dated section header lands when the actual version-bump
commit fires \`auto-tag.yml\`. This commit just populates the
[Unreleased] body so contributors get a clean summary while the
PRs are still in review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:19:09 -05:00
Hunter Bown 1b7939e680 feat(doctor): surface memory feature state in --json output (#489)
Operators ask "is memory on?" and "where does it live?" without
wanting to boot the TUI. Adds a \`memory\` block to the JSON doctor
report:

\`\`\`json
"memory": {
  "enabled": false,                                  // honours DEEPSEEK_MEMORY env
  "path": "/Users/you/.deepseek/memory.md",          // expanded path
  "file_present": false                              // does the file exist on disk?
}
\`\`\`

The \`enabled\` field reads \`DEEPSEEK_MEMORY\` directly so it stays
correct on this stabilization branch even though the dedicated
\`Config::memory_enabled()\` accessor lives on the memory-MVP branch
(#518). When both PRs land, the duplicated env-parse can collapse to
a single method call — TODO comment marks the spot.

Verified:
- \`deepseek doctor --json\` shows \`enabled: false\` by default
- \`DEEPSEEK_MEMORY=on deepseek doctor --json\` shows \`enabled: true\`
- All gates green (1856 main + supporting)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:16:01 -05:00
Hunter Bown 8071bce319 docs: MEMORY.md — user-facing memory documentation (#489)
The memory MVP shipped in PR #518 added three surfaces (\`# \` quick-add,
\`/memory\` slash command, \`remember\` model tool) plus the opt-in
toggle, but the only user-facing reference today is the one-line
mention of \`memory_path\` in CONFIGURATION.md and the \`#489\` cross-
reference in SUBAGENTS.md. This commit adds a dedicated user-facing
doc covering the whole feature.

Coverage:

- Why opt-in by default
- How to enable (env var + config.toml)
- What the system prompt block looks like
- Three ways to add to memory:
  1. \`# foo\` composer prefix (#492)
  2. \`/memory\` slash command (#491) — show / path / clear / edit
  3. \`remember\` tool (#489) — model-callable, auto-approved
- File format — timestamped Markdown bullets, hand-editable
- What stays out of memory — secrets / transient state / long
  instructions / conversation snippets
- Privacy and scope — per-user, never uploaded, provider-agnostic
- Configuration reference — settings table with defaults and overrides

Cross-link added in CONFIGURATION.md so the existing \`memory_path\`
mention now points at the full feature doc.

No Rust code changed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:10:55 -05:00