Commit Graph

233 Commits

Author SHA1 Message Date
Hunter Bown 3e8da4b99b chore: bump version to 0.7.9
Includes:
- Post-turn freeze fix (reorder maybe_advance_cycle before TurnComplete)
- Enter/steering fix (QueueFollowUp when model is streaming)
- Esc fanout hardening (idempotent finalize methods)
- cargo fmt pass on new code
- CHANGELOG, README, and version bump across workspace + npm
2026-04-30 20:53:10 -05:00
Hunter Bown 3c92753a44 v0.7.9: post-turn freeze fix, Enter/steering fix, Esc fanout hardening
- fix(#234): reorder cycle advancement before TurnComplete so the engine
  loop doesn't block the terminal after the turn signal. User sees the
  '↻ context refreshing...' status chip during briefing generation
  instead of a frozen terminal with no feedback.
- fix(#250): Enter during streaming queues a follow-up (visible queued
  text) instead of claiming to 'steer' a message that never reaches the
  model. During tool-execution phases Enter correctly steers the active
  turn via rx_steer, which is already drained before each API call.
- fix(#243): hardened Esc during sub-agent fanout — added idempotency
  test proving finalize_active_cell_as_interrupted is safe when
  TurnComplete arrives after Esc.
- close(#249): unicode search panic fix confirmed in v0.7.8, closing.
- feat: both fixes implemented by live sub-agents (agent_spawn) —
  proving the sub-agent system works end-to-end.
2026-04-30 20:15:32 -05:00
Hunter Bown d25783fe5b fix(v0.7.8): reconcile swarm state and unicode search 2026-04-30 19:50:01 -05:00
Hunter Bown 66a3aed528 feat(#248): foreground shell backgrounding, cancel tool, and Ctrl+B TUI shell control
- Add exec_shell_cancel tool to cancel one or all running background shell tasks
- Add foreground-to-background detach path via ShellManager request_foreground_background()
- Add wait_for_shell_delta_cancellable() so exec_shell_wait observes turn cancellation
- Add ShellControlView (Ctrl+B) with Background/Cancel options for active foreground commands
- Add 'Ctrl+B opens shell controls' hint in transcript for running exec cells
- Register exec_shell_cancel in ToolRegistryBuilder::with_shell_tools()
- Cancel-token checks in ShellInteractTool poll loop
- Update keybinding registry and OPERATIONS_RUNBOOK with Ctrl+B documentation
- Update TOOL_SURFACE.md with exec_shell_cancel entry
- Update prompts (rlm first-class guidance, AGENTS.md issue-closure policy)
- Tests: foreground_background, wait_cancel_leaves_running, cancel_tool_single, cancel_tool_all

Closes #248
2026-04-30 18:12:15 -05:00
Hunter Bown 7f2f47edf8 v0.7.7: stabilize sub-agent / swarm / fanout lifecycle, Windows install, and TUI polish (#246)
* wip(v0.7.7): handoff baseline of partial sub-agent stabilization

Captures uncommitted work-in-progress on the v0.7.7 stabilization lane
so subsequent fixes have a stable starting point. Subsequent commits
finish the canonical SubAgentJob/SwarmJob model, fix sidebar/transcript/
footer agreement, copy/paste/cancel contract, checklist rendering, shell
summary preservation, monotonic spend, and version provenance.

Refs #235 #236 #237 #238 #239 #240 #241 #242 #243 #244 #245

* release: bump workspace version to 0.7.7 (#245)

Refs #245

* fix(v0.7.7): canonical swarm card binding, monotonic spend, checklist + shell summary

- Add `swarm_card_index: HashMap<swarm_id, history_index>` so overlapping
  fanouts each project to their own FanoutCard. Eliminates the screenshot
  contradiction where a stale background swarm's progress clobbered a
  newer card (#236, #238).
- Suppress fanout-class tools (`agent_swarm`, `spawn_agents_on_csv`,
  `rlm`, `agent_spawn`) from `active_tool_status_label` so the footer no
  longer reports "tool agent_swarm · 1 active" while sidebar+card show
  the actual worker counts (#236, #238).
- Add `App::displayed_session_cost` + `displayed_cost_high_water` so the
  visible session+sub-agent total is monotonic across reconciliation
  events (cache discounts, provisional → final). New tests: monotonicity
  under negative reconciliation; duplicate dedup keeps display steady (#244).
- Preserve high-signal summary lines from the truncated tail of shell
  output: `test result:`, `failures:`, `error[E…]`, `Finished`,
  `Compiling`, panic markers. Stops the agent re-running cargo gates
  just to see pass/fail under truncation (#242).
- Render `checklist_write` / `todo_*` results as a purpose-built
  checklist card with completed/total + percent header, per-item status
  markers, and a collapsing affordance for long lists. Plumbed through
  the existing `GenericToolCell` so no new variant threading is needed (#241).

Refs #236 #238 #241 #242 #244

* fix(v0.7.7): Esc clears active tool entries optimistically (#243)

When Esc cancels the foreground turn we now finalize the active cell
immediately rather than waiting for the engine's TurnComplete echo to
drain. This stops the footer "tool ... · X active" chip from briefly
contradicting the cancelled state, and frees the composer for the next
message.

Background `block:false` swarms are intentionally NOT killed here — they
remain durable, tracked through `swarm_jobs` and `swarm_card_index` so
their FanoutCard updates as workers land. Subsequent `swarm_status` /
`swarm_result` / `swarm_cancel` tool calls see the canonical store.

New focused test verifies: after Esc, `active_cell` is None, the
background swarm record is preserved, and `is_loading` is cleared so
the composer can submit immediately.

Refs #243

* fix(v0.7.7): Windows .exe lookup + post-turn snapshot detach (#247, #234)

#247 — npm-distributed Windows package failed at runtime because the
Rust dispatcher's `delegate_to_tui` / `delegate_simple_tui` looked for a
sibling named exactly "deepseek-tui", while the actual file shipped by
`scripts/install.js` is `deepseek-tui.exe`. Replace both lookups with
`locate_sibling_tui_binary`, which:

- Honours `DEEPSEEK_TUI_BIN` for explicit overrides
- Tries `deepseek-tui{EXE_SUFFIX}` first (`.exe` on Windows, "" elsewhere)
- Falls back to suffix-less `deepseek-tui` on Windows so users who
  applied the issue's manual workaround still launch successfully
- Emits a platform-correct error path in the bail message

Tests: `sibling_tui_candidate_picks_platform_correct_name`,
`sibling_tui_candidate_windows_falls_back_to_suffixless` (windows-only),
`locate_sibling_tui_binary_honours_env_override`.

#234 — Detach the post-turn workspace snapshot so `git add -A && git
commit` no longer pins the engine loop after `Event::TurnComplete`.
The snapshot still runs on `tokio::task::spawn_blocking`, but the
engine no longer awaits its `JoinHandle`, so the UI accepts input
(text, copy, paste, selection) without waiting for the bookkeeping to
finish. Cycle advance and pre-turn snapshot remain awaited — they are
correctness-sensitive and the cycle path already emits a status chip
("↻ context refreshing…") so the user has visible feedback.

Refs #234 #247

* chore(v0.7.7): bump npm package version 0.7.6 → 0.7.7

Required by `scripts/release/check-versions.sh` ("Version drift" CI
gate); the workspace was bumped to 0.7.7 but `npm/deepseek-tui/package.json`
still reported 0.7.6, blocking PR #246 from going green.

Refs #245
2026-04-30 07:26:26 -05:00
Hunter Bown 8ba8600155 release: v0.7.6
- Bump workspace version to 0.7.6 (Cargo.toml + all crate internal dep pins)
- Bump npm wrapper version and deepseekBinaryVersion to 0.7.6
- Add v0.7.6 changelog entry: localization, paste burst, history search,
  pending input preview, grouped /config editor, searchable help overlay,
  Alt+↑ edit-last-queued, composer attachment management
- Update README with v0.7.6 features (localization, paste, history search)
- Archive v0.7.5 implementation plan to docs/archive/
- Update Cargo.lock
2026-04-29 17:00:36 -05:00
Hunter Bown 0568203ec2 Fix agent swarm cancellation and status UI 2026-04-29 16:43:28 -05:00
Hunter Bown a47b28e5d5 Complete v0.7.6 TUI polish and localization lane (#222)
Squash-merge PR #222 after green CI and review cleanup.\n\nCloses #198, #199, #206, #207, #208, #209, #210, #212, #213, #214, #215, #216.
2026-04-29 13:06:51 -05:00
Hunter Bown c2b2c284f6 release: v0.7.5 — token-basis fixes, shell timeout recovery, context/cache policy
Issues #202, #203, #204, #205:

- Cycle/seam triggers use active request input size + response
  headroom reserve, not lifetime cumulative API usage.
- V4 hard-cycle headroom calibrated around fixed TURN_MAX_OUTPUT_TOKENS
  plus CONTEXT_HEADROOM_TOKENS safety buffer.
- /tokens, /cost, footer/header labels, and docs now separate
  active context, turn telemetry, cumulative usage, cache hit/miss,
  context percent, and cost.
- Foreground exec_shell timeout output tells the model the process
  was killed and suggests task_shell_start or background exec_shell
  plus poll/wait.
- Added regression tests for active-token basis, V4 headroom,
  seam trigger basis, footer label behavior, and shell timeout
  recovery metadata.
- Preserved #200/#201 policy: V4 default is append-only,
  prefix-cache preserving; replacement compaction, Flash seams,
  and capacity intervention remain opt-in.
2026-04-29 10:13:27 -05:00
Hunter Bown 0578eb701e Add shell jobs and MCP manager to the TUI 2026-04-29 09:38:04 -05:00
Hunter Bown 41e8f2b5b2 Disable default compaction and opt in context seams 2026-04-29 09:12:20 -05:00
Hunter Bown 00c92e1c2a Implement v0.7.4 long-running agent tools 2026-04-29 00:50:43 -05:00
Hunter Bown 6d8ab4c2b8 fix: close v0.7.2 issue cleanup 2026-04-28 23:09:19 -05:00
Hunter Bown 0f8c363012 fix: clean v0.7.2 release prep 2026-04-28 21:54:43 -05:00
Hunter Bown 35db361a87 v0.7.2: clippy clean, cost counter wiring, layered context fixup
#167: Fix all 7 clippy warnings — annotated SeamMetadata dead fields,
removed unused should_cycle calls, collapsed nested ifs, fixed
useless_format and nonminimal_bool.

#168: Wire TokenUsage mailbox drain to subagent_cost accumulator.
handle_subagent_mailbox now intercepts TokenUsage before routing to
cards, computes cost via calculate_turn_cost, and increments
app.subagent_cost in real time. Footer reflects live sub-agent spend.

Restored ArchivedContext variant to HistoryCell (corrupted by prior
apply_patch). Version bump to 0.7.2.

Refs: #166, #167, #168
2026-04-28 21:46:25 -05:00
Hunter Bown 12b1ae42c4 feat: SeamManager module + context config for v0.7.2 layered context management
Adds the core SeamManager struct (#159) that uses V4 Flash to produce
append-only <archived_context> XML blocks at 192K/384K/576K thresholds.
No messages are deleted — soft seams are navigational summaries that
preserve the V4 prefix cache.

- seam_manager.rs: Flash-driven soft seam production, recompaction,
  and cycle briefing replacement
- config.rs: [context] table with L1/L2/L3/cycle thresholds,
  verbatim window, seam model, and per-model overrides
- compaction.rs: pub exports for plan_compaction, KEEP_RECENT_MESSAGES,
  and CompactionPlan fields so SeamManager can reuse pinning heuristics
- cycle_manager.rs: pub CYCLE_HANDOFF_TEMPLATE for Flash briefing use
- main.rs: mod seam_manager registration

All 1,570 tests pass. Engine wiring follows in a subsequent commit.
2026-04-28 20:15:22 -05:00
Hunter Bown 64d1698bde Release 0.7.1 (#156) 2026-04-28 18:38:44 -05:00
Hunter Bown 096aa91823 Rank file picker by working set relevance (#155) 2026-04-28 18:26:07 -05:00
Hunter Bown 0e96928f35 Add compact context inspector metadata (#154) 2026-04-28 18:14:29 -05:00
Hunter Bown 6396bffcd4 Make tool details selected-card aware (#153) 2026-04-28 17:58:45 -05:00
Hunter Bown 78b272e56b Group active tool cards and live status (#152) 2026-04-28 17:46:55 -05:00
Hunter Bown d7b033d59e Polish tool cards and context previews (#151) 2026-04-28 17:36:00 -05:00
Hunter Bown a02907b89d chore(release): v0.7.0 2026-04-28 16:08:33 -05:00
Hunter Bown 49d2be9e5c refactor(tools): share tool result primitives from crate 2026-04-28 01:23:21 -05:00
Hunter Bown 4fb8372c1c refactor(engine): split turn loop and capacity flow 2026-04-28 01:12:25 -05:00
Hunter Bown 27527699db Merge branch 'feat/v070-lsp' (#136 LSP diagnostics)
# Conflicts:
#	config.example.toml
#	crates/config/src/lib.rs
#	crates/tui/src/config.rs
#	crates/tui/src/core/engine.rs
#	crates/tui/src/main.rs
#	crates/tui/src/runtime_threads.rs
#	crates/tui/src/tui/ui.rs
2026-04-28 01:03:36 -05:00
Hunter Bown 41daab3ca0 Merge branch 'feat/v070-snapshots' (#137 side-git snapshots)
# Conflicts:
#	crates/config/src/lib.rs
#	crates/tui/src/config.rs
2026-04-28 00:58:16 -05:00
Hunter Bown 626bfbc65e Merge branch 'feat/v070-backtrack' (#133 Esc-Esc backtrack) 2026-04-28 00:50:21 -05:00
Hunter Bown 3bc54b0bc0 fix(snapshot): harden side-git restore wiring 2026-04-28 00:46:24 -05:00
Hunter Bown 0781b7c203 feat(session): #137 prune stale workspace snapshots at session boot
`run_interactive` now calls `session_manager::prune_workspace_snapshots_at_boot`
right after the system-skills installer, dropping any snapshot in the
side-git repo older than 7 days (default; configurable via the new
`[snapshots]` section in `config.example.toml`). The helper is
non-fatal: a missing `git` binary, read-only home, or absent snapshot
dir all log a single WARN (or DEBUG for the count of pruned commits)
and return, so the TUI keeps starting even when retention can't run.

Also document the snapshot subsystem in `config.example.toml` —
disk-footprint expectations, where the side repo lives, and how
`/restore` / `revert_turn` consume it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:31:57 -05:00
Hunter Bown fb12e331ab feat(snapshot): #137 add /restore command and revert_turn tool
Two user-facing entry points to the snapshot side-repo:

- `/restore [N]` (slash command) — `/restore` with no arg lists the
  10 most recent snapshots so the user can see what's available.
  `/restore N` restores the N-th most recent snapshot. Outside YOLO
  or `/trust on`, the command refuses to mutate files and tells the
  user how to opt in (no in-flow modal-confirm path inside slash
  commands today; trust mode is the explicit gate).

- `revert_turn` (agent-callable tool) — `turn_offset` (default 1)
  counts in `pre-turn:*` snapshots, so the model can say "undo my
  last edit" without having to enumerate the history. Approval-gated
  (`ApprovalRequirement::Required`) since it mutates the workspace,
  and registered through `with_full_agent_surface` so children
  inherit it just like every other agent-mode tool.

Tests for both surfaces use the process-wide env mutex
(`crate::test_support::lock_test_env`) plus an RAII `HOME` guard so
tempdir-based snapshot resolution stays inside the per-test sandbox
even when the runner threads multiple tests in parallel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:31:47 -05:00
Hunter Bown 8ff4f66b95 feat(core): #137 add pre/post-turn snapshot hooks
Wire `pre_turn_snapshot` and `post_turn_snapshot` helpers into
`core::turn`, then call them from `Engine::handle_send_message` —
pre-turn fires right after `turn_counter` is incremented, post-turn
fires right after `Event::TurnComplete` is emitted.

Both hooks are dispatched via `tokio::task::spawn_blocking` so the
agent loop never waits on the side-git commit, and helper failures are
swallowed at WARN log level so a busted disk or missing `git` binary
can never derail a turn (per the snapshot module's documented
non-fatal contract).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:31:31 -05:00
Hunter Bown 3dc116b9fc feat(snapshot): #137 add workspace snapshot side-repo module
Introduce `crate::snapshot` — a per-workspace side-git repo that lives
under `~/.deepseek/snapshots/<project_hash>/<worktree_hash>/.git` and
captures the workspace into commits via `git add -A` + `git commit
--allow-empty`. The user's own `.git` is never touched: every git
invocation passes both `--git-dir` (side repo) and `--work-tree`
(workspace) together, which is the load-bearing safety invariant.

Module layout:
- `paths.rs` — resolves the side-repo dir; strips `.worktrees/<name>`
  so worktrees of the same checkout share a project_hash but get
  distinct worktree_hashes.
- `repo.rs` — `SnapshotRepo::open_or_init / snapshot / restore / list /
  prune_older_than`. Shells out to system `git` (avoids `git2` LGPL
  surface). Honors workspace `.gitignore` automatically.
- `prune.rs` — boot-time helper used by session_manager (next commit).
  Default retention is 7 days.

Tests (real `git` invocations on tempdirs, env-mutating tests serialised
through the existing `crate::test_support::lock_test_env` mutex) cover:
snapshot creates a commit in the side repo only, restore reverts files,
list respects limit, prune drops aged commits, gitignore is honored,
and re-init is idempotent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:31:21 -05:00
Hunter Bown de9a3b8b74 feat(skills): #140 wire /skill install/update/uninstall/trust + [skills] config
Slash-command surface for the community-skill installer:

- `/skill install <github:owner/repo|https://...|<registry-name>>` parses
  the spec via `InstallSource::parse`, calls `install_with_registry`, and
  surfaces `NeedsApproval`/`NetworkDenied` with actionable messages
  pointing at `[network]` config (we deliberately don't dispatch a modal
  from the sync slash-command path; the underlying installer returns the
  outcome so a future approval wiring can reuse it).
- `/skill update <name>` re-fetches and prints "no upstream change" when
  the checksum matches.
- `/skill uninstall <name>` and `/skill trust <name>` both refuse to
  touch system skills (no `.installed-from` marker).
- `/skills --remote` (or `/skills remote`) fetches the curated registry
  through the same network gate and prints `name — description (source)`.

Internals:
- Sub-command dispatch happens in `run_skill` before activation lookup,
  so a user can't accidentally activate a skill literally named
  `install`. Async install/update/uninstall plumbed through
  `tokio::task::block_in_place` + `Handle::current().block_on`, matching
  the existing pattern in `commands/cycle.rs`.
- `installer_settings` loads `Config` on demand — `App` doesn't carry a
  `Config` reference, and the cost of a single TOML parse is negligible
  next to the network round-trip the install will make.

Config:
- New `[skills]` section in both `crates/tui/src/config.rs::Config` and
  the workspace `crates/config/src/lib.rs::ConfigToml` with
  `registry_url` (default: bundled raw GitHub index) and
  `max_install_size_bytes` (default: 5 MiB).
- `merge_config` propagates the new field, default impls cover the
  unset case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:30:05 -05:00
Hunter Bown ad78466ba0 feat(skills): #140 community-skill installer module
Add `crates/tui/src/skills/install.rs` — async installer that pulls
user-authored skills from GitHub repos, raw tarball URLs, or a curated
`index.json` registry. The whole pipeline is gated by the per-domain
`NetworkPolicy` (#135), validated against path-traversal / size / symlink
attacks before any bytes hit the destination, and atomic-renamed into place
so a half-installed skill cannot survive a failure mid-extract.

Public surface:
- `InstallSource::{GitHubRepo,DirectUrl,Registry}` with `parse(spec)`.
- `install` / `install_with_registry` returning
  `InstallOutcome::{Installed,NeedsApproval,NetworkDenied}`.
- `update` / `update_with_registry` returning
  `UpdateResult::{NoChange,Updated,NeedsApproval,NetworkDenied}` — uses a
  SHA-256 over the downloaded tarball to short-circuit no-op fetches.
- `uninstall` / `trust` — both refuse to touch directories without an
  `.installed-from` marker, so the bundled `skill-creator` system skill is
  protected.
- `fetch_registry` — typed loader for the curated `index.json`.

Validation hard rules (each covered by an integration test):
- `..` segments and absolute paths in tar entries are rejected.
- Symlinks / hardlinks in tar entries are rejected outright.
- Uncompressed total size is bounded by `max_size` (default 5 MiB).
- SKILL.md must exist at the archive root or under `skills/<name>/`.
- Frontmatter must carry both `name` and `description`.
- `install` with an existing destination requires `update = true`.
- `update` re-fetches and only replaces the on-disk install when the
  checksum changes; no-change paths skip the rename entirely.

Adds `tar`, `flate2`, and `sha2` to `crates/tui/Cargo.toml` and propagates
the resulting lockfile drift to `Cargo.lock`.

Tests: 11 colocated unit tests in `install.rs` + 11 integration tests in
`crates/tui/tests/skill_install.rs` driving a `tiny_http`-based server so
the network gate, download cap, validation pipeline, and atomic rename
all run end-to-end.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:29:48 -05:00
Hunter Bown 05a1032e00 feat(lsp): #136 post-edit diagnostics injection
Inject LSP diagnostics as a synthetic user message after every successful
file edit (`edit_file`, `apply_patch`, `write_file`) so the agent sees
compile breaks before its next reasoning step. Largest agent-quality
lever in v0.7.0.

Pieces:
- `crates/tui/src/lsp/`: thin JSON-RPC stdio client (no `tower-lsp`),
  per-language registry, diagnostics renderer producing the
  `<diagnostics file="…">` block format. `LspManager` owns lazily
  spawned per-language transports keyed by `Language`.
- `core/engine.rs`: hook on the success branch of the tool-result loop
  derives the edited file path(s) per tool, queries the LspManager
  with a 5 s timeout, and collects rendered blocks into
  `pending_lsp_blocks`. The queue is flushed as a `text` content
  block on the next request iteration so the model sees the
  diagnostics before it streams its next turn.
- `[lsp]` config schema (`enabled`, `poll_after_edit_ms`,
  `max_diagnostics_per_file`, `include_warnings`, optional
  `servers` override) with built-in defaults for rust-analyzer,
  gopls, pyright, typescript-language-server, and clangd.
- Failure modes are non-blocking by design: a missing LSP binary
  logs a one-time warning and skips the hook; a crashed server or
  poll timeout simply drops that turn's diagnostics. The agent's
  work is never blocked.

Tests: 24 unit tests cover language detection, registry overrides,
filter/sort/truncate behavior, and the rendered block format. Three
engine-level tokio tests exercise the full path through a fake
transport (no real LSP server is ever spawned in CI).

Acceptance criteria (per #136):
- Edit introducing a type error -> next request body contains
  `<diagnostics file="…">` block at the right line/col.
- `[lsp] enabled = false` -> no diagnostics injected.
- Snapshot test exercises full path with mock transport.
- LSP binary not on PATH -> one-time warning, agent proceeds.
- 5 s timeout, errors-only by default.
- Transports spawn lazily on first edit per language.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:29:28 -05:00
Hunter Bown 18b797b593 feat(tui): #133 wire Esc-Esc backtrack into UI and live transcript
Connects the new BacktrackState to the live UI:

- App: holds a `backtrack: BacktrackState` and a new
  `truncate_history_to(new_len)` helper that keeps `tool_cells`,
  `tool_details_by_cell`, and the sub-agent card index consistent.
- live_transcript: gains a `Mode::BacktrackPreview { selected_idx }`
  that highlights the Nth-from-tail HistoryCell::User with a `▶` marker
  and reverse-video styling. Cache stays valid across mode flips —
  decoration is applied post-wrap. Left/Right/Enter/Esc emit new
  `ViewEvent::Backtrack{Step,Confirm,Cancel}` events.
- ui.rs: routes Esc through `BacktrackState::handle_esc` only when
  no popup is open and not streaming, opens the preview overlay on
  the second Esc, and on confirm trims `app.history` /
  `app.api_messages` and refills the composer with the dropped user
  input. Streaming and existing popup paths preserve their original
  Esc behaviour.
- keybindings: documents the `Esc Esc` chord in the help catalog.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:26:21 -05:00
Hunter Bown e3dc49c526 feat(tui): #133 add Esc-Esc backtrack state machine module
Introduces `tui::backtrack::BacktrackState` — a small Inactive/Primed/
Selecting state machine for the two-step Esc chord. The module owns
nothing beyond its phase enum; transcript snapshots, popup detection,
and fork side-effects all stay in the UI layer so the state machine is
trivially unit-testable.

`handle_esc(total_user_messages)` returns one of `None | Prime |
Cancel | OpenOverlay`, `step(Direction)` walks the selection in
`Selecting`, and `confirm()` yields the depth-from-tail and resets to
`Inactive`. 15 unit tests cover every transition including bounds
clamping, empty-transcript short-circuit, and defensive Esc routing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:26:09 -05:00
Hunter Bown 26b79312f9 feat(runtime): #133 add fork_at_user_message for backtrack rewind
Adds `RuntimeThreadManager::fork_at_user_message(id, depth_from_tail)` —
a sibling of the existing `fork_thread` that drops every turn from the
Nth-from-tail user message onward and returns the dropped user input so
the caller can pre-populate the composer.

The existing `fork_thread` is left untouched. The new helper mirrors its
copy loop but stops short of the cutoff turn, emitting a
`thread.forked` event with backtrack provenance fields. Includes unit
tests covering depth=0, depth=1, out-of-range error, and source-thread
non-mutation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:26:00 -05:00
Hunter Bown 87ac2e9bdc Merge branch 'feat/v070-network' (#135 egress network policy + audit)
- crates/tui/src/network_policy.rs (NEW, ~300 LOC) — NetworkPolicy/NetworkPolicyDecider/NetworkAuditor
- crates/tui/src/tools/{fetch_url,web_search,spec}.rs + mcp.rs — gate before egress
- crates/tui/src/core/engine.rs + runtime_threads.rs + tui/ui.rs — surface NetworkDenied
- crates/config/src/lib.rs + tui/src/config.rs + config.example.toml — [network] schema
- Subdomain-prefix matching with deny-wins precedence
- Audit format: <RFC3339> network <host> <tool> <decision>

# Conflicts:
#	crates/tui/src/config.rs
2026-04-28 00:07:42 -05:00
Hunter Bown a054789f79 Merge branch 'feat/v070-osc9' (#132 OSC 9 desktop notification on long-turn completion)
- crates/tui/src/tui/notifications.rs (NEW) — Method enum {Auto/Osc9/Bel/Off}, notify_done()
- crates/tui/src/tui/mod.rs + ui.rs — register module + hook EngineEvent::TurnComplete
- crates/tui/src/config.rs — NotificationsConfig (method/threshold_secs/include_summary)
- config.example.toml — [notifications] section
- 9 unit tests including tmux DCS passthrough wrapping
2026-04-28 00:06:26 -05:00
Hunter Bown ff21d2268d Merge branch 'feat/v070-arity' (#131 bash-command arity for safer auto_allow)
- crates/tui/src/command_safety.rs — port opencode arity dictionary (160+ entries) + classify_command()
- crates/tui/src/tools/approval_cache.rs — cache key uses canonical prefix
- config.example.toml — auto_allow prefix-semantics docblock
- 41 colocated unit tests
2026-04-28 00:06:21 -05:00
Hunter Bown 7b6993fecb Merge branch 'feat/v070-altv' (#138 Alt+V chord for tool-details pager)
- crates/tui/src/tui/ui.rs — new Alt+V/Alt+Shift+V arm next to Alt+A/Y/P family, no empty-input gate
- crates/tui/src/tui/history.rs — 5 hint strings + 4 test assertions updated to "Alt+V for details"
- crates/tui/src/tui/keybindings.rs — entry under Submission so ? overlay lists it
- Bare 'v' handler unchanged (legacy muscle memory)
2026-04-28 00:06:16 -05:00
Hunter Bown d3dd8a590f Merge branch 'feat/v070-keyring' (#134 OS keyring credential store)
- crates/secrets/ (NEW crate) — KeyringStore trait + Default/InMemory/File backends
- crates/config/src/lib.rs — api_key resolution via Secrets::auto_detect (CLI → keyring → env → config-file)
- crates/cli/src/main.rs — auth set/get/clear/migrate/list subcommands
- crates/tui/src/config.rs + main.rs — wire keyring resolver, doctor reports backend
- Lockfile updated for keyring 3.6 (apple/windows/linux native features)
- One-time deprecation warning when api_key is read from config.toml
2026-04-28 00:06:09 -05:00
Hunter Bown f36e891117 Merge branch 'feat/v070-skill-creator' (#139 bundled skill-creator system skill)
- crates/tui/assets/skills/skill-creator/SKILL.md (NEW) — MIT-attributed port from codex
- crates/tui/src/skills.rs → skills/mod.rs + skills/system.rs (NEW) — install_system_skills() with version-marker idempotence
- crates/tui/src/main.rs — startup auto-install hook (non-fatal on error)
- crates/tui/src/commands/skills.rs — /skill new alias for /skill skill-creator
- 7 unit tests covering fresh install, idempotence, user-deleted-dir respect, version-bump reinstall
2026-04-28 00:06:03 -05:00
Hunter Bown 9db841fc62 test(tui): #69 integration tests for mock LLM client + record fixtures
Adds `integration_mock_llm.rs` covering the LlmClient trait surface:

- streaming turn loop (text deltas + finish reason)
- reasoning-content replay across tool-call rounds (V4 §5.1.1, the
  HTTP 400 path that broke v0.4.9-v0.5.1)
- tool-call round-trip with chunked input JSON
- multiple tool calls in one turn preserve event ordering
- compaction-style non-streaming `create_message`
- sub-agent style independent parent/child mocks
- capacity-gate observation of a captured request

Four full-engine tests are `#[ignore]`-marked as BLOCKED on the engine
refactor from concrete `Option<DeepSeekClient>` to `Arc<dyn LlmClient>`.
Once that wiring lands the ignored tests light up with no mock changes.

Adds:
- `tests/support/llm_client.rs` mirrors the trait so the mock can be
  brought into the integration test via `#[path]` without dragging in
  the rest of the binary's module tree
- `tests/fixtures/.gitkeep` so the `eval --record` output directory
  rides the repo
- `tests/README.md` documents both the trait-level mocking strategy
  and the `--record` fixture flow
- `record_flag_writes_one_jsonl_line_per_step` in `eval_harness.rs`
  exercises the new `--record` flag end-to-end

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:03:18 -05:00
Hunter Bown 0626bc80c0 feat(test): #69 mock LLM client + eval --record fixture flag
Adds a queue-driven `MockLlmClient` that implements the `LlmClient` trait
by replaying canned per-turn `StreamEvent` vectors and capturing every
outgoing `MessageRequest`. The mock lives at the trait boundary so it
stays decoupled from the concrete reqwest plumbing inside `DeepSeekClient`,
and surfaces builders (`canned::*`) for the common event shapes (text
delta, thinking delta, tool_use start, input JSON delta, message delta).

Wires a new `--record <DIR>` flag into `deepseek eval` that appends one
JSON Lines fixture line per step to `<DIR>/<scenario>.jsonl`. The format
is documented at the top of `eval.rs` and is the storage shape the mock
will replay from.

`crates/tui/src/llm_client.rs` becomes `crates/tui/src/llm_client/mod.rs`
to host the new submodule cleanly. The trait shape is unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 00:03:00 -05:00
Hunter Bown f82f162e7f feat(network): #135 gate fetch_url, web_search, MCP HTTP via policy
Threads the optional `NetworkPolicyDecider` from `EngineConfig` through to
`ToolContext.network_policy` and `McpPool::with_network_policy`. Each gate
point follows the same pattern: extract the host, call `decider.evaluate`,
then `Allow` proceeds, `Deny` returns a structured permission-denied error,
and `Prompt` falls through to the same denial with a hint pointing to
`/network allow <host>` (full modal flow lands in a follow-up).

* `fetch_url` — gates on the parsed URL host.
* `web_search` — gates DuckDuckGo (`html.duckduckgo.com`) and the Bing
  fallback (`www.bing.com`) independently so a deny on one engine doesn't
  silently let the other through.
* MCP — only the HTTP/SSE transport is gated; STDIO MCP servers are
  unaffected. `McpConnection::connect_with_policy` replaces the old
  `connect` (no external callers existed).

The session cache short-circuits `evaluate` once a host is approved, so
the existing `approve_session` hook is enough to wire the prompt-once
flow when the approval modal lands.

`NetworkPolicyDecider::with_default_audit` materializes the auditor at
`~/.deepseek/audit.log` when the config has `audit = true`.

Includes one tool-level test asserting `fetch_url` denies a blocked host
through the policy gate.
2026-04-28 00:02:56 -05:00
Hunter Bown abbb86cdd2 feat(network): #135 add [network] config schema for policy
Adds the `[network]` table to both the workspace config crate (`ConfigToml`)
and the live tui config (`Config`), plus a documented example block in
`config.example.toml`. Schema:

```toml
[network]
default = "prompt"      # allow | deny | prompt
allow = ["api.deepseek.com", "github.com"]
deny = []
audit = true
```

`NetworkPolicyToml::into_runtime()` builds a runtime `NetworkPolicy` so the
engine can construct a `NetworkPolicyDecider` without reaching across crate
boundaries. Defaults preserve pre-v0.7.0 behavior: when the section is
absent, no policy is enforced.
2026-04-28 00:02:34 -05:00
Hunter Bown 45727a09f7 feat(network): #135 add per-domain network policy module
Introduces `network_policy::{Decision, NetworkPolicy, NetworkPolicyDecider,
NetworkAuditor, NetworkSessionCache, NetworkDenied}` for gating outbound
network calls.

Deny-wins precedence: a host listed in both `allow` and `deny` is denied.
Subdomain wildcard via leading-dot entries (`.example.com` matches
`api.example.com` but not the apex). Audit log writes one plaintext line
per terminal decision to `~/.deepseek/audit.log` in the format
`<RFC3339> network <host> <tool> <Allow|Deny|Prompt-Approved|Prompt-Denied>`.

Approve-once-for-session caching is implemented in `NetworkSessionCache`;
`approve_persistent` mutates the policy's allow list so callers can write
back to config later.

19 unit tests cover deny-wins precedence, subdomain matching, audit
logging, session-cache short-circuit, and `NetworkDenied` shape.
2026-04-28 00:02:27 -05:00