The role taxonomy expansion in #404 added Implementer + Verifier as
distinct postures alongside General / Explore / Plan / Review /
Custom. The issue body explicitly lists \`docs/AGENTS.md or
docs/SUBAGENTS.md\` as a target file; this commit creates that file.
Coverage:
- Role taxonomy table — stance, write/shell access, typical use per
role.
- "When to pick which role" — narrative guidance the model can read
if the role choice isn't obvious.
- Alias map — every accepted spelling routed to a canonical role,
matching what \`SubAgentType::from_str\` accepts.
- Concurrency cap — the 10-by-default value, the
\`[subagents].max_concurrent\` knob, and the running-only
semantics (#509).
- Lifecycle — Pending → Running → terminal states, plus
\`Interrupted\` after a process restart.
- Session boundaries (#405) — \`session_boot_id\` mechanics,
default current-session filter, \`include_archived=true\` escape
hatch, pre-#405 record handling.
- Output contract — the SUMMARY/CHANGES/EVIDENCE/RISKS/BLOCKERS
format every sub-agent must produce.
- Memory + \`remember\` integration (#489) — sub-agents inherit the
parent's memory file when memory is enabled and can append durable
notes.
- Implementation notes — source path, persisted state file,
is_running semantics, RwLock pattern.
Cross-link added in \`docs/TOOL_SURFACE.md\` so the sub-agent section
points to this doc.
No Rust code changed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The project-config overlay (`<workspace>/.deepseek/config.toml` merged
on top of the user's global `~/.deepseek/config.toml`) was already
wired but only carried four string fields: model, api_key, base_url,
reasoning_effort. The use cases users actually file under #485 — "this
repo wants a different sandbox / approval policy / MCP server set / hard
sub-agent cap" — weren't covered.
### What ships
Adds the following keys to the project overlay, all merged with
identical "non-empty wins" semantics for strings:
- `provider` — pick a different backend per repo (e.g. `nvidia-nim` for
an enterprise repo, `deepseek-cn` for a CN-team repo).
- `approval_policy` — `never` / `on-request` / `untrusted` for repos
with strict policies.
- `sandbox_mode` — `read-only` / `workspace-write` / `danger-full-access`.
- `mcp_config_path` — per-repo MCP server set without touching the
user's global file.
- `notes_path` — keep notes in-repo for projects where the notes tool
is part of the dev workflow.
Plus two non-string fields:
- `max_subagents` (positive integer; clamped to `1..=MAX_SUBAGENTS=20`).
- `allow_shell` (bool).
### What stays user-global
`skills_dir`, `hooks`, `[capacity]`, `[retry]`, `[memory]`, etc. — those
are user-shaped settings, not repo-shaped. If a future use case
demands per-project values for any of them, a follow-up PR can extend
the overlay rather than letting the boundary blur.
### Tests
- 8 new tests in `project_config_tests` covering: provider+model,
approval+sandbox, max_subagents+allow_shell, max_subagents
clamping, negative-max_subagents rejection, missing config file
pass-through, malformed TOML pass-through, and empty-string
no-op.
### Docs
- New "Per-project overlay (#485)" section in `docs/CONFIGURATION.md`
with a table of supported keys and the rationale for which fields
stay user-global.
### Verification
cargo fmt --all -- --check ✓
cargo clippy --workspace --all-targets --all-features --locked -- -D warnings ✓
cargo test --workspace --all-features --locked ✓ 1828 + supporting (was 1820)
Closes#485
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bundles the v0.8.8 stabilization fixes that were already implemented in the
working tree, plus the workflow/doc reconciliation called out in #507.
### Sub-agent runtime fixes
- **#509** Default sub-agent cap raised to 10 (configurable via
`[subagents].max_concurrent` in `config.toml`, hard ceiling 20). The
running-count calculation now ignores non-running, no-handle, and finished
handles so completed agents stop counting against the cap.
- **#510** `SharedSubAgentManager` is now `Arc<RwLock<...>>`; the read paths
that previously held a `Mutex` for inspection now take a read lock,
eliminating the multi-agent fan-out UI freeze.
- **#511** `compact_tool_result_for_context` summarizes `agent_result` /
`agent_wait` payloads before they are folded into the parent context.
- **#512** RLM tool cards map to `ToolFamily::Rlm` and render `rlm`, not
`swarm`. Stale "swarm" wording cleaned in docs/comments/tests.
- **#513** (foreground stopgap only) Foreground RLM work is visible in the
Agents sidebar projection. Full async RLM lifecycle remains v0.8.9 — the
issue stays open with a refined scope.
### TUI / UX fixes
- **#487** Offline composer queue is now session-scoped; legacy unscoped
queues fail closed.
- **#488** Composer Option+Backspace deletes by word; cross-platform key
routing helpers added.
- **#443/#444** Keyboard enhancement flags pop on normal AND panic exit; the
raw-mode startup probe is now bounded by a configurable timeout.
- **#449** Production footer reads statusline colors from `app.ui_theme`
rather than the bespoke palette.
- **#506** `display_path_with_home` no longer mutates `HOME` in tests; the
flake on shared-env CI is gone.
### Self-update / packaging
- **#503** `update.rs` arch mapping uses release-asset naming (`arm64`/`x64`)
instead of the raw Rust constants. The platform-asset selector also rejects
`.sha256` siblings as primary binaries. Tests now live alongside the source
in `mod tests` (the `#[path]`-based integration test was removed because it
duplicated test runs and forced a `pub(crate)` helper that no real caller
used).
- **`Max 5 in flight` wording updated** in `agent_spawn` description,
`prompts/base.md`, and `docs/TOOL_SURFACE.md` so the model sees the real
default cap (10) and the configuration knob name.
### CI / release docs (#507)
- Pruned three duplicated/dead workflows: `crates-publish.yml`, `parity.yml`,
`publish-npm.yml`. Their gates already run in `ci.yml` for every push/PR.
- `release.yml` build job now allows `parity` to be skipped (it only runs on
tag push), unblocking `workflow_dispatch` reruns. The job still fails
closed on a real parity failure.
- `RELEASE_RUNBOOK.md` reconciled: crate publishing is documented as the
manual `scripts/release/publish-crates.sh` flow (no automated workflow);
references to the deleted workflows removed.
- `CLAUDE.md` notes the `RELEASE_TAG_PAT` requirement for the auto-tag →
release.yml chain (without it, the tag is created but `release.yml` does
not fire) and documents the `workflow_dispatch` parity-skip behavior.
### Docs
- `docs/COMPETITIVE_ANALYSIS.md` added — capability matrix vs OpenCode and
Codex CLI, gap analysis, and recommended implementation order.
### Verification (this branch)
- `cargo fmt --all -- --check` ✓
- `cargo check --workspace --all-targets --locked` ✓
- `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings` ✓
- `cargo test --workspace --all-features --locked` ✓ (1809 + supporting)
- Parity gates ✓ (snapshot, parity_protocol, parity_state)
- `cargo build --release --locked -p deepseek-tui-cli -p deepseek-tui` ✓
- Lockfile drift guard ✓
- `deepseek doctor --json` clean
- `deepseek eval` (offline harness) success=true, 0 tool errors
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merge the v0.8.6 feature batch and release hardening.\n\nIncludes the full #373-#380/#382-#402 milestone scope, version bump to 0.8.6, secure /share temp-file handling, Windows-safe self-update replacement, and CI portability fixes.\n\nRemote PR checks passed on the final head before merge.
* wip(v0.7.7): handoff baseline of partial sub-agent stabilization
Captures uncommitted work-in-progress on the v0.7.7 stabilization lane
so subsequent fixes have a stable starting point. Subsequent commits
finish the canonical SubAgentJob/SwarmJob model, fix sidebar/transcript/
footer agreement, copy/paste/cancel contract, checklist rendering, shell
summary preservation, monotonic spend, and version provenance.
Refs #235#236#237#238#239#240#241#242#243#244#245
* release: bump workspace version to 0.7.7 (#245)
Refs #245
* fix(v0.7.7): canonical swarm card binding, monotonic spend, checklist + shell summary
- Add `swarm_card_index: HashMap<swarm_id, history_index>` so overlapping
fanouts each project to their own FanoutCard. Eliminates the screenshot
contradiction where a stale background swarm's progress clobbered a
newer card (#236, #238).
- Suppress fanout-class tools (`agent_swarm`, `spawn_agents_on_csv`,
`rlm`, `agent_spawn`) from `active_tool_status_label` so the footer no
longer reports "tool agent_swarm · 1 active" while sidebar+card show
the actual worker counts (#236, #238).
- Add `App::displayed_session_cost` + `displayed_cost_high_water` so the
visible session+sub-agent total is monotonic across reconciliation
events (cache discounts, provisional → final). New tests: monotonicity
under negative reconciliation; duplicate dedup keeps display steady (#244).
- Preserve high-signal summary lines from the truncated tail of shell
output: `test result:`, `failures:`, `error[E…]`, `Finished`,
`Compiling`, panic markers. Stops the agent re-running cargo gates
just to see pass/fail under truncation (#242).
- Render `checklist_write` / `todo_*` results as a purpose-built
checklist card with completed/total + percent header, per-item status
markers, and a collapsing affordance for long lists. Plumbed through
the existing `GenericToolCell` so no new variant threading is needed (#241).
Refs #236#238#241#242#244
* fix(v0.7.7): Esc clears active tool entries optimistically (#243)
When Esc cancels the foreground turn we now finalize the active cell
immediately rather than waiting for the engine's TurnComplete echo to
drain. This stops the footer "tool ... · X active" chip from briefly
contradicting the cancelled state, and frees the composer for the next
message.
Background `block:false` swarms are intentionally NOT killed here — they
remain durable, tracked through `swarm_jobs` and `swarm_card_index` so
their FanoutCard updates as workers land. Subsequent `swarm_status` /
`swarm_result` / `swarm_cancel` tool calls see the canonical store.
New focused test verifies: after Esc, `active_cell` is None, the
background swarm record is preserved, and `is_loading` is cleared so
the composer can submit immediately.
Refs #243
* fix(v0.7.7): Windows .exe lookup + post-turn snapshot detach (#247, #234)
#247 — npm-distributed Windows package failed at runtime because the
Rust dispatcher's `delegate_to_tui` / `delegate_simple_tui` looked for a
sibling named exactly "deepseek-tui", while the actual file shipped by
`scripts/install.js` is `deepseek-tui.exe`. Replace both lookups with
`locate_sibling_tui_binary`, which:
- Honours `DEEPSEEK_TUI_BIN` for explicit overrides
- Tries `deepseek-tui{EXE_SUFFIX}` first (`.exe` on Windows, "" elsewhere)
- Falls back to suffix-less `deepseek-tui` on Windows so users who
applied the issue's manual workaround still launch successfully
- Emits a platform-correct error path in the bail message
Tests: `sibling_tui_candidate_picks_platform_correct_name`,
`sibling_tui_candidate_windows_falls_back_to_suffixless` (windows-only),
`locate_sibling_tui_binary_honours_env_override`.
#234 — Detach the post-turn workspace snapshot so `git add -A && git
commit` no longer pins the engine loop after `Event::TurnComplete`.
The snapshot still runs on `tokio::task::spawn_blocking`, but the
engine no longer awaits its `JoinHandle`, so the UI accepts input
(text, copy, paste, selection) without waiting for the bookkeeping to
finish. Cycle advance and pre-turn snapshot remain awaited — they are
correctness-sensitive and the cycle path already emits a status chip
("↻ context refreshing…") so the user has visible feedback.
Refs #234#247
* chore(v0.7.7): bump npm package version 0.7.6 → 0.7.7
Required by `scripts/release/check-versions.sh` ("Version drift" CI
gate); the workspace was bumped to 0.7.7 but `npm/deepseek-tui/package.json`
still reported 0.7.6, blocking PR #246 from going green.
Refs #245
- Bump workspace version to 0.7.6 (Cargo.toml + all crate internal dep pins)
- Bump npm wrapper version and deepseekBinaryVersion to 0.7.6
- Add v0.7.6 changelog entry: localization, paste burst, history search,
pending input preview, grouped /config editor, searchable help overlay,
Alt+↑ edit-last-queued, composer attachment management
- Update README with v0.7.6 features (localization, paste, history search)
- Archive v0.7.5 implementation plan to docs/archive/
- Update Cargo.lock
Issues #202, #203, #204, #205:
- Cycle/seam triggers use active request input size + response
headroom reserve, not lifetime cumulative API usage.
- V4 hard-cycle headroom calibrated around fixed TURN_MAX_OUTPUT_TOKENS
plus CONTEXT_HEADROOM_TOKENS safety buffer.
- /tokens, /cost, footer/header labels, and docs now separate
active context, turn telemetry, cumulative usage, cache hit/miss,
context percent, and cost.
- Foreground exec_shell timeout output tells the model the process
was killed and suggests task_shell_start or background exec_shell
plus poll/wait.
- Added regression tests for active-token basis, V4 headroom,
seam trigger basis, footer label behavior, and shell timeout
recovery metadata.
- Preserved #200/#201 policy: V4 default is append-only,
prefix-cache preserving; replacement compaction, Flash seams,
and capacity intervention remain opt-in.
- Remove the `publish-npm` job from `release.yml`. It has been failing on
every release with `npm error code EOTP` because the configured `NPM_TOKEN`
doesn't bypass 2FA. Manual publish from a developer machine is the actual
ship path; codify that.
- Update `docs/RELEASE_RUNBOOK.md` "npm Wrapper Release" to describe the
manual flow (`npm publish --access public` + OTP) and explain why the auto
path is gone, with a recovery note for future Trusted-Publishing migration.
- Refresh stale cross-reference comment in `publish-npm.yml` (the workflow
remains as inert plumbing for an eventual Trusted Publishing setup).
- Stop tracking `docs/DeepSeek_V4.pdf` (4.4 MB). It was never referenced
outside test fixture filenames; the tests synthesize their own fake PDF.
Add to `.gitignore` so a local copy can sit there without nagging.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `wrap_with_deprecation_notice` helper in the subagent module that
merges a `_deprecation` block into a ToolResult's metadata. Applied
exclusively on alias invocations:
- `spawn_agent` → use `agent_spawn` (removed in v0.8.0)
- `delegate_to_agent` → use `agent_spawn` (removed in v0.8.0)
- `close_agent` → use `agent_cancel` (removed in v0.8.0)
- `send_input` → use `agent_send_input` (removed in v0.8.0)
Canonical names are unaffected. Each alias invocation also emits a
`tracing::warn` so the deprecation appears in audit logs. Documents
the deprecation schedule in `docs/TOOL_SURFACE.md`. Four unit tests
verify the notice shape and that canonical tools stay clean.
Refs #72
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removed:
- `.claude/next-agent-prompt.md` (111 lines) — v0.4.6-era session prompt
describing slices A/B/C that have all shipped. Successive sessions use
fresh prompts (e.g. .deepseek/v0.6.0-overnight-review.md); this one is
pure history.
- `docs/archive/workspace_migration_status.md` (92 lines) — explicitly
archived (April 11), describes a one-time migration that's complete.
Removed enclosing `docs/archive/` directory too (was the only file).
CHANGELOG entry from v0.4.x still narrates the archival as history.
- `docs/parity_release_and_ci.md` (38 lines) — duplicates what
`.github/workflows/parity.yml` and CONTRIBUTING.md already say
authoritatively. Single source of truth wins.
- `AI_HANDOFF.md` + `todo.md` (untracked, no commit needed) — `todo.md`
was a 7-line pointer to AI_HANDOFF.md, which itself was an April 11
snapshot listing "remaining work" that's mostly delivered. CLAUDE.md
is the live developer guide now.
1004/1004 tests still green; no doc/code references broken.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a structured rlm_query tool for parallel/batched LLM fan-out.
The model calls it with one prompt or up to 16 concurrent prompts;
children dispatch via tokio::join_all against the existing DeepSeek
client. Default child model is deepseek-v4-flash; override per-call
via the model field. Available in Plan / Agent / YOLO. Cost folds
into the session's running total automatically.
Fixes scroll-stuck regression (#56): TranscriptScroll::resolve_top
and scrolled_by now use a three-level fallback chain (same line →
same cell line 0 → nearest cell at-or-before) instead of teleporting
to ToBottom when an anchor cell vanishes.
Loosens command-safety chains (#57): cargo build && cargo test and
similar chains of known-safe commands now escalate to RequiresApproval
instead of being hard-blocked as Dangerous. Chains containing unknown
commands still block.
Suppresses the GettingCrowded footer chip — context-percent header
already covers conversation pressure.
Refactors:
- Extracts file_mention parsing/completion/expansion (~450 LOC) from
the 5,500-line ui.rs into crates/tui/src/tui/file_mention.rs.
- Deletes truly unused helpers (write_bytes, timestamped_filename,
extension_from_url, output_path, has_project_doc, primary_doc_path).
Tests: 853 pass. cargo clippy --workspace -D warnings clean.
cargo fmt --all -- --check clean.
Closes#46#47#48#49#50#53#54#55#56#57#58.
Captures the full RLM-fundamental story across the design doc, MODES.md,
and the Hetun prompt. Tracking issues are now #46–#55 (helper layer
filed as #53, Hetun as #54, vendoring as #55).
What this nails down:
- **Hetun mode** is added at the END of the Tab cycle (Plan → Agent →
YOLO → Hetun → Plan), not as a Plan replacement. Default landing mode
is unchanged so people don't accidentally start there. Plan stays as
it is.
- **Mission-level approval, not block-level.** Hetun runs a research
phase, presents one mission card, and only executes after explicit
user approval. Inside the execution turn the repl block runs straight
through with no per-block prompts — that's the whole point of the
mode.
- **The user's configured model is left alone on enter/exit.** Pro/max
users stay on Pro/max. The flash-as-coordinator behaviour is internal
to the runtime (ZIGRLM_RLM_CMD always points to flash regardless of
mode). No global model swap.
- **No /hetun slash command.** Tab cycles into the mode; /plan keeps
switching to Plan as today.
- **The helper layer (#53) is fundamental, not aleph-derived.** A
curated ~20-function ctx-helper module + AST-validated Python sandbox
baked into the repl runtime so a single block can load → slice → fan
out flash queries → aggregate without crossing tool boundaries.
Inspired by aleph's pattern but our own native primitive — not a port.
- **Hetun research methodology adopts Sakana's Fugu patterns.** The
research phase is recursive novelty sampling + hierarchical narrative
tree synthesis + multi-detector cross-verification (flash for
breadth, Pro for depth) + hypothesis-verification loop. Not "fan out
8 fixed queries". This is what makes "Plan + Recursive Agents"
meaningful versus a flash-coordinator wrapper.
- **No version-number framing anywhere.** The plan ships as one cohesive
RLM landing across #46/#48/#49/#50/#53/#54/#55 — order is dependency,
not release schedule. We keep shipping.
- **Auto-compaction stays automatic.** Removed a manual /compact nag
from the Hetun prompt; the existing coherence + capacity system
already handles this.
Files:
docs/rlm-design.md new — full design doc with Hetun details
docs/research-react-vs-rlm.md new — supporting research treatment
docs/MODES.md 4-mode cycle, Hetun added at end, Plan kept
crates/tui/src/prompts/hetun.txt prompt teaching the recursive-novelty
+ hierarchical-synthesis + verification-loop
rhythm, mission-card structure, two-step gate
.gitignore ignore .claude/scheduled_tasks.lock runtime
Closes nothing yet — implementation lands across the tracking issues.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tool-surface audit pass:
- FILE OPERATIONS rewritten so each line states the niche, not just the
verb. read_file mentions PDF auto-extraction + `pages` slicing.
- New SEARCH section consolidates grep_files / file_search / web_search /
fetch_url so the model sees them next to each other and picks the
right one. fetch_url (#33) added; previously absent from the prompt.
- request_user_input pulled out of FILE OPERATIONS into its own USER
section — it never belonged there.
- SUB-AGENTS list shrinks by 3: drops `spawn_agent` (use `agent_spawn`),
`close_agent` (use `agent_cancel`), and the `agent_assign /
assign_agent` dual-name. The underlying dispatchers still resolve those
names, so existing sessions don't break — they just no longer
pollute the model's tool list.
Adds `docs/TOOL_SURFACE.md` with the rationale, the v0.5.1 final
surface, and the dropped aliases. Calls out that grep_files is pure-Rust
(no rg/grep shell-out, so the "fall back to grep" AC from #35 is
vacuously satisfied — the tool has no shell dependency to fall back from).
Closes#35.
Adds scripts/release/check-versions.sh and a `versions` CI job that runs
on every push/PR. Verifies:
- no per-crate Cargo.toml carries a literal version (must inherit the
workspace version)
- npm/deepseek-tui/package.json matches the workspace version
- Cargo.lock is in sync with the manifests
Closes#31.
Adds a compact `setup --status` view, a `setup --clean` for regenerable
session checkpoints, and `--tools`/`--plugins` scaffolding for
~/.deepseek/{tools,plugins} so the extension model has a documented home
that doctor can count. `doctor --json` lands as a CI-safe alternative to
the human-readable doctor (skips the live API probe).
Also locks down the engine's hostility to fake tool-call wrappers:
filter_tool_call_delta and the marker constants are now testable, the
streaming loop emits one compact status notice per turn when it strips
a wrapper, and a new protocol_recovery integration test asserts that
the legacy text parser never turns <function_calls> into a real tool
call. Adds 23 unit tests + 14 integration tests covering both slices.
Adds first-class DeepSeek V4 Pro and Flash support, updates the default model to deepseek-v4-pro, aligns legacy aliases with the current V4 1M context behavior, and fixes thinking-mode request handling.
Key fixes:
- Send DeepSeek's raw Chat Completions `thinking` parameter at the top level instead of SDK-only `extra_body`.
- Preserve assistant `reasoning_content` for all prior thinking-mode tool-call turns so subsequent requests satisfy DeepSeek V4's replay requirement.
- Fix npm wrapper concurrent first-run downloads by using per-process temporary download paths.
- Add `.mailmap` so historical bot-attributed commits aggregate under Hunter Bown where mailmap is honored.
Verified with the full local Rust gate, live DeepSeek V4 smoke, npm wrapper temp-install smoke, and green PR CI across Linux, macOS, and Windows.
Add Yahoo Finance quote tool with chart fallback, redesign header widget
with proportional truncation and context bar, refactor footer status strip,
expand test suite to 680+ tests, and fix blocking issues (usize underflow
in header, tempdir leak in finance tests, per-call HTTP client creation).
Keep legacy /normal and settings fallback behavior mapped to Agent, align docs around the three visible modes, and include the current TUI and onboarding refinements in this worktree.
- Move src/* into crates/tui/src/ to create a proper workspace structure
- Add .claude/ and .trimtab/ directories for Trimtab closed-loop workflow
- Add DEPENDENCY_GRAPH.md and update documentation
- Update Cargo.toml files to reflect new crate dependencies
- Update CI workflows and npm package scripts
- All tests pass, release build works
- Convert root to Cargo workspace with crates/ layout
- Add deepseek-* crates mirroring Codex architecture
- Add parity CI workflow with snapshot/protocol/state tests
- Update release workflow to build both deepseek and deepseek-tui binaries
- Bump version to 0.3.28