codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
Hunter Bown	d129ab4150	docs: SUBAGENTS.md — role taxonomy, lifecycle, output contract (#404 ) The role taxonomy expansion in #404 added Implementer + Verifier as distinct postures alongside General / Explore / Plan / Review / Custom. The issue body explicitly lists \`docs/AGENTS.md or docs/SUBAGENTS.md\` as a target file; this commit creates that file. Coverage: - Role taxonomy table — stance, write/shell access, typical use per role. - "When to pick which role" — narrative guidance the model can read if the role choice isn't obvious. - Alias map — every accepted spelling routed to a canonical role, matching what \`SubAgentType::from_str\` accepts. - Concurrency cap — the 10-by-default value, the \`[subagents].max_concurrent\` knob, and the running-only semantics (#509). - Lifecycle — Pending → Running → terminal states, plus \`Interrupted\` after a process restart. - Session boundaries (#405) — \`session_boot_id\` mechanics, default current-session filter, \`include_archived=true\` escape hatch, pre-#405 record handling. - Output contract — the SUMMARY/CHANGES/EVIDENCE/RISKS/BLOCKERS format every sub-agent must produce. - Memory + \`remember\` integration (#489) — sub-agents inherit the parent's memory file when memory is enabled and can append durable notes. - Implementation notes — source path, persisted state file, is_running semantics, RwLock pattern. Cross-link added in \`docs/TOOL_SURFACE.md\` so the sub-agent section points to this doc. No Rust code changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 04:09:27 -05:00
Hunter Bown	4d4a9b424c	feat(config): expand per-project overlay to cover provider, sandbox, approval, mcp_path, max_subagents, allow_shell (#485 ) The project-config overlay (`<workspace>/.deepseek/config.toml` merged on top of the user's global `~/.deepseek/config.toml`) was already wired but only carried four string fields: model, api_key, base_url, reasoning_effort. The use cases users actually file under #485 — "this repo wants a different sandbox / approval policy / MCP server set / hard sub-agent cap" — weren't covered. ### What ships Adds the following keys to the project overlay, all merged with identical "non-empty wins" semantics for strings: - `provider` — pick a different backend per repo (e.g. `nvidia-nim` for an enterprise repo, `deepseek-cn` for a CN-team repo). - `approval_policy` — `never` / `on-request` / `untrusted` for repos with strict policies. - `sandbox_mode` — `read-only` / `workspace-write` / `danger-full-access`. - `mcp_config_path` — per-repo MCP server set without touching the user's global file. - `notes_path` — keep notes in-repo for projects where the notes tool is part of the dev workflow. Plus two non-string fields: - `max_subagents` (positive integer; clamped to `1..=MAX_SUBAGENTS=20`). - `allow_shell` (bool). ### What stays user-global `skills_dir`, `hooks`, `[capacity]`, `[retry]`, `[memory]`, etc. — those are user-shaped settings, not repo-shaped. If a future use case demands per-project values for any of them, a follow-up PR can extend the overlay rather than letting the boundary blur. ### Tests - 8 new tests in `project_config_tests` covering: provider+model, approval+sandbox, max_subagents+allow_shell, max_subagents clamping, negative-max_subagents rejection, missing config file pass-through, malformed TOML pass-through, and empty-string no-op. ### Docs - New "Per-project overlay (#485)" section in `docs/CONFIGURATION.md` with a table of supported keys and the rationale for which fields stay user-global. ### Verification cargo fmt --all -- --check ✓ cargo clippy --workspace --all-targets --all-features --locked -- -D warnings ✓ cargo test --workspace --all-features --locked ✓ 1828 + supporting (was 1820) Closes #485 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 03:25:43 -05:00
Hunter Bown	ad8064b143	chore(v0.8.8): stabilization batch — sub-agent caps, mutex contention, RLM polish, CI cleanup Bundles the v0.8.8 stabilization fixes that were already implemented in the working tree, plus the workflow/doc reconciliation called out in #507. ### Sub-agent runtime fixes - #509 Default sub-agent cap raised to 10 (configurable via `[subagents].max_concurrent` in `config.toml`, hard ceiling 20). The running-count calculation now ignores non-running, no-handle, and finished handles so completed agents stop counting against the cap. - #510 `SharedSubAgentManager` is now `Arc<RwLock<...>>`; the read paths that previously held a `Mutex` for inspection now take a read lock, eliminating the multi-agent fan-out UI freeze. - #511 `compact_tool_result_for_context` summarizes `agent_result` / `agent_wait` payloads before they are folded into the parent context. - #512 RLM tool cards map to `ToolFamily::Rlm` and render `rlm`, not `swarm`. Stale "swarm" wording cleaned in docs/comments/tests. - #513 (foreground stopgap only) Foreground RLM work is visible in the Agents sidebar projection. Full async RLM lifecycle remains v0.8.9 — the issue stays open with a refined scope. ### TUI / UX fixes - #487 Offline composer queue is now session-scoped; legacy unscoped queues fail closed. - #488 Composer Option+Backspace deletes by word; cross-platform key routing helpers added. - #443/#444 Keyboard enhancement flags pop on normal AND panic exit; the raw-mode startup probe is now bounded by a configurable timeout. - #449 Production footer reads statusline colors from `app.ui_theme` rather than the bespoke palette. - #506 `display_path_with_home` no longer mutates `HOME` in tests; the flake on shared-env CI is gone. ### Self-update / packaging - #503 `update.rs` arch mapping uses release-asset naming (`arm64`/`x64`) instead of the raw Rust constants. The platform-asset selector also rejects `.sha256` siblings as primary binaries. Tests now live alongside the source in `mod tests` (the `#[path]`-based integration test was removed because it duplicated test runs and forced a `pub(crate)` helper that no real caller used). - `Max 5 in flight` wording updated in `agent_spawn` description, `prompts/base.md`, and `docs/TOOL_SURFACE.md` so the model sees the real default cap (10) and the configuration knob name. ### CI / release docs (#507) - Pruned three duplicated/dead workflows: `crates-publish.yml`, `parity.yml`, `publish-npm.yml`. Their gates already run in `ci.yml` for every push/PR. - `release.yml` build job now allows `parity` to be skipped (it only runs on tag push), unblocking `workflow_dispatch` reruns. The job still fails closed on a real parity failure. - `RELEASE_RUNBOOK.md` reconciled: crate publishing is documented as the manual `scripts/release/publish-crates.sh` flow (no automated workflow); references to the deleted workflows removed. - `CLAUDE.md` notes the `RELEASE_TAG_PAT` requirement for the auto-tag → release.yml chain (without it, the tag is created but `release.yml` does not fire) and documents the `workflow_dispatch` parity-skip behavior. ### Docs - `docs/COMPETITIVE_ANALYSIS.md` added — capability matrix vs OpenCode and Codex CLI, gap analysis, and recommended implementation order. ### Verification (this branch) - `cargo fmt --all -- --check` ✓ - `cargo check --workspace --all-targets --locked` ✓ - `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings` ✓ - `cargo test --workspace --all-features --locked` ✓ (1809 + supporting) - Parity gates ✓ (snapshot, parity_protocol, parity_state) - `cargo build --release --locked -p deepseek-tui-cli -p deepseek-tui` ✓ - Lockfile drift guard ✓ - `deepseek doctor --json` clean - `deepseek eval` (offline harness) success=true, 0 tool errors Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:57:37 -05:00
Hunter Bown	e0b6a1b967	feat(v0.8.8): Don't auto-approve git -C ... (fixes #416 )	2026-05-02 22:32:13 -05:00
Hunter Bown	7125172f67	fix(tui): tighten selection and live task panels	2026-05-02 21:05:15 -05:00
Hunter Bown	5bfc1feb62	v0.8.6: survivability, UX polish, and release hardening Merge the v0.8.6 feature batch and release hardening.\n\nIncludes the full #373-#380/#382-#402 milestone scope, version bump to 0.8.6, secure /share temp-file handling, Windows-safe self-update replacement, and CI portability fixes.\n\nRemote PR checks passed on the final head before merge.	2026-05-02 20:11:33 -05:00
Hunter Bown	aa23182674	chore(tools): remove /swarm command + agent_swarm/spawn_agents_on_csv tool surface; park swarm.rs pending #357 cascade (#336 ) Surface removed: /swarm slash command, agent_swarm, spawn_agents_on_csv, swarm_status, swarm_result, swarm_cancel tools, report_agent_job_result. Prompts/docs/tests updated. swarm.rs parked with #![allow(dead_code)] pending the full cascade in #357. RLM prompt audit tracked in #358.	2026-05-02 01:30:23 -05:00
Hunter Bown	997c7f4bcd	chore(release): verify dual registry publish state	2026-05-01 11:06:45 -05:00
Hunter Bown	f00bae3bfb	ci: smoke npm wrapper on windows	2026-05-01 02:37:37 -05:00
Hunter Bown	66a3aed528	feat(#248 ): foreground shell backgrounding, cancel tool, and Ctrl+B TUI shell control - Add exec_shell_cancel tool to cancel one or all running background shell tasks - Add foreground-to-background detach path via ShellManager request_foreground_background() - Add wait_for_shell_delta_cancellable() so exec_shell_wait observes turn cancellation - Add ShellControlView (Ctrl+B) with Background/Cancel options for active foreground commands - Add 'Ctrl+B opens shell controls' hint in transcript for running exec cells - Register exec_shell_cancel in ToolRegistryBuilder::with_shell_tools() - Cancel-token checks in ShellInteractTool poll loop - Update keybinding registry and OPERATIONS_RUNBOOK with Ctrl+B documentation - Update TOOL_SURFACE.md with exec_shell_cancel entry - Update prompts (rlm first-class guidance, AGENTS.md issue-closure policy) - Tests: foreground_background, wait_cancel_leaves_running, cancel_tool_single, cancel_tool_all Closes #248	2026-04-30 18:12:15 -05:00
Hunter Bown	7f2f47edf8	v0.7.7: stabilize sub-agent / swarm / fanout lifecycle, Windows install, and TUI polish (#246 ) * wip(v0.7.7): handoff baseline of partial sub-agent stabilization Captures uncommitted work-in-progress on the v0.7.7 stabilization lane so subsequent fixes have a stable starting point. Subsequent commits finish the canonical SubAgentJob/SwarmJob model, fix sidebar/transcript/ footer agreement, copy/paste/cancel contract, checklist rendering, shell summary preservation, monotonic spend, and version provenance. Refs #235 #236 #237 #238 #239 #240 #241 #242 #243 #244 #245 * release: bump workspace version to 0.7.7 (#245) Refs #245 * fix(v0.7.7): canonical swarm card binding, monotonic spend, checklist + shell summary - Add `swarm_card_index: HashMap<swarm_id, history_index>` so overlapping fanouts each project to their own FanoutCard. Eliminates the screenshot contradiction where a stale background swarm's progress clobbered a newer card (#236, #238). - Suppress fanout-class tools (`agent_swarm`, `spawn_agents_on_csv`, `rlm`, `agent_spawn`) from `active_tool_status_label` so the footer no longer reports "tool agent_swarm · 1 active" while sidebar+card show the actual worker counts (#236, #238). - Add `App::displayed_session_cost` + `displayed_cost_high_water` so the visible session+sub-agent total is monotonic across reconciliation events (cache discounts, provisional → final). New tests: monotonicity under negative reconciliation; duplicate dedup keeps display steady (#244). - Preserve high-signal summary lines from the truncated tail of shell output: `test result:`, `failures:`, `error[E…]`, `Finished`, `Compiling`, panic markers. Stops the agent re-running cargo gates just to see pass/fail under truncation (#242). - Render `checklist_write` / `todo_` results as a purpose-built checklist card with completed/total + percent header, per-item status markers, and a collapsing affordance for long lists. Plumbed through the existing `GenericToolCell` so no new variant threading is needed (#241). Refs #236 #238 #241 #242 #244 fix(v0.7.7): Esc clears active tool entries optimistically (#243) When Esc cancels the foreground turn we now finalize the active cell immediately rather than waiting for the engine's TurnComplete echo to drain. This stops the footer "tool ... · X active" chip from briefly contradicting the cancelled state, and frees the composer for the next message. Background `block:false` swarms are intentionally NOT killed here — they remain durable, tracked through `swarm_jobs` and `swarm_card_index` so their FanoutCard updates as workers land. Subsequent `swarm_status` / `swarm_result` / `swarm_cancel` tool calls see the canonical store. New focused test verifies: after Esc, `active_cell` is None, the background swarm record is preserved, and `is_loading` is cleared so the composer can submit immediately. Refs #243 * fix(v0.7.7): Windows .exe lookup + post-turn snapshot detach (#247, #234) #247 — npm-distributed Windows package failed at runtime because the Rust dispatcher's `delegate_to_tui` / `delegate_simple_tui` looked for a sibling named exactly "deepseek-tui", while the actual file shipped by `scripts/install.js` is `deepseek-tui.exe`. Replace both lookups with `locate_sibling_tui_binary`, which: - Honours `DEEPSEEK_TUI_BIN` for explicit overrides - Tries `deepseek-tui{EXE_SUFFIX}` first (`.exe` on Windows, "" elsewhere) - Falls back to suffix-less `deepseek-tui` on Windows so users who applied the issue's manual workaround still launch successfully - Emits a platform-correct error path in the bail message Tests: `sibling_tui_candidate_picks_platform_correct_name`, `sibling_tui_candidate_windows_falls_back_to_suffixless` (windows-only), `locate_sibling_tui_binary_honours_env_override`. #234 — Detach the post-turn workspace snapshot so `git add -A && git commit` no longer pins the engine loop after `Event::TurnComplete`. The snapshot still runs on `tokio::task::spawn_blocking`, but the engine no longer awaits its `JoinHandle`, so the UI accepts input (text, copy, paste, selection) without waiting for the bookkeeping to finish. Cycle advance and pre-turn snapshot remain awaited — they are correctness-sensitive and the cycle path already emits a status chip ("↻ context refreshing…") so the user has visible feedback. Refs #234 #247 * chore(v0.7.7): bump npm package version 0.7.6 → 0.7.7 Required by `scripts/release/check-versions.sh` ("Version drift" CI gate); the workspace was bumped to 0.7.7 but `npm/deepseek-tui/package.json` still reported 0.7.6, blocking PR #246 from going green. Refs #245	2026-04-30 07:26:26 -05:00
Hunter Bown	8ba8600155	release: v0.7.6 - Bump workspace version to 0.7.6 (Cargo.toml + all crate internal dep pins) - Bump npm wrapper version and deepseekBinaryVersion to 0.7.6 - Add v0.7.6 changelog entry: localization, paste burst, history search, pending input preview, grouped /config editor, searchable help overlay, Alt+↑ edit-last-queued, composer attachment management - Update README with v0.7.6 features (localization, paste, history search) - Archive v0.7.5 implementation plan to docs/archive/ - Update Cargo.lock	2026-04-29 17:00:36 -05:00
Hunter Bown	a47b28e5d5	Complete v0.7.6 TUI polish and localization lane (#222 ) Squash-merge PR #222 after green CI and review cleanup.\n\nCloses #198, #199, #206, #207, #208, #209, #210, #212, #213, #214, #215, #216.	2026-04-29 13:06:51 -05:00
Hunter Bown	c2b2c284f6	release: v0.7.5 — token-basis fixes, shell timeout recovery, context/cache policy Issues #202, #203, #204, #205: - Cycle/seam triggers use active request input size + response headroom reserve, not lifetime cumulative API usage. - V4 hard-cycle headroom calibrated around fixed TURN_MAX_OUTPUT_TOKENS plus CONTEXT_HEADROOM_TOKENS safety buffer. - /tokens, /cost, footer/header labels, and docs now separate active context, turn telemetry, cumulative usage, cache hit/miss, context percent, and cost. - Foreground exec_shell timeout output tells the model the process was killed and suggests task_shell_start or background exec_shell plus poll/wait. - Added regression tests for active-token basis, V4 headroom, seam trigger basis, footer label behavior, and shell timeout recovery metadata. - Preserved #200/#201 policy: V4 default is append-only, prefix-cache preserving; replacement compaction, Flash seams, and capacity intervention remain opt-in.	2026-04-29 10:13:27 -05:00
Hunter Bown	0578eb701e	Add shell jobs and MCP manager to the TUI	2026-04-29 09:38:04 -05:00
Hunter Bown	41e8f2b5b2	Disable default compaction and opt in context seams	2026-04-29 09:12:20 -05:00
Hunter Bown	00c92e1c2a	Implement v0.7.4 long-running agent tools	2026-04-29 00:50:43 -05:00
Hunter Bown	6d8ab4c2b8	fix: close v0.7.2 issue cleanup	2026-04-28 23:09:19 -05:00
Hunter Bown	97846cd63a	release: include secrets crate in publish order	2026-04-28 16:39:22 -05:00
Hunter Bown	49d2be9e5c	refactor(tools): share tool result primitives from crate	2026-04-28 01:23:21 -05:00
Hunter Bown	4fb8372c1c	refactor(engine): split turn loop and capacity flow	2026-04-28 01:12:25 -05:00
Hunter Bown	3bc54b0bc0	fix(snapshot): harden side-git restore wiring	2026-04-28 00:46:24 -05:00
Hunter Bown	1f00ac6311	chore(release): drop auto npm publish, document manual flow, trim research PDF - Remove the `publish-npm` job from `release.yml`. It has been failing on every release with `npm error code EOTP` because the configured `NPM_TOKEN` doesn't bypass 2FA. Manual publish from a developer machine is the actual ship path; codify that. - Update `docs/RELEASE_RUNBOOK.md` "npm Wrapper Release" to describe the manual flow (`npm publish --access public` + OTP) and explain why the auto path is gone, with a recovery note for future Trusted-Publishing migration. - Refresh stale cross-reference comment in `publish-npm.yml` (the workflow remains as inert plumbing for an eventual Trusted Publishing setup). - Stop tracking `docs/DeepSeek_V4.pdf` (4.4 MB). It was never referenced outside test fixture filenames; the tests synthesize their own fake PDF. Add to `.gitignore` so a local copy can sit there without nagging. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 16:26:10 -05:00
Hunter Bown	feb3cf1e0c	feat: explain parallel fan-out caps in tool descriptions and error messages (fixes #81 )	2026-04-26 13:16:12 -05:00
Hunter Bown	c58d10ded1	feat(tools): mark alias tools with deprecation metadata Add `wrap_with_deprecation_notice` helper in the subagent module that merges a `_deprecation` block into a ToolResult's metadata. Applied exclusively on alias invocations: - `spawn_agent` → use `agent_spawn` (removed in v0.8.0) - `delegate_to_agent` → use `agent_spawn` (removed in v0.8.0) - `close_agent` → use `agent_cancel` (removed in v0.8.0) - `send_input` → use `agent_send_input` (removed in v0.8.0) Canonical names are unaffected. Each alias invocation also emits a `tracing::warn` so the deprecation appears in audit logs. Documents the deprecation schedule in `docs/TOOL_SURFACE.md`. Four unit tests verify the notice shape and that canonical tools stay clean. Refs #72 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 12:32:26 -05:00
Hunter Bown	ca7ca9f75f	docs: drop stale handoff/migration/parity docs Removed: - `.claude/next-agent-prompt.md` (111 lines) — v0.4.6-era session prompt describing slices A/B/C that have all shipped. Successive sessions use fresh prompts (e.g. .deepseek/v0.6.0-overnight-review.md); this one is pure history. - `docs/archive/workspace_migration_status.md` (92 lines) — explicitly archived (April 11), describes a one-time migration that's complete. Removed enclosing `docs/archive/` directory too (was the only file). CHANGELOG entry from v0.4.x still narrates the archival as history. - `docs/parity_release_and_ci.md` (38 lines) — duplicates what `.github/workflows/parity.yml` and CONTRIBUTING.md already say authoritatively. Single source of truth wins. - `AI_HANDOFF.md` + `todo.md` (untracked, no commit needed) — `todo.md` was a 7-line pointer to AI_HANDOFF.md, which itself was an April 11 snapshot listing "remaining work" that's mostly delivered. CLAUDE.md is the live developer guide now. 1004/1004 tests still green; no doc/code references broken. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 00:27:58 -05:00
Hunter Bown	5f223adea6	v0.6.0: native rlm_query tool + scroll fix + cleanup Adds a structured rlm_query tool for parallel/batched LLM fan-out. The model calls it with one prompt or up to 16 concurrent prompts; children dispatch via tokio::join_all against the existing DeepSeek client. Default child model is deepseek-v4-flash; override per-call via the model field. Available in Plan / Agent / YOLO. Cost folds into the session's running total automatically. Fixes scroll-stuck regression (#56): TranscriptScroll::resolve_top and scrolled_by now use a three-level fallback chain (same line → same cell line 0 → nearest cell at-or-before) instead of teleporting to ToBottom when an anchor cell vanishes. Loosens command-safety chains (#57): cargo build && cargo test and similar chains of known-safe commands now escalate to RequiresApproval instead of being hard-blocked as Dangerous. Chains containing unknown commands still block. Suppresses the GettingCrowded footer chip — context-percent header already covers conversation pressure. Refactors: - Extracts file_mention parsing/completion/expansion (~450 LOC) from the 5,500-line ui.rs into crates/tui/src/tui/file_mention.rs. - Deletes truly unused helpers (write_bytes, timestamped_filename, extension_from_url, output_path, has_project_doc, primary_doc_path). Tests: 853 pass. cargo clippy --workspace -D warnings clean. cargo fmt --all -- --check clean. Closes #46 #47 #48 #49 #50 #53 #54 #55 #56 #57 #58.	2026-04-25 21:48:17 -05:00
Hunter Bown	027d6d19b6	docs(rlm): land Hetun design + helper layer + Sakana research methodology Captures the full RLM-fundamental story across the design doc, MODES.md, and the Hetun prompt. Tracking issues are now #46–#55 (helper layer filed as #53, Hetun as #54, vendoring as #55). What this nails down: - Hetun mode is added at the END of the Tab cycle (Plan → Agent → YOLO → Hetun → Plan), not as a Plan replacement. Default landing mode is unchanged so people don't accidentally start there. Plan stays as it is. - Mission-level approval, not block-level. Hetun runs a research phase, presents one mission card, and only executes after explicit user approval. Inside the execution turn the repl block runs straight through with no per-block prompts — that's the whole point of the mode. - The user's configured model is left alone on enter/exit. Pro/max users stay on Pro/max. The flash-as-coordinator behaviour is internal to the runtime (ZIGRLM_RLM_CMD always points to flash regardless of mode). No global model swap. - No /hetun slash command. Tab cycles into the mode; /plan keeps switching to Plan as today. - The helper layer (#53) is fundamental, not aleph-derived. A curated ~20-function ctx-helper module + AST-validated Python sandbox baked into the repl runtime so a single block can load → slice → fan out flash queries → aggregate without crossing tool boundaries. Inspired by aleph's pattern but our own native primitive — not a port. - Hetun research methodology adopts Sakana's Fugu patterns. The research phase is recursive novelty sampling + hierarchical narrative tree synthesis + multi-detector cross-verification (flash for breadth, Pro for depth) + hypothesis-verification loop. Not "fan out 8 fixed queries". This is what makes "Plan + Recursive Agents" meaningful versus a flash-coordinator wrapper. - No version-number framing anywhere. The plan ships as one cohesive RLM landing across #46/#48/#49/#50/#53/#54/#55 — order is dependency, not release schedule. We keep shipping. - Auto-compaction stays automatic. Removed a manual /compact nag from the Hetun prompt; the existing coherence + capacity system already handles this. Files: docs/rlm-design.md new — full design doc with Hetun details docs/research-react-vs-rlm.md new — supporting research treatment docs/MODES.md 4-mode cycle, Hetun added at end, Plan kept crates/tui/src/prompts/hetun.txt prompt teaching the recursive-novelty + hierarchical-synthesis + verification-loop rhythm, mission-card structure, two-step gate .gitignore ignore .claude/scheduled_tasks.lock runtime Closes nothing yet — implementation lands across the tracking issues. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 15:37:25 -05:00
Hunter Bown	82e4a564aa	refactor(#35 ): tighten agent prompt tool descriptions, drop alias dupes Tool-surface audit pass: - FILE OPERATIONS rewritten so each line states the niche, not just the verb. read_file mentions PDF auto-extraction + `pages` slicing. - New SEARCH section consolidates grep_files / file_search / web_search / fetch_url so the model sees them next to each other and picks the right one. fetch_url (#33) added; previously absent from the prompt. - request_user_input pulled out of FILE OPERATIONS into its own USER section — it never belonged there. - SUB-AGENTS list shrinks by 3: drops `spawn_agent` (use `agent_spawn`), `close_agent` (use `agent_cancel`), and the `agent_assign / assign_agent` dual-name. The underlying dispatchers still resolve those names, so existing sessions don't break — they just no longer pollute the model's tool list. Adds `docs/TOOL_SURFACE.md` with the rationale, the v0.5.1 final surface, and the dropped aliases. Calls out that grep_files is pure-Rust (no rg/grep shell-out, so the "fall back to grep" AC from #35 is vacuously satisfied — the tool has no shell dependency to fall back from). Closes #35.	2026-04-25 13:44:43 -05:00
Hunter Bown	0a394e1587	fix(#31 ): catch version drift in CI, not at release time Adds scripts/release/check-versions.sh and a `versions` CI job that runs on every push/PR. Verifies: - no per-crate Cargo.toml carries a literal version (must inherit the workspace version) - npm/deepseek-tui/package.json matches the workspace version - Cargo.lock is in sync with the manifests Closes #31.	2026-04-25 13:25:55 -05:00
Hunter Bown	29141bc89b	Add NIM env support and .env.example template	2026-04-25 07:21:43 -05:00
Hunter Bown	298f5c6c51	Merge branch 'claude/improve-deepseek-v4-harness-NxBpS' into main	2026-04-25 01:55:44 -05:00
Hunter Bown	853a39138c	feat: setup status/clean/dirs and protocol-recovery hardening Adds a compact `setup --status` view, a `setup --clean` for regenerable session checkpoints, and `--tools`/`--plugins` scaffolding for ~/.deepseek/{tools,plugins} so the extension model has a documented home that doctor can count. `doctor --json` lands as a CI-safe alternative to the human-readable doctor (skips the live API probe). Also locks down the engine's hostility to fake tool-call wrappers: filter_tool_call_delta and the marker constants are now testable, the streaming loop emits one compact status notice per turn when it strips a wrapper, and a new protocol_recovery integration test asserts that the legacy text parser never turns <function_calls> into a real tool call. Adds 23 unit tests + 14 integration tests covering both slices.	2026-04-25 06:26:07 +00:00
Hunter Bown	6ad3727fa0	Add provider switch and file mention attachments	2026-04-24 23:09:48 -05:00
Hunter Bown	16f62f7abf	Fix reasoning replay and context accounting for NIM	2026-04-24 18:42:18 -05:00
Hunter Bown	d0dc26ce25	Add NVIDIA NIM provider support for DeepSeek	2026-04-24 18:29:19 -05:00
Hunter Bown	f377b0f9d2	ci: add npm publish retry path	2026-04-24 16:15:48 -05:00
Hunter Bown	f3df8f5f26	ci: publish npm with trusted publishing	2026-04-24 16:13:59 -05:00
Hunter Bown	8323bedfb7	fix: restore default tui mouse scrolling	2026-04-24 11:41:31 -05:00
Hunter Bown	c4f9078712	release: deepseek tui 0.4.3	2026-04-24 10:48:35 -05:00
Hunter Bown	35595f8edc	fix: normalize legacy DeepSeek aliases to V4 flash	2026-04-23 23:08:44 -05:00
Hunter Bown	b7bd02d814	feat: DeepSeek V4 support with reasoning-effort control (0.4.0) Adds first-class DeepSeek V4 Pro and Flash support, updates the default model to deepseek-v4-pro, aligns legacy aliases with the current V4 1M context behavior, and fixes thinking-mode request handling. Key fixes: - Send DeepSeek's raw Chat Completions `thinking` parameter at the top level instead of SDK-only `extra_body`. - Preserve assistant `reasoning_content` for all prior thinking-mode tool-call turns so subsequent requests satisfy DeepSeek V4's replay requirement. - Fix npm wrapper concurrent first-run downloads by using per-process temporary download paths. - Add `.mailmap` so historical bot-attributed commits aggregate under Hunter Bown where mailmap is honored. Verified with the full local Rust gate, live DeepSeek V4 smoke, npm wrapper temp-install smoke, and green PR CI across Linux, macOS, and Windows.	2026-04-23 22:53:20 -05:00
Hunter Bown	dc8e94d705	docs: update documentation and cleanup for v0.3.33	2026-04-22 22:36:45 -05:00
Hunter Bown	cbcf35c1bd	release: prepare v0.3.32 — finance tool, header redesign, expanded tests Add Yahoo Finance quote tool with chart fallback, redesign header widget with proportional truncation and context bar, refactor footer status strip, expand test suite to 680+ tests, and fix blocking issues (usize underflow in header, tempdir leak in finance tests, per-call HTTP client creation).	2026-04-11 18:58:56 -05:00
Hunter Bown	b172b8d306	feat: remove Normal mode and consolidate to Agent (#4 ) Keep legacy /normal and settings fallback behavior mapped to Agent, align docs around the three visible modes, and include the current TUI and onboarding refinements in this worktree.	2026-03-12 11:32:25 -05:00
Hunter Bown	7b91169017	refactor: move source files into workspace crates - Move src/* into crates/tui/src/ to create a proper workspace structure - Add .claude/ and .trimtab/ directories for Trimtab closed-loop workflow - Add DEPENDENCY_GRAPH.md and update documentation - Update Cargo.toml files to reflect new crate dependencies - Update CI workflows and npm package scripts - All tests pass, release build works	2026-03-11 20:00:38 -05:00
Hunter Bown	37186c3d95	Workspace migration: split into modular crates, parity CI, release updates - Convert root to Cargo workspace with crates/ layout - Add deepseek-* crates mirroring Codex architecture - Add parity CI workflow with snapshot/protocol/state tests - Update release workflow to build both deepseek and deepseek-tui binaries - Bump version to 0.3.28	2026-03-02 17:52:46 -06:00
Hunter Bown	8d904129b5	Release 0.3.23	2026-02-24 10:31:54 -06:00
Hunter Bown	b88ce88a42	Improve TUI coherence, small-screen layout, and contrast guardrails	2026-02-19 10:09:41 -06:00
Hunter Bown	8b5f1bc83f	fix: harden runtime app pre-release issues	2026-02-18 11:12:40 -06:00

1 2

60 Commits