A user couldn't find an `NVIDIA_API_KEY` block in `~/.deepseek/config.toml`
because the example file only mentioned NIM as commented-out alternates
to the top-level keys. Two fixes:
- `config.example.toml` now has explicit `[providers.deepseek]` and
`[providers.nvidia_nim]` sections (placed after all top-level keys so
the TOML still parses cleanly), each documenting `api_key` /
`base_url` / `model` plus the env vars that override them. Both
providers can be stored at once and toggled via `/provider` or
`--provider` without re-entering keys.
- `setup --status` "missing api_key" message is now provider-aware: on
`nvidia-nim` it points at `NVIDIA_API_KEY` + `[providers.nvidia_nim]`
+ `deepseek auth set --provider nvidia-nim`, instead of the
DeepSeek-only hint.
Audit verified: the v0.5.0 multi-turn replay fix path
(`should_replay_reasoning_content` → `requires_reasoning_content` in
`crates/tui/src/client.rs:1796`) keys off the model name (matches
`deepseek-v4`), not the provider, so NIM-hosted V4 models get the
replay automatically. No NIM-specific 400-class regression there.
Closes#37 (docs/UX); the live multi-turn-against-NIM verification
remains a manual smoke step listed in the issue (no NIM creds in CI).
Minimum-viable version of the handoff artifact described in #32:
- New `HANDOFF_RELATIVE_PATH = ".deepseek/handoff.md"` convention.
- `system_prompt_for_mode_with_context` now reads that path on every
prompt rebuild and prepends a `## Previous Session Handoff` block to
the system prompt when the file is non-empty. A fresh agent gets the
prior session's blockers/decisions/files-touched in turn-1 context
with zero discovery cost.
- Agent prompt updated to make the convention explicit: "if the block
appears, read it first; before exit/`/compact`, write or update it
via `write_file`."
- `.deepseek/` is already gitignored, so the handoff travels with the
workspace but doesn't pollute commits unless the user opts in.
Tests cover: present-and-non-empty (block injected with file content),
missing file (no block), empty/whitespace-only file (no block). A
unique marker in the injected block (`"left a handoff at .deepseek/..."`)
discriminates the actual block from the agent prompt's own description
of the convention.
Out of scope for v0.5.1: a `/handoff` slash command, a startup banner
toast, automatic write on exit, and the diff-against-HEAD-on-resume
mechanism. The agent can already write the file via `write_file` when
the user types `write a session handoff`.
Closes#32.
Tool-surface audit pass:
- FILE OPERATIONS rewritten so each line states the niche, not just the
verb. read_file mentions PDF auto-extraction + `pages` slicing.
- New SEARCH section consolidates grep_files / file_search / web_search /
fetch_url so the model sees them next to each other and picks the
right one. fetch_url (#33) added; previously absent from the prompt.
- request_user_input pulled out of FILE OPERATIONS into its own USER
section — it never belonged there.
- SUB-AGENTS list shrinks by 3: drops `spawn_agent` (use `agent_spawn`),
`close_agent` (use `agent_cancel`), and the `agent_assign /
assign_agent` dual-name. The underlying dispatchers still resolve those
names, so existing sessions don't break — they just no longer
pollute the model's tool list.
Adds `docs/TOOL_SURFACE.md` with the rationale, the v0.5.1 final
surface, and the dropped aliases. Calls out that grep_files is pure-Rust
(no rg/grep shell-out, so the "fall back to grep" AC from #35 is
vacuously satisfied — the tool has no shell dependency to fall back from).
Closes#35.
The reasoning-effort tier (`max` chip + whale icon) and the live/context
indicators were the only signals on the right of the header. Switching
to nvidia-nim left the right-hand side identical to a DeepSeek session,
so it wasn't obvious at a glance that requests were going to a different
backend.
Now: when `app.api_provider != Deepseek`, the header surfaces a bold
`NIM` chip on the right, leftmost in the chip cluster (so it survives
the narrow-width fallback variants in `right_spans`). Default-DeepSeek
sessions are unchanged — `provider_label = None` short-circuits the
chip.
Closes#38.
`read_file` now detects PDFs by extension or `%PDF-` magic bytes and
shells out to `pdftotext -layout` (poppler) to return plain text
directly to the model. New optional `pages` arg accepts `N` or `N-M`
slices so big papers can be read in pieces without burning context.
When `pdftotext` isn't on `$PATH`, the tool returns a structured
`{type: "binary_unavailable", kind: "pdf", reason, hint}` payload with
install hints (`brew install poppler` / `apt install poppler-utils`)
instead of crashing or returning UTF-8 garbage from a binary file.
Tests cover extension detection (case-insensitive), magic-byte sniffing
on extension-less files, the negative case for plain text, the pages
arg parser (single, range, whitespace, invalid forms), and the
binary_unavailable branch when `pdftotext` is absent.
.docx / .epub / .html stripping deferred — same dispatch can take more
extractors later.
Closes#34.
Complements `web_search` for cases where the URL is already known —
GitHub repos, blog posts, spec pages — and a search-engine round trip is
overkill or actively unhelpful (which #25 had been making worse).
Surface:
- `fetch_url(url, format?, max_bytes?, timeout_ms?)`
- `format`: `markdown` (default), `text`, `raw`
- HTTPS preferred, http:// allowed; non-http schemes rejected up front
- Follows up to 5 redirects; 1 MB default cap (10 MB hard ceiling); 15 s
default timeout (60 s ceiling)
- HTML responses are stripped to readable text via the same regex
pattern used by `web_search` (script/style strip → tag strip → entity
decode → whitespace collapse)
- 4xx / 5xx responses still return the body (with `success: false`) so
the caller can read JSON error envelopes
Capabilities: `ReadOnly + Network`. Approval: `Auto` (matches
`web_search`). Registered in `with_web_tools` so it's available wherever
`web_search` is.
Tests cover: format parsing aliases, scheme rejection, missing/empty
url validation, html-to-text stripping. The over-the-wire cases
(redirect chains, oversized truncation) are exercised by integration
tests once the test suite is wired to a local mock HTTP server —
deferring that since the unit tests already lock in the input
validation and HTML processing.
Closes#33.
The thinking-mode sanitizer now sums the byte size of every replayed
`reasoning_content` field in the outgoing chat-completions body and
emits an `info`-level log line:
Reasoning-content replay: 7 assistant message(s), ~3.2K input tokens (12,884 chars) being re-sent in this request
This is visible under `RUST_LOG=deepseek_tui=info` (or higher). It's the
first step toward the footer/status-line indicator described in #30 —
the model's input-side reasoning replay is now observable per turn,
even before it gets a dedicated UI surface.
Tests cover both branches: bodies that already have reasoning_content
(count is summed across all assistant turns) and bodies where the
sanitizer had to inject the `(reasoning omitted)` placeholder (the
placeholder bytes are included in the count since they ship over the
wire).
Footer integration deferred — that needs a new event from client → engine
→ TUI to surface the count alongside `cache N%` / `$X.XX`. Part of #30
remains open.
Adds scripts/release/check-versions.sh and a `versions` CI job that runs
on every push/PR. Verifies:
- no per-crate Cargo.toml carries a literal version (must inherit the
workspace version)
- npm/deepseek-tui/package.json matches the workspace version
- Cargo.lock is in sync with the manifests
Closes#31.
The #27 per-mode context budget commit (1be18e69) replaced calls to
compaction_threshold_for_model with compaction_threshold_for_model_and_effort
but left the old name in the import list, which fails under -Dwarnings on
Build, Test, and the npm wrapper smoke job. Also re-runs cargo fmt over
the four files the lint job flagged.
- README: drop the `cargo install deepseek-tui` / `deepseek-tui-cli`
block and the crates.io badge — those packages have been lagging the
workspace version for several releases. Source install (`cargo install
--path crates/tui`) remains documented for hackers.
- README: add a "What's new in v0.5.0" section pointing at the
thinking-mode tool-call 400 fix and the #25 web.run cleanup.
- CHANGELOG: add the [0.5.0] - 2026-04-25 entry covering the per-message
reasoning_content rule, the wire-payload sanitizer, and #25.
Add compaction_threshold_for_model_and_effort() with mode-aware soft
caps based on DeepSeek V4 paper Figure 9 recall-quality data:
Plan / off -> 64K (paper eval: 8K-128K)
Agent / high -> 192K (paper eval: 128K)
YOLO / max -> 384K (paper eval: 384K-512K)
Previously, the 80%-of-window rule gave 800K for V4's 1M window,
which is well past the point where MRCR MMR collapses (0.49 at 1M).
Non-V4 models keep the legacy 80% rule. None/unknown effort defaults
to agent-tier (192K).
Replace all web.run mentions with web_search in prompt files (base,
agent, yolo, plan, normal) and update web_search.rs description.
Model was trying web.run which doesn't exist, wasting turns on
validation errors. Also remove [cite:ref_id] citation format which
required web.run's ref_id system.
Partial fix for #25 — web_search reliability improvements (real
search provider) still needed.
### Fixed
- DeepSeek thinking-mode tool-call rounds now always replay reasoning_content
in all subsequent requests (including across new user turns), matching the
documented API contract that assistant tool-call messages must retain their
reasoning content forever. Previously, reasoning_content was cleared after
the current user turn completed, which could cause HTTP 400 errors.
- Missing reasoning_content on a tool-call assistant message now substitutes
a safe placeholder ("(reasoning omitted)") instead of dropping the tool
calls and their matching tool results, preventing orphaned conversation
chains and API 400 rejections.
- Session checkpoint now persists a Thinking-block placeholder for tool-call
turns that produced no streamed reasoning text, keeping on-disk sessions
structurally correct for subsequent requests.
- Token estimation for compaction now counts thinking tokens across ALL
tool-call rounds (not just the current user turn), aligning with the
updated reasoning_content replay rule.
### Changed
- Internal crate dependency pins bumped 0.4.5 → 0.4.9 to match workspace.
- npm wrapper version and deepseekBinaryVersion bumped to 0.4.9.
- README fully rewritten: clearer feature highlights, V4 model focus,
keyboard shortcut table, improved docs index, and more engaging layout.
- CHANGELOG entry for 0.4.9 with comparison URLs.
Bump workspace version 0.4.6 -> 0.4.7 and ship the bug fix flagged by Devin
on PR #18: an uncommented `DEEPSEEK_API_KEY=` line in .env.example caused
`cp .env.example .env` to load an empty key, which `apply_env_overrides`
then propagated into the config and `Config::validate()` rejected on
startup with "api_key cannot be empty string".
- .env.example: comment out the empty `DEEPSEEK_API_KEY=` placeholder.
- crates/tui/src/config.rs: skip empty `DEEPSEEK_API_KEY` env values in
`apply_env_overrides`, matching the facade's empty-string filter.
- Add `apply_env_overrides_ignores_empty_api_key` regression test.
- Bump Cargo.toml workspace version, npm wrapper version + binary version,
and Cargo.lock.
Bundles new work since v0.4.4: NIM provider support, file mention
attachments, provider switch UX, setup --status/--clean/--tools/--plugins,
doctor --json, and protocol-recovery hardening that strips fake
tool-call wrappers from streaming model output.
Local main's unpushed commits had removed DEFAULT_TEXT_MODEL from the
crate::config import in main.rs, but the merged branch's new code at
two call sites still uses it. Textual three-way merge took the local
import line and the branch's call sites, producing a build break.
Re-add the symbol to the import.
Adds a compact `setup --status` view, a `setup --clean` for regenerable
session checkpoints, and `--tools`/`--plugins` scaffolding for
~/.deepseek/{tools,plugins} so the extension model has a documented home
that doctor can count. `doctor --json` lands as a CI-safe alternative to
the human-readable doctor (skips the live API probe).
Also locks down the engine's hostility to fake tool-call wrappers:
filter_tool_call_delta and the marker constants are now testable, the
streaming loop emits one compact status notice per turn when it strips
a wrapper, and a new protocol_recovery integration test asserts that
the legacy text parser never turns <function_calls> into a real tool
call. Adds 23 unit tests + 14 integration tests covering both slices.
Adds first-class DeepSeek V4 Pro and Flash support, updates the default model to deepseek-v4-pro, aligns legacy aliases with the current V4 1M context behavior, and fixes thinking-mode request handling.
Key fixes:
- Send DeepSeek's raw Chat Completions `thinking` parameter at the top level instead of SDK-only `extra_body`.
- Preserve assistant `reasoning_content` for all prior thinking-mode tool-call turns so subsequent requests satisfy DeepSeek V4's replay requirement.
- Fix npm wrapper concurrent first-run downloads by using per-process temporary download paths.
- Add `.mailmap` so historical bot-attributed commits aggregate under Hunter Bown where mailmap is honored.
Verified with the full local Rust gate, live DeepSeek V4 smoke, npm wrapper temp-install smoke, and green PR CI across Linux, macOS, and Windows.
- Remove FOOTER_HINT color constant from palette
- Drop footer clock label and related synchronization logic
- Simplify footer status line layout and narrow-terminal handling
- Update tests to align with simplified footer logic
- Remove empty state placeholder text for cleaner UI
- Bump version to 0.3.33
Add Yahoo Finance quote tool with chart fallback, redesign header widget
with proportional truncation and context bar, refactor footer status strip,
expand test suite to 680+ tests, and fix blocking issues (usize underflow
in header, tempdir leak in finance tests, per-call HTTP client creation).
* fix: address PR #4 follow-ups
Honor low_motion in the default tool transcript path and align composer cursor padding with the rendered placeholder. Add focused regression tests for both behaviors.
* lint: remove redundant imports in empty_state test, reuse create_test_app
The test had inner `use` statements for Config, App, TuiOptions, and
PathBuf that duplicated the module-level test imports. It also manually
constructed App instead of calling the existing create_test_app() helper.
* fix: replace useless format!("{text}") with text.to_string() in details_affordance_line
* test: pin composer_density in cursor test to avoid sensitivity to loaded settings
Settings::load() may return a non-default composer_density on some CI
environments. Explicitly set ComposerDensity::Comfortable so the
expected cursor position is deterministic across all platforms.
* fix: make tool low_motion test robust against coarse Windows timers
Use a 2× cycle offset so the animated frame index is 2 (maximally
distant from 0), giving 1800 ms of headroom before the animation could
wrap back to index 0. The previous 1× offset left only ~15 ms of
margin, causing flaky failures on Windows where Instant resolution is
approximately 15.6 ms.
* fix: correct headroom comment in tool animation test (3600ms, not 1800ms)
* fix: resolve lint, parity, and Windows test failures
- Fix rustfmt line-length issue in history.rs tool animation test
- Settings::path() now respects DEEPSEEK_CONFIG_PATH for Windows test compat
- doctor_check_mcp_server recognizes Unix-style absolute paths on Windows
- Use checked_sub for Instant arithmetic in web_run tests to prevent
underflow on freshly-booted Windows CI runners
* fix: expand ~ in DEEPSEEK_CONFIG_PATH when resolving settings path
---------
Keep legacy /normal and settings fallback behavior mapped to Agent, align docs around the three visible modes, and include the current TUI and onboarding refinements in this worktree.