A small cleanup pass to catch brand mentions that the R5 sweep missed
because they hid in:
- HTTP User-Agent format strings (`Mozilla/5.0 (compatible; deepseek-tui/`
in `client.rs` and `fetch_url.rs`).
- Multi-line error messages whose phrase boundary straddled a line break
("…restart\n deepseek-tui." in `js_execution.rs`,
`tool_catalog.rs`, `repl/runtime.rs`).
- Doc comments mentioning `deepseek-tui` as a binary (`config/src/lib.rs`,
`core/capacity.rs`, `tui/streaming/chunking.rs`, `features.rs`).
- Skill descriptions shipped in `crates/tui/assets/skills/*/SKILL.md`.
- Test fixtures with placeholder paths / git emails
(`tui/external_editor.rs`, `snapshot/repo.rs`).
- `task_manager.rs`'s `cargo test -p deepseek-tui --lib` example.
- `scripts/tencent-lighthouse/doctor.sh` info-line prefix.
The remaining `deepseek-tui` mentions in the codebase are intentional
(the legacy `[[bin]]` entry in `crates/tui/Cargo.toml`, the legacy
`npm/deepseek-tui/` deprecation shim package, the CNB mirror namespace,
the security email, the legacy bin's shim source file, and historical
CHANGELOG entries) and were preserved per the rebrand anti-scope.
Local gates green: `cargo check --workspace --all-targets --locked`,
`cargo fmt --all -- --check`, `cargo clippy --workspace --all-targets
--all-features --locked -- -D warnings`, `cargo test --workspace
--all-features --locked` (3226+ pass, 0 fail).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address gemini-code-assist review on PR #1743:
- HIGH: should_replay_reasoning_content_for_provider was made model-aware
in the previous commit, but handle_chat_completion_stream still computed
is_reasoning_model = requires_reasoning_content(model) &&
provider_accepts_reasoning_content(provider). On the openai provider +
a DeepSeek model that was false during SSE parsing, so reasoning tokens
were stored as content (not reasoning_content) and the next request
still 400'd -- the fix was incomplete. Extract is_reasoning_model_for_stream()
and route the stream call site through it; add an equivalence test
locking it to the replay predicate so the two paths can't drift.
- MEDIUM: rename generic_openai_provider_drops_deepseek_reasoning_content
-> generic_openai_provider_drops_reasoning_content_for_non_deepseek_models
(now uses gpt-4o; old name was misleading).
Non-DeepSeek models on any provider are unaffected (#1542 not regressed).
Refs #1739, #1694, #1542.
should_replay_reasoning_content_for_provider() returned false whenever
provider_accepts_reasoning_content(provider) was false (true for
ApiProvider::Openai) without checking the model. This single gate feeds
both build_for_provider (include_reasoning) and
sanitize_thinking_mode_messages, so a DeepSeek reasoning model on the
generic openai provider (DeepSeek-compatible endpoint) had all
reasoning_content stripped -> the DeepSeek thinking-mode API 400s
('reasoning_content in the thinking mode must be passed back'). This is
the over-aggressive half of ac01b225 (fix#1542).
Gate the early return on the model too:
!provider_accepts_reasoning_content(provider) && !requires_reasoning_content(model).
Known DeepSeek reasoning models replay regardless of provider; genuine
non-DeepSeek models on openai still strip (effort=off still wins). #1542
not regressed (provider_accepts_reasoning_content untouched).
Two pre-existing client.rs tests asserted the buggy case (deepseek-v4-pro
on Openai -> dropped); retargeted to gpt-4o to preserve their #1542
intent without encoding the bug. New positive/negative coverage in
chat.rs.
Refs #1739, #1694, #1542, #1736.
`apply_reasoning_effort`'s vLLM branch was injecting
`thinking: {type: "disabled"}` at the top of the request body to
turn off model reasoning. But vLLM speaks OpenAI's
chat-completions protocol, not Anthropic-native extension fields,
and silently ignored that directive — the model emitted a full
hidden reasoning trace into the non-OpenAI-standard `reasoning`
field (which this client does not surface), so users saw a
~13-second perceived freeze before the first content token
arrived.
The vLLM branch now emits the OpenAI extension
`chat_template_kwargs.enable_thinking` — the canonical way to
toggle Qwen3's `<think>` mode, DeepSeek-R1's reasoning trace, and
any other reasoning-capable model served via vLLM. End-to-end
measurement against vLLM hosting Qwen3.6-35B-A3B-FP8:
- TTFT: 13039ms → 274ms
- Total LLM call: 13s → 5.7s
- Output rate: 3 ch/s → 46 ch/s
The `high` / `max` reasoning levels likewise route through
`chat_template_kwargs` so the toggle is consistent across effort
levels. No change for any non-vLLM provider (NVIDIA NIM continues
to use the NVIDIA-specific `chat_template_kwargs.thinking` key;
Anthropic-native providers keep the Anthropic-native field).
Resolved a 3-way merge conflict against the v0.8.32 AtlasCloud
harvest (PR #1436) so AtlasCloud stays in the no-op match arm
alongside OpenAI / Ollama while the new vLLM arm gets its own
branch. Note for future Sglang / Fireworks / Novita work: those
servers likely have the same bug but each has its own
chat_template_kwargs schema; this PR is intentionally minimal
to the verified-fix scope.
Harvested from PR #1480 by @h3c-hexin
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two-layer design for users whose UI locale is not English:
1. **System-prompt directive (primary)**: when the user enables
translation via `/translate`, a `## Language Output Requirement`
block is appended to the system prompt instructing the model to
reply in the resolved session locale (Simplified Chinese,
Traditional Chinese, Japanese, or Brazilian Portuguese). Code
identifiers, technical terms without an established translation,
and code blocks the user explicitly requests in English are
exempt. The block is gated on `PromptSessionContext.translation_
enabled`, so it adds zero tokens for installs that don't opt in.
2. **Post-hoc heuristic (fallback)**: a lightweight detector in
`tui::translation` compares Latin-letter count against weighted
CJK characters (CJK chars carry ~3× the information per glyph,
so the ratio comparison stays fair across mixed code+prose).
When a reply still surfaces English despite the directive, the
detector flags it and a focused per-message `client.translate()`
call renders the localised version before display. The dedicated
translation request runs without conversation history, tool
calls, or streaming — the only role is translate-and-return.
Adds the `/translate` slash command, locale strings for the new UI
states, the post-hoc fallback module, the per-message
`TranslationStatus`, and threading through `core::ops`,
`core::engine`, `runtime_threads`, and the TUI app/UI surface.
Trust-boundary check: opt-in only — `translation_enabled` defaults
to false everywhere, so English-locale installs see zero behaviour
change. The system prompt addition is conditional on the runtime
flag, not the contributor's earlier always-on form. Threaded the
new `Locale::ZhHant` arm through the v0.8.32 `/change` slash
command match to keep the pattern exhaustiveness check passing.
Harvested from PR #1462 by @YaYII
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AtlasCloud (https://atlascloud.ai) hosts the V4 family on its own
DeepSeek-compatible endpoint at `https://api.atlascloud.ai/v1`, and
several contributors had been running it through the
OpenAI-compatible passthrough with manual `base_url` / model
overrides. Selecting `provider = "atlascloud"` in
`~/.deepseek/config.toml` (or via `DEEPSEEK_PROVIDER=atlascloud`)
now wires up:
- documented `DEFAULT_ATLASCLOUD_BASE_URL` /
`DEFAULT_ATLASCLOUD_MODEL` defaults so a fresh install needs
only the api_key
- a `[providers.atlascloud]` config block with the same fields
every other named provider exposes (api_key / base_url / model
/ http_headers)
- `ATLASCLOUD_API_KEY` env var path, including the secrets test
cleanup loop so per-test env hygiene continues to work
- the provider-picker / `/provider` slash command entries so the
provider is reachable from the runtime UI, not just config
- the env-driven `*_BASE_URL` override branch so users who pin a
proxy can still flip it without editing config.toml
Trust-boundary pins held: AtlasCloud is opt-in (default remains
DeepSeek), no API keys are hardcoded, the api_key resolution flows
through the same `secrets` crate path every other provider uses,
and the provider-config base_url stays settable per environment.
Resolved 3-way merge conflicts in `crates/secrets/src/lib.rs` (env
cleanup loop) and `crates/tui/src/config.rs` (per-provider
base_url match arm + `provider_passes_model_through` predicate)
so the contributor's AtlasCloud branch coexists with the v0.8.x
provider expansion already on `main`. Added the missing match arm
in `validate_provider_base_url` so the non-exhaustive-pattern
check passes after the new variant lands.
Harvested from PR #1436 by @lucaszhu-hue
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous one-line error told users to set
DEEPSEEK_ALLOW_INSECURE_HTTP=1 but the env var name is easy to typo
when you're staring at it in a terminal (sam43b in #1303 wrote
"DEEPSEEKALLOWINSECURE_HTTP"). Reformat the message to:
- Note that loopback hosts are auto-allowed (no env var needed)
- Show the env var with underscores explicit and prominent
- Include a one-line copy-pasteable example
No behavior change; same `validate_base_url_security` decisions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(client): stabilize reasoning_content replay for prompt cache
- stop gating assistant reasoning_content on whether a later user turn
exists; the field now depends only on the stored message itself
- preserve historical message bytes across turns so DeepSeek's prefix
cache stays warm on every text-reply follow-up
- add a byte-stability regression test and update the prior-non-tool
reasoning test to assert the new contract
* style(client): rustfmt single-line let binding
- collapse a two-line `let mut has_reasoning = ...` into a single line
so `cargo fmt --all -- --check` passes
Merge of PR #1196 by wplll. Adds:
Cache-aware prompt layering:
- PromptBuilder struct separates prompt construction from inspection
- System prompt split into named layers with stability classification
- Layers classified as static/history/dynamic for cache debugging
/cache inspect command:
- SHA-256 hashes of each rendered prompt layer
- Base static prefix hash vs full request prefix hash
- Static prefix stability status across turns
- First-divergence tracking from previous request
Wire payload optimization:
- Tool result budget: large outputs compacted before API request
- Tool result dedup: repeated outputs replaced by compact refs
- Turn metadata dedup: repeated <turn_meta> blocks deduplicated
- Wire-only: local session messages remain unchanged
Project context pack:
- Deterministic workspace summary injected into stable prefix
- Configurable via [context] project_pack = false
Cache warmup and improved footer cache display.
Thanks to wplll for the contribution.
* test: add reproducer for /models 404 on beta base URL
* fix: route non-beta paths to /v1 when base URL ends with /beta
---------
Co-authored-by: Hanmiao Li <894876246@qq.com>
Integrates the useful custom HTTP header support from #881 onto current main.
- support root, provider-specific, and DEEPSEEK_HTTP_HEADERS overrides
- apply validated extra headers to model API requests while preserving protected Authorization and Content-Type defaults
- document the config shape in README, config.example.toml, and docs/CONFIGURATION.md
Co-authored-by: Desheng <8596814+dst1213@users.noreply.github.com>
Deletes crates/tui/src/responses_api_proxy/ (443 LOC), client/responses.rs
(406 LOC), and removes the ResponsesApiProxy CLI command, the
EXPERIMENTAL_RESPONSES_API_ENV env var plumbing, chat_fallback_counter,
use_chat_completions, RESPONSES_RECOVERY_INTERVAL, and the
RequestPayloadMode::ResponsesApi variant. The experimental Responses API
path was never instantiated and had no documented users; removing it
simplifies the client surface for the upcoming --anthropic-wire flag.
Closes#723
`normalize_model_name` now passes v-series snapshots through unchanged
(deepseek-v4-flash-20260423 stays pinned, future v5-* matches via
regex). Removes ~245 LOC of legacy alias machinery: deepseek_legacy_aliases,
the chat/reasoner/r1/v3/v3.2 fold-arm, is_current_deepseek_v4_alias,
v4 fallback branch, alias capacity test seeds, alias config test block.
The migration from V3 → V4 is over; users on legacy names route their
own request to DeepSeek and see the server actual response (404 if
deprecated, success if still served). No more silent renaming.
Closes#717
47 fmt drifts had accumulated from the squash-merged community PRs on
this branch (#653, #654, #655, #645, #658, #668, #659, #661, #660,
#667, #656). Pure formatting — no behavioural changes — applied via
`cargo fmt --all` to satisfy CI's `cargo fmt --all -- --check` gate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resolves the post-#514/#517/#518 conflicts:
- CHANGELOG.md: kept both polish-stack and Linux ARM64 entries under
[Unreleased]; reordered so the ARM64/install-message Changed/Docs
sections precede the Releases footer.
- config.example.toml: kept both the `instructions = [...]` example
and the `[memory]` opt-in stanza in sequence.
- crates/tui/src/config.rs: kept both `instructions_paths()` (#454)
and `memory_enabled()` (#489) on the Config impl.
- crates/tui/src/prompts.rs: extended
`system_prompt_for_mode_with_context_and_skills` to take BOTH
`instructions: Option<&[PathBuf]>` and `user_memory_block:
Option<&str>`. Section 2.5a renders instructions; 2.5b renders the
memory block — both above the skills block so KV prefix caching
still wins.
- crates/tui/src/core/engine.rs: thread both args through the two
call sites.
- crates/tui/src/prompts.rs: update the `system_prompt_for_mode_with_context`
forwarder and the test caller to pass `None` for the new arg.
- .gitignore: ignore `.claude/*.local.md` and `*.local.json` so
local ralph / Claude-Code notes can't leak into commits.
Folds in two valid suggestions from the gemini-code-assist review on #519:
- `client.rs`: collapse the duplicated `LlmError → label` match and the
`human_retry_reason` body into a single
`retry_reason_label_and_human(err) -> (&'static str, String)` helper.
- `widgets/footer.rs::retry_banner_spans`: merge the two separate
`match &props.retry` blocks into one that returns both `(label, color)`.
Behavior is unchanged; refactor is a pure DRY win.
Corporate users behind TLS-inspecting proxies (Zscaler, Netskope,
Palo Alto, in-house mitmproxy fleets) need to add the proxy's
intermediate CA to the trusted-roots set so the deepseek client
doesn't fail with `unable to get local issuer certificate`.
The reqwest builder already trusts the platform's system store
via native-tls. This adds opt-in support for the conventional
`SSL_CERT_FILE` env var so users can point at their own bundle:
* New `add_extra_root_certs(builder, path)` helper reads the
file, tries `Certificate::from_pem_bundle` (covers single-cert
files too), falls back to `from_der` for binary cert files.
* Wired into `build_http_client` when `SSL_CERT_FILE` is set
and non-empty. Failures log a warning via the existing
`logging::warn` channel and return the builder unchanged —
the existing system trust still applies, so a malformed env
var degrades gracefully instead of bricking the launch.
* Each successful load logs `info` with the cert count so
operators can confirm their bundle was picked up.
Documented in `docs/CONFIGURATION.md`'s environment-variables
list alongside the existing TLS-related notes.
No new dependency — reqwest's `native-tls` feature already
exposes `Certificate::from_pem_bundle` / `from_der`.
When the API client retries a 429 / 5xx / network failure, the TUI
previously went silent during the backoff sleep. The user saw "thinking"
or "ready" with no signal anything was wrong, until the request finally
either succeeded or raised. PR adds a foreground retry banner so the
user sees what's happening and how long until the next attempt.
### What ships
- New `crates/tui/src/retry_status.rs` module exposing a process-wide
`RetryState` (`Idle | Active(banner) | Failed { reason }`) with
`start`, `succeeded`, `failed`, and `clear` helpers. The state is
process-global because the user-facing TUI is one engine per process;
sub-agent retries deliberately don't light up the foreground banner.
- `client::send_with_retry` now flips the state in its retry callback
(`start(attempt+1, delay, reason)`) and on the final outcome
(`succeeded()` on Ok, `failed(reason)` on Err with retries-exhausted,
`clear()` on Err with attempts==1 so non-retryable errors don't pin
the failure row).
- `human_retry_reason` translates the structured `LlmError` into a
short label: rate-limit reasons include the `Retry-After` header
when the upstream provided one ("rate limited (Retry-After 30s)").
- Footer's `render` checks `retry_status::snapshot()` first; when
`Active` it renders `⟳ retry N in Ms — <reason>` in the warning
color; when `Failed` it renders `× failed: <reason>` in the error
color. Banner takes precedence over the toast and the regular
status line.
- `Engine::handle_user_message` calls `retry_status::clear()` right
after emitting `TurnStarted` so the previous turn's failure row
doesn't bleed into a new turn.
### Tests
- 4 unit tests in `retry_status::tests` covering idle default, the
active → succeeded round-trip, the failed-state pin, and a
past-deadline saturation.
- 2 footer rendering tests asserting the banner / failure-row text
appears in the rendered buffer.
- All tests touching the global retry surface serialize through
`retry_status::test_guard()` so cargo's parallel runner can't observe
a torn read.
### Verification
cargo fmt --all -- --check ✓
cargo clippy --workspace --all-targets --all-features --locked -- -D warnings ✓
cargo test --workspace --all-features --locked ✓ 1815 + supporting (was 1809 on stabilization base)
cargo test -p deepseek-tui --bin deepseek-tui --locked retry ✓ 28 passed
Closes#499
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add DeepseekCN as a first-class provider variant with:
- Enum variant + parse/as_str/display_name/all methods
- DEFAULT_DEEPSEEKCN_BASE_URL (https://api.deepseeki.com)
- Auto-detection when base_url contains api.deepseeki.com
- Locale-based auto-suggest: if no provider is configured and
system locale (LC_ALL/LC_MESSAGES/LANG) starts with 'zh-*',
the TUI defaults to DeepseekCN at startup
- ProvidersConfig.deepseek_cn for provider-scoped credentials
- All match arms updated across config.rs, client.rs,
provider_picker.rs, main.rs, and ui.rs
- provider_picker tests updated for the 7th provider entry
#167: Fix all 7 clippy warnings — annotated SeamMetadata dead fields,
removed unused should_cycle calls, collapsed nested ifs, fixed
useless_format and nonminimal_bool.
#168: Wire TokenUsage mailbox drain to subagent_cost accumulator.
handle_subagent_mailbox now intercepts TokenUsage before routing to
cards, computes cost via calculate_turn_cost, and increments
app.subagent_cost in real time. Footer reflects live sub-agent spend.
Restored ArchivedContext variant to HistoryCell (corrupted by prior
apply_patch). Version bump to 0.7.2.
Refs: #166, #167, #168
Phase 1: log full reqwest error chain + headers + bytes-received at decode site
Phase 2: HTTP/2 keepalive settings + tcp keepalive on the reqwest builder
Phase 3: engine transparently retries when stream errors before any content;
surface error on mid-stream failure (no double-bill); stream_errors
threshold relaxed 3 -> 5 with the new keepalive
Phase 4: unit tests for the four classes of stream failure
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two fixes for the persistent "Stream read error: error decoding response
body" we saw mid-turn during long V4-pro thinking sessions.
1) HTTP transport tuning (`crates/tui/src/client.rs`):
- Drop the blanket 300s request timeout. Long V4 thinking turns
legitimately exceed the wall-clock window; per-chunk and per-stream
guards in `engine.rs` already bound how long we wait without progress.
- Add `tcp_keepalive(30s)` so dead-peer detection happens at the TCP
layer instead of waiting for the application to notice.
- Add `http2_keep_alive_interval(15s)` + `http2_keep_alive_timeout(20s)`
so HTTP/2 connections to DeepSeek's edge don't go silent and get
killed by an upstream proxy mid-thinking.
2) Stream-error diagnostics (`crates/tui/src/client/chat.rs`):
- Walk reqwest's `std::error::Error::source()` chain when a chunk read
errors, so the underlying hyper / h2 / io error is logged. Without
this the outer "error decoding response body" message tells us
nothing about WHY the stream died.
- Track elapsed wall time, bytes received so far, and ms since the
last successful event; log them alongside the error chain. Lets us
tell HTTP/2 RST_STREAM mid-idle from chunk-decode-failure on a
short stream from gzip-corruption mid-burst.
Phase 3 (transparent retry with `prefix` continuation) is intentionally
NOT in this PR. The retry-flag plumbing on MessageRequest + chat.rs prefix
wire format + engine.rs retry loop is a meaningful surface that deserves
its own review pass; this PR ships the diagnostic-and-resilience floor so
we can land the harder retry work knowing the underlying network state is
better.
Refs #103.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Split client.rs into client/mod.rs (public API + helpers), client/chat.rs
(chat-completions streaming), and client/responses.rs (responses API
helpers). Internal helpers promoted to pub(super) for intra-module
visibility; the public DeepSeekClient API is unchanged.
While here, redesign all five system prompts around decomposition-first
philosophy inspired by the mismanaged-geniuses hypothesis (Zhang et al.,
2026). The model is now instructed to todo_write / update_plan before
acting, fan out sub-agents for parallel work, and keep the sidebar
populated so the user always sees what's happening. Mode prompts updated:
- agent.txt: 'Before requesting approval, lay out work with todo_write'
- plan.txt: 'Use update_plan for strategy, todo_write for tactics'
- yolo.txt: 'Even with auto-approval, create a todo_write first'
- normal.txt: same pattern for legacy compatibility
Update CHANGELOG [Unreleased] and README modes section accordingly.
Two related TUI affordances bundled because they share ui.rs and the
ui/tests.rs file.
#30 — Reasoning-content replay telemetry, end-to-end:
* models.rs — Usage gains reasoning_replay_tokens: Option<u32>.
* client.rs — sanitize_thinking_mode_messages now returns the
approximate replay-token count (~4 chars/token); the streaming
pipeline overlays it onto the parsed MessageDelta usage so the
server-reported and client-estimated numbers reach the engine
together.
* app.rs — App stores last_reasoning_replay_tokens.
* ui.rs — TurnComplete handler copies the value into the App; new
footer_reasoning_replay_spans renders an `rsn N.Nk` chip in the
footer next to the cache hit-rate, warning-coloured when replay
tokens exceed 50% of the input budget.
* ui/tests.rs — covers chip-on, chip-hidden-when-zero, and the
sanitizer's None-on-non-thinking-model path.
#28 — Tab-complete @file mentions against the workspace:
* ui.rs — adds partial_file_mention_at_cursor (with a guard against
`user@example.com`-style false positives) and
try_autocomplete_file_mention. Walks the workspace via the
existing ignore::WalkBuilder, ranks prefix matches above
substring matches, applies the unique match outright, extends to
the longest common prefix when multiple match, and surfaces
ambiguous candidates via the status line. Wired into the existing
Tab handler after the slash-command branch.
* ui/tests.rs — covers cursor-inside-mention extraction, email
guard, prefix vs substring ranking, single-match application,
common-prefix extension, no-match status, and the
no-mention-no-op path.
The mention-expansion path that ships file contents to the model is
unchanged — this is purely a discovery aid for typing the path.
Inline-contents and a fuzzy popup picker are queued for v0.5.2.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The thinking-mode sanitizer now sums the byte size of every replayed
`reasoning_content` field in the outgoing chat-completions body and
emits an `info`-level log line:
Reasoning-content replay: 7 assistant message(s), ~3.2K input tokens (12,884 chars) being re-sent in this request
This is visible under `RUST_LOG=deepseek_tui=info` (or higher). It's the
first step toward the footer/status-line indicator described in #30 —
the model's input-side reasoning replay is now observable per turn,
even before it gets a dedicated UI surface.
Tests cover both branches: bodies that already have reasoning_content
(count is summed across all assistant turns) and bodies where the
sanitizer had to inject the `(reasoning omitted)` placeholder (the
placeholder bytes are included in the count since they ship over the
wire).
Footer integration deferred — that needs a new event from client → engine
→ TUI to surface the count alongside `cache N%` / `$X.XX`. Part of #30
remains open.
### Fixed
- DeepSeek thinking-mode tool-call rounds now always replay reasoning_content
in all subsequent requests (including across new user turns), matching the
documented API contract that assistant tool-call messages must retain their
reasoning content forever. Previously, reasoning_content was cleared after
the current user turn completed, which could cause HTTP 400 errors.
- Missing reasoning_content on a tool-call assistant message now substitutes
a safe placeholder ("(reasoning omitted)") instead of dropping the tool
calls and their matching tool results, preventing orphaned conversation
chains and API 400 rejections.
- Session checkpoint now persists a Thinking-block placeholder for tool-call
turns that produced no streamed reasoning text, keeping on-disk sessions
structurally correct for subsequent requests.
- Token estimation for compaction now counts thinking tokens across ALL
tool-call rounds (not just the current user turn), aligning with the
updated reasoning_content replay rule.
### Changed
- Internal crate dependency pins bumped 0.4.5 → 0.4.9 to match workspace.
- npm wrapper version and deepseekBinaryVersion bumped to 0.4.9.
- README fully rewritten: clearer feature highlights, V4 model focus,
keyboard shortcut table, improved docs index, and more engaging layout.
- CHANGELOG entry for 0.4.9 with comparison URLs.