codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
Hunter Bown	5fa24733e9	chore(rebrand): update repository links for CodeWhale	2026-05-23 14:07:36 -05:00
Hunter Bown	32ce14d6b2	test(rebrand): residual brand-string cleanup across source and assets A small cleanup pass to catch brand mentions that the R5 sweep missed because they hid in: - HTTP User-Agent format strings (`Mozilla/5.0 (compatible; deepseek-tui/` in `client.rs` and `fetch_url.rs`). - Multi-line error messages whose phrase boundary straddled a line break ("…restart\n deepseek-tui." in `js_execution.rs`, `tool_catalog.rs`, `repl/runtime.rs`). - Doc comments mentioning `deepseek-tui` as a binary (`config/src/lib.rs`, `core/capacity.rs`, `tui/streaming/chunking.rs`, `features.rs`). - Skill descriptions shipped in `crates/tui/assets/skills/*/SKILL.md`. - Test fixtures with placeholder paths / git emails (`tui/external_editor.rs`, `snapshot/repo.rs`). - `task_manager.rs`'s `cargo test -p deepseek-tui --lib` example. - `scripts/tencent-lighthouse/doctor.sh` info-line prefix. The remaining `deepseek-tui` mentions in the codebase are intentional (the legacy `[[bin]]` entry in `crates/tui/Cargo.toml`, the legacy `npm/deepseek-tui/` deprecation shim package, the CNB mirror namespace, the security email, the legacy bin's shim source file, and historical CHANGELOG entries) and were preserved per the rebrand anti-scope. Local gates green: `cargo check --workspace --all-targets --locked`, `cargo fmt --all -- --check`, `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings`, `cargo test --workspace --all-features --locked` (3226+ pass, 0 fail). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 11:58:34 -05:00
Hunter Bown	8597afc076	feat(provider): add Wanjie Ark support	2026-05-21 00:02:02 +08:00
Hunter Bown	912da38cca	chore(release): prepare v0.8.40	2026-05-18 23:29:20 +08:00
Zhongyue Lin	323f43df60	fix(client): align stream reasoning classification with replay (review #1743 ) Address gemini-code-assist review on PR #1743: - HIGH: should_replay_reasoning_content_for_provider was made model-aware in the previous commit, but handle_chat_completion_stream still computed is_reasoning_model = requires_reasoning_content(model) && provider_accepts_reasoning_content(provider). On the openai provider + a DeepSeek model that was false during SSE parsing, so reasoning tokens were stored as content (not reasoning_content) and the next request still 400'd -- the fix was incomplete. Extract is_reasoning_model_for_stream() and route the stream call site through it; add an equivalence test locking it to the replay predicate so the two paths can't drift. - MEDIUM: rename generic_openai_provider_drops_deepseek_reasoning_content -> generic_openai_provider_drops_reasoning_content_for_non_deepseek_models (now uses gpt-4o; old name was misleading). Non-DeepSeek models on any provider are unaffected (#1542 not regressed). Refs #1739, #1694, #1542.	2026-05-18 23:06:52 +08:00
Zhongyue Lin	1a5ee2f67d	fix(client): replay reasoning_content for DeepSeek models on openai provider (#1739 ) should_replay_reasoning_content_for_provider() returned false whenever provider_accepts_reasoning_content(provider) was false (true for ApiProvider::Openai) without checking the model. This single gate feeds both build_for_provider (include_reasoning) and sanitize_thinking_mode_messages, so a DeepSeek reasoning model on the generic openai provider (DeepSeek-compatible endpoint) had all reasoning_content stripped -> the DeepSeek thinking-mode API 400s ('reasoning_content in the thinking mode must be passed back'). This is the over-aggressive half of `ac01b225` (fix #1542). Gate the early return on the model too: !provider_accepts_reasoning_content(provider) && !requires_reasoning_content(model). Known DeepSeek reasoning models replay regardless of provider; genuine non-DeepSeek models on openai still strip (effort=off still wins). #1542 not regressed (provider_accepts_reasoning_content untouched). Two pre-existing client.rs tests asserted the buggy case (deepseek-v4-pro on Openai -> dropped); retargeted to gpt-4o to preserve their #1542 intent without encoding the bug. New positive/negative coverage in chat.rs. Refs #1739, #1694, #1542, #1736.	2026-05-18 23:06:52 +08:00
Hunter Bown	ae9e4b4b24	fix(client): omit strict OpenAI-incompatible fields	2026-05-14 15:25:40 -05:00
Hunter Bown	ac01b225e7	fix(client): strip reasoning content for generic providers Fixes #1542.	2026-05-13 00:17:14 -05:00
Hunter Bown	99c6b22e83	chore(release): v0.8.33 — sub-agent and RLM renovation with persistent sessions - Persistent RLM sessions (rlm_open/rlm_eval/rlm_close) with bounded REPL helpers - Fork-aware sub-agent sessions (agent_open/agent_eval/agent_close) with handle_read - Shared handle_read storage with slice/range/count/JSONPath projections - Slash-command routing: /rlm, /agent, /relay (/接力) for handoff prompts - Sidebar renamed to "Work" tab, consistent across Plan/Agent/YOLO modes - Tool papercuts: file_search excludes, grep_files strings, fetch_url JSON, edit_file fuzz, exec_shell merged stdout/stderr, revert_turn no-op reject - CLI reasoning-effort honoured on non-auto exec routes (#1511 @h3c-hexin) - Edit-file replacement boundaries clarified (#1516) - Pandoc output validated before probing (#1523) - Running turns steerable/repaintable (#1533, #1537) - Tasks/Activity Detail calmer under load - npm retry timeout hint (#1538 @reidliu41) - Issue templates improved (#1525 @reidliu41) - Shell: kill process group to prevent UI freeze (#828 @CrepuscularIRIS) - TUI: ignore leaked SGR mouse reports in composer (#1421 @reidliu41) - Footer: keep chips within available width (#1417 @Wenjunyun123) - Session picker: scope Ctrl+R to current workspace (#1395 @LinQ) - Removed stale competitive-analysis doc - Prompts/docs teach only new tool names	2026-05-12 19:54:08 -05:00
Hunter Bown	dcc2c448eb	fix(client): vLLM uses chat_template_kwargs to toggle reasoning, not the Anthropic field `apply_reasoning_effort`'s vLLM branch was injecting `thinking: {type: "disabled"}` at the top of the request body to turn off model reasoning. But vLLM speaks OpenAI's chat-completions protocol, not Anthropic-native extension fields, and silently ignored that directive — the model emitted a full hidden reasoning trace into the non-OpenAI-standard `reasoning` field (which this client does not surface), so users saw a ~13-second perceived freeze before the first content token arrived. The vLLM branch now emits the OpenAI extension `chat_template_kwargs.enable_thinking` — the canonical way to toggle Qwen3's `<think>` mode, DeepSeek-R1's reasoning trace, and any other reasoning-capable model served via vLLM. End-to-end measurement against vLLM hosting Qwen3.6-35B-A3B-FP8: - TTFT: 13039ms → 274ms - Total LLM call: 13s → 5.7s - Output rate: 3 ch/s → 46 ch/s The `high` / `max` reasoning levels likewise route through `chat_template_kwargs` so the toggle is consistent across effort levels. No change for any non-vLLM provider (NVIDIA NIM continues to use the NVIDIA-specific `chat_template_kwargs.thinking` key; Anthropic-native providers keep the Anthropic-native field). Resolved a 3-way merge conflict against the v0.8.32 AtlasCloud harvest (PR #1436) so AtlasCloud stays in the no-op match arm alongside OpenAI / Ollama while the new vLLM arm gets its own branch. Note for future Sglang / Fireworks / Novita work: those servers likely have the same bug but each has its own chat_template_kwargs schema; this PR is intentionally minimal to the verified-fix scope. Harvested from PR #1480 by @h3c-hexin Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 01:25:16 -05:00
Hunter Bown	d62facafac	feat(translate): opt-in /translate command localises model replies Two-layer design for users whose UI locale is not English: 1. System-prompt directive (primary): when the user enables translation via `/translate`, a `## Language Output Requirement` block is appended to the system prompt instructing the model to reply in the resolved session locale (Simplified Chinese, Traditional Chinese, Japanese, or Brazilian Portuguese). Code identifiers, technical terms without an established translation, and code blocks the user explicitly requests in English are exempt. The block is gated on `PromptSessionContext.translation_ enabled`, so it adds zero tokens for installs that don't opt in. 2. Post-hoc heuristic (fallback): a lightweight detector in `tui::translation` compares Latin-letter count against weighted CJK characters (CJK chars carry ~3× the information per glyph, so the ratio comparison stays fair across mixed code+prose). When a reply still surfaces English despite the directive, the detector flags it and a focused per-message `client.translate()` call renders the localised version before display. The dedicated translation request runs without conversation history, tool calls, or streaming — the only role is translate-and-return. Adds the `/translate` slash command, locale strings for the new UI states, the post-hoc fallback module, the per-message `TranslationStatus`, and threading through `core::ops`, `core::engine`, `runtime_threads`, and the TUI app/UI surface. Trust-boundary check: opt-in only — `translation_enabled` defaults to false everywhere, so English-locale installs see zero behaviour change. The system prompt addition is conditional on the runtime flag, not the contributor's earlier always-on form. Threaded the new `Locale::ZhHant` arm through the v0.8.32 `/change` slash command match to keep the pattern exhaustiveness check passing. Harvested from PR #1462 by @YaYII Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 00:46:50 -05:00
Hunter Bown	8f33e4bd48	feat(providers): add AtlasCloud as a first-class provider AtlasCloud (https://atlascloud.ai) hosts the V4 family on its own DeepSeek-compatible endpoint at `https://api.atlascloud.ai/v1`, and several contributors had been running it through the OpenAI-compatible passthrough with manual `base_url` / model overrides. Selecting `provider = "atlascloud"` in `~/.deepseek/config.toml` (or via `DEEPSEEK_PROVIDER=atlascloud`) now wires up: - documented `DEFAULT_ATLASCLOUD_BASE_URL` / `DEFAULT_ATLASCLOUD_MODEL` defaults so a fresh install needs only the api_key - a `[providers.atlascloud]` config block with the same fields every other named provider exposes (api_key / base_url / model / http_headers) - `ATLASCLOUD_API_KEY` env var path, including the secrets test cleanup loop so per-test env hygiene continues to work - the provider-picker / `/provider` slash command entries so the provider is reachable from the runtime UI, not just config - the env-driven `*_BASE_URL` override branch so users who pin a proxy can still flip it without editing config.toml Trust-boundary pins held: AtlasCloud is opt-in (default remains DeepSeek), no API keys are hardcoded, the api_key resolution flows through the same `secrets` crate path every other provider uses, and the provider-config base_url stays settable per environment. Resolved 3-way merge conflicts in `crates/secrets/src/lib.rs` (env cleanup loop) and `crates/tui/src/config.rs` (per-provider base_url match arm + `provider_passes_model_through` predicate) so the contributor's AtlasCloud branch coexists with the v0.8.x provider expansion already on `main`. Added the missing match arm in `validate_provider_base_url` so the non-exhaustive-pattern check passes after the new variant lands. Harvested from PR #1436 by @lucaszhu-hue Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-12 00:40:43 -05:00
Hunter Bown	6c25a18b42	chore(release): bump to v0.8.27, add CHANGELOG	2026-05-10 08:41:04 -05:00
imkingjh999	9fae156182	fix: address PR review feedback - Use CARGO_PKG_VERSION for User-Agent instead of hardcoded version - Restore default_model fallback for backward compatibility	2026-05-10 08:37:25 -05:00
dst1213	682e915857	fix: retry quota errors returned as HTTP 400	2026-05-10 08:15:19 -05:00
Hunter Bown	829270a8e7	fix(client): clearer insecure-base-URL rejection (#1303 ) The previous one-line error told users to set DEEPSEEK_ALLOW_INSECURE_HTTP=1 but the env var name is easy to typo when you're staring at it in a terminal (sam43b in #1303 wrote "DEEPSEEKALLOWINSECURE_HTTP"). Reformat the message to: - Note that loopback hosts are auto-allowed (no env var needed) - Show the env var with underscores explicit and prominent - Include a one-line copy-pasteable example No behavior change; same `validate_base_url_security` decisions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-10 00:27:56 -05:00
Duducoco	e10e53d396	fix(client): stabilize reasoning_content replay for prompt cache (#1297 ) * fix(client): stabilize reasoning_content replay for prompt cache - stop gating assistant reasoning_content on whether a later user turn exists; the field now depends only on the stored message itself - preserve historical message bytes across turns so DeepSeek's prefix cache stays warm on every text-reply follow-up - add a byte-stability regression test and update the prior-non-tool reasoning test to assert the new contract * style(client): rustfmt single-line let binding - collapse a two-line `let mut has_reasoning = ...` into a single line so `cargo fmt --all -- --check` passes	2026-05-09 12:34:36 -05:00
Hunter Bown	54ca5718d2	feat(cache): cache-aware prompt diagnostics and wire payload optimization (#1196 ) Merge of PR #1196 by wplll. Adds: Cache-aware prompt layering: - PromptBuilder struct separates prompt construction from inspection - System prompt split into named layers with stability classification - Layers classified as static/history/dynamic for cache debugging /cache inspect command: - SHA-256 hashes of each rendered prompt layer - Base static prefix hash vs full request prefix hash - Static prefix stability status across turns - First-divergence tracking from previous request Wire payload optimization: - Tool result budget: large outputs compacted before API request - Tool result dedup: repeated outputs replaced by compact refs - Turn metadata dedup: repeated <turn_meta> blocks deduplicated - Wire-only: local session messages remain unchanged Project context pack: - Deterministic workspace summary injected into stable prefix - Configurable via [context] project_pack = false Cache warmup and improved footer cache display. Thanks to wplll for the contribution.	2026-05-09 00:20:13 -05:00
Hunter Bown	f183501fbd	fix(client): route non-beta paths from beta base to v1 (#1174 ) * test: add reproducer for /models 404 on beta base URL * fix: route non-beta paths to /v1 when base URL ends with /beta --------- Co-authored-by: Hanmiao Li <894876246@qq.com>	2026-05-08 02:59:56 -05:00
Hunter Bown	f97604c3f0	fix(provider): enable OpenAI-compatible TUI runtime (#1017 )	2026-05-07 05:32:15 -05:00
Hunter Bown	3e2c832933	fix(api): narrow reasoning replay policy (#1009 )	2026-05-07 04:45:55 -05:00
Hunter Bown	323598e764	fix(api): harden strict tool schemas (#1005 )	2026-05-07 04:12:22 -05:00
Hunter Bown	c7ed05a07c	feat(api): default DeepSeek to beta endpoint Closes #941.\n\nRefs #938, #939, #940.	2026-05-06 21:24:59 -05:00
Reid	78c415f40c	feat(provider): add Ollama provider support (#921 ) Source PR: #921 by @reidliu41. Closes #908. Local verification: - cargo test --workspace --all-features ollama - cargo fmt --all -- --check - cargo build Co-authored-by: reidliu41 <reid201711@gmail.com>	2026-05-06 20:16:46 -05:00
Hunter Bown	633092167c	feat(config): support custom HTTP headers (#914 ) Integrates the useful custom HTTP header support from #881 onto current main. - support root, provider-specific, and DEEPSEEK_HTTP_HEADERS overrides - apply validated extra headers to model API requests while preserving protected Authorization and Content-Type defaults - document the config shape in README, config.example.toml, and docs/CONFIGURATION.md Co-authored-by: Desheng <8596814+dst1213@users.noreply.github.com>	2026-05-06 18:13:18 -05:00
Agent007	a335ff5e4c	feat(provider): add vLLM provider support (#737 ) Add vLLM as a first-class OpenAI-compatible self-hosted provider with VLLM_BASE_URL, VLLM_API_KEY, and VLLM_MODEL wiring.	2026-05-05 21:22:24 -05:00
Hunter Bown	ab59ef8ff2	fix(cost): count V4 reasoning tokens in usage output (#762 )	2026-05-05 19:57:25 -05:00
Hunter Bown	c4cbd7c19f	chore(release): finalize v0.8.13 stabilization	2026-05-05 13:06:09 -05:00
Hunter Bown	6b0a01d054	feat(client): remove dead responses_api_proxy module and EXPERIMENTAL_RESPONSES_API_ENV plumbing Deletes crates/tui/src/responses_api_proxy/ (443 LOC), client/responses.rs (406 LOC), and removes the ResponsesApiProxy CLI command, the EXPERIMENTAL_RESPONSES_API_ENV env var plumbing, chat_fallback_counter, use_chat_completions, RESPONSES_RECOVERY_INTERVAL, and the RequestPayloadMode::ResponsesApi variant. The experimental Responses API path was never instantiated and had no documented users; removing it simplifies the client surface for the upcoming --anthropic-wire flag. Closes #723	2026-05-05 04:22:35 -05:00
Hunter Bown	310c975597	feat(models): preserve dated variant suffixes; remove legacy alias machinery `normalize_model_name` now passes v-series snapshots through unchanged (deepseek-v4-flash-20260423 stays pinned, future v5-* matches via regex). Removes ~245 LOC of legacy alias machinery: deepseek_legacy_aliases, the chat/reasoner/r1/v3/v3.2 fold-arm, is_current_deepseek_v4_alias, v4 fallback branch, alias capacity test seeds, alias config test block. The migration from V3 → V4 is over; users on legacy names route their own request to DeepSeek and see the server actual response (404 if deprecated, success if still served). No more silent renaming. Closes #717	2026-05-05 03:49:20 -05:00
Hunter Bown	a9dcf2b6e6	style: cargo fmt sweep across community PRs 47 fmt drifts had accumulated from the squash-merged community PRs on this branch (#653, #654, #655, #645, #658, #668, #659, #661, #660, #667, #656). Pure formatting — no behavioural changes — applied via `cargo fmt --all` to satisfy CI's `cargo fmt --all -- --check` gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 01:57:25 -05:00
wangfeng	fd8c9fdb20	feat(tools): FIM edit tool + strict tool mode for V4 endpoints (closes #662 )	2026-05-04 18:11:56 -07:00
wangfeng	3edcc6dacb	feat(tools): FIM edit tool stub for V4 /beta endpoint (closes #662 )	2026-05-04 18:05:48 -07:00
Hunter Bown	bda30b0fd6	Merge main into feat/v0.8.8-tui-polish + gemini-code-assist feedback Resolves the post-#514/#517/#518 conflicts: - CHANGELOG.md: kept both polish-stack and Linux ARM64 entries under [Unreleased]; reordered so the ARM64/install-message Changed/Docs sections precede the Releases footer. - config.example.toml: kept both the `instructions = [...]` example and the `[memory]` opt-in stanza in sequence. - crates/tui/src/config.rs: kept both `instructions_paths()` (#454) and `memory_enabled()` (#489) on the Config impl. - crates/tui/src/prompts.rs: extended `system_prompt_for_mode_with_context_and_skills` to take BOTH `instructions: Option<&[PathBuf]>` and `user_memory_block: Option<&str>`. Section 2.5a renders instructions; 2.5b renders the memory block — both above the skills block so KV prefix caching still wins. - crates/tui/src/core/engine.rs: thread both args through the two call sites. - crates/tui/src/prompts.rs: update the `system_prompt_for_mode_with_context` forwarder and the test caller to pass `None` for the new arg. - .gitignore: ignore `.claude/.local.md` and `.local.json` so local ralph / Claude-Code notes can't leak into commits. Folds in two valid suggestions from the gemini-code-assist review on #519: - `client.rs`: collapse the duplicated `LlmError → label` match and the `human_retry_reason` body into a single `retry_reason_label_and_human(err) -> (&'static str, String)` helper. - `widgets/footer.rs::retry_banner_spans`: merge the two separate `match &props.retry` blocks into one that returns both `(label, color)`. Behavior is unchanged; refactor is a pure DRY win.	2026-05-03 08:29:59 -05:00
Hunter Bown	604edc9f83	feat(tls): honor SSL_CERT_FILE for corporate-CA / MITM proxies (#418 ) Corporate users behind TLS-inspecting proxies (Zscaler, Netskope, Palo Alto, in-house mitmproxy fleets) need to add the proxy's intermediate CA to the trusted-roots set so the deepseek client doesn't fail with `unable to get local issuer certificate`. The reqwest builder already trusts the platform's system store via native-tls. This adds opt-in support for the conventional `SSL_CERT_FILE` env var so users can point at their own bundle: * New `add_extra_root_certs(builder, path)` helper reads the file, tries `Certificate::from_pem_bundle` (covers single-cert files too), falls back to `from_der` for binary cert files. * Wired into `build_http_client` when `SSL_CERT_FILE` is set and non-empty. Failures log a warning via the existing `logging::warn` channel and return the builder unchanged — the existing system trust still applies, so a malformed env var degrades gracefully instead of bricking the launch. * Each successful load logs `info` with the cert count so operators can confirm their bundle was picked up. Documented in `docs/CONFIGURATION.md`'s environment-variables list alongside the existing TLS-related notes. No new dependency — reqwest's `native-tls` feature already exposes `Certificate::from_pem_bundle` / `from_der`.	2026-05-03 07:35:23 -05:00
Hunter Bown	8680a43298	feat(tui): visual retry/backoff countdown in footer (#499 ) When the API client retries a 429 / 5xx / network failure, the TUI previously went silent during the backoff sleep. The user saw "thinking" or "ready" with no signal anything was wrong, until the request finally either succeeded or raised. PR adds a foreground retry banner so the user sees what's happening and how long until the next attempt. ### What ships - New `crates/tui/src/retry_status.rs` module exposing a process-wide `RetryState` (`Idle \| Active(banner) \| Failed { reason }`) with `start`, `succeeded`, `failed`, and `clear` helpers. The state is process-global because the user-facing TUI is one engine per process; sub-agent retries deliberately don't light up the foreground banner. - `client::send_with_retry` now flips the state in its retry callback (`start(attempt+1, delay, reason)`) and on the final outcome (`succeeded()` on Ok, `failed(reason)` on Err with retries-exhausted, `clear()` on Err with attempts==1 so non-retryable errors don't pin the failure row). - `human_retry_reason` translates the structured `LlmError` into a short label: rate-limit reasons include the `Retry-After` header when the upstream provided one ("rate limited (Retry-After 30s)"). - Footer's `render` checks `retry_status::snapshot()` first; when `Active` it renders `⟳ retry N in Ms — <reason>` in the warning color; when `Failed` it renders `× failed: <reason>` in the error color. Banner takes precedence over the toast and the regular status line. - `Engine::handle_user_message` calls `retry_status::clear()` right after emitting `TurnStarted` so the previous turn's failure row doesn't bleed into a new turn. ### Tests - 4 unit tests in `retry_status::tests` covering idle default, the active → succeeded round-trip, the failed-state pin, and a past-deadline saturation. - 2 footer rendering tests asserting the banner / failure-row text appears in the rendered buffer. - All tests touching the global retry surface serialize through `retry_status::test_guard()` so cargo's parallel runner can't observe a torn read. ### Verification cargo fmt --all -- --check ✓ cargo clippy --workspace --all-targets --all-features --locked -- -D warnings ✓ cargo test --workspace --all-features --locked ✓ 1815 + supporting (was 1809 on stabilization base) cargo test -p deepseek-tui --bin deepseek-tui --locked retry ✓ 28 passed Closes #499 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 03:08:53 -05:00
Hunter Bown	e5f56dee82	feat(config): add ApiProvider::DeepseekCN variant for China endpoint (#361 ) Add DeepseekCN as a first-class provider variant with: - Enum variant + parse/as_str/display_name/all methods - DEFAULT_DEEPSEEKCN_BASE_URL (https://api.deepseeki.com) - Auto-detection when base_url contains api.deepseeki.com - Locale-based auto-suggest: if no provider is configured and system locale (LC_ALL/LC_MESSAGES/LANG) starts with 'zh-*', the TUI defaults to DeepseekCN at startup - ProvidersConfig.deepseek_cn for provider-scoped credentials - All match arms updated across config.rs, client.rs, provider_picker.rs, main.rs, and ui.rs - provider_picker tests updated for the 7th provider entry	2026-05-02 02:10:57 -05:00
Hunter Bown	6d8ab4c2b8	fix: close v0.7.2 issue cleanup	2026-04-28 23:09:19 -05:00
Hunter Bown	0f8c363012	fix: clean v0.7.2 release prep	2026-04-28 21:54:43 -05:00
Hunter Bown	35db361a87	v0.7.2: clippy clean, cost counter wiring, layered context fixup #167: Fix all 7 clippy warnings — annotated SeamMetadata dead fields, removed unused should_cycle calls, collapsed nested ifs, fixed useless_format and nonminimal_bool. #168: Wire TokenUsage mailbox drain to subagent_cost accumulator. handle_subagent_mailbox now intercepts TokenUsage before routing to cards, computes cost via calculate_turn_cost, and increments app.subagent_cost in real time. Footer reflects live sub-agent spend. Restored ArchivedContext variant to HistoryCell (corrupted by prior apply_patch). Version bump to 0.7.2. Refs: #166, #167, #168	2026-04-28 21:46:25 -05:00
Hunter Bown	48c30473da	Merge branch 'feat/v067-providers' (#52 OpenRouter + Novita providers)	2026-04-27 22:17:27 -05:00
Hunter Bown	f118db8201	feat(providers): #52 OpenRouter + Novita as first-class providers ProviderKind gains Openrouter + Novita variants; ModelRegistry registers deepseek/deepseek-v4-{pro,flash} against both. /provider opens a picker modal with inline API-key prompt for un-configured providers. Env fallbacks: OPENROUTER_API_KEY, NOVITA_API_KEY. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 21:58:51 -05:00
Hunter Bown	36320c5bea	fix(client): #103 stream-error diagnostics + transparent retry on early decode failure Phase 1: log full reqwest error chain + headers + bytes-received at decode site Phase 2: HTTP/2 keepalive settings + tcp keepalive on the reqwest builder Phase 3: engine transparently retries when stream errors before any content; surface error on mid-stream failure (no double-bill); stream_errors threshold relaxed 3 -> 5 with the new keepalive Phase 4: unit tests for the four classes of stream failure Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 21:57:13 -05:00
Hunter Bown	bbdfb26f3c	fix(client): TCP/HTTP2 keepalives + stream-error diagnostics (#103 Phase 1+2) Two fixes for the persistent "Stream read error: error decoding response body" we saw mid-turn during long V4-pro thinking sessions. 1) HTTP transport tuning (`crates/tui/src/client.rs`): - Drop the blanket 300s request timeout. Long V4 thinking turns legitimately exceed the wall-clock window; per-chunk and per-stream guards in `engine.rs` already bound how long we wait without progress. - Add `tcp_keepalive(30s)` so dead-peer detection happens at the TCP layer instead of waiting for the application to notice. - Add `http2_keep_alive_interval(15s)` + `http2_keep_alive_timeout(20s)` so HTTP/2 connections to DeepSeek's edge don't go silent and get killed by an upstream proxy mid-thinking. 2) Stream-error diagnostics (`crates/tui/src/client/chat.rs`): - Walk reqwest's `std::error::Error::source()` chain when a chunk read errors, so the underlying hyper / h2 / io error is logged. Without this the outer "error decoding response body" message tells us nothing about WHY the stream died. - Track elapsed wall time, bytes received so far, and ms since the last successful event; log them alongside the error chain. Lets us tell HTTP/2 RST_STREAM mid-idle from chunk-decode-failure on a short stream from gzip-corruption mid-burst. Phase 3 (transparent retry with `prefix` continuation) is intentionally NOT in this PR. The retry-flag plumbing on MessageRequest + chat.rs prefix wire format + engine.rs retry loop is a meaningful surface that deserves its own review pass; this PR ships the diagnostic-and-resilience floor so we can land the harder retry work knowing the underlying network state is better. Refs #103. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 17:19:42 -05:00
Hunter Bown	c5a584d5c3	refactor(client): extract chat + responses into folder module (P1.1) Split client.rs into client/mod.rs (public API + helpers), client/chat.rs (chat-completions streaming), and client/responses.rs (responses API helpers). Internal helpers promoted to pub(super) for intra-module visibility; the public DeepSeekClient API is unchanged. While here, redesign all five system prompts around decomposition-first philosophy inspired by the mismanaged-geniuses hypothesis (Zhang et al., 2026). The model is now instructed to todo_write / update_plan before acting, fan out sub-agents for parallel work, and keep the sidebar populated so the user always sees what's happening. Mode prompts updated: - agent.txt: 'Before requesting approval, lay out work with todo_write' - plan.txt: 'Use update_plan for strategy, todo_write for tactics' - yolo.txt: 'Even with auto-approval, create a todo_write first' - normal.txt: same pattern for legacy compatibility Update CHANGELOG [Unreleased] and README modes section accordingly.	2026-04-26 11:39:44 -05:00
Hunter Bown	d79178a926	feat(#28,#30): @file Tab-completion + reasoning replay footer chip Two related TUI affordances bundled because they share ui.rs and the ui/tests.rs file. #30 — Reasoning-content replay telemetry, end-to-end: * models.rs — Usage gains reasoning_replay_tokens: Option<u32>. * client.rs — sanitize_thinking_mode_messages now returns the approximate replay-token count (~4 chars/token); the streaming pipeline overlays it onto the parsed MessageDelta usage so the server-reported and client-estimated numbers reach the engine together. * app.rs — App stores last_reasoning_replay_tokens. * ui.rs — TurnComplete handler copies the value into the App; new footer_reasoning_replay_spans renders an `rsn N.Nk` chip in the footer next to the cache hit-rate, warning-coloured when replay tokens exceed 50% of the input budget. * ui/tests.rs — covers chip-on, chip-hidden-when-zero, and the sanitizer's None-on-non-thinking-model path. #28 — Tab-complete @file mentions against the workspace: * ui.rs — adds partial_file_mention_at_cursor (with a guard against `user@example.com`-style false positives) and try_autocomplete_file_mention. Walks the workspace via the existing ignore::WalkBuilder, ranks prefix matches above substring matches, applies the unique match outright, extends to the longest common prefix when multiple match, and surfaces ambiguous candidates via the status line. Wired into the existing Tab handler after the slash-command branch. * ui/tests.rs — covers cursor-inside-mention extraction, email guard, prefix vs substring ranking, single-match application, common-prefix extension, no-match status, and the no-mention-no-op path. The mention-expansion path that ships file contents to the model is unchanged — this is purely a discovery aid for typing the path. Inline-contents and a fuzzy popup picker are queued for v0.5.2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 14:39:59 -05:00
Hunter Bown	017ac97d0d	feat(#30 ): debug-log reasoning_content replay size per request The thinking-mode sanitizer now sums the byte size of every replayed `reasoning_content` field in the outgoing chat-completions body and emits an `info`-level log line: Reasoning-content replay: 7 assistant message(s), ~3.2K input tokens (12,884 chars) being re-sent in this request This is visible under `RUST_LOG=deepseek_tui=info` (or higher). It's the first step toward the footer/status-line indicator described in #30 — the model's input-side reasoning replay is now observable per turn, even before it gets a dedicated UI surface. Tests cover both branches: bodies that already have reasoning_content (count is summed across all assistant turns) and bodies where the sanitizer had to inject the `(reasoning omitted)` placeholder (the placeholder bytes are included in the count since they ship over the wire). Footer integration deferred — that needs a new event from client → engine → TUI to surface the count alongside `cache N%` / `$X.XX`. Part of #30 remains open.	2026-04-25 13:28:44 -05:00
Hunter Bown	19f8d83d3b	release: v0.5.0 — fix multi-turn tool call 400 error (missing reasoning_content on assistant messages with tool_calls)	2026-04-25 12:27:53 -05:00
Hunter Bown	67b232b063	Release v0.4.9: thinking-mode reasoning_content fix + README refresh ### Fixed - DeepSeek thinking-mode tool-call rounds now always replay reasoning_content in all subsequent requests (including across new user turns), matching the documented API contract that assistant tool-call messages must retain their reasoning content forever. Previously, reasoning_content was cleared after the current user turn completed, which could cause HTTP 400 errors. - Missing reasoning_content on a tool-call assistant message now substitutes a safe placeholder ("(reasoning omitted)") instead of dropping the tool calls and their matching tool results, preventing orphaned conversation chains and API 400 rejections. - Session checkpoint now persists a Thinking-block placeholder for tool-call turns that produced no streamed reasoning text, keeping on-disk sessions structurally correct for subsequent requests. - Token estimation for compaction now counts thinking tokens across ALL tool-call rounds (not just the current user turn), aligning with the updated reasoning_content replay rule. ### Changed - Internal crate dependency pins bumped 0.4.5 → 0.4.9 to match workspace. - npm wrapper version and deepseekBinaryVersion bumped to 0.4.9. - README fully rewritten: clearer feature highlights, V4 model focus, keyboard shortcut table, improved docs index, and more engaging layout. - CHANGELOG entry for 0.4.9 with comparison URLs.	2026-04-25 12:00:08 -05:00
Hunter Bown	16f62f7abf	Fix reasoning replay and context accounting for NIM	2026-04-24 18:42:18 -05:00

1 2

55 Commits