Commit Graph

55 Commits

Author SHA1 Message Date
Hunter Bown 5fa24733e9 chore(rebrand): update repository links for CodeWhale 2026-05-23 14:07:36 -05:00
Hunter Bown 32ce14d6b2 test(rebrand): residual brand-string cleanup across source and assets
A small cleanup pass to catch brand mentions that the R5 sweep missed
because they hid in:

- HTTP User-Agent format strings (`Mozilla/5.0 (compatible; deepseek-tui/`
  in `client.rs` and `fetch_url.rs`).
- Multi-line error messages whose phrase boundary straddled a line break
  ("…restart\n             deepseek-tui." in `js_execution.rs`,
  `tool_catalog.rs`, `repl/runtime.rs`).
- Doc comments mentioning `deepseek-tui` as a binary (`config/src/lib.rs`,
  `core/capacity.rs`, `tui/streaming/chunking.rs`, `features.rs`).
- Skill descriptions shipped in `crates/tui/assets/skills/*/SKILL.md`.
- Test fixtures with placeholder paths / git emails
  (`tui/external_editor.rs`, `snapshot/repo.rs`).
- `task_manager.rs`'s `cargo test -p deepseek-tui --lib` example.
- `scripts/tencent-lighthouse/doctor.sh` info-line prefix.

The remaining `deepseek-tui` mentions in the codebase are intentional
(the legacy `[[bin]]` entry in `crates/tui/Cargo.toml`, the legacy
`npm/deepseek-tui/` deprecation shim package, the CNB mirror namespace,
the security email, the legacy bin's shim source file, and historical
CHANGELOG entries) and were preserved per the rebrand anti-scope.

Local gates green: `cargo check --workspace --all-targets --locked`,
`cargo fmt --all -- --check`, `cargo clippy --workspace --all-targets
--all-features --locked -- -D warnings`, `cargo test --workspace
--all-features --locked` (3226+ pass, 0 fail).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 11:58:34 -05:00
Hunter Bown 8597afc076 feat(provider): add Wanjie Ark support 2026-05-21 00:02:02 +08:00
Hunter Bown 912da38cca chore(release): prepare v0.8.40 2026-05-18 23:29:20 +08:00
Zhongyue Lin 323f43df60 fix(client): align stream reasoning classification with replay (review #1743)
Address gemini-code-assist review on PR #1743:

- HIGH: should_replay_reasoning_content_for_provider was made model-aware
  in the previous commit, but handle_chat_completion_stream still computed
  is_reasoning_model = requires_reasoning_content(model) &&
  provider_accepts_reasoning_content(provider). On the openai provider +
  a DeepSeek model that was false during SSE parsing, so reasoning tokens
  were stored as content (not reasoning_content) and the next request
  still 400'd -- the fix was incomplete. Extract is_reasoning_model_for_stream()
  and route the stream call site through it; add an equivalence test
  locking it to the replay predicate so the two paths can't drift.
- MEDIUM: rename generic_openai_provider_drops_deepseek_reasoning_content
  -> generic_openai_provider_drops_reasoning_content_for_non_deepseek_models
  (now uses gpt-4o; old name was misleading).

Non-DeepSeek models on any provider are unaffected (#1542 not regressed).

Refs #1739, #1694, #1542.
2026-05-18 23:06:52 +08:00
Zhongyue Lin 1a5ee2f67d fix(client): replay reasoning_content for DeepSeek models on openai provider (#1739)
should_replay_reasoning_content_for_provider() returned false whenever
provider_accepts_reasoning_content(provider) was false (true for
ApiProvider::Openai) without checking the model. This single gate feeds
both build_for_provider (include_reasoning) and
sanitize_thinking_mode_messages, so a DeepSeek reasoning model on the
generic openai provider (DeepSeek-compatible endpoint) had all
reasoning_content stripped -> the DeepSeek thinking-mode API 400s
('reasoning_content in the thinking mode must be passed back'). This is
the over-aggressive half of ac01b225 (fix #1542).

Gate the early return on the model too:
!provider_accepts_reasoning_content(provider) && !requires_reasoning_content(model).
Known DeepSeek reasoning models replay regardless of provider; genuine
non-DeepSeek models on openai still strip (effort=off still wins). #1542
not regressed (provider_accepts_reasoning_content untouched).

Two pre-existing client.rs tests asserted the buggy case (deepseek-v4-pro
on Openai -> dropped); retargeted to gpt-4o to preserve their #1542
intent without encoding the bug. New positive/negative coverage in
chat.rs.

Refs #1739, #1694, #1542, #1736.
2026-05-18 23:06:52 +08:00
Hunter Bown ae9e4b4b24 fix(client): omit strict OpenAI-incompatible fields 2026-05-14 15:25:40 -05:00
Hunter Bown ac01b225e7 fix(client): strip reasoning content for generic providers
Fixes #1542.
2026-05-13 00:17:14 -05:00
Hunter Bown 99c6b22e83 chore(release): v0.8.33 — sub-agent and RLM renovation with persistent sessions
- Persistent RLM sessions (rlm_open/rlm_eval/rlm_close) with bounded REPL helpers
- Fork-aware sub-agent sessions (agent_open/agent_eval/agent_close) with handle_read
- Shared handle_read storage with slice/range/count/JSONPath projections
- Slash-command routing: /rlm, /agent, /relay (/接力) for handoff prompts
- Sidebar renamed to "Work" tab, consistent across Plan/Agent/YOLO modes
- Tool papercuts: file_search excludes, grep_files strings, fetch_url JSON,
  edit_file fuzz, exec_shell merged stdout/stderr, revert_turn no-op reject
- CLI reasoning-effort honoured on non-auto exec routes (#1511 @h3c-hexin)
- Edit-file replacement boundaries clarified (#1516)
- Pandoc output validated before probing (#1523)
- Running turns steerable/repaintable (#1533, #1537)
- Tasks/Activity Detail calmer under load
- npm retry timeout hint (#1538 @reidliu41)
- Issue templates improved (#1525 @reidliu41)
- Shell: kill process group to prevent UI freeze (#828 @CrepuscularIRIS)
- TUI: ignore leaked SGR mouse reports in composer (#1421 @reidliu41)
- Footer: keep chips within available width (#1417 @Wenjunyun123)
- Session picker: scope Ctrl+R to current workspace (#1395 @LinQ)
- Removed stale competitive-analysis doc
- Prompts/docs teach only new tool names
2026-05-12 19:54:08 -05:00
Hunter Bown dcc2c448eb fix(client): vLLM uses chat_template_kwargs to toggle reasoning, not the Anthropic field
`apply_reasoning_effort`'s vLLM branch was injecting
`thinking: {type: "disabled"}` at the top of the request body to
turn off model reasoning. But vLLM speaks OpenAI's
chat-completions protocol, not Anthropic-native extension fields,
and silently ignored that directive — the model emitted a full
hidden reasoning trace into the non-OpenAI-standard `reasoning`
field (which this client does not surface), so users saw a
~13-second perceived freeze before the first content token
arrived.

The vLLM branch now emits the OpenAI extension
`chat_template_kwargs.enable_thinking` — the canonical way to
toggle Qwen3's `<think>` mode, DeepSeek-R1's reasoning trace, and
any other reasoning-capable model served via vLLM. End-to-end
measurement against vLLM hosting Qwen3.6-35B-A3B-FP8:

  - TTFT:           13039ms → 274ms
  - Total LLM call: 13s     → 5.7s
  - Output rate:    3 ch/s  → 46 ch/s

The `high` / `max` reasoning levels likewise route through
`chat_template_kwargs` so the toggle is consistent across effort
levels. No change for any non-vLLM provider (NVIDIA NIM continues
to use the NVIDIA-specific `chat_template_kwargs.thinking` key;
Anthropic-native providers keep the Anthropic-native field).

Resolved a 3-way merge conflict against the v0.8.32 AtlasCloud
harvest (PR #1436) so AtlasCloud stays in the no-op match arm
alongside OpenAI / Ollama while the new vLLM arm gets its own
branch. Note for future Sglang / Fireworks / Novita work: those
servers likely have the same bug but each has its own
chat_template_kwargs schema; this PR is intentionally minimal
to the verified-fix scope.

Harvested from PR #1480 by @h3c-hexin

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 01:25:16 -05:00
Hunter Bown d62facafac feat(translate): opt-in /translate command localises model replies
Two-layer design for users whose UI locale is not English:

1. **System-prompt directive (primary)**: when the user enables
   translation via `/translate`, a `## Language Output Requirement`
   block is appended to the system prompt instructing the model to
   reply in the resolved session locale (Simplified Chinese,
   Traditional Chinese, Japanese, or Brazilian Portuguese). Code
   identifiers, technical terms without an established translation,
   and code blocks the user explicitly requests in English are
   exempt. The block is gated on `PromptSessionContext.translation_
   enabled`, so it adds zero tokens for installs that don't opt in.

2. **Post-hoc heuristic (fallback)**: a lightweight detector in
   `tui::translation` compares Latin-letter count against weighted
   CJK characters (CJK chars carry ~3× the information per glyph,
   so the ratio comparison stays fair across mixed code+prose).
   When a reply still surfaces English despite the directive, the
   detector flags it and a focused per-message `client.translate()`
   call renders the localised version before display. The dedicated
   translation request runs without conversation history, tool
   calls, or streaming — the only role is translate-and-return.

Adds the `/translate` slash command, locale strings for the new UI
states, the post-hoc fallback module, the per-message
`TranslationStatus`, and threading through `core::ops`,
`core::engine`, `runtime_threads`, and the TUI app/UI surface.

Trust-boundary check: opt-in only — `translation_enabled` defaults
to false everywhere, so English-locale installs see zero behaviour
change. The system prompt addition is conditional on the runtime
flag, not the contributor's earlier always-on form. Threaded the
new `Locale::ZhHant` arm through the v0.8.32 `/change` slash
command match to keep the pattern exhaustiveness check passing.

Harvested from PR #1462 by @YaYII

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:46:50 -05:00
Hunter Bown 8f33e4bd48 feat(providers): add AtlasCloud as a first-class provider
AtlasCloud (https://atlascloud.ai) hosts the V4 family on its own
DeepSeek-compatible endpoint at `https://api.atlascloud.ai/v1`, and
several contributors had been running it through the
OpenAI-compatible passthrough with manual `base_url` / model
overrides. Selecting `provider = "atlascloud"` in
`~/.deepseek/config.toml` (or via `DEEPSEEK_PROVIDER=atlascloud`)
now wires up:

- documented `DEFAULT_ATLASCLOUD_BASE_URL` /
  `DEFAULT_ATLASCLOUD_MODEL` defaults so a fresh install needs
  only the api_key
- a `[providers.atlascloud]` config block with the same fields
  every other named provider exposes (api_key / base_url / model
  / http_headers)
- `ATLASCLOUD_API_KEY` env var path, including the secrets test
  cleanup loop so per-test env hygiene continues to work
- the provider-picker / `/provider` slash command entries so the
  provider is reachable from the runtime UI, not just config
- the env-driven `*_BASE_URL` override branch so users who pin a
  proxy can still flip it without editing config.toml

Trust-boundary pins held: AtlasCloud is opt-in (default remains
DeepSeek), no API keys are hardcoded, the api_key resolution flows
through the same `secrets` crate path every other provider uses,
and the provider-config base_url stays settable per environment.

Resolved 3-way merge conflicts in `crates/secrets/src/lib.rs` (env
cleanup loop) and `crates/tui/src/config.rs` (per-provider
base_url match arm + `provider_passes_model_through` predicate)
so the contributor's AtlasCloud branch coexists with the v0.8.x
provider expansion already on `main`. Added the missing match arm
in `validate_provider_base_url` so the non-exhaustive-pattern
check passes after the new variant lands.

Harvested from PR #1436 by @lucaszhu-hue

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 00:40:43 -05:00
Hunter Bown 6c25a18b42 chore(release): bump to v0.8.27, add CHANGELOG 2026-05-10 08:41:04 -05:00
imkingjh999 9fae156182 fix: address PR review feedback
- Use CARGO_PKG_VERSION for User-Agent instead of hardcoded version
- Restore default_model fallback for backward compatibility
2026-05-10 08:37:25 -05:00
dst1213 682e915857 fix: retry quota errors returned as HTTP 400 2026-05-10 08:15:19 -05:00
Hunter Bown 829270a8e7 fix(client): clearer insecure-base-URL rejection (#1303)
The previous one-line error told users to set
DEEPSEEK_ALLOW_INSECURE_HTTP=1 but the env var name is easy to typo
when you're staring at it in a terminal (sam43b in #1303 wrote
"DEEPSEEKALLOWINSECURE_HTTP"). Reformat the message to:

- Note that loopback hosts are auto-allowed (no env var needed)
- Show the env var with underscores explicit and prominent
- Include a one-line copy-pasteable example

No behavior change; same `validate_base_url_security` decisions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-10 00:27:56 -05:00
Duducoco e10e53d396 fix(client): stabilize reasoning_content replay for prompt cache (#1297)
* fix(client): stabilize reasoning_content replay for prompt cache

- stop gating assistant reasoning_content on whether a later user turn
  exists; the field now depends only on the stored message itself
- preserve historical message bytes across turns so DeepSeek's prefix
  cache stays warm on every text-reply follow-up
- add a byte-stability regression test and update the prior-non-tool
  reasoning test to assert the new contract

* style(client): rustfmt single-line let binding

- collapse a two-line `let mut has_reasoning = ...` into a single line
  so `cargo fmt --all -- --check` passes
2026-05-09 12:34:36 -05:00
Hunter Bown 54ca5718d2 feat(cache): cache-aware prompt diagnostics and wire payload optimization (#1196)
Merge of PR #1196 by wplll. Adds:

Cache-aware prompt layering:
  - PromptBuilder struct separates prompt construction from inspection
  - System prompt split into named layers with stability classification
  - Layers classified as static/history/dynamic for cache debugging

/cache inspect command:
  - SHA-256 hashes of each rendered prompt layer
  - Base static prefix hash vs full request prefix hash
  - Static prefix stability status across turns
  - First-divergence tracking from previous request

Wire payload optimization:
  - Tool result budget: large outputs compacted before API request
  - Tool result dedup: repeated outputs replaced by compact refs
  - Turn metadata dedup: repeated <turn_meta> blocks deduplicated
  - Wire-only: local session messages remain unchanged

Project context pack:
  - Deterministic workspace summary injected into stable prefix
  - Configurable via [context] project_pack = false

Cache warmup and improved footer cache display.
Thanks to wplll for the contribution.
2026-05-09 00:20:13 -05:00
Hunter Bown f183501fbd fix(client): route non-beta paths from beta base to v1 (#1174)
* test: add reproducer for /models 404 on beta base URL

* fix: route non-beta paths to /v1 when base URL ends with /beta

---------

Co-authored-by: Hanmiao Li <894876246@qq.com>
2026-05-08 02:59:56 -05:00
Hunter Bown f97604c3f0 fix(provider): enable OpenAI-compatible TUI runtime (#1017) 2026-05-07 05:32:15 -05:00
Hunter Bown 3e2c832933 fix(api): narrow reasoning replay policy (#1009) 2026-05-07 04:45:55 -05:00
Hunter Bown 323598e764 fix(api): harden strict tool schemas (#1005) 2026-05-07 04:12:22 -05:00
Hunter Bown c7ed05a07c feat(api): default DeepSeek to beta endpoint
Closes #941.\n\nRefs #938, #939, #940.
2026-05-06 21:24:59 -05:00
Reid 78c415f40c feat(provider): add Ollama provider support (#921)
Source PR: #921 by @reidliu41.
Closes #908.

Local verification:
- cargo test --workspace --all-features ollama
- cargo fmt --all -- --check
- cargo build

Co-authored-by: reidliu41 <reid201711@gmail.com>
2026-05-06 20:16:46 -05:00
Hunter Bown 633092167c feat(config): support custom HTTP headers (#914)
Integrates the useful custom HTTP header support from #881 onto current main.

- support root, provider-specific, and DEEPSEEK_HTTP_HEADERS overrides
- apply validated extra headers to model API requests while preserving protected Authorization and Content-Type defaults
- document the config shape in README, config.example.toml, and docs/CONFIGURATION.md

Co-authored-by: Desheng <8596814+dst1213@users.noreply.github.com>
2026-05-06 18:13:18 -05:00
Agent007 a335ff5e4c feat(provider): add vLLM provider support (#737)
Add vLLM as a first-class OpenAI-compatible self-hosted provider with VLLM_BASE_URL, VLLM_API_KEY, and VLLM_MODEL wiring.
2026-05-05 21:22:24 -05:00
Hunter Bown ab59ef8ff2 fix(cost): count V4 reasoning tokens in usage output (#762) 2026-05-05 19:57:25 -05:00
Hunter Bown c4cbd7c19f chore(release): finalize v0.8.13 stabilization 2026-05-05 13:06:09 -05:00
Hunter Bown 6b0a01d054 feat(client): remove dead responses_api_proxy module and EXPERIMENTAL_RESPONSES_API_ENV plumbing
Deletes crates/tui/src/responses_api_proxy/ (443 LOC), client/responses.rs
(406 LOC), and removes the ResponsesApiProxy CLI command, the
EXPERIMENTAL_RESPONSES_API_ENV env var plumbing, chat_fallback_counter,
use_chat_completions, RESPONSES_RECOVERY_INTERVAL, and the
RequestPayloadMode::ResponsesApi variant. The experimental Responses API
path was never instantiated and had no documented users; removing it
simplifies the client surface for the upcoming --anthropic-wire flag.

Closes #723
2026-05-05 04:22:35 -05:00
Hunter Bown 310c975597 feat(models): preserve dated variant suffixes; remove legacy alias machinery
`normalize_model_name` now passes v-series snapshots through unchanged
(deepseek-v4-flash-20260423 stays pinned, future v5-* matches via
regex). Removes ~245 LOC of legacy alias machinery: deepseek_legacy_aliases,
the chat/reasoner/r1/v3/v3.2 fold-arm, is_current_deepseek_v4_alias,
v4 fallback branch, alias capacity test seeds, alias config test block.

The migration from V3 → V4 is over; users on legacy names route their
own request to DeepSeek and see the server actual response (404 if
deprecated, success if still served). No more silent renaming.

Closes #717
2026-05-05 03:49:20 -05:00
Hunter Bown a9dcf2b6e6 style: cargo fmt sweep across community PRs
47 fmt drifts had accumulated from the squash-merged community PRs on
this branch (#653, #654, #655, #645, #658, #668, #659, #661, #660,
#667, #656). Pure formatting — no behavioural changes — applied via
`cargo fmt --all` to satisfy CI's `cargo fmt --all -- --check` gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 01:57:25 -05:00
wangfeng fd8c9fdb20 feat(tools): FIM edit tool + strict tool mode for V4 endpoints (closes #662) 2026-05-04 18:11:56 -07:00
wangfeng 3edcc6dacb feat(tools): FIM edit tool stub for V4 /beta endpoint (closes #662) 2026-05-04 18:05:48 -07:00
Hunter Bown bda30b0fd6 Merge main into feat/v0.8.8-tui-polish + gemini-code-assist feedback
Resolves the post-#514/#517/#518 conflicts:

- CHANGELOG.md: kept both polish-stack and Linux ARM64 entries under
  [Unreleased]; reordered so the ARM64/install-message Changed/Docs
  sections precede the Releases footer.
- config.example.toml: kept both the `instructions = [...]` example
  and the `[memory]` opt-in stanza in sequence.
- crates/tui/src/config.rs: kept both `instructions_paths()` (#454)
  and `memory_enabled()` (#489) on the Config impl.
- crates/tui/src/prompts.rs: extended
  `system_prompt_for_mode_with_context_and_skills` to take BOTH
  `instructions: Option<&[PathBuf]>` and `user_memory_block:
  Option<&str>`. Section 2.5a renders instructions; 2.5b renders the
  memory block — both above the skills block so KV prefix caching
  still wins.
- crates/tui/src/core/engine.rs: thread both args through the two
  call sites.
- crates/tui/src/prompts.rs: update the `system_prompt_for_mode_with_context`
  forwarder and the test caller to pass `None` for the new arg.
- .gitignore: ignore `.claude/*.local.md` and `*.local.json` so
  local ralph / Claude-Code notes can't leak into commits.

Folds in two valid suggestions from the gemini-code-assist review on #519:

- `client.rs`: collapse the duplicated `LlmError → label` match and the
  `human_retry_reason` body into a single
  `retry_reason_label_and_human(err) -> (&'static str, String)` helper.
- `widgets/footer.rs::retry_banner_spans`: merge the two separate
  `match &props.retry` blocks into one that returns both `(label, color)`.

Behavior is unchanged; refactor is a pure DRY win.
2026-05-03 08:29:59 -05:00
Hunter Bown 604edc9f83 feat(tls): honor SSL_CERT_FILE for corporate-CA / MITM proxies (#418)
Corporate users behind TLS-inspecting proxies (Zscaler, Netskope,
Palo Alto, in-house mitmproxy fleets) need to add the proxy's
intermediate CA to the trusted-roots set so the deepseek client
doesn't fail with `unable to get local issuer certificate`.

The reqwest builder already trusts the platform's system store
via native-tls. This adds opt-in support for the conventional
`SSL_CERT_FILE` env var so users can point at their own bundle:

* New `add_extra_root_certs(builder, path)` helper reads the
  file, tries `Certificate::from_pem_bundle` (covers single-cert
  files too), falls back to `from_der` for binary cert files.
* Wired into `build_http_client` when `SSL_CERT_FILE` is set
  and non-empty. Failures log a warning via the existing
  `logging::warn` channel and return the builder unchanged —
  the existing system trust still applies, so a malformed env
  var degrades gracefully instead of bricking the launch.
* Each successful load logs `info` with the cert count so
  operators can confirm their bundle was picked up.

Documented in `docs/CONFIGURATION.md`'s environment-variables
list alongside the existing TLS-related notes.

No new dependency — reqwest's `native-tls` feature already
exposes `Certificate::from_pem_bundle` / `from_der`.
2026-05-03 07:35:23 -05:00
Hunter Bown 8680a43298 feat(tui): visual retry/backoff countdown in footer (#499)
When the API client retries a 429 / 5xx / network failure, the TUI
previously went silent during the backoff sleep. The user saw "thinking"
or "ready" with no signal anything was wrong, until the request finally
either succeeded or raised. PR adds a foreground retry banner so the
user sees what's happening and how long until the next attempt.

### What ships

- New `crates/tui/src/retry_status.rs` module exposing a process-wide
  `RetryState` (`Idle | Active(banner) | Failed { reason }`) with
  `start`, `succeeded`, `failed`, and `clear` helpers. The state is
  process-global because the user-facing TUI is one engine per process;
  sub-agent retries deliberately don't light up the foreground banner.
- `client::send_with_retry` now flips the state in its retry callback
  (`start(attempt+1, delay, reason)`) and on the final outcome
  (`succeeded()` on Ok, `failed(reason)` on Err with retries-exhausted,
  `clear()` on Err with attempts==1 so non-retryable errors don't pin
  the failure row).
- `human_retry_reason` translates the structured `LlmError` into a
  short label: rate-limit reasons include the `Retry-After` header
  when the upstream provided one ("rate limited (Retry-After 30s)").
- Footer's `render` checks `retry_status::snapshot()` first; when
  `Active` it renders `⟳ retry N in Ms — <reason>` in the warning
  color; when `Failed` it renders `× failed: <reason>` in the error
  color. Banner takes precedence over the toast and the regular
  status line.
- `Engine::handle_user_message` calls `retry_status::clear()` right
  after emitting `TurnStarted` so the previous turn's failure row
  doesn't bleed into a new turn.

### Tests

- 4 unit tests in `retry_status::tests` covering idle default, the
  active → succeeded round-trip, the failed-state pin, and a
  past-deadline saturation.
- 2 footer rendering tests asserting the banner / failure-row text
  appears in the rendered buffer.
- All tests touching the global retry surface serialize through
  `retry_status::test_guard()` so cargo's parallel runner can't observe
  a torn read.

### Verification

cargo fmt --all -- --check                                          ✓
cargo clippy --workspace --all-targets --all-features --locked --   -D warnings   ✓
cargo test --workspace --all-features --locked                      ✓ 1815 + supporting (was 1809 on stabilization base)
cargo test -p deepseek-tui --bin deepseek-tui --locked retry        ✓ 28 passed

Closes #499

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 03:08:53 -05:00
Hunter Bown e5f56dee82 feat(config): add ApiProvider::DeepseekCN variant for China endpoint (#361)
Add DeepseekCN as a first-class provider variant with:
- Enum variant + parse/as_str/display_name/all methods
- DEFAULT_DEEPSEEKCN_BASE_URL (https://api.deepseeki.com)
- Auto-detection when base_url contains api.deepseeki.com
- Locale-based auto-suggest: if no provider is configured and
  system locale (LC_ALL/LC_MESSAGES/LANG) starts with 'zh-*',
  the TUI defaults to DeepseekCN at startup
- ProvidersConfig.deepseek_cn for provider-scoped credentials
- All match arms updated across config.rs, client.rs,
  provider_picker.rs, main.rs, and ui.rs
- provider_picker tests updated for the 7th provider entry
2026-05-02 02:10:57 -05:00
Hunter Bown 6d8ab4c2b8 fix: close v0.7.2 issue cleanup 2026-04-28 23:09:19 -05:00
Hunter Bown 0f8c363012 fix: clean v0.7.2 release prep 2026-04-28 21:54:43 -05:00
Hunter Bown 35db361a87 v0.7.2: clippy clean, cost counter wiring, layered context fixup
#167: Fix all 7 clippy warnings — annotated SeamMetadata dead fields,
removed unused should_cycle calls, collapsed nested ifs, fixed
useless_format and nonminimal_bool.

#168: Wire TokenUsage mailbox drain to subagent_cost accumulator.
handle_subagent_mailbox now intercepts TokenUsage before routing to
cards, computes cost via calculate_turn_cost, and increments
app.subagent_cost in real time. Footer reflects live sub-agent spend.

Restored ArchivedContext variant to HistoryCell (corrupted by prior
apply_patch). Version bump to 0.7.2.

Refs: #166, #167, #168
2026-04-28 21:46:25 -05:00
Hunter Bown 48c30473da Merge branch 'feat/v067-providers' (#52 OpenRouter + Novita providers) 2026-04-27 22:17:27 -05:00
Hunter Bown f118db8201 feat(providers): #52 OpenRouter + Novita as first-class providers
ProviderKind gains Openrouter + Novita variants; ModelRegistry registers
deepseek/deepseek-v4-{pro,flash} against both. /provider opens a picker
modal with inline API-key prompt for un-configured providers. Env
fallbacks: OPENROUTER_API_KEY, NOVITA_API_KEY.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 21:58:51 -05:00
Hunter Bown 36320c5bea fix(client): #103 stream-error diagnostics + transparent retry on early decode failure
Phase 1: log full reqwest error chain + headers + bytes-received at decode site
Phase 2: HTTP/2 keepalive settings + tcp keepalive on the reqwest builder
Phase 3: engine transparently retries when stream errors before any content;
         surface error on mid-stream failure (no double-bill); stream_errors
         threshold relaxed 3 -> 5 with the new keepalive
Phase 4: unit tests for the four classes of stream failure

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 21:57:13 -05:00
Hunter Bown bbdfb26f3c fix(client): TCP/HTTP2 keepalives + stream-error diagnostics (#103 Phase 1+2)
Two fixes for the persistent "Stream read error: error decoding response
body" we saw mid-turn during long V4-pro thinking sessions.

1) HTTP transport tuning (`crates/tui/src/client.rs`):
   - Drop the blanket 300s request timeout. Long V4 thinking turns
     legitimately exceed the wall-clock window; per-chunk and per-stream
     guards in `engine.rs` already bound how long we wait without progress.
   - Add `tcp_keepalive(30s)` so dead-peer detection happens at the TCP
     layer instead of waiting for the application to notice.
   - Add `http2_keep_alive_interval(15s)` + `http2_keep_alive_timeout(20s)`
     so HTTP/2 connections to DeepSeek's edge don't go silent and get
     killed by an upstream proxy mid-thinking.

2) Stream-error diagnostics (`crates/tui/src/client/chat.rs`):
   - Walk reqwest's `std::error::Error::source()` chain when a chunk read
     errors, so the underlying hyper / h2 / io error is logged. Without
     this the outer "error decoding response body" message tells us
     nothing about WHY the stream died.
   - Track elapsed wall time, bytes received so far, and ms since the
     last successful event; log them alongside the error chain. Lets us
     tell HTTP/2 RST_STREAM mid-idle from chunk-decode-failure on a
     short stream from gzip-corruption mid-burst.

Phase 3 (transparent retry with `prefix` continuation) is intentionally
NOT in this PR. The retry-flag plumbing on MessageRequest + chat.rs prefix
wire format + engine.rs retry loop is a meaningful surface that deserves
its own review pass; this PR ships the diagnostic-and-resilience floor so
we can land the harder retry work knowing the underlying network state is
better.

Refs #103.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 17:19:42 -05:00
Hunter Bown c5a584d5c3 refactor(client): extract chat + responses into folder module (P1.1)
Split client.rs into client/mod.rs (public API + helpers), client/chat.rs
(chat-completions streaming), and client/responses.rs (responses API
helpers).  Internal helpers promoted to pub(super) for intra-module
visibility; the public DeepSeekClient API is unchanged.

While here, redesign all five system prompts around decomposition-first
philosophy inspired by the mismanaged-geniuses hypothesis (Zhang et al.,
2026).  The model is now instructed to todo_write / update_plan before
acting, fan out sub-agents for parallel work, and keep the sidebar
populated so the user always sees what's happening.  Mode prompts updated:
- agent.txt: 'Before requesting approval, lay out work with todo_write'
- plan.txt: 'Use update_plan for strategy, todo_write for tactics'
- yolo.txt:  'Even with auto-approval, create a todo_write first'
- normal.txt: same pattern for legacy compatibility

Update CHANGELOG [Unreleased] and README modes section accordingly.
2026-04-26 11:39:44 -05:00
Hunter Bown d79178a926 feat(#28,#30): @file Tab-completion + reasoning replay footer chip
Two related TUI affordances bundled because they share ui.rs and the
ui/tests.rs file.

#30 — Reasoning-content replay telemetry, end-to-end:
 * models.rs — Usage gains reasoning_replay_tokens: Option<u32>.
 * client.rs — sanitize_thinking_mode_messages now returns the
   approximate replay-token count (~4 chars/token); the streaming
   pipeline overlays it onto the parsed MessageDelta usage so the
   server-reported and client-estimated numbers reach the engine
   together.
 * app.rs — App stores last_reasoning_replay_tokens.
 * ui.rs — TurnComplete handler copies the value into the App; new
   footer_reasoning_replay_spans renders an `rsn N.Nk` chip in the
   footer next to the cache hit-rate, warning-coloured when replay
   tokens exceed 50% of the input budget.
 * ui/tests.rs — covers chip-on, chip-hidden-when-zero, and the
   sanitizer's None-on-non-thinking-model path.

#28 — Tab-complete @file mentions against the workspace:
 * ui.rs — adds partial_file_mention_at_cursor (with a guard against
   `user@example.com`-style false positives) and
   try_autocomplete_file_mention. Walks the workspace via the
   existing ignore::WalkBuilder, ranks prefix matches above
   substring matches, applies the unique match outright, extends to
   the longest common prefix when multiple match, and surfaces
   ambiguous candidates via the status line. Wired into the existing
   Tab handler after the slash-command branch.
 * ui/tests.rs — covers cursor-inside-mention extraction, email
   guard, prefix vs substring ranking, single-match application,
   common-prefix extension, no-match status, and the
   no-mention-no-op path.

The mention-expansion path that ships file contents to the model is
unchanged — this is purely a discovery aid for typing the path.
Inline-contents and a fuzzy popup picker are queued for v0.5.2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 14:39:59 -05:00
Hunter Bown 017ac97d0d feat(#30): debug-log reasoning_content replay size per request
The thinking-mode sanitizer now sums the byte size of every replayed
`reasoning_content` field in the outgoing chat-completions body and
emits an `info`-level log line:

  Reasoning-content replay: 7 assistant message(s), ~3.2K input tokens (12,884 chars) being re-sent in this request

This is visible under `RUST_LOG=deepseek_tui=info` (or higher). It's the
first step toward the footer/status-line indicator described in #30 —
the model's input-side reasoning replay is now observable per turn,
even before it gets a dedicated UI surface.

Tests cover both branches: bodies that already have reasoning_content
(count is summed across all assistant turns) and bodies where the
sanitizer had to inject the `(reasoning omitted)` placeholder (the
placeholder bytes are included in the count since they ship over the
wire).

Footer integration deferred — that needs a new event from client → engine
→ TUI to surface the count alongside `cache N%` / `$X.XX`. Part of #30
remains open.
2026-04-25 13:28:44 -05:00
Hunter Bown 19f8d83d3b release: v0.5.0 — fix multi-turn tool call 400 error (missing reasoning_content on assistant messages with tool_calls) 2026-04-25 12:27:53 -05:00
Hunter Bown 67b232b063 Release v0.4.9: thinking-mode reasoning_content fix + README refresh
### Fixed
- DeepSeek thinking-mode tool-call rounds now always replay reasoning_content
  in all subsequent requests (including across new user turns), matching the
  documented API contract that assistant tool-call messages must retain their
  reasoning content forever. Previously, reasoning_content was cleared after
  the current user turn completed, which could cause HTTP 400 errors.
- Missing reasoning_content on a tool-call assistant message now substitutes
  a safe placeholder ("(reasoning omitted)") instead of dropping the tool
  calls and their matching tool results, preventing orphaned conversation
  chains and API 400 rejections.
- Session checkpoint now persists a Thinking-block placeholder for tool-call
  turns that produced no streamed reasoning text, keeping on-disk sessions
  structurally correct for subsequent requests.
- Token estimation for compaction now counts thinking tokens across ALL
  tool-call rounds (not just the current user turn), aligning with the
  updated reasoning_content replay rule.

### Changed
- Internal crate dependency pins bumped 0.4.5 → 0.4.9 to match workspace.
- npm wrapper version and deepseekBinaryVersion bumped to 0.4.9.
- README fully rewritten: clearer feature highlights, V4 model focus,
  keyboard shortcut table, improved docs index, and more engaging layout.
- CHANGELOG entry for 0.4.9 with comparison URLs.
2026-04-25 12:00:08 -05:00
Hunter Bown 16f62f7abf Fix reasoning replay and context accounting for NIM 2026-04-24 18:42:18 -05:00