feat: DeepSeek V4 support with reasoning-effort control (0.4.0)

Adds first-class DeepSeek V4 Pro and Flash support, updates the default model to deepseek-v4-pro, aligns legacy aliases with the current V4 1M context behavior, and fixes thinking-mode request handling. Key fixes: - Send DeepSeek's raw Chat Completions `thinking` parameter at the top level instead of SDK-only `extra_body`. - Preserve assistant `reasoning_content` for all prior thinking-mode tool-call turns so subsequent requests satisfy DeepSeek V4's replay requirement. - Fix npm wrapper concurrent first-run downloads by using per-process temporary download paths. - Add `.mailmap` so historical bot-attributed commits aggregate under Hunter Bown where mailmap is honored. Verified with the full local Rust gate, live DeepSeek V4 smoke, npm wrapper temp-install smoke, and green PR CI across Linux, macOS, and Windows.
2026-04-23 22:53:20 -05:00
parent dc8e94d705
commit b7bd02d814
53 changed files with 1695 additions and 299 deletions
@@ -81,18 +81,21 @@ Current boundary note:

 ### LLM Integration

- **`client.rs`** - HTTP client for DeepSeek's OpenAI-compatible Responses API (with chat fallback)
+- **`client.rs`** - HTTP client for DeepSeek's documented OpenAI-compatible Chat Completions API
 - **`llm_client.rs`** - Abstract LLM client trait with retry logic
 - **`models.rs`** - Data structures for API requests/responses

 #### DeepSeek API Endpoints

 DeepSeek exposes OpenAI-compatible endpoints. The CLI uses:
- `https://api.deepseek.com/v1/responses` - preferred Responses API
- `https://api.deepseek.com/v1/chat/completions` - fallback if Responses is unavailable
+- `https://api.deepseek.com/v1/chat/completions` - normal and streaming model turns
+- `https://api.deepseek.com/v1/models` - live model discovery and health checks

-The engine uses `handle_deepseek_turn()` to drive the agent loop against the
-Responses API (with automatic fallback if needed).
+`https://api.deepseek.com/v1` is accepted for OpenAI SDK compatibility, and
+`https://api.deepseek.com/beta` can be configured for beta-only features such as
+strict tool mode, chat prefix completion, and FIM completion. The public
+DeepSeek docs do not document a Responses API path for this workflow; the engine
+drives turns through Chat Completions.

 ### Tool System

@@ -15,6 +15,11 @@ Overrides:

 If both are set, `--config` wins. Environment variable overrides are applied after the file is loaded.

+The `deepseek` facade and `deepseek-tui` binary share the same config file for
+DeepSeek auth and model defaults. `deepseek login --api-key ...` writes the
+root `api_key` field that `deepseek-tui` reads directly, and `deepseek --model
+deepseek-chat` is forwarded to the TUI as `DEEPSEEK_MODEL`.
+
 To bootstrap MCP and skills directories at their resolved paths, run `deepseek-tui setup`.
 To only scaffold MCP, run `deepseek-tui mcp init`.

@@ -30,7 +35,7 @@ You can define multiple profiles in the same file:

 ```toml
 api_key = "PERSONAL_KEY"
-default_text_model = "deepseek-reasoner"
+default_text_model = "deepseek-v4-pro"

 [profiles.work]
 api_key = "WORK_KEY"
@@ -50,6 +55,7 @@ These override config values:

 - `DEEPSEEK_API_KEY`
 - `DEEPSEEK_BASE_URL`
+- `DEEPSEEK_MODEL` or `DEEPSEEK_DEFAULT_TEXT_MODEL`
 - `DEEPSEEK_SKILLS_DIR`
 - `DEEPSEEK_MCP_CONFIG`
 - `DEEPSEEK_NOTES_PATH`
@@ -123,8 +129,9 @@ If you are upgrading from older releases:
 ### Core keys (used by the TUI/engine)

 - `api_key` (string, required): must be non-empty (or set `DEEPSEEK_API_KEY`).
- `base_url` (string, optional): defaults to `https://api.deepseek.com` (OpenAI-compatible Responses API).
- `default_text_model` (string, optional): defaults to `deepseek-reasoner`. Any valid DeepSeek model ID is accepted (common IDs: `deepseek-reasoner`, `deepseek-chat`). Use `/models` to discover live IDs from your configured endpoint.
+- `base_url` (string, optional): defaults to `https://api.deepseek.com` for DeepSeek's OpenAI-compatible Chat Completions API. `https://api.deepseek.com/v1` is also accepted for SDK compatibility; use `https://api.deepseek.com/beta` only for DeepSeek beta features such as strict tool mode, chat prefix completion, and FIM completion.
+- `default_text_model` (string, optional): defaults to `deepseek-v4-pro`. Current public DeepSeek IDs are `deepseek-v4-pro` and `deepseek-v4-flash`, both with 1M context windows and thinking mode enabled by default. Legacy `deepseek-chat` and `deepseek-reasoner` remain compatibility aliases for `deepseek-v4-flash`. Use `/models` or `deepseek models` to discover live IDs from your configured endpoint. `DEEPSEEK_MODEL` overrides this for a single process.
+- `reasoning_effort` (string, optional): `off`, `low`, `medium`, `high`, or `max`; defaults to the configured UI tier. `off` sends `thinking = {"type": "disabled"}`. `low` and `medium` are normalized to `high`; `max` sends `reasoning_effort = "max"`.
 - `allow_shell` (bool, optional): defaults to `true` (sandboxed).
 - `approval_policy` (string, optional): `on-request`, `untrusted`, or `never`. Runtime `approval_mode` editing in `/config` also accepts `on-request` and `untrusted` aliases.
 - `sandbox_mode` (string, optional): `read-only`, `workspace-write`, `danger-full-access`, `external-sandbox`.
@@ -73,6 +73,7 @@ See `MCP.md`.
 Run `deepseek --help` for the canonical list. Common flags:

 - `-p, --prompt <TEXT>`: one-shot prompt mode (prints and exits)
+- `--model <MODEL>`: when using the `deepseek` facade, forward a DeepSeek model override to the TUI
 - `--workspace <DIR>`: workspace root for file tools
 - `--yolo`: start in YOLO mode
 - `-r, --resume <ID|PREFIX|latest>`: resume a saved session