feat: DeepSeek V4 support with reasoning-effort control (0.4.0)

Adds first-class DeepSeek V4 Pro and Flash support, updates the default model to deepseek-v4-pro, aligns legacy aliases with the current V4 1M context behavior, and fixes thinking-mode request handling.

Key fixes:
- Send DeepSeek's raw Chat Completions `thinking` parameter at the top level instead of SDK-only `extra_body`.
- Preserve assistant `reasoning_content` for all prior thinking-mode tool-call turns so subsequent requests satisfy DeepSeek V4's replay requirement.
- Fix npm wrapper concurrent first-run downloads by using per-process temporary download paths.
- Add `.mailmap` so historical bot-attributed commits aggregate under Hunter Bown where mailmap is honored.

Verified with the full local Rust gate, live DeepSeek V4 smoke, npm wrapper temp-install smoke, and green PR CI across Linux, macOS, and Windows.
This commit is contained in:
Hunter Bown
2026-04-23 22:53:20 -05:00
parent dc8e94d705
commit b7bd02d814
53 changed files with 1695 additions and 299 deletions
+8 -5
View File
@@ -81,18 +81,21 @@ Current boundary note:
### LLM Integration
- **`client.rs`** - HTTP client for DeepSeek's OpenAI-compatible Responses API (with chat fallback)
- **`client.rs`** - HTTP client for DeepSeek's documented OpenAI-compatible Chat Completions API
- **`llm_client.rs`** - Abstract LLM client trait with retry logic
- **`models.rs`** - Data structures for API requests/responses
#### DeepSeek API Endpoints
DeepSeek exposes OpenAI-compatible endpoints. The CLI uses:
- `https://api.deepseek.com/v1/responses` - preferred Responses API
- `https://api.deepseek.com/v1/chat/completions` - fallback if Responses is unavailable
- `https://api.deepseek.com/v1/chat/completions` - normal and streaming model turns
- `https://api.deepseek.com/v1/models` - live model discovery and health checks
The engine uses `handle_deepseek_turn()` to drive the agent loop against the
Responses API (with automatic fallback if needed).
`https://api.deepseek.com/v1` is accepted for OpenAI SDK compatibility, and
`https://api.deepseek.com/beta` can be configured for beta-only features such as
strict tool mode, chat prefix completion, and FIM completion. The public
DeepSeek docs do not document a Responses API path for this workflow; the engine
drives turns through Chat Completions.
### Tool System
+10 -3
View File
@@ -15,6 +15,11 @@ Overrides:
If both are set, `--config` wins. Environment variable overrides are applied after the file is loaded.
The `deepseek` facade and `deepseek-tui` binary share the same config file for
DeepSeek auth and model defaults. `deepseek login --api-key ...` writes the
root `api_key` field that `deepseek-tui` reads directly, and `deepseek --model
deepseek-chat` is forwarded to the TUI as `DEEPSEEK_MODEL`.
To bootstrap MCP and skills directories at their resolved paths, run `deepseek-tui setup`.
To only scaffold MCP, run `deepseek-tui mcp init`.
@@ -30,7 +35,7 @@ You can define multiple profiles in the same file:
```toml
api_key = "PERSONAL_KEY"
default_text_model = "deepseek-reasoner"
default_text_model = "deepseek-v4-pro"
[profiles.work]
api_key = "WORK_KEY"
@@ -50,6 +55,7 @@ These override config values:
- `DEEPSEEK_API_KEY`
- `DEEPSEEK_BASE_URL`
- `DEEPSEEK_MODEL` or `DEEPSEEK_DEFAULT_TEXT_MODEL`
- `DEEPSEEK_SKILLS_DIR`
- `DEEPSEEK_MCP_CONFIG`
- `DEEPSEEK_NOTES_PATH`
@@ -123,8 +129,9 @@ If you are upgrading from older releases:
### Core keys (used by the TUI/engine)
- `api_key` (string, required): must be non-empty (or set `DEEPSEEK_API_KEY`).
- `base_url` (string, optional): defaults to `https://api.deepseek.com` (OpenAI-compatible Responses API).
- `default_text_model` (string, optional): defaults to `deepseek-reasoner`. Any valid DeepSeek model ID is accepted (common IDs: `deepseek-reasoner`, `deepseek-chat`). Use `/models` to discover live IDs from your configured endpoint.
- `base_url` (string, optional): defaults to `https://api.deepseek.com` for DeepSeek's OpenAI-compatible Chat Completions API. `https://api.deepseek.com/v1` is also accepted for SDK compatibility; use `https://api.deepseek.com/beta` only for DeepSeek beta features such as strict tool mode, chat prefix completion, and FIM completion.
- `default_text_model` (string, optional): defaults to `deepseek-v4-pro`. Current public DeepSeek IDs are `deepseek-v4-pro` and `deepseek-v4-flash`, both with 1M context windows and thinking mode enabled by default. Legacy `deepseek-chat` and `deepseek-reasoner` remain compatibility aliases for `deepseek-v4-flash`. Use `/models` or `deepseek models` to discover live IDs from your configured endpoint. `DEEPSEEK_MODEL` overrides this for a single process.
- `reasoning_effort` (string, optional): `off`, `low`, `medium`, `high`, or `max`; defaults to the configured UI tier. `off` sends `thinking = {"type": "disabled"}`. `low` and `medium` are normalized to `high`; `max` sends `reasoning_effort = "max"`.
- `allow_shell` (bool, optional): defaults to `true` (sandboxed).
- `approval_policy` (string, optional): `on-request`, `untrusted`, or `never`. Runtime `approval_mode` editing in `/config` also accepts `on-request` and `untrusted` aliases.
- `sandbox_mode` (string, optional): `read-only`, `workspace-write`, `danger-full-access`, `external-sandbox`.
+1
View File
@@ -73,6 +73,7 @@ See `MCP.md`.
Run `deepseek --help` for the canonical list. Common flags:
- `-p, --prompt <TEXT>`: one-shot prompt mode (prints and exits)
- `--model <MODEL>`: when using the `deepseek` facade, forward a DeepSeek model override to the TUI
- `--workspace <DIR>`: workspace root for file tools
- `--yolo`: start in YOLO mode
- `-r, --resume <ID|PREFIX|latest>`: resume a saved session