feat: DeepSeek V4 support with reasoning-effort control (0.4.0)

Adds first-class DeepSeek V4 Pro and Flash support, updates the default model to deepseek-v4-pro, aligns legacy aliases with the current V4 1M context behavior, and fixes thinking-mode request handling.

Key fixes:
- Send DeepSeek's raw Chat Completions `thinking` parameter at the top level instead of SDK-only `extra_body`.
- Preserve assistant `reasoning_content` for all prior thinking-mode tool-call turns so subsequent requests satisfy DeepSeek V4's replay requirement.
- Fix npm wrapper concurrent first-run downloads by using per-process temporary download paths.
- Add `.mailmap` so historical bot-attributed commits aggregate under Hunter Bown where mailmap is honored.

Verified with the full local Rust gate, live DeepSeek V4 smoke, npm wrapper temp-install smoke, and green PR CI across Linux, macOS, and Windows.
This commit is contained in:
Hunter Bown
2026-04-23 22:53:20 -05:00
parent dc8e94d705
commit b7bd02d814
53 changed files with 1695 additions and 299 deletions
+8 -5
View File
@@ -81,18 +81,21 @@ Current boundary note:
### LLM Integration
- **`client.rs`** - HTTP client for DeepSeek's OpenAI-compatible Responses API (with chat fallback)
- **`client.rs`** - HTTP client for DeepSeek's documented OpenAI-compatible Chat Completions API
- **`llm_client.rs`** - Abstract LLM client trait with retry logic
- **`models.rs`** - Data structures for API requests/responses
#### DeepSeek API Endpoints
DeepSeek exposes OpenAI-compatible endpoints. The CLI uses:
- `https://api.deepseek.com/v1/responses` - preferred Responses API
- `https://api.deepseek.com/v1/chat/completions` - fallback if Responses is unavailable
- `https://api.deepseek.com/v1/chat/completions` - normal and streaming model turns
- `https://api.deepseek.com/v1/models` - live model discovery and health checks
The engine uses `handle_deepseek_turn()` to drive the agent loop against the
Responses API (with automatic fallback if needed).
`https://api.deepseek.com/v1` is accepted for OpenAI SDK compatibility, and
`https://api.deepseek.com/beta` can be configured for beta-only features such as
strict tool mode, chat prefix completion, and FIM completion. The public
DeepSeek docs do not document a Responses API path for this workflow; the engine
drives turns through Chat Completions.
### Tool System