dgf1988/codewhale

Files

T

Hunter Bown b7bd02d814 feat: DeepSeek V4 support with reasoning-effort control (0.4.0)

Adds first-class DeepSeek V4 Pro and Flash support, updates the default model to deepseek-v4-pro, aligns legacy aliases with the current V4 1M context behavior, and fixes thinking-mode request handling.

Key fixes:
- Send DeepSeek's raw Chat Completions `thinking` parameter at the top level instead of SDK-only `extra_body`.
- Preserve assistant `reasoning_content` for all prior thinking-mode tool-call turns so subsequent requests satisfy DeepSeek V4's replay requirement.
- Fix npm wrapper concurrent first-run downloads by using per-process temporary download paths.
- Add `.mailmap` so historical bot-attributed commits aggregate under Hunter Bown where mailmap is honored.

Verified with the full local Rust gate, live DeepSeek V4 smoke, npm wrapper temp-install smoke, and green PR CI across Linux, macOS, and Windows.

2026-04-23 22:53:20 -05:00

14 KiB

Raw Blame History

DeepSeek TUI Architecture

This document provides an overview of the DeepSeek TUI architecture for developers and contributors.

Current boundary note:

crates/tui is still the live end-user runtime for the TUI, runtime API, task manager, and tool execution loop.
Other workspace crates are being split out incrementally, but they are not yet the sole runtime source of truth.

High-Level Overview

┌─────────────────────────────────────────────────────────────────┐
│                         User Interface                          │
│  ┌─────────────────┐  ┌─────────────────┐  ┌────────────────┐  │
│  │   TUI (ratatui) │  │  One-shot Mode  │  │  Config/CLI    │  │
│  └────────┬────────┘  └────────┬────────┘  └────────┬───────┘  │
└───────────┼─────────────────────┼────────────────────┼──────────┘
            │                     │                    │
            ▼                     ▼                    ▼
┌─────────────────────────────────────────────────────────────────┐
│                        Core Engine                              │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │                    Agent Loop (core/engine.rs)           │   │
│  │  ┌─────────┐  ┌─────────────┐  ┌──────────────────────┐ │   │
│  │  │ Session │  │ Turn Mgmt   │  │ Tool Orchestration   │ │   │
│  │  └─────────┘  └─────────────┘  └──────────────────────┘ │   │
│  └─────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
            │                     │                    │
            ▼                     ▼                    ▼
┌─────────────────────────────────────────────────────────────────┐
│                     Tool & Extension Layer                      │
│  ┌──────────┐  ┌──────────┐  ┌─────────┐  ┌────────────────┐   │
│  │  Tools   │  │  Skills  │  │  Hooks  │  │  MCP Servers   │   │
│  │ (shell,  │  │ (plugins)│  │ (pre/   │  │  (external)    │   │
│  │  file)   │  │          │  │  post)  │  │                │   │
│  └──────────┘  └──────────┘  └─────────┘  └────────────────┘   │
└─────────────────────────────────────────────────────────────────┘
            │                     │                    │
            ▼                     ▼                    ▼
┌─────────────────────────────────────────────────────────────────┐
│                  Runtime API + Task Management                  │
│  ┌─────────────────────────────┐  ┌──────────────────────────┐  │
│  │ HTTP/SSE Runtime API        │  │ Persistent Task Manager  │  │
│  │ (runtime_api.rs)            │  │ (task_manager.rs)        │  │
│  └─────────────────────────────┘  └──────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
            │                     │
            ▼                     ▼
┌─────────────────────────────────────────────────────────────────┐
│                        LLM Layer                                │
│  ┌──────────────────────────────────────────────────────────┐  │
│  │              LLM Client Abstraction (llm_client.rs)       │  │
│  │  ┌─────────────────┐  ┌─────────────────────────────┐    │  │
│  │  │  DeepSeek Client │  │  Compatible Client (DeepSeek)│    │  │
│  │  │   (client.rs)   │  │       (client.rs)           │    │  │
│  │  └─────────────────┘  └─────────────────────────────┘    │  │
│  └──────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

Module Organization

Entry Point

main.rs - CLI argument parsing (clap), configuration loading, entry point routing

Core Components

core/ - Main engine components
- engine.rs - Agent loop, message processing, tool execution orchestration
- session.rs - Session state management
- turn.rs - Turn-based conversation handling
- events.rs - Event system for UI updates
- ops.rs - Core operations

Configuration

config.rs - Configuration loading, profiles, environment variables
settings.rs - Runtime settings management

LLM Integration

client.rs - HTTP client for DeepSeek's documented OpenAI-compatible Chat Completions API
llm_client.rs - Abstract LLM client trait with retry logic
models.rs - Data structures for API requests/responses

DeepSeek API Endpoints

DeepSeek exposes OpenAI-compatible endpoints. The CLI uses:

https://api.deepseek.com/v1/chat/completions - normal and streaming model turns
https://api.deepseek.com/v1/models - live model discovery and health checks

https://api.deepseek.com/v1 is accepted for OpenAI SDK compatibility, and https://api.deepseek.com/beta can be configured for beta-only features such as strict tool mode, chat prefix completion, and FIM completion. The public DeepSeek docs do not document a Responses API path for this workflow; the engine drives turns through Chat Completions.

Tool System

tools/ - Built-in tool implementations
- mod.rs - Tool registry and common types
- shell.rs - Shell command execution
- file.rs - File read/write operations
- todo.rs - Todo list management
- plan.rs - Planning tools
- subagent.rs - Sub-agent spawning
- spec.rs - Tool specifications

Extension Systems

mcp.rs - Model Context Protocol client for external tool servers
skills.rs - Plugin/skill loading and execution
hooks.rs - Pre/post execution hooks with conditions

User Interface

tui/ - Terminal UI components (ratatui-based)
- app.rs - Application state and message handling
- ui.rs - Event handling, streaming state, and rendering logic
- approval.rs - Tool approval dialog
- clipboard.rs - Clipboard handling
- streaming.rs - Streaming text collector
ui.rs - Legacy/simple UI utilities

Security

sandbox/ - macOS sandboxing support
- mod.rs - Sandbox type definitions
- policy.rs - Sandbox policy configuration
- seatbelt.rs - macOS Seatbelt profile generation

Utilities

utils.rs - Common utilities
logging.rs - Logging infrastructure
compaction.rs - Context compaction for long conversations
pricing.rs - Cost estimation
prompts.rs - System prompt templates
project_doc.rs - Project documentation handling
session.rs - Session serialization
runtime_api.rs - HTTP/SSE runtime API (deepseek serve --http)
runtime_threads.rs - Durable thread/turn/item store + replayable event timeline
task_manager.rs - Durable queue, worker pool, task timelines and artifacts

Data Flow

Interactive Session

User input received in TUI
Input processed by core/engine.rs
Message sent to LLM via llm_client.rs
Response streamed back, parsed in client.rs
Tool calls extracted and executed via tools/
Hooks triggered before/after tool execution
Results aggregated and sent back to LLM
Final response rendered in TUI

Crash Recovery + Offline Queue

Before sending user input, the TUI writes a checkpoint snapshot to ~/.deepseek/sessions/checkpoints/latest.json
Startup remains fresh by default; prior sessions are resumed explicitly via --resume/--continue (or Ctrl+R in TUI)
While degraded/offline, new prompts are queued in-memory and mirrored to ~/.deepseek/sessions/checkpoints/offline_queue.json
Queue edits (/queue ...) are persisted continuously so drafts and queued prompts survive restarts
Successful turn completion clears the active checkpoint and writes a durable session snapshot

Tool Execution

LLM requests tool via tool_use content block
Tool registry looks up handler
Pre-execution hooks run
Approval requested if needed (non-yolo mode)
Tool executed (possibly sandboxed on macOS)
Post-execution hooks run
Result returned to agent loop

Background Tasks

Client enqueues task (/task add ... or POST /v1/tasks)
task_manager.rs persists task + queue entry under ~/.deepseek/tasks
Worker picks queued task (bounded pool), transitions to running
Task creates/uses a runtime thread and starts a runtime turn
runtime_threads.rs persists thread/turn/item records + monotonic event sequence
Timeline/tool summaries/artifact references are persisted incrementally
Final state (completed|failed|canceled) is durable and queryable via TUI/API

Runtime Thread/Turn Timeline

API/TUI creates or resumes a thread (/v1/threads*)
Turn starts on the thread (/v1/threads/{id}/turns)
Engine events are mapped to item lifecycle events (item.started|item.delta|item.completed)
Interrupt/steer operations apply to the active turn only
Compaction (auto/manual) is emitted as context_compaction item lifecycle
Clients replay history and resume with /v1/threads/{id}/events?since_seq=<n>

Durable Schema Gates

session_manager.rs, runtime_threads.rs, and task_manager.rs embed schema_version on persisted records.
On load, newer schema versions are rejected with explicit errors instead of silently truncating/overwriting data.
This allows safe forward migrations and prevents corruption when binaries and stored state are out of sync.

Extension Points

Adding a New Tool

Create handler in tools/
Register in tools/registry.rs
Add tool specification (name, description, input schema)

Adding an MCP Server

Configure in ~/.deepseek/mcp.json
Server auto-discovered at startup
Tools exposed to LLM automatically

Creating a Skill

Create skill directory with SKILL.md
Define skill prompt and optional scripts
Place in ~/.deepseek/skills/

Adding Hooks

Configure in ~/.deepseek/config.toml:

[[hooks]]
event = "tool_call_before"
command = "echo 'Running tool: $TOOL_NAME'"

Key Design Decisions

Streaming-first: All LLM responses stream for responsiveness
Tool safety: Non-YOLO mode requires approval for destructive operations, including side-effectful MCP tools
Extensibility: MCP, skills, and hooks allow customization without code changes
Cross-platform: Core works on Linux/macOS/Windows, sandboxing macOS-only
Minimal dependencies: Careful dependency selection for build speed
Local-first runtime API: HTTP/SSE endpoints are intended for trusted localhost access and are served by the crates/tui runtime today

Configuration Files

~/.deepseek/config.toml - Main configuration
/etc/deepseek/managed_config.toml - Optional managed defaults layer (Unix)
/etc/deepseek/requirements.toml - Optional allowed-policy constraints (Unix)
~/.deepseek/mcp.json - MCP server configuration
~/.deepseek/skills/ - User skills directory
~/.deepseek/sessions/ - Session history
~/.deepseek/sessions/checkpoints/ - Crash checkpoint + offline queue persistence
~/.deepseek/tasks/ - Background task records, queue, timelines, artifacts
~/.deepseek/audit.log - Append-only audit events for credential + approval/elevation actions

14 KiB Raw Blame History