feat: runtime API, task manager, and extensive improvements (v0.3.16)

Major Features:
- Runtime API for external integrations and turn management
- Task manager with persistence and recovery
- Shell output streaming and improved tool execution
- Error taxonomy and audit logging
- Command palette and UI enhancements

Documentation:
- Runtime API documentation
- Operations runbook
- Architecture updates

Fixes:
- Auto-compaction threshold and triggering logic
- Doctor command API key validation
- Clippy and formatting compliance
This commit is contained in:
Hunter Bown
2026-02-16 10:51:39 -06:00
parent fa3905041d
commit ab2c708ca7
57 changed files with 12030 additions and 836 deletions
+52 -1
View File
@@ -35,6 +35,15 @@ This document provides an overview of the DeepSeek CLI architecture for develope
│ │ │
▼ ▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ Runtime API + Task Management │
│ ┌─────────────────────────────┐ ┌──────────────────────────┐ │
│ │ HTTP/SSE Runtime API │ │ Persistent Task Manager │ │
│ │ (runtime_api.rs) │ │ (task_manager.rs) │ │
│ └─────────────────────────────┘ └──────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌─────────────────────────────────────────────────────────────────┐
│ LLM Layer │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ LLM Client Abstraction (llm_client.rs) │ │
@@ -125,6 +134,9 @@ Responses API (with automatic fallback if needed).
- **`prompts.rs`** - System prompt templates
- **`project_doc.rs`** - Project documentation handling
- **`session.rs`** - Session serialization
- **`runtime_api.rs`** - HTTP/SSE runtime API (`deepseek serve --http`)
- **`runtime_threads.rs`** - Durable thread/turn/item store + replayable event timeline
- **`task_manager.rs`** - Durable queue, worker pool, task timelines and artifacts
## Data Flow
@@ -139,6 +151,14 @@ Responses API (with automatic fallback if needed).
7. Results aggregated and sent back to LLM
8. Final response rendered in TUI
### Crash Recovery + Offline Queue
1. Before sending user input, the TUI writes a checkpoint snapshot to `~/.deepseek/sessions/checkpoints/latest.json`
2. If the process crashes mid-turn, startup restores that checkpoint automatically (unless explicit `--resume` is used)
3. While degraded/offline, new prompts are queued in-memory and mirrored to `~/.deepseek/sessions/checkpoints/offline_queue.json`
4. Queue edits (`/queue ...`) are persisted continuously so drafts and queued prompts survive restarts
5. Successful turn completion clears the active checkpoint and writes a durable session snapshot
### Tool Execution
1. LLM requests tool via `tool_use` content block
@@ -149,6 +169,31 @@ Responses API (with automatic fallback if needed).
6. Post-execution hooks run
7. Result returned to agent loop
### Background Tasks
1. Client enqueues task (`/task add ...` or `POST /v1/tasks`)
2. `task_manager.rs` persists task + queue entry under `~/.deepseek/tasks`
3. Worker picks queued task (bounded pool), transitions to `running`
4. Task creates/uses a runtime thread and starts a runtime turn
5. `runtime_threads.rs` persists thread/turn/item records + monotonic event sequence
6. Timeline/tool summaries/artifact references are persisted incrementally
7. Final state (`completed|failed|canceled`) is durable and queryable via TUI/API
### Runtime Thread/Turn Timeline
1. API/TUI creates or resumes a thread (`/v1/threads*`)
2. Turn starts on the thread (`/v1/threads/{id}/turns`)
3. Engine events are mapped to item lifecycle events (`item.started|item.delta|item.completed`)
4. Interrupt/steer operations apply to the active turn only
5. Compaction (auto/manual) is emitted as `context_compaction` item lifecycle
6. Clients replay history and resume with `/v1/threads/{id}/events?since_seq=<n>`
### Durable Schema Gates
- `session_manager.rs`, `runtime_threads.rs`, and `task_manager.rs` embed `schema_version` on persisted records.
- On load, newer schema versions are rejected with explicit errors instead of silently truncating/overwriting data.
- This allows safe forward migrations and prevents corruption when binaries and stored state are out of sync.
## Extension Points
### Adding a New Tool
@@ -182,14 +227,20 @@ command = "echo 'Running tool: $TOOL_NAME'"
## Key Design Decisions
1. **Streaming-first**: All LLM responses stream for responsiveness
2. **Tool safety**: Non-yolo mode requires approval for destructive operations
2. **Tool safety**: Non-yolo mode requires approval for destructive operations, including side-effectful MCP tools
3. **Extensibility**: MCP, skills, and hooks allow customization without code changes
4. **Cross-platform**: Core works on Linux/macOS/Windows, sandboxing macOS-only
5. **Minimal dependencies**: Careful dependency selection for build speed
6. **Local-first runtime API**: HTTP/SSE endpoints are intended for trusted localhost access
## Configuration Files
- `~/.deepseek/config.toml` - Main configuration
- `/etc/deepseek/managed_config.toml` - Optional managed defaults layer (Unix)
- `/etc/deepseek/requirements.toml` - Optional allowed-policy constraints (Unix)
- `~/.deepseek/mcp.json` - MCP server configuration
- `~/.deepseek/skills/` - User skills directory
- `~/.deepseek/sessions/` - Session history
- `~/.deepseek/sessions/checkpoints/` - Crash checkpoint + offline queue persistence
- `~/.deepseek/tasks/` - Background task records, queue, timelines, artifacts
- `~/.deepseek/audit.log` - Append-only audit events for credential + approval/elevation actions
+33 -2
View File
@@ -24,7 +24,7 @@ You can define multiple profiles in the same file:
```toml
api_key = "PERSONAL_KEY"
default_text_model = "deepseek-reasoner"
default_text_model = "deepseek-v3.2"
[profiles.work]
api_key = "WORK_KEY"
@@ -49,7 +49,13 @@ These override config values:
- `DEEPSEEK_NOTES_PATH`
- `DEEPSEEK_MEMORY_PATH`
- `DEEPSEEK_ALLOW_SHELL` (`1`/`true` enables)
- `DEEPSEEK_APPROVAL_POLICY` (`on-request|untrusted|never`)
- `DEEPSEEK_SANDBOX_MODE` (`read-only|workspace-write|danger-full-access|external-sandbox`)
- `DEEPSEEK_MANAGED_CONFIG_PATH`
- `DEEPSEEK_REQUIREMENTS_PATH`
- `DEEPSEEK_MAX_SUBAGENTS` (clamped to `1..=20`)
- `DEEPSEEK_TASKS_DIR` (runtime task queue/artifact storage, default `~/.deepseek/tasks`)
- `DEEPSEEK_ALLOW_INSECURE_HTTP` (`1`/`true` allows non-local `http://` base URLs; default is reject)
## Settings File (Persistent UI Preferences)
@@ -77,8 +83,12 @@ Common settings keys:
- `api_key` (string, required): must be non-empty (or set `DEEPSEEK_API_KEY`).
- `base_url` (string, optional): defaults to `https://api.deepseek.com` (OpenAI-compatible Responses API).
- `default_text_model` (string, optional): defaults to `deepseek-reasoner`. Other available models include `deepseek-chat`, `deepseek-r1`, `deepseek-v3`, `deepseek-v3.2`. Check the DeepSeek API for the latest model list.
- `default_text_model` (string, optional): defaults to `deepseek-v3.2`. Any valid DeepSeek model ID is accepted; common IDs include `deepseek-chat`, `deepseek-reasoner`, `deepseek-r1`, `deepseek-v3`, and `deepseek-v3.2`. Check the DeepSeek API for the latest model list.
- `allow_shell` (bool, optional): defaults to `false`.
- `approval_policy` (string, optional): `on-request`, `untrusted`, or `never`. Runtime `/set approval_mode` also accepts `on-request` and `untrusted` aliases.
- `sandbox_mode` (string, optional): `read-only`, `workspace-write`, `danger-full-access`, `external-sandbox`.
- `managed_config_path` (string, optional): managed config file loaded after user/env config.
- `requirements_path` (string, optional): requirements file used to enforce allowed approval/sandbox values.
- `max_subagents` (int, optional): defaults to `5` and is clamped to `1..=20`.
- `skills_dir` (string, optional): defaults to `~/.deepseek/skills` (each skill is a directory containing `SKILL.md`). Workspace-local `.agents/skills` or `./skills` are preferred when present.
- `mcp_config_path` (string, optional): defaults to `~/.deepseek/mcp.json`.
@@ -123,6 +133,27 @@ You can also override features for a single run:
Use `deepseek features list` to inspect known flags and their effective state.
## Managed Configuration and Requirements
DeepSeek CLI supports a policy layering model:
1. user config + profile + env overrides
2. managed config (if present)
3. requirements validation (if present)
By default on Unix:
- managed config: `/etc/deepseek/managed_config.toml`
- requirements: `/etc/deepseek/requirements.toml`
Requirements file shape:
```toml
allowed_approval_policies = ["on-request", "untrusted", "never"]
allowed_sandbox_modes = ["read-only", "workspace-write"]
```
If configured values violate requirements, startup fails with a descriptive error.
## Notes On `deepseek doctor`
`deepseek doctor` now follows the same config resolution rules as the rest of the CLI.
+25 -2
View File
@@ -2,6 +2,10 @@
DeepSeek CLI can load additional tools via MCP (Model Context Protocol). MCP servers are local processes that the CLI starts and communicates with over stdio.
Server mode note:
- `deepseek serve --mcp` runs the MCP stdio server.
- `deepseek serve --http` runs the runtime HTTP/SSE API (separate mode).
## Bootstrap MCP Config
Create a starter MCP config at your resolved MCP path:
@@ -12,6 +16,19 @@ deepseek mcp init
`deepseek setup --mcp` performs the same MCP bootstrap alongside skills setup.
Common management commands:
```bash
deepseek mcp list
deepseek mcp tools [server]
deepseek mcp add <name> --command "<cmd>" --arg "<arg>"
deepseek mcp add <name> --url "http://localhost:3000/mcp"
deepseek mcp enable <name>
deepseek mcp disable <name>
deepseek mcp remove <name>
deepseek mcp validate
```
## Config File Location
Default path:
@@ -75,10 +92,16 @@ Per-server settings:
- `env` (object, optional)
- `connect_timeout`, `execute_timeout`, `read_timeout` (seconds, optional)
- `disabled` (bool, optional)
- `enabled` (bool, optional, default `true`)
- `required` (bool, optional): startup/connect validation fails if this server cannot initialize.
- `enabled_tools` (array, optional): allowlist of tool names for this server.
- `disabled_tools` (array, optional): denylist applied after `enabled_tools`.
## Safety Caveat (Important)
## Safety Notes
MCP tools currently execute without TUI approval prompts. Only configure MCP servers you trust, and treat MCP server configuration as equivalent to running code on your machine.
MCP tools now flow through the same tool-approval framework as built-in tools. Read-only MCP helpers (resource/prompt listing and reads) can run without prompts in suggestive approval modes, while side-effectful MCP tools require approval.
You should still only configure MCP servers you trust, and treat MCP server configuration as equivalent to running code on your machine.
## Troubleshooting
+3 -3
View File
@@ -7,7 +7,7 @@ DeepSeek CLI has two related concepts:
## TUI Modes
Press `Tab` to cycle: **Normal → Plan → Agent → YOLO → Normal**.
Press `Tab` to cycle: **Plan → Agent → YOLO → Plan**.
- **Normal**: chat-first. Approvals for file writes, shell, and paid tools.
- **Plan**: design-first prompting. Approvals match Normal.
@@ -38,9 +38,9 @@ By default, file tools are restricted to the `--workspace` directory. Enable tru
YOLO mode enables trust mode automatically.
## MCP Caveat (Important)
## MCP Behavior
MCP tools are exposed as `mcp_<server>_<tool>` and currently execute without TUI approval prompts. Only configure MCP servers you trust.
MCP tools are exposed as `mcp_<server>_<tool>` and use the same approval flow as built-in tools. Read-only MCP helpers may auto-run in suggestive approval modes; MCP tools with possible side effects require approval.
See `MCP.md`.
+95
View File
@@ -0,0 +1,95 @@
# DeepSeek CLI Operations Runbook
This runbook covers practical debugging and incident response for the local CLI/TUI runtime.
## Quick Triage
1. Confirm binary + config:
- `cargo run -- --version`
- `cat ~/.deepseek/config.toml` (or inspect configured profile)
2. Enable verbose logs:
- `RUST_LOG=deepseek_cli=debug cargo run`
- For HTTP retries/reconnects: `RUST_LOG=deepseek_cli::client=debug cargo run`
3. Capture current state:
- `ls ~/.deepseek/sessions`
- `ls ~/.deepseek/sessions/checkpoints`
- `ls ~/.deepseek/tasks`
## Incident: Turn Hangs or Stream Stops
Symptoms:
- TUI remains in loading state
- partial assistant output with no completion
Checks:
1. Inspect retry/health logs (`deepseek_cli::client`)
2. Verify endpoint connectivity:
- `curl -sS https://api.deepseek.com/v1/models -H "Authorization: Bearer $DEEPSEEK_API_KEY"`
3. Confirm no local sandbox/permission deadlock in tool output
Actions:
1. Cancel current turn (`Esc` in TUI)
2. Retry prompt; if still failing, restart TUI
3. On restart, verify crash checkpoint recovery message appears
## Incident: Network Outage / Offline Behavior
Expected behavior:
- New prompts are queued while offline mode is active
- Queue state persists to `~/.deepseek/sessions/checkpoints/offline_queue.json`
Checks:
1. Open queue in TUI: `/queue list`
2. Confirm persisted queue file exists and updates timestamp
Actions:
1. Restore connectivity
2. Re-send queued entries (from `/queue edit <n>` + Enter, or normal input flow)
3. Ensure queue file clears when queue is empty
## Incident: Crash Recovery Needed
Expected behavior:
- Checkpoint stored at `~/.deepseek/sessions/checkpoints/latest.json`
- Startup auto-restores checkpoint when no explicit `--resume` target is supplied
Actions:
1. Start TUI normally and verify "Recovered checkpoint session" status
2. If automatic recovery fails, inspect checkpoint JSON for schema mismatch
3. If schema is newer than binary supports, upgrade binary or remove stale checkpoint
## Incident: Persistent State Schema Errors
Symptoms:
- Errors like `schema vX is newer than supported vY`
Affected stores:
- sessions (`~/.deepseek/sessions/*.json`)
- runtime thread/turn/item records
- tasks (`~/.deepseek/tasks/tasks/*.json`)
Actions:
1. Confirm binary version and migration expectations
2. Back up the state directory before editing
3. Either:
- run with a newer compatible binary, or
- archive incompatible records and regenerate state
## Incident: MCP/Tool Execution Failures
Checks:
1. Validate `~/.deepseek/mcp.json` schema and server command paths
2. Confirm server process can start manually
3. Check sandbox denials in TUI history / logs
Actions:
1. Retry with required approvals (or YOLO only when appropriate)
2. Temporarily disable failing MCP server and isolate issue
3. Re-enable after verification with `/mcp` diagnostics
## Post-Incident Checklist
1. Preserve logs and relevant state files
2. Record trigger, impact, and mitigation
3. Add or update regression tests (retry/recovery/schema)
4. Update this runbook and architecture docs if behavior changed
+181
View File
@@ -0,0 +1,181 @@
# Runtime API (HTTP/SSE)
DeepSeek CLI can expose a local runtime API for external clients:
```bash
deepseek serve --http --host 127.0.0.1 --port 7878 --workers 2
```
Defaults:
- bind: `127.0.0.1:7878`
- workers: `2` (clamped to `1..8`)
## Security Model (Local-First)
- The server is designed for trusted local use.
- There is no built-in auth, user isolation, or TLS termination.
- Do not expose this API directly to untrusted networks.
- If remote access is required, place it behind your own authenticated reverse proxy/VPN.
## Runtime Data Model
The runtime uses a durable Thread/Turn/Item lifecycle.
- `ThreadRecord`
- `id`, `created_at`, `updated_at`
- `model`, `workspace`, `mode`
- `latest_turn_id`, `latest_response_bookmark`, `archived`
- `TurnRecord`
- `id`, `thread_id`
- `status`: `queued|in_progress|completed|failed|interrupted|canceled`
- timestamps, duration, usage, error summary
- `TurnItemRecord`
- `id`, `turn_id`
- `kind`: `user_message|agent_message|tool_call|file_change|command_execution|context_compaction|status|error`
- lifecycle `status`: `queued|in_progress|completed|failed|interrupted|canceled`
The event log is append-only with global monotonic `seq` for replay/resume.
## Endpoints
### Health and Session
- `GET /health`
- `GET /v1/sessions?limit=50&search=<substring>`
### Compatibility Stream (Single Turn)
- `POST /v1/stream`
Backwards-compatible one-shot SSE wrapper. Internally creates an archived runtime thread+turn.
Request body:
```json
{
"prompt": "Summarize recent commits",
"model": "deepseek-v3.2",
"mode": "agent",
"workspace": ".",
"allow_shell": false,
"trust_mode": false,
"auto_approve": true
}
```
Typical SSE events:
- `turn.started`
- `message.delta`
- `tool.started`
- `tool.progress`
- `tool.completed`
- `approval.required`
- `sandbox.denied`
- `status`
- `error`
- `turn.completed`
- `done`
### Thread Lifecycle
- `POST /v1/threads`
- `GET /v1/threads?limit=50&include_archived=false`
- `GET /v1/threads/{id}`
- `POST /v1/threads/{id}/resume`
- `POST /v1/threads/{id}/fork`
Create thread request example:
```json
{
"model": "deepseek-v3.2",
"workspace": ".",
"mode": "agent",
"allow_shell": false,
"trust_mode": false,
"auto_approve": true,
"archived": false
}
```
### Turn Lifecycle
- `POST /v1/threads/{id}/turns`
- `POST /v1/threads/{id}/turns/{turn_id}/steer`
- `POST /v1/threads/{id}/turns/{turn_id}/interrupt`
- `POST /v1/threads/{id}/compact`
Notes:
- Only one active turn is allowed per thread (`409 Conflict` on overlap).
- `interrupt` returns quickly and marks `turn.interrupt_requested`.
- Terminal turn status becomes `interrupted` only after cleanup completes.
- Manual compaction is exposed as a turn with `context_compaction` item lifecycle events.
### Replayable Events
- `GET /v1/threads/{id}/events?since_seq=<u64>`
Returns SSE replay backlog, then live events for that thread.
SSE payload shape:
```json
{
"seq": 42,
"timestamp": "2026-02-11T20:18:49.123Z",
"thread_id": "thr_1234abcd",
"turn_id": "turn_5678efgh",
"item_id": "item_90ab12cd",
"event": "item.delta",
"payload": {
"delta": "partial output",
"kind": "agent_message"
}
}
```
Common event names:
- `thread.started`
- `thread.forked`
- `turn.started`
- `turn.lifecycle`
- `turn.steered`
- `turn.interrupt_requested`
- `turn.completed`
- `item.started`
- `item.delta`
- `item.completed`
- `item.failed`
- `item.interrupted`
- `approval.required`
- `sandbox.denied`
Compaction visibility:
- auto compaction emits `item.started`/`item.completed` with item kind `context_compaction` and `auto=true`
- manual compaction emits the same with `auto=false`
### Background Tasks
- `GET /v1/tasks`
- `POST /v1/tasks`
- `GET /v1/tasks/{id}`
- `POST /v1/tasks/{id}/cancel`
Tasks execute through the same runtime thread/turn pipeline and include:
- linked `thread_id` / `turn_id`
- runtime event count
- timeline + tool summaries + artifact references
## Persistence
Runtime store (default under task data root):
- `runtime/threads/*.json`
- `runtime/turns/*.json`
- `runtime/items/*.json`
- `runtime/events/{thread_id}.jsonl`
- `runtime/state.json` (monotonic sequence)
Task store:
- default `~/.deepseek/tasks` (override with `DEEPSEEK_TASKS_DIR`)
Both runtime and task state are restart-safe.