feat: runtime API, task manager, and extensive improvements (v0.3.16)

Major Features: - Runtime API for external integrations and turn management - Task manager with persistence and recovery - Shell output streaming and improved tool execution - Error taxonomy and audit logging - Command palette and UI enhancements Documentation: - Runtime API documentation - Operations runbook - Architecture updates Fixes: - Auto-compaction threshold and triggering logic - Doctor command API key validation - Clippy and formatting compliance
2026-02-16 10:51:39 -06:00
parent fa3905041d
commit ab2c708ca7
57 changed files with 12030 additions and 836 deletions
@@ -35,6 +35,15 @@ This document provides an overview of the DeepSeek CLI architecture for develope
            │                     │                    │
            ▼                     ▼                    ▼
 ┌─────────────────────────────────────────────────────────────────┐
+│                  Runtime API + Task Management                  │
+│  ┌─────────────────────────────┐  ┌──────────────────────────┐  │
+│  │ HTTP/SSE Runtime API        │  │ Persistent Task Manager  │  │
+│  │ (runtime_api.rs)            │  │ (task_manager.rs)        │  │
+│  └─────────────────────────────┘  └──────────────────────────┘  │
+└─────────────────────────────────────────────────────────────────┘
+            │                     │
+            ▼                     ▼
+┌─────────────────────────────────────────────────────────────────┐
 │                        LLM Layer                                │
 │  ┌──────────────────────────────────────────────────────────┐  │
 │  │              LLM Client Abstraction (llm_client.rs)       │  │
@@ -125,6 +134,9 @@ Responses API (with automatic fallback if needed).
 - **`prompts.rs`** - System prompt templates
 - **`project_doc.rs`** - Project documentation handling
 - **`session.rs`** - Session serialization
+- **`runtime_api.rs`** - HTTP/SSE runtime API (`deepseek serve --http`)
+- **`runtime_threads.rs`** - Durable thread/turn/item store + replayable event timeline
+- **`task_manager.rs`** - Durable queue, worker pool, task timelines and artifacts

 ## Data Flow

@@ -139,6 +151,14 @@ Responses API (with automatic fallback if needed).
 7. Results aggregated and sent back to LLM
 8. Final response rendered in TUI

+### Crash Recovery + Offline Queue
+
+1. Before sending user input, the TUI writes a checkpoint snapshot to `~/.deepseek/sessions/checkpoints/latest.json`
+2. If the process crashes mid-turn, startup restores that checkpoint automatically (unless explicit `--resume` is used)
+3. While degraded/offline, new prompts are queued in-memory and mirrored to `~/.deepseek/sessions/checkpoints/offline_queue.json`
+4. Queue edits (`/queue ...`) are persisted continuously so drafts and queued prompts survive restarts
+5. Successful turn completion clears the active checkpoint and writes a durable session snapshot
+
 ### Tool Execution

 1. LLM requests tool via `tool_use` content block
@@ -149,6 +169,31 @@ Responses API (with automatic fallback if needed).
 6. Post-execution hooks run
 7. Result returned to agent loop

+### Background Tasks
+
+1. Client enqueues task (`/task add ...` or `POST /v1/tasks`)
+2. `task_manager.rs` persists task + queue entry under `~/.deepseek/tasks`
+3. Worker picks queued task (bounded pool), transitions to `running`
+4. Task creates/uses a runtime thread and starts a runtime turn
+5. `runtime_threads.rs` persists thread/turn/item records + monotonic event sequence
+6. Timeline/tool summaries/artifact references are persisted incrementally
+7. Final state (`completed|failed|canceled`) is durable and queryable via TUI/API
+
+### Runtime Thread/Turn Timeline
+
+1. API/TUI creates or resumes a thread (`/v1/threads*`)
+2. Turn starts on the thread (`/v1/threads/{id}/turns`)
+3. Engine events are mapped to item lifecycle events (`item.started|item.delta|item.completed`)
+4. Interrupt/steer operations apply to the active turn only
+5. Compaction (auto/manual) is emitted as `context_compaction` item lifecycle
+6. Clients replay history and resume with `/v1/threads/{id}/events?since_seq=<n>`
+
+### Durable Schema Gates
+
+- `session_manager.rs`, `runtime_threads.rs`, and `task_manager.rs` embed `schema_version` on persisted records.
+- On load, newer schema versions are rejected with explicit errors instead of silently truncating/overwriting data.
+- This allows safe forward migrations and prevents corruption when binaries and stored state are out of sync.
+
 ## Extension Points

 ### Adding a New Tool
@@ -182,14 +227,20 @@ command = "echo 'Running tool: $TOOL_NAME'"
 ## Key Design Decisions

 1. **Streaming-first**: All LLM responses stream for responsiveness
-2. **Tool safety**: Non-yolo mode requires approval for destructive operations
+2. **Tool safety**: Non-yolo mode requires approval for destructive operations, including side-effectful MCP tools
 3. **Extensibility**: MCP, skills, and hooks allow customization without code changes
 4. **Cross-platform**: Core works on Linux/macOS/Windows, sandboxing macOS-only
 5. **Minimal dependencies**: Careful dependency selection for build speed
+6. **Local-first runtime API**: HTTP/SSE endpoints are intended for trusted localhost access

 ## Configuration Files

 - `~/.deepseek/config.toml` - Main configuration
+- `/etc/deepseek/managed_config.toml` - Optional managed defaults layer (Unix)
+- `/etc/deepseek/requirements.toml` - Optional allowed-policy constraints (Unix)
 - `~/.deepseek/mcp.json` - MCP server configuration
 - `~/.deepseek/skills/` - User skills directory
 - `~/.deepseek/sessions/` - Session history
+- `~/.deepseek/sessions/checkpoints/` - Crash checkpoint + offline queue persistence
+- `~/.deepseek/tasks/` - Background task records, queue, timelines, artifacts
+- `~/.deepseek/audit.log` - Append-only audit events for credential + approval/elevation actions
@@ -24,7 +24,7 @@ You can define multiple profiles in the same file:

 ```toml
 api_key = "PERSONAL_KEY"
-default_text_model = "deepseek-reasoner"
+default_text_model = "deepseek-v3.2"

 [profiles.work]
 api_key = "WORK_KEY"
@@ -49,7 +49,13 @@ These override config values:
 - `DEEPSEEK_NOTES_PATH`
 - `DEEPSEEK_MEMORY_PATH`
 - `DEEPSEEK_ALLOW_SHELL` (`1`/`true` enables)
+- `DEEPSEEK_APPROVAL_POLICY` (`on-request|untrusted|never`)
+- `DEEPSEEK_SANDBOX_MODE` (`read-only|workspace-write|danger-full-access|external-sandbox`)
+- `DEEPSEEK_MANAGED_CONFIG_PATH`
+- `DEEPSEEK_REQUIREMENTS_PATH`
 - `DEEPSEEK_MAX_SUBAGENTS` (clamped to `1..=20`)
+- `DEEPSEEK_TASKS_DIR` (runtime task queue/artifact storage, default `~/.deepseek/tasks`)
+- `DEEPSEEK_ALLOW_INSECURE_HTTP` (`1`/`true` allows non-local `http://` base URLs; default is reject)

 ## Settings File (Persistent UI Preferences)

@@ -77,8 +83,12 @@ Common settings keys:

 - `api_key` (string, required): must be non-empty (or set `DEEPSEEK_API_KEY`).
 - `base_url` (string, optional): defaults to `https://api.deepseek.com` (OpenAI-compatible Responses API).
- `default_text_model` (string, optional): defaults to `deepseek-reasoner`. Other available models include `deepseek-chat`, `deepseek-r1`, `deepseek-v3`, `deepseek-v3.2`. Check the DeepSeek API for the latest model list.
+- `default_text_model` (string, optional): defaults to `deepseek-v3.2`. Any valid DeepSeek model ID is accepted; common IDs include `deepseek-chat`, `deepseek-reasoner`, `deepseek-r1`, `deepseek-v3`, and `deepseek-v3.2`. Check the DeepSeek API for the latest model list.
 - `allow_shell` (bool, optional): defaults to `false`.
+- `approval_policy` (string, optional): `on-request`, `untrusted`, or `never`. Runtime `/set approval_mode` also accepts `on-request` and `untrusted` aliases.
+- `sandbox_mode` (string, optional): `read-only`, `workspace-write`, `danger-full-access`, `external-sandbox`.
+- `managed_config_path` (string, optional): managed config file loaded after user/env config.
+- `requirements_path` (string, optional): requirements file used to enforce allowed approval/sandbox values.
 - `max_subagents` (int, optional): defaults to `5` and is clamped to `1..=20`.
 - `skills_dir` (string, optional): defaults to `~/.deepseek/skills` (each skill is a directory containing `SKILL.md`). Workspace-local `.agents/skills` or `./skills` are preferred when present.
 - `mcp_config_path` (string, optional): defaults to `~/.deepseek/mcp.json`.
@@ -123,6 +133,27 @@ You can also override features for a single run:

 Use `deepseek features list` to inspect known flags and their effective state.

+## Managed Configuration and Requirements
+
+DeepSeek CLI supports a policy layering model:
+
+1. user config + profile + env overrides
+2. managed config (if present)
+3. requirements validation (if present)
+
+By default on Unix:
+- managed config: `/etc/deepseek/managed_config.toml`
+- requirements: `/etc/deepseek/requirements.toml`
+
+Requirements file shape:
+
+```toml
+allowed_approval_policies = ["on-request", "untrusted", "never"]
+allowed_sandbox_modes = ["read-only", "workspace-write"]
+```
+
+If configured values violate requirements, startup fails with a descriptive error.
+
 ## Notes On `deepseek doctor`

 `deepseek doctor` now follows the same config resolution rules as the rest of the CLI.
@@ -2,6 +2,10 @@

 DeepSeek CLI can load additional tools via MCP (Model Context Protocol). MCP servers are local processes that the CLI starts and communicates with over stdio.

+Server mode note:
+- `deepseek serve --mcp` runs the MCP stdio server.
+- `deepseek serve --http` runs the runtime HTTP/SSE API (separate mode).
+
 ## Bootstrap MCP Config

 Create a starter MCP config at your resolved MCP path:
@@ -12,6 +16,19 @@ deepseek mcp init

 `deepseek setup --mcp` performs the same MCP bootstrap alongside skills setup.

+Common management commands:
+
+```bash
+deepseek mcp list
+deepseek mcp tools [server]
+deepseek mcp add <name> --command "<cmd>" --arg "<arg>"
+deepseek mcp add <name> --url "http://localhost:3000/mcp"
+deepseek mcp enable <name>
+deepseek mcp disable <name>
+deepseek mcp remove <name>
+deepseek mcp validate
+```
+
 ## Config File Location

 Default path:
@@ -75,10 +92,16 @@ Per-server settings:
 - `env` (object, optional)
 - `connect_timeout`, `execute_timeout`, `read_timeout` (seconds, optional)
 - `disabled` (bool, optional)
+- `enabled` (bool, optional, default `true`)
+- `required` (bool, optional): startup/connect validation fails if this server cannot initialize.
+- `enabled_tools` (array, optional): allowlist of tool names for this server.
+- `disabled_tools` (array, optional): denylist applied after `enabled_tools`.

-## Safety Caveat (Important)
+## Safety Notes

-MCP tools currently execute without TUI approval prompts. Only configure MCP servers you trust, and treat MCP server configuration as equivalent to running code on your machine.
+MCP tools now flow through the same tool-approval framework as built-in tools. Read-only MCP helpers (resource/prompt listing and reads) can run without prompts in suggestive approval modes, while side-effectful MCP tools require approval.
+
+You should still only configure MCP servers you trust, and treat MCP server configuration as equivalent to running code on your machine.

 ## Troubleshooting

@@ -7,7 +7,7 @@ DeepSeek CLI has two related concepts:

 ## TUI Modes

-Press `Tab` to cycle: **Normal → Plan → Agent → YOLO → Normal**.
+Press `Tab` to cycle: **Plan → Agent → YOLO → Plan**.

 - **Normal**: chat-first. Approvals for file writes, shell, and paid tools.
 - **Plan**: design-first prompting. Approvals match Normal.
@@ -38,9 +38,9 @@ By default, file tools are restricted to the `--workspace` directory. Enable tru

 YOLO mode enables trust mode automatically.

-## MCP Caveat (Important)
+## MCP Behavior

-MCP tools are exposed as `mcp_<server>_<tool>` and currently execute without TUI approval prompts. Only configure MCP servers you trust.
+MCP tools are exposed as `mcp_<server>_<tool>` and use the same approval flow as built-in tools. Read-only MCP helpers may auto-run in suggestive approval modes; MCP tools with possible side effects require approval.

 See `MCP.md`.

@@ -0,0 +1,95 @@
+# DeepSeek CLI Operations Runbook
+
+This runbook covers practical debugging and incident response for the local CLI/TUI runtime.
+
+## Quick Triage
+
+1. Confirm binary + config:
+   - `cargo run -- --version`
+   - `cat ~/.deepseek/config.toml` (or inspect configured profile)
+2. Enable verbose logs:
+   - `RUST_LOG=deepseek_cli=debug cargo run`
+   - For HTTP retries/reconnects: `RUST_LOG=deepseek_cli::client=debug cargo run`
+3. Capture current state:
+   - `ls ~/.deepseek/sessions`
+   - `ls ~/.deepseek/sessions/checkpoints`
+   - `ls ~/.deepseek/tasks`
+
+## Incident: Turn Hangs or Stream Stops
+
+Symptoms:
+- TUI remains in loading state
+- partial assistant output with no completion
+
+Checks:
+1. Inspect retry/health logs (`deepseek_cli::client`)
+2. Verify endpoint connectivity:
+   - `curl -sS https://api.deepseek.com/v1/models -H "Authorization: Bearer $DEEPSEEK_API_KEY"`
+3. Confirm no local sandbox/permission deadlock in tool output
+
+Actions:
+1. Cancel current turn (`Esc` in TUI)
+2. Retry prompt; if still failing, restart TUI
+3. On restart, verify crash checkpoint recovery message appears
+
+## Incident: Network Outage / Offline Behavior
+
+Expected behavior:
+- New prompts are queued while offline mode is active
+- Queue state persists to `~/.deepseek/sessions/checkpoints/offline_queue.json`
+
+Checks:
+1. Open queue in TUI: `/queue list`
+2. Confirm persisted queue file exists and updates timestamp
+
+Actions:
+1. Restore connectivity
+2. Re-send queued entries (from `/queue edit <n>` + Enter, or normal input flow)
+3. Ensure queue file clears when queue is empty
+
+## Incident: Crash Recovery Needed
+
+Expected behavior:
+- Checkpoint stored at `~/.deepseek/sessions/checkpoints/latest.json`
+- Startup auto-restores checkpoint when no explicit `--resume` target is supplied
+
+Actions:
+1. Start TUI normally and verify "Recovered checkpoint session" status
+2. If automatic recovery fails, inspect checkpoint JSON for schema mismatch
+3. If schema is newer than binary supports, upgrade binary or remove stale checkpoint
+
+## Incident: Persistent State Schema Errors
+
+Symptoms:
+- Errors like `schema vX is newer than supported vY`
+
+Affected stores:
+- sessions (`~/.deepseek/sessions/*.json`)
+- runtime thread/turn/item records
+- tasks (`~/.deepseek/tasks/tasks/*.json`)
+
+Actions:
+1. Confirm binary version and migration expectations
+2. Back up the state directory before editing
+3. Either:
+   - run with a newer compatible binary, or
+   - archive incompatible records and regenerate state
+
+## Incident: MCP/Tool Execution Failures
+
+Checks:
+1. Validate `~/.deepseek/mcp.json` schema and server command paths
+2. Confirm server process can start manually
+3. Check sandbox denials in TUI history / logs
+
+Actions:
+1. Retry with required approvals (or YOLO only when appropriate)
+2. Temporarily disable failing MCP server and isolate issue
+3. Re-enable after verification with `/mcp` diagnostics
+
+## Post-Incident Checklist
+
+1. Preserve logs and relevant state files
+2. Record trigger, impact, and mitigation
+3. Add or update regression tests (retry/recovery/schema)
+4. Update this runbook and architecture docs if behavior changed
@@ -0,0 +1,181 @@
+# Runtime API (HTTP/SSE)
+
+DeepSeek CLI can expose a local runtime API for external clients:
+
+```bash
+deepseek serve --http --host 127.0.0.1 --port 7878 --workers 2
+```
+
+Defaults:
+- bind: `127.0.0.1:7878`
+- workers: `2` (clamped to `1..8`)
+
+## Security Model (Local-First)
+
+- The server is designed for trusted local use.
+- There is no built-in auth, user isolation, or TLS termination.
+- Do not expose this API directly to untrusted networks.
+- If remote access is required, place it behind your own authenticated reverse proxy/VPN.
+
+## Runtime Data Model
+
+The runtime uses a durable Thread/Turn/Item lifecycle.
+
+- `ThreadRecord`
+  - `id`, `created_at`, `updated_at`
+  - `model`, `workspace`, `mode`
+  - `latest_turn_id`, `latest_response_bookmark`, `archived`
+- `TurnRecord`
+  - `id`, `thread_id`
+  - `status`: `queued|in_progress|completed|failed|interrupted|canceled`
+  - timestamps, duration, usage, error summary
+- `TurnItemRecord`
+  - `id`, `turn_id`
+  - `kind`: `user_message|agent_message|tool_call|file_change|command_execution|context_compaction|status|error`
+  - lifecycle `status`: `queued|in_progress|completed|failed|interrupted|canceled`
+
+The event log is append-only with global monotonic `seq` for replay/resume.
+
+## Endpoints
+
+### Health and Session
+
+- `GET /health`
+- `GET /v1/sessions?limit=50&search=<substring>`
+
+### Compatibility Stream (Single Turn)
+
+- `POST /v1/stream`
+
+Backwards-compatible one-shot SSE wrapper. Internally creates an archived runtime thread+turn.
+
+Request body:
+
+```json
+{
+  "prompt": "Summarize recent commits",
+  "model": "deepseek-v3.2",
+  "mode": "agent",
+  "workspace": ".",
+  "allow_shell": false,
+  "trust_mode": false,
+  "auto_approve": true
+}
+```
+
+Typical SSE events:
+- `turn.started`
+- `message.delta`
+- `tool.started`
+- `tool.progress`
+- `tool.completed`
+- `approval.required`
+- `sandbox.denied`
+- `status`
+- `error`
+- `turn.completed`
+- `done`
+
+### Thread Lifecycle
+
+- `POST /v1/threads`
+- `GET /v1/threads?limit=50&include_archived=false`
+- `GET /v1/threads/{id}`
+- `POST /v1/threads/{id}/resume`
+- `POST /v1/threads/{id}/fork`
+
+Create thread request example:
+
+```json
+{
+  "model": "deepseek-v3.2",
+  "workspace": ".",
+  "mode": "agent",
+  "allow_shell": false,
+  "trust_mode": false,
+  "auto_approve": true,
+  "archived": false
+}
+```
+
+### Turn Lifecycle
+
+- `POST /v1/threads/{id}/turns`
+- `POST /v1/threads/{id}/turns/{turn_id}/steer`
+- `POST /v1/threads/{id}/turns/{turn_id}/interrupt`
+- `POST /v1/threads/{id}/compact`
+
+Notes:
+- Only one active turn is allowed per thread (`409 Conflict` on overlap).
+- `interrupt` returns quickly and marks `turn.interrupt_requested`.
+- Terminal turn status becomes `interrupted` only after cleanup completes.
+- Manual compaction is exposed as a turn with `context_compaction` item lifecycle events.
+
+### Replayable Events
+
+- `GET /v1/threads/{id}/events?since_seq=<u64>`
+
+Returns SSE replay backlog, then live events for that thread.
+
+SSE payload shape:
+
+```json
+{
+  "seq": 42,
+  "timestamp": "2026-02-11T20:18:49.123Z",
+  "thread_id": "thr_1234abcd",
+  "turn_id": "turn_5678efgh",
+  "item_id": "item_90ab12cd",
+  "event": "item.delta",
+  "payload": {
+    "delta": "partial output",
+    "kind": "agent_message"
+  }
+}
+```
+
+Common event names:
+- `thread.started`
+- `thread.forked`
+- `turn.started`
+- `turn.lifecycle`
+- `turn.steered`
+- `turn.interrupt_requested`
+- `turn.completed`
+- `item.started`
+- `item.delta`
+- `item.completed`
+- `item.failed`
+- `item.interrupted`
+- `approval.required`
+- `sandbox.denied`
+
+Compaction visibility:
+- auto compaction emits `item.started`/`item.completed` with item kind `context_compaction` and `auto=true`
+- manual compaction emits the same with `auto=false`
+
+### Background Tasks
+
+- `GET /v1/tasks`
+- `POST /v1/tasks`
+- `GET /v1/tasks/{id}`
+- `POST /v1/tasks/{id}/cancel`
+
+Tasks execute through the same runtime thread/turn pipeline and include:
+- linked `thread_id` / `turn_id`
+- runtime event count
+- timeline + tool summaries + artifact references
+
+## Persistence
+
+Runtime store (default under task data root):
+- `runtime/threads/*.json`
+- `runtime/turns/*.json`
+- `runtime/items/*.json`
+- `runtime/events/{thread_id}.jsonl`
+- `runtime/state.json` (monotonic sequence)
+
+Task store:
+- default `~/.deepseek/tasks` (override with `DEEPSEEK_TASKS_DIR`)
+
+Both runtime and task state are restart-safe.