Files
codewhale/docs/RUNTIME_API.md
T
Hunter Bown 0047b3225b feat(runtime-api): daemon API quartet for whalescale (#561 #562 #563 #564) (#567)
Bridge work to unblock whalescale-desktop's Settings/Composer/Archived-chats
flows without requiring a daemon recompile per dev-port or client-side
aggregation.

#561 / whalescale#255 — CORS allow-list configurable
* Add `[runtime_api] cors_origins` config field, `--cors-origin URL`
  (repeatable) flag on `deepseek serve --http`, and `DEEPSEEK_CORS_ORIGINS`
  env var. User entries stack on top of the built-in defaults
  (localhost:3000, localhost:1420, tauri://localhost). Resolution preserves
  first-seen order and drops empty/duplicate values; invalid HeaderValues
  log a warning and are skipped.
* Refactor `cors_layer()` to read merged origins from `RuntimeApiState`.

#562 / whalescale#256 — `PATCH /v1/threads/{id}` accepts the full editable
field set
* Extend `UpdateThreadRequest` with `allow_shell`, `trust_mode`,
  `auto_approve`, `model`, `mode`, `title`, `system_prompt`. Each is
  optional; missing means no change. Empty-string clears `title`/
  `system_prompt`. Empty `model`/`mode` rejected with 400.
* Add `title: Option<String>` to `ThreadRecord` (additive, no schema bump
  per documented criteria — old readers ignore the field without
  misinterpretation). `list_threads_summary` now returns the user-set title
  when present, falling back to the derived input-summary title.
* `thread.updated` event payload now carries a `changes` map with only the
  fields that actually changed.

#563 / whalescale#260 — list-archived-only filter
* New `archived_only=true` query param on `GET /v1/threads` and
  `GET /v1/threads/summary`. Backed by a new `ThreadListFilter` enum
  (`ActiveOnly` | `IncludeArchived` | `ArchivedOnly`). `archived_only`
  takes precedence over `include_archived`. Default behavior unchanged.

#564 / whalescale#261 — `GET /v1/usage` aggregation
* New `RuntimeThreadManager::aggregate_usage` walks all threads/turns,
  filters by inclusive `since`/`until` RFC 3339 bounds, accumulates token
  totals + cost (via `pricing::calculate_turn_cost_from_usage`), and
  groups by `day` (default), `model`, `provider`, or `thread`.
* New `GET /v1/usage` route. `since`/`until`/`group_by` query params,
  `since > until` and unknown `group_by` rejected with 400. Empty time
  ranges yield empty `buckets` (never 404).

5 new tests cover preflight Allow-Origin echoing for both default and
extra origins, the extended PATCH field set + clear-by-empty + 400 paths,
the archived_only filter on list + summary endpoints, and the
/v1/usage envelope + validation errors. Existing 13 runtime_api tests
continue to pass; the parity gates and full workspace test suite are clean.

`docs/RUNTIME_API.md` and `config.example.toml` updated to document the
new params, body shape, endpoint, and CORS knob.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 02:18:19 -05:00

11 KiB
Raw Blame History

Runtime API & Integration Contract

DeepSeek TUI exposes a local runtime API through deepseek serve --http and machine-readable health via deepseek doctor --json. This document is the stable integration contract for native macOS workbench applications (and other local supervisors) that embed the DeepSeek engine without screen-scraping terminal output.

Architecture

macOS workbench (or any local supervisor)
        │
        ├─ deepseek doctor --json   → machine-readable health & capability
        ├─ deepseek serve --http    → HTTP/SSE runtime API
        ├─ deepseek serve --mcp     → MCP stdio server
        └─ deepseek [args]          → interactive TUI session

The engine runs as a local-only process. All APIs bind to localhost by default. No hosted relay, no provider-token custody, no secret leakage.

Capability endpoint: deepseek doctor --json

Returns a JSON object describing the current installation's readiness state. Suitable for health-check polling from a macOS workbench.

deepseek doctor --json

Response schema (key fields)

Field Type Description
version string Installed version (e.g. "0.8.9")
config_path string Resolved config file path
config_present bool Whether the config file exists
workspace string Default workspace directory
api_key.source string env, config, or missing
base_url string API base URL
default_text_model string Default model
memory.enabled bool Whether the memory feature is on
memory.path string Path to memory file
memory.file_present bool Whether memory file exists
mcp.config_path string MCP config file path
mcp.present bool Whether MCP config exists
mcp.servers array Per-server health: {name, enabled, status, detail}
skills.selected string Resolved skills directory
skills.global.path / .present / .count Global skills dir
skills.agents.path / .present / .count .agents/skills/ dir
skills.local.path / .present / .count skills/ dir
skills.opencode.path / .present / .count .opencode/skills/ dir
skills.claude.path / .present / .count .claude/skills/ dir
tools.path / .present / .count Global tools directory
plugins.path / .present / .count Global plugins directory
sandbox.available bool Whether sandbox is supported on this OS
sandbox.kind string or null Sandbox kind (e.g. "macos_seatbelt")
storage.spillover.path / .present / .count Tool output spillover dir
storage.stash.path / .present / .count Composer stash

Example

{
  "version": "0.8.9",
  "config_path": "/Users/you/.deepseek/config.toml",
  "config_present": true,
  "workspace": "/Users/you/projects/deepseek-tui",
  "api_key": {
    "source": "env"
  },
  "base_url": "https://api.deepseek.com",
  "default_text_model": "deepseek-v4-pro",
  "memory": {
    "enabled": false,
    "path": "/Users/you/.deepseek/memory.md",
    "file_present": true
  },
  "mcp": {
    "config_path": "/Users/you/.deepseek/mcp.json",
    "present": true,
    "servers": [
      {"name": "filesystem", "enabled": true, "status": "ok", "detail": "ready"}
    ]
  },
  "sandbox": {
    "available": true,
    "kind": "macos_seatbelt"
  }
}

HTTP/SSE runtime API: deepseek serve --http

deepseek serve --http [--host 127.0.0.1] [--port 7878] [--workers 2]

Defaults: host 127.0.0.1, port 7878, 2 workers (clamped 18).

The server binds to localhost by default. Configuration is via CLI flags — there is no [app_server] config section.

Endpoints

Health

  • GET /health

Sessions (legacy session manager)

  • GET /v1/sessions?limit=50&search=<substring>
  • GET /v1/sessions/{id}
  • DELETE /v1/sessions/{id}
  • POST /v1/sessions/{id}/resume-thread

Threads (durable runtime data model)

  • GET /v1/threads?limit=50&include_archived=false&archived_only=false
  • GET /v1/threads/summary?limit=50&search=<optional>&include_archived=false&archived_only=false
  • POST /v1/threads
  • GET /v1/threads/{id}
  • PATCH /v1/threads/{id} (see body shape below)
  • POST /v1/threads/{id}/resume
  • POST /v1/threads/{id}/fork

archived_only=true returns archived threads only (mutually overrides include_archived). Default behavior is unchanged: include_archived=false and archived_only=false returns active threads. Added in v0.8.10 (#563).

PATCH /v1/threads/{id} body — every field is optional, missing means "no change". At least one field must be present. title and system_prompt accept an empty string to clear a previously-set value. Added in v0.8.10 (#562):

{
  "archived": true,
  "allow_shell": false,
  "trust_mode": false,
  "auto_approve": false,
  "model": "deepseek-v4-pro",
  "mode": "agent",
  "title": "User-set thread title",
  "system_prompt": "You are a useful assistant."
}

Turns (within a thread)

  • POST /v1/threads/{id}/turns
  • POST /v1/threads/{id}/turns/{turn_id}/steer
  • POST /v1/threads/{id}/turns/{turn_id}/interrupt
  • POST /v1/threads/{id}/compact (manual compaction)

Events (SSE replay + live stream)

  • GET /v1/threads/{id}/events?since_seq=<u64>

Compatibility stream (one-shot, backwards-compatible)

  • POST /v1/stream

Tasks (durable background work)

  • GET /v1/tasks
  • POST /v1/tasks
  • GET /v1/tasks/{id}
  • POST /v1/tasks/{id}/cancel

Automations (scheduled recurring work)

  • GET /v1/automations
  • POST /v1/automations
  • GET /v1/automations/{id}
  • PATCH /v1/automations/{id}
  • DELETE /v1/automations/{id}
  • POST /v1/automations/{id}/run
  • POST /v1/automations/{id}/pause
  • POST /v1/automations/{id}/resume
  • GET /v1/automations/{id}/runs?limit=20

Introspection

  • GET /v1/workspace/status
  • GET /v1/skills
  • GET /v1/apps/mcp/servers
  • GET /v1/apps/mcp/tools?server=<optional>

Usage (token/cost aggregation across threads)

  • GET /v1/usage?since=<rfc3339>&until=<rfc3339>&group_by=<day|model|provider|thread>

since / until are inclusive RFC 3339 timestamps and may be omitted (no bound). group_by defaults to day. Buckets are sorted by ascending key. Empty time ranges produce empty buckets (never a 404). Cost is computed via the model→pricing map; turns whose model has no pricing entry contribute tokens but 0.0 cost. Added in v0.8.10 (#564).

{
  "since": "2026-04-01T00:00:00Z",
  "until": "2026-04-30T23:59:59Z",
  "group_by": "day",
  "totals": {
    "input_tokens": 12345,
    "output_tokens": 6789,
    "cached_tokens": 0,
    "reasoning_tokens": 0,
    "cost_usd": 0.012,
    "turns": 42
  },
  "buckets": [
    {
      "key": "2026-04-30",
      "input_tokens": 1234,
      "output_tokens": 678,
      "cached_tokens": 0,
      "reasoning_tokens": 0,
      "cost_usd": 0.001,
      "turns": 3
    }
  ]
}

Runtime data model

The runtime uses a durable Thread/Turn/Item lifecycle.

  • ThreadRecordid, created_at, updated_at, model, workspace, mode, task_id, coherence_state, system_prompt, latest_turn_id, latest_response_bookmark, archived
  • TurnRecordid, thread_id, status (queued|in_progress|completed| failed|interrupted|canceled), timestamps, duration, usage, error summary
  • TurnItemRecordid, turn_id, kind (user_message|agent_message| tool_call|file_change|command_execution|context_compaction|status|error), lifecycle status, metadata

Events are append-only with a global monotonic seq for replay/resume.

Restart semantics

  • If the process restarts while a turn or item is queued or in_progress, the recovered record is marked interrupted with an "Interrupted by process restart" error.
  • Task execution performs its own recovery on top of the same persisted thread/turn store.

Approval model

  • The auto_approve flag applies to the runtime approval bridge and engine tool context. When enabled for a thread/turn/task, approval-required tools are auto-approved in the non-interactive runtime path, shell safety checks run in auto-approved mode, and spawned sub-agents inherit that setting.
  • When omitted, auto_approve defaults to false.

SSE event stream

The SSE event payload shape:

{
  "seq": 42,
  "timestamp": "2026-02-11T20:18:49.123Z",
  "thread_id": "thr_1234abcd",
  "turn_id": "turn_5678efgh",
  "item_id": "item_90ab12cd",
  "event": "item.delta",
  "payload": {
    "delta": "partial output",
    "kind": "agent_message"
  }
}

Common event names: thread.started, thread.forked, turn.started, turn.lifecycle, turn.steered, turn.interrupt_requested, turn.completed, item.started, item.delta, item.completed, item.failed, item.interrupted, approval.required, sandbox.denied, coherence.state.

Security boundary

  • Localhost only. The server binds to 127.0.0.1 by default. Set --host 0.0.0.0 only when you have a reverse-proxy / VPN that authenticates — there is no built-in auth, user isolation, or TLS.
  • No provider-token custody. The server never returns the API key. The api_key.source capability field reports env, config, or missing — never the key itself.
  • No hosted relay. The app-server is a local process under the user's control. There is no cloud component.
  • Capability responses never leak secrets, file contents, or session message bodies. They report metadata: presence, counts, status flags.

CORS allow-list

The runtime API ships with a built-in dev-origin allow-list: http://localhost:3000, http://127.0.0.1:3000, http://localhost:1420, http://127.0.0.1:1420, tauri://localhost. To add additional origins (e.g. when developing a UI on Vite's default :5173), use any of:

  • CLI flag (repeatable): deepseek serve --http --cors-origin http://localhost:5173
  • Env var (comma-separated): DEEPSEEK_CORS_ORIGINS="http://localhost:5173,http://localhost:8080"
  • Config (~/.deepseek/config.toml):
    [runtime_api]
    cors_origins = ["http://localhost:5173"]
    

User-supplied origins stack on top of the built-in defaults; they do not replace them. Wildcard origins are not supported — the explicit allow-list model is preserved. Added in v0.8.10 (#561).

Session lifecycle (native UI supervision)

Operation Endpoint
List sessions GET /v1/sessions
Get session GET /v1/sessions/{id}
Delete session DELETE /v1/sessions/{id}
Resume into thread POST /v1/sessions/{id}/resume-thread
Create thread POST /v1/threads
List threads GET /v1/threads
Attach to events GET /v1/threads/{id}/events?since_seq=0
Send message POST /v1/threads/{id}/turns
Steer POST /v1/threads/{id}/turns/{turn_id}/steer
Interrupt POST /v1/threads/{id}/turns/{turn_id}/interrupt
Compact POST /v1/threads/{id}/compact

Compatibility tests

Contract snapshots live in crates/protocol/tests/. Run:

cargo test -p deepseek-protocol --test parity_protocol --locked

This validates that the app-server's event schema hasn't drifted from the documented contract. CI runs this on every push to main and on release tags.