Hunter Bown 44f3b2cae5 fix(engine): transparent retry on stream death with no content (#103 Phase 3)
When the chunked-transfer connection to DeepSeek dies mid-stream — the
"Stream read error: error decoding response body" symptom — the engine
previously surfaced the error to the user and ended the turn as Failed,
even when no useful content had been received. The user's only recourse
was to manually re-send the same message.

Phase 3 closes that loop. After the inner stream-consumption loop ends,
detect "stream died with nothing actionable":

  - stream_errors > 0 (the stream errored at some point)
  - tool_uses.is_empty() (no tool call landed)
  - current_text_visible is empty/whitespace
  - current_thinking is empty/whitespace
  - !pending_message_complete

If all hold AND stream_retry_attempts < MAX_STREAM_RETRIES (3), silently
re-issue the SAME outer-loop iteration: rebuilds the request from
self.session.messages, calls create_message_stream again, and starts a
fresh inner loop. Surface a "Connection interrupted; retrying (N/3)"
status to the user so they know something's happening, but don't trip
the engine-level Error event so we don't double-display the failure as
a History cell.

Healthy rounds (stream_errors == 0) reset the retry budget so a single
proxy hiccup doesn't poison subsequent rounds in the same turn.

Crucially: if we got partial output (any tool call, any visible text,
or any thinking), we DON'T retry. Re-running the request would
double-bill the user; ship the partial state to the rest of the turn
pipeline (existing tool execution, content_blocks finalization) and
let the agent loop continue.

Combined with #103 Phase 1+2 (TCP/HTTP2 keepalives + diagnostic logging
in client.rs), this should turn the user-visible "Turn failed: Stream
read error" into either a fully-recovered turn OR a clearly-labeled
3-attempts-exhausted failure.

Refs #103.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 17:37:48 -05:00
2026-04-26 13:56:40 -05:00
2026-04-25 07:59:01 -05:00
2026-04-24 16:29:01 -05:00
2026-04-26 13:56:40 -05:00
2026-01-20 08:57:35 -06:00

DeepSeek TUI

A terminal-native coding agent for DeepSeek V4 models — with 1M-token context, thinking-mode reasoning, and full tool-use.

npm i -g deepseek-tui

CI npm

DeepSeek TUI screenshot


What is it?

DeepSeek TUI is a coding agent that runs entirely in your terminal. It gives DeepSeek's frontier models direct access to your workspace — reading and editing files, running shell commands, searching the web, managing git, and orchestrating sub-agents — all through a fast, keyboard-driven TUI.

Built for DeepSeek V4 (deepseek-v4-pro / deepseek-v4-flash) with 1M-token context windows and native thinking-mode (chain-of-thought) streaming. See the model's reasoning unfold in real time as it works through your tasks.

Key Features

  • Native RLM (rlm_query tool) — fans out 116 cheap deepseek-v4-flash children in parallel against the existing DeepSeek client for batched analysis, decomposition, or parallel reasoning
  • Thinking-mode streaming — shows DeepSeek's chain-of-thought as it reasons about your code
  • Full tool suite — file ops, shell execution, git, web search/browse, apply-patch, sub-agents, MCP servers
  • 1M-token context — automatic intelligent compaction when context fills up
  • Three interaction modes — Plan (read-only explore), Agent (interactive with approval), YOLO (auto-approved). Decomposition-first system prompts teach the model to todo_write, update_plan, and spawn sub-agents before acting
  • Reasoning-effort tiers — cycle through off → high → max with Shift+Tab
  • Session save/resume — checkpoint and resume long sessions
  • HTTP/SSE runtime APIdeepseek serve --http for headless agent workflows
  • MCP protocol — connect to Model Context Protocol servers for extended tooling
  • Live cost tracking — per-turn and session-level token usage and cost estimates
  • Dark theme — DeepSeek-blue palette

Quickstart

npm install -g deepseek-tui
deepseek

On first launch you'll be prompted for your DeepSeek API key. You can also set it ahead of time:

# via CLI
deepseek login --api-key "YOUR_DEEPSEEK_API_KEY"

# via env var
export DEEPSEEK_API_KEY="YOUR_DEEPSEEK_API_KEY"
deepseek

Using NVIDIA NIM

deepseek auth set --provider nvidia-nim --api-key "YOUR_NVIDIA_API_KEY"
deepseek --provider nvidia-nim

# or per-process:
DEEPSEEK_PROVIDER=nvidia-nim NVIDIA_API_KEY="..." deepseek
Install from source
git clone https://github.com/Hmbown/DeepSeek-TUI.git
cd DeepSeek-TUI
cargo install --path crates/tui --bin deepseek-tui --locked   # requires Rust 1.85+
cargo install --path crates/cli --bin deepseek --locked

What's new in v0.6.0

🌊 rlm_query — recursive language models as a first-class tool

The model now has direct access to a native recursive-LLM primitive. Inspired by Alex Zhang's RLM work and Sakana AI's published research on novelty search, but trimmed to what an agent loop actually needs: one tool, structured args, no DSL.

// Single child:
rlm_query({ "prompt": "Summarise this 4k-line log: ..." })

// 8 parallel children, indexed result:
rlm_query({
  "prompts": [
    "Review src/foo.rs for race conditions: ...",
    "Review src/foo.rs for input validation: ...",
    "Review src/foo.rs for error-handling gaps: ...",
    "..."
  ]
})

// Promote one call to Pro:
rlm_query({ "prompt": "Hard reasoning here", "model": "deepseek-v4-pro" })

Children run concurrently against the existing DeepSeek client via tokio — no external binary, no Python sandbox, no fenced-block DSL. Returns a single string for one prompt or [i] ... indexed blocks for many. Available in Plan / Agent / YOLO. The cost is folded into the session's running total automatically.

Other changes

  • Scroll position survives content rewrites — anchor fallback now clamps to the nearest surviving cell instead of teleporting to the bottom (#56)
  • Looser command-safety chainscargo build && cargo test is no longer blocked outright; chains of known-safe commands escalate to RequiresApproval instead of Dangerous (#57)
  • Multi-turn tool calls no longer 400 on thinking-mode modelsreasoning_content is replayed across user-message boundaries with a safe placeholder when the round produced none

Full history: CHANGELOG.md.


Models & Pricing

DeepSeek TUI targets DeepSeek V4 models with 1M-token context windows by default.

Model Context Input (cache hit) Input (cache miss) Output
deepseek-v4-pro 1M $0.003625 / 1M* $0.435 / 1M* $0.87 / 1M*
deepseek-v4-flash 1M $0.0028 / 1M $0.14 / 1M $0.28 / 1M

Legacy aliases deepseek-chat and deepseek-reasoner silently map to deepseek-v4-flash.

NVIDIA NIM hosted variants (deepseek-ai/deepseek-v4-pro, deepseek-ai/deepseek-v4-flash) use your NVIDIA account terms — no DeepSeek platform billing.

*DeepSeek lists the Pro rates above as a limited-time 75% discount valid until 2026-05-05 15:59 UTC; the TUI cost estimator falls back to base Pro rates after that timestamp.


Usage

deepseek                                      # interactive TUI
deepseek "explain this function"              # one-shot prompt
deepseek --model deepseek-v4-flash "summarize" # model override
deepseek --yolo                               # YOLO mode (auto-approve tools)
deepseek login --api-key "..."                # save API key
deepseek doctor                               # check setup & connectivity
deepseek models                               # list live API models
deepseek sessions                             # list saved sessions
deepseek resume --last                        # resume latest session
deepseek serve --http                         # HTTP/SSE API server

Keyboard shortcuts

Key Action
Tab Cycle mode: Plan → Agent → YOLO
Shift+Tab Cycle reasoning-effort: off → high → max
F1 Help
Esc Back / dismiss
Ctrl+K Command palette
@path Attach file/directory context in composer
/attach <path> Attach image/video media references

Modes

Mode Behavior
Plan 🔍 Read-only investigation — model explores and proposes a decomposition plan (update_plan + todo_write) before making changes
Agent 🤖 Default interactive mode — multi-step tool use with approval gates; model outlines work via todo_write before requesting writes
YOLO Auto-approve all tools in a trusted workspace; model still creates todo_write/update_plan to keep work visible and trackable

Configuration

~/.deepseek/config.toml — see config.example.toml for every option.

Key environment overrides:

Variable Purpose
DEEPSEEK_API_KEY API key
DEEPSEEK_BASE_URL API base URL
DEEPSEEK_MODEL Default model
DEEPSEEK_PROVIDER Provider: deepseek (default) or nvidia-nim
DEEPSEEK_PROFILE Config profile name
NVIDIA_API_KEY NVIDIA NIM API key

Quick diagnostics:

deepseek-tui setup --status    # read-only status check (API key, MCP, sandbox, .env)
deepseek-tui doctor --json     # machine-readable doctor output for CI
deepseek-tui setup --tools --plugins  # scaffold tools/ and plugins/ directories

DeepSeek context caching is automatic — when the API returns cache hit/miss token fields, the TUI includes them in usage and cost tracking.

Full reference: docs/CONFIGURATION.md


Documentation

Doc Topic
ARCHITECTURE.md Codebase internals
CONFIGURATION.md Full config reference
MODES.md Plan / Agent / YOLO modes
MCP.md Model Context Protocol integration
RUNTIME_API.md HTTP/SSE API server
RELEASE_RUNBOOK.md Release process
OPERATIONS_RUNBOOK.md Ops & recovery

Contributing

See CONTRIBUTING.md. Pull requests welcome!

Not affiliated with DeepSeek Inc.

License

MIT

S
Description
No description provided
Readme 24 MiB
Languages
Rust 94%
TypeScript 2.6%
JavaScript 1.6%
Shell 0.8%
Python 0.6%
Other 0.1%