feat(tools): probe Python interpreter at catalog-build time; doctor surfaces missing deps

Closes the "code_execution: program not found" failure reported by a Windows contributor and adds the surrounding plumbing so future external-binary dependencies can be added without each one repeating the same probe-and-skip logic. New module `crates/tui/src/dependencies.rs`: * `PYTHON_CANDIDATES = ["python3", "python", "py -3"]` — try the Unix-style name first, then the bare name (common on Windows and modern macOS), then the Windows launcher as a last resort. * `probe_executable(spec)` — splits the spec on whitespace, spawns `<program> [args...] --version`, returns true on success. stdout and stderr are routed to /dev/null so the probe doesn't print to the TUI's first frame. * `resolve_python_interpreter()` — OnceLock-cached resolver that returns the first candidate that probed successfully, or None. * `resolve_pdftotext()` — same shape, for doctor's PDF-path diagnostic. * `split_interpreter_spec("py -3") → ("py", ["-3"])` — needed so `tokio::process::Command::new(program).args(args).arg(script)` runs the Windows launcher correctly. Wiring in `core/engine/tool_catalog.rs::ensure_advanced_tooling`: * The hardcoded `catalog.push(Tool { name: "code_execution", … })` is now gated on `resolve_python_interpreter().is_some()`. On a machine without Python the tool is not advertised — the model never sees a tool it can't actually run, which removes the "Failed to execute tool: program not found" class of failure entirely. On a machine WITH Python the behavior is unchanged from the user's perspective. Wiring in `execute_code_execution_tool`: * Resolves the interpreter at call time (still cached) and splits on whitespace so `py -3` runs as `py -3 /tmp/.../code.py`. * Writes the user code to a tempfile under tempdir() and runs the file rather than passing it through `python3 -c "..."`. Fixes argv length limits on Windows, multiline code with quote nesting, and traceback line numbers (real filename instead of `<string>`). Folds in the contributor's Part 2 proposal. * Returns a clear actionable error if the interpreter somehow disappears between startup and the call (uninstall mid-session, PATH manipulation, etc.). Doctor (`crates/tui/src/main.rs::run_doctor`) gains two new sections: * "Tool Dependencies": ✓/✗ for Python and pdftotext with a platform-specific install hint when missing. The Python miss explicitly tells users that `code_execution` is not advertised. * "Terminal Quirks": shows which env-driven auto-overrides are currently active (VS Code / Ghostty low-motion, Termius/SSH low-motion per #1433, Ptyxis sync-output-off per v0.8.31). Answers the "why is my flicker / motion behaving this way" question without needing the user to read the source. The contributor's Part 1 proposal (share the sandbox/approval substrate with task_shell_start) is deferred to v0.8.32 — that touches `sandbox/` which CLAUDE.md flags as "stop and ask before changing" and needs an explicit design pass on approval semantics. Six new unit tests in `dependencies.rs` covering probe behavior, spec splitting, and cache stability. Existing tool_catalog tests still pass.
2026-05-11 22:54:52 -05:00
parent 80fc0046e1
commit 2b54b7e79d
4 changed files with 388 additions and 4 deletions
@@ -9,6 +9,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Fixed

+- **`code_execution` no longer fails with "program not found" on
+  Windows** (and any other host without `python3` on `PATH`). Before
+  v0.8.31 the tool hardcoded `python3` and was unconditionally
+  advertised in Agent / YOLO modes — so the model would call it,
+  spawn would fail, and the error surfaced as a generic tool failure
+  with no upstream hint. The fix probes for a Python interpreter
+  (`python3` → `python` → `py -3`) at catalog-build time, caches the
+  resolved interpreter, and only advertises `code_execution` when one
+  resolves. On hosts with no Python the tool is not registered at all
+  — the model never sees a tool it can't actually run. Reported by a
+  Windows contributor; resolver lives at
+  `crates/tui/src/dependencies.rs` and is also surfaced by
+  `deepseek doctor`. Folds in the contributor's "write code to a
+  tempfile and run the file" suggestion at the same time, so multiline
+  code with quote nesting no longer round-trips through `python3 -c`.
 - **Termius and every SSH session auto-enable low-motion**
  (#1433, harvested from PR #1479 by **@CrepuscularIRIS / autoghclaw**).
  Termius desktop sets `TERM_PROGRAM=Termius`; sshd exports
@@ -39,6 +54,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

 ### Added

+- **`deepseek doctor` now reports tool-dependency status.** A new
+  "Tool Dependencies" section lists which external binaries the
+  registered tools rely on, with ✓ when present and ✗ + an
+  install hint when missing. Today this covers the Python
+  interpreter (`code_execution`) and `pdftotext` (`read_file` PDF
+  path). A separate "Terminal Quirks" section shows which env-driven
+  auto-overrides (VS Code / Ghostty / Termius / SSH / Ptyxis) are
+  currently active so users can see at a glance why a particular
+  rendering compromise is in effect. Foundation for surfacing future
+  tool dependencies as the toolset grows.
 - **New `synchronized_output` setting** controls whether the renderer
  wraps each frame in DEC mode 2026 synchronized output. Accepts
  `auto` (default; respect the Ptyxis env opt-out), `on` (always emit
@@ -120,7 +120,17 @@ pub(super) fn build_model_tool_catalog(
 }

 pub(super) fn ensure_advanced_tooling(catalog: &mut Vec<Tool>, mode: AppMode) {
-    if mode != AppMode::Plan && !catalog.iter().any(|t| t.name == CODE_EXECUTION_TOOL_NAME) {
+    // code_execution depends on a locally-installed Python interpreter
+    // (python3 / python / py -3). Before v0.8.31, the tool was always
+    // advertised and would fail at execution time on Windows where
+    // `python3` isn't on PATH — the model treated the tool as reliable
+    // once it appeared in the catalog. We now probe at catalog-build
+    // time and only advertise when an interpreter resolves. See
+    // `crate::dependencies::resolve_python_interpreter` for the probe.
+    if mode != AppMode::Plan
+        && !catalog.iter().any(|t| t.name == CODE_EXECUTION_TOOL_NAME)
+        && crate::dependencies::resolve_python_interpreter().is_some()
+    {
        catalog.push(Tool {
            tool_type: Some(CODE_EXECUTION_TOOL_TYPE.to_string()),
            name: CODE_EXECUTION_TOOL_NAME.to_string(),
@@ -586,9 +596,45 @@ pub(super) async fn execute_code_execution_tool(
    workspace: &Path,
 ) -> Result<ToolResult, ToolError> {
    let code = required_str(input, "code")?;
-    let mut cmd = tokio::process::Command::new("python3");
-    cmd.arg("-c");
-    cmd.arg(code);
+
+    // Resolve the locally-installed Python interpreter we cached at
+    // catalog-build time. If it's absent now (somehow registered but
+    // disappeared between startup and this call — concurrent uninstall,
+    // PATH change, etc.) we fail fast with a clear message rather than
+    // dropping into `tokio::process::Command::new("python3")` and
+    // surfacing the cryptic "program not found" the contributor
+    // originally hit on Windows.
+    let interpreter = crate::dependencies::resolve_python_interpreter().ok_or_else(|| {
+        ToolError::execution_failed(format!(
+            "code_execution: no Python interpreter found on PATH (tried {:?}). \
+             Install Python 3 and ensure one of these is on PATH, then restart \
+             deepseek-tui.",
+            crate::dependencies::PYTHON_CANDIDATES,
+        ))
+    })?;
+    let (program, args) = crate::dependencies::split_interpreter_spec(&interpreter);
+
+    // Write the code to a temp file and execute it as a script rather
+    // than passing it via `-c "<code>"`. Reasons:
+    //   * `-c` has length limits (argv) on Windows.
+    //   * Multiline code with quote nesting is brittle through `-c`.
+    //   * Tracebacks reference a real filename instead of `<string>`,
+    //     so the model can interpret line numbers correctly.
+    // Tempfile lives only for the duration of this execution; Drop
+    // removes it. We use `.py` so any shebang / encoding-sniffer
+    // logic in the interpreter behaves normally.
+    let temp_dir = tempfile::tempdir()
+        .map_err(|e| ToolError::execution_failed(format!("tempdir failed: {e}")))?;
+    let script_path = temp_dir.path().join("code_execution.py");
+    tokio::fs::write(&script_path, code)
+        .await
+        .map_err(|e| ToolError::execution_failed(format!("tempfile write failed: {e}")))?;
+
+    let mut cmd = tokio::process::Command::new(&program);
+    for arg in &args {
+        cmd.arg(arg);
+    }
+    cmd.arg(&script_path);
    cmd.current_dir(workspace);

    let output = tokio::time::timeout(Duration::from_secs(120), cmd.output())
@@ -0,0 +1,210 @@
+//! External-binary dependency resolution for tools that shell out to
+//! locally-installed programs (Python for `code_execution`, `pdftotext`
+//! for PDF reading in `read_file`, future tools as added).
+//!
+//! Before v0.8.31, tools that called external binaries hardcoded the
+//! command name and failed at execution time when the binary wasn't on
+//! `PATH`. The most-cited example was `code_execution`, which spawned
+//! `python3` directly — Windows users (where the launcher is `py` or
+//! `python`, not `python3`) saw `Failed to execute tool: program not
+//! found` with no upstream hint of what was wrong.
+//!
+//! This module centralises the probe-then-decide pattern. The two
+//! supported callers today are:
+//!
+//! - Tool catalog construction (`core::engine::tool_catalog`): for
+//!   tools that should be advertised to the model only when the
+//!   required runtime is present.
+//! - Doctor command (`run_doctor` in `main.rs`): for surfacing the
+//!   resolved state to the user so missing dependencies aren't an
+//!   invisible failure.
+//!
+//! Results are cached for the process lifetime via [`std::sync::OnceLock`]
+//! — probing a binary involves a `Command::output` per candidate and
+//! we'd rather not pay that on every model turn.
+
+use std::process::Command;
+use std::sync::OnceLock;
+
+/// Candidate executable names for the Python interpreter, in the
+/// order we try them. On Windows the launcher convention is `py -3`,
+/// so we add it as a third option; the resolver splits on whitespace
+/// at execution time so `py -3 /tmp/code.py` runs correctly.
+///
+/// Order matters: `python3` first because it's the unambiguous v3
+/// binary on Unix and rules out Python 2 leftovers. `python` second
+/// covers Windows installations that drop the version suffix and
+/// modern macOS where Homebrew installs both. `py -3` last as a
+/// Windows-launcher fallback.
+pub const PYTHON_CANDIDATES: &[&str] = &["python3", "python", "py -3"];
+
+/// Probe a single executable. Returns `true` when the candidate
+/// responds to `--version` with a successful exit. Splits on
+/// whitespace so `"py -3"` works as a candidate.
+///
+/// We deliberately use `--version` rather than `which` so the probe
+/// is portable across Unix, Windows (no `which` by default), and
+/// containers. The downside is that we spawn a subprocess per
+/// candidate; the resolver caches the result so this only fires
+/// once per process.
+#[must_use]
+pub fn probe_executable(spec: &str) -> bool {
+    let mut parts = spec.split_whitespace();
+    let Some(program) = parts.next() else {
+        return false;
+    };
+    let mut cmd = Command::new(program);
+    for arg in parts {
+        cmd.arg(arg);
+    }
+    cmd.arg("--version");
+
+    // Silence the subprocess's stdout/stderr — `--version` would
+    // otherwise print to our terminal during startup, which is
+    // confusing on the TUI's first frame.
+    cmd.stdout(std::process::Stdio::null());
+    cmd.stderr(std::process::Stdio::null());
+
+    matches!(cmd.status(), Ok(status) if status.success())
+}
+
+/// Resolve the Python interpreter once per process. Returns the
+/// candidate spec (e.g. `"python3"` or `"py -3"`) that succeeded,
+/// or `None` when every candidate failed.
+///
+/// Callers that need to spawn the interpreter should split this
+/// string on whitespace — see [`split_interpreter_spec`].
+pub fn resolve_python_interpreter() -> Option<String> {
+    static CACHE: OnceLock<Option<String>> = OnceLock::new();
+    CACHE
+        .get_or_init(|| {
+            for candidate in PYTHON_CANDIDATES {
+                if probe_executable(candidate) {
+                    tracing::info!(
+                        target: "tool_dependencies",
+                        candidate = candidate,
+                        "Resolved Python interpreter for code_execution",
+                    );
+                    return Some((*candidate).to_string());
+                }
+            }
+            tracing::warn!(
+                target: "tool_dependencies",
+                tried = ?PYTHON_CANDIDATES,
+                "No Python interpreter found; code_execution tool will not be advertised",
+            );
+            None
+        })
+        .clone()
+}
+
+/// Resolve `pdftotext` (from Poppler) once per process. Used by
+/// `read_file`'s PDF path for graceful fallback messaging. Unlike
+/// the Python case, `read_file` itself still works for text files
+/// when `pdftotext` is missing — this resolver exists so the doctor
+/// command can surface the miss explicitly rather than the user
+/// hitting "PDF unsupported" on a read attempt.
+pub fn resolve_pdftotext() -> Option<String> {
+    static CACHE: OnceLock<Option<String>> = OnceLock::new();
+    CACHE
+        .get_or_init(|| {
+            if probe_executable("pdftotext") {
+                Some("pdftotext".to_string())
+            } else {
+                None
+            }
+        })
+        .clone()
+}
+
+/// Split an interpreter spec like `"py -3"` into the program name
+/// and any initial arguments. Returns `("py", vec!["-3"])` for the
+/// example; returns `("python3", vec![])` for a bare name.
+///
+/// Callers spawn `Command::new(program).args(args).arg(script_path)`.
+#[must_use]
+pub fn split_interpreter_spec(spec: &str) -> (String, Vec<String>) {
+    let mut parts = spec.split_whitespace();
+    let program = parts.next().unwrap_or("").to_string();
+    let args = parts.map(str::to_string).collect();
+    (program, args)
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn probe_executable_returns_false_for_unknown_binary() {
+        // Pick a name we're confident isn't on any developer's PATH.
+        // If this ever starts failing locally, rename it.
+        assert!(!probe_executable("deepseek-tui-imaginary-binary-xyz123"));
+    }
+
+    #[test]
+    fn probe_executable_handles_multi_word_specs() {
+        // `py -3` should split correctly. The probe will fail on
+        // most non-Windows machines (no `py` launcher), which is
+        // fine — we're checking that the *split* doesn't crash.
+        let _ = probe_executable("py -3");
+    }
+
+    #[test]
+    fn split_interpreter_spec_strips_args() {
+        assert_eq!(
+            split_interpreter_spec("python3"),
+            ("python3".to_string(), Vec::<String>::new())
+        );
+        assert_eq!(
+            split_interpreter_spec("py -3"),
+            ("py".to_string(), vec!["-3".to_string()])
+        );
+        assert_eq!(
+            split_interpreter_spec("  python3  "),
+            ("python3".to_string(), Vec::<String>::new()),
+            "leading/trailing whitespace must be tolerated"
+        );
+    }
+
+    #[test]
+    fn split_interpreter_spec_handles_empty_string() {
+        assert_eq!(
+            split_interpreter_spec(""),
+            (String::new(), Vec::<String>::new())
+        );
+    }
+
+    #[test]
+    fn python_resolver_is_cached_across_calls() {
+        // Whatever the first call returns, subsequent calls return
+        // the same value (cached). If this test ever flakes, the
+        // OnceLock semantics changed and we need to rethink the
+        // resolver.
+        let first = resolve_python_interpreter();
+        let second = resolve_python_interpreter();
+        assert_eq!(first, second);
+    }
+
+    #[test]
+    fn python_resolver_returns_some_on_developer_machines() {
+        // CI hosts have Python; developer machines have Python.
+        // The one environment where this returns None is bare-bones
+        // Windows / minimal CI containers — fine, those just don't
+        // get code_execution registered, which is the whole point.
+        // We don't assert Some() because we don't want this test
+        // to fail in those environments. Instead we just confirm
+        // the resolver doesn't panic and returns a stable value.
+        let resolved = resolve_python_interpreter();
+        if let Some(name) = resolved {
+            assert!(
+                !name.is_empty(),
+                "resolved interpreter name must be non-empty"
+            );
+            // The resolved name must be one of our candidates.
+            assert!(
+                PYTHON_CANDIDATES.contains(&name.as_str()),
+                "resolved {name:?} is not in PYTHON_CANDIDATES {PYTHON_CANDIDATES:?}"
+            );
+        }
+    }
+}
@@ -30,6 +30,7 @@ mod core;
 mod cost_status;
 mod cycle_manager;
 mod deepseek_theme;
+mod dependencies;
 mod error_taxonomy;
 mod eval;
 mod execpolicy;
@@ -2090,6 +2091,108 @@ async fn run_doctor(config: &Config, workspace: &Path, config_path_override: Opt
        }
    }

+    // Tool dependencies — probe external binaries that individual
+    // tools rely on (Python for code_execution, pdftotext for PDF
+    // reading) so users see explicit ✓/✗ rather than the tool failing
+    // at execution time with "program not found". New in v0.8.31.
+    println!();
+    println!("{}", "Tool Dependencies:".bold());
+
+    match crate::dependencies::resolve_python_interpreter() {
+        Some(name) => println!(
+            "  {} Python: {} → code_execution tool registered",
+            "✓".truecolor(aqua_r, aqua_g, aqua_b),
+            name
+        ),
+        None => {
+            println!(
+                "  {} Python: not found (tried {:?})",
+                "✗".truecolor(red_r, red_g, red_b),
+                crate::dependencies::PYTHON_CANDIDATES,
+            );
+            println!("    code_execution tool is NOT advertised to the model on this install.");
+            println!("    Install Python 3 and ensure one of those names is on PATH:");
+            match std::env::consts::OS {
+                "macos" => {
+                    println!("      brew install python@3.12   (or download from python.org)")
+                }
+                "linux" => println!(
+                    "      sudo apt install python3    (Debian/Ubuntu) — or your distro's equivalent"
+                ),
+                "windows" => {
+                    println!("      winget install Python.Python.3   (or download from python.org)")
+                }
+                other => println!("      install Python 3 for {other} from python.org"),
+            }
+        }
+    }
+
+    match crate::dependencies::resolve_pdftotext() {
+        Some(_) => println!(
+            "  {} pdftotext: available → read_file extracts PDF text",
+            "✓".truecolor(aqua_r, aqua_g, aqua_b),
+        ),
+        None => {
+            println!(
+                "  {} pdftotext: not found → read_file falls back to a not-supported error for .pdf files",
+                "!".truecolor(sky_r, sky_g, sky_b),
+            );
+            println!("    (Text files still work without pdftotext; this only affects PDF reads.)");
+            match std::env::consts::OS {
+                "macos" => println!("    Install via: brew install poppler"),
+                "linux" => {
+                    println!("    Install via: sudo apt install poppler-utils   (Debian/Ubuntu)")
+                }
+                "windows" => println!(
+                    "    Install Poppler for Windows from https://blog.alivate.com.au/poppler-windows/"
+                ),
+                _ => {}
+            }
+        }
+    }
+
+    // Terminal-quirk overrides currently active. Mirrors the env
+    // signals checked by `Settings::apply_env_overrides` so users
+    // can see at a glance which a11y/compat overrides fired.
+    println!();
+    println!("{}", "Terminal Quirks:".bold());
+    let term_program = std::env::var("TERM_PROGRAM").unwrap_or_default();
+    let term_program_lc = term_program.to_ascii_lowercase();
+    let mut any_quirk = false;
+    if matches!(term_program.as_str(), "vscode" | "ghostty") {
+        println!(
+            "  {} TERM_PROGRAM={} → low_motion + fancy_animations=false (auto)",
+            "•".truecolor(sky_r, sky_g, sky_b),
+            term_program
+        );
+        any_quirk = true;
+    }
+    if term_program == "Termius"
+        || std::env::var_os("SSH_CLIENT").is_some_and(|v| !v.is_empty())
+        || std::env::var_os("SSH_TTY").is_some_and(|v| !v.is_empty())
+    {
+        println!(
+            "  {} SSH/Termius session → low_motion + fancy_animations=false (auto, #1433)",
+            "•".truecolor(sky_r, sky_g, sky_b)
+        );
+        any_quirk = true;
+    }
+    if term_program_lc.contains("ptyxis")
+        || std::env::var_os("PTYXIS_VERSION").is_some_and(|v| !v.is_empty())
+    {
+        println!(
+            "  {} Ptyxis detected → synchronized_output=off (auto, v0.8.31)",
+            "•".truecolor(sky_r, sky_g, sky_b)
+        );
+        any_quirk = true;
+    }
+    if !any_quirk {
+        println!(
+            "  {} no env-driven terminal-quirk overrides active",
+            "·".dimmed()
+        );
+    }
+
    // Platform and sandbox checks
    println!();
    println!("{}", "Platform:".bold());