feat(tools): spillover-file writer + 7-day boot prune (#422)

#500 (tool-output spillover preview pane) explicitly depends on #422 (the storage writer) and #423 (the UI annotation). This ships the storage half so the other two unblock cleanly. ### What's wired New module \`crates/tui/src/tools/truncate.rs\`: - \`spillover_root()\` — resolves \`~/.deepseek/tool_outputs/\`. - \`spillover_path(id)\` — sanitises the tool call id (ASCII alphanumerics + \`-\`/\`_\`, drops \`.\` so \`..\` can't escape) and returns \`<root>/<id>.txt\`. - \`write_spillover(id, content)\` — atomic write via the existing \`utils::write_atomic\` helper. Creates parent directory on demand. - \`prune_older_than(max_age)\` — drops files older than \`max_age\` by mtime. Returns the count pruned. Per-file errors are logged and skipped, never propagated. - \`maybe_spillover(id, content, threshold, head_bytes)\` — convenience for the "too long? spill it." pattern. Walks back to the previous UTF-8 char boundary so the head slice is always valid \`str\`. Constants: - \`SPILLOVER_DIR_NAME = "tool_outputs"\` - \`SPILLOVER_THRESHOLD_BYTES = 100 KiB\` (mirrors \`MAX_MEMORY_SIZE\` for cross-feature consistency) - \`SPILLOVER_MAX_AGE = 7 days\` (mirrors workspace snapshot prune) Boot wiring in \`run_interactive\` calls \`prune_older_than\` unconditionally; non-fatal — any error is logged at WARN and the TUI starts anyway. ### Module-level \`#[allow(dead_code)]\` The boot-prune is the only live caller today. The storage helpers (\`write_spillover\`, \`maybe_spillover\`, \`spillover_path\`) are intentionally unused outside the module's own tests until #423 / #500 land — those follow-ups need the bytes to exist, and the contracts are pinned by tests so they can't drift before then. Module-level \`#![allow(dead_code)]\` documents the deferral with a comment pointing at the follow-up issues. ### Tests 8 unit tests in \`tools::truncate::tests\`: - \`sanitise_id\` keeps safe chars, drops dangerous ones (\`..\`, \`/etc/passwd\` traversal attempts). - \`write_spillover\` creates the directory and writes content. - \`write_spillover\` rejects empty / fully-invalid ids. - \`maybe_spillover\` returns \`None\` below threshold. - \`maybe_spillover\` writes + returns the head slice above threshold. - \`maybe_spillover\` walks back to a char boundary so the head string is never mid-codepoint (regression test using 4-byte whale emojis). - \`prune_older_than\` is a no-op when the root doesn't exist. - \`prune_older_than\` keeps fresh files and drops stale ones via a Unix \`utimensat\` backdating helper. Tests serialize through a static \`Mutex\` because they share process-global \`$HOME\`; the \`with_test_home\` helper documents the SAFETY contract for the env-var override. ### Verification cargo fmt --all -- --check ✓ cargo clippy --workspace --all-targets --all-features --locked -- -D warnings ✓ cargo test --workspace --all-features --locked ✓ 1873 + supporting (was 1865) Closes #422 (storage half). #423 and #500 remain open with the bytes now reachable on disk for them to cite. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 04:38:00 -05:00
parent 01fa11b96f
commit cf616e03bd
3 changed files with 407 additions and 0 deletions
@@ -3045,6 +3045,24 @@ async fn run_interactive(
        session_manager::prune_workspace_snapshots(&workspace, snapshots.max_age());
    }

+    // Prune stale tool-output spillover files (#422). Non-fatal: home
+    // missing or directory unreadable just means nothing got pruned;
+    // we never block startup. Runs unconditionally because the
+    // spillover store is created lazily on first write — there's no
+    // user-facing setting to gate.
+    match crate::tools::truncate::prune_older_than(crate::tools::truncate::SPILLOVER_MAX_AGE) {
+        Ok(0) => {}
+        Ok(n) => tracing::debug!(
+            target: "spillover",
+            "boot prune removed {n} spillover file(s)"
+        ),
+        Err(err) => tracing::warn!(
+            target: "spillover",
+            ?err,
+            "spillover prune skipped on boot"
+        ),
+    }
+
    tui::run_tui(
        config,
        tui::TuiOptions {
@@ -28,6 +28,7 @@ pub mod subagent;
 pub mod tasks;
 pub mod test_runner;
 pub mod todo;
+pub mod truncate;
 pub mod user_input;
 pub mod validate_data;
 pub mod web_run;
@@ -0,0 +1,388 @@
+//! Tool-output spillover writer (#422).
+//!
+//! When a tool produces output that's too large to land in the model's
+//! context budget, we want two things at once:
+//!
+//! 1. The transcript / tool-cell renders a bounded preview so the UI
+//!    stays scannable.
+//! 2. The full original output is preserved on disk so the model can
+//!    `read_file` it back if it later needs the elided tail, and so
+//!    the user can open it in `$EDITOR`.
+//!
+//! This module owns the disk side. Files land in
+//! `~/.deepseek/tool_outputs/<sanitised-id>.txt`. The id is the tool
+//! call id the engine assigns; we sanitise it conservatively (ASCII
+//! alphanumeric + `-`/`_`) so a hostile id can't escape the directory
+//! via `..` or absolute-path tricks.
+//!
+//! Boot prune drops files whose mtime is older than [`SPILLOVER_MAX_AGE`]
+//! (7 days). Prune failures are logged and never fatal — the user
+//! shouldn't see startup wedge because of a stale tool-output file.
+//!
+//! ## What's NOT here
+//!
+//! Wiring `maybe_spillover` into the actual tool-execution path is
+//! tracked by **#423** (UI annotation) and **#500** (preview pane);
+//! both want the spillover bytes to exist. This module ships the
+//! plumbing so those follow-ups land cleanly without re-litigating
+//! the storage decisions.
+//!
+//! Today the only live caller is the boot prune in `main.rs`. The
+//! storage helpers (`write_spillover`, `maybe_spillover`,
+//! `spillover_path`) are unused outside of this module's own tests
+//! and the `#[allow(dead_code)]` markers below mark them deferred —
+//! they get callers when #423 / #500 land.
+
+#![allow(dead_code)] // storage surface used by #423/#500 follow-ups; tests pin the contract
+
+use std::fs;
+use std::io;
+use std::path::PathBuf;
+use std::time::{Duration, SystemTime};
+
+// `Path` is only referenced from helpers gated to test builds.
+#[cfg(test)]
+use std::path::Path;
+
+/// Name of the spillover directory under `~/.deepseek/`.
+pub const SPILLOVER_DIR_NAME: &str = "tool_outputs";
+
+/// Default threshold above which a tool result is a candidate for
+/// spillover. Mirrors the `MAX_MEMORY_SIZE` ceiling we use elsewhere
+/// for "too large to inline" so the rules feel consistent. Wired
+/// callers can pass a different value if a tool family has different
+/// economics.
+pub const SPILLOVER_THRESHOLD_BYTES: usize = 100 * 1024; // 100 KiB
+
+/// Default boot-prune age. Older spillover files are deleted on
+/// startup to keep `~/.deepseek/tool_outputs/` from growing without
+/// bound. Mirrors the workspace-snapshot 7-day default.
+pub const SPILLOVER_MAX_AGE: Duration = Duration::from_secs(7 * 24 * 60 * 60);
+
+/// Resolve `~/.deepseek/tool_outputs/`. Returns `None` if the home
+/// directory can't be determined (CI containers occasionally hit
+/// this). Callers should treat `None` as "spillover unavailable" and
+/// degrade gracefully rather than fail the tool call.
+#[must_use]
+pub fn spillover_root() -> Option<PathBuf> {
+    Some(dirs::home_dir()?.join(".deepseek").join(SPILLOVER_DIR_NAME))
+}
+
+/// Resolve the spillover-file path for a tool call id. Sanitises the
+/// id so that a hostile value can't escape the storage directory.
+/// Returns `None` for empty / fully-invalid ids; the caller should
+/// treat that as "spillover unavailable" and skip the write.
+#[must_use]
+pub fn spillover_path(id: &str) -> Option<PathBuf> {
+    let sanitised = sanitise_id(id)?;
+    Some(spillover_root()?.join(format!("{sanitised}.txt")))
+}
+
+/// Write `content` to the spillover file for `id`. Creates the
+/// parent directory if needed. Returns the resolved path on success.
+///
+/// Atomic via `write` + filesystem rename guarantees from the
+/// underlying OS — the file is created at a temp name first and
+/// then renamed into place. Failures bubble up as `io::Error` so the
+/// caller can decide whether to surface them.
+pub fn write_spillover(id: &str, content: &str) -> io::Result<PathBuf> {
+    let path = spillover_path(id).ok_or_else(|| {
+        io::Error::new(
+            io::ErrorKind::InvalidInput,
+            "could not resolve spillover path (empty/invalid id or missing home directory)",
+        )
+    })?;
+    if let Some(parent) = path.parent() {
+        fs::create_dir_all(parent)?;
+    }
+    crate::utils::write_atomic(&path, content.as_bytes())?;
+    Ok(path)
+}
+
+/// Drop spillover files older than `max_age`. Returns the number of
+/// files removed. Non-fatal: directory-missing returns 0; per-file
+/// errors are logged and skipped. Mirrors
+/// [`crate::session_manager::prune_workspace_snapshots`].
+pub fn prune_older_than(max_age: Duration) -> io::Result<usize> {
+    let Some(root) = spillover_root() else {
+        return Ok(0);
+    };
+    if !root.exists() {
+        return Ok(0);
+    }
+    let cutoff = SystemTime::now()
+        .checked_sub(max_age)
+        .unwrap_or(SystemTime::UNIX_EPOCH);
+    let mut pruned = 0usize;
+    for entry in fs::read_dir(&root)? {
+        let entry = match entry {
+            Ok(e) => e,
+            Err(err) => {
+                tracing::warn!(target: "spillover", ?err, "skipping unreadable dir entry");
+                continue;
+            }
+        };
+        let path = entry.path();
+        if !path.is_file() {
+            continue;
+        }
+        let modified = match entry.metadata().and_then(|m| m.modified()) {
+            Ok(t) => t,
+            Err(err) => {
+                tracing::warn!(target: "spillover", ?err, ?path, "skipping unreadable mtime");
+                continue;
+            }
+        };
+        if modified < cutoff {
+            if let Err(err) = fs::remove_file(&path) {
+                tracing::warn!(target: "spillover", ?err, ?path, "spillover prune skipped a file");
+                continue;
+            }
+            pruned += 1;
+        }
+    }
+    Ok(pruned)
+}
+
+/// Convenience for the common "too long? spill it." pattern. If
+/// `content` is at or below `threshold` bytes, returns `None` and the
+/// caller keeps the inline content. Above the threshold, writes the
+/// full content to the spillover file and returns
+/// `Some((head, path))` where `head` is the leading slice the caller
+/// can show inline. The trailing tail isn't returned — `path` is the
+/// canonical reference.
+///
+/// `head_bytes` controls how much inline content the caller wants to
+/// keep. Pass `threshold` for "preserve as much as fits inline" or
+/// a smaller value (e.g. `4 * 1024`) for "show a peek".
+pub fn maybe_spillover(
+    id: &str,
+    content: &str,
+    threshold: usize,
+    head_bytes: usize,
+) -> io::Result<Option<(String, PathBuf)>> {
+    if content.len() <= threshold {
+        return Ok(None);
+    }
+    let path = write_spillover(id, content)?;
+    // Don't slice mid-utf8: walk back to a char boundary if needed.
+    let cut = head_bytes.min(content.len());
+    let cut = (0..=cut)
+        .rev()
+        .find(|&i| content.is_char_boundary(i))
+        .unwrap_or(0);
+    Ok(Some((content[..cut].to_string(), path)))
+}
+
+/// Sanitise a tool call id for use as a filename. Keeps ASCII
+/// alphanumerics, `-`, and `_`; rejects `.` to keep `..` traversal
+/// out, rejects empty results. Returns `None` if the input contains
+/// no acceptable characters.
+fn sanitise_id(id: &str) -> Option<String> {
+    let cleaned: String = id
+        .chars()
+        .filter(|c| c.is_ascii_alphanumeric() || *c == '-' || *c == '_')
+        .collect();
+    if cleaned.is_empty() {
+        None
+    } else {
+        Some(cleaned)
+    }
+}
+
+/// Override the spillover root for tests so they don't pollute the
+/// user's real `~/.deepseek/` directory. Wraps the body with a
+/// temporary `HOME` override that gets restored on drop.
+#[cfg(test)]
+fn with_test_home<F, R>(home: &Path, f: F) -> R
+where
+    F: FnOnce() -> R,
+{
+    // SAFETY: tests in this module serialize through `TEST_GUARD`
+    // because they share process-wide `$HOME`. Without the guard,
+    // parallel tests could observe each other's overrides.
+    let prior = std::env::var_os("HOME");
+    // SAFETY: caller holds the test guard.
+    unsafe {
+        std::env::set_var("HOME", home);
+    }
+    let out = f();
+    // SAFETY: caller holds the test guard.
+    unsafe {
+        if let Some(p) = prior {
+            std::env::set_var("HOME", p);
+        } else {
+            std::env::remove_var("HOME");
+        }
+    }
+    out
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+    use std::sync::Mutex;
+    use tempfile::tempdir;
+
+    /// Tests in this module serialize through this guard because
+    /// they mutate process-global `$HOME`. Without it, cargo's
+    /// parallel runner would observe interleaved overrides.
+    static TEST_GUARD: Mutex<()> = Mutex::new(());
+
+    fn setup() -> std::sync::MutexGuard<'static, ()> {
+        TEST_GUARD.lock().unwrap_or_else(|e| e.into_inner())
+    }
+
+    #[test]
+    fn sanitise_id_keeps_safe_chars_and_drops_dangerous() {
+        assert_eq!(super::sanitise_id("abc-123_x"), Some("abc-123_x".into()));
+        // `.` is dropped to keep `..` out of the path.
+        assert_eq!(super::sanitise_id("../etc"), Some("etc".into()));
+        assert_eq!(super::sanitise_id("/etc/passwd"), Some("etcpasswd".into()));
+        // Empty-after-sanitise → None.
+        assert!(super::sanitise_id("...").is_none());
+        assert!(super::sanitise_id("").is_none());
+    }
+
+    #[test]
+    fn write_spillover_creates_directory_and_writes_file() {
+        let _g = setup();
+        let tmp = tempdir().unwrap();
+        with_test_home(tmp.path(), || {
+            let path = write_spillover("call-abc", "hello world").expect("write");
+            assert!(path.exists(), "{path:?} missing");
+            let body = fs::read_to_string(&path).unwrap();
+            assert_eq!(body, "hello world");
+            // Directory landed under `<HOME>/.deepseek/tool_outputs/`.
+            assert!(path.to_string_lossy().contains(".deepseek/tool_outputs"));
+        });
+    }
+
+    #[test]
+    fn write_spillover_rejects_empty_id() {
+        let _g = setup();
+        let tmp = tempdir().unwrap();
+        with_test_home(tmp.path(), || {
+            let err = write_spillover("...", "x").unwrap_err();
+            assert_eq!(err.kind(), io::ErrorKind::InvalidInput);
+        });
+    }
+
+    #[test]
+    fn maybe_spillover_returns_none_below_threshold() {
+        let _g = setup();
+        let tmp = tempdir().unwrap();
+        with_test_home(tmp.path(), || {
+            let out = maybe_spillover("call-1", "tiny content", 100 * 1024, 4 * 1024).expect("ok");
+            assert!(out.is_none());
+        });
+    }
+
+    #[test]
+    fn maybe_spillover_writes_and_returns_head_above_threshold() {
+        let _g = setup();
+        let tmp = tempdir().unwrap();
+        with_test_home(tmp.path(), || {
+            // Content larger than the threshold.
+            let big = "A".repeat(2_000);
+            let (head, path) = maybe_spillover("call-2", &big, 1_000, 256)
+                .expect("ok")
+                .expect("should have spilled");
+            // Head is bounded.
+            assert_eq!(head.len(), 256);
+            // Full content on disk.
+            let body = fs::read_to_string(&path).unwrap();
+            assert_eq!(body.len(), 2_000);
+        });
+    }
+
+    #[test]
+    fn maybe_spillover_does_not_split_inside_a_codepoint() {
+        let _g = setup();
+        let tmp = tempdir().unwrap();
+        with_test_home(tmp.path(), || {
+            // 4 byte chars; ask for 3 bytes of head → walks back to
+            // the previous char boundary (0).
+            let s = "🐳🐳🐳🐳"; // 4 × 4-byte codepoints
+            assert_eq!(s.len(), 16);
+            let (head, _) = maybe_spillover("call-3", s, 1, 3)
+                .expect("ok")
+                .expect("spilled");
+            // 3 isn't a char boundary in this string; walk back → 0.
+            assert_eq!(head, "");
+            // Asking for 4 bytes lands on the first char boundary.
+            let (head, _) = maybe_spillover("call-3b", s, 1, 4)
+                .expect("ok")
+                .expect("spilled");
+            assert_eq!(head, "🐳");
+        });
+    }
+
+    #[test]
+    fn prune_older_than_handles_missing_root() {
+        let _g = setup();
+        let tmp = tempdir().unwrap();
+        with_test_home(tmp.path(), || {
+            // Nothing has ever written; root doesn't exist; that's fine.
+            let count = prune_older_than(SPILLOVER_MAX_AGE).expect("ok");
+            assert_eq!(count, 0);
+        });
+    }
+
+    #[test]
+    fn prune_older_than_keeps_fresh_files_drops_stale_ones() {
+        let _g = setup();
+        let tmp = tempdir().unwrap();
+        with_test_home(tmp.path(), || {
+            let fresh = write_spillover("fresh", "x").unwrap();
+            let stale = write_spillover("stale", "y").unwrap();
+
+            // Backdate `stale` to 30 days ago.
+            let thirty_days = SystemTime::now() - Duration::from_secs(30 * 24 * 60 * 60);
+            filetime_set_modified(&stale, thirty_days);
+
+            let pruned = prune_older_than(SPILLOVER_MAX_AGE).unwrap();
+            assert_eq!(pruned, 1);
+            assert!(fresh.exists());
+            assert!(!stale.exists());
+        });
+    }
+
+    /// Set the mtime on a file. The workspace doesn't pull the
+    /// `filetime` crate, so we reach for `utimensat` directly on
+    /// Unix. Windows is a no-op — the prune semantics are the same
+    /// and the per-cycle stress test lives on the Unix path.
+    #[cfg(unix)]
+    fn filetime_set_modified(path: &Path, when: SystemTime) {
+        let secs = when
+            .duration_since(SystemTime::UNIX_EPOCH)
+            .unwrap_or_default()
+            .as_secs() as libc::time_t;
+        let times = [
+            libc::timespec {
+                tv_sec: secs,
+                tv_nsec: 0,
+            },
+            libc::timespec {
+                tv_sec: secs,
+                tv_nsec: 0,
+            },
+        ];
+        let path_c = std::ffi::CString::new(path.as_os_str().as_encoded_bytes()).unwrap();
+        // SAFETY: path_c is a valid CString; times is a 2-element array
+        // matching utimensat's signature.
+        let rc = unsafe { libc::utimensat(libc::AT_FDCWD, path_c.as_ptr(), times.as_ptr(), 0) };
+        assert_eq!(
+            rc,
+            0,
+            "utimensat failed: {}",
+            std::io::Error::last_os_error()
+        );
+    }
+
+    #[cfg(not(unix))]
+    fn filetime_set_modified(_path: &Path, _when: SystemTime) {
+        // Not exercised in CI on Windows; prune semantics are the same
+        // and the per-cycle stress test lives on the Unix path.
+    }
+}