feat(tools): spillover-file writer + 7-day boot prune (#422)

#500 (tool-output spillover preview pane) explicitly depends on
#422 (the storage writer) and #423 (the UI annotation). This ships
the storage half so the other two unblock cleanly.

### What's wired

New module \`crates/tui/src/tools/truncate.rs\`:

- \`spillover_root()\` — resolves \`~/.deepseek/tool_outputs/\`.
- \`spillover_path(id)\` — sanitises the tool call id (ASCII
  alphanumerics + \`-\`/\`_\`, drops \`.\` so \`..\` can't escape) and
  returns \`<root>/<id>.txt\`.
- \`write_spillover(id, content)\` — atomic write via the existing
  \`utils::write_atomic\` helper. Creates parent directory on demand.
- \`prune_older_than(max_age)\` — drops files older than \`max_age\`
  by mtime. Returns the count pruned. Per-file errors are logged
  and skipped, never propagated.
- \`maybe_spillover(id, content, threshold, head_bytes)\` —
  convenience for the "too long? spill it." pattern. Walks back to
  the previous UTF-8 char boundary so the head slice is always
  valid \`str\`.

Constants:
- \`SPILLOVER_DIR_NAME = "tool_outputs"\`
- \`SPILLOVER_THRESHOLD_BYTES = 100 KiB\` (mirrors
  \`MAX_MEMORY_SIZE\` for cross-feature consistency)
- \`SPILLOVER_MAX_AGE = 7 days\` (mirrors workspace snapshot prune)

Boot wiring in \`run_interactive\` calls \`prune_older_than\`
unconditionally; non-fatal — any error is logged at WARN and the
TUI starts anyway.

### Module-level \`#[allow(dead_code)]\`

The boot-prune is the only live caller today. The storage helpers
(\`write_spillover\`, \`maybe_spillover\`, \`spillover_path\`) are
intentionally unused outside the module's own tests until #423 / #500
land — those follow-ups need the bytes to exist, and the contracts
are pinned by tests so they can't drift before then. Module-level
\`#![allow(dead_code)]\` documents the deferral with a comment
pointing at the follow-up issues.

### Tests

8 unit tests in \`tools::truncate::tests\`:
- \`sanitise_id\` keeps safe chars, drops dangerous ones (\`..\`,
  \`/etc/passwd\` traversal attempts).
- \`write_spillover\` creates the directory and writes content.
- \`write_spillover\` rejects empty / fully-invalid ids.
- \`maybe_spillover\` returns \`None\` below threshold.
- \`maybe_spillover\` writes + returns the head slice above
  threshold.
- \`maybe_spillover\` walks back to a char boundary so the head
  string is never mid-codepoint (regression test using 4-byte
  whale emojis).
- \`prune_older_than\` is a no-op when the root doesn't exist.
- \`prune_older_than\` keeps fresh files and drops stale ones via a
  Unix \`utimensat\` backdating helper.

Tests serialize through a static \`Mutex\` because they share
process-global \`$HOME\`; the \`with_test_home\` helper documents
the SAFETY contract for the env-var override.

### Verification

cargo fmt --all -- --check                                          ✓
cargo clippy --workspace --all-targets --all-features --locked --   -D warnings   ✓
cargo test --workspace --all-features --locked                      ✓ 1873 + supporting (was 1865)

Closes #422 (storage half). #423 and #500 remain open with the
bytes now reachable on disk for them to cite.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Hunter Bown
2026-05-03 04:38:00 -05:00
parent 01fa11b96f
commit cf616e03bd
3 changed files with 407 additions and 0 deletions
+18
View File
@@ -3045,6 +3045,24 @@ async fn run_interactive(
session_manager::prune_workspace_snapshots(&workspace, snapshots.max_age());
}
// Prune stale tool-output spillover files (#422). Non-fatal: home
// missing or directory unreadable just means nothing got pruned;
// we never block startup. Runs unconditionally because the
// spillover store is created lazily on first write — there's no
// user-facing setting to gate.
match crate::tools::truncate::prune_older_than(crate::tools::truncate::SPILLOVER_MAX_AGE) {
Ok(0) => {}
Ok(n) => tracing::debug!(
target: "spillover",
"boot prune removed {n} spillover file(s)"
),
Err(err) => tracing::warn!(
target: "spillover",
?err,
"spillover prune skipped on boot"
),
}
tui::run_tui(
config,
tui::TuiOptions {
+1
View File
@@ -28,6 +28,7 @@ pub mod subagent;
pub mod tasks;
pub mod test_runner;
pub mod todo;
pub mod truncate;
pub mod user_input;
pub mod validate_data;
pub mod web_run;
+388
View File
@@ -0,0 +1,388 @@
//! Tool-output spillover writer (#422).
//!
//! When a tool produces output that's too large to land in the model's
//! context budget, we want two things at once:
//!
//! 1. The transcript / tool-cell renders a bounded preview so the UI
//! stays scannable.
//! 2. The full original output is preserved on disk so the model can
//! `read_file` it back if it later needs the elided tail, and so
//! the user can open it in `$EDITOR`.
//!
//! This module owns the disk side. Files land in
//! `~/.deepseek/tool_outputs/<sanitised-id>.txt`. The id is the tool
//! call id the engine assigns; we sanitise it conservatively (ASCII
//! alphanumeric + `-`/`_`) so a hostile id can't escape the directory
//! via `..` or absolute-path tricks.
//!
//! Boot prune drops files whose mtime is older than [`SPILLOVER_MAX_AGE`]
//! (7 days). Prune failures are logged and never fatal — the user
//! shouldn't see startup wedge because of a stale tool-output file.
//!
//! ## What's NOT here
//!
//! Wiring `maybe_spillover` into the actual tool-execution path is
//! tracked by **#423** (UI annotation) and **#500** (preview pane);
//! both want the spillover bytes to exist. This module ships the
//! plumbing so those follow-ups land cleanly without re-litigating
//! the storage decisions.
//!
//! Today the only live caller is the boot prune in `main.rs`. The
//! storage helpers (`write_spillover`, `maybe_spillover`,
//! `spillover_path`) are unused outside of this module's own tests
//! and the `#[allow(dead_code)]` markers below mark them deferred —
//! they get callers when #423 / #500 land.
#![allow(dead_code)] // storage surface used by #423/#500 follow-ups; tests pin the contract
use std::fs;
use std::io;
use std::path::PathBuf;
use std::time::{Duration, SystemTime};
// `Path` is only referenced from helpers gated to test builds.
#[cfg(test)]
use std::path::Path;
/// Name of the spillover directory under `~/.deepseek/`.
pub const SPILLOVER_DIR_NAME: &str = "tool_outputs";
/// Default threshold above which a tool result is a candidate for
/// spillover. Mirrors the `MAX_MEMORY_SIZE` ceiling we use elsewhere
/// for "too large to inline" so the rules feel consistent. Wired
/// callers can pass a different value if a tool family has different
/// economics.
pub const SPILLOVER_THRESHOLD_BYTES: usize = 100 * 1024; // 100 KiB
/// Default boot-prune age. Older spillover files are deleted on
/// startup to keep `~/.deepseek/tool_outputs/` from growing without
/// bound. Mirrors the workspace-snapshot 7-day default.
pub const SPILLOVER_MAX_AGE: Duration = Duration::from_secs(7 * 24 * 60 * 60);
/// Resolve `~/.deepseek/tool_outputs/`. Returns `None` if the home
/// directory can't be determined (CI containers occasionally hit
/// this). Callers should treat `None` as "spillover unavailable" and
/// degrade gracefully rather than fail the tool call.
#[must_use]
pub fn spillover_root() -> Option<PathBuf> {
Some(dirs::home_dir()?.join(".deepseek").join(SPILLOVER_DIR_NAME))
}
/// Resolve the spillover-file path for a tool call id. Sanitises the
/// id so that a hostile value can't escape the storage directory.
/// Returns `None` for empty / fully-invalid ids; the caller should
/// treat that as "spillover unavailable" and skip the write.
#[must_use]
pub fn spillover_path(id: &str) -> Option<PathBuf> {
let sanitised = sanitise_id(id)?;
Some(spillover_root()?.join(format!("{sanitised}.txt")))
}
/// Write `content` to the spillover file for `id`. Creates the
/// parent directory if needed. Returns the resolved path on success.
///
/// Atomic via `write` + filesystem rename guarantees from the
/// underlying OS — the file is created at a temp name first and
/// then renamed into place. Failures bubble up as `io::Error` so the
/// caller can decide whether to surface them.
pub fn write_spillover(id: &str, content: &str) -> io::Result<PathBuf> {
let path = spillover_path(id).ok_or_else(|| {
io::Error::new(
io::ErrorKind::InvalidInput,
"could not resolve spillover path (empty/invalid id or missing home directory)",
)
})?;
if let Some(parent) = path.parent() {
fs::create_dir_all(parent)?;
}
crate::utils::write_atomic(&path, content.as_bytes())?;
Ok(path)
}
/// Drop spillover files older than `max_age`. Returns the number of
/// files removed. Non-fatal: directory-missing returns 0; per-file
/// errors are logged and skipped. Mirrors
/// [`crate::session_manager::prune_workspace_snapshots`].
pub fn prune_older_than(max_age: Duration) -> io::Result<usize> {
let Some(root) = spillover_root() else {
return Ok(0);
};
if !root.exists() {
return Ok(0);
}
let cutoff = SystemTime::now()
.checked_sub(max_age)
.unwrap_or(SystemTime::UNIX_EPOCH);
let mut pruned = 0usize;
for entry in fs::read_dir(&root)? {
let entry = match entry {
Ok(e) => e,
Err(err) => {
tracing::warn!(target: "spillover", ?err, "skipping unreadable dir entry");
continue;
}
};
let path = entry.path();
if !path.is_file() {
continue;
}
let modified = match entry.metadata().and_then(|m| m.modified()) {
Ok(t) => t,
Err(err) => {
tracing::warn!(target: "spillover", ?err, ?path, "skipping unreadable mtime");
continue;
}
};
if modified < cutoff {
if let Err(err) = fs::remove_file(&path) {
tracing::warn!(target: "spillover", ?err, ?path, "spillover prune skipped a file");
continue;
}
pruned += 1;
}
}
Ok(pruned)
}
/// Convenience for the common "too long? spill it." pattern. If
/// `content` is at or below `threshold` bytes, returns `None` and the
/// caller keeps the inline content. Above the threshold, writes the
/// full content to the spillover file and returns
/// `Some((head, path))` where `head` is the leading slice the caller
/// can show inline. The trailing tail isn't returned — `path` is the
/// canonical reference.
///
/// `head_bytes` controls how much inline content the caller wants to
/// keep. Pass `threshold` for "preserve as much as fits inline" or
/// a smaller value (e.g. `4 * 1024`) for "show a peek".
pub fn maybe_spillover(
id: &str,
content: &str,
threshold: usize,
head_bytes: usize,
) -> io::Result<Option<(String, PathBuf)>> {
if content.len() <= threshold {
return Ok(None);
}
let path = write_spillover(id, content)?;
// Don't slice mid-utf8: walk back to a char boundary if needed.
let cut = head_bytes.min(content.len());
let cut = (0..=cut)
.rev()
.find(|&i| content.is_char_boundary(i))
.unwrap_or(0);
Ok(Some((content[..cut].to_string(), path)))
}
/// Sanitise a tool call id for use as a filename. Keeps ASCII
/// alphanumerics, `-`, and `_`; rejects `.` to keep `..` traversal
/// out, rejects empty results. Returns `None` if the input contains
/// no acceptable characters.
fn sanitise_id(id: &str) -> Option<String> {
let cleaned: String = id
.chars()
.filter(|c| c.is_ascii_alphanumeric() || *c == '-' || *c == '_')
.collect();
if cleaned.is_empty() {
None
} else {
Some(cleaned)
}
}
/// Override the spillover root for tests so they don't pollute the
/// user's real `~/.deepseek/` directory. Wraps the body with a
/// temporary `HOME` override that gets restored on drop.
#[cfg(test)]
fn with_test_home<F, R>(home: &Path, f: F) -> R
where
F: FnOnce() -> R,
{
// SAFETY: tests in this module serialize through `TEST_GUARD`
// because they share process-wide `$HOME`. Without the guard,
// parallel tests could observe each other's overrides.
let prior = std::env::var_os("HOME");
// SAFETY: caller holds the test guard.
unsafe {
std::env::set_var("HOME", home);
}
let out = f();
// SAFETY: caller holds the test guard.
unsafe {
if let Some(p) = prior {
std::env::set_var("HOME", p);
} else {
std::env::remove_var("HOME");
}
}
out
}
#[cfg(test)]
mod tests {
use super::*;
use std::sync::Mutex;
use tempfile::tempdir;
/// Tests in this module serialize through this guard because
/// they mutate process-global `$HOME`. Without it, cargo's
/// parallel runner would observe interleaved overrides.
static TEST_GUARD: Mutex<()> = Mutex::new(());
fn setup() -> std::sync::MutexGuard<'static, ()> {
TEST_GUARD.lock().unwrap_or_else(|e| e.into_inner())
}
#[test]
fn sanitise_id_keeps_safe_chars_and_drops_dangerous() {
assert_eq!(super::sanitise_id("abc-123_x"), Some("abc-123_x".into()));
// `.` is dropped to keep `..` out of the path.
assert_eq!(super::sanitise_id("../etc"), Some("etc".into()));
assert_eq!(super::sanitise_id("/etc/passwd"), Some("etcpasswd".into()));
// Empty-after-sanitise → None.
assert!(super::sanitise_id("...").is_none());
assert!(super::sanitise_id("").is_none());
}
#[test]
fn write_spillover_creates_directory_and_writes_file() {
let _g = setup();
let tmp = tempdir().unwrap();
with_test_home(tmp.path(), || {
let path = write_spillover("call-abc", "hello world").expect("write");
assert!(path.exists(), "{path:?} missing");
let body = fs::read_to_string(&path).unwrap();
assert_eq!(body, "hello world");
// Directory landed under `<HOME>/.deepseek/tool_outputs/`.
assert!(path.to_string_lossy().contains(".deepseek/tool_outputs"));
});
}
#[test]
fn write_spillover_rejects_empty_id() {
let _g = setup();
let tmp = tempdir().unwrap();
with_test_home(tmp.path(), || {
let err = write_spillover("...", "x").unwrap_err();
assert_eq!(err.kind(), io::ErrorKind::InvalidInput);
});
}
#[test]
fn maybe_spillover_returns_none_below_threshold() {
let _g = setup();
let tmp = tempdir().unwrap();
with_test_home(tmp.path(), || {
let out = maybe_spillover("call-1", "tiny content", 100 * 1024, 4 * 1024).expect("ok");
assert!(out.is_none());
});
}
#[test]
fn maybe_spillover_writes_and_returns_head_above_threshold() {
let _g = setup();
let tmp = tempdir().unwrap();
with_test_home(tmp.path(), || {
// Content larger than the threshold.
let big = "A".repeat(2_000);
let (head, path) = maybe_spillover("call-2", &big, 1_000, 256)
.expect("ok")
.expect("should have spilled");
// Head is bounded.
assert_eq!(head.len(), 256);
// Full content on disk.
let body = fs::read_to_string(&path).unwrap();
assert_eq!(body.len(), 2_000);
});
}
#[test]
fn maybe_spillover_does_not_split_inside_a_codepoint() {
let _g = setup();
let tmp = tempdir().unwrap();
with_test_home(tmp.path(), || {
// 4 byte chars; ask for 3 bytes of head → walks back to
// the previous char boundary (0).
let s = "🐳🐳🐳🐳"; // 4 × 4-byte codepoints
assert_eq!(s.len(), 16);
let (head, _) = maybe_spillover("call-3", s, 1, 3)
.expect("ok")
.expect("spilled");
// 3 isn't a char boundary in this string; walk back → 0.
assert_eq!(head, "");
// Asking for 4 bytes lands on the first char boundary.
let (head, _) = maybe_spillover("call-3b", s, 1, 4)
.expect("ok")
.expect("spilled");
assert_eq!(head, "🐳");
});
}
#[test]
fn prune_older_than_handles_missing_root() {
let _g = setup();
let tmp = tempdir().unwrap();
with_test_home(tmp.path(), || {
// Nothing has ever written; root doesn't exist; that's fine.
let count = prune_older_than(SPILLOVER_MAX_AGE).expect("ok");
assert_eq!(count, 0);
});
}
#[test]
fn prune_older_than_keeps_fresh_files_drops_stale_ones() {
let _g = setup();
let tmp = tempdir().unwrap();
with_test_home(tmp.path(), || {
let fresh = write_spillover("fresh", "x").unwrap();
let stale = write_spillover("stale", "y").unwrap();
// Backdate `stale` to 30 days ago.
let thirty_days = SystemTime::now() - Duration::from_secs(30 * 24 * 60 * 60);
filetime_set_modified(&stale, thirty_days);
let pruned = prune_older_than(SPILLOVER_MAX_AGE).unwrap();
assert_eq!(pruned, 1);
assert!(fresh.exists());
assert!(!stale.exists());
});
}
/// Set the mtime on a file. The workspace doesn't pull the
/// `filetime` crate, so we reach for `utimensat` directly on
/// Unix. Windows is a no-op — the prune semantics are the same
/// and the per-cycle stress test lives on the Unix path.
#[cfg(unix)]
fn filetime_set_modified(path: &Path, when: SystemTime) {
let secs = when
.duration_since(SystemTime::UNIX_EPOCH)
.unwrap_or_default()
.as_secs() as libc::time_t;
let times = [
libc::timespec {
tv_sec: secs,
tv_nsec: 0,
},
libc::timespec {
tv_sec: secs,
tv_nsec: 0,
},
];
let path_c = std::ffi::CString::new(path.as_os_str().as_encoded_bytes()).unwrap();
// SAFETY: path_c is a valid CString; times is a 2-element array
// matching utimensat's signature.
let rc = unsafe { libc::utimensat(libc::AT_FDCWD, path_c.as_ptr(), times.as_ptr(), 0) };
assert_eq!(
rc,
0,
"utimensat failed: {}",
std::io::Error::last_os_error()
);
}
#[cfg(not(unix))]
fn filetime_set_modified(_path: &Path, _when: SystemTime) {
// Not exercised in CI on Windows; prune semantics are the same
// and the per-cycle stress test lives on the Unix path.
}
}