Tool-surface audit pass: - FILE OPERATIONS rewritten so each line states the niche, not just the verb. read_file mentions PDF auto-extraction + `pages` slicing. - New SEARCH section consolidates grep_files / file_search / web_search / fetch_url so the model sees them next to each other and picks the right one. fetch_url (#33) added; previously absent from the prompt. - request_user_input pulled out of FILE OPERATIONS into its own USER section — it never belonged there. - SUB-AGENTS list shrinks by 3: drops `spawn_agent` (use `agent_spawn`), `close_agent` (use `agent_cancel`), and the `agent_assign / assign_agent` dual-name. The underlying dispatchers still resolve those names, so existing sessions don't break — they just no longer pollute the model's tool list. Adds `docs/TOOL_SURFACE.md` with the rationale, the v0.5.1 final surface, and the dropped aliases. Calls out that grep_files is pure-Rust (no rg/grep shell-out, so the "fall back to grep" AC from #35 is vacuously satisfied — the tool has no shell dependency to fall back from). Closes #35.
3.8 KiB
Tool surface
Why these specific tools, in this groupings, and how each one is meant to be
chosen over the available shell equivalent. Companion to crates/tui/src/prompts/agent.txt.
Design stance
- Dedicated tools over
exec_shellwhenever the dedicated tool returns structured output. Bash escaping is error-prone and platform behavior varies (GNU vs BSDgrep,rgis not always installed). Structured output also frees the model from re-parsing free-form text. exec_shellfor everything else. Build, test, format, lint, ad-hoc commands, anything platform-specific. We don't try to wrap the long tail.- Drop tools that don't beat their shell equivalent. Two-tool aliases for the same backing operation are a model trap — the LLM will alternate between them and the cache hit rate suffers.
Final surface (v0.5.1)
File operations
| Tool | Niche |
|---|---|
read_file |
Read a UTF-8 file. PDFs auto-extracted via pdftotext (poppler) when available; pages: "1-5" slices large docs. |
list_dir |
Structured, gitignore-aware listing. Preferred over exec_shell("ls"). |
write_file |
Create or overwrite a file. |
edit_file |
Search-and-replace inside a single file. Cheaper than a full rewrite. |
apply_patch |
Apply a unified diff. The right tool for multi-hunk edits. |
Search
| Tool | Niche |
|---|---|
grep_files |
Regex search file contents within the workspace; structured matches + context lines. Pure-Rust (regex crate), no rg/grep shell-out. |
file_search |
Fuzzy-match filenames (not contents). Use when you know roughly the name. |
web_search |
DuckDuckGo (with Bing fallback); ranked snippets + ref_id for citation. |
fetch_url |
Direct HTTP GET on a known URL. Faster than web_search when the link is already known. HTML stripped to text by default. |
Shell
| Tool | Niche |
|---|---|
exec_shell |
Run a shell command. Foreground or background (background: true returns a task_id). |
exec_shell_wait |
Poll a background task for incremental output. |
exec_shell_interact |
Send stdin to a running background task and read incremental output. |
Git / diagnostics / testing
| Tool | Niche |
|---|---|
git_status |
Inspect repo status without running shell. |
git_diff |
Inspect working-tree or staged diffs. |
diagnostics |
Workspace, git, sandbox, and toolchain info in one call. |
run_tests |
cargo test with optional args. |
Task management
| Tool | Niche |
|---|---|
todo_write |
Granular per-item progress. |
update_plan |
Structured checklist for complex multi-step work. |
note |
One-off important fact for later. |
Sub-agents
agent_spawn, agent_swarm, spawn_agents_on_csv, plus the supporting
tools (agent_result / swarm_result / wait / send_input /
agent_assign / agent_cancel / resume_agent / agent_list /
report_agent_job_result / swarm_status). See agent.txt for the
delegation protocol.
Recently consolidated (v0.5.1)
Removed from the prompt as duplicates of equivalent tools (the underlying dispatchers still resolve them, so existing sessions don't break — they just no longer pollute the model's tool list):
spawn_agent→ useagent_spawn.close_agent→ useagent_cancel.assign_agent→ useagent_assign.
Why we don't ship a single bash tool
Single-bash agents (Claude Code's design) are powerful but hand the model
all the foot-guns of shell scripting: quoting, platform divergence,
side-effects from misread cwd, cd not persisting between calls, etc. Our
file tools are also significantly cheaper to render in the transcript
(structured JSON-shaped output collapses better than ls -la walls of text).
The model can always fall back to exec_shell when something is missing.
The dedicated tools just take the common 80% off the shell escape-hatch.