Design-only deliverables for the v0.8.53 "tool surface diet / canonical surfaces" cutover (no catalog code in this cycle). Grounded in a verified inventory of the actual tool registry. - docs/TOOL_LIFECYCLE.md (#2681): the umbrella policy. Five lifecycle states (active / deferred / hidden-compatibility / deprecated / removed) modeled as const name-sets + an alias table in tool_catalog.rs (not a per-ToolSpec field), so registration stays untouched and old transcripts always replay. Includes the deprecation manifest (exec_wait/exec_interact/tts → hidden-compat; todo_* → checklist_* deprecated; 11 legacy subagent names are already non-visible dead code → cleanup + guardrail), per-mode/per-provider active-catalog budget (incl. Arcee's 8-tool first-turn set), prefix-cache safety rules, and the tool_agent decision: canonical but DeepSeek-V4-gated. - docs/CODEBASE_SEARCH_DESIGN.md (#2680, v0.9.0): local-first FTS5/BM25 + symbol/path ranking + RRF hybrid; rusqlite storage; mtime/branch/vendor invalidation; an explainable tool contract returning reasons[]; and a real CodeWhale query eval set. Complements grep_files/file_search, never replaces. - docs/SKILL_INVOCATION_DESIGN.md (0.9.0): the $<skill-name> inline invocation syntax (the token IS the skill name), namespaced resolution, ambiguity- suggests-not-guesses, visible activation line, and a smallest-viable slice. - docs/VISION_NORTH_STAR.md (0.9.0+): intent router, hybrid codebase intelligence, WhaleFlow typed workflow IR, skills/rules runtime, the layered context-memory stack, tool repair/autoload, the evaluation loop, and the command-surface taxonomy (/memory small · /context dashboard · /rules · /workflow · /overlay · $<skill> · codebase_search). Marked DIRECTION, not committed 0.8.53 work; also records the deferred-not-done diet items. Targets codex/v0.8.53.
12 KiB
Skill Invocation Design — the $<skill-name> inline syntax
Status: DESIGN ONLY (v0.8.53 cycle). No catalog/parser code ships in this cycle; the implementation target is 0.9.0. This document describes what will be built and the contracts it must honor against today's code.
Related design docs: TOOL_LIFECYCLE.md (tool lifecycle states + per-skill tool
restriction), command-surface taxonomy notes for /memory, /context,
/rules, /workflow (/whaleflow), /overlay. Open PRs on codex/v0.8.53:
#2684 (subagent role vocab / lifecycle signals / eval ergonomics) and #2685
(git history active + RLM/field errors). Nothing here contradicts those.
1. Problem
Skill activation has no single, model-legible entry point, and the candidate surfaces all compete with each other:
- A
/skillslash command, aload_skill-style tool, plugin/namespace naming (superpowers:systematic-debugging,github:gh-fix-ci), and the long-running workflow commands (/workflow//whaleflow) all could be "the way you start a skill." None of them is canonical. - Slash commands are already overloaded.
/memory,/context,/rules,/config,/provider,/workflow,/overlayeach map to one subsystem; jamming skill invocation into/-space forces a weaker model to disambiguate "is this a command or a skill?" on every keystroke. - Weaker / smaller models (the cheaper providers CodeWhale targets) do not reliably pick the right mechanism. They will free-text "let me use systematic debugging" instead of actually loading the skill body, so the guidance never enters the context window.
- Today there is no parser that activates an inline skill mention on submit.
slash_menu.rs:86(partial_inline_skill_mention_at_cursor) recognizes an inline/<skill>token under the cursor for popup purposes only; the submit path inui.rs:4721(build_queued_message) does not resolve or activate any inline mention. There is also no activation-mode concept (always-on / glob / model-decision / manual) and skills cannot restrict tools yet.
We need one prefix that means exactly "invoke this skill," is visually distinct from commands, and is cheap for a small model to emit correctly.
2. Proposal
Adopt $ as the skill-invocation prefix, where the token is the skill
name — not a literal command called $skill.
$systematic-debugging figure out why MiMo auth fails
$test-driven-development add coverage before fixing
$github:gh-fix-ci inspect the failing checks
$aleph search the planning doc
The leading $ is the marker; everything from $ up to the next whitespace is
the skill id. The rest of the line is the user's request, passed through to
the model with the skill body already loaded as active guidance.
This is deliberately a reference / macro sigil, like a shell variable
expansion or an @mention: $skill-id resolves to "the contents and tool
policy of that skill," then the surrounding prose is the task.
$ works in three places (see §4): the user composer, the command-palette
input, and model-facing planning text — so the model itself can write
$systematic-debugging in its plan and have it resolve.
3. Resolution rules
Given a token $<id> (id captured up to the next whitespace):
-
Exact name first. Look the id up directly:
discover_in_workspace(workspace).get(id)—skills/mod.rs:553builds the registry;SkillRegistry::get(skills/mod.rs:421) matches ons.name == idexactly. Skill names come from frontmattername:(or the first# Headingfallback) parsed atskills/mod.rs:382-417. An exact hit wins unconditionally. -
Namespaced
$ns:skill. If the id contains a:, treat the part before the colon as a source/plugin namespace and the part after as the skill name:$github:gh-fix-ci,$superpowers:systematic-debugging. Namespaced ids are the disambiguation handle a user is told to type when a bare id is ambiguous. (Glob/wildcard namespacing —$github:*— is explicitly deferred, see §6.) -
Fuzzy match suggests, never silently chooses. If there is no exact (or namespaced-exact) hit, run a case-insensitive substring / prefix match over
SkillRegistry::list()(skills/mod.rs:426). If exactly one skill matches, surface it as a suggestion ("did you mean$systematic-debugging?") but do not auto-activate it. If more than one matches, list them and require the user/model to re-issue with a disambiguated id (§7). Ambiguity never resolves to a silent pick. -
Respect enable-state. A resolved skill is only activated if
SkillStateStore::is_enabled(id)is true (skill_state.rs:73:!self.disabled.contains(skill_name)). A disabled skill that resolves by name produces a clear "skill is disabled; enable it with/skill enable <id>" message rather than silently activating or silently doing nothing.
Resolution order is therefore: exact → namespaced-exact → enabled-check → fuzzy-suggest (never auto-pick).
4. Behavior
When a $<id> mention resolves and is enabled:
- Visible activation line. The transcript shows
Using skill: <name>so the user can see which skill body entered context. (Mirrors the existing skill UX vocabulary; one line per activated skill.) - Body loaded as active guidance. The skill's
body(skills/mod.rsSkill.body) is injected into the turn as authoritative guidance, the same content a/skill-style activation would load. The user's trailing prose is the task the guidance applies to. - Tool-surface narrowing (when declared). If the skill declares a set of
allowed tools, the active tool surface narrows to that set for the duration of
the skill's influence. Per-skill tool restriction is net-new — skills
cannot restrict tools today; the mechanism, and how narrowing interacts with
the catalog-head byte-stability invariant (
tool_catalog.rs:169-196), is specified inTOOL_LIFECYCLE.md. Until that lands, a declared tool list is parsed and shown but not enforced. - Multiple
$mentionscompose explicitly, or prompt. Until formal composition rules exist, two or more$mentionsin one message either compose only when the rule is unambiguous (e.g. one guidance skill + one tool-scoping skill) or return a "choose one" prompt listing the mentioned skills. We never silently activate multiple complex skills at once (see §7 and Non-goals). - Three input surfaces. Resolution runs for: (a) user prompts in the
composer, (b) command-palette input, and (c) model-facing planning text, so a
model that writes
$test-driven-developmentin its plan triggers the same activation path a human would. - Slash commands remain supported.
/skill ...and the rest of the slash surface keep working unchanged.$is the preferred path for models because it is one token and unambiguous, but it is additive, not a replacement (§7 Non-goals).
5. Why $
- Visually distinct from
/commands. A glance separates "run a subsystem command" (/memory,/context,/workflow) from "load a skill" ($aleph). Weaker models stop confusing the two surfaces. - Reads like a reference / macro.
$namealready means "expand this named thing" to anyone who has touched a shell or a templating language. Skill invocation is an expansion:$skill-id→ that skill's guidance + tool policy. - Avoids overloading the slash namespace.
/workflow,/memory,/config,/provider,/rules,/overlay,/contexteach already own one meaning in the command-surface taxonomy. Skills get their own sigil instead of a crowded/skill <name>subcommand competing with all of them. - Easy to type and remember. Single leading character, then the literal
skill name. Nothing to memorize beyond the skill ids the user already sees in
/skill list.
6. Implementation plan (smallest viable 0.8.53-ready slice → 0.9.0)
The 0.8.53 cycle is docs only. The plan below is the build order once code is unblocked; the first slice is intentionally the minimum that proves the path.
Slice 1 — token scanner at submit (the minimum viable feature).
- Add a
$<skill-id>token scanner invoked on submit, beforebuild_queued_messageruns (ui.rs:4721). The scanner finds leading-$tokens, captures the id up to the next whitespace, and hands each id to the resolver. The scanner must skip$occurrences inside code fences and inline command strings (see Non-goals) so shell$VARreferences are never treated as skill mentions. - Resolve via
discover_in_workspace(workspace).get(id)(skills/mod.rs:553/:421), gate onSkillStateStore::is_enabled(skill_state.rs:73), and emit theUsing skill: <name>line plus the loaded body.
Slice 2 — inline-mention popup.
- Extend the inline-mention popup machinery in
slash_menu.rs:86(partial_inline_skill_mention_at_cursor) to recognize a$-prefixed token under the cursor and offer skill-name completions fromSkillRegistry::list(), the same way the slash popup offers commands. This is a UX accelerator on top of Slice 1, not a precondition for it.
Slice 3 — ambiguity diagnostics.
- When resolution is ambiguous, emit actionable diagnostics, e.g.
"$debugging matched 3 skills: systematic-debugging, root-cause-debugging, superpowers:systematic-debugging — use $superpowers:systematic-debugging". Diagnostics name the disambiguated id the user should type next.
Deferred to 0.9.0+ (explicitly out of the first slices):
$ns:skillglobs / wildcards ($github:*). Plain namespaced-exact ($github:gh-fix-ci) ships in Slice 1; globbing does not.- Per-skill tool restriction enforcement. Parsing/display can land early;
enforcement and its catalog-head-stability handling are owned by
TOOL_LIFECYCLE.md. - Multi-skill composition rules. Until defined, fall back to the "choose one" prompt (§4, §7).
7. Ambiguity / error UX, tests, and non-goals
Error / ambiguity UX examples
| Input | Outcome |
|---|---|
$systematic-debugging fix the auth bug |
Exact hit. Using skill: systematic-debugging, body loaded, task = "fix the auth bug". |
$github:gh-fix-ci inspect failing checks |
Namespaced-exact hit. Using skill: github:gh-fix-ci, body loaded. |
$nope do a thing |
No match. "No skill named 'nope'. Run /skill list to see available skills." No activation; the line is sent as ordinary text. |
$debugging ... (3 candidates) |
"$debugging matched 3 skills: systematic-debugging, root-cause-debugging, superpowers:systematic-debugging — use $superpowers:systematic-debugging." No auto-pick. |
$systematic-debug ... (1 fuzzy candidate) |
Suggest only: "No exact skill 'systematic-debug'. Did you mean $systematic-debugging?" No silent activation. |
$aleph ... but aleph disabled |
"Skill 'aleph' is disabled. Enable it with /skill enable aleph." No activation. |
$tdd $systematic-debugging ... (2 mentions) |
"Choose one skill to lead this turn: $test-driven-development or $systematic-debugging." (until composition rules exist). |
echo $PATH inside a code fence / command string |
Not a mention. Scanner skips $ inside code/command contexts. |
Tests (planned)
- Exact:
$systematic-debuggingresolves viaget(id), activates, loads body. - Namespaced:
$github:gh-fix-ciresolves on thens:skillform. - Missing:
$nope→ no-match message, no activation, line passed as text. - Ambiguous:
$debugging(≥2 candidates) → "matched N skills … use $ns:skill", asserts no auto-activation occurred. - Disabled: a skill with
is_enabled == false→ disabled message, no activation. - Guardrail —
$in code:$VARinside a fenced block or command string is not treated as a mention.
Non-goals
- Do not remove slash commands.
/skilland the whole/surface stay;$is preferred for models but additive. - Do not auto-run arbitrary scripts. A
$mentionloads guidance (and, later, a declared tool policy) — it never executes shell or skill-attached scripts on its own. - Do not silently activate multiple complex skills. Multi-mention falls back to a "choose one" prompt until composition rules are specified.
- Do not let
$collide with shell variables.$inside code fences and command strings is never parsed as a skill mention.