8cb4f94f30
Design-only deliverables for the v0.8.53 "tool surface diet / canonical surfaces" cutover (no catalog code in this cycle). Grounded in a verified inventory of the actual tool registry. - docs/TOOL_LIFECYCLE.md (#2681): the umbrella policy. Five lifecycle states (active / deferred / hidden-compatibility / deprecated / removed) modeled as const name-sets + an alias table in tool_catalog.rs (not a per-ToolSpec field), so registration stays untouched and old transcripts always replay. Includes the deprecation manifest (exec_wait/exec_interact/tts → hidden-compat; todo_* → checklist_* deprecated; 11 legacy subagent names are already non-visible dead code → cleanup + guardrail), per-mode/per-provider active-catalog budget (incl. Arcee's 8-tool first-turn set), prefix-cache safety rules, and the tool_agent decision: canonical but DeepSeek-V4-gated. - docs/CODEBASE_SEARCH_DESIGN.md (#2680, v0.9.0): local-first FTS5/BM25 + symbol/path ranking + RRF hybrid; rusqlite storage; mtime/branch/vendor invalidation; an explainable tool contract returning reasons[]; and a real CodeWhale query eval set. Complements grep_files/file_search, never replaces. - docs/SKILL_INVOCATION_DESIGN.md (0.9.0): the $<skill-name> inline invocation syntax (the token IS the skill name), namespaced resolution, ambiguity- suggests-not-guesses, visible activation line, and a smallest-viable slice. - docs/VISION_NORTH_STAR.md (0.9.0+): intent router, hybrid codebase intelligence, WhaleFlow typed workflow IR, skills/rules runtime, the layered context-memory stack, tool repair/autoload, the evaluation loop, and the command-surface taxonomy (/memory small · /context dashboard · /rules · /workflow · /overlay · $<skill> · codebase_search). Marked DIRECTION, not committed 0.8.53 work; also records the deferred-not-done diet items. Targets codex/v0.8.53.
234 lines
12 KiB
Markdown
234 lines
12 KiB
Markdown
# Skill Invocation Design — the `$<skill-name>` inline syntax
|
|
|
|
Status: **DESIGN ONLY** (v0.8.53 cycle). No catalog/parser code ships in this
|
|
cycle; the implementation target is **0.9.0**. This document describes what
|
|
*will* be built and the contracts it must honor against today's code.
|
|
|
|
Related design docs: `TOOL_LIFECYCLE.md` (tool lifecycle states + per-skill tool
|
|
restriction), command-surface taxonomy notes for `/memory`, `/context`,
|
|
`/rules`, `/workflow` (`/whaleflow`), `/overlay`. Open PRs on `codex/v0.8.53`:
|
|
#2684 (subagent role vocab / lifecycle signals / eval ergonomics) and #2685
|
|
(git history active + RLM/field errors). Nothing here contradicts those.
|
|
|
|
---
|
|
|
|
## 1. Problem
|
|
|
|
Skill activation has no single, model-legible entry point, and the candidate
|
|
surfaces all compete with each other:
|
|
|
|
- A `/skill` slash command, a `load_skill`-style tool, plugin/namespace naming
|
|
(`superpowers:systematic-debugging`, `github:gh-fix-ci`), and the long-running
|
|
workflow commands (`/workflow` / `/whaleflow`) all *could* be "the way you
|
|
start a skill." None of them is canonical.
|
|
- Slash commands are already overloaded. `/memory`, `/context`, `/rules`,
|
|
`/config`, `/provider`, `/workflow`, `/overlay` each map to one subsystem;
|
|
jamming skill invocation into `/`-space forces a weaker model to disambiguate
|
|
"is this a command or a skill?" on every keystroke.
|
|
- Weaker / smaller models (the cheaper providers CodeWhale targets) do not
|
|
reliably pick the right mechanism. They will free-text "let me use systematic
|
|
debugging" instead of actually loading the skill body, so the guidance never
|
|
enters the context window.
|
|
- Today there is **no parser that activates an inline skill mention on submit.**
|
|
`slash_menu.rs:86` (`partial_inline_skill_mention_at_cursor`) recognizes an
|
|
inline `/<skill>` token *under the cursor for popup purposes only*; the submit
|
|
path in `ui.rs:4721` (`build_queued_message`) does not resolve or activate any
|
|
inline mention. There is also no activation-mode concept (always-on / glob /
|
|
model-decision / manual) and skills cannot restrict tools yet.
|
|
|
|
We need one prefix that means exactly "invoke this skill," is visually distinct
|
|
from commands, and is cheap for a small model to emit correctly.
|
|
|
|
---
|
|
|
|
## 2. Proposal
|
|
|
|
Adopt **`$` as the skill-invocation prefix**, where **the token *is* the skill
|
|
name** — not a literal command called `$skill`.
|
|
|
|
```
|
|
$systematic-debugging figure out why MiMo auth fails
|
|
$test-driven-development add coverage before fixing
|
|
$github:gh-fix-ci inspect the failing checks
|
|
$aleph search the planning doc
|
|
```
|
|
|
|
The leading `$` is the marker; everything from `$` up to the next whitespace is
|
|
the **skill id**. The rest of the line is the user's request, passed through to
|
|
the model with the skill body already loaded as active guidance.
|
|
|
|
This is deliberately a *reference / macro* sigil, like a shell variable
|
|
expansion or an `@mention`: `$skill-id` resolves to "the contents and tool
|
|
policy of that skill," then the surrounding prose is the task.
|
|
|
|
`$` works in three places (see §4): the user composer, the command-palette
|
|
input, and **model-facing planning text** — so the model itself can write
|
|
`$systematic-debugging` in its plan and have it resolve.
|
|
|
|
---
|
|
|
|
## 3. Resolution rules
|
|
|
|
Given a token `$<id>` (id captured up to the next whitespace):
|
|
|
|
1. **Exact name first.** Look the id up directly:
|
|
`discover_in_workspace(workspace).get(id)` — `skills/mod.rs:553` builds the
|
|
registry; `SkillRegistry::get` (`skills/mod.rs:421`) matches on `s.name == id`
|
|
exactly. Skill names come from frontmatter `name:` (or the first `# Heading`
|
|
fallback) parsed at `skills/mod.rs:382-417`. An exact hit wins unconditionally.
|
|
|
|
2. **Namespaced `$ns:skill`.** If the id contains a `:`, treat the part before
|
|
the colon as a source/plugin namespace and the part after as the skill name:
|
|
`$github:gh-fix-ci`, `$superpowers:systematic-debugging`. Namespaced ids are
|
|
the disambiguation handle a user is told to type when a bare id is ambiguous.
|
|
(Glob/wildcard namespacing — `$github:*` — is explicitly deferred, see §6.)
|
|
|
|
3. **Fuzzy match *suggests*, never silently chooses.** If there is no exact (or
|
|
namespaced-exact) hit, run a case-insensitive substring / prefix match over
|
|
`SkillRegistry::list()` (`skills/mod.rs:426`). If exactly one skill matches,
|
|
surface it as a suggestion ("did you mean `$systematic-debugging`?") but do
|
|
**not** auto-activate it. If more than one matches, list them and require the
|
|
user/model to re-issue with a disambiguated id (§7). Ambiguity never resolves
|
|
to a silent pick.
|
|
|
|
4. **Respect enable-state.** A resolved skill is only activated if
|
|
`SkillStateStore::is_enabled(id)` is true (`skill_state.rs:73`:
|
|
`!self.disabled.contains(skill_name)`). A disabled skill that resolves by
|
|
name produces a clear "skill is disabled; enable it with `/skill enable <id>`"
|
|
message rather than silently activating or silently doing nothing.
|
|
|
|
Resolution order is therefore: **exact → namespaced-exact → enabled-check →
|
|
fuzzy-suggest (never auto-pick).**
|
|
|
|
---
|
|
|
|
## 4. Behavior
|
|
|
|
When a `$<id>` mention resolves and is enabled:
|
|
|
|
- **Visible activation line.** The transcript shows `Using skill: <name>` so the
|
|
user can see which skill body entered context. (Mirrors the existing skill UX
|
|
vocabulary; one line per activated skill.)
|
|
- **Body loaded as active guidance.** The skill's `body`
|
|
(`skills/mod.rs` `Skill.body`) is injected into the turn as authoritative
|
|
guidance, the same content a `/skill`-style activation would load. The user's
|
|
trailing prose is the task the guidance applies to.
|
|
- **Tool-surface narrowing (when declared).** If the skill declares a set of
|
|
allowed tools, the active tool surface narrows to that set for the duration of
|
|
the skill's influence. **Per-skill tool restriction is net-new** — skills
|
|
cannot restrict tools today; the mechanism, and how narrowing interacts with
|
|
the catalog-head byte-stability invariant (`tool_catalog.rs:169-196`), is
|
|
specified in `TOOL_LIFECYCLE.md`. Until that lands, a declared tool list is
|
|
parsed and shown but not enforced.
|
|
- **Multiple `$mentions` compose explicitly, or prompt.** Until formal
|
|
composition rules exist, two or more `$mentions` in one message either compose
|
|
only when the rule is unambiguous (e.g. one guidance skill + one tool-scoping
|
|
skill) or return a **"choose one"** prompt listing the mentioned skills. We
|
|
never silently activate multiple complex skills at once (see §7 and Non-goals).
|
|
- **Three input surfaces.** Resolution runs for: (a) user prompts in the
|
|
composer, (b) command-palette input, and (c) model-facing planning text, so a
|
|
model that writes `$test-driven-development` in its plan triggers the same
|
|
activation path a human would.
|
|
- **Slash commands remain supported.** `/skill ...` and the rest of the slash
|
|
surface keep working unchanged. `$` is the *preferred* path for models because
|
|
it is one token and unambiguous, but it is additive, not a replacement (§7
|
|
Non-goals).
|
|
|
|
---
|
|
|
|
## 5. Why `$`
|
|
|
|
- **Visually distinct from `/commands`.** A glance separates "run a subsystem
|
|
command" (`/memory`, `/context`, `/workflow`) from "load a skill" (`$aleph`).
|
|
Weaker models stop confusing the two surfaces.
|
|
- **Reads like a reference / macro.** `$name` already means "expand this named
|
|
thing" to anyone who has touched a shell or a templating language. Skill
|
|
invocation *is* an expansion: `$skill-id` → that skill's guidance + tool policy.
|
|
- **Avoids overloading the slash namespace.** `/workflow`, `/memory`, `/config`,
|
|
`/provider`, `/rules`, `/overlay`, `/context` each already own one meaning in
|
|
the command-surface taxonomy. Skills get their own sigil instead of a crowded
|
|
`/skill <name>` subcommand competing with all of them.
|
|
- **Easy to type and remember.** Single leading character, then the literal
|
|
skill name. Nothing to memorize beyond the skill ids the user already sees in
|
|
`/skill list`.
|
|
|
|
---
|
|
|
|
## 6. Implementation plan (smallest viable 0.8.53-ready slice → 0.9.0)
|
|
|
|
The 0.8.53 cycle is **docs only**. The plan below is the build order once code
|
|
is unblocked; the first slice is intentionally the minimum that proves the path.
|
|
|
|
**Slice 1 — token scanner at submit (the minimum viable feature).**
|
|
- Add a `$<skill-id>` token scanner invoked on submit, **before**
|
|
`build_queued_message` runs (`ui.rs:4721`). The scanner finds leading-`$`
|
|
tokens, captures the id up to the next whitespace, and hands each id to the
|
|
resolver. The scanner must skip `$` occurrences inside code fences and inline
|
|
command strings (see Non-goals) so shell `$VAR` references are never treated as
|
|
skill mentions.
|
|
- Resolve via `discover_in_workspace(workspace).get(id)` (`skills/mod.rs:553` /
|
|
`:421`), gate on `SkillStateStore::is_enabled` (`skill_state.rs:73`), and emit
|
|
the `Using skill: <name>` line plus the loaded body.
|
|
|
|
**Slice 2 — inline-mention popup.**
|
|
- Extend the inline-mention popup machinery in `slash_menu.rs:86`
|
|
(`partial_inline_skill_mention_at_cursor`) to recognize a `$`-prefixed token
|
|
under the cursor and offer skill-name completions from `SkillRegistry::list()`,
|
|
the same way the slash popup offers commands. This is a UX accelerator on top
|
|
of Slice 1, not a precondition for it.
|
|
|
|
**Slice 3 — ambiguity diagnostics.**
|
|
- When resolution is ambiguous, emit actionable diagnostics, e.g.
|
|
`"$debugging matched 3 skills: systematic-debugging, root-cause-debugging,
|
|
superpowers:systematic-debugging — use $superpowers:systematic-debugging"`.
|
|
Diagnostics name the disambiguated id the user should type next.
|
|
|
|
**Deferred to 0.9.0+ (explicitly out of the first slices):**
|
|
- `$ns:skill` **globs / wildcards** (`$github:*`). Plain namespaced-exact
|
|
(`$github:gh-fix-ci`) ships in Slice 1; globbing does not.
|
|
- **Per-skill tool restriction enforcement.** Parsing/display can land early;
|
|
enforcement and its catalog-head-stability handling are owned by
|
|
`TOOL_LIFECYCLE.md`.
|
|
- **Multi-skill composition rules.** Until defined, fall back to the "choose one"
|
|
prompt (§4, §7).
|
|
|
|
---
|
|
|
|
## 7. Ambiguity / error UX, tests, and non-goals
|
|
|
|
### Error / ambiguity UX examples
|
|
|
|
| Input | Outcome |
|
|
|---|---|
|
|
| `$systematic-debugging fix the auth bug` | Exact hit. `Using skill: systematic-debugging`, body loaded, task = "fix the auth bug". |
|
|
| `$github:gh-fix-ci inspect failing checks` | Namespaced-exact hit. `Using skill: github:gh-fix-ci`, body loaded. |
|
|
| `$nope do a thing` | No match. `"No skill named 'nope'. Run /skill list to see available skills."` No activation; the line is sent as ordinary text. |
|
|
| `$debugging ...` (3 candidates) | `"$debugging matched 3 skills: systematic-debugging, root-cause-debugging, superpowers:systematic-debugging — use $superpowers:systematic-debugging."` No auto-pick. |
|
|
| `$systematic-debug ...` (1 fuzzy candidate) | Suggest only: `"No exact skill 'systematic-debug'. Did you mean $systematic-debugging?"` No silent activation. |
|
|
| `$aleph ...` but aleph disabled | `"Skill 'aleph' is disabled. Enable it with /skill enable aleph."` No activation. |
|
|
| `$tdd $systematic-debugging ...` (2 mentions) | `"Choose one skill to lead this turn: $test-driven-development or $systematic-debugging."` (until composition rules exist). |
|
|
| `echo $PATH` inside a code fence / command string | Not a mention. Scanner skips `$` inside code/command contexts. |
|
|
|
|
### Tests (planned)
|
|
|
|
- **Exact:** `$systematic-debugging` resolves via `get(id)`, activates, loads body.
|
|
- **Namespaced:** `$github:gh-fix-ci` resolves on the `ns:skill` form.
|
|
- **Missing:** `$nope` → no-match message, no activation, line passed as text.
|
|
- **Ambiguous:** `$debugging` (≥2 candidates) → "matched N skills … use $ns:skill",
|
|
asserts **no** auto-activation occurred.
|
|
- **Disabled:** a skill with `is_enabled == false` → disabled message, no activation.
|
|
- **Guardrail — `$` in code:** `$VAR` inside a fenced block or command string is
|
|
not treated as a mention.
|
|
|
|
### Non-goals
|
|
|
|
- **Do not remove slash commands.** `/skill` and the whole `/` surface stay; `$`
|
|
is preferred for models but additive.
|
|
- **Do not auto-run arbitrary scripts.** A `$mention` loads guidance (and, later,
|
|
a declared tool policy) — it never executes shell or skill-attached scripts on
|
|
its own.
|
|
- **Do not silently activate multiple complex skills.** Multi-mention falls back
|
|
to a "choose one" prompt until composition rules are specified.
|
|
- **Do not let `$` collide with shell variables.** `$` inside code fences and
|
|
command strings is never parsed as a skill mention.
|