From cef3b92964e9005a5983a82eda9f99b73a922739 Mon Sep 17 00:00:00 2001 From: Hunter Bown Date: Wed, 10 Jun 2026 16:19:07 -0700 Subject: [PATCH 1/2] feat(docs): agent-task issue template, labels, and runner protocol (#3021) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds the distributed intelligence infrastructure so remote agents can autonomously execute v0.8.58 milestone issues: - .github/ISSUE_TEMPLATE/agent-task.yml — GitHub issue form with six required sections (Goal, Scope, Key files, Acceptance criteria, Verification, Out of scope). Auto-labels as agent-ready. - docs/AGENT_RUNNER.md — pick → claim → worktree → exec → verify → PR loop with safety rules, label semantics, and the issue body format. Labels agent-ready, agent-in-progress, needs-human already exist (created during milestone setup). --- .github/ISSUE_TEMPLATE/agent-task.yml | 91 ++++++++++++++++++ docs/AGENT_RUNNER.md | 133 ++++++++++++++++++++++++++ 2 files changed, 224 insertions(+) create mode 100644 .github/ISSUE_TEMPLATE/agent-task.yml create mode 100644 docs/AGENT_RUNNER.md diff --git a/.github/ISSUE_TEMPLATE/agent-task.yml b/.github/ISSUE_TEMPLATE/agent-task.yml new file mode 100644 index 00000000..26311028 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/agent-task.yml @@ -0,0 +1,91 @@ +name: Agent task +description: Create a self-contained task that a headless agent (DeepSeek V4, remote droplet) can execute end-to-end without human context. +title: "v0.8.58: " +labels: ["agent-ready", "v0.8.58"] +body: + - type: markdown + attributes: + value: | + ## Instructions for authors + + This issue will be executed by an autonomous agent running `codewhale exec --auto` + on a headless VM. The body must be **self-sufficient** — every file path, command, + and acceptance criterion must be explicit. The agent has: + + - A fresh clone of `Hmbown/CodeWhale` at `main` + - Shell, read, write, and git tools with auto-approvals + - No conversation context — this issue body is all it knows + + Fill every section. Sections marked * are required. + + - type: textarea + id: goal + attributes: + label: "Goal / Why" + description: "What problem does this fix, and why now? (2-4 sentences)" + placeholder: | + e.g. "The TUI freezes when 4+ sub-agents run concurrently because + AgentProgress events trigger a full redraw each. This blocks v0.8.58's + recommended sub-agent fanout." + validations: + required: true + + - type: textarea + id: scope + attributes: + label: "Scope / Plan" + description: "Numbered steps with file paths. Each step is one concrete action." + placeholder: | + 1. crates/tui/src/tui/ui.rs — add throttle in AgentProgress handler (line ~2308) + 2. crates/tui/src/tui/app.rs — add `last_agent_progress_redraw` field + 3. cargo test -p codewhale-tui — verify no regressions + validations: + required: true + + - type: textarea + id: key-files + attributes: + label: "Key files" + description: "One file path per line. The agent will read these first." + placeholder: | + crates/tui/src/tui/ui.rs + crates/tui/src/tui/sidebar.rs + crates/tui/src/tui/app.rs + validations: + required: true + + - type: textarea + id: acceptance-criteria + attributes: + label: "Acceptance criteria" + description: "Behavior-level checkboxes. Every item must be testable." + placeholder: | + - [ ] 4 concurrent sub-agents do not freeze TUI input + - [ ] Ctrl+C works during sub-agent activity + - [ ] Sidebar updates throttle under load + validations: + required: true + + - type: textarea + id: verification + attributes: + label: "Verification" + description: "Exact shell commands the agent must run to prove the fix works." + placeholder: | + cargo check -p codewhale-tui + cargo test -p codewhale-tui -- subagent + cargo clippy -p codewhale-tui -- -D warnings + validations: + required: true + + - type: textarea + id: out-of-scope + attributes: + label: "Out of scope" + description: "What this issue does NOT change. Prevents scope creep." + placeholder: | + - Changing the sub-agent execution model + - Reducing the recommended fanout count + - Network-level optimizations + validations: + required: true diff --git a/docs/AGENT_RUNNER.md b/docs/AGENT_RUNNER.md new file mode 100644 index 00000000..b2035d02 --- /dev/null +++ b/docs/AGENT_RUNNER.md @@ -0,0 +1,133 @@ +# Agent Runner Protocol + +How a headless agent (DeepSeek V4 on a DigitalOcean droplet, or any codewhale exec caller) picks up, implements, verifies, and delivers a milestone issue — fully autonomously. + +## Prerequisites + +- `gh` CLI authenticated with a fine-grained PAT scoped to `Hmbown/CodeWhale` (Contents RW, Issues RW, PRs RW, Metadata R) +- `codewhale` binary on `$PATH` (v0.8.57+) +- `DEEPSEEK_API_KEY` (or equivalent provider key) exported in the agent user's shell +- A `git worktree` per issue (never commit directly to `main`) + +--- + +## The loop + +### 1. Pick + +```bash +gh issue list \ + --repo Hmbown/CodeWhale \ + --milestone v0.8.58 \ + --label agent-ready \ + --state open \ + --json number,title,url +``` + +Choose an issue. Prefer `release-blocker` → `bug` → `enhancement` order. +Do not pick an issue already labeled `agent-in-progress`. + +### 2. Claim + +```bash +gh issue edit --add-label agent-in-progress --remove-label agent-ready +``` + +This prevents other agents from picking the same issue. + +### 3. Isolate + +```bash +cd /opt/whalebro/codewhale +git fetch origin +git worktree add ../worktrees/issue- -b agent/- origin/main +cd ../worktrees/issue- +``` + +Every issue gets its own branch and worktree. The branch name convention is `agent/-`. + +### 4. Execute + +```bash +gh issue view --json body -q .body | \ + codewhale exec --auto --output-format stream-json "$(cat)" +``` + +The agent reads the issue body and implements the fix. Use a tmux session per issue so the run survives SSH disconnects: + +```bash +tmux new-session -d -s "issue-" \ + "gh issue view --json body -q .body | \ + codewhale exec --auto --output-format stream-json \"\$(cat)\" 2>&1 | tee /tmp/issue-.log" +``` + +For resuming an interrupted run: + +```bash +codewhale exec --auto --output-format stream-json --resume latest "..." +``` + +### 5. Verify + +Run the exact commands from the issue's **Verification** section. If they pass, proceed. If they fail, loop back to step 4 with the error output as context, or label `needs-human`. + +### 6. Deliver + +```bash +gh pr create \ + --repo Hmbown/CodeWhale \ + --base main \ + --title "" \ + --body "Closes #" \ + --label v0.8.58 +``` + +All delivery is via PR — never push to `main` directly. Human review is required before merge. + +### 7. On blockage + +```bash +gh issue edit --add-label needs-human --remove-label agent-in-progress +gh issue comment --body "Blocked: . Human decision needed." +``` + +Common blockers: missing credentials, ambiguous scope, test environment unavailable, network outage. + +--- + +## Label semantics + +| Label | Meaning | Auto-applied? | +|---|---|---| +| `agent-ready` | Body has all six template sections; a remote agent may claim it | Yes (template) | +| `agent-in-progress` | Claimed by an agent run; do not double-pick | Manual (step 2) | +| `needs-human` | Agent blocked; requires human decision or credentials | Manual (step 7) | +| `autonomous-ready` | Legacy nightly-loop label; distinct from `agent-ready` | No | + +The `autonomous-ready` label is for the legacy nightly loop (external automation). +New work uses `agent-ready`. + +--- + +## Safety rules + +1. **PR-only delivery.** Never commit to `main`. Every change is a branch + PR. +2. **No force-push.** `git push --force` is forbidden. +3. **Secrets never in argv, history, or logs.** API keys, PATs, and credentials live in `/etc/codewhale/*.env` and are sourced into the agent user's shell. The runtime API listens on `127.0.0.1:7878` only. Telegram bridge chats are allowlisted. +4. **Human reviews every PR.** The droplet loop delivers PRs; a human on the laptop reviews and merges. +5. **One issue per worktree.** No cross-contamination between concurrent agent runs. + +--- + +## Issue body format + +Every `agent-ready` issue must have these six sections (enforced by `.github/ISSUE_TEMPLATE/agent-task.yml`): + +1. **Goal / Why** — what problem, why now +2. **Scope / Plan** — numbered steps with file paths +3. **Key files** — paths to read first +4. **Acceptance criteria** — behavior-level checkboxes +5. **Verification** — exact shell commands +6. **Out of scope** — explicit non-goals + +The body must be self-sufficient: a fresh clone agent with no conversation context must be able to execute it. From e4ea208d538c7ea2325279e1eb3c8cf58cf28d81 Mon Sep 17 00:00:00 2001 From: Claude Date: Thu, 11 Jun 2026 02:31:00 +0000 Subject: [PATCH 2/2] =?UTF-8?q?docs(runner):=20fix=20resume=20example=20?= =?UTF-8?q?=E2=80=94=20exec=20has=20no=20'latest'=20session=20alias;=20use?= =?UTF-8?q?=20--continue=20(#3021)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude https://claude.ai/code/session_018zaP8vUfTAsrE38L6h6fw5 --- docs/AGENT_RUNNER.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/docs/AGENT_RUNNER.md b/docs/AGENT_RUNNER.md index b2035d02..8e26dd40 100644 --- a/docs/AGENT_RUNNER.md +++ b/docs/AGENT_RUNNER.md @@ -61,10 +61,12 @@ tmux new-session -d -s "issue-" \ codewhale exec --auto --output-format stream-json \"\$(cat)\" 2>&1 | tee /tmp/issue-.log" ``` -For resuming an interrupted run: +For resuming an interrupted run (`--continue` picks up the most recent +session for this workspace; `--resume latest` only exists in the interactive +TUI): ```bash -codewhale exec --auto --output-format stream-json --resume latest "..." +codewhale exec --auto --output-format stream-json --continue "..." ``` ### 5. Verify