test(fleet): CI-safe multi-worker dogfood smoke with injected failure (#3166)
Adds a verifiable dogfood smoke that drives several concurrent exec-style workers (three healthy + one injected-failure that emits an error event and exits non-zero) through the real host adapter, asserting distinct terminal pass/fail outcomes — no external services, no model calls, no codewhale binary. Documents the automated CI smoke vs the manual `codewhale fleet run` path in the dogfood spec, and is honest that the manager run-loop cutover to drive real FleetExecutor workers is still in progress. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,16 +1,27 @@
|
||||
# Agent Fleet dogfood smoke spec (#3166)
|
||||
#
|
||||
# This spec exercises the fleet end-to-end: create a run with two local
|
||||
# workers, run a lint task and a review task, verify the ledger records
|
||||
# receipts, and confirm the status surfaces work.
|
||||
# workers, run a workspace-check task and a protocol-review task, verify the
|
||||
# ledger records receipts, and confirm the status surfaces work. Each worker is
|
||||
# a headless `codewhale exec` run (see docs/AGENT_RUNTIME.md).
|
||||
#
|
||||
# Run:
|
||||
# Automated CI-safe smoke (no external services, no model calls):
|
||||
# cargo test -p codewhale-tui --bins fleet::executor
|
||||
# It drives several concurrent exec-style workers (with one injected failure)
|
||||
# through the real host adapter and asserts terminal pass/fail outcomes.
|
||||
#
|
||||
# Manual run (drives real `codewhale exec` workers; needs provider creds):
|
||||
# codewhale fleet run docs/examples/fleet-dogfood.toml --max-workers 2 --once
|
||||
#
|
||||
# Then check:
|
||||
# codewhale fleet status
|
||||
# codewhale fleet inspect <worker-id-from-status>
|
||||
# codewhale fleet logs <worker-id-from-status>
|
||||
#
|
||||
# NOTE: wiring the manager run loop to drive FleetExecutor for real workers is
|
||||
# the in-progress cutover; until then the manual run path uses the local
|
||||
# simulation harness. The automated smoke above already proves the real
|
||||
# exec-subprocess -> ledger-event path.
|
||||
|
||||
name = "dogfood smoke"
|
||||
labels = { milestone = "v0.8.60", class = "smoke" }
|
||||
|
||||
Reference in New Issue
Block a user