Three targeted error-message improvements extracted from community
PR #2933 (author cy2311), with additional model-not-found annotation:
1. dispatch.rs format_tool_error: pass through self-explanatory messages
that already name the cause (mode switch, allow_shell, feature flag,
denied by user) instead of appending a conflicting generic suffix.
Fixes the Plan-mode double-message (#2657).
2. subagent/mod.rs session-name conflict: include elapsed time
(started Ns ago / NmNs ago) so the parent can distinguish a live
worker from a stale/failed earlier spawn (#2656).
3. subagent/mod.rs annotate_child_model_error: catch model-not-found
patterns (Model Not Exist, does not exist, no such model, etc.) in
the raw error text even when the taxonomy classifies them as
Internal rather than Authorization/State (#2653).
Closes#2653, #2656, #2657.
Credit: cy2311 for the dispatch.rs and subagent conflict hunks from #2933.
Co-authored-by: cy2311 <29836092+cy2311@users.noreply.github.com>
- setup-vm.sh: bump RELEASE_TAG default to v0.8.57, add gh CLI install
step (official APT repo) and 4G swapfile creation (idempotent)
- agent-session.sh: new sourceable helper that exports the provider key
from /etc/codewhale/runtime.env for interactive agent sessions
- README.md: update version refs, add agent-session.sh to layout, add
Autonomous agent loop section with full pick->PR commands
The droplet ops (binary upgrade, PAT setup, first end-to-end issue run)
are documented as the next steps for the operator.
Headless exec hardening for benchmark/CI/droplet use:
- New CLI flags: --allowed-tools, --disallowed-tools, --max-turns, --append-system-prompt
- Add disallowed_tools to EngineConfig + command_denies_tool() helper
- run_exec_agent threads all four flags into EngineConfig and Op::SendMessage
- needs_engine now includes flag presence for standalone exec use
Three changes to replace raw UUIDs/hex-ids with stable user-facing labels:
1. Turn label: Add turn_counter to App, display "Turn N" instead of the
raw runtime_turn_id UUID prefix. Full UUID preserved in hover text.
2. Agent labels: Add agent_counter + agent_label_map to App. Populated
on AgentSpawned; sidebar rows use "Agent 1", "Agent 2" etc. instead
of agent_<hex>. Nicknames and user-assigned names still take priority.
3. Step counter: Add format_step_counter() helper. When max_steps is
u32::MAX (the unbounded sentinel), renders "step 16" instead of the
meaningless "step 16/4294967295". Concrete step budgets still show
the denominator.
When 4+ sub-agents run concurrently, each AgentProgress event triggers
a full terminal redraw via received_engine_event → needs_redraw. The
render loop saturates, sidebar recomputation dominates the frame budget,
and terminal input events (including Ctrl+C) are starved.
Limit progress-driven redraws to at most one per 100ms per agent. The
status-animation timer (80ms cadence) still guarantees sidebar updates.
Agent state is recorded immediately; the sidebar picks it up on the next
permitted redraw.
Adds last_agent_progress_redraw field to App to track throttle state.
SavedSession.messages is Vec<Message>, not AppendLog — .clone() already
returns Vec<Message>, so .into() was a no-op conversion that triggered
clippy::useless_conversion in CI lint.
- Wire AppendLog as the backing store for Session.messages
- Add Deref, From impls, and explicit mutation methods to AppendLog
- Narrow API: remove DerefMut, add push_batch/truncate_to/trim_front/clear/last_mut
- Update all direct message assignments to use .into() conversions
- Update tests to deref through AppendLog for comparisons
Rebased onto upstream/main (v0.8.57) to resolve merge conflicts.
- prompt_persist tests wrote through the developer's real
~/.codewhale/prompt_cache and raced each other (the eviction test could
delete another test's entry). cache_dir() now honors
CODEWHALE_PROMPT_CACHE_DIR and the tests run serialized against private
tempdirs.
- The raw TCP mock servers in mcp tests answered one request per socket
but never advertised 'Connection: close', so reqwest pooled dead
keep-alive connections and retried POSTs failed under parallel-suite
load with 'connection closed before message completed' (~50% failure
rate on the full suite).
The workflow hardcoded install boilerplate plus a contributor list that
had already drifted (v0.8.56's release thanked people 'for shaping
v0.9.0'). The body now comes from scripts/release/generate-release-body.sh:
static install/verify sections plus the tagged version's changelog section,
which already carries the per-release credits.
- scripts/release/prepare-release.sh bumps workspace + crate pins + npm
wrapper + README install tags, refreshes Cargo.lock, regenerates the
TUI changelog slice and web facts, then runs check-versions.sh
- check-versions.sh now also gates web/lib/facts.generated.ts and the
README install-tag examples (both drifted silently before)
- .cnb.yml validates the pushed tag against Cargo.toml before generating
mirror release notes
- RELEASE_CHECKLIST/RUNBOOK updated accordingly (v0.8.56 needed 9 fix
commits for exactly these sync points)
When the host sleeps while a model response is streaming, the connection
dies on wake with 'Stream read error: error decoding response body' and
the turn was lost. The engine now stamps every stream chunk with both
monotonic and wall-clock time; Instant pauses across a suspend while
SystemTime does not, so a >10s divergence on a stream error identifies a
sleep/wake cycle. In that case the partial output is discarded and the
identical request is re-issued (sharing the existing MAX_STREAM_RETRIES=3
budget) instead of failing the turn. Ordinary network flakes keep the
deliberate no-retry-after-content policy from #103.
- system prompt environment key deepseek_version -> codewhale_version
- drop legacy .deepseek/instructions.md from the Local Law prompt tier
(the engine still reads it for back-compat)
- instructions-file truncation marker now states how many bytes were
omitted so the model knows what it is missing
- CODEWHALE_CHANGELOG const + user-facing /change strings
- codewhale metrics doc headers
The packaged changelog is now a recent-releases slice produced by
scripts/sync-changelog.sh (which gains a --check mode); also restore the
SECURITY.md contact line the version gate guards, and finish the stale
binary-name sweep (--bin codewhale examples, qa harness doc).
The /change command embeds this file into every binary via include_str!;
it now carries only recent releases (regenerated by scripts/sync-changelog.sh,
wired into the release checklist). The explicit-version test derives its
fixture versions from the embedded slice instead of hardcoding old ones.