Three targeted error-message improvements extracted from community
PR #2933 (author cy2311), with additional model-not-found annotation:
1. dispatch.rs format_tool_error: pass through self-explanatory messages
that already name the cause (mode switch, allow_shell, feature flag,
denied by user) instead of appending a conflicting generic suffix.
Fixes the Plan-mode double-message (#2657).
2. subagent/mod.rs session-name conflict: include elapsed time
(started Ns ago / NmNs ago) so the parent can distinguish a live
worker from a stale/failed earlier spawn (#2656).
3. subagent/mod.rs annotate_child_model_error: catch model-not-found
patterns (Model Not Exist, does not exist, no such model, etc.) in
the raw error text even when the taxonomy classifies them as
Internal rather than Authorization/State (#2653).
Closes#2653, #2656, #2657.
Credit: cy2311 for the dispatch.rs and subagent conflict hunks from #2933.
Co-authored-by: cy2311 <29836092+cy2311@users.noreply.github.com>
- setup-vm.sh: bump RELEASE_TAG default to v0.8.57, add gh CLI install
step (official APT repo) and 4G swapfile creation (idempotent)
- agent-session.sh: new sourceable helper that exports the provider key
from /etc/codewhale/runtime.env for interactive agent sessions
- README.md: update version refs, add agent-session.sh to layout, add
Autonomous agent loop section with full pick->PR commands
The droplet ops (binary upgrade, PAT setup, first end-to-end issue run)
are documented as the next steps for the operator.
Headless exec hardening for benchmark/CI/droplet use:
- New CLI flags: --allowed-tools, --disallowed-tools, --max-turns, --append-system-prompt
- Add disallowed_tools to EngineConfig + command_denies_tool() helper
- run_exec_agent threads all four flags into EngineConfig and Op::SendMessage
- needs_engine now includes flag presence for standalone exec use
SavedSession.messages is Vec<Message>, not AppendLog — .clone() already
returns Vec<Message>, so .into() was a no-op conversion that triggered
clippy::useless_conversion in CI lint.
- Wire AppendLog as the backing store for Session.messages
- Add Deref, From impls, and explicit mutation methods to AppendLog
- Narrow API: remove DerefMut, add push_batch/truncate_to/trim_front/clear/last_mut
- Update all direct message assignments to use .into() conversions
- Update tests to deref through AppendLog for comparisons
Rebased onto upstream/main (v0.8.57) to resolve merge conflicts.
- prompt_persist tests wrote through the developer's real
~/.codewhale/prompt_cache and raced each other (the eviction test could
delete another test's entry). cache_dir() now honors
CODEWHALE_PROMPT_CACHE_DIR and the tests run serialized against private
tempdirs.
- The raw TCP mock servers in mcp tests answered one request per socket
but never advertised 'Connection: close', so reqwest pooled dead
keep-alive connections and retried POSTs failed under parallel-suite
load with 'connection closed before message completed' (~50% failure
rate on the full suite).
The workflow hardcoded install boilerplate plus a contributor list that
had already drifted (v0.8.56's release thanked people 'for shaping
v0.9.0'). The body now comes from scripts/release/generate-release-body.sh:
static install/verify sections plus the tagged version's changelog section,
which already carries the per-release credits.
- scripts/release/prepare-release.sh bumps workspace + crate pins + npm
wrapper + README install tags, refreshes Cargo.lock, regenerates the
TUI changelog slice and web facts, then runs check-versions.sh
- check-versions.sh now also gates web/lib/facts.generated.ts and the
README install-tag examples (both drifted silently before)
- .cnb.yml validates the pushed tag against Cargo.toml before generating
mirror release notes
- RELEASE_CHECKLIST/RUNBOOK updated accordingly (v0.8.56 needed 9 fix
commits for exactly these sync points)
When the host sleeps while a model response is streaming, the connection
dies on wake with 'Stream read error: error decoding response body' and
the turn was lost. The engine now stamps every stream chunk with both
monotonic and wall-clock time; Instant pauses across a suspend while
SystemTime does not, so a >10s divergence on a stream error identifies a
sleep/wake cycle. In that case the partial output is discarded and the
identical request is re-issued (sharing the existing MAX_STREAM_RETRIES=3
budget) instead of failing the turn. Ordinary network flakes keep the
deliberate no-retry-after-content policy from #103.
- system prompt environment key deepseek_version -> codewhale_version
- drop legacy .deepseek/instructions.md from the Local Law prompt tier
(the engine still reads it for back-compat)
- instructions-file truncation marker now states how many bytes were
omitted so the model knows what it is missing
- CODEWHALE_CHANGELOG const + user-facing /change strings
- codewhale metrics doc headers
The packaged changelog is now a recent-releases slice produced by
scripts/sync-changelog.sh (which gains a --check mode); also restore the
SECURITY.md contact line the version gate guards, and finish the stale
binary-name sweep (--bin codewhale examples, qa harness doc).
The /change command embeds this file into every binary via include_str!;
it now carries only recent releases (regenerated by scripts/sync-changelog.sh,
wired into the release checklist). The explicit-version test derives its
fixture versions from the embedded slice instead of hardcoding old ones.
Loads resolved hotbar slot bindings into app state and renders the Hotbar panel at the bottom of the sidebar (render layer only; key dispatch is a follow-up). Part of #2061, ref #2065.
Prompt, exec_shell description, and background schema now direct work expected to take >5 seconds (builds, test suites, servers, polling, sleeps) to task_shell_start/background=true. Addresses #2939.