Three targeted changes to reduce low-value detail in the default
compact/Live transcript view:
1. ExecCell: suppress "(no output)" line in Live mode. The success
header already conveys the outcome; Transcript mode keeps it for
exports/clipboard/pager.
2. ExecCell: suppress sub-second timing in Live mode. Calls under 1s
show no timing line; Transcript mode always shows exact timing.
3. render_preserved_output_mode: suppress "(no output)" for empty output
in Live mode. Same rationale — the header carries the signal.
Full command text, complete output, and exact timing remain available
in Transcript mode (pager, clipboard export, Alt+V detail view).
- prompt_persist tests wrote through the developer's real
~/.codewhale/prompt_cache and raced each other (the eviction test could
delete another test's entry). cache_dir() now honors
CODEWHALE_PROMPT_CACHE_DIR and the tests run serialized against private
tempdirs.
- The raw TCP mock servers in mcp tests answered one request per socket
but never advertised 'Connection: close', so reqwest pooled dead
keep-alive connections and retried POSTs failed under parallel-suite
load with 'connection closed before message completed' (~50% failure
rate on the full suite).
When the host sleeps while a model response is streaming, the connection
dies on wake with 'Stream read error: error decoding response body' and
the turn was lost. The engine now stamps every stream chunk with both
monotonic and wall-clock time; Instant pauses across a suspend while
SystemTime does not, so a >10s divergence on a stream error identifies a
sleep/wake cycle. In that case the partial output is discarded and the
identical request is re-issued (sharing the existing MAX_STREAM_RETRIES=3
budget) instead of failing the turn. Ordinary network flakes keep the
deliberate no-retry-after-content policy from #103.
- system prompt environment key deepseek_version -> codewhale_version
- drop legacy .deepseek/instructions.md from the Local Law prompt tier
(the engine still reads it for back-compat)
- instructions-file truncation marker now states how many bytes were
omitted so the model knows what it is missing
- CODEWHALE_CHANGELOG const + user-facing /change strings
- codewhale metrics doc headers
The packaged changelog is now a recent-releases slice produced by
scripts/sync-changelog.sh (which gains a --check mode); also restore the
SECURITY.md contact line the version gate guards, and finish the stale
binary-name sweep (--bin codewhale examples, qa harness doc).
The /change command embeds this file into every binary via include_str!;
it now carries only recent releases (regenerated by scripts/sync-changelog.sh,
wired into the release checklist). The explicit-version test derives its
fixture versions from the embedded slice instead of hardcoding old ones.
Loads resolved hotbar slot bindings into app state and renders the Hotbar panel at the bottom of the sidebar (render layer only; key dispatch is a follow-up). Part of #2061, ref #2065.
Prompt, exec_shell description, and background schema now direct work expected to take >5 seconds (builds, test suites, servers, polling, sleeps) to task_shell_start/background=true. Addresses #2939.
allow_shell now rides the per-turn <runtime_prompt> tag alongside mode/approval; message[0] stays byte-stable across shell toggles and mode switches, preserving the DeepSeek prefix cache. Removes the shell-guidance line-filtering machinery; adds a static Shell Policy reference section.
When the CLI dispatcher forwards --api-key with DEEPSEEK_API_KEY_SOURCE=cli, that explicit override now wins over the saved root key for DeepSeek providers.
Event-driven task panel refresh on exec_shell_cancel/exec_shell_wait/task_cancel plus an immediate refresh after ShellJob actions, so the Tasks sidebar reflects cancellations without waiting for the periodic refresh. Addresses #2937.
The <runtime_prompt> tag includes a visibility="internal" attribute that
was listed in the tag format but never explained. Models sometimes
interpreted this as an instruction to announce or restate the current
mode to the user, leading to repetitive YOLO-mode confirmations before
every tool call (#2922).
Add a one-sentence explanation: the attribute marks this tag as a
runtime instruction for the model (not user input), and the model should
apply the referenced rules silently without announcing the mode.
Closes#2922
Move allow_shell from message[0] (static system prompt) to the per-turn
<runtime_prompt> tag alongside mode and approval. This preserves the
DeepSeek prefix cache across shell-access toggles and mode switches,
which previously forced YOLO entry/exit to rebuild the entire prompt.
Changes:
- Delete remove_shell_tool_guidance and 3 other dead functions (~60 lines)
- Remove allow_shell field from PromptSessionContext and StaticPromptCtx
- Remove shell_tools_available dead parameter from compose functions
- Add Shell Policy section to Runtime Policy Reference (static text)
- Extend <runtime_prompt> tag with allow_shell="true|false" attribute
- Update tests: omits→always_keeps, 83/83 prompts tests pass
- Drop dead compose_mode_prompt_with_approval_and_model
Net: message[0] bytes are now stable regardless of shell-access state.
Mode/approval/shell flags all use the same per-turn tag pattern.
Render configured/default hotbar slots at the bottom of the sidebar.
Load resolved hotbar bindings into app state, display them as compact sidebar rows, highlight active slots, and preserve unknown
actions visibly. Keep narrow sidebar rows within the available width so slots do not wrap or disappear.
Add focused sidebar hotbar render and layout tests.
Background shell task cancellation was unreliable because the Tasks
sidebar panel was not refreshed immediately after cancel actions.
Root cause:
- ShellJobAction::Cancel/CancelAll killed the process in ShellManager
but did not trigger a task_panel refresh, leaving stale "running"
entries until the next 2.5 s periodic poll.
- The tool-name refresh list at line 1734 missed exec_shell_cancel,
exec_shell_wait, and task_cancel.
Fix:
- Add refresh_active_task_panel() call after ShellJobAction dispatch.
- Add exec_shell_cancel, exec_shell_wait, task_cancel to the
immediate-refresh tool name list.
Tests:
- shell_manager_cancel_transitions_task_to_not_running
- task_panel_entry_roundtrips_status
Qwen 3.6 Plus already had full catalog/resolver/picker support. Add
dedicated provider-hinted resolution tests to close the remaining gap.
- Add qwen3_6_plus_resolves_to_canonical_on_openrouter test
- Add qwen3_6_plus_alias_qwen_dash_resolves test
- Both verify /model qwen3.6-plus resolves to qwen/qwen3.6-plus on OpenRouter
When the CLI dispatcher launches the interactive TUI with an explicit
`--api-key` argument (e.g. for a DeepSeek-compatible subscription
endpoint), the environment variable `DEEPSEEK_API_KEY` carries the
intended key with `DEEPSEEK_API_KEY_SOURCE=cli`. Previously the
saved root `api_key` in config.toml always won over this env override
for the DeepSeek provider, blocking users from running:
codewhale --provider deepseek \
--api-key ark-... --base-url https://... --model auto
This change gives the dispatcher-supplied env key priority when the
source marker is `cli`, keeping full backward compatibility for
normal config-file or keyring paths, and also cleaning up a `***`
literal in an unrelated test.
Tightens the experimental OpenAI Codex (ChatGPT) provider so the v0.8.55
gate is green.
- clippy: collapse 5 nested if/if-let blocks flagged by
clippy::collapsible_if into let-chains (oauth.rs env-override
resolution, responses.rs SSE delta handling). cargo clippy --workspace
--all-targets -- -D warnings is now clean.
- fmt: cargo fmt --all over the Codex/Together changes (the gate's
--check was failing, incl. a mangled "| ApiProvider::Ollama").
- default model: Config::default_model() now resolves to the Codex
default (gpt-5.5) for the Codex provider instead of leaking a DeepSeek
default_text_model the Responses backend rejects. The carve-out sits
after the explicit provider-scoped model block (so
[providers.openai_codex] model still wins) and before the
DeepSeek-validating path, which is unchanged. Adds a behavior test.
https://claude.ai/code/session_013cHWv5sR6XPnVWzfMP8uma
Verified live against the ChatGPT Codex backend (real codex login):
`exec --model gpt-5.5` through the openai-codex provider returns a correct
completion. Fixes found while getting there:
- Route the non-streaming path too. create_message only dispatched chat
completions; for OpenAI Codex it now drives the Responses stream and folds it
into a MessageResponse (handle_responses_message), so `exec` and other
non-streaming callers use the same wire path as the interactive stream.
- Present a non-browser User-Agent on the Codex path. The ChatGPT backend sits
behind Cloudflare, which served a JS challenge (HTTP 403) to our browser-like
"Mozilla/5.0 (compatible; codewhale/...)" UA. A codex_cli_rs UA passes.
- Always send `instructions` (Responses rejects empty instructions); fall back
to a minimal system prompt.
- Map reasoning effort onto the Codex-allowed set (none/minimal/low/medium/
high/xhigh); CodeWhale's "auto" has no equivalent and maps to medium.
- Send `Accept: text/event-stream`.
Antislop pass on the changeset:
- Inline the one-caller codex_access_token wrapper (config calls get_credentials
directly) and drop the one-caller credentials_present helper; both presence
checks now use auth_file_path().exists() consistently with the Kimi path.
- Remove dead stream-parser state (ToolCallState fields, unused response_id /
current_item_type / output_text / thinking_text accumulators).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Completes the in-progress OpenAI Codex provider and bumps the workspace to
0.8.55. Builds on the committed Together AI provider + model catalog work.
OpenAI Codex (ChatGPT) provider — experimental:
- Wire the previously-dead OAuth module into credential resolution. The TUI
config now resolves the access token via the Codex CLI login in
~/.codex/auth.json (env overrides OPENAI_CODEX_ACCESS_TOKEN/CODEX_ACCESS_TOKEN),
refreshing expired tokens synchronously via the OpenAI token endpoint —
mirroring the existing Kimi OAuth flow rather than introducing a new pattern.
- Send the ChatGPT backend's required headers from the Responses client
(chatgpt-account-id, OpenAI-Beta: responses=experimental, originator) and
stop duplicating the Authorization header already installed on the client.
- Fix the cli crate's non-exhaustive ProviderKind matches (compile blocker).
Consistency / de-slop pass (so the provider fits the whole app, not one path):
- has_api_key_for / active_provider_has_config_api_key now detect the Codex
OAuth login on disk, the same way they detect Kimi OAuth — a `codex login`
user is no longer reported as unauthenticated.
- Replace the bogus OPENAI_CODEX_API_KEY hint (which exists nowhere else) with
the real OPENAI_CODEX_ACCESS_TOKEN/CODEX_ACCESS_TOKEN in the auth-error and
picker surfaces.
- Drop dead state in the Responses stream parser (unused ToolCallState fields /
imports); tool-call data is streamed live.
- Update docs/PROVIDERS.md, config.example.toml, and the provider-metadata wire
test for the Responses wire format.
Release:
- Bump workspace + crates + npm package to 0.8.55; update CHANGELOG.md and
crates/tui/CHANGELOG.md.
Note: the live Responses round-trip has not been exercised against the
production ChatGPT backend in this environment; the provider ships as preview.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`pdf_extract::extract_text` uses an internal codepath that can hang on
certain PDF cross-reference tables or font encodings. The per-page
`extract_text_by_pages` path does not trigger this hang and produces
identical output when joined.
When `pages` is not specified, route through `extract_text_by_pages`
and join all pages instead of calling `extract_text`.
Fixes#2641.