Commit Graph

2529 Commits

Author SHA1 Message Date
Claude 2bbe25b7e4 fix: #3034 audit residuals — correct the constitution renderer doc-comment (renderer NOT reconciled with committed MD, do not regenerate); pin the collapsed-thinking header assertion; align auth list's openai-codex credential source with auth status
Co-Authored-By: Claude <noreply@anthropic.com>
https://claude.ai/code/session_018zaP8vUfTAsrE38L6h6fw5
2026-06-11 02:45:25 +00:00
Hunter Bown 83ef677355 style: cargo fmt
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-10 16:40:14 -07:00
Hunter Bown 940ea28756 fix(codex): land function_call_output, schema sanitization, and reasoning effort mapping
- responses.rs: add ToolResult→function_call_output conversion for user-role
  blocks (previously silently dropped, breaking multi-turn tool calling)
- responses.rs: extract codex_responses_reasoning_effort helper — maps
  'maximum'→xhigh, off/none→omit, unknown→medium
- responses.rs: tool_to_responses_function now clones and sanitizes schema
  via schema_sanitize::sanitize_for_responses before sending parameters
- schema_sanitize.rs: add sanitize_for_responses — forces root type:object,
  merges root composition properties (oneOf/anyOf/allOf), removes root
  enum/not, ensures properties object, prunes dangling required
- schema_sanitize.rs: add merge_root_composition_properties helper
- schema_sanitize.rs: add 3 responses_sanitize_* tests

Refs: #2984 #2955
2026-06-10 15:30:47 -07:00
Hunter Bown 5bea28e4e2 feat(tui): split sidebar 'Model reasoning' from 'Background commands'; provider-aware effort labels
- Add TaskPanelEntryKind enum (Background, ModelReasoning) to app.rs
- Add ReasoningEffort::display_label_for_provider() — maps Max→xhigh for Codex
- Sidebar: new 'Model reasoning' section above 'Background commands'
- Transcript: rename 'thinking' → 'reasoning' label
- Model picker: effort rows use provider-aware labels
- Subagent routing: tag entries with kind::Background
- Tests: kind assertions in task entry + effort display label tests

Refs: #2984 (sidebar+reasoning surface for Codex and all-models theme)
2026-06-10 15:30:41 -07:00
Hunter Bown 4d4cbd36a2 v0.8.58: constitution refactor + rebrand fixes + provider error improvement
- Constitution: new preamble (A / Rule Number 6), personality tier removed
  (8 tiers instead of 9), elevated constitutional prose with shall/shall-not
- YAML constitution (constitution.yaml) as structured source of truth with
  indentation encoding tier precedence
- Python renderer (render_constitution.py) for YAML -> markdown conversion
- prompts.rs: load constitution.md instead of base.md + calm.md overlay
- #2664: state.db default path prefers .codewhale/ over .deepseek/ with
  legacy fallback so existing installs keep session history
- #2644/#2664: update stale doc comments referencing deepseek paths
- #3007: provider rejection error now shows source (CLI flag vs config),
  lists supported providers, and gives specific fix instructions
2026-06-10 10:02:49 -07:00
Hunter B b23067bacd release: v0.8.57 — sleep-resume turns, docker fix, one-command release prep, changelog diet 2026-06-10 00:02:51 -07:00
Hunter B f9c9764265 fix(tests): deflake prompt cache and MCP SSE mock-server tests
- prompt_persist tests wrote through the developer's real
  ~/.codewhale/prompt_cache and raced each other (the eviction test could
  delete another test's entry). cache_dir() now honors
  CODEWHALE_PROMPT_CACHE_DIR and the tests run serialized against private
  tempdirs.
- The raw TCP mock servers in mcp tests answered one request per socket
  but never advertised 'Connection: close', so reqwest pooled dead
  keep-alive connections and retried POSTs failed under parallel-suite
  load with 'connection closed before message completed' (~50% failure
  rate on the full suite).
2026-06-10 00:00:54 -07:00
Hunter B ddd5df4b9b chore: drop stale allow(dead_code) on AgentOpenTool (registered since v0.8.33) 2026-06-09 23:45:58 -07:00
Hunter B c58ef8ddff feat(release): generate the GitHub Release body from the CHANGELOG entry
The workflow hardcoded install boilerplate plus a contributor list that
had already drifted (v0.8.56's release thanked people 'for shaping
v0.9.0'). The body now comes from scripts/release/generate-release-body.sh:
static install/verify sections plus the tagged version's changelog section,
which already carries the per-release credits.
2026-06-09 23:44:57 -07:00
Hunter B 4465459b69 feat(release): one-command version bump via prepare-release.sh; close version-drift gaps
- scripts/release/prepare-release.sh bumps workspace + crate pins + npm
  wrapper + README install tags, refreshes Cargo.lock, regenerates the
  TUI changelog slice and web facts, then runs check-versions.sh
- check-versions.sh now also gates web/lib/facts.generated.ts and the
  README install-tag examples (both drifted silently before)
- .cnb.yml validates the pushed tag against Cargo.toml before generating
  mirror release notes
- RELEASE_CHECKLIST/RUNBOOK updated accordingly (v0.8.56 needed 9 fix
  commits for exactly these sync points)
2026-06-09 23:43:15 -07:00
Hunter B 717d728163 feat: survive system sleep mid-turn — detect the suspend gap and retry the request (#2990)
When the host sleeps while a model response is streaming, the connection
dies on wake with 'Stream read error: error decoding response body' and
the turn was lost. The engine now stamps every stream chunk with both
monotonic and wall-clock time; Instant pauses across a suspend while
SystemTime does not, so a >10s divergence on a stream error identifies a
sleep/wake cycle. In that case the partial output is discarded and the
identical request is re-issued (sharing the existing MAX_STREAM_RETRIES=3
budget) instead of failing the turn. Ordinary network flakes keep the
deliberate no-retry-after-content policy from #103.
2026-06-09 23:40:42 -07:00
Hunter B fad04a016f polish: finish the rebrand in agent-facing surfaces
- system prompt environment key deepseek_version -> codewhale_version
- drop legacy .deepseek/instructions.md from the Local Law prompt tier
  (the engine still reads it for back-compat)
- instructions-file truncation marker now states how many bytes were
  omitted so the model knows what it is missing
- CODEWHALE_CHANGELOG const + user-facing /change strings
- codewhale metrics doc headers
2026-06-09 23:36:32 -07:00
Hunter B 6dcdc19077 chore: drop unused deps (tracing-appender, zeroize, rustls in release), orphaned vendor lockfile and one-off verify_task.sh 2026-06-09 23:33:47 -07:00
Hunter B 44c13eb63f fix(release): check-versions validates the generated TUI changelog slice, not byte equality
The packaged changelog is now a recent-releases slice produced by
scripts/sync-changelog.sh (which gains a --check mode); also restore the
SECURITY.md contact line the version gate guards, and finish the stale
binary-name sweep (--bin codewhale examples, qa harness doc).
2026-06-09 23:32:40 -07:00
Hunter B 626032ad6b fix(devcontainer): use codewhale user/name/mount instead of pre-rebrand deepseek 2026-06-09 23:30:53 -07:00
Hunter B 26947bd407 fix(docker): stop copying legacy deepseek binaries that no longer exist
The release docker job failed on 'cp: cannot stat .../release/deepseek'.
Legacy entrypoints survive as symlinks to the codewhale binaries.
2026-06-09 23:30:45 -07:00
Hunter B 68aee8409f chore: move HarmonyOS clang wrappers to scripts/ohos/ 2026-06-09 23:24:13 -07:00
Hunter B 4a4ea63820 chore: add CODEOWNERS and dependabot config 2026-06-09 23:23:50 -07:00
Hunter B a2bf9f806a chore: gitignore benchmark results/ and __pycache__ under scripts/ 2026-06-09 23:23:37 -07:00
Hunter B 6551106e79 docs: move internal design docs into docs/rfcs/ 2026-06-09 23:23:25 -07:00
Hunter B 854274de1d docs: remove internal US VM setup notes 2026-06-09 23:23:03 -07:00
Hunter B bf2e5504a2 docs: remove stale internal docs (handoffs, old audits, orphaned roadmaps) 2026-06-09 23:22:32 -07:00
Hunter B b4edb4e1ef build: generate crates/tui/CHANGELOG.md as a 15-release slice instead of a full manual copy
The /change command embeds this file into every binary via include_str!;
it now carries only recent releases (regenerated by scripts/sync-changelog.sh,
wired into the release checklist). The explicit-version test derives its
fixture versions from the embedded slice instead of hardcoding old ones.
2026-06-09 23:22:00 -07:00
Hunter B 258d75376c docs: archive changelog entries v0.8.39 and older to docs/CHANGELOG_ARCHIVE.md 2026-06-09 23:19:11 -07:00
Hunter B bcaf655316 fix: bump npm wrapper version, add v0.8.56 compare link, resync TUI CHANGELOG 2026-06-09 22:35:50 -07:00
Hunter B 6a2ec32c00 fix: bump all internal crate dependency versions to 0.8.56 2026-06-09 22:30:27 -07:00
Hunter B 710b094c8a fix: sync crates/tui/CHANGELOG.md with workspace root for v0.8.56 test gate 2026-06-09 22:26:24 -07:00
Hunter B 5783b05b0a fix: update Cargo.lock after version bump to 0.8.56 2026-06-09 22:14:45 -07:00
Hunter B ac56bd85b9 polish: drop deferred npm language from INSTALL.md, bump to v0.8.56 2026-06-09 22:12:41 -07:00
Hunter B f1c73bda9b polish: bump CNB install tags, remove deferred npm language, regen web facts to 0.8.56 2026-06-09 22:12:00 -07:00
Hunter B a6feb79e50 chore: bump workspace version to 0.8.56 2026-06-09 22:06:05 -07:00
Hunter B 3218bb00da release: v0.8.56 CHANGELOG, contributor credits, web version bumps
14 merged PRs across localization, providers, search, shell safety,
prefix-cache stability, PDF fixes, paste handling, and bug fixes.

Contributors: gordonlu, hongchen1993 (new), cyq1017, idling11, LeoAlex0,
h3c-hexin, sximelon, HUQIANTAO, reidliu41.

Reverted hotbar #2945 (render-only, no key dispatch wired up).
2026-06-09 22:05:02 -07:00
Hunter B 44d5aa496d Revert " feat(tui): render hotbar in sidebar"
This reverts commit c90049dd31.
2026-06-09 22:04:14 -07:00
Hunter Bown 3c97ba0799 Merge PR #2945 from reidliu41: render hotbar panel in sidebar
Loads resolved hotbar slot bindings into app state and renders the Hotbar panel at the bottom of the sidebar (render layer only; key dispatch is a follow-up). Part of #2061, ref #2065.
2026-06-09 21:17:13 -07:00
Hunter Bown 40f9a8bf17 Merge PR #2947 from cyq1017: guide >5s shell work to background tasks
Prompt, exec_shell description, and background schema now direct work expected to take >5 seconds (builds, test suites, servers, polling, sleeps) to task_shell_start/background=true. Addresses #2939.
2026-06-09 21:17:00 -07:00
Hunter Bown 45a9bff750 Merge PR #2905 from cyq1017: name the allow_shell blocker in shell tool errors
Missing-shell-tool errors now state that allow_shell = false is the cause, with the existing /config remediation hint. Related #2657.
2026-06-09 21:16:47 -07:00
Hunter Bown f12863f826 Merge PR #2949 from LeoAlex0: decouple allow_shell from static system-prompt prefix
allow_shell now rides the per-turn <runtime_prompt> tag alongside mode/approval; message[0] stays byte-stable across shell toggles and mode switches, preserving the DeepSeek prefix cache. Removes the shell-guidance line-filtering machinery; adds a static Shell Policy reference section.
2026-06-09 21:08:23 -07:00
Hunter Bown f118dff940 Merge PR #2896 from gordonlu: localize status picker surface
Part of the i18n localization batch. Locale strings + MessageId wiring; no logic changes.
2026-06-09 20:09:38 -07:00
Hunter Bown f4ab469bd2 Merge PR #2891 from gordonlu: localize approval dialog surface across 7 locales
Part of the i18n localization batch. Locale strings + MessageId wiring; no logic changes.
2026-06-09 20:09:16 -07:00
Hunter Bown ddae5cfe36 Merge PR #2923 from hongchen1993: allow Volcengine provider in TUI dispatcher
Adds Volcengine to the TUI-supported provider gate and verifies Ark key forwarding.
2026-06-09 20:08:02 -07:00
Hunter Bown dbff899f16 Merge PR #2928 from hongchen1993: prefer dispatcher-provided API key over saved root key
When the CLI dispatcher forwards --api-key with DEEPSEEK_API_KEY_SOURCE=cli, that explicit override now wins over the saved root key for DeepSeek providers.
2026-06-09 20:07:59 -07:00
Hunter Bown 082ba162c3 Merge PR #2946 from h3c-hexin: update Bocha web search response handling
Correct /v1/web-search endpoint, parse webPages.value with legacy pages fallback, surface business error codes; unit tests added.
2026-06-09 20:06:58 -07:00
Hunter Bown 0475c33c1d Merge PR #2951 from LeoAlex0: explain visibility="internal" in Runtime Policy Reference
Clarifies that the runtime_prompt tag is internal instruction, not user input, and that modes should not be announced. Related #2922.
2026-06-09 20:06:46 -07:00
Hunter Bown 607ee1796d Merge PR #2920 from sximelon: write oversized pastes to .codewhale/pastes/
Moves large-paste files from the legacy .deepseek/pastes to .codewhale/pastes, tests updated. Fixes #2914.
2026-06-09 20:06:33 -07:00
Hunter Bown a9a91053fa Merge PR #2930 from idling11: Qwen 3.6 Plus resolution tests
Tests-only: cover qwen3.6-plus and qwen-3.6-plus alias resolution on OpenRouter. Related #2908.
2026-06-09 20:06:20 -07:00
Hunter Bown edadd1e6db Merge PR #2941 from idling11: sync task panel after background shell cancel
Event-driven task panel refresh on exec_shell_cancel/exec_shell_wait/task_cancel plus an immediate refresh after ShellJob actions, so the Tasks sidebar reflects cancellations without waiting for the periodic refresh. Addresses #2937.
2026-06-09 20:05:20 -07:00
Hunter Bown 0e3c4a3dd6 Merge PR #2898 from idling11: avoid PDF read hang via extract_text_by_pages
Fixes the pdf-extract full-document hang on certain xref tables/font encodings by using the per-page extraction path. Closes #2641.
2026-06-09 20:04:15 -07:00
Hunter Bown fb341a9460 Merge PR #2880 from HUQIANTAO: fix 9 bugs across tools, client, and commands
UTF-8 boundary panic in clean_pdf_text, SSE ContentBlockStop index, skills TOCTOU double-call, token-count truncation, strict 6-digit hex decode in from_api_tool_name, response-body read errors surfaced, context_lines clamp, u64→usize try_from in read_file, SiliconflowCn/Volcengine reasoning-content support.
2026-06-09 20:03:45 -07:00
zLeoAlex 88ce926525 fix(prompts): explain visibility="internal" attribute in Runtime Policy Reference
The <runtime_prompt> tag includes a visibility="internal" attribute that
was listed in the tag format but never explained.  Models sometimes
interpreted this as an instruction to announce or restate the current
mode to the user, leading to repetitive YOLO-mode confirmations before
every tool call (#2922).

Add a one-sentence explanation: the attribute marks this tag as a
runtime instruction for the model (not user input), and the model should
apply the referenced rules silently without announcing the mode.

Closes #2922
2026-06-10 01:48:48 +08:00
zLeoAlex 9206eb9d73 refactor(prompts): decouple allow_shell from static system-prompt prefix
Move allow_shell from message[0] (static system prompt) to the per-turn
<runtime_prompt> tag alongside mode and approval.  This preserves the
DeepSeek prefix cache across shell-access toggles and mode switches,
which previously forced YOLO entry/exit to rebuild the entire prompt.

Changes:
- Delete remove_shell_tool_guidance and 3 other dead functions (~60 lines)
- Remove allow_shell field from PromptSessionContext and StaticPromptCtx
- Remove shell_tools_available dead parameter from compose functions
- Add Shell Policy section to Runtime Policy Reference (static text)
- Extend <runtime_prompt> tag with allow_shell="true|false" attribute
- Update tests: omits→always_keeps, 83/83 prompts tests pass
- Drop dead compose_mode_prompt_with_approval_and_model

Net: message[0] bytes are now stable regardless of shell-access state.
Mode/approval/shell flags all use the same per-turn tag pattern.
2026-06-10 01:44:34 +08:00