codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
Hunter B	fb86737a8c	test(settings): assert migrated settings display canonical path Extend the #2730 settings migration harvest with the missing platform-config fallback display assertion from review, and keep the v0.9 execution map/changelog credit current. Validation: cargo fmt --all -- --check; git diff --check; cargo test -p codewhale-tui --bin codewhale-tui --locked settings_ -- --nocapture; cargo test -p codewhale-tui --bin codewhale-tui --locked display_localizes_header_and_config_file_label -- --nocapture. Harvested from PR #2730 by @xyuai. Co-authored-by: xyuai <281015099+xyuai@users.noreply.github.com>	2026-06-03 21:02:46 -07:00
Hunter B	23c9481af1	feat: add HarmonyOS OpenHarmony support Harvest the HarmonyOS/OpenHarmony port from PR #2634 and make it publish-safe by target-gating unsupported host dependencies out of the OHOS TUI graph. Self-update is disabled on OHOS, PTY shell mode reports unsupported, and Starlark execpolicy parsing returns an explicit unsupported-platform error until upstream starlark/rustyline/nix support catches up. Add OHOS SDK setup docs and launcher scripts, install the rustls ring provider for rustls-no-provider entrypoints, and keep the packaged codewhale-tui OHOS graph free of starlark, rustyline, nix@0.28, portable-pty, and arboard. Validation: cargo fmt --all -- --check; git diff --check; git diff --cached --check; cargo check -p codewhale-cli --locked; cargo check -p codewhale-app-server --locked; cargo check -p codewhale-tui --locked; cargo test -p codewhale-cli --locked update::tests::; cargo test -p codewhale-release --locked; cargo test -p codewhale-tui --locked background_tty_command_has_controlling_terminal; cargo test -p codewhale-tui --locked clipboard; cargo package -p codewhale-tui --allow-dirty --no-verify --locked; packaged OHOS cargo tree checks. OHOS target check still requires a loaded OpenHarmony SDK/sysroot and currently stops in ring with missing assert.h when CC/CFLAGS/linker are unset. Harvested from PR #2634 by @shenjackyuanjie. Co-authored-by: shenjackyuanjie <54507071+shenjackyuanjie@users.noreply.github.com>	2026-06-03 21:02:46 -07:00
HUQIANTAO	98edba3683	refactor(engine): append turn metadata after user text Place user text before volatile turn metadata in outgoing user-message content arrays so provider prefix caches can continue matching the stable user-input prefix across date, model-route, and working-set changes. Also adds wire-level coverage proving tail-positioned turn metadata serializes after user text while preserving turn-meta deduplication. Harvested from PR #2517 by @HUQIANTAO Co-authored-by: HUQIANTAO <58421104+HUQIANTAO@users.noreply.github.com>	2026-06-03 21:02:45 -07:00
Hunter B	60f8e7d62e	refactor(web_run): split cache locks for page reads Harvested from PR #2502 by @HUQIANTAO Co-authored-by: HUQIANTAO <58421104+HUQIANTAO@users.noreply.github.com>	2026-06-03 21:02:45 -07:00
Hunter B	311eb4002b	feat(tui): add bounded restore snapshot listing Harvested from PR #2513 by @cyq1017. Co-authored-by: cyq1017 <61975706+cyq1017@users.noreply.github.com>	2026-06-03 21:02:45 -07:00
Hunter B	8502784218	fix(xiaomi-mimo): use token-plan api-key auth Harvests the MiMo Token Plan auth-header behavior from #2627 while keeping Xiaomi env-key precedence unchanged so standard endpoints do not accidentally receive a Token Plan key. Harvested from PR #2627 by @xyuai. Co-authored-by: xyuai <281015099+xyuai@users.noreply.github.com>	2026-06-03 21:02:45 -07:00
Hunter B	159f509dd6	fix(tui): invalidate fanout card rows on sibling starts	2026-06-03 21:01:39 -07:00
jrcjrcc	7b2a7e513d	fix: Windows sub-agent completion halves TUI render width Root cause: AgentComplete unconditionally calls resume_terminal() even when the terminal was never paused, causing a secondary EnterAlternateScreen on Windows that creates a new buffer whose width may differ from the window width. Additionally, ColorCompatBackend had no terminal_size cache, so size() fell through to crossterm::terminal::size() which on Windows returns the WinAPI buffer width rather than the window width. Changes: - AgentComplete: add event_broker.is_paused() guard - resume_terminal(): cache real terminal size before reset_viewport - Resize handler: also set terminal_size alongside forced_size - subagent_routing: 3x mark_history_updated -> bump_history_cell(idx) - color_compat: add terminal_size field, set_terminal_size(), fix size() fallback priority (forced_size > terminal_size) - tests: 3 unit tests for size() fallback chain Review feedback addressed: - forced_size now takes priority over terminal_size (gemini-code-assist) - Redundant map lookups removed in subagent_routing (both bots) - set_terminal_size moved before reset_terminal_viewport (greptile-apps) (cherry picked from commit 4463c46644a6e485e7e20dc2b19c29c2e8eb3c5c)	2026-06-03 21:01:39 -07:00
Hunter B	4401f7a2e5	feat(tools): hide legacy subagent and shell aliases from model catalog (#2683 ) Subagent aliases: - Legacy names (agent_spawn, agent_result, agent_cancel, resume_agent, agent_list, agent_send_input, agent_assign, agent_wait, delegate_to_agent) are already NOT registered — they exist as dead code with #[allow(dead_code)] since v0.8.33 - Add test verifying model catalog only advertises canonical subagent tools: agent_open, agent_eval, agent_close, tool_agent Shell aliases: - Hide exec_wait from model catalog (legacy alias for exec_shell_wait) - Hide exec_interact from model catalog (legacy alias for exec_shell_interact) - Both remain callable for saved transcript replay - Add test verifying shell aliases are hidden but callable Verification: cargo test -p codewhale-tui --locked (4040 passed), cargo clippy -D warnings	2026-06-03 21:01:38 -07:00
Implementist	88422f3ad3	fix(plan_prompt): pre-wrap CJK+Latin mixed text to avoid forced line-breaks at script boundaries Wrap plan steps via wrap_text() before rendering, breaking only on display-width overflow, not on Latin/CJK Unicode word boundaries. Switch main render path from Wrap { trim: true } to Wrap { trim: false } since all content is pre-wrapped. Replace wrapped_line_count() with lines.len() for accurate scroll bounds. Keep confirm-exit dialog on Wrap { trim: true } (English-only, no risk).	2026-06-03 21:01:38 -07:00
Implementist	966b5cf1fb	refactor(plan_prompt): use display-width in wrap_text, skip wasted render work - wrap_text: replace chars().count() with UnicodeWidthStr::width() so CJK text is wrapped by display columns, consistent with wrapped_line_count and ratatui's Paragraph::wrap. Also fix the hard-split loop to use exclusive byte ranges (..end) instead of inclusive (..=i) so multi-byte UTF-8 prefixes are always valid. - render: hoist the confirming_exit branch to an early return so the plan-content construction (lines, scroll bounds, footer) is skipped entirely when the confirmation dialog is visible.	2026-06-03 21:01:38 -07:00
Implementist	e3a52555eb	fix(plan_prompt): clear pending_g on Esc, deduplicate render_modal_chrome - Clear pending_g when Esc triggers the exit-confirmation prompt so a stray 'g' press does not leak into and survive the confirmation dialog. - Move render_modal_chrome into the else branch so only one call fires per render pass, eliminating a shadow artifact when confirming_exit is active.	2026-06-03 21:01:38 -07:00
Implementist	47c071a0d5	chore: apply cargo fmt fix to plan_prompt.rs	2026-06-03 21:01:38 -07:00
Implementist	6d79d55b6c	fix(plan_prompt): use display-width for leading spaces and de-hardcode wrap width - wrapped_line_count: compute leading-space width via UnicodeWidthStr instead of byte length, so non-ASCII leading whitespace is measured correctly. - render: hoist popup_area / content_width computation above plan rendering so wrap_text can share the same content_width derived from the actual popup geometry instead of a magic 68.	2026-06-03 21:01:38 -07:00
Implementist	11c448d66e	fix(plan_prompt): remove step truncation to allow content overflow into scroll region	2026-06-03 21:01:38 -07:00
Implementist	537a8bccf3	fix(tui): replace manual div_ceil with usize::div_ceil to satisfy clippy lint	2026-06-03 21:01:38 -07:00
Implementist	14db9b2466	fix(tui): avoid spurious exit-confirmation on short plan after scroll key Use clamped (effective) scroll instead of raw `self.scroll` in the Esc handler so a short plan that fits entirely (max_scroll == 0) never triggers the "exit without implementing?" dialog when the user pressed a scroll key (PgDn/Ctrl-D/G/End) beforehand.	2026-06-03 21:01:38 -07:00
Implementist	1669e3c12e	fix(tui): address code review feedback on plan prompt modal - Use word-wrapping-aware line count to prevent underestimating scroll range (gemini-code-assist / greptile-apps) - Merge PLAN_OPTIONS, PLAN_SHORTCUTS, PLAN_SHORT_LABELS into PlanOption struct (gemini-code-assist) - Remove dead Esc code in handle_key (greptile-apps) - Guard gg/G with modifier checks (gemini-code-assist) - Increase PgUp/PgDn scroll amount from 6 to 12 (greptile-apps) - Use u16::try_from for scroll value to avoid silent truncation (greptile-apps) - Update related unit tests for new scroll values	2026-06-03 21:01:38 -07:00
Implementist	68784cff52	fix(tui): add scroll support to plan prompt modal - Add scroll state field to PlanPromptView with PgUp/PgDn, Ctrl+U/D/F/B, Home/End, gg/G vim-style keybindings - Show scroll indicator footer when content overflows the popup - Add confirming_exit state: Esc while scrolled asks for confirmation before discarding, preventing accidental exits on long plans - Clamp scroll in render() so overscroll doesn't hide bottom options - Use wrapped_line_count() with UnicodeWidthStr for accurate overflow detection with CJK characters - Add 11 unit tests covering scroll, keybindings, and exit confirmation	2026-06-03 21:01:38 -07:00
HUQIANTAO	863f55cc68	perf(history): simplify output_rows cache API and switch to FNV-1a Three follow-ups to the previous perf commit: 1. Drop the rows_hash field on CacheEntry. The field was computed and stored but never read on the hot path; tests exercised it only to assert the cache returned a stable hash. After this change get_or_compute_rows returns just Vec<OutputRow>, halving the tuple-return ABI and removing one DefaultHasher::write pass on every cache miss. 2. Replace DefaultHasher (SipHash) with a hand-rolled FNV-1a 64-bit hash. SipHash is per-process-keyed and ~5-10x slower than FNV on the small-to-medium tool output strings we see at 120 FPS. FNV-1a has no per-process key, fits in 20 lines of pure-Rust, and a 64-bit collision space is more than wide enough for the per-process LRU's expected <= a few hundred entries. The cache is a correctness optimization, not a security boundary; collisions only cause a false miss, never wrong data. 3. Caller in tui::history::render_preserved_output_mode updated to the new Vec<OutputRow>-only signature. Two new tests cover the FNV-1a properties (length-suffix sensitivity, empty-input stability).	2026-06-03 21:01:38 -07:00
HUQIANTAO	3b0ef3f63c	perf(history): cache output_rows and selected_output_indices per cell output_rows (in tui::history) walks the raw tool output, ANSI-strips each line, classifies path/URL-like rows, and wraps the rest to the current viewport width. selected_output_indices then computes the head/tail/importance subset that the compact Live view shows. Both functions are pure, but they are called on every render frame for every visible tool cell. For a 4 KB tool output on a 120 FPS render loop that is 2-6 redundant walks per frame, per cell, and the function is called from a non-trivial number of cells across exec, tool, command, and review history. Add tui::output_rows_cache, a thread-local, content-addressed cache keyed on (content_hash, width) for the rows and (content_hash, width, line_limit) for the indices. The cache stores the wrapped Vec<OutputRow> plus a per-line-limit map of selected indices on a single entry, so a single key lookup satisfies both render steps. render_preserved_output_mode now consults the cache for both the rows and the indices; on a hit, neither the per-line ANSI strip nor the importance-ranking pass runs. The cache is bounded (default capacity 256) with insertion-order eviction. The OutputRow struct gains PartialEq + Eq + pub fields so the cache module can store and hash it without exposing private internals. Tests: 6 new unit tests cover the hit/miss path, width invalidation, content invalidation, indices per-line_limit caching, capacity eviction, and hash stability. The wider tui::history test suite (68 tests) still passes.	2026-06-03 21:01:38 -07:00
HUQIANTAO	c0b36824c2	perf(capacity): let scan_canonical_inputs early-exit without verified-user lookup The build_canonical_state path never reads CanonicalStateScan::latest_verified_user_idx, but the previous patch required is_complete() to find a verified user message before it would short-circuit. On a long history with no verification replay — the common case — the scan walked the entire message list looking for a match that could not exist. Add a find_verified: bool parameter to scan_canonical_inputs and CanonicalStateScan::is_complete. build_canonical_state now passes false, so the loop stops as soon as the goal and CANONICAL_SCAN_MAX_FACTS facts are found. The replan path (apply_verify_and_replan) keeps the existing true behavior so it still locates the latest verified user message. Test calls are updated to match; no behavior change for any test.	2026-06-03 21:01:38 -07:00
HUQIANTAO	837a6f8c54	perf(capacity): collapse build_canonical_state's reverse scans to one pass build_canonical_state previously did two independent reverse walks of session.messages — one to extract the most recent user goal, and one to collect up to four confirmed-fact snippets. apply_verify_and_replan then added a third and fourth reverse scan to locate the latest user message and the latest [verification replay] user message for the re-plan path. All four reverse scans collect disjoint facts about the same most- recent-first view of the conversation. This PR folds them into a single helper, scan_canonical_inputs, that walks messages once in reverse, fills a CanonicalStateScan, and short-circuits as soon as every collector is satisfied. The helper exposes the latest-message indices so apply_verify_and_replan can clone the full Message values after the scan (eliminating the two independent find().cloned() walks). The output CanonicalState is byte-identical to the prior implementation: same goal, same confirmed facts (newest first, errors filtered), same fallback string when no user text exists. The re-plan path's keep-messages set is identical: latest user + latest verified. Tests: 6 new unit tests cover the goal lookup, fact cap, error-result filter, verified-marker scan, empty input, and the early-exit condition. The full engine test suite (153 tests) still passes.	2026-06-03 21:01:38 -07:00
HUQIANTAO	e3adc98baf	perf(prefix-cache): fold tool.strict into identity hash, share cache with PrefixFingerprint::compute Three follow-ups to the previous perf commit: 1. Correctness: tool.strict participates in the wire format emitted by tool_to_api_json, so it MUST participate in the cache identity. Two catalogs that differ only in strict would otherwise collide and serve a stale SHA-256, silently busting prefix-cache stability on the wire. 2. Allocation: replace the per-tool serde_json::to_string in tool_set_identity with a hash_json_value helper that walks the JSON tree directly. For a 60-tool catalog this drops ~25-40 KB of transient allocation per cache miss. 3. Dead code: the previous patch introduced PrefixFingerprint::compute, CachedCatalog::joined, ToolCatalogCache::{invalidate,is_empty}, and a thread-local cache helper that were not used outside tests. With -D warnings in CI all four triggered dead-code errors. The compute helper is now only built in cfg(test); the rest are marked #[allow(dead_code)] with comments explaining their observability and test-only use.	2026-06-03 21:01:37 -07:00
HUQIANTAO	baef5ba95d	perf(prefix-cache): cache tool-catalog JSON serialization across checks PrefixFingerprint::compute is called once per turn by the turn loop prefix-stability check. The tool-side work serializes every tool to the chat-API JSON shape, sorts the resulting strings, joins with newlines, and SHA-256s the result. For a 60-tool catalog that is ~25-40 KB of allocation plus a sort, all of which produces a byte-identical output once the tool set is stable across turns (the common case after the first turn of a session). Introduce a process-local ToolCatalogCache that stores the joined+sorted catalog under a content-derived u64 identity (length + per-tool name + description + serialized input_schema). On a hit, the per-tool JSON serialization, sort, and join are skipped entirely — the pre-computed SHA-256 hex digest is returned directly. The cache lives on PrefixStabilityManager (per-session ownership) and backs a new PrefixFingerprint::compute_with_tool_cache entry point. check_and_update, PrefixStabilityManager::new, and pin() all use the cached path. The original compute() is kept as a fallback for callers that do not have a cache in hand (e.g. CLI tools that build a one-shot fingerprint). The cache is bounded (default capacity = 8) and uses insertion-order eviction, matching the eviction strategy already in transcript_cache.rs. invalidate() is exposed for tool-registry hot-reload and MCP attach paths. Tests: 8 new unit tests cover the miss/hit path (pointer-equal Arc on hit), identity collisions, schema change detection, capacity eviction, invalidate, empty slice, and the equivalence between cached and uncached fingerprints. The full 30-test prefix_cache suite passes; the wider prefix-cache contract tests in settings, prompts, and core::engine::tests continue to pass.	2026-06-03 21:01:37 -07:00
HUQIANTAO	3de07a99ed	perf(engine): memoize estimated_input_tokens via content-keyed cache The token estimator walks the full session.messages and the active system prompt. Five call sites per turn in the engine (capacity pre/post tool checkpoints, error escalation, the seam manager, the trim budget check) plus four TUI/command consumers (footer, /status, /debug, context inspector) all re-walked the same data independently. On a 200-message history with 5 KB of tool results that is roughly 2 ms per call, or ~20 ms of pure waste on a single turn. Introduce a process-local TokenEstimateCache keyed on (session.messages_revision, system_prompt_fingerprint). Repeated calls with the same inputs return the cached value without re-walking the message list. The cache invalidates as soon as either input changes: * session.messages_revision is a monotonic counter bumped in Session::add_message, Session::replace_messages, the new Session::bump_messages_revision helper, and at every direct session.messages mutation site in core/engine.rs and core/engine/capacity_flow.rs. * system_prompt_fingerprint is a stable 64-bit hash of the SystemPrompt::Text or SystemPrompt::Blocks payload. Also restructures layered_context_checkpoint to compute the estimated token count before taking a long-lived &SeamManager borrow, and re-routes the capacity pre/post tool checkpoints to compute the observation into a local before calling capacity_controller.observe_*. Both refactors are required to satisfy the borrow checker once estimated_input_tokens requires &mut self. Tests: 10 new unit tests cover the miss/hit path, revision bumps, system-prompt changes, audit-ring capacity, and downward-revision no-ops. The full 157-test engine suite still passes.	2026-06-03 21:01:37 -07:00
xyuai	9e15805f64	fix(settings): tighten legacy path migration coverage	2026-06-03 19:24:37 -07:00
Hunter B	f7a602cd20	feat(tools): hide todo_* aliases from model catalog, add deprecation metadata (#2682 ) - Add model_visible() hook to ToolSpec trait (default true) - Override model_visible() -> false on todo_write, todo_add, todo_update, todo_list - Checklist variants remain model-visible as the canonical surface - Legacy todo_* calls still work for saved transcript replay - Return _deprecation metadata with use_instead and removed_in=0.9.0 - Update prompts to recommend checklist_* only - Update TOOL_SURFACE.md with v0.9.0 deprecation notes - Add tests for hidden catalog, compat alias behavior, and metadata Verification: cargo test -p codewhale-tui -- todo, cargo clippy -D warnings	2026-06-03 19:20:23 -07:00
Hunter Bown	8dff2f7525	fix(tui): guard xiaomi mimo defaults test against CI env vars	2026-06-03 16:25:04 -07:00
Hunter Bown	772ec46c98	chore(release): v0.8.53 — Arcee support, telegram bridge, provider fixes - Fix Rust syntax/clippy fallout in client.rs, cli/src/lib.rs, web_search.rs - Fix 0.8.53 release metadata: changelog links, TUI changelog, npm wrapper - Update visible help copy for multi-provider support - Add telegram-bridge integration with deploy configs - Add US remote VM quickstart doc - Update Tencent Cloud deploy scripts and docs - Bump npm wrapper to 0.8.53	2026-06-03 16:12:38 -07:00
RefuseOdd	8b0e1cc3c0	Limit path suffix to chat completions	2026-06-03 15:34:24 -07:00
RefuseOdd	d2999bb402	Add path_suffix to ProviderConfigToml and ProviderConfig Adds an optional path_suffix field that lets users override the API path for OpenAI-compatible endpoints. When set, the suffix replaces the default /v1/<path> pattern, enabling use with endpoints that don't accept /v1/ prefixes (e.g. /chat/completions instead of /v1/chat/completions). Changes: - ProviderConfigToml (config crate): path_suffix field - ProviderConfig (tui crate): path_suffix field - merge_provider_config: propagates path_suffix - merge_project_provider_config: propagates path_suffix - api_url: delegates to new api_url_with_suffix function - api_url_with_suffix: uses suffix when present, skips /v1 versioning - DeepSeekClient: reads path_suffix from config, passes to URL builder - config.example.toml: documents the new option - Tests for the new URL building behavior Closes #2089	2026-06-03 15:34:24 -07:00
cyq	45562822f0	feat(agent): classify model families	2026-06-03 15:34:12 -07:00
reidliu41	195dd6b9ab	fix(tui): hide shell prompt guidance when shell is disabled Thread allow_shell into system prompt composition and remove shell-only guidance when shell tools are not available. This keeps the prompt aligned with the runtime tool catalog and prevents the model from trying exec_shell or task_shell_* after allow_shell = false.	2026-06-03 15:28:29 -07:00
xyuai	dba332e8d5	fix(tui): persist provider switches to config	2026-06-03 15:28:17 -07:00
Hunter Bown	260ee737b0	style: cargo fmt	2026-06-03 15:18:19 -07:00
Hunter Bown	be7a3e7e69	fix(tui): provider picker r shortcut with modifier guard - add r/R shortcut to re-enter API key for any provider in picker - guard against Ctrl/Alt/Meta modifiers (only plain r triggers) - dynamic footer: 'apply' when key exists, 'set key' otherwise - add 'R edit key' hint to picker footer - add route/model to scoped auth status output - add tests for r shortcut, ctrl-r guard, footer text, and route/model Ports #2717 with review fix. Fixes #2662.	2026-06-03 15:14:39 -07:00
Hunter Bown	5719301d1e	fix(auth): all-provider auth status and scoped logout - auth status shows every known provider with config/keyring/env status - auth status --provider <id> shows detailed single-provider info - auth list now probes keyring for all providers (was only active) - /logout clears only the active provider's key (was clearing all) - add clear_active_provider_api_key for scoped TOML key removal - add Huggingface to ProviderArg enum - add auth status tests for all-provider and scoped views Fixes #2716	2026-06-03 15:08:28 -07:00
Hunter Bown	d9ca5fbbff	docs(tui): mirror v0.8.53 changelog	2026-06-03 14:43:08 -07:00
Hunter Bown	28a0f19c13	fix(provider): polish v0.8.53 routing and shell gating	2026-06-03 14:40:25 -07:00
Hunter Bown	5786584767	chore(release): bump workspace to 0.8.53	2026-06-03 12:39:01 -07:00
Hunter Bown	ed4ec3f799	Merge branch 'codex/v0.8.53-deprecate-whale-md' into codex/v0.8.53	2026-06-03 12:38:00 -07:00
Hunter Bown	8bc994e492	Merge branch 'codex/v0.8.53-tool-deferred-ux' into codex/v0.8.53	2026-06-03 12:37:53 -07:00
Hunter Bown	a10e17a62a	fix(context): prefer global AGENTS over WHALE	2026-06-03 12:37:39 -07:00
Hunter Bown	f5c8d7e5c5	fix(subagent): align advertised role aliases	2026-06-03 12:37:39 -07:00
Hunter Bown	025089494b	fix(rlm): include session object in source hints	2026-06-03 12:37:39 -07:00
Hunter Bown	fc8ad7b3a8	feat(project): enrich repo constitution (invariants, branch policy, escalation) Per the layered-authority clarification (base myth → global Constitution → repo constitution = local law → task packet → runtime policy), extend .codewhale/constitution.json beyond authority+verification with optional: - protected_invariants — repo invariants the agent must not break - branch_policy — branch/release policy in effect - escalate_when — conditions to stop and escalate to the user All optional; rendered as concise model-facing prose. The global Brother Whale identity anchor and Constitution in prompts/base.md are unchanged (verified untouched on this branch). Dogfood constitution.json filled with CodeWhale's real invariants (prefix-cache byte-stability, transcript replay, stable Rust, cli/tui parity), branch policy (codex/v0.8.53), and escalation rules. Docs note the layered hierarchy. cargo test -p codewhale-tui --bins → 3946 passed; clippy clean.	2026-06-03 12:16:06 -07:00
Hunter Bown	9d9616e898	feat(project): deprecate WHALE.md; add .codewhale/constitution.json authority layer Splits repo-level guidance into two clear artifacts and deprecates the confusing WHALE.md concept (overlapped with AGENTS.md): - AGENTS.md is the canonical cross-agent project-instructions file. - .codewhale/constitution.json is the CodeWhale-specific repo authority / prioritization policy (when local sources conflict, which to trust first; what to verify before claiming done). Rendered into the system prompt as a higher-authority <codewhale_repo_constitution> block; takes precedence over a legacy WHALE.md. WHALE.md migration (compat-preserving): - AGENTS.md now ranks above WHALE.md in both project and global discovery; with both present, AGENTS.md wins. - WHALE.md is still read as a legacy fallback, but now emits a deprecation warning and is never created or recommended (init.rs no longer suggests it). - Discovery/docs updated; the global CodeWhale Constitution in prompts/base.md is unaffected (different thing). constitution.json: - New RepoConstitution (serde, all fields optional, unknown fields ignored, schema_version checked). Discovered at .codewhale/constitution.json in the workspace or any parent up to the git root. Malformed JSON warns, never panics. - Loaded after the auto-generate fallback so it can't be clobbered. .gitignore: ignore .codewhale/ contents at any depth EXCEPT the committed constitution.json (a directory exclude can't be negated, so */.codewhale/ + negation). init.rs writes the same pattern for new repos. Dogfood: this repo's .codewhale/constitution.json added. find_git_root made pub(crate) and reused (no duplicate loader). Tests: AGENTS-over-WHALE precedence, WHALE legacy-read-with-warning, constitution render + system-block surfacing, malformed-constitution warning, gitignore-keeps-constitution. cargo test -p codewhale-tui --bins → 3946 passed; clippy clean. Targets codex/v0.8.53.	2026-06-03 12:12:34 -07:00
Hunter Bown	7bbc6b78e4	fix(tools): activate read-only git history + actionable RLM/field errors v0.8.53 tool/deferred/error UX (PR group 4), low-risk subset: - #2654: add git_log and git_show to DEFAULT_ACTIVE_NATIVE_TOOLS so read-only git history joins git_diff/git_status in the active partition (kept alphabetical → prefix-cache head stays sorted/byte-stable). git_blame and other history tools remain deferred. - #2655: rlm_open's source-count error now echoes common misnamed fields with a "did you mean file_path/content/url" hint; rlm_eval's missing-`code` error explains it runs raw Python and shows an example. Schema descriptions for rlm_eval name/code sharpened. - #2659: likely_field_corrections gains RLM source-field rename hints (the role/type vocabulary change itself lives in the WS3 PR #2684 to avoid a double-edit of normalize_role_alias). Deferred to the medium-risk batch: #2648 (render deferred-tool hydration distinctly from "done") — needs a ToolStatus/cell-build change with wider render blast radius than this low-risk PR. Verification: cargo test -p codewhale-tui --bins → 3944 passed, 0 failed (incl. prefix-cache sort invariant); cargo clippy clean. Targets codex/v0.8.53.	2026-06-03 11:31:33 -07:00
Hunter Bown	725abeb603	fix(subagent): clearer role vocab, lifecycle signals, and eval ergonomics Make the sub-agent surface easier for less-capable models to drive: - Unify role/type vocabulary (#2649): normalize_role_alias now accepts the full set SubAgentType::from_str accepts (reviewer/implementer/verifier/...), and SubAgentType::from_str learns `planner`, so the dual-validation pass no longer rejects natural roles with a stale four-value hint. Error strings and schema descriptions now enumerate the real accepted aliases. - agent_eval/agent_close always active (#2605) so a first call executes instead of hydrating its schema and forcing a double-invoke; both accept an `agent_name` session alias (#2650). - Self-diagnosing name conflicts (#2656): the duplicate-name error names the conflicting agent_id and its status. - Self-describing completion sentinels (#2658): subagent.done now carries result_clipped / summary_complete / next_action so the parent knows whether to trust the previous-line summary or call agent_eval. - Actionable child-model-unavailable diagnostics (#2653): a provider 403/404 is annotated with the model id and recovery path instead of a bare error. Tests: role vocabulary acceptance + error wording, agent_name resolution, duplicate-name diagnostics, clipped-result sentinel, child-model annotation, agent_eval/agent_close default-active. Full tui suite green (3948), clippy clean. Targets codex/v0.8.53 (v0.8.53 stabilization).	2026-06-03 11:22:56 -07:00

1 2 3 4 5 ...

1671 Commits