codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
hongchen1993	c84292235f	feat(config): prefer dispatcher-provided API key over saved DeepSeek key when source is cli When the CLI dispatcher launches the interactive TUI with an explicit `--api-key` argument (e.g. for a DeepSeek-compatible subscription endpoint), the environment variable `DEEPSEEK_API_KEY` carries the intended key with `DEEPSEEK_API_KEY_SOURCE=cli`. Previously the saved root `api_key` in config.toml always won over this env override for the DeepSeek provider, blocking users from running: codewhale --provider deepseek \ --api-key ark-... --base-url https://... --model auto This change gives the dispatcher-supplied env key priority when the source marker is `cli`, keeping full backward compatibility for normal config-file or keyring paths, and also cleaning up a `***` literal in an unrelated test.	2026-06-09 11:48:59 +08:00
Hunter B	9463266cb1	ci(web): make Cloudflare deploy manual	2026-06-08 08:33:26 -07:00
Hunter B	0854425dc6	ci(web): deploy public site from main	2026-06-08 08:30:00 -07:00
Hunter B	7344b88eac	fix(web): sync frontend lockfile for CI	2026-06-08 08:28:10 -07:00
Hunter B	3d503a0a24	docs: bring public surface and npm-deferred install copy	2026-06-08 08:01:18 -07:00
Hunter B	c4ff9e5345	fix(release): allow asset publication despite docker failure	2026-06-08 07:47:48 -07:00
Hunter B	533b0f5766	fix(release): regenerate Cargo.lock for 0.8.54 workspace versions	2026-06-08 07:00:50 -07:00
Hunter B	78ae354fa4	chore(release): merge v0.9.0-stewardship into v0.8.54 Includes Paulo's command parity and Gherkin E2E harnesses, HUQIANTAO's concurrency/security fixes, LeoAlex0's runtime_prompt slim, reidliu41's hotbar persistence, HarmonyOS scaffolding, Whaleflow foundation crate, and all v0.9.0 stabilization work.	2026-06-08 06:54:09 -07:00
Hunter B	edd28066e1	chore(release): v0.8.54 — benchmark harness runners, MiMo routing	2026-06-08 06:47:21 -07:00
Hunter B	f88528a5a3	test(subagent): de-flake touch_refreshes_stale_running_agent_heartbeat The 1ms heartbeat timeout raced the synchronous touch()->cleanup() gap on loaded CI runners (Windows scheduler can deschedule >1ms), intermittently reaping the just-touched agent so cleanup() returned 1. Widen the timeout to 50ms and the staleness sleep to 150ms to keep the logic exercised without the timing race. Addresses CI flakiness under the v0.9.0 stabilization gate (#2721). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 10:49:36 -07:00
greyfreedom	17dbed13c7	feat(execpolicy): wire permissions.toml ask-rules into runtime Harvested from PR #2885 by @greyfreedom. Wires ask-rules into the app-server and core ExecPolicyEngine (previously inert). Removes the original PR's NeedsApproval arm that incorrectly allow-listed the working directory as a network host. Co-Authored-By: greyfreedom <11493871+greyfreedom@users.noreply.github.com>	2026-06-07 10:49:36 -07:00
Hunter B	4e3184eae9	fix(client): consume probe response body to return connection to pool Harvested from PR #2884 by @ousamabenyounes. Drops the orphan desktop tray.rs module (dead code, never wired) from that PR. Co-Authored-By: Ben Younes <2910651+ousamabenyounes@users.noreply.github.com>	2026-06-07 10:49:36 -07:00
Hunter B	e2b7d5e197	fix: harvest safe bug fixes from PR #2880 Harvests 7 safe fixes from PR #2880 by @HUQIANTAO: tool-name hex-digit guard, token-usage u32 clamp, read-file line usize::try_from, grep context-lines cap, UTF-8 PDF trim, run_skill dedup, and Volcengine/SiliconflowCn reasoning_content support. Excludes the DeepSeek stream-stop change and the unwired prompt_persist module (deferred for separate review). Co-Authored-By: HUQIANTAO <58421104+HUQIANTAO@users.noreply.github.com>	2026-06-07 10:49:36 -07:00
Hunter B	ab65495b0e	Merge PR #2781 from punkcanyang: opt-in ghost-text follow-up suggestion	2026-06-07 10:21:01 -07:00
Hunter B	8d329a434c	Merge PR #2869 from ousamabenyounes: list saved models from all providers in /model picker	2026-06-07 10:21:00 -07:00
Hunter B	b39e00e72b	Merge PR #2883 from HUQIANTAO: concurrency hardening (mutex recovery, join handles)	2026-06-07 10:21:00 -07:00
Hunter B	1a9549babd	Merge PR #2881 from HUQIANTAO: log instead of swallowing errors	2026-06-07 10:21:00 -07:00
Hunter B	4caa28772b	Merge PR #2882 from HUQIANTAO: security fixes in execution policy and approval mapping	2026-06-07 10:21:00 -07:00
Hunter B	face4dc27a	Merge PR #2877 from LeoAlex0: cache_inspect test spillover root	2026-06-07 10:21:00 -07:00
Hunter B	a54d08f28d	chore(fmt): rustfmt engine tests from PR #2874 Mechanical rustfmt of the runtime_prompt tests rewritten in PR #2874 (LeoAlex0). No logic change. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 10:10:11 -07:00
Hunter B	3619962507	Merge PR #2874 from LeoAlex0: slim runtime_prompt to minimal tag	2026-06-07 10:09:21 -07:00
Hunter B	a42e9115b1	Merge PR #2873 from reidliu41: hotbar slot persistence	2026-06-07 10:09:21 -07:00
Hunter B	2c56f7761e	Merge PR #2887 from aboimpinto: Gherkin acceptance E2E harness	2026-06-07 10:04:12 -07:00
Hunter B	b0d9c3196b	Merge PR #2878 from aboimpinto: Layer 2 command parity harness	2026-06-07 10:04:08 -07:00
Ousama Ben Younes	97f6e0b2e5	fix(tui): use sort_by_key to satisfy clippy::unnecessary_sort_by	2026-06-07 15:17:13 +00:00
Paulo Aboim Pinto	c25f7af219	Address acceptance harness review feedback	2026-06-07 16:29:40 +02:00
Paulo Aboim Pinto	d90031f06f	Add Gherkin acceptance E2E harness example	2026-06-07 16:12:12 +02:00
huqiantao	bdf7b15bd7	revert: use std::thread::spawn for fire-and-forget hooks tokio::task::spawn_blocking requires a running tokio runtime, which breaks tests that call hook functions outside a tokio context. Since hooks are fire-and-forget (no JoinHandle needed), std::thread::spawn is the correct choice.	2026-06-07 19:59:17 +08:00
huqiantao	5cab1517a4	fix: add tracing dependency to app-server crate Required for tracing::error! in persist_config error handling.	2026-06-07 19:56:06 +08:00
huqiantao	5e761a616c	fix: collapse nested if-let to satisfy clippy::collapsible_if lint	2026-06-07 19:55:38 +08:00
huqiantao	3c197d707b	fix: add sse_task field to SseTransport test initializer The test at line 4768 was missing the new sse_task field added to SseTransport. Add a dummy tokio::spawn task for the test.	2026-06-07 19:48:09 +08:00
huqiantao	9aa71e24c0	chore: update Cargo.lock for tracing dependency in core crate	2026-06-07 19:47:36 +08:00
huqiantao	4dd0a47c05	style: apply cargo fmt formatting	2026-06-07 19:46:24 +08:00
huqiantao	265b8ee142	fix: add tracing dependency to core crate and apply cargo fmt - Add tracing.workspace = true to crates/core/Cargo.toml (required for tracing::warn! in lib.rs:752) - Apply cargo fmt formatting to engine.rs, mcp.rs, tool_execution.rs, config/lib.rs	2026-06-07 19:46:02 +08:00
huqiantao	27ca87251e	fix: use Box<dyn Write + Send> for cross-platform tracing writer Replace platform-specific std::os::unix::io::FromRawFd with Box<dyn std::io::Write + Send> return type. This compiles on Windows, macOS, and Linux without unsafe code. The closure now returns a boxed writer that is either: - The cloned file handle (success case) - A reopened file handle (clone failed) - stderr (last resort, prevents panic)	2026-06-07 19:35:59 +08:00
huqiantao	75593a0eac	fix: address security review comments 1. Fix whitespace bypass in normalize_command (execpolicy/lib.rs:446) - Collapse internal whitespace to prevent 'git status' bypassing 'git status' - split_whitespace().join(' ') normalizes all whitespace 2. Fix 'never'/'deny' approval mapping (app-server/lib.rs:287) - Map to AskForApproval::Never instead of OnRequest - 'never'/'deny' should forbid commands, not prompt for approval 3. Optimize prefix matching (execpolicy/lib.rs:355, bash_arity.rs:375) - Avoid format! allocation on every check - Use byte comparison for space boundary check	2026-06-07 19:35:20 +08:00
huqiantao	eb3a989eeb	fix: address review comments on engine.rs 1. Replace let-else with if-let-Some to avoid compilation error - let-else with return would return from the entire function - if-let-Some correctly assigns to tool_registry and continues 2. Preserve original goal_objective_for_prompt behavior - Return None (not fallback) when objective exists but goal is inactive - Use state.is_active().then() to match original semantics	2026-06-07 19:33:52 +08:00
huqiantao	4304c89d65	fix: concurrency bugs - mutex handling, thread spawning, and resource management 1. Fix Mutex lock().unwrap() in MCP server (mcp_server.rs:384,434) - Use unwrap_or_else(\|e\| e.into_inner()) to recover from poisoned locks - Previously, a single panic while holding the lock would cascade to all threads 2. Fix std::thread::spawn in async code (hooks.rs:1055) - Replace std::thread::spawn with tokio::task::spawn_blocking - Respects tokio's thread pool limits instead of creating unbounded OS threads - Fire-and-forget hook execution now properly managed by tokio runtime 3. Fix dropped JoinHandle in SSE loop (mcp.rs:647) - Store the JoinHandle in SseTransport struct - Enables detection of SSE loop termination - Prevents silent connection loss without structured error reporting 4. Fix std::sync::Mutex poison handling in cost_status (cost_status.rs:28-58) - Use unwrap_or_else(\|e\| e.into_inner()) to recover from poisoned locks - Previously, a panic while holding the lock silently lost all subsequent cost data - Cost tracking now survives mutex poisoning 5. Fix .expect() in tracing writer (runtime_log.rs:162) - Replace expect() with fallback chain: try_clone -> reopen file -> stderr - Prevents panicking inside tracing subscriber on fd exhaustion - Previously, EMFILE during logging would crash the application	2026-06-07 19:18:19 +08:00
huqiantao	27fac5d704	fix: security bugs in execpolicy, app-server, and tools 1. Fix deny rule prefix matching without word boundary (execpolicy/lib.rs:351-353) - Deny rule 'rm' now blocks 'rm -rf /' but NOT 'rmdir' or 'rmview' - Previously used bare starts_with which matched any command starting with 'rm' - Add word-boundary check: command must equal rule or start with rule+space 2. Fix fallback prefix match clarity (execpolicy/bash_arity.rs:362-374) - Improve comment to clarify word-boundary matching behavior - The trailing space in starts_with already provides word boundary 3. Fix hardcoded AskForApproval::OnRequest in HTTP API (app-server/lib.rs:283) - Read approval_policy from config instead of hardcoding OnRequest - Users with 'auto'/'yolo' policy now get UnlessTrusted for API calls - Previously ignored user's configured security posture 4. Fix fuzzy indentation search destroying preceding text (tools/file.rs:714-735) - When match starts mid-line after whitespace stripping, use exact position - Previously always expanded to line start, destroying preceding content - Now only expands to line start when match is at a line boundary 5. Fix potential underflow in apply_hunk start index (tools/apply_patch.rs:1110-1115) - Use checked_add_signed to safely handle negative cumulative_offset - Prevents isize overflow on adversarial patch input - Clamp to lines.len() instead of relying on .max(0) cast	2026-06-07 19:13:43 +08:00
huqiantao	ef4dc5ca61	fix: error handling bugs - log instead of silently swallowing errors 1. Fix swallowed persist_config errors (app-server/lib.rs:882,896) - Log errors when config persistence fails after set/unset - Users previously got success response even when disk write failed 2. Fix swallowed job store load error (core/lib.rs:751) - Add warning log when job store fails to load at startup - Previously silently started with empty job list on corruption 3. Fix silent config parse failures (config/lib.rs:1590) - Log warning when project config TOML is malformed - Previously returned None indistinguishable from 'no config file' 4. Fix MCP connect_all errors swallowed (mcp.rs:2151,2189) - Log warnings for each server that fails to connect - Previously returned incomplete resource list with no indication 5. Fix error context stripped in engine status (core/engine.rs:2223) - Use {err:#} format to include full error chain - Was inconsistent with line 2234 which already used {err:#} 6. Fix tool audit log failures silently dropped (tool_execution.rs:122-136) - Log each failure: serialization, directory creation, file open, write - Previously silently dropped all errors for security audit trail 7. Fix Err(_) arms discarding error info (runtime_log.rs:179, runtime_threads.rs:828) - Log stderr redirect failures on Windows - Log poisoned mutex in pending_approvals 8. Fix env var parsing errors silently ignored (config/lib.rs:2519-2530) - Warn when DEEPSEEK_TELEMETRY, DEEPSEEK_YOLO, DEEPSEEK_HTTP_HEADERS have invalid values instead of silently treating as unset 9. Fix MCP config reload errors swallowed (mcp.rs:2011) - Log config reload errors instead of complete silence 10. Fix .expect() on sub-agent runtime (core/engine.rs:1715) - Gracefully fall back to basic tool set when API client missing - Previously panicked if subagents enabled but no client configured 11. Fix .expect() on goal objective (core/engine.rs:2543) - Use safe if-let pattern instead of check+expect - Prevents panic if refactoring changes control flow	2026-06-07 19:04:47 +08:00
zLeoAlex	55d7499408	test: add runtime_policy_reference composition test, strengthen ChangeMode tests, fix outdated comments - Add runtime_policy_reference_is_included_in_full_prompt test to verify that render_runtime_policy_reference() output lands in the composed system prompt. Guards against silent breakage if the push_str() call is accidentally removed (all existing tests would still pass). - Strengthen change_mode_op_updates_current_mode_and_emits_status: destructure SessionUpdated to assert that session messages do NOT contain <runtime_prompt> tags after mode change — verifying the core invariant that Op::ChangeMode does not write session history. - Extend current_mode_field_assignment_takes_effect_synchronously: now also verifies that messages_with_turn_metadata() produces the correct runtime tag (mode="yolo" approval="auto") after a mode switch, covering the tag-generation mechanism end-to-end. - Fix outdated comments in composed_prompt_no_longer_inlines_tool_taxonomy and plan_prompt_taxonomy_omits_run_tests: replace stale references to deleted <mode_prompt> metadata with accurate descriptions of the ## Runtime Policy Reference section.	2026-06-07 18:31:36 +08:00
Paulo Aboim Pinto	acaae1c2e5	test(tui): address command harness review	2026-06-07 12:24:13 +02:00
Paulo Aboim Pinto	96bff65797	test(tui): add command parity harness	2026-06-07 11:43:57 +02:00
zLeoAlex	256f34c621	fix(cache): set temp spillover root in cache_inspect test to survive nix sandbox The test cache_inspect_displays_tool_result_budget_metadata relied on a writable $HOME/.codewhale/tool_outputs/ for tool-result wire-dedup persistence. nix build sandboxes have a read-only home tree, so the first tool-result SHA spillover write failed, the dedup hash table was never populated, and the second identical tool result was not marked deduplicated — causing the expect("repeat tool-result sighting should report dedup metadata") assertion to fail. Set TEST_SPILLOVER_ROOT to a tempdir inside the test (matching the with_tool_result_sha_spillover_root pattern in chat.rs), so the wire-dedup path works in any environment without depending on $HOME.	2026-06-07 16:06:38 +08:00
zLeoAlex	7b900b8699	test(cache): rename misleading test — does not exercise Op::ChangeMode dispatch - Rename mode_change_op_updates_current_mode_and_emits_session_updated to current_mode_field_assignment_takes_effect_synchronously. - The test directly mutates engine.current_mode, not through Op::ChangeMode. The dispatch path is separately covered by change_mode_op_updates_current_mode_and_emits_status.	2026-06-07 15:26:54 +08:00
zLeoAlex	c6c3d2cc4d	refactor(cache): inline single-call helpers, remove dead code - Inline mode_prompt_marker_value and approval_prompt_marker_value into runtime_prompt_text (each called exactly once). - Remove default_approval_mode_for_mode — zero callers.	2026-06-07 15:22:53 +08:00
zLeoAlex	039abb2ae6	refactor(cache): remove render_core_tool_taxonomy_block, inline to body variant - Replace the 2 remaining test callers with render_core_tool_taxonomy_body (neither test depends on the ## heading — they check content only). - Delete render_core_tool_taxonomy_block — zero production callers after the previous refactor.	2026-06-07 15:20:51 +08:00
zLeoAlex	12167b39c3	refactor(cache): replace taxonomy_body strip hack with source-level render_core_tool_taxonomy_body - Add render_core_tool_taxonomy_body(mode) that generates the tool taxonomy text without the ## Core Tool Taxonomy heading. - Refactor render_core_tool_taxonomy_block to use the body function internally (DRY). - Delete taxonomy_body() — a downstream strip_prefix hack that worked around the source format instead of fixing it. - Also removes the now-unnecessary debug_assert! (over-defensive, since the two functions are co-located in the same file).	2026-06-07 15:19:27 +08:00
zLeoAlex	0b5d574e63	fix(cache): address CR feedback — blank lines, heading hierarchy, debug_assert - Add proper blank lines (\n\n) before mode headings in render_runtime_policy_reference (CommonMark/GFM compliance). - Demote subheadings in agent.md from ##### to ###### so they nest correctly under the demoted main heading. - Add debug_assert! in taxonomy_body() to loudly fail when render_core_tool_taxonomy_block format changes, preventing silent heading-hierarchy breakage.	2026-06-07 15:15:12 +08:00
zLeoAlex	427bd5d52f	feat(cache): slim runtime_prompt to minimal tag, move policy descriptions to system prompt - Add render_runtime_policy_reference() in prompts.rs containing all mode and approval policy descriptions in the frozen system-prompt prefix (sent once per session, cache-hit thereafter). - Simplify runtime_prompt_text() from ~500-token XML block to a ~16-token self-closing tag (<runtime_prompt visibility="internal" mode="..." approval="..."/>). - Fix markdown heading hierarchy in all prompts/modes/.md and prompts/approvals/.md (## → #####) to nest correctly under ####. - Remove now-unused legacy functions: mode_prompt(), approval_prompt_for_mode(), mode_change_runtime_message(). - Simplify Op::ChangeMode: no longer persists a mode_change event (next turn tag carries the current mode). - Update and rename affected tests. Builds on #2801. Reduces per-request runtime prompt overhead by 97% (~471 tokens saved per API call). System prompt grows by ~1325 tokens in the frozen prefix (one-time miss cost); break-even at 3 API calls.	2026-06-07 15:03:43 +08:00

1 2 3 4 5 ...

2456 Commits