Commit Graph

2424 Commits

Author SHA1 Message Date
Hunter B 1a9549babd Merge PR #2881 from HUQIANTAO: log instead of swallowing errors 2026-06-07 10:21:00 -07:00
Hunter B 4caa28772b Merge PR #2882 from HUQIANTAO: security fixes in execution policy and approval mapping 2026-06-07 10:21:00 -07:00
Hunter B face4dc27a Merge PR #2877 from LeoAlex0: cache_inspect test spillover root 2026-06-07 10:21:00 -07:00
Hunter B a54d08f28d chore(fmt): rustfmt engine tests from PR #2874
Mechanical rustfmt of the runtime_prompt tests rewritten in PR #2874
(LeoAlex0). No logic change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 10:10:11 -07:00
Hunter B 3619962507 Merge PR #2874 from LeoAlex0: slim runtime_prompt to minimal tag 2026-06-07 10:09:21 -07:00
Hunter B a42e9115b1 Merge PR #2873 from reidliu41: hotbar slot persistence 2026-06-07 10:09:21 -07:00
Hunter B 2c56f7761e Merge PR #2887 from aboimpinto: Gherkin acceptance E2E harness 2026-06-07 10:04:12 -07:00
Hunter B b0d9c3196b Merge PR #2878 from aboimpinto: Layer 2 command parity harness 2026-06-07 10:04:08 -07:00
Paulo Aboim Pinto c25f7af219 Address acceptance harness review feedback 2026-06-07 16:29:40 +02:00
Paulo Aboim Pinto d90031f06f Add Gherkin acceptance E2E harness example 2026-06-07 16:12:12 +02:00
huqiantao 5cab1517a4 fix: add tracing dependency to app-server crate
Required for tracing::error! in persist_config error handling.
2026-06-07 19:56:06 +08:00
huqiantao 5e761a616c fix: collapse nested if-let to satisfy clippy::collapsible_if lint 2026-06-07 19:55:38 +08:00
huqiantao 9aa71e24c0 chore: update Cargo.lock for tracing dependency in core crate 2026-06-07 19:47:36 +08:00
huqiantao 4dd0a47c05 style: apply cargo fmt formatting 2026-06-07 19:46:24 +08:00
huqiantao 265b8ee142 fix: add tracing dependency to core crate and apply cargo fmt
- Add tracing.workspace = true to crates/core/Cargo.toml
  (required for tracing::warn! in lib.rs:752)
- Apply cargo fmt formatting to engine.rs, mcp.rs, tool_execution.rs, config/lib.rs
2026-06-07 19:46:02 +08:00
huqiantao 75593a0eac fix: address security review comments
1. Fix whitespace bypass in normalize_command (execpolicy/lib.rs:446)
   - Collapse internal whitespace to prevent 'git  status' bypassing 'git status'
   - split_whitespace().join(' ') normalizes all whitespace

2. Fix 'never'/'deny' approval mapping (app-server/lib.rs:287)
   - Map to AskForApproval::Never instead of OnRequest
   - 'never'/'deny' should forbid commands, not prompt for approval

3. Optimize prefix matching (execpolicy/lib.rs:355, bash_arity.rs:375)
   - Avoid format! allocation on every check
   - Use byte comparison for space boundary check
2026-06-07 19:35:20 +08:00
huqiantao eb3a989eeb fix: address review comments on engine.rs
1. Replace let-else with if-let-Some to avoid compilation error
   - let-else with return would return from the entire function
   - if-let-Some correctly assigns to tool_registry and continues

2. Preserve original goal_objective_for_prompt behavior
   - Return None (not fallback) when objective exists but goal is inactive
   - Use state.is_active().then() to match original semantics
2026-06-07 19:33:52 +08:00
huqiantao 27fac5d704 fix: security bugs in execpolicy, app-server, and tools
1. Fix deny rule prefix matching without word boundary (execpolicy/lib.rs:351-353)
   - Deny rule 'rm' now blocks 'rm -rf /' but NOT 'rmdir' or 'rmview'
   - Previously used bare starts_with which matched any command starting with 'rm'
   - Add word-boundary check: command must equal rule or start with rule+space

2. Fix fallback prefix match clarity (execpolicy/bash_arity.rs:362-374)
   - Improve comment to clarify word-boundary matching behavior
   - The trailing space in starts_with already provides word boundary

3. Fix hardcoded AskForApproval::OnRequest in HTTP API (app-server/lib.rs:283)
   - Read approval_policy from config instead of hardcoding OnRequest
   - Users with 'auto'/'yolo' policy now get UnlessTrusted for API calls
   - Previously ignored user's configured security posture

4. Fix fuzzy indentation search destroying preceding text (tools/file.rs:714-735)
   - When match starts mid-line after whitespace stripping, use exact position
   - Previously always expanded to line start, destroying preceding content
   - Now only expands to line start when match is at a line boundary

5. Fix potential underflow in apply_hunk start index (tools/apply_patch.rs:1110-1115)
   - Use checked_add_signed to safely handle negative cumulative_offset
   - Prevents isize overflow on adversarial patch input
   - Clamp to lines.len() instead of relying on .max(0) cast
2026-06-07 19:13:43 +08:00
huqiantao ef4dc5ca61 fix: error handling bugs - log instead of silently swallowing errors
1. Fix swallowed persist_config errors (app-server/lib.rs:882,896)
   - Log errors when config persistence fails after set/unset
   - Users previously got success response even when disk write failed

2. Fix swallowed job store load error (core/lib.rs:751)
   - Add warning log when job store fails to load at startup
   - Previously silently started with empty job list on corruption

3. Fix silent config parse failures (config/lib.rs:1590)
   - Log warning when project config TOML is malformed
   - Previously returned None indistinguishable from 'no config file'

4. Fix MCP connect_all errors swallowed (mcp.rs:2151,2189)
   - Log warnings for each server that fails to connect
   - Previously returned incomplete resource list with no indication

5. Fix error context stripped in engine status (core/engine.rs:2223)
   - Use {err:#} format to include full error chain
   - Was inconsistent with line 2234 which already used {err:#}

6. Fix tool audit log failures silently dropped (tool_execution.rs:122-136)
   - Log each failure: serialization, directory creation, file open, write
   - Previously silently dropped all errors for security audit trail

7. Fix Err(_) arms discarding error info (runtime_log.rs:179, runtime_threads.rs:828)
   - Log stderr redirect failures on Windows
   - Log poisoned mutex in pending_approvals

8. Fix env var parsing errors silently ignored (config/lib.rs:2519-2530)
   - Warn when DEEPSEEK_TELEMETRY, DEEPSEEK_YOLO, DEEPSEEK_HTTP_HEADERS
     have invalid values instead of silently treating as unset

9. Fix MCP config reload errors swallowed (mcp.rs:2011)
   - Log config reload errors instead of complete silence

10. Fix .expect() on sub-agent runtime (core/engine.rs:1715)
    - Gracefully fall back to basic tool set when API client missing
    - Previously panicked if subagents enabled but no client configured

11. Fix .expect() on goal objective (core/engine.rs:2543)
    - Use safe if-let pattern instead of check+expect
    - Prevents panic if refactoring changes control flow
2026-06-07 19:04:47 +08:00
zLeoAlex 55d7499408 test: add runtime_policy_reference composition test, strengthen ChangeMode tests, fix outdated comments
- Add runtime_policy_reference_is_included_in_full_prompt test to verify
  that render_runtime_policy_reference() output lands in the composed
  system prompt. Guards against silent breakage if the push_str() call
  is accidentally removed (all existing tests would still pass).

- Strengthen change_mode_op_updates_current_mode_and_emits_status:
  destructure SessionUpdated to assert that session messages do NOT
  contain <runtime_prompt> tags after mode change — verifying the core
  invariant that Op::ChangeMode does not write session history.

- Extend current_mode_field_assignment_takes_effect_synchronously:
  now also verifies that messages_with_turn_metadata() produces the
  correct runtime tag (mode="yolo" approval="auto") after a mode
  switch, covering the tag-generation mechanism end-to-end.

- Fix outdated comments in composed_prompt_no_longer_inlines_tool_taxonomy
  and plan_prompt_taxonomy_omits_run_tests: replace stale references to
  deleted <mode_prompt> metadata with accurate descriptions of the
  ## Runtime Policy Reference section.
2026-06-07 18:31:36 +08:00
Paulo Aboim Pinto acaae1c2e5 test(tui): address command harness review 2026-06-07 12:24:13 +02:00
Paulo Aboim Pinto 96bff65797 test(tui): add command parity harness 2026-06-07 11:43:57 +02:00
zLeoAlex 256f34c621 fix(cache): set temp spillover root in cache_inspect test to survive nix sandbox
The test cache_inspect_displays_tool_result_budget_metadata relied on a
writable $HOME/.codewhale/tool_outputs/ for tool-result wire-dedup
persistence.  nix build sandboxes have a read-only home tree, so the
first tool-result SHA spillover write failed, the dedup hash table was
never populated, and the second identical tool result was not marked
deduplicated — causing the expect("repeat tool-result sighting should
report dedup metadata") assertion to fail.

Set TEST_SPILLOVER_ROOT to a tempdir inside the test (matching the
with_tool_result_sha_spillover_root pattern in chat.rs), so the
wire-dedup path works in any environment without depending on $HOME.
2026-06-07 16:06:38 +08:00
zLeoAlex 7b900b8699 test(cache): rename misleading test — does not exercise Op::ChangeMode dispatch
- Rename mode_change_op_updates_current_mode_and_emits_session_updated
  to current_mode_field_assignment_takes_effect_synchronously.
- The test directly mutates engine.current_mode, not through Op::ChangeMode.
  The dispatch path is separately covered by
  change_mode_op_updates_current_mode_and_emits_status.
2026-06-07 15:26:54 +08:00
zLeoAlex c6c3d2cc4d refactor(cache): inline single-call helpers, remove dead code
- Inline mode_prompt_marker_value and approval_prompt_marker_value into
  runtime_prompt_text (each called exactly once).
- Remove default_approval_mode_for_mode — zero callers.
2026-06-07 15:22:53 +08:00
zLeoAlex 039abb2ae6 refactor(cache): remove render_core_tool_taxonomy_block, inline to body variant
- Replace the 2 remaining test callers with render_core_tool_taxonomy_body
  (neither test depends on the ## heading — they check content only).
- Delete render_core_tool_taxonomy_block — zero production callers after
  the previous refactor.
2026-06-07 15:20:51 +08:00
zLeoAlex 12167b39c3 refactor(cache): replace taxonomy_body strip hack with source-level render_core_tool_taxonomy_body
- Add render_core_tool_taxonomy_body(mode) that generates the tool
  taxonomy text without the ## Core Tool Taxonomy heading.
- Refactor render_core_tool_taxonomy_block to use the body function
  internally (DRY).
- Delete taxonomy_body() — a downstream strip_prefix hack that
  worked around the source format instead of fixing it.
- Also removes the now-unnecessary debug_assert! (over-defensive,
  since the two functions are co-located in the same file).
2026-06-07 15:19:27 +08:00
zLeoAlex 0b5d574e63 fix(cache): address CR feedback — blank lines, heading hierarchy, debug_assert
- Add proper blank lines (\n\n) before mode headings in
  render_runtime_policy_reference (CommonMark/GFM compliance).
- Demote subheadings in agent.md from ##### to ###### so they
  nest correctly under the demoted main heading.
- Add debug_assert! in taxonomy_body() to loudly fail when
  render_core_tool_taxonomy_block format changes, preventing
  silent heading-hierarchy breakage.
2026-06-07 15:15:12 +08:00
zLeoAlex 427bd5d52f feat(cache): slim runtime_prompt to minimal tag, move policy descriptions to system prompt
- Add render_runtime_policy_reference() in prompts.rs containing all
  mode and approval policy descriptions in the frozen system-prompt
  prefix (sent once per session, cache-hit thereafter).
- Simplify runtime_prompt_text() from ~500-token XML block to a ~16-token
  self-closing tag (<runtime_prompt visibility="internal" mode="..." approval="..."/>).
- Fix markdown heading hierarchy in all prompts/modes/*.md and
  prompts/approvals/*.md (## → #####) to nest correctly under ####.
- Remove now-unused legacy functions: mode_prompt(),
  approval_prompt_for_mode(), mode_change_runtime_message().
- Simplify Op::ChangeMode: no longer persists a mode_change event
  (next turn tag carries the current mode).
- Update and rename affected tests.

Builds on #2801. Reduces per-request runtime prompt overhead by 97%
(~471 tokens saved per API call). System prompt grows by ~1325 tokens
in the frozen prefix (one-time miss cost); break-even at 3 API calls.
2026-06-07 15:03:43 +08:00
reidliu41 00407b5bf8 feat(config): add hotbar slot persistence
Add durable [[hotbar]] config bindings for slots 1-8, including default
  bindings when no hotbar config is present.

  Validate bindings without panicking: skip out-of-range slots, use the last
  duplicate slot, and preserve unknown actions so future UI layers can show
  disabled placeholders.
2026-06-07 14:42:52 +08:00
Hunter B 3d676c2509 chore(tui): harden exec harness signals 2026-06-06 22:55:23 -07:00
Hunter B fde931ee89 chore(release): allow trusted v0.9 contributors 2026-06-06 19:56:11 -07:00
Hunter B f2159b7827 docs(release): honor v0.9 contributor credits 2026-06-06 19:45:28 -07:00
Hunter B 9b500a7b91 Prepare v0.9.0 release build 2026-06-06 19:39:02 -07:00
Hunter Bown 59d12f3b6a Merge pull request #2871 from aboimpinto/feat/2791-command-parity-harness
Layer 1: clean command support boundaries
2026-06-06 19:24:31 -07:00
Paulo Aboim Pinto fefd63f30b fix: address command layer review feedback 2026-06-07 03:19:45 +02:00
Paulo Aboim Pinto 18df8db056 refactor: extract neutral command support 2026-06-07 02:44:29 +02:00
Paulo Aboim Pinto 8e8b45a20e test: make command-adjacent tests hermetic 2026-06-07 02:44:15 +02:00
Paulo Aboim Pinto 5300dc484e chore: enforce lf for rust sources 2026-06-07 02:44:08 +02:00
Hunter Bown ad70739b6a Merge pull request #2868 from Hmbown/codex/v090-vscode-git-meta
feat(vscode): show thread git metadata
2026-06-06 10:51:43 -07:00
Hunter B ce17f06db5 feat(vscode): show thread git metadata 2026-06-06 10:50:48 -07:00
Hunter B 6b1de930af chore(release): credit direct v0.9 community merges 2026-06-06 10:49:25 -07:00
Hunter Bown 0b96e8923a Merge pull request #2864 from ljm3790865/feat/tab-core-narrow
feat(tui): add multi-tab system core (manager + persistence)
2026-06-06 10:41:50 -07:00
Hunter Bown a7c1c034ab Merge pull request #2866 from reidliu41/feat/hotbar-action-registry
feat(tui): add hotbar action registry foundation
2026-06-06 10:40:23 -07:00
Hunter Bown 461c22f327 Merge pull request #2867 from ousamabenyounes/fix/azerty-altgr-at-key-conflict
fix(tui): prevent AltGr from swallowing @/#/$/!/%/ characters in composer
2026-06-06 10:37:48 -07:00
Hunter B ffaf110957 fix(tui): advance tab restore counters 2026-06-06 10:34:49 -07:00
Hunter B c9ce6c920b fix(tui): harden hotbar action dispatch 2026-06-06 10:32:18 -07:00
Hunter B 700a36edf1 style(tui): format AltGr sidebar shortcut guards 2026-06-06 10:30:17 -07:00
Ousama Ben Younes da6b8141ad fix(tui): prevent AltGr from swallowing @/#/$/!/%/ characters in composer
On Windows, AltGr is delivered as Ctrl+Alt by crossterm. European keyboard
layouts (French AZERTY, German QWERTZ, etc.) use AltGr to type characters
like @ (AltGr+0), # (AltGr+3), etc. The sidebar-focus shortcuts for
Alt+@/Alt+!/Alt+#/Alt+$/Alt+%) were matching on "contains ALT" alone,
swallowing these AltGr-typed characters instead of inserting them into
the composer.

Exclude the Ctrl modifier from these sidebar-focus shortcut guards so
AltGr-typed glyphs fall through to the  catch-all and
are inserted as text. This is consistent with the has_ctrl_or_alt /
is_altgr philosophy in key_hint.rs, which already treats Ctrl+Alt as
AltGr to preserve European keyboard input.

Closes #2863
2026-06-06 15:48:30 +00:00
reidliu41 1f99fcbd97 feat(tui): add hotbar action registry foundation
Introduce the hotbar action trait and registry, and register the built-in app
  actions needed by the first hotbar slice.
2026-06-06 23:23:48 +08:00