Commit Graph

2147 Commits

Author SHA1 Message Date
Hunter Bown dd26114697 feat(tui): send /attach images as multimodal content (#2584, #2587) (#2607)
Adds OpenAI-compatible image_url content blocks to the chat message
model, wiring attached images through build_chat_messages_with_reasoning
as multimodal user-content arrays. When images are present, user
messages emit a content array of text + image_url parts instead of a
plain string, matching the OpenAI vision API shape.

- models.rs: new ImageUrlContent struct, ContentBlock::ImageUrl variant
- client/chat.rs: image_parts collection, multimodal wire format for
  user messages, image-aware message inspection, stream-event no-op
- Exhaustiveness arms added across 10 files (compaction, seam_manager,
  capacity_flow, purge, notifications, session_picker, utils,
  working_set, rlm/session, runtime_api)
- Test: request_builder_emits_openai_image_url_parts_for_user_images

Credit: @xyuai (PR #2587 — root cause + initial implementation)
Closes: #2584

Co-authored-by: xyuai <xyuai@users.noreply.github.com>
2026-06-02 21:27:31 -07:00
AresNing 8981d5c5fd feat: add subagent lifecycle hooks
Add subagent lifecycle hooks for better control over subagent initialization and teardown.
2026-06-02 20:48:09 -07:00
Gordon b4691bc082 feat(i18n): localize context-inspector surface across 7 locales
Localize the context-inspector surface across 7 locales for improved internationalization support.
2026-06-02 20:47:53 -07:00
Justin Gao 29acb87a9d feat(engine): inject mode-change runtime message and include mode in turn metadata
Inject a mode-change runtime message into the engine and include mode information in turn metadata for better tracking.
2026-06-02 20:47:36 -07:00
Hanmiao Li 1781312c7a feat(tui): add drag-to-resize sidebar width
Add drag-to-resize functionality for the TUI sidebar width, allowing users to interactively resize the sidebar.
2026-06-02 20:47:18 -07:00
Hunter B 2721b2a077 Merge branch 'codex/v0.8.51-arcee-provider' — v0.8.51 release
v0.8.51: Arcee provider, cycle removal, compaction improvements,
TUI fixes, model persistence, and community harvest.
2026-06-02 20:38:10 -07:00
Hunter B 541926eb38 docs(changelog): backfill v0.8.51 Fixed entries + Community credits
Adds 8 missing Fixed entries for commits that landed after the
release-prep commit (06612495f): DEC CSI fragment fix, engine panic
recovery, nested file-picker, command-palette scroll, .NET/Windows
env, config key warnings, diff-render whitespace, and model
persistence. Adds Community credits for contributors whose work
landed or shaped this release cycle.
2026-06-02 20:37:52 -07:00
Hunter B c000bd7e60 harvest(v0.8.51): diff-render whitespace fix + schema dead_code + model persistence + prompt updates
- fix(diff-render): preserve leading whitespace in patch content lines
  Credit: @zlh124 (PR #2591), with extra-space bug fixed.
- fix(tui): allow unused schema migration registry
  Credit: @reidliu41 (PR #2601).
- feat(tui): persist per-provider model selection from /model command
- docs(prompts): prefer gh --json CLI for GitHub triage in agent instructions
2026-06-02 20:30:31 -07:00
Claude f886f28acf test(tui): update walk-depth test for new default depth (#2488)
`workspace_completions_honor_configured_walk_depth` placed its probe file at
component depth 9 and asserted the *default* walk excludes it — true at the
old default (6) but not the new one (10). Move the probe to depth 12 so it
stays past the default while remaining within the explicit deeper walk (16)
and the unlimited (0) cases the test also exercises.

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:41:13 +00:00
Claude 79d78878b8 test(mcp): make legacy SSE reconnect test deterministic under load (#2597)
`legacy_sse_closed_stream_reconnects_and_retries_tool_call` passed in
isolation but flaked under parallel load. The mock server dropped the
tool-call response whenever `active_sse` was momentarily `None` — which
happens when the retry POST is scheduled ahead of the reconnecting GET /sse
that re-stores the SSE sender. The client then hung until timeout and the
test failed.

Make the server wait briefly (bounded, 5s) for the SSE channel before
sending, so response delivery no longer depends on the order in which the
two server tasks are scheduled.

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:24:14 +00:00
Claude a7d482067b fix(clippy): clear -D warnings across changed code (#2599)
Resolves the workspace clippy warnings so the release gate
(`cargo clippy --workspace --all-targets -- -D warnings`) is clean:

- chat.rs: elide needless lifetime on `next_arcee_waf_trigger` (returned
  strs are `'static`).
- llm_client/mod.rs: use `enumerate()` instead of a manual loop counter in
  `truncate_for_error` (explicit_counter_loop).
- ui.rs: `#[allow(clippy::too_many_arguments)]` on `apply_model_picker_choice`
  with rationale (8 distinct handles/states; a struct would only obscure it).
- file_picker.rs: gate the test-only `WALK_DEPTH` const and `new_with_relevance`
  convenience ctor behind `#[cfg(test)]` (the #2488 change moved production
  callers to `new_with_relevance_and_depth`).

The 6 schema_migration registry structs the issue noted are no longer flagged
(their trait impls keep them live). Also normalizes rustfmt formatting on the
touched lines.

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:24:14 +00:00
Claude 7aa73fad50 fix(config): warn when shell/sandbox keys are nested under unknown sections (#2589)
Shell tools (`exec_shell`, `task_shell_start`, …) only register when
`allow_shell` is true, but `allow_shell` and `sandbox_mode` are top-level
keys. Placing them under a `[general]` or `[sandbox]` table — neither of
which CodeWhale defines — makes TOML silently drop them, so `allow_shell`
stays false and the tools vanish from the catalog with no explanation.

Following the existing `warn_on_misplaced_root_base_url` precedent, emit a
startup warning naming the misplaced keys and telling the user to move them
to the top of the file. With the keys correctly placed, shell tools register
on Windows too (no sandbox required for danger-full-access).

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:16:19 +00:00
Claude cccc5ed55f fix(shell): pass .NET/NuGet + Windows app-data env to exec_shell (#1857)
`exec_shell` runs with `env_clear()` plus a strict allowlist. On Windows
there is no sandbox, so commands run directly — but the allowlist dropped
`APPDATA`, `LOCALAPPDATA`, `ProgramData`, `ProgramFiles*`, and the `DOTNET_*`
/ `NUGET_*` variables that `dotnet restore` and NuGet rely on to locate
their package cache, HTTP cache, and config. Restore therefore failed
through the tool while working in the user's own shell, where the full
environment is present.

Add the .NET/NuGet and Windows app-data path variables to the shell
allowlist (`DOTNET_*` via prefix, like `LC_*`). NuGet credential vars
(`NuGetPackageSourceCredentials_*`) still fall outside the allowlist and are
not exported. Also benefits npm/pip on Windows, which use the same paths.

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:16:19 +00:00
Claude e95f759cd8 fix(tui): keep command-palette selection visible while scrolling (#2590)
The palette sized its scroll window by entry count (`popup_height - 7`)
while the same fixed-height popup also rendered the 9-line header plus
per-section labels and separators. Those uncounted rows pushed the selected
entry past the bottom clip line, so pressing Down made the cursor vanish and
the list appear frozen until the index finally exceeded the oversized budget.

Size the window against the real rendered cost: subtract the actual header
height and account for section labels/separators when choosing the visible
range, guaranteeing the selection stays on screen and the list scrolls.
Adds unit tests for the window helper.

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:08:53 +00:00
Claude 478bae451a fix(tui): find files in deeply nested directories via @ and Ctrl+P (#2488)
`ignore`'s `max_depth(Some(6))` excludes files inside a 6-level-deep
directory (they sit at component depth 7), so @-mention completion and the
Ctrl+P picker silently missed them. Raise the default walk depth to 10
(covers conventionally nested Java/.NET/web trees) and make the Ctrl+P
picker honor the configurable `mention_walk_depth` — including `0` for an
unlimited walk — so it matches @-mention behavior and the existing
"set mention_walk_depth 0 to search deeper" guidance.

The walk stays bounded by `.gitignore` and `MAX_CANDIDATES`. Adds a
regression test covering depth-6 miss, default reach, and unlimited.

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:08:42 +00:00
Claude 5249723e18 fix(engine): recover from turn panics instead of killing the loop (#2583, #1269)
A panic inside `handle_deepseek_turn` unwound through `engine.run()` and was
caught by `spawn_supervised("engine-event-loop")`, which wrote a crash dump
and let the whole engine task exit. The UI never received `TurnComplete`, so
it sat on "working" forever and every subsequent turn was dead too — exactly
the "the engine have stopped" / stuck-on-working reports.

Wrap the turn call in `catch_unwind` so a panic now surfaces as a failed
`TurnComplete` (with a clear, actionable message) and the engine keeps
running. The crash dump is still written via a new `record_caught_panic`
helper so maintainers retain the `~/.codewhale/crashes/` diagnostics.

Also dedupes the panic-message extraction in `spawn_supervised` /
`spawn_blocking_supervised` into a shared `panic_message` helper.

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:08:30 +00:00
Claude fd69f4c806 fix(tui): strip DEC private mode CSI fragments from composer (#2592)
Terminal mode set/reset chatter (bracketed paste `[?2004h/l`, mouse
capture `[?1000h`, focus reporting `[?1004h`, synchronized output
`[?2026h`) ends in `h`/`l`, but the composer's CSI fragment filter only
treated `u` (Kitty keyboard) as a terminator. During dense streaming the
leading `[` leaked into the input and corrupted editing — a regression of
the #1915 control-sequence filter.

Accept `h`/`l` as terminators too, but only after a numeric parameter so
ordinary prose like `[?help]` is preserved (real mode sequences are always
`[?<number><letter>`; Kitty's parameter-less `[?u` still matches). Adds
regression tests for the mode sequences and the false-positive guard.

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:08:19 +00:00
Hunter Bown 06612495fc chore(release): prep v0.8.51 — Arcee provider, cycle removal, UI fixes
Release-preparation checkpoint for v0.8.51 (workspace + npm bumped to 0.8.51).

Added:
- Arcee AI direct provider: [providers.arcee], ARCEE_API_KEY/BASE_URL/MODEL,
  CLI auth, provider + model picker, registry. Default direct-API model is
  trinity-large-thinking (reasoning, 262K ctx/out); preview + mini selectable.
  Cloudflare-WAF-safe opening turn (benign read-only tool surface, system-prompt
  payload splitting) and reasoning_content replay on tool-call turns.
- Expanded model catalog (qwen3.6 flash/plus/max-preview, Xiaomi MiMo v2.5
  chat/ASR/TTS); provider-aware model picker with per-provider saved models.

Changed:
- Auto-compaction is percentage- and model-aware
  (compaction_threshold_for_model_at_percent; default 80%; auto-enable for
  <=256K windows, opt-in for 1M models).
- Provider/gateway HTTP errors sanitized (HTML/WAF interstitials collapsed,
  401/403 split into authentication vs authorization).

Removed:
- The session cycle / checkpoint-restart system: /cycles, /cycle, /recall,
  recall_archive tool, cycle_manager, cycle-handoff prompt, sidebar cycle lines,
  EngineConfig.cycle / Event::CycleAdvanced / seam cycle thresholds.

Fixed:
- Orphaned assistant 'blue dot' role glyph on whitespace-only turns.
- Sidebar mouse-wheel scroll leaking into the transcript.
- Sidebar hover tooltip overlap + warning-orange styling.
- README Constitution description corrected to match prompts/base.md.
- Repaired release-blocking unit/integration tests after the refactors.

Preflight: cargo fmt clean, workspace builds, 3903 tui tests pass (1 known
flaky MCP SSE test under parallel load, passes in isolation).
2026-06-02 17:36:18 -07:00
Hunter B 8eca757633 test(tui): cover arcee provider picker entry 2026-06-02 09:03:19 -07:00
Hunter B 99da87ca1d fix(cli): wire arcee provider auth 2026-06-02 08:58:01 -07:00
Hunter B e54a0a500b feat(provider): add direct arcee support 2026-06-02 08:51:39 -07:00
Hunter Bown 0072209d12 v0.8.50: community harvest release (40 PRs, 14 contributors)
Full changelog in CHANGELOG.md. macOS failure is pre-existing flaky MCP network test; Windows was a CI timeout during compilation. Both unrelated to harvested changes.
2026-06-01 21:40:45 -07:00
Hunter B f2df1d54fb docs(changelog): note paste-first context menu fix 2026-06-01 21:35:11 -07:00
Hunter B 9a3c545572 fix(tui): move Paste to first position in right-click context menu
Users copying text from the output area and right-clicking in the
composer expect Paste to be the first, most accessible action.
Previously Paste appeared after all cell-specific actions (Open
Details, Copy Message, etc.), requiring extra mouse travel or
keyboard navigation.

Reported by a WeChat/Chinese UX user during the v0.8.50 triage pass.
2026-06-01 21:34:56 -07:00
Hunter B 471a58ff08 chore(release): sync tui CHANGELOG after i18n/CI harvests 2026-06-01 21:31:04 -07:00
Hunter B b8194238d5 docs(changelog): credit i18n and CI workflow harvests (#2568, #2566, #2565) 2026-06-01 21:30:28 -07:00
Nightt ea7fc474a9 fix: paginate pending allowlist PR lookup 2026-06-01 21:27:39 -07:00
Nightt dfe1884702 fix: add contribution gate dry run mode 2026-06-01 21:27:39 -07:00
Nightt c8c20e0931 fix: remove dead issue gate guard 2026-06-01 21:27:39 -07:00
Nightt 50590761ac fix: read contribution allowlist from default branch 2026-06-01 21:27:39 -07:00
Nightt dcf8350ff8 fix: harden contribution gate bypasses 2026-06-01 21:27:39 -07:00
Nightt 97c615ca9c chore: add contribution gate workflows 2026-06-01 21:27:39 -07:00
gordonlu f48e398ba5 Revert "fix: restore two-line draft header layout"
This reverts commit 6bc5e629c89aa98762b3f181ade29fefaa77380f.
2026-06-01 21:27:34 -07:00
gordonlu cc60129f3a feat(i18n): add FanoutCounts MessageId, wire into FanoutCard stats line 2026-06-01 21:27:17 -07:00
gordonlu 25017091e1 fix: restore two-line draft header layout 2026-06-01 21:27:17 -07:00
gordonlu 478d45f795 fmt: cargo fmt 2026-06-01 21:27:17 -07:00
gordonlu 19d55799a5 fix: avoid Instant overflow in turn_liveness tests on Windows 2026-06-01 21:27:17 -07:00
gordonlu 6ab77eaba3 feat(i18n): localize all queue command messages across 7 locales 2026-06-01 21:27:17 -07:00
Hunter B ddae7584f8 fix: resolve clippy warnings in harvested PRs (needless-borrow, is_multiple_of, dead unwrap) 2026-06-01 21:24:38 -07:00
Hunter B e99ee5e7b1 chore(release): sync tui crate CHANGELOG for version drift gate 2026-06-01 21:21:56 -07:00
Hunter B e763b44e1e docs(changelog): credit new harvests for v0.8.50 (#2514, #2519, #2503, #2560) 2026-06-01 21:20:44 -07:00
xyuai 5f497e00c4 fix: harden Xiaomi MiMo speech flow 2026-06-01 21:18:28 -07:00
xyuai 8532dcc49e feat: add Xiaomi MiMo speech support 2026-06-01 21:18:28 -07:00
Hu Qiantao 139b542d3f test(ci): add Cache Guard CI test for prefix-cache stability
Add a CI guard test that verifies prefix-cache stability across
multi-turn conversations.

The test runs 8 test cases × 14-24 turns each:
- plain-dialogue (14 turns, with/without reasoning)
- long-dialogue (18 turns)
- mixed-message-sizes (20 turns)
- tool-loop (14 turns, with/without reasoning)
- long-tool-loop (24 turns, with/without reasoning)
- compaction-must-cause-at-least-one-miss (30 turns)

Environment variables:
- CODEWHALE_CACHE_GUARD=1: Enable the guard (default: disabled)
- CODEWHALE_CACHE_GUARD_THRESHOLD=40: Hit rate threshold (0-100)
- CODEWHALE_CACHE_GUARD_STRICT=1: Fail on threshold violation

Usage:
  CODEWHALE_CACHE_GUARD=1 cargo test --test cache_guard
  CODEWHALE_CACHE_GUARD=1 CODEWHALE_CACHE_GUARD_STRICT=1 cargo test --test cache_guard

The mock simulates DeepSeek's server-side prefix cache behavior
using byte-prefix matching. The default threshold (40%) is calibrated
for the mock; real CI should use CODEWHALE_CACHE_GUARD_THRESHOLD=90
for production-quality validation.

9 tests covering:
- 8 multi-turn conversation scenarios
- 1 compaction behavior verification
2026-06-01 21:15:12 -07:00
Hu Qiantao d58613ab23 test(client): add plan mode toggle byte-stability invariant test
Add test plan_mode_toggle_preserves_catalog_byte_stability that verifies
three invariants critical for DeepSeek's KV prefix cache:

1. Building the tool catalog twice for the same mode produces identical
   JSON bytes. This catches any non-determinism in catalog construction
   (e.g., HashMap iteration order, timestamp-dependent logic).

2. Non-deferred tools common to Plan and Agent modes appear in the same
   order. Plan mode excludes execution tools, but the tools that are
   present in both modes must have stable byte positions so that toggling
   between modes doesn't shift byte offsets of shared tools.

3. Activating a deferred tool mid-session appends to the tail without
   reordering the catalog head. This is the existing invariant from #263,
   now covered by a dedicated byte-level assertion.

Also add a doc comment to build_model_tool_catalog documenting the
catalog-head stability invariant.
2026-06-01 21:15:12 -07:00
Justin Gao c9e4c8b2ce fix: clarify comment, avoid per-turn tool clone on happy path
- Comment: remove 'never auto-re-pins' (it does auto-re-freeze),
  describe accurately as 'auto-re-freeze on drift'
- Perf: use as_deref().unwrap_or_default() to borrow &[Tool] for
  verify(), only to_vec() when constructing PinnedPrefix
2026-06-01 21:15:12 -07:00
Justin Gao b122b58c92 refs(#2264): Phase 2 — wire FrozenPrefix::verify() into turn_loop
Adds a three-zone diagnostic layer alongside the existing
PrefixStabilityManager::check_and_update(). On the first turn,
freeze the PinnedPrefix baseline; on subsequent turns, verify
the current system+tool state against the frozen baseline and
log drift via tracing::debug!. Phase 2 is warn-only — no
request refusal — auto-re-freezes on drift to keep subsequent
turn comparisons meaningful.

- Session: add frozen_prefix: Option<FrozenPrefix> field
- turn_loop: import PinnedPrefix, insert verify block after
  check_and_update, before MessageRequest construction
2026-06-01 21:15:12 -07:00
Hunter B 4a09197433 fix(tui): bound foreground shell reader drains
Refs #2571

Harvests the core idea from PR #2573 by @idling11, with local cleanup for normal-exit inherited pipe handles and a foreground orphan-pipe regression.

Co-authored-by: Hanmiao Li <894876246@qq.com>
2026-06-01 20:49:26 -07:00
Hunter B eedeb5290b fix(agent): pass through explicit AtlasCloud model ids
Refs #2569

Harvests the safe part of PR #2569 by allowing AtlasCloud provider-hinted namespaced model IDs to route exactly as requested, without freezing a volatile provider model catalog in the static registry.

Co-authored-by: lucaszhu-hue <lucas.zhu@atlascloud.ai>
2026-06-01 20:06:26 -07:00
Hunter B 88f34fc9dd fix(tui): protect multiline drafts on arrow navigation 2026-06-01 19:58:39 -07:00