Commit Graph

1696 Commits

Author SHA1 Message Date
Hunter B 1e54be0620 fix(tui): refresh branch status after shell changes 2026-06-04 19:26:26 -07:00
Hunter B 47577d59e9 fix(tui): #2760 correct sessions resume footer
Harvested from PR #2760 by @sximelon

Fixes #2758

Show the canonical 'codewhale resume <session-id>' subcommand in the sessions footer instead of the invalid dispatcher form, and add a parser/footer regression test tying the hint to the actual Resume command.

Verification: cargo fmt --all -- --check; git diff --check; ./scripts/release/check-versions.sh; cargo test -p codewhale-tui --bin codewhale-tui --locked sessions_footer_points_to_resume_subcommand -- --nocapture; cargo clippy -p codewhale-tui --bin codewhale-tui --locked -- -D warnings.

Co-authored-by: sximelon <15710511+sximelon@users.noreply.github.com>
2026-06-04 18:56:25 -07:00
Hunter B de86cc1860 fix(tui): install rustls provider before HTTP clients
Install the ring rustls provider through a shared TUI helper and route reqwest client construction through it so no-provider TLS builds do not panic in engine, runtime API, tool, MCP, config, and test paths.

Keep the skill-installer integration include compatible with a local helper, and pin prompt byte-stability tests to an isolated home/skills environment under the shared env lock.

Verification: cargo fmt --all -- --check; git diff --check; ./scripts/release/check-versions.sh; cargo clippy --workspace --all-features --locked -- -D warnings; cargo test --workspace --all-features --locked; focused skill_install, finance, goal-tool, and MCP reruns.
2026-06-04 18:50:20 -07:00
Hunter B 70adeeeae6 docs(v0.9): record #2746/#2747, #2750, #2756, #2757 harvests and #2742/#2751/#2755 dispositions
Log the new community-PR harvests in CHANGELOG.md and crates/tui/CHANGELOG.md
(MCP underscore server names, Xiaomi MiMo pricing, hydrated deferred-tool
render, Token Plan region docs) with contributor credit, and update
docs/V0_9_0_EXECUTION_MAP.md with evidence-backed dispositions for the
newly-reviewed PRs, including the deferred #2742 and forwarded #2751/#2755.
2026-06-04 18:28:40 -07:00
Hunter B 74b3268521 fix(tui): render hydrated deferred-tool results as loaded, not run done
When a deferred tool is used for the first time, the engine returns a
schema-hydration result (event tool.schema_hydrated, executed=false,
retry_required=true) instead of executing the tool. The transcript and sidebar
previously rendered this as a completed run ("run done"), which was
indistinguishable from a real successful execution and misled both the user and
the model. Hydrated results now render as "tool loaded - retry required" via a
dedicated ToolStatus::Hydrated, threaded through tool routing, history, sidebar,
and theme. A successful real execution still renders as run done, a failed tool
with hydration metadata stays Failed.

Local correction on top of the PR: a hydrated row ranks with active work
(ToolStatus::Running) in the sidebar rather than alongside completed successes,
matching the "not run done" intent. The contributor's hydration detection and
missing-metadata handling are kept as-is (the sole emitter always sets
executed=false, consistent with the engine's own check).

Harvested from PR #2757 by @mvanhorn. Fixes #2648.

Co-authored-by: mvanhorn <455140+mvanhorn@users.noreply.github.com>
2026-06-04 18:26:37 -07:00
Hunter B 9d13960603 feat(pricing): price Xiaomi MiMo primary models at matching DeepSeek V4 rates
Add cost-estimate pricing for the Xiaomi MiMo chat models that were previously
unpriced: mimo-v2.5-pro / xiaomi/mimo-v2.5-pro reuse the DeepSeek V4-Pro rate
table, and mimo-v2.5 / xiaomi/mimo-v2.5 reuse the DeepSeek V4-Flash rates. The
DeepSeek V4 pro/flash rate tables are extracted into deepseek_v4_pro_pricing()
and deepseek_v4_flash_pricing() helpers so the MiMo aliases stay aligned with
DeepSeek. Existing DeepSeek pricing behavior is unchanged (deepseek + non-v4pro
still maps to Flash), and unrelated models still return None.

Harvested from PR #2750 by @cyq1017. Fixes #2731.

Co-authored-by: cyq1017 <61975706+cyq1017@users.noreply.github.com>
2026-06-04 18:23:49 -07:00
Hunter B 9e29c221b9 fix(mcp): preserve underscored MCP server names in tool routing
parse_prefixed_name now matches the qualified mcp_<server>_<tool> name against
the set of registered server names (connections + configured servers) and
prefers the longest matching server name, instead of naively splitting on the
first underscore. Tools on servers whose names contain underscores (e.g.
"my_db") are now reachable, and an overlapping pair like "my" and "my_db"
routes to the correct server. Falls back to the legacy first-underscore split
when no registered server matches, preserving backward compatibility.

Harvested from PR #2747 by @cyq1017; supersedes the equivalent fix in PR #2746
by @puneetdixit200. Both contributors diagnosed and fixed issue #2744; #2747
landed for its longest-match tie-break test coverage. Fixes #2744.

Co-authored-by: cyq1017 <61975706+cyq1017@users.noreply.github.com>
Co-authored-by: puneetdixit200 <236133619+puneetdixit200@users.noreply.github.com>
2026-06-04 18:23:02 -07:00
Hunter B 56f8044cf3 feat(tui): show focused approval details
Harvested the narrow approval-detail and shell-preview slice from #1991/#2269 by @tdccccc. Approval cards now show prominent command, dir, file, path, or target rows before falling back to raw params, and shell approvals preserve long command tails while splitting common shell chains for review.

The maintained path keeps the existing #2381 intent-summary block visible and does not take the broader diff-preview/pager rewrite from #2269. Live shell companion tools are classified as shell so their approval cards use the same review posture.

Co-authored-by: tdccccc <79492752+tdccccc@users.noreply.github.com>
2026-06-04 00:17:39 -07:00
Hunter B 0d66ef34d1 feat(hooks): add turn_end observer hook
Harvested the narrow Rust/docs slice of PR #2578 by @AresNing for #1364. The event uses the maintained structured observer path: JSON stdin, stdout ignored, warn-only failures, and no ability to block or mutate the turn.

The hook fires after post-turn app state, usage totals, cost, notification, receipt, and queue-recovery state are updated, before queued follow-up dispatch. Docs, RFC notes, /hooks discovery, and v0.9 tracking now describe the observer-only contract.

Co-authored-by: AresNing <49557311+AresNing@users.noreply.github.com>
2026-06-04 00:09:01 -07:00
Hunter B 586640a437 feat(config): add typed harness posture profiles
Harvested from PR #2741 by @idling11 for #2693, with review fixes folded in: typed compaction/tool/safety enums, no silent unknown-kind fallback, unknown profile keys rejected, and whole-struct equality for future reload/runtime checks.

Co-authored-by: idling11 <8055620+idling11@users.noreply.github.com>
2026-06-04 00:00:22 -07:00
Hunter B 13cabac077 docs(config): clarify provider path suffix support
Records that #2506/#2508 are superseded by the safer #2558 path_suffix implementation, credits the original #1874 report and follow-up PR review trail, and documents that suffix overrides only affect chat completions while model and beta paths keep built-in routing.
2026-06-03 23:56:40 -07:00
Hunter B e18f072a5a perf(context): cache project context with content signatures
Harvested from PR #2636 by @HUQIANTAO with widened cache invalidation for constitution files, generated context, trust state, canonical paths, and same-length overwrites.

Co-authored-by: HUQIANTAO <58421104+HUQIANTAO@users.noreply.github.com>
2026-06-03 23:49:08 -07:00
Hunter B 6a7063c912 ci(ohos): guard unsupported target dependencies 2026-06-03 23:41:21 -07:00
Hunter B 8d9cd44078 fix(tui): make queued follow-up edits recoverable 2026-06-03 23:36:42 -07:00
Hunter B 5b3ee9db67 fix(tasks): fail stale running tasks after restart
Refs #1786.

Reported by @bevis-wong. This lands the durable restart-safety slice: persisted running tasks and running tool rows are marked failed with a recovery note instead of being requeued as live work after a prior process exits.
2026-06-03 23:30:16 -07:00
Hunter B ad3d61936b feat(subagent): preserve checkpoints for timeout continuation
Refs #2029.

Reported by @qiyuanlicn. This lands live per-step API-timeout checkpoint continuation and preserves checkpoint metadata through projections, transcripts, and persistence; cold-restart child-task rehydration remains out of scope.
2026-06-03 23:27:20 -07:00
Hunter B 3cb49233ee feat(sidebar): show full details for truncated rows
Harvested from PR #2734 by @idling11 with reviewer fixes for row-source fidelity, row-authoritative hit testing, and display-width popover sizing.

Refs #2694.

Co-authored-by: idling11 <8055620+idling11@users.noreply.github.com>
2026-06-03 23:26:08 -07:00
Hunter B 8d4eb0c2c9 fix(context): bound auto-generated project context
Refs #697 and #1827.

Reported by @NASLXTO and @wuxixing. Prior context-cap and startup-diagnosis work by @linzhiqin2003 and @merchloubna70-dot shaped this fallback.
2026-06-03 23:10:50 -07:00
Hunter B e14fc4712c fix(tui): label pending input delivery modes
Harvested from PR #2532 by @cyq1017.

Pending input rows now distinguish steer-pending, rejected-steer, and queued-follow-up states, with continuation rows aligned under the delivery label.

Refs #2054; leaves the broader cancel/edit affordance work open.

Co-authored-by: cyq1017 <61975706+cyq1017@users.noreply.github.com>
2026-06-03 22:19:22 -07:00
Hunter B 333275162f feat(runtime): save completed threads as sessions
Harvested from PR #2639 by @gaord.

Adds POST /v1/sessions for runtime clients to persist completed threads as managed saved sessions, with a 409 guard for queued or active turn/item state and focused session endpoint coverage.

Also makes MCP HTTP tests install the rustls ring provider before constructing reqwest clients so filtered no-provider test runs are deterministic.

Co-authored-by: gaord <9567937+gaord@users.noreply.github.com>
2026-06-03 22:16:02 -07:00
Hunter B c76ec47526 feat(transcript): collapse dense tool runs
Harvested from PR #2738 by @idling11.

Co-authored-by: idling11 <8055620+idling11@users.noreply.github.com>
2026-06-03 22:00:46 -07:00
Hunter B 55024a16d8 fix(subagent): inherit tool-agent model route
Harvested from PR #2736 by @h3c-hexin.

Co-authored-by: h3c-hexin <13790929+h3c-hexin@users.noreply.github.com>
2026-06-03 21:43:18 -07:00
Hunter B 9719b45cd3 fix(skills): merge configured and workspace skill dirs
Harvested from PR #2737 by @h3c-hexin.

Co-authored-by: h3c-hexin <13790929+h3c-hexin@users.noreply.github.com>
2026-06-03 21:39:15 -07:00
Hunter B 7ac8063b6b feat(plan): preserve rich PlanArtifact context
Harvested from PR #2733 by @idling11.

Adds richer update_plan artifact fields for grounded Plan-mode review, renders them in the transcript and Plan confirmation prompt, and carries them through /relay, fork-state, and saved-session replay.

Verification: cargo test -p codewhale-tui --bin codewhale-tui --locked plan_ -- --nocapture

Verification: cargo test -p codewhale-tui --bin codewhale-tui --locked relay_slash_command_routes_to_session_relay_instruction -- --nocapture

Verification: cargo clippy -p codewhale-tui --locked -- -D warnings

Co-authored-by: idling11 <8055620+idling11@users.noreply.github.com>
2026-06-03 21:31:09 -07:00
Hunter B 66c88ddfae feat(runtime): allow thread workspace updates
Harvest the UpdateThreadRequest workspace field from PR #2640 while keeping the engine-cache correctness fix: PATCH /v1/threads/{id} now persists workspace changes, emits the workspace change in thread.updated, rejects empty paths, rejects workspace changes while a turn is active, and evicts idle cached engines so the next turn starts in the new workspace.

Validation: cargo fmt --all -- --check; git diff --check; cargo test -p codewhale-tui --bin codewhale-tui --locked update_thread_workspace -- --nocapture; cargo clippy -p codewhale-tui --locked -- -D warnings; python3 scripts/check-coauthor-trailers.py --author-map .github/AUTHOR_MAP --range origin/main..HEAD --check-authors.

Harvested from PR #2640 by @gaord.

Co-authored-by: gaord <9567937+gaord@users.noreply.github.com>
2026-06-03 21:11:55 -07:00
Hunter B fb86737a8c test(settings): assert migrated settings display canonical path
Extend the #2730 settings migration harvest with the missing platform-config fallback display assertion from review, and keep the v0.9 execution map/changelog credit current.

Validation: cargo fmt --all -- --check; git diff --check; cargo test -p codewhale-tui --bin codewhale-tui --locked settings_ -- --nocapture; cargo test -p codewhale-tui --bin codewhale-tui --locked display_localizes_header_and_config_file_label -- --nocapture.

Harvested from PR #2730 by @xyuai.

Co-authored-by: xyuai <281015099+xyuai@users.noreply.github.com>
2026-06-03 21:02:46 -07:00
Hunter B 23c9481af1 feat: add HarmonyOS OpenHarmony support
Harvest the HarmonyOS/OpenHarmony port from PR #2634 and make it publish-safe by target-gating unsupported host dependencies out of the OHOS TUI graph. Self-update is disabled on OHOS, PTY shell mode reports unsupported, and Starlark execpolicy parsing returns an explicit unsupported-platform error until upstream starlark/rustyline/nix support catches up.

Add OHOS SDK setup docs and launcher scripts, install the rustls ring provider for rustls-no-provider entrypoints, and keep the packaged codewhale-tui OHOS graph free of starlark, rustyline, nix@0.28, portable-pty, and arboard.

Validation: cargo fmt --all -- --check; git diff --check; git diff --cached --check; cargo check -p codewhale-cli --locked; cargo check -p codewhale-app-server --locked; cargo check -p codewhale-tui --locked; cargo test -p codewhale-cli --locked update::tests::; cargo test -p codewhale-release --locked; cargo test -p codewhale-tui --locked background_tty_command_has_controlling_terminal; cargo test -p codewhale-tui --locked clipboard; cargo package -p codewhale-tui --allow-dirty --no-verify --locked; packaged OHOS cargo tree checks. OHOS target check still requires a loaded OpenHarmony SDK/sysroot and currently stops in ring with missing assert.h when CC/CFLAGS/linker are unset.

Harvested from PR #2634 by @shenjackyuanjie.

Co-authored-by: shenjackyuanjie <54507071+shenjackyuanjie@users.noreply.github.com>
2026-06-03 21:02:46 -07:00
HUQIANTAO 98edba3683 refactor(engine): append turn metadata after user text
Place user text before volatile turn metadata in outgoing user-message content arrays so provider prefix caches can continue matching the stable user-input prefix across date, model-route, and working-set changes.

Also adds wire-level coverage proving tail-positioned turn metadata serializes after user text while preserving turn-meta deduplication.

Harvested from PR #2517 by @HUQIANTAO

Co-authored-by: HUQIANTAO <58421104+HUQIANTAO@users.noreply.github.com>
2026-06-03 21:02:45 -07:00
Hunter B 60f8e7d62e refactor(web_run): split cache locks for page reads
Harvested from PR #2502 by @HUQIANTAO

Co-authored-by: HUQIANTAO <58421104+HUQIANTAO@users.noreply.github.com>
2026-06-03 21:02:45 -07:00
Hunter B 311eb4002b feat(tui): add bounded restore snapshot listing
Harvested from PR #2513 by @cyq1017.

Co-authored-by: cyq1017 <61975706+cyq1017@users.noreply.github.com>
2026-06-03 21:02:45 -07:00
Hunter B 8502784218 fix(xiaomi-mimo): use token-plan api-key auth
Harvests the MiMo Token Plan auth-header behavior from #2627 while keeping Xiaomi env-key precedence unchanged so standard endpoints do not accidentally receive a Token Plan key.

Harvested from PR #2627 by @xyuai.

Co-authored-by: xyuai <281015099+xyuai@users.noreply.github.com>
2026-06-03 21:02:45 -07:00
Hunter B 159f509dd6 fix(tui): invalidate fanout card rows on sibling starts 2026-06-03 21:01:39 -07:00
jrcjrcc 7b2a7e513d fix: Windows sub-agent completion halves TUI render width
Root cause: AgentComplete unconditionally calls resume_terminal()
even when the terminal was never paused, causing a secondary
EnterAlternateScreen on Windows that creates a new buffer whose
width may differ from the window width. Additionally,
ColorCompatBackend had no terminal_size cache, so size() fell
through to crossterm::terminal::size() which on Windows returns
the WinAPI buffer width rather than the window width.

Changes:
- AgentComplete: add event_broker.is_paused() guard
- resume_terminal(): cache real terminal size before reset_viewport
- Resize handler: also set terminal_size alongside forced_size
- subagent_routing: 3x mark_history_updated -> bump_history_cell(idx)
- color_compat: add terminal_size field, set_terminal_size(), fix
  size() fallback priority (forced_size > terminal_size)
- tests: 3 unit tests for size() fallback chain

Review feedback addressed:
- forced_size now takes priority over terminal_size (gemini-code-assist)
- Redundant map lookups removed in subagent_routing (both bots)
- set_terminal_size moved before reset_terminal_viewport (greptile-apps)

(cherry picked from commit 4463c46644a6e485e7e20dc2b19c29c2e8eb3c5c)
2026-06-03 21:01:39 -07:00
Hunter B 4401f7a2e5 feat(tools): hide legacy subagent and shell aliases from model catalog (#2683)
Subagent aliases:
- Legacy names (agent_spawn, agent_result, agent_cancel, resume_agent,
  agent_list, agent_send_input, agent_assign, agent_wait,
  delegate_to_agent) are already NOT registered — they exist as dead code
  with #[allow(dead_code)] since v0.8.33
- Add test verifying model catalog only advertises canonical subagent
  tools: agent_open, agent_eval, agent_close, tool_agent

Shell aliases:
- Hide exec_wait from model catalog (legacy alias for exec_shell_wait)
- Hide exec_interact from model catalog (legacy alias for
  exec_shell_interact)
- Both remain callable for saved transcript replay
- Add test verifying shell aliases are hidden but callable

Verification: cargo test -p codewhale-tui --locked (4040 passed),
cargo clippy -D warnings
2026-06-03 21:01:38 -07:00
Implementist 88422f3ad3 fix(plan_prompt): pre-wrap CJK+Latin mixed text to avoid forced line-breaks at script boundaries
Wrap plan steps via wrap_text() before rendering, breaking only on display-width overflow, not on Latin/CJK Unicode word boundaries. Switch main render path from Wrap { trim: true } to Wrap { trim: false } since all content is pre-wrapped. Replace wrapped_line_count() with lines.len() for accurate scroll bounds. Keep confirm-exit dialog on Wrap { trim: true } (English-only, no risk).
2026-06-03 21:01:38 -07:00
Implementist 966b5cf1fb refactor(plan_prompt): use display-width in wrap_text, skip wasted render work
- wrap_text: replace chars().count() with UnicodeWidthStr::width() so
  CJK text is wrapped by display columns, consistent with
  wrapped_line_count and ratatui's Paragraph::wrap.  Also fix the
  hard-split loop to use exclusive byte ranges (..end) instead of
  inclusive (..=i) so multi-byte UTF-8 prefixes are always valid.
- render: hoist the confirming_exit branch to an early return so the
  plan-content construction (lines, scroll bounds, footer) is skipped
  entirely when the confirmation dialog is visible.
2026-06-03 21:01:38 -07:00
Implementist e3a52555eb fix(plan_prompt): clear pending_g on Esc, deduplicate render_modal_chrome
- Clear pending_g when Esc triggers the exit-confirmation prompt so a
  stray 'g' press does not leak into and survive the confirmation dialog.
- Move render_modal_chrome into the else branch so only one call fires
  per render pass, eliminating a shadow artifact when confirming_exit
  is active.
2026-06-03 21:01:38 -07:00
Implementist 47c071a0d5 chore: apply cargo fmt fix to plan_prompt.rs 2026-06-03 21:01:38 -07:00
Implementist 6d79d55b6c fix(plan_prompt): use display-width for leading spaces and de-hardcode wrap width
- wrapped_line_count: compute leading-space width via UnicodeWidthStr
  instead of byte length, so non-ASCII leading whitespace is measured
  correctly.
- render: hoist popup_area / content_width computation above plan
  rendering so wrap_text can share the same content_width derived
  from the actual popup geometry instead of a magic 68.
2026-06-03 21:01:38 -07:00
Implementist 11c448d66e fix(plan_prompt): remove step truncation to allow content overflow into scroll region 2026-06-03 21:01:38 -07:00
Implementist 537a8bccf3 fix(tui): replace manual div_ceil with usize::div_ceil to satisfy clippy lint 2026-06-03 21:01:38 -07:00
Implementist 14db9b2466 fix(tui): avoid spurious exit-confirmation on short plan after scroll key
Use clamped (effective) scroll instead of raw `self.scroll` in the Esc
handler so a short plan that fits entirely (max_scroll == 0) never
triggers the "exit without implementing?" dialog when the user pressed
a scroll key (PgDn/Ctrl-D/G/End) beforehand.
2026-06-03 21:01:38 -07:00
Implementist 1669e3c12e fix(tui): address code review feedback on plan prompt modal
- Use word-wrapping-aware line count to prevent underestimating scroll range
  (gemini-code-assist / greptile-apps)
- Merge PLAN_OPTIONS, PLAN_SHORTCUTS, PLAN_SHORT_LABELS into PlanOption struct
  (gemini-code-assist)
- Remove dead Esc code in handle_key (greptile-apps)
- Guard gg/G with modifier checks (gemini-code-assist)
- Increase PgUp/PgDn scroll amount from 6 to 12 (greptile-apps)
- Use u16::try_from for scroll value to avoid silent truncation (greptile-apps)
- Update related unit tests for new scroll values
2026-06-03 21:01:38 -07:00
Implementist 68784cff52 fix(tui): add scroll support to plan prompt modal
- Add scroll state field to PlanPromptView with PgUp/PgDn, Ctrl+U/D/F/B,
  Home/End, gg/G vim-style keybindings
- Show scroll indicator footer when content overflows the popup
- Add confirming_exit state: Esc while scrolled asks for confirmation
  before discarding, preventing accidental exits on long plans
- Clamp scroll in render() so overscroll doesn't hide bottom options
- Use wrapped_line_count() with UnicodeWidthStr for accurate overflow
  detection with CJK characters
- Add 11 unit tests covering scroll, keybindings, and exit confirmation
2026-06-03 21:01:38 -07:00
HUQIANTAO 863f55cc68 perf(history): simplify output_rows cache API and switch to FNV-1a
Three follow-ups to the previous perf commit:

1. Drop the rows_hash field on CacheEntry. The field was computed and
   stored but never read on the hot path; tests exercised it only to
   assert the cache returned a stable hash. After this change
   get_or_compute_rows returns just Vec<OutputRow>, halving the
   tuple-return ABI and removing one DefaultHasher::write pass on
   every cache miss.

2. Replace DefaultHasher (SipHash) with a hand-rolled FNV-1a 64-bit
   hash. SipHash is per-process-keyed and ~5-10x slower than FNV on
   the small-to-medium tool output strings we see at 120 FPS. FNV-1a
   has no per-process key, fits in 20 lines of pure-Rust, and a 64-bit
   collision space is more than wide enough for the per-process LRU's
   expected <= a few hundred entries. The cache is a correctness
   optimization, not a security boundary; collisions only cause a
   false miss, never wrong data.

3. Caller in tui::history::render_preserved_output_mode updated to
   the new Vec<OutputRow>-only signature. Two new tests cover the
   FNV-1a properties (length-suffix sensitivity, empty-input
   stability).
2026-06-03 21:01:38 -07:00
HUQIANTAO 3b0ef3f63c perf(history): cache output_rows and selected_output_indices per cell
output_rows (in tui::history) walks the raw tool output, ANSI-strips
each line, classifies path/URL-like rows, and wraps the rest to the
current viewport width. selected_output_indices then computes the
head/tail/importance subset that the compact Live view shows. Both
functions are pure, but they are called on every render frame for
every visible tool cell. For a 4 KB tool output on a 120 FPS render
loop that is 2-6 redundant walks per frame, per cell, and the
function is called from a non-trivial number of cells across
exec, tool, command, and review history.

Add tui::output_rows_cache, a thread-local, content-addressed cache
keyed on (content_hash, width) for the rows and (content_hash, width,
line_limit) for the indices. The cache stores the wrapped
Vec<OutputRow> plus a per-line-limit map of selected indices on a
single entry, so a single key lookup satisfies both render steps.

render_preserved_output_mode now consults the cache for both the
rows and the indices; on a hit, neither the per-line ANSI strip nor
the importance-ranking pass runs. The cache is bounded (default
capacity 256) with insertion-order eviction. The OutputRow struct
gains PartialEq + Eq + pub fields so the cache module can store and
hash it without exposing private internals.

Tests: 6 new unit tests cover the hit/miss path, width invalidation,
content invalidation, indices per-line_limit caching, capacity
eviction, and hash stability. The wider tui::history test suite (68
tests) still passes.
2026-06-03 21:01:38 -07:00
HUQIANTAO c0b36824c2 perf(capacity): let scan_canonical_inputs early-exit without verified-user lookup
The build_canonical_state path never reads
CanonicalStateScan::latest_verified_user_idx, but the previous patch
required is_complete() to find a verified user message before it would
short-circuit. On a long history with no verification replay — the
common case — the scan walked the entire message list looking for a
match that could not exist.

Add a find_verified: bool parameter to scan_canonical_inputs and
CanonicalStateScan::is_complete. build_canonical_state now passes
false, so the loop stops as soon as the goal and CANONICAL_SCAN_MAX_FACTS
facts are found. The replan path (apply_verify_and_replan) keeps the
existing true behavior so it still locates the latest verified user
message.

Test calls are updated to match; no behavior change for any test.
2026-06-03 21:01:38 -07:00
HUQIANTAO 837a6f8c54 perf(capacity): collapse build_canonical_state's reverse scans to one pass
build_canonical_state previously did two independent reverse walks of
session.messages — one to extract the most recent user goal, and one
to collect up to four confirmed-fact snippets. apply_verify_and_replan
then added a third and fourth reverse scan to locate the latest user
message and the latest [verification replay] user message for the
re-plan path.

All four reverse scans collect disjoint facts about the same most-
recent-first view of the conversation. This PR folds them into a
single helper, scan_canonical_inputs, that walks messages once in
reverse, fills a CanonicalStateScan, and short-circuits as soon as
every collector is satisfied. The helper exposes the latest-message
indices so apply_verify_and_replan can clone the full Message values
after the scan (eliminating the two independent find().cloned() walks).

The output CanonicalState is byte-identical to the prior
implementation: same goal, same confirmed facts (newest first, errors
filtered), same fallback string when no user text exists. The re-plan
path's keep-messages set is identical: latest user + latest verified.

Tests: 6 new unit tests cover the goal lookup, fact cap, error-result
filter, verified-marker scan, empty input, and the early-exit
condition. The full engine test suite (153 tests) still passes.
2026-06-03 21:01:38 -07:00
HUQIANTAO e3adc98baf perf(prefix-cache): fold tool.strict into identity hash, share cache with PrefixFingerprint::compute
Three follow-ups to the previous perf commit:

1. Correctness: tool.strict participates in the wire format emitted by
   tool_to_api_json, so it MUST participate in the cache identity. Two
   catalogs that differ only in strict would otherwise collide and serve
   a stale SHA-256, silently busting prefix-cache stability on the wire.

2. Allocation: replace the per-tool serde_json::to_string in
   tool_set_identity with a hash_json_value helper that walks the JSON
   tree directly. For a 60-tool catalog this drops ~25-40 KB of
   transient allocation per cache miss.

3. Dead code: the previous patch introduced PrefixFingerprint::compute,
   CachedCatalog::joined, ToolCatalogCache::{invalidate,is_empty}, and a
   thread-local cache helper that were not used outside tests. With
   -D warnings in CI all four triggered dead-code errors. The compute
   helper is now only built in cfg(test); the rest are marked
   #[allow(dead_code)] with comments explaining their observability and
   test-only use.
2026-06-03 21:01:37 -07:00
HUQIANTAO baef5ba95d perf(prefix-cache): cache tool-catalog JSON serialization across checks
PrefixFingerprint::compute is called once per turn by the turn loop
prefix-stability check. The tool-side work serializes every tool to the
chat-API JSON shape, sorts the resulting strings, joins with newlines,
and SHA-256s the result. For a 60-tool catalog that is ~25-40 KB of
allocation plus a sort, all of which produces a byte-identical output
once the tool set is stable across turns (the common case after the
first turn of a session).

Introduce a process-local ToolCatalogCache that stores the joined+sorted
catalog under a content-derived u64 identity (length + per-tool name +
description + serialized input_schema). On a hit, the per-tool JSON
serialization, sort, and join are skipped entirely — the pre-computed
SHA-256 hex digest is returned directly.

The cache lives on PrefixStabilityManager (per-session ownership) and
backs a new PrefixFingerprint::compute_with_tool_cache entry point.
check_and_update, PrefixStabilityManager::new, and pin() all use the
cached path. The original compute() is kept as a fallback for callers
that do not have a cache in hand (e.g. CLI tools that build a one-shot
fingerprint).

The cache is bounded (default capacity = 8) and uses insertion-order
eviction, matching the eviction strategy already in
transcript_cache.rs. invalidate() is exposed for tool-registry hot-reload
and MCP attach paths.

Tests: 8 new unit tests cover the miss/hit path (pointer-equal Arc on
hit), identity collisions, schema change detection, capacity eviction,
invalidate, empty slice, and the equivalence between cached and uncached
fingerprints. The full 30-test prefix_cache suite passes; the wider
prefix-cache contract tests in settings, prompts, and
core::engine::tests continue to pass.
2026-06-03 21:01:37 -07:00