Commit Graph

2210 Commits

Author SHA1 Message Date
Hunter B 111a805eb8 docs: mark mention depth hint harvested 2026-06-03 21:02:45 -07:00
Hunter B 5dc1a63cd4 docs: harvest provider fallback chain RFC
Harvested from PR #2581 by @idling11.

Co-authored-by: idling11 <8055620+idling11@users.noreply.github.com>
2026-06-03 21:02:45 -07:00
Hunter B 44ceabd606 docs: refresh README and v0.9 execution map 2026-06-03 21:02:45 -07:00
Hunter B 8502784218 fix(xiaomi-mimo): use token-plan api-key auth
Harvests the MiMo Token Plan auth-header behavior from #2627 while keeping Xiaomi env-key precedence unchanged so standard endpoints do not accidentally receive a Token Plan key.

Harvested from PR #2627 by @xyuai.

Co-authored-by: xyuai <281015099+xyuai@users.noreply.github.com>
2026-06-03 21:02:45 -07:00
Hunter B 159f509dd6 fix(tui): invalidate fanout card rows on sibling starts 2026-06-03 21:01:39 -07:00
jrcjrcc 7b2a7e513d fix: Windows sub-agent completion halves TUI render width
Root cause: AgentComplete unconditionally calls resume_terminal()
even when the terminal was never paused, causing a secondary
EnterAlternateScreen on Windows that creates a new buffer whose
width may differ from the window width. Additionally,
ColorCompatBackend had no terminal_size cache, so size() fell
through to crossterm::terminal::size() which on Windows returns
the WinAPI buffer width rather than the window width.

Changes:
- AgentComplete: add event_broker.is_paused() guard
- resume_terminal(): cache real terminal size before reset_viewport
- Resize handler: also set terminal_size alongside forced_size
- subagent_routing: 3x mark_history_updated -> bump_history_cell(idx)
- color_compat: add terminal_size field, set_terminal_size(), fix
  size() fallback priority (forced_size > terminal_size)
- tests: 3 unit tests for size() fallback chain

Review feedback addressed:
- forced_size now takes priority over terminal_size (gemini-code-assist)
- Redundant map lookups removed in subagent_routing (both bots)
- set_terminal_size moved before reset_terminal_viewport (greptile-apps)

(cherry picked from commit 4463c46644a6e485e7e20dc2b19c29c2e8eb3c5c)
2026-06-03 21:01:39 -07:00
Hunter B 27db89c25d docs: update TOOL_SURFACE.md with v0.9.0 hidden-alias table (#2682, #2683) 2026-06-03 21:01:38 -07:00
Hunter B 4401f7a2e5 feat(tools): hide legacy subagent and shell aliases from model catalog (#2683)
Subagent aliases:
- Legacy names (agent_spawn, agent_result, agent_cancel, resume_agent,
  agent_list, agent_send_input, agent_assign, agent_wait,
  delegate_to_agent) are already NOT registered — they exist as dead code
  with #[allow(dead_code)] since v0.8.33
- Add test verifying model catalog only advertises canonical subagent
  tools: agent_open, agent_eval, agent_close, tool_agent

Shell aliases:
- Hide exec_wait from model catalog (legacy alias for exec_shell_wait)
- Hide exec_interact from model catalog (legacy alias for
  exec_shell_interact)
- Both remain callable for saved transcript replay
- Add test verifying shell aliases are hidden but callable

Verification: cargo test -p codewhale-tui --locked (4040 passed),
cargo clippy -D warnings
2026-06-03 21:01:38 -07:00
Implementist 88422f3ad3 fix(plan_prompt): pre-wrap CJK+Latin mixed text to avoid forced line-breaks at script boundaries
Wrap plan steps via wrap_text() before rendering, breaking only on display-width overflow, not on Latin/CJK Unicode word boundaries. Switch main render path from Wrap { trim: true } to Wrap { trim: false } since all content is pre-wrapped. Replace wrapped_line_count() with lines.len() for accurate scroll bounds. Keep confirm-exit dialog on Wrap { trim: true } (English-only, no risk).
2026-06-03 21:01:38 -07:00
Implementist 966b5cf1fb refactor(plan_prompt): use display-width in wrap_text, skip wasted render work
- wrap_text: replace chars().count() with UnicodeWidthStr::width() so
  CJK text is wrapped by display columns, consistent with
  wrapped_line_count and ratatui's Paragraph::wrap.  Also fix the
  hard-split loop to use exclusive byte ranges (..end) instead of
  inclusive (..=i) so multi-byte UTF-8 prefixes are always valid.
- render: hoist the confirming_exit branch to an early return so the
  plan-content construction (lines, scroll bounds, footer) is skipped
  entirely when the confirmation dialog is visible.
2026-06-03 21:01:38 -07:00
Implementist e3a52555eb fix(plan_prompt): clear pending_g on Esc, deduplicate render_modal_chrome
- Clear pending_g when Esc triggers the exit-confirmation prompt so a
  stray 'g' press does not leak into and survive the confirmation dialog.
- Move render_modal_chrome into the else branch so only one call fires
  per render pass, eliminating a shadow artifact when confirming_exit
  is active.
2026-06-03 21:01:38 -07:00
Implementist 47c071a0d5 chore: apply cargo fmt fix to plan_prompt.rs 2026-06-03 21:01:38 -07:00
Implementist 6d79d55b6c fix(plan_prompt): use display-width for leading spaces and de-hardcode wrap width
- wrapped_line_count: compute leading-space width via UnicodeWidthStr
  instead of byte length, so non-ASCII leading whitespace is measured
  correctly.
- render: hoist popup_area / content_width computation above plan
  rendering so wrap_text can share the same content_width derived
  from the actual popup geometry instead of a magic 68.
2026-06-03 21:01:38 -07:00
Implementist 11c448d66e fix(plan_prompt): remove step truncation to allow content overflow into scroll region 2026-06-03 21:01:38 -07:00
Implementist 537a8bccf3 fix(tui): replace manual div_ceil with usize::div_ceil to satisfy clippy lint 2026-06-03 21:01:38 -07:00
Implementist 14db9b2466 fix(tui): avoid spurious exit-confirmation on short plan after scroll key
Use clamped (effective) scroll instead of raw `self.scroll` in the Esc
handler so a short plan that fits entirely (max_scroll == 0) never
triggers the "exit without implementing?" dialog when the user pressed
a scroll key (PgDn/Ctrl-D/G/End) beforehand.
2026-06-03 21:01:38 -07:00
Implementist 1669e3c12e fix(tui): address code review feedback on plan prompt modal
- Use word-wrapping-aware line count to prevent underestimating scroll range
  (gemini-code-assist / greptile-apps)
- Merge PLAN_OPTIONS, PLAN_SHORTCUTS, PLAN_SHORT_LABELS into PlanOption struct
  (gemini-code-assist)
- Remove dead Esc code in handle_key (greptile-apps)
- Guard gg/G with modifier checks (gemini-code-assist)
- Increase PgUp/PgDn scroll amount from 6 to 12 (greptile-apps)
- Use u16::try_from for scroll value to avoid silent truncation (greptile-apps)
- Update related unit tests for new scroll values
2026-06-03 21:01:38 -07:00
Implementist 68784cff52 fix(tui): add scroll support to plan prompt modal
- Add scroll state field to PlanPromptView with PgUp/PgDn, Ctrl+U/D/F/B,
  Home/End, gg/G vim-style keybindings
- Show scroll indicator footer when content overflows the popup
- Add confirming_exit state: Esc while scrolled asks for confirmation
  before discarding, preventing accidental exits on long plans
- Clamp scroll in render() so overscroll doesn't hide bottom options
- Use wrapped_line_count() with UnicodeWidthStr for accurate overflow
  detection with CJK characters
- Add 11 unit tests covering scroll, keybindings, and exit confirmation
2026-06-03 21:01:38 -07:00
HUQIANTAO 863f55cc68 perf(history): simplify output_rows cache API and switch to FNV-1a
Three follow-ups to the previous perf commit:

1. Drop the rows_hash field on CacheEntry. The field was computed and
   stored but never read on the hot path; tests exercised it only to
   assert the cache returned a stable hash. After this change
   get_or_compute_rows returns just Vec<OutputRow>, halving the
   tuple-return ABI and removing one DefaultHasher::write pass on
   every cache miss.

2. Replace DefaultHasher (SipHash) with a hand-rolled FNV-1a 64-bit
   hash. SipHash is per-process-keyed and ~5-10x slower than FNV on
   the small-to-medium tool output strings we see at 120 FPS. FNV-1a
   has no per-process key, fits in 20 lines of pure-Rust, and a 64-bit
   collision space is more than wide enough for the per-process LRU's
   expected <= a few hundred entries. The cache is a correctness
   optimization, not a security boundary; collisions only cause a
   false miss, never wrong data.

3. Caller in tui::history::render_preserved_output_mode updated to
   the new Vec<OutputRow>-only signature. Two new tests cover the
   FNV-1a properties (length-suffix sensitivity, empty-input
   stability).
2026-06-03 21:01:38 -07:00
HUQIANTAO 3b0ef3f63c perf(history): cache output_rows and selected_output_indices per cell
output_rows (in tui::history) walks the raw tool output, ANSI-strips
each line, classifies path/URL-like rows, and wraps the rest to the
current viewport width. selected_output_indices then computes the
head/tail/importance subset that the compact Live view shows. Both
functions are pure, but they are called on every render frame for
every visible tool cell. For a 4 KB tool output on a 120 FPS render
loop that is 2-6 redundant walks per frame, per cell, and the
function is called from a non-trivial number of cells across
exec, tool, command, and review history.

Add tui::output_rows_cache, a thread-local, content-addressed cache
keyed on (content_hash, width) for the rows and (content_hash, width,
line_limit) for the indices. The cache stores the wrapped
Vec<OutputRow> plus a per-line-limit map of selected indices on a
single entry, so a single key lookup satisfies both render steps.

render_preserved_output_mode now consults the cache for both the
rows and the indices; on a hit, neither the per-line ANSI strip nor
the importance-ranking pass runs. The cache is bounded (default
capacity 256) with insertion-order eviction. The OutputRow struct
gains PartialEq + Eq + pub fields so the cache module can store and
hash it without exposing private internals.

Tests: 6 new unit tests cover the hit/miss path, width invalidation,
content invalidation, indices per-line_limit caching, capacity
eviction, and hash stability. The wider tui::history test suite (68
tests) still passes.
2026-06-03 21:01:38 -07:00
HUQIANTAO c0b36824c2 perf(capacity): let scan_canonical_inputs early-exit without verified-user lookup
The build_canonical_state path never reads
CanonicalStateScan::latest_verified_user_idx, but the previous patch
required is_complete() to find a verified user message before it would
short-circuit. On a long history with no verification replay — the
common case — the scan walked the entire message list looking for a
match that could not exist.

Add a find_verified: bool parameter to scan_canonical_inputs and
CanonicalStateScan::is_complete. build_canonical_state now passes
false, so the loop stops as soon as the goal and CANONICAL_SCAN_MAX_FACTS
facts are found. The replan path (apply_verify_and_replan) keeps the
existing true behavior so it still locates the latest verified user
message.

Test calls are updated to match; no behavior change for any test.
2026-06-03 21:01:38 -07:00
HUQIANTAO 837a6f8c54 perf(capacity): collapse build_canonical_state's reverse scans to one pass
build_canonical_state previously did two independent reverse walks of
session.messages — one to extract the most recent user goal, and one
to collect up to four confirmed-fact snippets. apply_verify_and_replan
then added a third and fourth reverse scan to locate the latest user
message and the latest [verification replay] user message for the
re-plan path.

All four reverse scans collect disjoint facts about the same most-
recent-first view of the conversation. This PR folds them into a
single helper, scan_canonical_inputs, that walks messages once in
reverse, fills a CanonicalStateScan, and short-circuits as soon as
every collector is satisfied. The helper exposes the latest-message
indices so apply_verify_and_replan can clone the full Message values
after the scan (eliminating the two independent find().cloned() walks).

The output CanonicalState is byte-identical to the prior
implementation: same goal, same confirmed facts (newest first, errors
filtered), same fallback string when no user text exists. The re-plan
path's keep-messages set is identical: latest user + latest verified.

Tests: 6 new unit tests cover the goal lookup, fact cap, error-result
filter, verified-marker scan, empty input, and the early-exit
condition. The full engine test suite (153 tests) still passes.
2026-06-03 21:01:38 -07:00
HUQIANTAO e3adc98baf perf(prefix-cache): fold tool.strict into identity hash, share cache with PrefixFingerprint::compute
Three follow-ups to the previous perf commit:

1. Correctness: tool.strict participates in the wire format emitted by
   tool_to_api_json, so it MUST participate in the cache identity. Two
   catalogs that differ only in strict would otherwise collide and serve
   a stale SHA-256, silently busting prefix-cache stability on the wire.

2. Allocation: replace the per-tool serde_json::to_string in
   tool_set_identity with a hash_json_value helper that walks the JSON
   tree directly. For a 60-tool catalog this drops ~25-40 KB of
   transient allocation per cache miss.

3. Dead code: the previous patch introduced PrefixFingerprint::compute,
   CachedCatalog::joined, ToolCatalogCache::{invalidate,is_empty}, and a
   thread-local cache helper that were not used outside tests. With
   -D warnings in CI all four triggered dead-code errors. The compute
   helper is now only built in cfg(test); the rest are marked
   #[allow(dead_code)] with comments explaining their observability and
   test-only use.
2026-06-03 21:01:37 -07:00
HUQIANTAO baef5ba95d perf(prefix-cache): cache tool-catalog JSON serialization across checks
PrefixFingerprint::compute is called once per turn by the turn loop
prefix-stability check. The tool-side work serializes every tool to the
chat-API JSON shape, sorts the resulting strings, joins with newlines,
and SHA-256s the result. For a 60-tool catalog that is ~25-40 KB of
allocation plus a sort, all of which produces a byte-identical output
once the tool set is stable across turns (the common case after the
first turn of a session).

Introduce a process-local ToolCatalogCache that stores the joined+sorted
catalog under a content-derived u64 identity (length + per-tool name +
description + serialized input_schema). On a hit, the per-tool JSON
serialization, sort, and join are skipped entirely — the pre-computed
SHA-256 hex digest is returned directly.

The cache lives on PrefixStabilityManager (per-session ownership) and
backs a new PrefixFingerprint::compute_with_tool_cache entry point.
check_and_update, PrefixStabilityManager::new, and pin() all use the
cached path. The original compute() is kept as a fallback for callers
that do not have a cache in hand (e.g. CLI tools that build a one-shot
fingerprint).

The cache is bounded (default capacity = 8) and uses insertion-order
eviction, matching the eviction strategy already in
transcript_cache.rs. invalidate() is exposed for tool-registry hot-reload
and MCP attach paths.

Tests: 8 new unit tests cover the miss/hit path (pointer-equal Arc on
hit), identity collisions, schema change detection, capacity eviction,
invalidate, empty slice, and the equivalence between cached and uncached
fingerprints. The full 30-test prefix_cache suite passes; the wider
prefix-cache contract tests in settings, prompts, and
core::engine::tests continue to pass.
2026-06-03 21:01:37 -07:00
HUQIANTAO 3de07a99ed perf(engine): memoize estimated_input_tokens via content-keyed cache
The token estimator walks the full session.messages and the active system
prompt. Five call sites per turn in the engine (capacity pre/post tool
checkpoints, error escalation, the seam manager, the trim budget check)
plus four TUI/command consumers (footer, /status, /debug, context
inspector) all re-walked the same data independently. On a 200-message
history with 5 KB of tool results that is roughly 2 ms per call, or
~20 ms of pure waste on a single turn.

Introduce a process-local TokenEstimateCache keyed on
(session.messages_revision, system_prompt_fingerprint). Repeated calls
with the same inputs return the cached value without re-walking the
message list. The cache invalidates as soon as either input changes:
  * session.messages_revision is a monotonic counter bumped in
    Session::add_message, Session::replace_messages, the new
    Session::bump_messages_revision helper, and at every direct
    session.messages mutation site in core/engine.rs and
    core/engine/capacity_flow.rs.
  * system_prompt_fingerprint is a stable 64-bit hash of the
    SystemPrompt::Text or SystemPrompt::Blocks payload.

Also restructures layered_context_checkpoint to compute the estimated
token count before taking a long-lived &SeamManager borrow, and
re-routes the capacity pre/post tool checkpoints to compute the
observation into a local before calling
capacity_controller.observe_*. Both refactors are required to satisfy
the borrow checker once estimated_input_tokens requires &mut self.

Tests: 10 new unit tests cover the miss/hit path, revision bumps,
system-prompt changes, audit-ring capacity, and downward-revision
no-ops. The full 157-test engine suite still passes.
2026-06-03 21:01:37 -07:00
Hunter B 05950d1d59 fix release crate publish checks 2026-06-03 19:24:42 -07:00
xyuai 9e15805f64 fix(settings): tighten legacy path migration coverage 2026-06-03 19:24:37 -07:00
Hunter B f7a602cd20 feat(tools): hide todo_* aliases from model catalog, add deprecation metadata (#2682)
- Add model_visible() hook to ToolSpec trait (default true)
- Override model_visible() -> false on todo_write, todo_add, todo_update, todo_list
- Checklist variants remain model-visible as the canonical surface
- Legacy todo_* calls still work for saved transcript replay
- Return _deprecation metadata with use_instead and removed_in=0.9.0
- Update prompts to recommend checklist_* only
- Update TOOL_SURFACE.md with v0.9.0 deprecation notes
- Add tests for hidden catalog, compat alias behavior, and metadata

Verification: cargo test -p codewhale-tui -- todo, cargo clippy -D warnings
2026-06-03 19:20:23 -07:00
Hunter Bown 8dff2f7525 fix(tui): guard xiaomi mimo defaults test against CI env vars 2026-06-03 16:25:04 -07:00
Hunter Bown 772ec46c98 chore(release): v0.8.53 — Arcee support, telegram bridge, provider fixes
- Fix Rust syntax/clippy fallout in client.rs, cli/src/lib.rs, web_search.rs
- Fix 0.8.53 release metadata: changelog links, TUI changelog, npm wrapper
- Update visible help copy for multi-provider support
- Add telegram-bridge integration with deploy configs
- Add US remote VM quickstart doc
- Update Tencent Cloud deploy scripts and docs
- Bump npm wrapper to 0.8.53
2026-06-03 16:12:38 -07:00
Hunter Bown f884ceb6af docs(readme): credit xyuai and RefuseOdd for v0.8.53 contributions 2026-06-03 15:43:05 -07:00
RefuseOdd 8b0e1cc3c0 Limit path suffix to chat completions 2026-06-03 15:34:24 -07:00
RefuseOdd d2999bb402 Add path_suffix to ProviderConfigToml and ProviderConfig
Adds an optional path_suffix field that lets users override the API path
for OpenAI-compatible endpoints. When set, the suffix replaces the default
/v1/<path> pattern, enabling use with endpoints that don't accept /v1/
prefixes (e.g. /chat/completions instead of /v1/chat/completions).

Changes:
- ProviderConfigToml (config crate): path_suffix field
- ProviderConfig (tui crate): path_suffix field
- merge_provider_config: propagates path_suffix
- merge_project_provider_config: propagates path_suffix
- api_url: delegates to new api_url_with_suffix function
- api_url_with_suffix: uses suffix when present, skips /v1 versioning
- DeepSeekClient: reads path_suffix from config, passes to URL builder
- config.example.toml: documents the new option
- Tests for the new URL building behavior

Closes #2089
2026-06-03 15:34:24 -07:00
cyq 45562822f0 feat(agent): classify model families 2026-06-03 15:34:12 -07:00
reidliu41 195dd6b9ab fix(tui): hide shell prompt guidance when shell is disabled
Thread allow_shell into system prompt composition and remove shell-only guidance
  when shell tools are not available.

  This keeps the prompt aligned with the runtime tool catalog and prevents the
  model from trying exec_shell or task_shell_* after allow_shell = false.
2026-06-03 15:28:29 -07:00
xyuai dba332e8d5 fix(tui): persist provider switches to config 2026-06-03 15:28:17 -07:00
Hunter Bown 260ee737b0 style: cargo fmt 2026-06-03 15:18:19 -07:00
Hunter Bown be7a3e7e69 fix(tui): provider picker r shortcut with modifier guard
- add r/R shortcut to re-enter API key for any provider in picker
- guard against Ctrl/Alt/Meta modifiers (only plain r triggers)
- dynamic footer: 'apply' when key exists, 'set key' otherwise
- add 'R edit key' hint to picker footer
- add route/model to scoped auth status output
- add tests for r shortcut, ctrl-r guard, footer text, and route/model

Ports #2717 with review fix. Fixes #2662.
2026-06-03 15:14:39 -07:00
Hunter Bown 3f8e02d6cf docs(readme): add Hugging Face provider to all localized READMEs 2026-06-03 15:11:31 -07:00
Hunter Bown 4b990e190c docs(rfc): file decomposition plan for v0.9.0 2026-06-03 15:08:31 -07:00
Hunter Bown 5719301d1e fix(auth): all-provider auth status and scoped logout
- auth status shows every known provider with config/keyring/env status
- auth status --provider <id> shows detailed single-provider info
- auth list now probes keyring for all providers (was only active)
- /logout clears only the active provider's key (was clearing all)
- add clear_active_provider_api_key for scoped TOML key removal
- add Huggingface to ProviderArg enum
- add auth status tests for all-provider and scoped views

Fixes #2716
2026-06-03 15:08:28 -07:00
Hunter Bown d9ca5fbbff docs(tui): mirror v0.8.53 changelog 2026-06-03 14:43:08 -07:00
Hunter Bown 28a0f19c13 fix(provider): polish v0.8.53 routing and shell gating 2026-06-03 14:40:25 -07:00
Hunter Bown 5786584767 chore(release): bump workspace to 0.8.53 2026-06-03 12:39:01 -07:00
Hunter Bown ed4ec3f799 Merge branch 'codex/v0.8.53-deprecate-whale-md' into codex/v0.8.53 2026-06-03 12:38:00 -07:00
Hunter Bown d5c6856754 Merge branch 'codex/v0.8.53-toolsurface-design-docs' into codex/v0.8.53 2026-06-03 12:37:57 -07:00
Hunter Bown 8bc994e492 Merge branch 'codex/v0.8.53-tool-deferred-ux' into codex/v0.8.53 2026-06-03 12:37:53 -07:00
Hunter Bown a10e17a62a fix(context): prefer global AGENTS over WHALE 2026-06-03 12:37:39 -07:00
Hunter Bown aa4c734602 docs: align v0.8.53 tool surface notes 2026-06-03 12:37:39 -07:00
Hunter Bown f5c8d7e5c5 fix(subagent): align advertised role aliases 2026-06-03 12:37:39 -07:00