Harvested the narrow Rust/docs slice of PR #2578 by @AresNing for #1364. The event uses the maintained structured observer path: JSON stdin, stdout ignored, warn-only failures, and no ability to block or mutate the turn.
The hook fires after post-turn app state, usage totals, cost, notification, receipt, and queue-recovery state are updated, before queued follow-up dispatch. Docs, RFC notes, /hooks discovery, and v0.9 tracking now describe the observer-only contract.
Co-authored-by: AresNing <49557311+AresNing@users.noreply.github.com>
Harvested from PR #2741 by @idling11 for #2693, with review fixes folded in: typed compaction/tool/safety enums, no silent unknown-kind fallback, unknown profile keys rejected, and whole-struct equality for future reload/runtime checks.
Co-authored-by: idling11 <8055620+idling11@users.noreply.github.com>
Records that #2506/#2508 are superseded by the safer #2558 path_suffix implementation, credits the original #1874 report and follow-up PR review trail, and documents that suffix overrides only affect chat completions while model and beta paths keep built-in routing.
Harvested from PR #2636 by @HUQIANTAO with widened cache invalidation for constitution files, generated context, trust state, canonical paths, and same-length overwrites.
Co-authored-by: HUQIANTAO <58421104+HUQIANTAO@users.noreply.github.com>
Refs #1786.
Reported by @bevis-wong. This lands the durable restart-safety slice: persisted running tasks and running tool rows are marked failed with a recovery note instead of being requeued as live work after a prior process exits.
Refs #2029.
Reported by @qiyuanlicn. This lands live per-step API-timeout checkpoint continuation and preserves checkpoint metadata through projections, transcripts, and persistence; cold-restart child-task rehydration remains out of scope.
Harvested from PR #2734 by @idling11 with reviewer fixes for row-source fidelity, row-authoritative hit testing, and display-width popover sizing.
Refs #2694.
Co-authored-by: idling11 <8055620+idling11@users.noreply.github.com>
Refs #697 and #1827.
Reported by @NASLXTO and @wuxixing. Prior context-cap and startup-diagnosis work by @linzhiqin2003 and @merchloubna70-dot shaped this fallback.
Harvested from PR #2532 by @cyq1017.
Pending input rows now distinguish steer-pending, rejected-steer, and queued-follow-up states, with continuation rows aligned under the delivery label.
Refs #2054; leaves the broader cancel/edit affordance work open.
Co-authored-by: cyq1017 <61975706+cyq1017@users.noreply.github.com>
Harvested from PR #2639 by @gaord.
Adds POST /v1/sessions for runtime clients to persist completed threads as managed saved sessions, with a 409 guard for queued or active turn/item state and focused session endpoint coverage.
Also makes MCP HTTP tests install the rustls ring provider before constructing reqwest clients so filtered no-provider test runs are deterministic.
Co-authored-by: gaord <9567937+gaord@users.noreply.github.com>
Harvested from PR #2733 by @idling11.
Adds richer update_plan artifact fields for grounded Plan-mode review, renders them in the transcript and Plan confirmation prompt, and carries them through /relay, fork-state, and saved-session replay.
Verification: cargo test -p codewhale-tui --bin codewhale-tui --locked plan_ -- --nocapture
Verification: cargo test -p codewhale-tui --bin codewhale-tui --locked relay_slash_command_routes_to_session_relay_instruction -- --nocapture
Verification: cargo clippy -p codewhale-tui --locked -- -D warnings
Co-authored-by: idling11 <8055620+idling11@users.noreply.github.com>
Harvest the UpdateThreadRequest workspace field from PR #2640 while keeping the engine-cache correctness fix: PATCH /v1/threads/{id} now persists workspace changes, emits the workspace change in thread.updated, rejects empty paths, rejects workspace changes while a turn is active, and evicts idle cached engines so the next turn starts in the new workspace.
Validation: cargo fmt --all -- --check; git diff --check; cargo test -p codewhale-tui --bin codewhale-tui --locked update_thread_workspace -- --nocapture; cargo clippy -p codewhale-tui --locked -- -D warnings; python3 scripts/check-coauthor-trailers.py --author-map .github/AUTHOR_MAP --range origin/main..HEAD --check-authors.
Harvested from PR #2640 by @gaord.
Co-authored-by: gaord <9567937+gaord@users.noreply.github.com>
Harvest the HarmonyOS/OpenHarmony port from PR #2634 and make it publish-safe by target-gating unsupported host dependencies out of the OHOS TUI graph. Self-update is disabled on OHOS, PTY shell mode reports unsupported, and Starlark execpolicy parsing returns an explicit unsupported-platform error until upstream starlark/rustyline/nix support catches up.
Add OHOS SDK setup docs and launcher scripts, install the rustls ring provider for rustls-no-provider entrypoints, and keep the packaged codewhale-tui OHOS graph free of starlark, rustyline, nix@0.28, portable-pty, and arboard.
Validation: cargo fmt --all -- --check; git diff --check; git diff --cached --check; cargo check -p codewhale-cli --locked; cargo check -p codewhale-app-server --locked; cargo check -p codewhale-tui --locked; cargo test -p codewhale-cli --locked update::tests::; cargo test -p codewhale-release --locked; cargo test -p codewhale-tui --locked background_tty_command_has_controlling_terminal; cargo test -p codewhale-tui --locked clipboard; cargo package -p codewhale-tui --allow-dirty --no-verify --locked; packaged OHOS cargo tree checks. OHOS target check still requires a loaded OpenHarmony SDK/sysroot and currently stops in ring with missing assert.h when CC/CFLAGS/linker are unset.
Harvested from PR #2634 by @shenjackyuanjie.
Co-authored-by: shenjackyuanjie <54507071+shenjackyuanjie@users.noreply.github.com>
Place user text before volatile turn metadata in outgoing user-message content arrays so provider prefix caches can continue matching the stable user-input prefix across date, model-route, and working-set changes.
Also adds wire-level coverage proving tail-positioned turn metadata serializes after user text while preserving turn-meta deduplication.
Harvested from PR #2517 by @HUQIANTAO
Co-authored-by: HUQIANTAO <58421104+HUQIANTAO@users.noreply.github.com>
Harvests the MiMo Token Plan auth-header behavior from #2627 while keeping Xiaomi env-key precedence unchanged so standard endpoints do not accidentally receive a Token Plan key.
Harvested from PR #2627 by @xyuai.
Co-authored-by: xyuai <281015099+xyuai@users.noreply.github.com>
Root cause: AgentComplete unconditionally calls resume_terminal()
even when the terminal was never paused, causing a secondary
EnterAlternateScreen on Windows that creates a new buffer whose
width may differ from the window width. Additionally,
ColorCompatBackend had no terminal_size cache, so size() fell
through to crossterm::terminal::size() which on Windows returns
the WinAPI buffer width rather than the window width.
Changes:
- AgentComplete: add event_broker.is_paused() guard
- resume_terminal(): cache real terminal size before reset_viewport
- Resize handler: also set terminal_size alongside forced_size
- subagent_routing: 3x mark_history_updated -> bump_history_cell(idx)
- color_compat: add terminal_size field, set_terminal_size(), fix
size() fallback priority (forced_size > terminal_size)
- tests: 3 unit tests for size() fallback chain
Review feedback addressed:
- forced_size now takes priority over terminal_size (gemini-code-assist)
- Redundant map lookups removed in subagent_routing (both bots)
- set_terminal_size moved before reset_terminal_viewport (greptile-apps)
(cherry picked from commit 4463c46644a6e485e7e20dc2b19c29c2e8eb3c5c)
Subagent aliases:
- Legacy names (agent_spawn, agent_result, agent_cancel, resume_agent,
agent_list, agent_send_input, agent_assign, agent_wait,
delegate_to_agent) are already NOT registered — they exist as dead code
with #[allow(dead_code)] since v0.8.33
- Add test verifying model catalog only advertises canonical subagent
tools: agent_open, agent_eval, agent_close, tool_agent
Shell aliases:
- Hide exec_wait from model catalog (legacy alias for exec_shell_wait)
- Hide exec_interact from model catalog (legacy alias for
exec_shell_interact)
- Both remain callable for saved transcript replay
- Add test verifying shell aliases are hidden but callable
Verification: cargo test -p codewhale-tui --locked (4040 passed),
cargo clippy -D warnings
Wrap plan steps via wrap_text() before rendering, breaking only on display-width overflow, not on Latin/CJK Unicode word boundaries. Switch main render path from Wrap { trim: true } to Wrap { trim: false } since all content is pre-wrapped. Replace wrapped_line_count() with lines.len() for accurate scroll bounds. Keep confirm-exit dialog on Wrap { trim: true } (English-only, no risk).
- wrap_text: replace chars().count() with UnicodeWidthStr::width() so
CJK text is wrapped by display columns, consistent with
wrapped_line_count and ratatui's Paragraph::wrap. Also fix the
hard-split loop to use exclusive byte ranges (..end) instead of
inclusive (..=i) so multi-byte UTF-8 prefixes are always valid.
- render: hoist the confirming_exit branch to an early return so the
plan-content construction (lines, scroll bounds, footer) is skipped
entirely when the confirmation dialog is visible.
- Clear pending_g when Esc triggers the exit-confirmation prompt so a
stray 'g' press does not leak into and survive the confirmation dialog.
- Move render_modal_chrome into the else branch so only one call fires
per render pass, eliminating a shadow artifact when confirming_exit
is active.
- wrapped_line_count: compute leading-space width via UnicodeWidthStr
instead of byte length, so non-ASCII leading whitespace is measured
correctly.
- render: hoist popup_area / content_width computation above plan
rendering so wrap_text can share the same content_width derived
from the actual popup geometry instead of a magic 68.
Use clamped (effective) scroll instead of raw `self.scroll` in the Esc
handler so a short plan that fits entirely (max_scroll == 0) never
triggers the "exit without implementing?" dialog when the user pressed
a scroll key (PgDn/Ctrl-D/G/End) beforehand.
- Use word-wrapping-aware line count to prevent underestimating scroll range
(gemini-code-assist / greptile-apps)
- Merge PLAN_OPTIONS, PLAN_SHORTCUTS, PLAN_SHORT_LABELS into PlanOption struct
(gemini-code-assist)
- Remove dead Esc code in handle_key (greptile-apps)
- Guard gg/G with modifier checks (gemini-code-assist)
- Increase PgUp/PgDn scroll amount from 6 to 12 (greptile-apps)
- Use u16::try_from for scroll value to avoid silent truncation (greptile-apps)
- Update related unit tests for new scroll values
- Add scroll state field to PlanPromptView with PgUp/PgDn, Ctrl+U/D/F/B,
Home/End, gg/G vim-style keybindings
- Show scroll indicator footer when content overflows the popup
- Add confirming_exit state: Esc while scrolled asks for confirmation
before discarding, preventing accidental exits on long plans
- Clamp scroll in render() so overscroll doesn't hide bottom options
- Use wrapped_line_count() with UnicodeWidthStr for accurate overflow
detection with CJK characters
- Add 11 unit tests covering scroll, keybindings, and exit confirmation
Three follow-ups to the previous perf commit:
1. Drop the rows_hash field on CacheEntry. The field was computed and
stored but never read on the hot path; tests exercised it only to
assert the cache returned a stable hash. After this change
get_or_compute_rows returns just Vec<OutputRow>, halving the
tuple-return ABI and removing one DefaultHasher::write pass on
every cache miss.
2. Replace DefaultHasher (SipHash) with a hand-rolled FNV-1a 64-bit
hash. SipHash is per-process-keyed and ~5-10x slower than FNV on
the small-to-medium tool output strings we see at 120 FPS. FNV-1a
has no per-process key, fits in 20 lines of pure-Rust, and a 64-bit
collision space is more than wide enough for the per-process LRU's
expected <= a few hundred entries. The cache is a correctness
optimization, not a security boundary; collisions only cause a
false miss, never wrong data.
3. Caller in tui::history::render_preserved_output_mode updated to
the new Vec<OutputRow>-only signature. Two new tests cover the
FNV-1a properties (length-suffix sensitivity, empty-input
stability).
output_rows (in tui::history) walks the raw tool output, ANSI-strips
each line, classifies path/URL-like rows, and wraps the rest to the
current viewport width. selected_output_indices then computes the
head/tail/importance subset that the compact Live view shows. Both
functions are pure, but they are called on every render frame for
every visible tool cell. For a 4 KB tool output on a 120 FPS render
loop that is 2-6 redundant walks per frame, per cell, and the
function is called from a non-trivial number of cells across
exec, tool, command, and review history.
Add tui::output_rows_cache, a thread-local, content-addressed cache
keyed on (content_hash, width) for the rows and (content_hash, width,
line_limit) for the indices. The cache stores the wrapped
Vec<OutputRow> plus a per-line-limit map of selected indices on a
single entry, so a single key lookup satisfies both render steps.
render_preserved_output_mode now consults the cache for both the
rows and the indices; on a hit, neither the per-line ANSI strip nor
the importance-ranking pass runs. The cache is bounded (default
capacity 256) with insertion-order eviction. The OutputRow struct
gains PartialEq + Eq + pub fields so the cache module can store and
hash it without exposing private internals.
Tests: 6 new unit tests cover the hit/miss path, width invalidation,
content invalidation, indices per-line_limit caching, capacity
eviction, and hash stability. The wider tui::history test suite (68
tests) still passes.
The build_canonical_state path never reads
CanonicalStateScan::latest_verified_user_idx, but the previous patch
required is_complete() to find a verified user message before it would
short-circuit. On a long history with no verification replay — the
common case — the scan walked the entire message list looking for a
match that could not exist.
Add a find_verified: bool parameter to scan_canonical_inputs and
CanonicalStateScan::is_complete. build_canonical_state now passes
false, so the loop stops as soon as the goal and CANONICAL_SCAN_MAX_FACTS
facts are found. The replan path (apply_verify_and_replan) keeps the
existing true behavior so it still locates the latest verified user
message.
Test calls are updated to match; no behavior change for any test.
build_canonical_state previously did two independent reverse walks of
session.messages — one to extract the most recent user goal, and one
to collect up to four confirmed-fact snippets. apply_verify_and_replan
then added a third and fourth reverse scan to locate the latest user
message and the latest [verification replay] user message for the
re-plan path.
All four reverse scans collect disjoint facts about the same most-
recent-first view of the conversation. This PR folds them into a
single helper, scan_canonical_inputs, that walks messages once in
reverse, fills a CanonicalStateScan, and short-circuits as soon as
every collector is satisfied. The helper exposes the latest-message
indices so apply_verify_and_replan can clone the full Message values
after the scan (eliminating the two independent find().cloned() walks).
The output CanonicalState is byte-identical to the prior
implementation: same goal, same confirmed facts (newest first, errors
filtered), same fallback string when no user text exists. The re-plan
path's keep-messages set is identical: latest user + latest verified.
Tests: 6 new unit tests cover the goal lookup, fact cap, error-result
filter, verified-marker scan, empty input, and the early-exit
condition. The full engine test suite (153 tests) still passes.
Three follow-ups to the previous perf commit:
1. Correctness: tool.strict participates in the wire format emitted by
tool_to_api_json, so it MUST participate in the cache identity. Two
catalogs that differ only in strict would otherwise collide and serve
a stale SHA-256, silently busting prefix-cache stability on the wire.
2. Allocation: replace the per-tool serde_json::to_string in
tool_set_identity with a hash_json_value helper that walks the JSON
tree directly. For a 60-tool catalog this drops ~25-40 KB of
transient allocation per cache miss.
3. Dead code: the previous patch introduced PrefixFingerprint::compute,
CachedCatalog::joined, ToolCatalogCache::{invalidate,is_empty}, and a
thread-local cache helper that were not used outside tests. With
-D warnings in CI all four triggered dead-code errors. The compute
helper is now only built in cfg(test); the rest are marked
#[allow(dead_code)] with comments explaining their observability and
test-only use.
PrefixFingerprint::compute is called once per turn by the turn loop
prefix-stability check. The tool-side work serializes every tool to the
chat-API JSON shape, sorts the resulting strings, joins with newlines,
and SHA-256s the result. For a 60-tool catalog that is ~25-40 KB of
allocation plus a sort, all of which produces a byte-identical output
once the tool set is stable across turns (the common case after the
first turn of a session).
Introduce a process-local ToolCatalogCache that stores the joined+sorted
catalog under a content-derived u64 identity (length + per-tool name +
description + serialized input_schema). On a hit, the per-tool JSON
serialization, sort, and join are skipped entirely — the pre-computed
SHA-256 hex digest is returned directly.
The cache lives on PrefixStabilityManager (per-session ownership) and
backs a new PrefixFingerprint::compute_with_tool_cache entry point.
check_and_update, PrefixStabilityManager::new, and pin() all use the
cached path. The original compute() is kept as a fallback for callers
that do not have a cache in hand (e.g. CLI tools that build a one-shot
fingerprint).
The cache is bounded (default capacity = 8) and uses insertion-order
eviction, matching the eviction strategy already in
transcript_cache.rs. invalidate() is exposed for tool-registry hot-reload
and MCP attach paths.
Tests: 8 new unit tests cover the miss/hit path (pointer-equal Arc on
hit), identity collisions, schema change detection, capacity eviction,
invalidate, empty slice, and the equivalence between cached and uncached
fingerprints. The full 30-test prefix_cache suite passes; the wider
prefix-cache contract tests in settings, prompts, and
core::engine::tests continue to pass.
The token estimator walks the full session.messages and the active system
prompt. Five call sites per turn in the engine (capacity pre/post tool
checkpoints, error escalation, the seam manager, the trim budget check)
plus four TUI/command consumers (footer, /status, /debug, context
inspector) all re-walked the same data independently. On a 200-message
history with 5 KB of tool results that is roughly 2 ms per call, or
~20 ms of pure waste on a single turn.
Introduce a process-local TokenEstimateCache keyed on
(session.messages_revision, system_prompt_fingerprint). Repeated calls
with the same inputs return the cached value without re-walking the
message list. The cache invalidates as soon as either input changes:
* session.messages_revision is a monotonic counter bumped in
Session::add_message, Session::replace_messages, the new
Session::bump_messages_revision helper, and at every direct
session.messages mutation site in core/engine.rs and
core/engine/capacity_flow.rs.
* system_prompt_fingerprint is a stable 64-bit hash of the
SystemPrompt::Text or SystemPrompt::Blocks payload.
Also restructures layered_context_checkpoint to compute the estimated
token count before taking a long-lived &SeamManager borrow, and
re-routes the capacity pre/post tool checkpoints to compute the
observation into a local before calling
capacity_controller.observe_*. Both refactors are required to satisfy
the borrow checker once estimated_input_tokens requires &mut self.
Tests: 10 new unit tests cover the miss/hit path, revision bumps,
system-prompt changes, audit-ring capacity, and downward-revision
no-ops. The full 157-test engine suite still passes.
- Add model_visible() hook to ToolSpec trait (default true)
- Override model_visible() -> false on todo_write, todo_add, todo_update, todo_list
- Checklist variants remain model-visible as the canonical surface
- Legacy todo_* calls still work for saved transcript replay
- Return _deprecation metadata with use_instead and removed_in=0.9.0
- Update prompts to recommend checklist_* only
- Update TOOL_SURFACE.md with v0.9.0 deprecation notes
- Add tests for hidden catalog, compat alias behavior, and metadata
Verification: cargo test -p codewhale-tui -- todo, cargo clippy -D warnings
Adds an optional path_suffix field that lets users override the API path
for OpenAI-compatible endpoints. When set, the suffix replaces the default
/v1/<path> pattern, enabling use with endpoints that don't accept /v1/
prefixes (e.g. /chat/completions instead of /v1/chat/completions).
Changes:
- ProviderConfigToml (config crate): path_suffix field
- ProviderConfig (tui crate): path_suffix field
- merge_provider_config: propagates path_suffix
- merge_project_provider_config: propagates path_suffix
- api_url: delegates to new api_url_with_suffix function
- api_url_with_suffix: uses suffix when present, skips /v1 versioning
- DeepSeekClient: reads path_suffix from config, passes to URL builder
- config.example.toml: documents the new option
- Tests for the new URL building behavior
Closes#2089