After v0.6.1's light-theme removal exposed it more visibly, rapid resizes
left stale glyphs in the right column (sidebar fragments, mid-character
title truncation, duplicated transcript timestamps). Three small fixes:
- Coalesce queued `Event::Resize` events, run a single `terminal.clear()`,
and immediately draw the new frame instead of waiting for the next event
loop iteration. Previously the cleared screen could sit blank between
the resize handler's `continue` and the next draw, so any other event
arriving in that window would be processed before the repaint.
- `truncate_line_to_width` for budgets `<= 3` was counting codepoints
instead of display widths, overrunning the cell budget for any
double-width grapheme. Fix by accumulating display widths consistently.
- Add a `tracing::debug!` log to the resize handler so users hitting this
in the wild can confirm whether crossterm is delivering the event.
Adds two regression tests in `tui/widgets` (resize cycle + cache
invalidation on width change) and one in `tui/ui` (truncate semantics).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce `RlmChildClient` — a dyn-compatible `#[async_trait]` wrapper around
the single create_message operation — so tests can inject a `MockRlmClient`
without a live API key. This replaces the direct `Arc<DeepSeekClient>` field
with `Arc<dyn RlmChildClient>`, wired transparently via `RlmQueryTool::new`.
Concurrency regression test (`rlm_parallel_fanout_overlaps_not_serialized`):
fires N=4 children each sleeping 50 ms through `join_all`. Asserts total
elapsed < 4×50 ms (serial bound) and that all start timestamps cluster within
<50 ms of each other. First run: total_elapsed=54 ms, start_spread=141 µs —
fan-out was already correct; no serialization fix needed.
UI wiring tests (`rlm_query_tool_cell_wired_with_prompts_on_start` etc.) verify
that `handle_tool_call_started` with `rlm_query` populates `GenericToolCell.prompts`
from the `prompts` (array) and `prompt` (singular) input shapes, and that
non-fan-out tools leave `prompts: None`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Implement `deepseek metrics` as a dispatcher-handled subcommand (no TUI
binary roundtrip) that reads ~/.deepseek/audit.log, session JSON files,
and tasks runtime JSONL event streams, then prints a human-readable
usage rollup aggregated by tool name, compaction events, sub-agent
spawns, and capacity-controller interventions.
Flags: --json (machine-readable) and --since DURATION (e.g. 7d, 24h,
30m, now-2h, 2h30m). Empty/missing audit log exits 0 with an empty
rollup; malformed lines are skipped silently via tracing::trace!.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the single-spout bounce animation with two independent `╭───╮`
arcs sweeping at different speeds across a calm `─` water surface. Add
`footer_working_label` to pulse `working` → `working...` at 400 ms
cadence while a turn is live. The dot-pulse fires even in low-motion
mode; the arc strip is gated behind `!app.low_motion`. Frame math is
purely deterministic so the test suite can pin specific frames.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the single-spout bounce animation with two independent `╭───╮`
arcs sweeping at different speeds across a calm `─` water surface. Add
`footer_working_label` to pulse `working` → `working...` at 400 ms
cadence while a turn is live. The dot-pulse fires even in low-motion
mode; the arc strip is gated behind `!app.low_motion`. Frame math is
purely deterministic so the test suite can pin specific frames.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes#77, refs #75.
#77 — bracketed paste was unconditionally enabled at terminal init. Add a
\`bracketed_paste\` field to Settings (default true) and propagate it through
TuiOptions → App → run_tui / pause_terminal / resume_terminal so users on
the rare terminal that mishandles \`\e[?2004h\` can disable it via
\`/set bracketed_paste off\` or \`bracketed_paste = false\` in
\`~/.config/deepseek/settings.toml\`. Modern terminals continue to work as
before. All TuiOptions construction sites updated in one pass.
#75 — added an ignored-test microbench for \`compute_profile\` in
\`crates/tui/src/core/capacity.rs\`. Run with:
cargo test -p deepseek-tui --release bench_compute_profile -- --ignored --nocapture
Baseline (release, M1):
window= 16 per-call= 48ns
window= 64 per-call= 126ns
window= 256 per-call= 385ns
window=1024 per-call=1438ns
Sub-µs at typical window sizes — no optimization shipped, bench locks in
the regression contract. No new dev-deps (uses std::time::Instant +
black_box, gated as #[ignore]).
The user-facing entry point for every flow is the `deepseek` dispatcher
(crates/cli), not `deepseek-tui`. Future agent sessions and example
commands should default to `deepseek` / `cargo run --bin deepseek`.
Mirror the same directive in the local CLAUDE.md (gitignored).
Highlights:
- fix(client): SSE idle-timeout so a stalled stream surfaces a clear error
instead of hanging the active cell (#76)
- fix(tui): sidebar Agents panel reads live engine progress, not just the
cached snapshot — matches the footer chip in real time (#63)
- fix(tui): generic tool result preview preserves newlines for diff stats
/ file lists / todo snapshots (#80)
- fix(tui): slash-menu scroll viewport now exercises center-tracking past
the first 6 entries (#64)
- feat(mcp): connect-failure errors include URL, status, body excerpt,
transport — credentials masked (#71)
- feat(tools): mark alias tools (spawn_agent, close_agent, send_input,
delegate_to_agent) with _deprecation metadata; removal slated 0.8.0 (#72)
- feat(capacity): V4 model priors (deepseek-v4-pro/flash) + key
normalization, plus DEEPSEEK_CAPACITY_PRIOR_V4_* env overrides (#73)
- feat(tools): explain parallel fan-out caps in agent_spawn vs rlm_query
descriptions and error messages — cost-class table in TOOL_SURFACE.md (#81)
- chore(errors): partial wiring of the error taxonomy — classify_error_message
helper used in capacity controller, audit log fields pending (#66)
- chore(providers): scaffold OpenRouter and Novita variants end-to-end
(env keys, default base URLs, model normalization). Modal /provider
picker UI still pending (#52)
Build hygiene:
- cargo fmt clean, cargo clippy --workspace -- -D warnings clean
- cargo test --workspace passes (979+ tests across crates)
- pre-existing dead-code warnings gated per-item with TODO refs to #61/#66
EOF
)
Cleanup pass after the issue fixes (#64, #71, #80, #63):
Simplifications:
- sidebar.rs: extract `push_agent_row` closure to remove the duplicated
two-line agent rendering (cached + progress-only paths used the same
shape with different summary text).
- engine.rs: replace `error_categories.iter().any(|c| c == X)` with
`.contains(&X)` (clippy::manual_contains).
- widgets/mod.rs: replace `for idx in menu_top..menu_bottom` index loop
with `.iter().enumerate().take(menu_bottom).skip(menu_top)`
(clippy::needless_range_loop).
Build hygiene (CI runs `cargo clippy ... -- -D warnings`):
- error_taxonomy.rs: per-item `#[allow(dead_code)]` on `ErrorSeverity`,
`ErrorEnvelope`, and `ErrorEnvelope::new` with TODO notes referencing
#66. Keeps deepseek's removal of the file-wide allow but stops the
scaffold from breaking the build until #66 follows up.
- app.rs: per-field `#[allow(dead_code)]` on `fancy_animations` (pending
#61 footer animation consumer).
- config/lib.rs: complete the OpenRouter/Novita variant scaffolding so
`match ProviderKind { ... }` is exhaustive — add api_key/base_url env
loading (`OPENROUTER_API_KEY`, `NOVITA_API_KEY`, optional `*_BASE_URL`
overrides), wire `api_key_for` / `base_url_for` arms with the documented
defaults, and extend `normalize_model_for_provider` so generic V4 model
names map to each provider's catalog ID. Full /provider picker UI still
pending #52.
Verified: cargo fmt clean, cargo clippy --workspace --all-targets
--all-features --locked -- -D warnings clean, full test suite passes
(979 + adjacent crate tests).
Repro: spawn 5 sub-agents. The footer chip correctly shows "5 agents" because
running_agent_count() unions app.agent_progress (live engine events) with
app.subagent_cache (settled snapshot from Op::ListSubAgents). The sidebar's
Agents panel only read app.subagent_cache and so showed "No agents" while
the footer said 5 — same data flow bug the user screenshotted in #63.
Mirror the footer's union here:
- Live progress-only IDs (in agent_progress, not yet in subagent_cache) get a
one-line "starting" row with the latest progress message — surfaces the
freshest signal first.
- Cached entries get the full status row (steps taken, role, objective).
- Header shows "{live_running} running / {total}" with both counts unified.
The Agents panel now stays in sync with the footer chip and never lies
about whether agents are in flight. Todos panel was already wired correctly
to app.todos (the SharedTodoList lock); only the agents path was racing.
Refs #63
Before, GenericToolCell rendered its `output` through `render_compact_kv`, which
treated the entire string as one logical line and let the wrapper handle
overflow. Multi-line output (git diff --stat, todo snapshots, file lists)
ended up squashed into a single hard-wrapped blob — the screenshot in the
issue showed "Cargo.lock | 1 + crates/cli/Cargo.toml | 1 + crates/cli/src/main.rs"
all on one row.
Switch the result rendering to `render_tool_output_mode` (already used by
ExecCell) which:
- splits on `\n` first, then wraps each line independently;
- caps live view at TOOL_OUTPUT_LINE_LIMIT (= 6) rows with a "+N more lines;
press v for details" affordance;
- emits the full body in transcript view.
Threaded `RenderMode` through `ToolCell::Generic(...)` dispatch and renamed
`GenericToolCell::lines_with_motion` → `lines_with_mode(mode)` (sole caller).
Tests:
- `generic_tool_cell_preserves_multi_line_output_in_transcript` asserts each
diff-stat file lands on its own row.
- `generic_tool_cell_caps_multi_line_output_in_live_with_affordance` pins the
live cap + affordance + transcript-includes-everything contract.
Fixes#80
Before: a failed MCP server connection just said "Failed to connect to SSE: 401" or
"Failed to spawn MCP server 'foo'" — devs had to enable RUST_LOG=debug to see
what actually went wrong.
Now:
- SSE failures show "MCP SSE rejected (transport=http url=... status=401):
<body excerpt up to 200 bytes>", with userinfo + bearer tokens + api_key
query params masked.
- stdio spawn failures show "MCP stdio spawn failed (transport=stdio
server=foo cmd="..." args=[...] env_keys=[...])" — env values stay private,
only keys leak.
Helpers `mask_url_secrets`, `redact_body_preview`, `bounded_body_excerpt` are
covered by 4 unit tests.
Fixes#71
The composer's render path already paginates with center-tracking, but the
source list was hard-capped at 6 entries — so pressing Down arrow past
index 5 had no entries to land on. Repro: with ~37 slash commands, hitting
Down repeatedly stuck at the last visible row.
Bumping the source cap to 128 lets the existing viewport scroll logic
exercise the full filtered command list. No render-path change needed.
Fixes#64
Add `wrap_with_deprecation_notice` helper in the subagent module that
merges a `_deprecation` block into a ToolResult's metadata. Applied
exclusively on alias invocations:
- `spawn_agent` → use `agent_spawn` (removed in v0.8.0)
- `delegate_to_agent` → use `agent_spawn` (removed in v0.8.0)
- `close_agent` → use `agent_cancel` (removed in v0.8.0)
- `send_input` → use `agent_send_input` (removed in v0.8.0)
Canonical names are unaffected. Each alias invocation also emits a
`tracing::warn` so the deprecation appears in audit logs. Documents
the deprecation schedule in `docs/TOOL_SURFACE.md`. Four unit tests
verify the notice shape and that canonical tools stay clean.
Refs #72
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add deepseek_v4_pro (3.5) and deepseek_v4_flash (4.2) priors to
CapacityControllerConfig::default() so V4 models are no longer silently
mapped to the generic 3.8 fallback.
Extend normalize_model_prior_key to match v4-pro, v4_pro, v4-flash,
v4_flash, and deepseek-ai/-prefixed NIM identifiers before the V3/
reasoner branches to prevent cross-matches. V3 and reasoner fallbacks
are unchanged.
Add deepseek_v4_pro_prior / deepseek_v4_flash_prior fields to
CapacityConfig (config.toml) and DEEPSEEK_CAPACITY_PRIOR_V4_PRO /
DEEPSEEK_CAPACITY_PRIOR_V4_FLASH env-var overrides, matching the
existing V3 pattern.
Refs #73
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The OIDC Trusted Publisher path for npm has 404'd on PUT for v0.5.1,
v0.5.2, and v0.6.1, even with valid OIDC tokens. Switch publish-npm and
publish-npm-manual to a classic NPM_TOKEN automation token (set the
NPM_TOKEN repo secret to a granular access token scoped to deepseek-tui
with publish permission) so future releases ship reliably.
Also add .github/workflows/auto-tag.yml: when the workspace version on
main changes, push the matching v$VERSION tag automatically so release.yml
fires without a manual tag push. Requires a RELEASE_TAG_PAT secret to
trigger downstream workflows (GITHUB_TOKEN tag pushes don't trigger
on: push: tags by design).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- V4 cache-hit input prices cut to 1/10th per DeepSeek pricing update:
Pro promo 0.03625→0.003625, Pro base 0.145→0.0145, Flash 0.028→0.0028
- Remove the 'light' theme variant (Variant::Light, Theme::light(), test)
- Remove the theme setting entirely — hardcode UI_THEME to whale/dark,
drop the theme field from Settings, ConfigView, and config command
- Bump workspace version 0.6.0 → 0.6.1 (Cargo.toml, npm pkg, CHANGELOG)
- De-cringe the README: drop emojis, marketing fluff, unverified claims
Split client.rs into client/mod.rs (public API + helpers), client/chat.rs
(chat-completions streaming), and client/responses.rs (responses API
helpers). Internal helpers promoted to pub(super) for intra-module
visibility; the public DeepSeekClient API is unchanged.
While here, redesign all five system prompts around decomposition-first
philosophy inspired by the mismanaged-geniuses hypothesis (Zhang et al.,
2026). The model is now instructed to todo_write / update_plan before
acting, fan out sub-agents for parallel work, and keep the sidebar
populated so the user always sees what's happening. Mode prompts updated:
- agent.txt: 'Before requesting approval, lay out work with todo_write'
- plan.txt: 'Use update_plan for strategy, todo_write for tactics'
- yolo.txt: 'Even with auto-approval, create a todo_write first'
- normal.txt: same pattern for legacy compatibility
Update CHANGELOG [Unreleased] and README modes section accordingly.
Splits `core/engine.rs` (4670 → 4314 lines) into a small folder module:
- `engine/approval.rs` (~125 lines) — `ApprovalDecision`,
`UserInputDecision`, `ApprovalResult`, plus the two handshake
methods `Engine::await_tool_approval` and `Engine::await_user_input`.
- `engine/dispatch.rs` (~300 lines) — tool-input parsing
(`final_tool_input`, `parse_tool_input`, fenced/JSON segment
helpers), `multi_tool_use.parallel` payload parser, dispatch policy
predicates (`should_parallelize_tool_batch`,
`should_force_update_plan_first`, `should_stop_after_plan_tool`,
the read-only MCP tool helpers), and the
`ToolExecutionPlan`/`ToolExecOutcome`/`ParallelToolResult*`/
`ToolExecGuard` types the batch driver passes around.
The public engine surface (`EngineConfig`, `EngineHandle`,
`spawn_engine`, `MockEngineHandle`, `mock_engine_handle`,
`compact_tool_result_for_context`, `TOOL_CALL_*_MARKERS`,
`FAKE_WRAPPER_NOTICE`) stays in `engine.rs` — every external user
imports unchanged.
Not split this round: the 1268-line `handle_deepseek_turn` method.
Carving its inline parallel/sequential dispatch and approval handshake
arms requires extracting two new methods from a borrow-heavy turn
loop; flagged in the v0.6.0 audit doc as future work.
Workspace tests: 1011/1011 still green. No clippy regressions.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lifts `visible_slash_menu_entries`, `apply_slash_menu_selection`, and
`try_autocomplete_slash_command` from `tui/ui.rs` into a sibling
module. Drops the now-unused `slash_completion_hints` import from
`ui.rs` (the new module imports it directly).
Kept separate from `tui::file_mention` per the audit doc — the two
popups have distinct trigger characters, ranking, and post-selection
behaviour even though they share UI scaffolding.
`ui.rs`: ~5070 → ~4990 lines.
Workspace tests: 1011/1011 still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lifts `handle_paste_burst_key`, `handle_paste_burst_decision`,
`apply_paste_burst_retro_capture`, and the local `in_command_context`
helper out of `tui/ui.rs` into a sibling module. The state machine
(`PasteBurst`) and its tests stay in `paste_burst.rs`; only the keymap-
side wiring moves. Drops the now-unused `CharDecision` import from
`ui.rs`.
Workspace tests: 1011/1011 still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves the four sidebar panels (Plan, Todos, Tasks, Agents) plus the
shared `render_sidebar_section` wrapper out of `tui/ui.rs` into a new
sibling module. `truncate_line_to_width` becomes `pub(crate)` so the
new module can reuse it. Drops six imports from `ui.rs` that the
sidebar took with it.
`ui.rs`: 5450 → ~5070 lines.
Workspace tests: 1011/1011 still green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Routes Thinking content through `active_cell` so a turn that emits
Thinking → Tool → Tool renders as one logical "Working…" block until
the next assistant prose chunk flushes the group into history.
- `ActiveCell::push_thinking` parallels `push_tool` for non-tool entries.
- `mark_in_progress_as_interrupted` now also stops streaming Thinking
spinners on cancellation, matching tool cell behaviour.
- New `streaming_thinking_active_entry` field on `App` tracks the
in-flight thinking entry index so deltas can mutate it in place.
- `flush_active_cell` finalizes any unclosed thinking spinner before
draining the group into history (defensive guard).
- Removed the dead `StreamingCellKind::Thinking` variant and tightened
`append_streaming_text` to Assistant only.
Tests cover: push_thinking, group ordering, drain order, interrupt-
clears-spinner, the full Thinking → Tool → Tool → flush flow,
defensive flush of an unclosed thinking block, and a second thinking
block appending inside the same active cell.
Workspace tests: 1004/1004 → 1011/1011.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removed:
- `.claude/next-agent-prompt.md` (111 lines) — v0.4.6-era session prompt
describing slices A/B/C that have all shipped. Successive sessions use
fresh prompts (e.g. .deepseek/v0.6.0-overnight-review.md); this one is
pure history.
- `docs/archive/workspace_migration_status.md` (92 lines) — explicitly
archived (April 11), describes a one-time migration that's complete.
Removed enclosing `docs/archive/` directory too (was the only file).
CHANGELOG entry from v0.4.x still narrates the archival as history.
- `docs/parity_release_and_ci.md` (38 lines) — duplicates what
`.github/workflows/parity.yml` and CONTRIBUTING.md already say
authoritatively. Single source of truth wins.
- `AI_HANDOFF.md` + `todo.md` (untracked, no commit needed) — `todo.md`
was a 7-line pointer to AI_HANDOFF.md, which itself was an April 11
snapshot listing "remaining work" that's mostly delivered. CLAUDE.md
is the live developer guide now.
1004/1004 tests still green; no doc/code references broken.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Not referenced from README.md, docs/, npm/, or any Cargo metadata.
README uses assets/screenshot.png. Reduces repo size by 226 KB.
Also cleaned up working-directory cruft (untracked, no commit needed):
apps/ (empty), python/ (empty after egg-info removed),
counterpoint.copilot.db, firebase-debug.log, excalidraw.log, .DS_Store.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Auto-format pass after the tool-call rendering work, footer chip,
mention popup, subagent split, and parse-counter de-flake.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Promote `tools/subagent.rs` (4206 lines) to a folder module:
tools/subagent/
mod.rs — runtime types, manager, tool implementations (~3577 lines)
tests.rs — extracted test module (~631 lines)
This is the safe first step. The audit doc proposed a 4-way split
(mod / spec / executor / tests). I tried the 3-way (mod / tools / tests)
and the runtime <-> tool-impl coupling produces unresolved-symbol errors
because shared helpers (`SubAgentTask`, `run_subagent_task`,
`build_allowed_tools`, `normalize_role_alias`, `parse_spawn_request`,
the agent prompt constants) are referenced from both layers. Doing that
split right needs a small API design pass to decide which helpers
graduate to the manager API and which stay tool-private — out of scope
for a structural reorg. Pulled the test module out as the cleanest
no-API-change win and left a path open for the bigger split later.
Public API unchanged — `pub mod subagent;` still exports the same items
because `mod.rs` is a drop-in replacement for `subagent.rs`.
954 → 954 tests, 0 failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The audit doc claimed the wiring was "in place" but only the App state
fields existed (`mention_menu_selected`, `mention_menu_hidden`) — no
helpers, no widget rendering, no key handling. Building it out fully so
the popup actually shows when the user types `@` in the composer and
Up/Down/Enter/Tab/Esc behave the way the slash menu does.
What's new:
1. `file_mention::visible_mention_menu_entries(app, limit)` — the entries
source. Returns `Vec<String>` from the workspace walk, gated on the
`mention_menu_hidden` flag and on the cursor being inside an `@token`.
2. `file_mention::apply_mention_menu_selection(app, entries)` — splices
the selected entry into the input via the existing `replace_file_mention`,
resets `mention_menu_hidden`, surfaces a status confirmation.
3. `ComposerWidget::new(app, max_height, slash_entries, mention_entries)`
— second menu slot. The widget renders whichever slice is non-empty,
addressed by the matching selected index. Mention entries get an `@`
prefix so the popup row reads like the actual mention being composed.
Mention takes precedence (positional check is stricter than slash's
"starts-with-/").
4. ui.rs key handler:
- Up/Down navigate `mention_menu_selected` when the popup is open.
- Enter applies `apply_mention_menu_selection` instead of submitting.
- Tab applies the selection (then falls through to the existing slash /
command-completion / file-mention chain).
- Esc hides the popup until the next input edit (`insert_str` already
resets `mention_menu_hidden`, so typing re-opens it).
6 new tests in `ui/tests.rs`:
- mention_popup_is_empty_when_cursor_is_not_in_a_mention
- mention_popup_lists_workspace_matches_for_cursor_partial
- mention_popup_respects_hidden_flag
- apply_mention_menu_selection_splices_selected_entry
- apply_mention_menu_selection_is_noop_outside_a_mention
- apply_mention_menu_selection_with_no_entries_is_noop
Also fixes a stray duplicate `#[cfg(...)]` and an unused-doc-comment
warning that landed when the parse-counter went thread-local — back to
baseline 7 clippy warnings.
948 → 954 tests, 0 failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Audit pass found the auto-scroll paths are already gated correctly:
- `mark_history_updated` only bumps history_version + needs_redraw — does
NOT scroll.
- All tool-cell handlers (`handle_tool_call_started`,
`handle_tool_call_complete`, `push_active_tool_cell`,
`register_tool_cell`) call `mark_history_updated` only — none of them
call `scroll_to_bottom`.
- `add_message` and `flush_active_cell` gate their auto-scroll on
`user_scrolled_during_stream`.
- The per-stream lock clears at TurnComplete (ui.rs ~557) and when the
user scrolls back to the live tail (widgets/mod.rs ~126).
- Explicit user actions (vim G, End, session resume, message submit) call
`scroll_to_bottom` directly — that's correct.
5 new regression tests in ui/tests.rs lock the contract so a future
contributor adding `app.scroll_to_bottom()` to a tool-cell handler hits a
red CI immediately:
- add_message_does_not_scroll_when_user_scrolled_away
- add_message_pins_to_tail_when_user_was_following
- tool_call_started_does_not_scroll_when_user_scrolled_away
- tool_call_complete_does_not_scroll_when_user_scrolled_away
- mark_history_updated_does_not_call_scroll_to_bottom
948 → 948 (no changes; tests were already passing — they just weren't
written yet).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`parse_invocations_increment` and `render_parsed_does_not_call_parse` both
read the global PARSE_INVOCATIONS atomic. They were racing whenever any
other test in the suite called `parse()` in parallel — the global counter
would tick once for each unrelated call and the assertion (== 2 / == 0)
would mismatch.
Switching to `thread_local!<Cell<u64>>` gives each test thread its own
counter, so concurrent callers from other tests can't pollute the result.
Tested across 8 sequential full-suite runs: 8/8 green (was ~40% green).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
FooterProps gains an `agents` chip slot, populated by `footer_agents_chip`
which mirrors the rest of the footer chips: empty `Vec<Span>` when
`running_agent_count == 0` (chip hides), "1 agent" / "N agents" otherwise,
DeepSeek-sky color matching the model badge.
The widget's `auxiliary_spans` includes it in the same drop-from-end
fit-to-width chain as the existing chips, so on narrow terminals the cost
chip drops first as before.
The "0 running" wording the audit doc called out wasn't actually in
FooterProps — that wording is in the agent sidebar (ui.rs ~2960) and was
already fixed there to swap to "N done" once nothing is in flight. So the
P2.5 work here is the additive footer surface, not a wording fix.
4 new tests in widgets/footer.rs:
- footer_agents_chip_is_empty_when_no_agents_running
- footer_agents_chip_uses_singular_for_one
- footer_agents_chip_uses_plural_for_many
- footer_agents_chip_renders_into_widget
939 → 943 tests, 0 failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The engine used to fire `Event::ToolCallStarted` from `ContentBlockStart::ToolUse`
with `input: json!({})` — before any `Delta::InputJsonDelta` had streamed in.
The UI's `handle_tool_call_started` baked the placeholder into the cell at
creation time and never refreshed, so users saw `<command>` and `<file>`
literals while the args finished streaming.
Fix relocates the emission to `ContentBlockStop` (where the input is finalized
already) and routes it through a new `final_tool_input(state)` helper that
prefers the parsed buffer over a stale empty initial input. Three regression
tests in `engine/tests.rs` pin the contract.
Also bundled (same theme — make in-flight tool cells read right):
- Progressive labels via `exploring_label`: "Read foo.rs" → "Reading foo.rs",
"List X" → "Listing X", "Search pattern" → "Searching for `pattern`",
"List files" → "Listing files". 5 tests in `ui/tests.rs`.
- `running_status_label_with_elapsed` in `history.rs`: from 3 s onward the
status segment becomes `running (Ns)` and ticks every second, driven by
the existing CX#3 status-animation tick. Below 3 s no badge — quick
reads/greps stay quiet. Wired through `render_tool_header`. 2 tests.
- Spinner cadence sped up: `TOOL_STATUS_SYMBOL_MS` 1800 → 720 ms per glyph,
so the 4-glyph "heartbeat" is ~2.88 s instead of ~7.2 s.
929 → 939 tests, 0 failures.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces "tool start pushes new cell" with a single ActiveCell that
collects parallel/serial tool entries at the transcript tail and
flushes as a contiguous block on first assistant text or turn complete.
Stops the bounce when many tools fire concurrently.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codex pattern — instead of appending a new ToolCall history cell for each
parallel tool invocation, keep one Exploring/Searching/Reading active cell at
the tail of the transcript and mutate its contents in place as new tool calls
fire. Drops cell churn and keeps the visual anchor stable while multiple tools
stream concurrently.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- HistoryCell::Thinking — live shows first ~4 lines + Ctrl+O affordance;
transcript_lines() returns full content with all paragraphs.
- ExecCell — live caps with head/tail + omission marker; transcript
emits all wrapped lines without truncation.
- Tool/Patch/Mcp/Review cells — live caps + affordance; transcript
uncapped.
- User/Assistant/System/Plan/Diff/etc — display == transcript.
- Pager (Ctrl+O / Ctrl+T) flows through transcript_lines via
history_cell_to_text — opening the pager on a thinking or capped tool
cell shows the full body.
Updated affordance assertion to match the post-CX#9 wording
(press Ctrl+O for full text).
911/911 tests pass; clippy -D warnings clean; fmt clean.
CX#5 (newline-boundary streaming gate):
- New crates/tui/src/tui/streaming/line_buffer.rs — LineBuffer holds
text after the last \n until the next \n arrives, so partial code
fences never become visible state.
- Wired into BlockState in streaming/mod.rs. Assistant text gates;
thinking deltas bypass (reasoning stays live).
- 9 unit tests including the partial-fence regression case.
CX#11 (pure-render footer):
- New crates/tui/src/tui/widgets/footer.rs — FooterProps / FooterToast
/ FooterWidget. Pure render of pre-computed props.
- ui.rs::render_footer rewritten to build props once and delegate to
FooterWidget. Visual output identical; existing 10 footer tests
pass unchanged. 5 new from_app tests for the props builder.
908/908 tests pass; cargo clippy --workspace -D warnings clean;
cargo fmt clean.
Adds a structured rlm_query tool for parallel/batched LLM fan-out.
The model calls it with one prompt or up to 16 concurrent prompts;
children dispatch via tokio::join_all against the existing DeepSeek
client. Default child model is deepseek-v4-flash; override per-call
via the model field. Available in Plan / Agent / YOLO. Cost folds
into the session's running total automatically.
Fixes scroll-stuck regression (#56): TranscriptScroll::resolve_top
and scrolled_by now use a three-level fallback chain (same line →
same cell line 0 → nearest cell at-or-before) instead of teleporting
to ToBottom when an anchor cell vanishes.
Loosens command-safety chains (#57): cargo build && cargo test and
similar chains of known-safe commands now escalate to RequiresApproval
instead of being hard-blocked as Dangerous. Chains containing unknown
commands still block.
Suppresses the GettingCrowded footer chip — context-percent header
already covers conversation pressure.
Refactors:
- Extracts file_mention parsing/completion/expansion (~450 LOC) from
the 5,500-line ui.rs into crates/tui/src/tui/file_mention.rs.
- Deletes truly unused helpers (write_bytes, timestamped_filename,
extension_from_url, output_path, has_project_doc, primary_doc_path).
Tests: 853 pass. cargo clippy --workspace -D warnings clean.
cargo fmt --all -- --check clean.
Closes#46#47#48#49#50#53#54#55#56#57#58.