codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
HUQIANTAO	3de07a99ed	perf(engine): memoize estimated_input_tokens via content-keyed cache The token estimator walks the full session.messages and the active system prompt. Five call sites per turn in the engine (capacity pre/post tool checkpoints, error escalation, the seam manager, the trim budget check) plus four TUI/command consumers (footer, /status, /debug, context inspector) all re-walked the same data independently. On a 200-message history with 5 KB of tool results that is roughly 2 ms per call, or ~20 ms of pure waste on a single turn. Introduce a process-local TokenEstimateCache keyed on (session.messages_revision, system_prompt_fingerprint). Repeated calls with the same inputs return the cached value without re-walking the message list. The cache invalidates as soon as either input changes: * session.messages_revision is a monotonic counter bumped in Session::add_message, Session::replace_messages, the new Session::bump_messages_revision helper, and at every direct session.messages mutation site in core/engine.rs and core/engine/capacity_flow.rs. * system_prompt_fingerprint is a stable 64-bit hash of the SystemPrompt::Text or SystemPrompt::Blocks payload. Also restructures layered_context_checkpoint to compute the estimated token count before taking a long-lived &SeamManager borrow, and re-routes the capacity pre/post tool checkpoints to compute the observation into a local before calling capacity_controller.observe_*. Both refactors are required to satisfy the borrow checker once estimated_input_tokens requires &mut self. Tests: 10 new unit tests cover the miss/hit path, revision bumps, system-prompt changes, audit-ring capacity, and downward-revision no-ops. The full 157-test engine suite still passes.	2026-06-03 21:01:37 -07:00
Hunter Bown	06612495fc	chore(release): prep v0.8.51 — Arcee provider, cycle removal, UI fixes Release-preparation checkpoint for v0.8.51 (workspace + npm bumped to 0.8.51). Added: - Arcee AI direct provider: [providers.arcee], ARCEE_API_KEY/BASE_URL/MODEL, CLI auth, provider + model picker, registry. Default direct-API model is trinity-large-thinking (reasoning, 262K ctx/out); preview + mini selectable. Cloudflare-WAF-safe opening turn (benign read-only tool surface, system-prompt payload splitting) and reasoning_content replay on tool-call turns. - Expanded model catalog (qwen3.6 flash/plus/max-preview, Xiaomi MiMo v2.5 chat/ASR/TTS); provider-aware model picker with per-provider saved models. Changed: - Auto-compaction is percentage- and model-aware (compaction_threshold_for_model_at_percent; default 80%; auto-enable for <=256K windows, opt-in for 1M models). - Provider/gateway HTTP errors sanitized (HTML/WAF interstitials collapsed, 401/403 split into authentication vs authorization). Removed: - The session cycle / checkpoint-restart system: /cycles, /cycle, /recall, recall_archive tool, cycle_manager, cycle-handoff prompt, sidebar cycle lines, EngineConfig.cycle / Event::CycleAdvanced / seam cycle thresholds. Fixed: - Orphaned assistant 'blue dot' role glyph on whitespace-only turns. - Sidebar mouse-wheel scroll leaking into the transcript. - Sidebar hover tooltip overlap + warning-orange styling. - README Constitution description corrected to match prompts/base.md. - Repaired release-blocking unit/integration tests after the refactors. Preflight: cargo fmt clean, workspace builds, 3903 tui tests pass (1 known flaky MCP SSE test under parallel load, passes in isolation).	2026-06-02 17:36:18 -07:00
Hunter Bown	57c10c78d6	fix(tui): compact tool-call UI and context	2026-06-01 16:55:10 -07:00
hexin	4f3a0c3cfc	feat(engine): allow DEEPSEEK_MAX_OUTPUT_TOKENS env override for tight-context providers (#2147 ) The `effective_max_output_tokens` heuristic defaults to 64K for any model not in the known-context-window table. This is fine for DeepSeek's hosted API (1M context) but causes immediate HTTP 400s on self-hosted providers with tight `max-model-len`. Example: vLLM serving Qwen3.6 with `--max-model-len 65536` rejects requests because 64000 (output) + ~1500 (input) exceeds the limit by 1 token. This change lets the operator set `DEEPSEEK_MAX_OUTPUT_TOKENS=16384` (or whatever fits their deployment) to override the heuristic. The env var takes precedence over the model-table lookup when set to a positive integer; otherwise the existing behavior is preserved. No new config struct field — env-only override keeps the public API unchanged. Useful for embedded users (e.g. pinvou3) who need to control output budget without forking the engine config schema. Co-authored-by: hexin <he.xin@h3c.com>	2026-05-26 10:31:26 -05:00
hexin	9c8e482607	fix(engine): keep auto-compaction working on sub-500K self-hosted windows Harvested from PR #2060 by @h3c-hexin. Co-authored-by: hexin <he.xin@h3c.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 18:25:58 -05:00
Hunter Bown	faf5e07adb	feat(tui): add Fin tool-agent and screenshot OCR	2026-05-21 10:47:14 +08:00
Hunter Bown	99c6b22e83	chore(release): v0.8.33 — sub-agent and RLM renovation with persistent sessions - Persistent RLM sessions (rlm_open/rlm_eval/rlm_close) with bounded REPL helpers - Fork-aware sub-agent sessions (agent_open/agent_eval/agent_close) with handle_read - Shared handle_read storage with slice/range/count/JSONPath projections - Slash-command routing: /rlm, /agent, /relay (/接力) for handoff prompts - Sidebar renamed to "Work" tab, consistent across Plan/Agent/YOLO modes - Tool papercuts: file_search excludes, grep_files strings, fetch_url JSON, edit_file fuzz, exec_shell merged stdout/stderr, revert_turn no-op reject - CLI reasoning-effort honoured on non-auto exec routes (#1511 @h3c-hexin) - Edit-file replacement boundaries clarified (#1516) - Pandoc output validated before probing (#1523) - Running turns steerable/repaintable (#1533, #1537) - Tasks/Activity Detail calmer under load - npm retry timeout hint (#1538 @reidliu41) - Issue templates improved (#1525 @reidliu41) - Shell: kill process group to prevent UI freeze (#828 @CrepuscularIRIS) - TUI: ignore leaked SGR mouse reports in composer (#1421 @reidliu41) - Footer: keep chips within available width (#1417 @Wenjunyun123) - Session picker: scope Ctrl+R to current workspace (#1395 @LinQ) - Removed stale competitive-analysis doc - Prompts/docs teach only new tool names	2026-05-12 19:54:08 -05:00
THINKER_ONLY	250953ad35	feat(tools/agent_spawn): teach parent that subagent results are self-reports	2026-05-10 08:15:19 -05:00
junskyeed	d4185ede5a	fix(engine): cap API request max_tokens fix(engine): cap API request max_tokens without affecting internal context budget	2026-05-06 09:59:22 -05:00
Hunter Bown	e98efcf31d	fix(engine): drop dead working set prompt marker	2026-05-04 22:08:07 -05:00
Hunter Bown	b48b68f078	perf(engine): stabilize system prompt and move working set metadata	2026-05-04 22:06:55 -05:00
Hunter Bown	ad8064b143	chore(v0.8.8): stabilization batch — sub-agent caps, mutex contention, RLM polish, CI cleanup Bundles the v0.8.8 stabilization fixes that were already implemented in the working tree, plus the workflow/doc reconciliation called out in #507. ### Sub-agent runtime fixes - #509 Default sub-agent cap raised to 10 (configurable via `[subagents].max_concurrent` in `config.toml`, hard ceiling 20). The running-count calculation now ignores non-running, no-handle, and finished handles so completed agents stop counting against the cap. - #510 `SharedSubAgentManager` is now `Arc<RwLock<...>>`; the read paths that previously held a `Mutex` for inspection now take a read lock, eliminating the multi-agent fan-out UI freeze. - #511 `compact_tool_result_for_context` summarizes `agent_result` / `agent_wait` payloads before they are folded into the parent context. - #512 RLM tool cards map to `ToolFamily::Rlm` and render `rlm`, not `swarm`. Stale "swarm" wording cleaned in docs/comments/tests. - #513 (foreground stopgap only) Foreground RLM work is visible in the Agents sidebar projection. Full async RLM lifecycle remains v0.8.9 — the issue stays open with a refined scope. ### TUI / UX fixes - #487 Offline composer queue is now session-scoped; legacy unscoped queues fail closed. - #488 Composer Option+Backspace deletes by word; cross-platform key routing helpers added. - #443/#444 Keyboard enhancement flags pop on normal AND panic exit; the raw-mode startup probe is now bounded by a configurable timeout. - #449 Production footer reads statusline colors from `app.ui_theme` rather than the bespoke palette. - #506 `display_path_with_home` no longer mutates `HOME` in tests; the flake on shared-env CI is gone. ### Self-update / packaging - #503 `update.rs` arch mapping uses release-asset naming (`arm64`/`x64`) instead of the raw Rust constants. The platform-asset selector also rejects `.sha256` siblings as primary binaries. Tests now live alongside the source in `mod tests` (the `#[path]`-based integration test was removed because it duplicated test runs and forced a `pub(crate)` helper that no real caller used). - `Max 5 in flight` wording updated in `agent_spawn` description, `prompts/base.md`, and `docs/TOOL_SURFACE.md` so the model sees the real default cap (10) and the configuration knob name. ### CI / release docs (#507) - Pruned three duplicated/dead workflows: `crates-publish.yml`, `parity.yml`, `publish-npm.yml`. Their gates already run in `ci.yml` for every push/PR. - `release.yml` build job now allows `parity` to be skipped (it only runs on tag push), unblocking `workflow_dispatch` reruns. The job still fails closed on a real parity failure. - `RELEASE_RUNBOOK.md` reconciled: crate publishing is documented as the manual `scripts/release/publish-crates.sh` flow (no automated workflow); references to the deleted workflows removed. - `CLAUDE.md` notes the `RELEASE_TAG_PAT` requirement for the auto-tag → release.yml chain (without it, the tag is created but `release.yml` does not fire) and documents the `workflow_dispatch` parity-skip behavior. ### Docs - `docs/COMPETITIVE_ANALYSIS.md` added — capability matrix vs OpenCode and Codex CLI, gap analysis, and recommended implementation order. ### Verification (this branch) - `cargo fmt --all -- --check` ✓ - `cargo check --workspace --all-targets --locked` ✓ - `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings` ✓ - `cargo test --workspace --all-features --locked` ✓ (1809 + supporting) - Parity gates ✓ (snapshot, parity_protocol, parity_state) - `cargo build --release --locked -p deepseek-tui-cli -p deepseek-tui` ✓ - Lockfile drift guard ✓ - `deepseek doctor --json` clean - `deepseek eval` (offline harness) success=true, 0 tool errors Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 01:57:37 -05:00
Hunter Bown	8dd5ed38d7	refactor(engine): extract context helpers	2026-05-01 07:09:30 -05:00

13 Commits