Commit Graph

2187 Commits

Author SHA1 Message Date
HUQIANTAO baef5ba95d perf(prefix-cache): cache tool-catalog JSON serialization across checks
PrefixFingerprint::compute is called once per turn by the turn loop
prefix-stability check. The tool-side work serializes every tool to the
chat-API JSON shape, sorts the resulting strings, joins with newlines,
and SHA-256s the result. For a 60-tool catalog that is ~25-40 KB of
allocation plus a sort, all of which produces a byte-identical output
once the tool set is stable across turns (the common case after the
first turn of a session).

Introduce a process-local ToolCatalogCache that stores the joined+sorted
catalog under a content-derived u64 identity (length + per-tool name +
description + serialized input_schema). On a hit, the per-tool JSON
serialization, sort, and join are skipped entirely — the pre-computed
SHA-256 hex digest is returned directly.

The cache lives on PrefixStabilityManager (per-session ownership) and
backs a new PrefixFingerprint::compute_with_tool_cache entry point.
check_and_update, PrefixStabilityManager::new, and pin() all use the
cached path. The original compute() is kept as a fallback for callers
that do not have a cache in hand (e.g. CLI tools that build a one-shot
fingerprint).

The cache is bounded (default capacity = 8) and uses insertion-order
eviction, matching the eviction strategy already in
transcript_cache.rs. invalidate() is exposed for tool-registry hot-reload
and MCP attach paths.

Tests: 8 new unit tests cover the miss/hit path (pointer-equal Arc on
hit), identity collisions, schema change detection, capacity eviction,
invalidate, empty slice, and the equivalence between cached and uncached
fingerprints. The full 30-test prefix_cache suite passes; the wider
prefix-cache contract tests in settings, prompts, and
core::engine::tests continue to pass.
2026-06-03 21:01:37 -07:00
HUQIANTAO 3de07a99ed perf(engine): memoize estimated_input_tokens via content-keyed cache
The token estimator walks the full session.messages and the active system
prompt. Five call sites per turn in the engine (capacity pre/post tool
checkpoints, error escalation, the seam manager, the trim budget check)
plus four TUI/command consumers (footer, /status, /debug, context
inspector) all re-walked the same data independently. On a 200-message
history with 5 KB of tool results that is roughly 2 ms per call, or
~20 ms of pure waste on a single turn.

Introduce a process-local TokenEstimateCache keyed on
(session.messages_revision, system_prompt_fingerprint). Repeated calls
with the same inputs return the cached value without re-walking the
message list. The cache invalidates as soon as either input changes:
  * session.messages_revision is a monotonic counter bumped in
    Session::add_message, Session::replace_messages, the new
    Session::bump_messages_revision helper, and at every direct
    session.messages mutation site in core/engine.rs and
    core/engine/capacity_flow.rs.
  * system_prompt_fingerprint is a stable 64-bit hash of the
    SystemPrompt::Text or SystemPrompt::Blocks payload.

Also restructures layered_context_checkpoint to compute the estimated
token count before taking a long-lived &SeamManager borrow, and
re-routes the capacity pre/post tool checkpoints to compute the
observation into a local before calling
capacity_controller.observe_*. Both refactors are required to satisfy
the borrow checker once estimated_input_tokens requires &mut self.

Tests: 10 new unit tests cover the miss/hit path, revision bumps,
system-prompt changes, audit-ring capacity, and downward-revision
no-ops. The full 157-test engine suite still passes.
2026-06-03 21:01:37 -07:00
Hunter B 05950d1d59 fix release crate publish checks 2026-06-03 19:24:42 -07:00
xyuai 9e15805f64 fix(settings): tighten legacy path migration coverage 2026-06-03 19:24:37 -07:00
Hunter B f7a602cd20 feat(tools): hide todo_* aliases from model catalog, add deprecation metadata (#2682)
- Add model_visible() hook to ToolSpec trait (default true)
- Override model_visible() -> false on todo_write, todo_add, todo_update, todo_list
- Checklist variants remain model-visible as the canonical surface
- Legacy todo_* calls still work for saved transcript replay
- Return _deprecation metadata with use_instead and removed_in=0.9.0
- Update prompts to recommend checklist_* only
- Update TOOL_SURFACE.md with v0.9.0 deprecation notes
- Add tests for hidden catalog, compat alias behavior, and metadata

Verification: cargo test -p codewhale-tui -- todo, cargo clippy -D warnings
2026-06-03 19:20:23 -07:00
Hunter Bown 8dff2f7525 fix(tui): guard xiaomi mimo defaults test against CI env vars 2026-06-03 16:25:04 -07:00
Hunter Bown 772ec46c98 chore(release): v0.8.53 — Arcee support, telegram bridge, provider fixes
- Fix Rust syntax/clippy fallout in client.rs, cli/src/lib.rs, web_search.rs
- Fix 0.8.53 release metadata: changelog links, TUI changelog, npm wrapper
- Update visible help copy for multi-provider support
- Add telegram-bridge integration with deploy configs
- Add US remote VM quickstart doc
- Update Tencent Cloud deploy scripts and docs
- Bump npm wrapper to 0.8.53
2026-06-03 16:12:38 -07:00
Hunter Bown f884ceb6af docs(readme): credit xyuai and RefuseOdd for v0.8.53 contributions 2026-06-03 15:43:05 -07:00
RefuseOdd 8b0e1cc3c0 Limit path suffix to chat completions 2026-06-03 15:34:24 -07:00
RefuseOdd d2999bb402 Add path_suffix to ProviderConfigToml and ProviderConfig
Adds an optional path_suffix field that lets users override the API path
for OpenAI-compatible endpoints. When set, the suffix replaces the default
/v1/<path> pattern, enabling use with endpoints that don't accept /v1/
prefixes (e.g. /chat/completions instead of /v1/chat/completions).

Changes:
- ProviderConfigToml (config crate): path_suffix field
- ProviderConfig (tui crate): path_suffix field
- merge_provider_config: propagates path_suffix
- merge_project_provider_config: propagates path_suffix
- api_url: delegates to new api_url_with_suffix function
- api_url_with_suffix: uses suffix when present, skips /v1 versioning
- DeepSeekClient: reads path_suffix from config, passes to URL builder
- config.example.toml: documents the new option
- Tests for the new URL building behavior

Closes #2089
2026-06-03 15:34:24 -07:00
cyq 45562822f0 feat(agent): classify model families 2026-06-03 15:34:12 -07:00
reidliu41 195dd6b9ab fix(tui): hide shell prompt guidance when shell is disabled
Thread allow_shell into system prompt composition and remove shell-only guidance
  when shell tools are not available.

  This keeps the prompt aligned with the runtime tool catalog and prevents the
  model from trying exec_shell or task_shell_* after allow_shell = false.
2026-06-03 15:28:29 -07:00
xyuai dba332e8d5 fix(tui): persist provider switches to config 2026-06-03 15:28:17 -07:00
Hunter Bown 260ee737b0 style: cargo fmt 2026-06-03 15:18:19 -07:00
Hunter Bown be7a3e7e69 fix(tui): provider picker r shortcut with modifier guard
- add r/R shortcut to re-enter API key for any provider in picker
- guard against Ctrl/Alt/Meta modifiers (only plain r triggers)
- dynamic footer: 'apply' when key exists, 'set key' otherwise
- add 'R edit key' hint to picker footer
- add route/model to scoped auth status output
- add tests for r shortcut, ctrl-r guard, footer text, and route/model

Ports #2717 with review fix. Fixes #2662.
2026-06-03 15:14:39 -07:00
Hunter Bown 3f8e02d6cf docs(readme): add Hugging Face provider to all localized READMEs 2026-06-03 15:11:31 -07:00
Hunter Bown 4b990e190c docs(rfc): file decomposition plan for v0.9.0 2026-06-03 15:08:31 -07:00
Hunter Bown 5719301d1e fix(auth): all-provider auth status and scoped logout
- auth status shows every known provider with config/keyring/env status
- auth status --provider <id> shows detailed single-provider info
- auth list now probes keyring for all providers (was only active)
- /logout clears only the active provider's key (was clearing all)
- add clear_active_provider_api_key for scoped TOML key removal
- add Huggingface to ProviderArg enum
- add auth status tests for all-provider and scoped views

Fixes #2716
2026-06-03 15:08:28 -07:00
Hunter Bown d9ca5fbbff docs(tui): mirror v0.8.53 changelog 2026-06-03 14:43:08 -07:00
Hunter Bown 28a0f19c13 fix(provider): polish v0.8.53 routing and shell gating 2026-06-03 14:40:25 -07:00
Hunter Bown 5786584767 chore(release): bump workspace to 0.8.53 2026-06-03 12:39:01 -07:00
Hunter Bown ed4ec3f799 Merge branch 'codex/v0.8.53-deprecate-whale-md' into codex/v0.8.53 2026-06-03 12:38:00 -07:00
Hunter Bown d5c6856754 Merge branch 'codex/v0.8.53-toolsurface-design-docs' into codex/v0.8.53 2026-06-03 12:37:57 -07:00
Hunter Bown 8bc994e492 Merge branch 'codex/v0.8.53-tool-deferred-ux' into codex/v0.8.53 2026-06-03 12:37:53 -07:00
Hunter Bown a10e17a62a fix(context): prefer global AGENTS over WHALE 2026-06-03 12:37:39 -07:00
Hunter Bown aa4c734602 docs: align v0.8.53 tool surface notes 2026-06-03 12:37:39 -07:00
Hunter Bown f5c8d7e5c5 fix(subagent): align advertised role aliases 2026-06-03 12:37:39 -07:00
Hunter Bown 025089494b fix(rlm): include session object in source hints 2026-06-03 12:37:39 -07:00
Hunter Bown fc8ad7b3a8 feat(project): enrich repo constitution (invariants, branch policy, escalation)
Per the layered-authority clarification (base myth → global Constitution → repo
constitution = local law → task packet → runtime policy), extend
.codewhale/constitution.json beyond authority+verification with optional:

- protected_invariants — repo invariants the agent must not break
- branch_policy — branch/release policy in effect
- escalate_when — conditions to stop and escalate to the user

All optional; rendered as concise model-facing prose. The global Brother Whale
identity anchor and Constitution in prompts/base.md are unchanged (verified
untouched on this branch). Dogfood constitution.json filled with CodeWhale's
real invariants (prefix-cache byte-stability, transcript replay, stable Rust,
cli/tui parity), branch policy (codex/v0.8.53), and escalation rules. Docs note
the layered hierarchy.

cargo test -p codewhale-tui --bins → 3946 passed; clippy clean.
2026-06-03 12:16:06 -07:00
Hunter Bown 9d9616e898 feat(project): deprecate WHALE.md; add .codewhale/constitution.json authority layer
Splits repo-level guidance into two clear artifacts and deprecates the
confusing WHALE.md concept (overlapped with AGENTS.md):

- AGENTS.md is the canonical cross-agent project-instructions file.
- .codewhale/constitution.json is the CodeWhale-specific repo authority /
  prioritization policy (when local sources conflict, which to trust first; what
  to verify before claiming done). Rendered into the system prompt as a
  higher-authority <codewhale_repo_constitution> block; takes precedence over a
  legacy WHALE.md.

WHALE.md migration (compat-preserving):
- AGENTS.md now ranks above WHALE.md in both project and global discovery; with
  both present, AGENTS.md wins.
- WHALE.md is still read as a legacy fallback, but now emits a deprecation
  warning and is never created or recommended (init.rs no longer suggests it).
- Discovery/docs updated; the global CodeWhale Constitution in prompts/base.md
  is unaffected (different thing).

constitution.json:
- New RepoConstitution (serde, all fields optional, unknown fields ignored,
  schema_version checked). Discovered at .codewhale/constitution.json in the
  workspace or any parent up to the git root. Malformed JSON warns, never panics.
- Loaded after the auto-generate fallback so it can't be clobbered.

.gitignore: ignore .codewhale/ contents at any depth EXCEPT the committed
constitution.json (a directory exclude can't be negated, so **/.codewhale/* +
negation). init.rs writes the same pattern for new repos. Dogfood: this repo's
.codewhale/constitution.json added.

find_git_root made pub(crate) and reused (no duplicate loader).

Tests: AGENTS-over-WHALE precedence, WHALE legacy-read-with-warning,
constitution render + system-block surfacing, malformed-constitution warning,
gitignore-keeps-constitution. cargo test -p codewhale-tui --bins → 3946 passed;
clippy clean.

Targets codex/v0.8.53.
2026-06-03 12:12:34 -07:00
Hunter Bown 8cb4f94f30 docs: v0.8.53 tool-surface-diet design + north-star direction
Design-only deliverables for the v0.8.53 "tool surface diet / canonical
surfaces" cutover (no catalog code in this cycle). Grounded in a verified
inventory of the actual tool registry.

- docs/TOOL_LIFECYCLE.md (#2681): the umbrella policy. Five lifecycle states
  (active / deferred / hidden-compatibility / deprecated / removed) modeled as
  const name-sets + an alias table in tool_catalog.rs (not a per-ToolSpec
  field), so registration stays untouched and old transcripts always replay.
  Includes the deprecation manifest (exec_wait/exec_interact/tts →
  hidden-compat; todo_* → checklist_* deprecated; 11 legacy subagent names are
  already non-visible dead code → cleanup + guardrail), per-mode/per-provider
  active-catalog budget (incl. Arcee's 8-tool first-turn set), prefix-cache
  safety rules, and the tool_agent decision: canonical but DeepSeek-V4-gated.
- docs/CODEBASE_SEARCH_DESIGN.md (#2680, v0.9.0): local-first FTS5/BM25 +
  symbol/path ranking + RRF hybrid; rusqlite storage; mtime/branch/vendor
  invalidation; an explainable tool contract returning reasons[]; and a real
  CodeWhale query eval set. Complements grep_files/file_search, never replaces.
- docs/SKILL_INVOCATION_DESIGN.md (0.9.0): the $<skill-name> inline invocation
  syntax (the token IS the skill name), namespaced resolution, ambiguity-
  suggests-not-guesses, visible activation line, and a smallest-viable slice.
- docs/VISION_NORTH_STAR.md (0.9.0+): intent router, hybrid codebase
  intelligence, WhaleFlow typed workflow IR, skills/rules runtime, the layered
  context-memory stack, tool repair/autoload, the evaluation loop, and the
  command-surface taxonomy (/memory small · /context dashboard · /rules ·
  /workflow · /overlay · $<skill> · codebase_search). Marked DIRECTION, not
  committed 0.8.53 work; also records the deferred-not-done diet items.

Targets codex/v0.8.53.
2026-06-03 11:47:29 -07:00
Hunter Bown 7bbc6b78e4 fix(tools): activate read-only git history + actionable RLM/field errors
v0.8.53 tool/deferred/error UX (PR group 4), low-risk subset:

- #2654: add git_log and git_show to DEFAULT_ACTIVE_NATIVE_TOOLS so read-only
  git history joins git_diff/git_status in the active partition (kept
  alphabetical → prefix-cache head stays sorted/byte-stable). git_blame and
  other history tools remain deferred.
- #2655: rlm_open's source-count error now echoes common misnamed fields with a
  "did you mean file_path/content/url" hint; rlm_eval's missing-`code` error
  explains it runs raw Python and shows an example. Schema descriptions for
  rlm_eval name/code sharpened.
- #2659: likely_field_corrections gains RLM source-field rename hints (the
  role/type vocabulary change itself lives in the WS3 PR #2684 to avoid a
  double-edit of normalize_role_alias).

Deferred to the medium-risk batch: #2648 (render deferred-tool hydration
distinctly from "done") — needs a ToolStatus/cell-build change with wider
render blast radius than this low-risk PR.

Verification: cargo test -p codewhale-tui --bins → 3944 passed, 0 failed
(incl. prefix-cache sort invariant); cargo clippy clean.
Targets codex/v0.8.53.
2026-06-03 11:31:33 -07:00
Hunter Bown 725abeb603 fix(subagent): clearer role vocab, lifecycle signals, and eval ergonomics
Make the sub-agent surface easier for less-capable models to drive:

- Unify role/type vocabulary (#2649): normalize_role_alias now accepts the
  full set SubAgentType::from_str accepts (reviewer/implementer/verifier/...),
  and SubAgentType::from_str learns `planner`, so the dual-validation pass no
  longer rejects natural roles with a stale four-value hint. Error strings and
  schema descriptions now enumerate the real accepted aliases.
- agent_eval/agent_close always active (#2605) so a first call executes instead
  of hydrating its schema and forcing a double-invoke; both accept an
  `agent_name` session alias (#2650).
- Self-diagnosing name conflicts (#2656): the duplicate-name error names the
  conflicting agent_id and its status.
- Self-describing completion sentinels (#2658): subagent.done now carries
  result_clipped / summary_complete / next_action so the parent knows whether
  to trust the previous-line summary or call agent_eval.
- Actionable child-model-unavailable diagnostics (#2653): a provider 403/404
  is annotated with the model id and recovery path instead of a bare error.

Tests: role vocabulary acceptance + error wording, agent_name resolution,
duplicate-name diagnostics, clipped-result sentinel, child-model annotation,
agent_eval/agent_close default-active. Full tui suite green (3948), clippy clean.

Targets codex/v0.8.53 (v0.8.53 stabilization).
2026-06-03 11:22:56 -07:00
Hunter Bown 03d1bba538 Merge pull request #2630 from Hmbown/codex/v0.8.52-home-cost-fixes
fix(release): tighten 0.8.52 home and cost accounting
2026-06-03 03:44:40 -07:00
Hunter B b965d2ecd5 fix(release): tighten 0.8.52 home and cost accounting 2026-06-03 03:35:46 -07:00
Hunter Bown c8ce2b8e92 Merge pull request #2626 from Hmbown/codex/v0.8.52-stabilization
fix(release): stabilize v0.8.52
2026-06-03 03:07:40 -07:00
Hunter B 32e6aa5e17 fix(tui): keep work panel summary during lock misses
Co-authored-by: Hanmiao Li <894876246@qq.com>
2026-06-03 02:59:17 -07:00
Hunter B 14c882be53 fix(provider): expose siliconflow-cn registry coverage 2026-06-03 02:51:42 -07:00
Hunter B 54446e6c07 fix(release): stabilize v0.8.52 2026-06-03 02:39:45 -07:00
Hunter Bown 25340d17a7 feat(provider): add SiliconFlow China region (siliconflow-CN) (#2615)
Adds SiliconFlow China regional endpoint (api.siliconflow.cn) as new provider variant.

Credit: @Raid10Without1 (PR #2588)

Co-authored-by: Raid10没有1 <88494433+Raid10Without1@users.noreply.github.com>
2026-06-02 21:27:40 -07:00
Hunter Bown dd26114697 feat(tui): send /attach images as multimodal content (#2584, #2587) (#2607)
Adds OpenAI-compatible image_url content blocks to the chat message
model, wiring attached images through build_chat_messages_with_reasoning
as multimodal user-content arrays. When images are present, user
messages emit a content array of text + image_url parts instead of a
plain string, matching the OpenAI vision API shape.

- models.rs: new ImageUrlContent struct, ContentBlock::ImageUrl variant
- client/chat.rs: image_parts collection, multimodal wire format for
  user messages, image-aware message inspection, stream-event no-op
- Exhaustiveness arms added across 10 files (compaction, seam_manager,
  capacity_flow, purge, notifications, session_picker, utils,
  working_set, rlm/session, runtime_api)
- Test: request_builder_emits_openai_image_url_parts_for_user_images

Credit: @xyuai (PR #2587 — root cause + initial implementation)
Closes: #2584

Co-authored-by: xyuai <xyuai@users.noreply.github.com>
2026-06-02 21:27:31 -07:00
AresNing 8981d5c5fd feat: add subagent lifecycle hooks
Add subagent lifecycle hooks for better control over subagent initialization and teardown.
2026-06-02 20:48:09 -07:00
Gordon b4691bc082 feat(i18n): localize context-inspector surface across 7 locales
Localize the context-inspector surface across 7 locales for improved internationalization support.
2026-06-02 20:47:53 -07:00
Justin Gao 29acb87a9d feat(engine): inject mode-change runtime message and include mode in turn metadata
Inject a mode-change runtime message into the engine and include mode information in turn metadata for better tracking.
2026-06-02 20:47:36 -07:00
Hanmiao Li 1781312c7a feat(tui): add drag-to-resize sidebar width
Add drag-to-resize functionality for the TUI sidebar width, allowing users to interactively resize the sidebar.
2026-06-02 20:47:18 -07:00
Hunter B 2721b2a077 Merge branch 'codex/v0.8.51-arcee-provider' — v0.8.51 release
v0.8.51: Arcee provider, cycle removal, compaction improvements,
TUI fixes, model persistence, and community harvest.
2026-06-02 20:38:10 -07:00
Hunter B 541926eb38 docs(changelog): backfill v0.8.51 Fixed entries + Community credits
Adds 8 missing Fixed entries for commits that landed after the
release-prep commit (06612495f): DEC CSI fragment fix, engine panic
recovery, nested file-picker, command-palette scroll, .NET/Windows
env, config key warnings, diff-render whitespace, and model
persistence. Adds Community credits for contributors whose work
landed or shaped this release cycle.
2026-06-02 20:37:52 -07:00
Hunter B c000bd7e60 harvest(v0.8.51): diff-render whitespace fix + schema dead_code + model persistence + prompt updates
- fix(diff-render): preserve leading whitespace in patch content lines
  Credit: @zlh124 (PR #2591), with extra-space bug fixed.
- fix(tui): allow unused schema migration registry
  Credit: @reidliu41 (PR #2601).
- feat(tui): persist per-provider model selection from /model command
- docs(prompts): prefer gh --json CLI for GitHub triage in agent instructions
2026-06-02 20:30:31 -07:00
Claude f886f28acf test(tui): update walk-depth test for new default depth (#2488)
`workspace_completions_honor_configured_walk_depth` placed its probe file at
component depth 9 and asserted the *default* walk excludes it — true at the
old default (6) but not the new one (10). Move the probe to depth 12 so it
stays past the default while remaining within the explicit deeper walk (16)
and the unlimited (0) cases the test also exercises.

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:41:13 +00:00
Claude 79d78878b8 test(mcp): make legacy SSE reconnect test deterministic under load (#2597)
`legacy_sse_closed_stream_reconnects_and_retries_tool_call` passed in
isolation but flaked under parallel load. The mock server dropped the
tool-call response whenever `active_sse` was momentarily `None` — which
happens when the retry POST is scheduled ahead of the reconnecting GET /sse
that re-stores the SSE sender. The client then hung until timeout and the
test failed.

Make the server wait briefly (bounded, 5s) for the SSE channel before
sending, so response delivery no longer depends on the order in which the
two server tasks are scheduled.

https://claude.ai/code/session_01MQrnh6wHfrEYN5BBdMarC1
2026-06-03 01:24:14 +00:00