codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
Hunter B	a5f27aae3a	feat(benchmarks): default PinchBench to MiMo v2.5 Pro, add direct-mimo routing PinchBench runner now defaults to openrouter/xiaomi/mimo-v2.5-pro instead of deepseek/deepseek-chat. Adds --direct-mimo flag for routing through Xiaomi's API directly (bypasses OpenRouter), with tp-/sk- key type detection and endpoint mismatch warnings. Harbor adapter gains --provider CLI flag for MiMo provider routing. Known issues documented in docs/MIMO_BENCHMARK_ISSUES.md: - PinchBench model validation requires OpenRouter prefix - OPENROUTER_API_KEY needed even for some direct-provider paths - Token Plan vs pay-as-you-go key/endpoint mismatch - PinchBench runs through OpenClaw, not CodeWhale	2026-06-04 19:33:43 -07:00
Hunter B	b329a532f5	feat(benchmarks): add SWE-bench, Terminal-Bench, and PinchBench integration Benchmark harness for evaluating CodeWhale against three external benchmarks: - SWE-bench: batch driver wrapping existing codewhale swebench commands - Terminal-Bench: Harbor adapter (BaseInstalledAgent) for container eval - PinchBench: runner with auto-install for real-world agent tasks Includes docs/BENCHMARKS.md umbrella doc with setup, usage, and reproducibility checklist. Scripts record version/commit/timestamp metadata for each run. Branch: codex/v0.8.53-benchmarks (based on v0.8.53)	2026-06-04 19:22:06 -07:00
Hunter Bown	772ec46c98	chore(release): v0.8.53 — Arcee support, telegram bridge, provider fixes - Fix Rust syntax/clippy fallout in client.rs, cli/src/lib.rs, web_search.rs - Fix 0.8.53 release metadata: changelog links, TUI changelog, npm wrapper - Update visible help copy for multi-provider support - Add telegram-bridge integration with deploy configs - Add US remote VM quickstart doc - Update Tencent Cloud deploy scripts and docs - Bump npm wrapper to 0.8.53	2026-06-03 16:12:38 -07:00
Hunter Bown	4b990e190c	docs(rfc): file decomposition plan for v0.9.0	2026-06-03 15:08:31 -07:00
Hunter Bown	28a0f19c13	fix(provider): polish v0.8.53 routing and shell gating	2026-06-03 14:40:25 -07:00
Hunter Bown	ed4ec3f799	Merge branch 'codex/v0.8.53-deprecate-whale-md' into codex/v0.8.53	2026-06-03 12:38:00 -07:00
Hunter Bown	d5c6856754	Merge branch 'codex/v0.8.53-toolsurface-design-docs' into codex/v0.8.53	2026-06-03 12:37:57 -07:00
Hunter Bown	aa4c734602	docs: align v0.8.53 tool surface notes	2026-06-03 12:37:39 -07:00
Hunter Bown	fc8ad7b3a8	feat(project): enrich repo constitution (invariants, branch policy, escalation) Per the layered-authority clarification (base myth → global Constitution → repo constitution = local law → task packet → runtime policy), extend .codewhale/constitution.json beyond authority+verification with optional: - protected_invariants — repo invariants the agent must not break - branch_policy — branch/release policy in effect - escalate_when — conditions to stop and escalate to the user All optional; rendered as concise model-facing prose. The global Brother Whale identity anchor and Constitution in prompts/base.md are unchanged (verified untouched on this branch). Dogfood constitution.json filled with CodeWhale's real invariants (prefix-cache byte-stability, transcript replay, stable Rust, cli/tui parity), branch policy (codex/v0.8.53), and escalation rules. Docs note the layered hierarchy. cargo test -p codewhale-tui --bins → 3946 passed; clippy clean.	2026-06-03 12:16:06 -07:00
Hunter Bown	9d9616e898	feat(project): deprecate WHALE.md; add .codewhale/constitution.json authority layer Splits repo-level guidance into two clear artifacts and deprecates the confusing WHALE.md concept (overlapped with AGENTS.md): - AGENTS.md is the canonical cross-agent project-instructions file. - .codewhale/constitution.json is the CodeWhale-specific repo authority / prioritization policy (when local sources conflict, which to trust first; what to verify before claiming done). Rendered into the system prompt as a higher-authority <codewhale_repo_constitution> block; takes precedence over a legacy WHALE.md. WHALE.md migration (compat-preserving): - AGENTS.md now ranks above WHALE.md in both project and global discovery; with both present, AGENTS.md wins. - WHALE.md is still read as a legacy fallback, but now emits a deprecation warning and is never created or recommended (init.rs no longer suggests it). - Discovery/docs updated; the global CodeWhale Constitution in prompts/base.md is unaffected (different thing). constitution.json: - New RepoConstitution (serde, all fields optional, unknown fields ignored, schema_version checked). Discovered at .codewhale/constitution.json in the workspace or any parent up to the git root. Malformed JSON warns, never panics. - Loaded after the auto-generate fallback so it can't be clobbered. .gitignore: ignore .codewhale/ contents at any depth EXCEPT the committed constitution.json (a directory exclude can't be negated, so */.codewhale/ + negation). init.rs writes the same pattern for new repos. Dogfood: this repo's .codewhale/constitution.json added. find_git_root made pub(crate) and reused (no duplicate loader). Tests: AGENTS-over-WHALE precedence, WHALE legacy-read-with-warning, constitution render + system-block surfacing, malformed-constitution warning, gitignore-keeps-constitution. cargo test -p codewhale-tui --bins → 3946 passed; clippy clean. Targets codex/v0.8.53.	2026-06-03 12:12:34 -07:00
Hunter Bown	8cb4f94f30	docs: v0.8.53 tool-surface-diet design + north-star direction Design-only deliverables for the v0.8.53 "tool surface diet / canonical surfaces" cutover (no catalog code in this cycle). Grounded in a verified inventory of the actual tool registry. - docs/TOOL_LIFECYCLE.md (#2681): the umbrella policy. Five lifecycle states (active / deferred / hidden-compatibility / deprecated / removed) modeled as const name-sets + an alias table in tool_catalog.rs (not a per-ToolSpec field), so registration stays untouched and old transcripts always replay. Includes the deprecation manifest (exec_wait/exec_interact/tts → hidden-compat; todo_* → checklist_* deprecated; 11 legacy subagent names are already non-visible dead code → cleanup + guardrail), per-mode/per-provider active-catalog budget (incl. Arcee's 8-tool first-turn set), prefix-cache safety rules, and the tool_agent decision: canonical but DeepSeek-V4-gated. - docs/CODEBASE_SEARCH_DESIGN.md (#2680, v0.9.0): local-first FTS5/BM25 + symbol/path ranking + RRF hybrid; rusqlite storage; mtime/branch/vendor invalidation; an explainable tool contract returning reasons[]; and a real CodeWhale query eval set. Complements grep_files/file_search, never replaces. - docs/SKILL_INVOCATION_DESIGN.md (0.9.0): the $<skill-name> inline invocation syntax (the token IS the skill name), namespaced resolution, ambiguity- suggests-not-guesses, visible activation line, and a smallest-viable slice. - docs/VISION_NORTH_STAR.md (0.9.0+): intent router, hybrid codebase intelligence, WhaleFlow typed workflow IR, skills/rules runtime, the layered context-memory stack, tool repair/autoload, the evaluation loop, and the command-surface taxonomy (/memory small · /context dashboard · /rules · /workflow · /overlay · $<skill> · codebase_search). Marked DIRECTION, not committed 0.8.53 work; also records the deferred-not-done diet items. Targets codex/v0.8.53.	2026-06-03 11:47:29 -07:00
Hunter Bown	725abeb603	fix(subagent): clearer role vocab, lifecycle signals, and eval ergonomics Make the sub-agent surface easier for less-capable models to drive: - Unify role/type vocabulary (#2649): normalize_role_alias now accepts the full set SubAgentType::from_str accepts (reviewer/implementer/verifier/...), and SubAgentType::from_str learns `planner`, so the dual-validation pass no longer rejects natural roles with a stale four-value hint. Error strings and schema descriptions now enumerate the real accepted aliases. - agent_eval/agent_close always active (#2605) so a first call executes instead of hydrating its schema and forcing a double-invoke; both accept an `agent_name` session alias (#2650). - Self-diagnosing name conflicts (#2656): the duplicate-name error names the conflicting agent_id and its status. - Self-describing completion sentinels (#2658): subagent.done now carries result_clipped / summary_complete / next_action so the parent knows whether to trust the previous-line summary or call agent_eval. - Actionable child-model-unavailable diagnostics (#2653): a provider 403/404 is annotated with the model id and recovery path instead of a bare error. Tests: role vocabulary acceptance + error wording, agent_name resolution, duplicate-name diagnostics, clipped-result sentinel, child-model annotation, agent_eval/agent_close default-active. Full tui suite green (3948), clippy clean. Targets codex/v0.8.53 (v0.8.53 stabilization).	2026-06-03 11:22:56 -07:00
Hunter B	14c882be53	fix(provider): expose siliconflow-cn registry coverage	2026-06-03 02:51:42 -07:00
Hunter B	54446e6c07	fix(release): stabilize v0.8.52	2026-06-03 02:39:45 -07:00
AresNing	8981d5c5fd	feat: add subagent lifecycle hooks Add subagent lifecycle hooks for better control over subagent initialization and teardown.	2026-06-02 20:48:09 -07:00
Hunter Bown	06612495fc	chore(release): prep v0.8.51 — Arcee provider, cycle removal, UI fixes Release-preparation checkpoint for v0.8.51 (workspace + npm bumped to 0.8.51). Added: - Arcee AI direct provider: [providers.arcee], ARCEE_API_KEY/BASE_URL/MODEL, CLI auth, provider + model picker, registry. Default direct-API model is trinity-large-thinking (reasoning, 262K ctx/out); preview + mini selectable. Cloudflare-WAF-safe opening turn (benign read-only tool surface, system-prompt payload splitting) and reasoning_content replay on tool-call turns. - Expanded model catalog (qwen3.6 flash/plus/max-preview, Xiaomi MiMo v2.5 chat/ASR/TTS); provider-aware model picker with per-provider saved models. Changed: - Auto-compaction is percentage- and model-aware (compaction_threshold_for_model_at_percent; default 80%; auto-enable for <=256K windows, opt-in for 1M models). - Provider/gateway HTTP errors sanitized (HTML/WAF interstitials collapsed, 401/403 split into authentication vs authorization). Removed: - The session cycle / checkpoint-restart system: /cycles, /cycle, /recall, recall_archive tool, cycle_manager, cycle-handoff prompt, sidebar cycle lines, EngineConfig.cycle / Event::CycleAdvanced / seam cycle thresholds. Fixed: - Orphaned assistant 'blue dot' role glyph on whitespace-only turns. - Sidebar mouse-wheel scroll leaking into the transcript. - Sidebar hover tooltip overlap + warning-orange styling. - README Constitution description corrected to match prompts/base.md. - Repaired release-blocking unit/integration tests after the refactors. Preflight: cargo fmt clean, workspace builds, 3903 tui tests pass (1 known flaky MCP SSE test under parallel load, passes in isolation).	2026-06-02 17:36:18 -07:00
Hunter B	e54a0a500b	feat(provider): add direct arcee support	2026-06-02 08:51:39 -07:00
xyuai	8532dcc49e	feat: add Xiaomi MiMo speech support	2026-06-01 21:18:28 -07:00
Hunter B	eedeb5290b	fix(agent): pass through explicit AtlasCloud model ids Refs #2569 Harvests the safe part of PR #2569 by allowing AtlasCloud provider-hinted namespaced model IDs to route exactly as requested, without freezing a volatile provider model catalog in the static registry. Co-authored-by: lucaszhu-hue <lucas.zhu@atlascloud.ai>	2026-06-01 20:06:26 -07:00
Hunter B	63b7c189b8	fix(release): ship NSIS installer artifact	2026-06-01 19:34:25 -07:00
songzhenrui	e6de6f47d5	fix: address Gemini code review feedback - Define UnStrStr macro for uninstaller string functions - Use un.StrStr instead of StrStr in uninstaller context - Rewrite un.RemoveFromPath with correct offset calculations and semicolon handling to prevent PATH corruption - Use dynamic version fetch from GitHub API in CLASSROOM_INSTALL.md	2026-06-01 19:27:30 -07:00
songzhenrui	c2c36cca11	feat: add NSIS installer and classroom admin checklist Closes #1983 - Add scripts/installer/codewhale.nsi: NSIS installer that installs both codewhale.exe and codewhale-tui.exe to %LOCALAPPDATA%\Programs\CodeWhale\bin, adds to current-user PATH, and includes an uninstaller that cleans PATH - Add docs/CLASSROOM_INSTALL.md: step-by-step checklist for IT admins deploying CodeWhale in labs/classrooms, covering silent install, manual fallback, API key provisioning, imaging notes, and troubleshooting - Update docs/INSTALL.md: add Windows NSIS Installer section referencing the new installer and classroom checklist	2026-06-01 19:27:30 -07:00
cyq	c92f3c350b	feat(tui): expose current model in turn metadata	2026-06-01 19:22:49 -07:00
cyq	a41a3825c5	docs(runtime): outline receipt export boundary	2026-06-01 19:22:00 -07:00
cyq	baba81cfb9	fix(tui): show session timestamps in listings	2026-06-01 19:21:53 -07:00
reidliu41	c81cdabc09	feat(tui): add bang shell command shortcut Support `! <command>` and `!command` in the TUI composer to run shell commands through the existing exec_shell path. The shortcut keeps normal approval, sandbox, policy, transcript, and work-panel handling, while avoiding model context pollution from local-only tool results. Refs #1546	2026-06-01 17:42:04 -07:00
cyq	73cd721665	feat(tui): add mention browser completions	2026-06-01 17:33:41 -07:00
Hunter Bown	cbd6239f3d	docs: mark agent shell tools approval-gated	2026-06-01 17:23:23 -07:00
cyq	9e9326990d	docs: polish mode availability table	2026-06-01 17:23:04 -07:00
cyq	fda2141b70	docs: clarify shell tool mode availability	2026-06-01 17:23:04 -07:00
Hunter Bown	ae2000b59a	docs(tui): align statusline customization limits	2026-06-01 17:20:49 -07:00
cyq	9bd08c2f1c	docs(tui): document statusline footer items	2026-06-01 17:20:05 -07:00
cyq	3db7b40458	docs(rebrand): address migration review notes	2026-06-01 17:19:26 -07:00
cyq	5f3cc3c8e8	docs(rebrand): clarify state migration paths	2026-06-01 17:19:26 -07:00
cyq	5a909eea45	docs(providers): clarify local model tool calls	2026-06-01 17:17:20 -07:00
Hunter Bown	c52769e5f5	feat(tools): add parallel verifier ensemble	2026-06-01 16:40:30 -07:00
cyq	3b5727f283	fix(tui): prefer codewhale settings path	2026-06-01 16:17:43 -07:00
Hunter Bown	3d5edfee80	test(config): cover legacy workspace shell opt-in Refs #2523	2026-06-01 15:30:47 -07:00
cyq	7dfec0ed4a	fix(config): honor workspace shell opt-in	2026-06-01 15:30:14 -07:00
Hunter Bown	bb64018a15	feat(tui): add configurable auto-compact threshold Refs #1722 Preserves auto_compact as opt-in, adds the saved threshold setting, keeps the 500K hard floor, and wires Ctrl+L as a manual compaction shortcut for context-pressure recovery. Harvested from PR #1723 by @aboimpinto Co-authored-by: Paulo Aboim Pinto <aboimpinto@gmail.com>	2026-06-01 14:09:49 -07:00
Hunter B	998af56d6a	chore(release): harden deepseek-tui deprecation path	2026-06-01 06:01:03 -07:00
greyfreedom	3df018994f	feat(config): load typed ask permissions file (cherry picked from commit fb77cf1e0946a061376e5e9a8fc9422dddd98419)	2026-06-01 05:43:26 -07:00
Hunter B	ce75b9551a	docs: update TOOL_SURFACE.md for v0.8.49 - Update version header from v0.8.35 to v0.8.49 - Fix PDF extraction backend description (bundled pure-Rust, not Poppler) - Add 'Additional registered tools' section listing ~15 tools present in the registry but not previously documented: web.run, multi_tool_use.parallel, request_user_input, git_show/log/blame, load_skill, revert_turn, pandoc_convert, validate_data, code_execution, review, project_map, remember, image_analyze, image_ocr, finance - Clarify canonical live names version reference	2026-06-01 02:33:40 -07:00
Hunter B	78facf9239	chore: home-polish for v0.8.49 - Add completion_sound to config.example.toml notifications section (off/beep/bell, default 'beep', introduced in v0.8.48) - Add kimi-k2.6 to Default Models comments in config.example.toml (default Moonshot/Kimi model, provider section already added) - Fix stale v0.8.47 reference in docs/PROVIDERS.md Planned section (now says v0.8.48+, reflecting current milestones)	2026-06-01 02:33:40 -07:00
Hunter B	e9eea70445	release: v0.8.48 — liveness watchdog fix, Qwen 3.7 removal, provider/docs sync - Fix false 'Turn stalled' during long active turns with running tools. Add turn_last_activity_at tracking and active-tool awareness to reconcile_turn_liveness(). Three new tests cover the fix. - Remove Qwen 3.7 Max OpenRouter preset from registry, picker, docs, and tests. Qwen 3.7 Max is a hosted model; the preset will return when an open-weight Qwen 3.7 release ships. MiniMax M3 remains as a full 1M-context multimodal route. - Sync root CHANGELOG to crates/tui/CHANGELOG for crates.io packaging. Update docs/CONFIGURATION.md, docs/PROVIDERS.md, and README to reflect the Qwen 3.7 removal. Regenerate web facts timestamp.	2026-05-31 21:45:31 -07:00
Hunter B	2bd77edc15	Fix provider model selection and stream completion	2026-05-31 21:15:40 -07:00
Hunter B	2b69f4e041	chore: polish codewhale home defaults	2026-05-31 19:22:12 -07:00
Hunter B	6f295e9961	docs: sync release credit surfaces	2026-05-31 17:43:46 -07:00
Hunter B	a9a4213d39	fix(tui): make startup update checks configurable	2026-05-31 17:06:20 -07:00
Hunter B	d6baf20721	Migrate file secrets to codewhale home	2026-05-31 15:30:35 -07:00

1 2 3 4 5 ...

268 Commits