Commit Graph

401 Commits

Author SHA1 Message Date
Hunter B bb4d1e5744 docs(web,readme): taste pass — fix stale refs, tighten design rhythm
Polish fixes (12):
- INSTALL.md: drop duplicate code block; doc CODEWHALE_VERSION canonical
  env var; fix example mirror URLs /DeepSeek-TUI/ → /CodeWhale/; remove
  stale "Once npm publication resumes" conditional
- footer.tsx: fix Gitee mirror URL deepseek-tui → CodeWhale
- globals.css: update seal CSS shadows from old cream rgba(244,241,232)
  to paper white rgba(255,255,255)
- web/README.md: update palette description cream #FAF6EE → white #FFFFFF
- nav.tsx: fix zh-only date strip (第…期) and API label (在线) — now
  "Edition …" / "API · Online" in en mode
- page.tsx: hardcoded v0.8.56 → dynamic facts.version
- README.md: merge duplicate Contributing sections
- README.ja-JP/vi: parenthesize untranslated Chinese thank-you blurbs

Design rhythm (9):
- ThinkingTrace section: py-16→py-20, bg-paper-deep, hairlines for reveal
- Judgment stack: py-16→py-12; middle cell bg-paper-deep for ABA rhythm
- Maintainer loop: py-16→py-14; step numbers grow 01→04
- How it works: py-16→py-12
- Join in closer: py-16→py-20 (dark section breathes)
- Ticker: LIVE label shrunk to annotation weight
2026-06-13 13:55:51 -07:00
Hunter B 5b3228cc4e docs(readme): renovate website hero + localized READMEs; nest contributors
- Home hero (web/app/[locale]/page.tsx): multi-model, community-first,
  'intent outranks everything' framing; replace DeepSeek-only positioning.
- Add 'See how it decides' section + thinking-trace.tsx: surfaces real
  reasoning traces (Article II/V citations) paired with the decision each
  produced — proof, not assertion, that the constitution operates.
- docs/CONTRIBUTORS.md: chronological collapsible bands (forward-track ->
  recurring/historical -> v0.8.48). Rebuilt by byte-copying verified HEAD
  content; handle diff confirms zero contributor drops.
- Localize README.zh-CN / ja-JP / vi to mirror the new English README
  (multi-model, community-first, constitution in its own section, concise
  Thanks linking to docs/CONTRIBUTORS.md).
2026-06-13 13:40:33 -07:00
Hunter B 5ca618d70a fix(subagents): make eval and queued steering nonblocking
Make agent_eval return a running projection by default so follow-up steering does not wait for child model calls. Keep checkpoint resume blocking by default unless block=false is explicit.

Teach /agent, /swarm, prompts, and docs to poll workers nonblocking and reserve block:true for deliberate terminal waits. Add Ctrl+S as a reliable queued-message send path before falling back to draft stash.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 13:31:21 -07:00
Hunter B 5815f94f0b docs(runtime): clarify fleet-backed subagents
State the v0.8.60 direction plainly: sub-agent is role and UX vocabulary, while durable detached work should use the fleet-backed worker lifecycle.

Document the current agent_open path as compatibility until retry, receipt, and ledger semantics are unified with Fleet.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 13:06:28 -07:00
Hunter B a21048a207 docs(readme): focus public overview
Shorten the README around the product path, provider support, install flow, docs index, and community contribution entry points.

Move the long contributor credit ledger into docs/CONTRIBUTORS.md so public credit remains visible without overwhelming the README.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 13:03:58 -07:00
Hunter B 877b44935e fix(skills): reject multi-skill Claude plugin archives
Document the portable SKILL.md compatibility boundary for Claude Code plugin bundles and keep /skill install from silently flattening plugin archives that carry multiple skills plus plugin.json runtime metadata.

Reported by @AiurArtanis in #2743.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 08:20:36 -07:00
Hunter B 38ce04790a feat(whaleflow): add declarative JS workflow authoring
Adds a compile-only JavaScript/TypeScript authoring path that extracts a JSON-compatible workflow({...}) object, lowers it to the existing WorkflowSpec IR, and runs the Rust validation gate before execution.

Includes a branch/reduce .workflow.js example and a short authoring design note comparing YAML/JSON, Starlark, JavaScript, and TypeScript. The compiler rejects effectful JavaScript constructs instead of executing workflow source as a second runtime.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 08:15:14 -07:00
Hunter B f6867e65bd fix(provider): preserve Z.ai GLM thinking traces
Z.ai's current GLM-5.2 coding-agent docs confirm actual thinking efforts high and max, with ultracode mapped to max. Its Chat Completion API documents thinking.type, thinking.clear_thinking, and reasoning_content rather than a raw reasoning_effort scalar, so direct Z.ai requests now use the documented thinking object and preserve reasoning_content for coding-agent continuity.

Also accepts ultracode as a max-effort alias across settings, routing, and Responses mapping.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 08:10:37 -07:00
1Kill2Steal 6ac93e81c7 feat(tui): add Ctrl+P/N slash-menu navigation
Adds Ctrl+P and Ctrl+N as Emacs-style alternatives for slash-command autocomplete navigation while keeping the global Ctrl+P file picker from stealing focus when the slash menu is open.

Harvested from PR #3196 by @1Git2Clone.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 08:07:29 -07:00
Hunter B 5106ecfb2d test(fleet): CI-safe multi-worker dogfood smoke with injected failure (#3166)
Adds a verifiable dogfood smoke that drives several concurrent exec-style
workers (three healthy + one injected-failure that emits an error event and
exits non-zero) through the real host adapter, asserting distinct terminal
pass/fail outcomes — no external services, no model calls, no codewhale binary.
Documents the automated CI smoke vs the manual `codewhale fleet run` path in
the dogfood spec, and is honest that the manager run-loop cutover to drive real
FleetExecutor workers is still in progress.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 01:30:06 -07:00
Hunter B 22980b8788 docs(direction): unified agent runtime + Claude Code convergence stance (#2972)
Adds docs/AGENT_RUNTIME.md: one headless runtime, three launchers (TUI turn,
`codewhale exec`, Agent Fleet); sub-agent == fleet worker == nested/launched run
of the same runtime; the isolation+events+compact-projection lag fix; the single
recursion axis; and the one event vocabulary. Answers #2972 (how much Claude
Code convergence): converge on shape (headless runtime, isolated sub-agents,
event-driven fanout, skills, receipts), keep CodeWhale branding + multi-provider
+ the local-first fleet distinct, and never fork execution semantics per surface.
Cross-linked from FLEET.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 01:27:59 -07:00
Hunter B b1cc6ecf5d merge: fleet security/trust + headless-worker foundation + depth unification 2026-06-13 01:11:50 -07:00
Hunter B e8b52ac57a feat(fleet): security/trust + headless-worker foundation; unify recursion depth
Lands the Agent Fleet security/trust boundary and the headless-worker bridge on
the v0.8.60 line, and collapses the sub-agent and fleet recursion model into a
single shared axis (Hunter steer: "not two moving targets").

Security & trust (#3165):
- FleetTrustLevel, FleetSecurityPolicy, FleetSecretRef (redacted), FleetWorkerAuth,
  FleetCapabilityGrant, FleetAlertEndpoint (redacted) in protocol.
- secrets: resolve_direct(key, source_hint) — fleet secret resolution, never logged.
- Host adapters refuse secret-bearing env keys; SSH uses SendEnv (no argv secrets).

Roles & delegation (#3167):
- fleet role -> SubAgentType mapping; reviewer/verifier default read-only.

Headless worker bridge (#3096/#3154, partial — still simulation, real spawn next):
- worker_runtime: FleetTaskSpec -> AgentWorkerSpec, status -> ledger events,
  exec hardening (mirrors #3027), parallel-safe read-only tool set (#2983).
- FleetManager carries an optional SharedSubAgentManager + exec config.

Recursion depth — ONE axis:
- codewhale_config now owns DEFAULT_SPAWN_DEPTH (3) + MAX_SPAWN_DEPTH_CEILING (3).
- sub-agent DEFAULT_MAX_SPAWN_DEPTH and the fleet clamp both source these consts.
- fleet default raised 1 -> 3 to match standalone sub-agents; root runs at depth 0,
  budget gates child delegation. End-to-end test proves a depth-0 fleet worker
  reaches 3 nested levels (afford >= 3).

Dogfood scaffolding (#3166, partial): docs/examples/fleet-dogfood.toml.

Tests green: codewhale-config fleet, codewhale-tui fleet (58), subagent max_depth;
cargo fmt + git diff --check clean; cargo check --workspace ok.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 01:10:30 -07:00
Hunter B 754cff1c84 Refresh GLM and MiniMax provider models 2026-06-12 23:56:01 -07:00
Hunter B 751ff943aa fix(tui): group strategy context under work checklist 2026-06-12 22:55:11 -07:00
Hunter B c0ba6ce5ad feat(subagents): add agent run receipts 2026-06-12 22:45:53 -07:00
Hunter B fb1a27b4f2 feat(fleet): add manager runbook skill 2026-06-12 22:26:35 -07:00
Hunter B 26925ae644 feat(runtime-sdk): add fleet helper client
Refs #3163.

Adds the @codewhale/runtime-sdk workspace with typed fleet Runtime API helpers, protocol-shaped TypeScript declarations, JSON/SSE event fixture handling, and typed RuntimeCapabilityError failures for create/event-stream endpoints that the Rust API has not exposed yet.

Documents the SDK contract in docs/RUNTIME_API.md and wires npm workspace verification through npm test --workspace @codewhale/runtime-sdk.
2026-06-12 22:17:16 -07:00
Hunter B e9f0c1c13b feat(config): add first-party MiniMax provider route
Refs #1310.

Adds the direct minimax provider slot, auth/config/env bindings, model aliases, docs, and picker/status surfaces. MiniMax requests now set reasoning_split and preserve reasoning_details history so thinking stays out of answer text.

Also updates Moonshot/Kimi reasoning classification for the Kimi Code route so reasoning_content streams into Thinking cells instead of inline prose.
2026-06-12 22:05:22 -07:00
Hunter B 6b5ffd5627 feat(tui): expose fleet status surfaces 2026-06-12 19:35:11 -07:00
Hunter B bc61857fad feat(tui): add opt-in fleet alert dry-runs 2026-06-12 19:17:50 -07:00
Hunter B 714276e1c1 feat(tui): add verifiable fleet task specs 2026-06-12 19:01:00 -07:00
Hunter B 9e5688b8c8 feat(tui): add fleet host adapters 2026-06-12 18:31:29 -07:00
CodeWhale Agent 97f8a98b75 fix(config): align Kimi OAuth credentials with Kimi Code 2026-06-12 16:43:25 -07:00
CodeWhale Agent 89a9981bf9 Merge PR #2879: Hugging Face provider docs and tests
Harvested from PR #2879 by @mvanhorn

Co-authored-by: mvanhorn <455140+mvanhorn@users.noreply.github.com>
2026-06-12 13:56:03 -07:00
CodeWhale Agent 81265ec71b feat(config): surface experimental feature flags 2026-06-12 13:36:13 -07:00
CodeWhale Agent 8fc2b904de docs: record full 90-PR equivalence audit and lane B landings
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 10:00:53 -07:00
CodeWhale Agent c08f161e1f docs: add v0.8.59 triage runbook and scratch-branch integration guidance
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
2026-06-12 09:56:29 -07:00
cyq1017 42de833d80 feat(config): add concise verbosity mode
Harvests #3052 onto the v0.8.59 release branch. Noninteractive CLI launches default to concise output discipline unless config, env, or --verbosity overrides it; interactive TUI launches remain normal by default.

Also forwards CODEWHALE_VERBOSITY as the primary env bridge while keeping DEEPSEEK_VERBOSITY for compatibility, documents the setting, and records @cyq1017 credit in both changelogs.
2026-06-12 06:41:01 -07:00
Hunter B cf910b7da2 feat(goal): persist thread goals through app server 2026-06-12 06:28:47 -07:00
Hunter B efcf299712 feat(models): add Kimi K2.7 Code defaults 2026-06-12 05:43:38 -07:00
Hunter B d50dfd4827 fix(hooks): require workspace trust for project hooks
Gate .codewhale/hooks.toml behind user-owned workspace trust, mirroring the project-local MCP trust boundary while preserving shell-command hook semantics.

Harvested from PR #3140.

Co-authored-by: Hmbown <101357273+Hmbown@users.noreply.github.com>

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
2026-06-12 02:38:56 -07:00
Hunter B 8c690cb7bf fix(update): guide legacy deepseek users to codewhale
Harvests PR #3013 by @cyq1017 and PR #3053 by @angus-guo. Legacy deepseek/deepseek-tui binaries now return migration steps instead of trying to self-update through a missing codewhale binary, and the README/rebrand docs carry the same upgrade path.

Co-authored-by: cyq1017 <61975706+cyq1017@users.noreply.github.com>

Co-authored-by: gus <217034332+angus-guo@users.noreply.github.com>
2026-06-12 02:22:10 -07:00
Hunter B 10e41b1153 feat(runtime): expose matched approval rule metadata
Harvests the explainability slice from PR #2971 without changing the public HookEvent constructor shape. Runtime API approval.required frames now carry matched_rule metadata when an execpolicy rule caused the prompt.

Co-authored-by: greyfreedom <11493871+greyfreedom@users.noreply.github.com>
2026-06-12 01:46:41 -07:00
idling11 da943be7de fix(config): split SiliconFlow CN provider config
Harvest #2895 for the v0.8.59 release lane. SiliconFlow CN now reads its own providers.siliconflow_cn / providers.siliconflow-CN table and falls back to providers.siliconflow only when api_key, base_url, or model are unset.

Maintainer amendments wire the TUI fallback paths, provider config get/set/unset/redaction surfaces, env override routing, model normalization, tests, and changelog credit.

Fixes #2893.

Reported-by: Artenx <18120598+Artenx@users.noreply.github.com>
2026-06-12 01:21:27 -07:00
idling11 0cd8bcde1b feat(bench): add CLI comparison harness
Harvest #3009 for the v0.8.59 release lane. Adds a paired Terminal-Bench harness for CodeWhale and Codex, a Codex Harbor adapter, generated-result ignore protection, and benchmark docs.

Maintainer amendments keep explicit zero-valued metrics, regenerate parent task names, write refreshed summaries in regenerate mode, and allow transcript paths outside the repo.

Fixes #2952.
2026-06-12 01:15:00 -07:00
Hunter B f99fff969a release: harden v0.8.59 readiness lane
Integrate the v0.8.59 release-readiness aggregate: command-boundary grouping, Responses schema hardening, Codex reasoning tiers, goal lifecycle/runtime sync, sub-agent stall guards, activity metadata rows, and provider metadata/auth fixes.

Credit surfaces are captured in the changelogs for Paulo, Nightt, yekern, and the Devin/Hunter integration work.

Co-authored-by: aboimpinto <1231687+aboimpinto@users.noreply.github.com>

Co-authored-by: nightt5879 <87569709+nightt5879@users.noreply.github.com>
2026-06-12 01:07:11 -07:00
Claude d940b7821c Merge origin/main into v0.8.58-3014-anthropic-adapter — keep #3024 Moonshot/Ollama dialect arms plus the #3014 Anthropic native-shaping arms in all three reasoning tiers; keep the #3014 Anthropic capability early-return while honoring the #3023 removal of the Openai/Atlascloud/Moonshot one; union claude + bare-id max_output rows 2026-06-11 05:45:51 +00:00
Hunter Bown 186ebbb116 Merge PR #3050 from Hmbown: wire reasoning-effort for Atlascloud, Moonshot, Ollama dialects
fix(reasoning): wire reasoning-effort for Atlascloud, Moonshot, Ollama
2026-06-10 22:30:00 -07:00
Claude 3f2f154ad2 Merge origin/main into v0.8.58-3018-unhardcode-deepseek — append #3030 step-counter tests after the #3018 routing tests 2026-06-11 05:29:25 +00:00
Claude d9dff6809c Merge origin/main into v0.8.58-3026-hooks-v2 — keep both the #3026 hook fold helpers and the #3027 command_denies_tool gate in turn_loop 2026-06-11 05:26:27 +00:00
Hunter Bown 6b01cccc65 Merge PR #3038 from Hmbown: make Ctrl+B directly background the active foreground shell
fix(tui): make Ctrl+B directly background the active foreground shell
2026-06-10 22:20:27 -07:00
Claude 279ad5af90 feat(client): native Anthropic Messages API adapter — /v1/messages dialect with x-api-key + anthropic-version auth, adaptive thinking + output_config.effort shaping, cache_control breakpoints (prefix zones, capped at 4), signed-thinking replay via new signature field + signature_delta, #2961 usage normalization, SSE pass-through with error envelopes; registry/docs sync (#3014)
Co-Authored-By: Claude <noreply@anthropic.com>
https://claude.ai/code/session_018zaP8vUfTAsrE38L6h6fw5
2026-06-11 03:54:46 +00:00
Claude fc10d10189 feat(routing): complete #3018 — RouterCandidates + provider_router_candidates; candidate-aware heuristics, router prompts and recommendation parsing; skip the network router when no cheap tier exists; validate spawn-time model against the active provider; tests + SUBAGENTS.md per-role models docs
Co-Authored-By: Claude <noreply@anthropic.com>
https://claude.ai/code/session_018zaP8vUfTAsrE38L6h6fw5
2026-06-11 03:03:08 +00:00
Claude 033132a735 fix(tui): #3032 residuals — running-exec hint now says Ctrl+B backgrounds the command; Ctrl+B documented in KEYBINDINGS.md and runbook updated for menu removal; Cannot-background message names the reason (interactive / non-shell tool / nothing running)
Co-Authored-By: Claude <noreply@anthropic.com>
https://claude.ai/code/session_018zaP8vUfTAsrE38L6h6fw5
2026-06-11 02:40:07 +00:00
Claude e4ea208d53 docs(runner): fix resume example — exec has no 'latest' session alias; use --continue (#3021)
Co-Authored-By: Claude <noreply@anthropic.com>
https://claude.ai/code/session_018zaP8vUfTAsrE38L6h6fw5
2026-06-11 02:31:00 +00:00
Claude 6f5039101a feat(hooks): wire the #3026 decision contract into the engine — JSON deny/ask/updatedInput now steer tool calls (deny>ask>allow, last-writer updatedInput), additionalContext piped into tool results, project .codewhale/hooks.toml loaded at both HookExecutor sites; parser hardening + fold/glob/project tests; docs
Co-Authored-By: Claude <noreply@anthropic.com>
https://claude.ai/code/session_018zaP8vUfTAsrE38L6h6fw5
2026-06-11 01:52:22 +00:00
Claude 29e60cd4b9 fix(reasoning): Atlascloud joins the DeepSeek dialect group — max tier now sends reasoning_effort=max; add Atlascloud/Moonshot/Ollama dialect-shape tests; document per-provider reasoning-effort wire mapping in PROVIDERS.md (#3024)
Co-Authored-By: Claude <noreply@anthropic.com>
https://claude.ai/code/session_018zaP8vUfTAsrE38L6h6fw5
2026-06-11 00:13:18 +00:00
Hunter Bown cef3b92964 feat(docs): agent-task issue template, labels, and runner protocol (#3021)
Adds the distributed intelligence infrastructure so remote agents
can autonomously execute v0.8.58 milestone issues:

- .github/ISSUE_TEMPLATE/agent-task.yml — GitHub issue form with six
  required sections (Goal, Scope, Key files, Acceptance criteria,
  Verification, Out of scope).  Auto-labels as agent-ready.

- docs/AGENT_RUNNER.md — pick → claim → worktree → exec → verify → PR
  loop with safety rules, label semantics, and the issue body format.

Labels agent-ready, agent-in-progress, needs-human already exist
(created during milestone setup).
2026-06-10 16:19:07 -07:00
Hunter B 4465459b69 feat(release): one-command version bump via prepare-release.sh; close version-drift gaps
- scripts/release/prepare-release.sh bumps workspace + crate pins + npm
  wrapper + README install tags, refreshes Cargo.lock, regenerates the
  TUI changelog slice and web facts, then runs check-versions.sh
- check-versions.sh now also gates web/lib/facts.generated.ts and the
  README install-tag examples (both drifted silently before)
- .cnb.yml validates the pushed tag against Cargo.toml before generating
  mirror release notes
- RELEASE_CHECKLIST/RUNBOOK updated accordingly (v0.8.56 needed 9 fix
  commits for exactly these sync points)
2026-06-09 23:43:15 -07:00