codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
Hunter Bown	abc9cfdcc1	fix(reasoning): wire reasoning-effort for Atlascloud, Moonshot, Ollama (#3024 ) Three providers previously had silent no-ops for all reasoning-effort tiers — the user toggled thinking on/off and nothing changed on the wire. This commit wires them: - Atlascloud: now speaks the DeepSeek dialect (thinking + reasoning_effort fields) since it serves DeepSeek models. All three arms (off, low–high, xhigh/max) updated. - Moonshot/Kimi: emits thinking: {type: disabled/enabled} for off/on. Kimi-k2.6 natively supports this field. - Ollama: emits think: false/true for off/on. Sent through the OpenAI-compatible /v1/chat/completions endpoint. Providers that remain unchanged: Openai, WanjieArk, OpenaiCodex (chat path), Arcee, Huggingface.	2026-06-10 16:33:29 -07:00
Hunter Bown	db69ee42cf	feat(hooks): JSON decision contract, glob matchers, project-local hooks (#3026 ) Three incremental improvements to the hooks control plane: 1. ToolCallBeforeStdout parser: hooks can now emit a JSON decision on stdout — {"decision": "allow"\|"deny"\|"ask", "reason": "...", "updatedInput": {...}, "additionalContext": "..."}. Non-JSON or empty stdout retains legacy passthrough (allow). Exit code 2 still hard-denies regardless of stdout. 2. Glob matchers for ToolName conditions: `name = "mcp__"` now matches all MCP tools. Uses regex::escape + `` → `.*` pattern, same convention as execpolicy/matcher.rs. Exact names keep working. 3. Project-local hooks: `HooksConfig::load_with_project(global, workspace)` reads `.codewhale/hooks.toml` and appends its hooks after global. Malformed file logs a warning and falls back to global-only.	2026-06-10 16:32:11 -07:00
Hunter Bown	29edb6d0cf	feat(prompts): parameterize model-specific facts — context window, pricing, thinking (#3025 ) Extends apply_model_template() to substitute model-specific facts from runtime lookups instead of hardcoded V4 claims: - {context_window_note}: resolved from context_window_for_model(); if unknown, emits fallback wording instead of guessing - {subagent_economics}: resolved from input_cost_note() (new public pricing helper); cost-agnostic fallback when pricing is unknown - {model_thinking_note}: gated on model_supports_reasoning(); empty string when the model does not emit thinking tokens The hardcoded "1M-token window", "$0.14/M Flash", and V4 thinking strategy will only reach models whose capability lookups return those values (DeepSeek V4 family). Non-DeepSeek models get accurate facts or honest "unknown" wording. Adds input_cost_note() to pricing.rs — returns a one-line sub-agent cost description, or None when pricing is unavailable.	2026-06-10 16:29:17 -07:00
Hunter Bown	139fe07d71	fix(providers): use model-based lookups for Moonshot/OpenAI/Atlascloud/Ollama capability (#3023 ) Three changes to fix provider capability reporting: 1. Delete the Openai/Atlascloud/Moonshot early-return arm in provider_capability() so these providers use the generic model-based path. Moonshot models now correctly report 262,144 context window and thinking_supported: true (via models.rs lookups). 2. Delete the Ollama hardcoded arm so Ollama also uses model-based lookups. The generic fallback now uses 8192 for Ollama (conservative for small local models) instead of the 128K default. 3. Ollama fallback: when context_window_for_model returns None and the provider is Ollama, default to 8192 instead of LEGACY_DEEPSEEK_CONTEXT_WINDOW_TOKENS (128K).	2026-06-10 16:27:49 -07:00
Hunter Bown	7c4d359ed7	fix(reasoning): add Moonshot/Kimi to reasoning-content provider and model support (#3016 ) Two targeted fixes to close the Kimi/Moonshot reasoning gap: 1. chat.rs: add ApiProvider::Moonshot to provider_accepts_reasoning_content so Kimi thinking traces stream as Thinking blocks instead of leaking as plain answer text on Moonshot's native endpoint. 2. models.rs: add kimi-* prefix match to model_supports_reasoning so bare Moonshot-native IDs (kimi-k2.6, kimi-*-thinking, etc.) are recognized as reasoning-capable without requiring the OpenRouter-style moonshotai/ prefix. These two changes together ensure is_reasoning_model_for_stream returns true for Kimi models on Moonshot, fixing the RC dialet gap.	2026-06-10 16:26:20 -07:00
Hunter Bown	faeeeef59b	fix(subagent): un-hardcode DeepSeek from model validation — accept any provider id (#3018 ) Two changes to let non-DeepSeek providers use their own model IDs: 1. config.rs: add requested_model_for_provider() — DeepSeek providers use strict normalize_model_name(); all others accept any non-empty string via normalize_custom_model_id(). 2. subagent/mod.rs: normalize_requested_subagent_model() now takes an ApiProvider parameter and delegates to requested_model_for_provider. Error messages name the active provider and list accepted model IDs from model_completion_names_for_provider() instead of hardcoding "Expected a DeepSeek model id". parse_optional_subagent_model() keeps basic trimming-only validation; provider-aware checks are deferred to the spawn path where the runtime is available.	2026-06-10 16:25:03 -07:00
Hunter Bown	5483e1553d	feat(remote-smoke): bump to v0.8.57, add gh CLI, swapfile, agent-session.sh, autonomous loop docs (#3022 ) - setup-vm.sh: bump RELEASE_TAG default to v0.8.57, add gh CLI install step (official APT repo) and 4G swapfile creation (idempotent) - agent-session.sh: new sourceable helper that exports the provider key from /etc/codewhale/runtime.env for interactive agent sessions - README.md: update version refs, add agent-session.sh to layout, add Autonomous agent loop section with full pick->PR commands The droplet ops (binary upgrade, PAT setup, first end-to-end issue run) are documented as the next steps for the operator.	2026-06-10 16:20:57 -07:00
Hunter Bown	cef3b92964	feat(docs): agent-task issue template, labels, and runner protocol (#3021 ) Adds the distributed intelligence infrastructure so remote agents can autonomously execute v0.8.58 milestone issues: - .github/ISSUE_TEMPLATE/agent-task.yml — GitHub issue form with six required sections (Goal, Scope, Key files, Acceptance criteria, Verification, Out of scope). Auto-labels as agent-ready. - docs/AGENT_RUNNER.md — pick → claim → worktree → exec → verify → PR loop with safety rules, label semantics, and the issue body format. Labels agent-ready, agent-in-progress, needs-human already exist (created during milestone setup).	2026-06-10 16:19:07 -07:00
Hunter Bown	dbd9b9670d	feat(exec): add --allowed-tools, --disallowed-tools, --max-turns, --append-system-prompt (#3027 ) Headless exec hardening for benchmark/CI/droplet use: - New CLI flags: --allowed-tools, --disallowed-tools, --max-turns, --append-system-prompt - Add disallowed_tools to EngineConfig + command_denies_tool() helper - run_exec_agent threads all four flags into EngineConfig and Op::SendMessage - needs_engine now includes flag presence for standalone exec use	2026-06-10 16:17:33 -07:00
Hunter Bown	4af5a68591	feat(tui): clickable sidebar rows — click-to-act on Tasks and Agents panels (#3028 ) Adds mouse-click dispatch for sidebar rows: - Add click_action: Option<String> to SidebarHoverRow (app.rs) - Extend sidebar_hover_rows() and render_sidebar_section() to carry row_actions: Vec<Option<String>> - Add task_panel_row_actions(): background task rows get /task show <id> and /task cancel <id> - Add agent_panel_row_actions(): agent rows get /subagents - Add sidebar_click_action() in mouse_ui.rs: resolves mouse position to an action from the sidebar hover state - Wire into MouseEventKind::Down(Left): dispatches ViewEvent:: CommandPaletteSelected → existing slash-command pipeline Reuses the existing command dispatch backbone — no new ViewEvent variants. Keyboard parity preserved; all click actions are also reachable via existing slash commands.	2026-06-10 16:10:50 -07:00
Hunter Bown	60e9f706b3	feat(tui): OSC 8 out-of-band hyperlink infrastructure (#3029 ) Adds the foundation for working OSC 8 hyperlinks in the transcript: - LinkRegion struct: (row, col_start, col_end, target) for a contiguous run of linked cells on one terminal row - write_osc8_open/close: emit OSC 8 escapes directly through a Write impl (bypassing ratatui's buffer which strips ESC bytes) - FRAME_LINKS thread-local: passes link regions from the render closure to ColorCompatBackend::draw(), where OSC 8 escapes are emitted out-of-band through the backend's Write impl - ColorCompatBackend integration: draw() reads FRAME_LINKS, emits OSC 8 open/close around linked cells The markdown renderer still uses the inline Span::content approach (known broken); the sentinel-color buffer-scan integration is a follow-up. This PR delivers the emission path and thread-local plumbing so the remaining work is confined to link detection in the render closure.	2026-06-10 16:06:26 -07:00
Hunter Bown	502fb04c23	fix(tui): make Ctrl+B directly background the active foreground shell (#3032 ) Previously Ctrl+B opened a two-step ShellControlView menu (Background / Cancel). Now it directly calls request_foreground_shell_background(), backgrounding the running foreground shell in one keystroke. When no foreground shell is running, the existing status message ("No foreground shell command to background") provides the hint. The ShellControlView and open_shell_control() remain available as a programmatic entry point for views/tests.	2026-06-10 15:59:40 -07:00
Hunter Bown	7fef919765	fix(tui): compact tool-call transcript rendering — suppress boilerplate (#3031 ) Three targeted changes to reduce low-value detail in the default compact/Live transcript view: 1. ExecCell: suppress "(no output)" line in Live mode. The success header already conveys the outcome; Transcript mode keeps it for exports/clipboard/pager. 2. ExecCell: suppress sub-second timing in Live mode. Calls under 1s show no timing line; Transcript mode always shows exact timing. 3. render_preserved_output_mode: suppress "(no output)" for empty output in Live mode. Same rationale — the header carries the signal. Full command text, complete output, and exact timing remain available in Transcript mode (pager, clipboard export, Alt+V detail view).	2026-06-10 15:57:08 -07:00
Hunter Bown	ec0789daf4	fix(tui): hide internal IDs from normal UI — stable labels for turns and agents (#3030 ) Three changes to replace raw UUIDs/hex-ids with stable user-facing labels: 1. Turn label: Add turn_counter to App, display "Turn N" instead of the raw runtime_turn_id UUID prefix. Full UUID preserved in hover text. 2. Agent labels: Add agent_counter + agent_label_map to App. Populated on AgentSpawned; sidebar rows use "Agent 1", "Agent 2" etc. instead of agent_<hex>. Nicknames and user-assigned names still take priority. 3. Step counter: Add format_step_counter() helper. When max_steps is u32::MAX (the unbounded sentinel), renders "step 16" instead of the meaningless "step 16/4294967295". Concrete step budgets still show the denominator.	2026-06-10 15:52:34 -07:00
Hunter Bown	7b1446f7b0	fix(tui): throttle AgentProgress redraws to prevent freeze under subagent load (#3033 ) When 4+ sub-agents run concurrently, each AgentProgress event triggers a full terminal redraw via received_engine_event → needs_redraw. The render loop saturates, sidebar recomputation dominates the frame budget, and terminal input events (including Ctrl+C) are starved. Limit progress-driven redraws to at most one per 100ms per agent. The status-animation timer (80ms cadence) still guarantees sidebar updates. Agent state is recorded immediately; the sidebar picks it up on the next permitted redraw. Adds last_agent_progress_redraw field to App to track throttle state.	2026-06-10 15:47:35 -07:00
Hunter Bown	940ea28756	fix(codex): land function_call_output, schema sanitization, and reasoning effort mapping - responses.rs: add ToolResult→function_call_output conversion for user-role blocks (previously silently dropped, breaking multi-turn tool calling) - responses.rs: extract codex_responses_reasoning_effort helper — maps 'maximum'→xhigh, off/none→omit, unknown→medium - responses.rs: tool_to_responses_function now clones and sanitizes schema via schema_sanitize::sanitize_for_responses before sending parameters - schema_sanitize.rs: add sanitize_for_responses — forces root type:object, merges root composition properties (oneOf/anyOf/allOf), removes root enum/not, ensures properties object, prunes dangling required - schema_sanitize.rs: add merge_root_composition_properties helper - schema_sanitize.rs: add 3 responses_sanitize_* tests Refs: #2984 #2955	2026-06-10 15:30:47 -07:00
Hunter Bown	5bea28e4e2	feat(tui): split sidebar 'Model reasoning' from 'Background commands'; provider-aware effort labels - Add TaskPanelEntryKind enum (Background, ModelReasoning) to app.rs - Add ReasoningEffort::display_label_for_provider() — maps Max→xhigh for Codex - Sidebar: new 'Model reasoning' section above 'Background commands' - Transcript: rename 'thinking' → 'reasoning' label - Model picker: effort rows use provider-aware labels - Subagent routing: tag entries with kind::Background - Tests: kind assertions in task entry + effort display label tests Refs: #2984 (sidebar+reasoning surface for Codex and all-models theme)	2026-06-10 15:30:41 -07:00
Hunter Bown	4d4cbd36a2	v0.8.58: constitution refactor + rebrand fixes + provider error improvement - Constitution: new preamble (A / Rule Number 6), personality tier removed (8 tiers instead of 9), elevated constitutional prose with shall/shall-not - YAML constitution (constitution.yaml) as structured source of truth with indentation encoding tier precedence - Python renderer (render_constitution.py) for YAML -> markdown conversion - prompts.rs: load constitution.md instead of base.md + calm.md overlay - #2664: state.db default path prefers .codewhale/ over .deepseek/ with legacy fallback so existing installs keep session history - #2644/#2664: update stale doc comments referencing deepseek paths - #3007: provider rejection error now shows source (CLI flag vs config), lists supported providers, and gives specific fix instructions	2026-06-10 10:02:49 -07:00
Justin Gao	ebe828af27	fix: remove useless .into() on SavedSession.messages clone (#2579 ) SavedSession.messages is Vec<Message>, not AppendLog — .clone() already returns Vec<Message>, so .into() was a no-op conversion that triggered clippy::useless_conversion in CI lint.	2026-06-10 17:19:43 +08:00
Justin Gao	08904fde47	refs(#2264 ): Phase 4 — replace Session.messages: Vec<Message> with AppendLog (#2579 ) - Wire AppendLog as the backing store for Session.messages - Add Deref, From impls, and explicit mutation methods to AppendLog - Narrow API: remove DerefMut, add push_batch/truncate_to/trim_front/clear/last_mut - Update all direct message assignments to use .into() conversions - Update tests to deref through AppendLog for comparisons Rebased onto upstream/main (v0.8.57) to resolve merge conflicts.	2026-06-10 16:55:11 +08:00
Hunter B	b23067bacd	release: v0.8.57 — sleep-resume turns, docker fix, one-command release prep, changelog diet	2026-06-10 00:02:51 -07:00
Hunter B	f9c9764265	fix(tests): deflake prompt cache and MCP SSE mock-server tests - prompt_persist tests wrote through the developer's real ~/.codewhale/prompt_cache and raced each other (the eviction test could delete another test's entry). cache_dir() now honors CODEWHALE_PROMPT_CACHE_DIR and the tests run serialized against private tempdirs. - The raw TCP mock servers in mcp tests answered one request per socket but never advertised 'Connection: close', so reqwest pooled dead keep-alive connections and retried POSTs failed under parallel-suite load with 'connection closed before message completed' (~50% failure rate on the full suite).	2026-06-10 00:00:54 -07:00
Hunter B	ddd5df4b9b	chore: drop stale allow(dead_code) on AgentOpenTool (registered since v0.8.33)	2026-06-09 23:45:58 -07:00
Hunter B	c58ef8ddff	feat(release): generate the GitHub Release body from the CHANGELOG entry The workflow hardcoded install boilerplate plus a contributor list that had already drifted (v0.8.56's release thanked people 'for shaping v0.9.0'). The body now comes from scripts/release/generate-release-body.sh: static install/verify sections plus the tagged version's changelog section, which already carries the per-release credits.	2026-06-09 23:44:57 -07:00
Hunter B	4465459b69	feat(release): one-command version bump via prepare-release.sh; close version-drift gaps - scripts/release/prepare-release.sh bumps workspace + crate pins + npm wrapper + README install tags, refreshes Cargo.lock, regenerates the TUI changelog slice and web facts, then runs check-versions.sh - check-versions.sh now also gates web/lib/facts.generated.ts and the README install-tag examples (both drifted silently before) - .cnb.yml validates the pushed tag against Cargo.toml before generating mirror release notes - RELEASE_CHECKLIST/RUNBOOK updated accordingly (v0.8.56 needed 9 fix commits for exactly these sync points)	2026-06-09 23:43:15 -07:00
Hunter B	717d728163	feat: survive system sleep mid-turn — detect the suspend gap and retry the request (#2990 ) When the host sleeps while a model response is streaming, the connection dies on wake with 'Stream read error: error decoding response body' and the turn was lost. The engine now stamps every stream chunk with both monotonic and wall-clock time; Instant pauses across a suspend while SystemTime does not, so a >10s divergence on a stream error identifies a sleep/wake cycle. In that case the partial output is discarded and the identical request is re-issued (sharing the existing MAX_STREAM_RETRIES=3 budget) instead of failing the turn. Ordinary network flakes keep the deliberate no-retry-after-content policy from #103.	2026-06-09 23:40:42 -07:00
Hunter B	fad04a016f	polish: finish the rebrand in agent-facing surfaces - system prompt environment key deepseek_version -> codewhale_version - drop legacy .deepseek/instructions.md from the Local Law prompt tier (the engine still reads it for back-compat) - instructions-file truncation marker now states how many bytes were omitted so the model knows what it is missing - CODEWHALE_CHANGELOG const + user-facing /change strings - codewhale metrics doc headers	2026-06-09 23:36:32 -07:00
Hunter B	6dcdc19077	chore: drop unused deps (tracing-appender, zeroize, rustls in release), orphaned vendor lockfile and one-off verify_task.sh	2026-06-09 23:33:47 -07:00
Hunter B	44c13eb63f	fix(release): check-versions validates the generated TUI changelog slice, not byte equality The packaged changelog is now a recent-releases slice produced by scripts/sync-changelog.sh (which gains a --check mode); also restore the SECURITY.md contact line the version gate guards, and finish the stale binary-name sweep (--bin codewhale examples, qa harness doc).	2026-06-09 23:32:40 -07:00
Hunter B	626032ad6b	fix(devcontainer): use codewhale user/name/mount instead of pre-rebrand deepseek	2026-06-09 23:30:53 -07:00
Hunter B	26947bd407	fix(docker): stop copying legacy deepseek binaries that no longer exist The release docker job failed on 'cp: cannot stat .../release/deepseek'. Legacy entrypoints survive as symlinks to the codewhale binaries.	2026-06-09 23:30:45 -07:00
dependabot[bot]	ffc734289e	build(deps): bump actions/stale from 9 to 10 Bumps [actions/stale](https://github.com/actions/stale) from 9 to 10. - [Release notes](https://github.com/actions/stale/releases) - [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md) - [Commits](https://github.com/actions/stale/compare/v9...v10) --- updated-dependencies: - dependency-name: actions/stale dependency-version: '10' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com>	2026-06-10 06:27:27 +00:00
Hunter B	68aee8409f	chore: move HarmonyOS clang wrappers to scripts/ohos/	2026-06-09 23:24:13 -07:00
Hunter B	4a4ea63820	chore: add CODEOWNERS and dependabot config	2026-06-09 23:23:50 -07:00
Hunter B	a2bf9f806a	chore: gitignore benchmark results/ and __pycache__ under scripts/	2026-06-09 23:23:37 -07:00
Hunter B	6551106e79	docs: move internal design docs into docs/rfcs/	2026-06-09 23:23:25 -07:00
Hunter B	854274de1d	docs: remove internal US VM setup notes	2026-06-09 23:23:03 -07:00
Hunter B	bf2e5504a2	docs: remove stale internal docs (handoffs, old audits, orphaned roadmaps)	2026-06-09 23:22:32 -07:00
Hunter B	b4edb4e1ef	build: generate crates/tui/CHANGELOG.md as a 15-release slice instead of a full manual copy The /change command embeds this file into every binary via include_str!; it now carries only recent releases (regenerated by scripts/sync-changelog.sh, wired into the release checklist). The explicit-version test derives its fixture versions from the embedded slice instead of hardcoding old ones.	2026-06-09 23:22:00 -07:00
Hunter B	258d75376c	docs: archive changelog entries v0.8.39 and older to docs/CHANGELOG_ARCHIVE.md	2026-06-09 23:19:11 -07:00
Hunter B	bcaf655316	fix: bump npm wrapper version, add v0.8.56 compare link, resync TUI CHANGELOG	2026-06-09 22:35:50 -07:00
Hunter B	6a2ec32c00	fix: bump all internal crate dependency versions to 0.8.56	2026-06-09 22:30:27 -07:00
Hunter B	710b094c8a	fix: sync crates/tui/CHANGELOG.md with workspace root for v0.8.56 test gate	2026-06-09 22:26:24 -07:00
Hunter B	5783b05b0a	fix: update Cargo.lock after version bump to 0.8.56	2026-06-09 22:14:45 -07:00
Hunter B	ac56bd85b9	polish: drop deferred npm language from INSTALL.md, bump to v0.8.56	2026-06-09 22:12:41 -07:00
Hunter B	f1c73bda9b	polish: bump CNB install tags, remove deferred npm language, regen web facts to 0.8.56	2026-06-09 22:12:00 -07:00
gordonlu	19326f940d	fix: address review feedback - cfg(test), \u{26a0}, revert vi fanout, expand tests	2026-06-10 13:10:59 +08:00
gordonlu	f753f09e6a	feat(i18n): localize sandbox elevation dialog across 7 locales	2026-06-10 13:06:55 +08:00
Hunter B	a6feb79e50	chore: bump workspace version to 0.8.56	2026-06-09 22:06:05 -07:00
Hunter B	3218bb00da	release: v0.8.56 CHANGELOG, contributor credits, web version bumps 14 merged PRs across localization, providers, search, shell safety, prefix-cache stability, PDF fixes, paste handling, and bug fixes. Contributors: gordonlu, hongchen1993 (new), cyq1017, idling11, LeoAlex0, h3c-hexin, sximelon, HUQIANTAO, reidliu41. Reverted hotbar #2945 (render-only, no key dispatch wired up).	2026-06-09 22:05:02 -07:00

... 4 5 6 7 8 ...

2803 Commits