dgf1988/codewhale

Files

T

Hunter Bown 06612495fc chore(release): prep v0.8.51 — Arcee provider, cycle removal, UI fixes

Release-preparation checkpoint for v0.8.51 (workspace + npm bumped to 0.8.51).

Added:
- Arcee AI direct provider: [providers.arcee], ARCEE_API_KEY/BASE_URL/MODEL,
  CLI auth, provider + model picker, registry. Default direct-API model is
  trinity-large-thinking (reasoning, 262K ctx/out); preview + mini selectable.
  Cloudflare-WAF-safe opening turn (benign read-only tool surface, system-prompt
  payload splitting) and reasoning_content replay on tool-call turns.
- Expanded model catalog (qwen3.6 flash/plus/max-preview, Xiaomi MiMo v2.5
  chat/ASR/TTS); provider-aware model picker with per-provider saved models.

Changed:
- Auto-compaction is percentage- and model-aware
  (compaction_threshold_for_model_at_percent; default 80%; auto-enable for
  <=256K windows, opt-in for 1M models).
- Provider/gateway HTTP errors sanitized (HTML/WAF interstitials collapsed,
  401/403 split into authentication vs authorization).

Removed:
- The session cycle / checkpoint-restart system: /cycles, /cycle, /recall,
  recall_archive tool, cycle_manager, cycle-handoff prompt, sidebar cycle lines,
  EngineConfig.cycle / Event::CycleAdvanced / seam cycle thresholds.

Fixed:
- Orphaned assistant 'blue dot' role glyph on whitespace-only turns.
- Sidebar mouse-wheel scroll leaking into the transcript.
- Sidebar hover tooltip overlap + warning-orange styling.
- README Constitution description corrected to match prompts/base.md.
- Repaired release-blocking unit/integration tests after the refactors.

Preflight: cargo fmt clean, workspace builds, 3903 tui tests pass (1 known
flaky MCP SSE test under parallel load, passes in isolation).

2026-06-02 17:36:18 -07:00

13 KiB

Raw Blame History

`codewhale remote-setup` — Design & Implementation Plan

Status: design (do not implement against the 0.8.48 release wrap; land on a branch or after 0.8.48 ships). Author handoff doc, mirrors the style of REFACTOR_HANDOFF.md.

Goal

One command — codewhale remote-setup — that guides a user through standing up a remote CodeWhale agent they can talk to from a phone chat app, across:

Cloud target: Tencent Lighthouse or Azure (extensible to GCP/Hetzner/bare).
Chat bridge: Feishu/Lark or Telegram (extensible to Slack/Discord).
Model provider: any entry in the existing PROVIDERS registry (DeepSeek, OpenAI, NVIDIA NIM, Atlascloud, WanjieArk, OpenRouter, Novita, Fireworks, Moonshot, SGLang, vLLM, Ollama, Xiaomi).

Decisions locked with the user:

Form: native Rust subcommand in-binary (touches crates/cli + crates/tui).
Scope: generate the deploy bundle and optionally auto-provision via the cloud CLI (az / CNB), behind a confirmation gate.

Prior art: Hermes Agent (reference only — do not copy)

/Volumes/VIXinSSD/hermesagent (Nous Research's Hermes Agent, Python) solves the same problem and validates this design. Use it for ideas; keep CodeWhale's style (Rust core, zero-dep Node bridges, plain-text replies).

Its gateway/platform_registry.py is exactly the table-driven approach here: a PlatformEntry { name, label, adapter_factory, check_fn, validate_config, required_env, install_hint, setup_fn, source }. That maps 1:1 onto our BridgeSpec/CloudTarget rows, and its per-platform setup_fn + required_env are what our wizard reads to prompt. A single gateway process fans out to many platforms — the model we want.
Its gateway/pairing.py mirrors our allowlist/first-pairing flow.

Telegram hardening checklist (mined from `gateway/platforms/telegram.py`)

That adapter is battle-tested; its method names enumerate edge cases our MVP bridge should handle. Status against integrations/telegram-bridge:

Edge case	In Hermes	In our bridge
409 polling conflict (two `getUpdates`)	`_looks_like_polling_conflict`	done — poll loop backs off 10s + warns
429 `retry_after`	rate-limit handling	done — `telegramApi` honors `parameters.retry_after`
Forum "General topic = 1" send/typing asymmetry	`_message_thread_id_for_send` vs `_for_typing`	done — omit `message_thread_id` when id is 1 on send
"message to be replied not found" after restart	`_send_with_dm_topic_reply_anchor_retry`	sidestepped — we never set `reply_to_message_id`
Network/connect-timeout retry	`_looks_like_network_error`	partial — generic 3s backoff in poll loop
Text batching + progress-edit (edit one msg vs spam)	`test_telegram_text_batching`	deferred — we send a chunk every 15s
MarkdownV2 escaping + table rendering	`_escape_mdv2`, `_wrap_markdown_tables`	deferred — plain text (safe; tables look plain)
Webhook mode as an alternative to long polling	`_webhook_mode`	out of scope — long-poll only (no inbound ports)

Deferred items are deliberate: progress-edit and MarkdownV2 add real UX polish but also complexity and (for MDV2) a whole class of parser-escaping bugs. Revisit after remote-setup lands.

Design principle: table-driven, like `ProviderSpec`

The provider registry (crates/config/src/lib.rs::PROVIDERS) is the model to copy: "adding a provider is one row." Apply the same to clouds and bridges so the matrix grows by data, not by new control flow.

        CloudTarget  ×  BridgeSpec   +  ProviderSpec (existing registry)
        ───────────     ──────────      ────────────────────────────────
        lighthouse      feishu          deepseek / openai / nvidia-nim / …
        azure           telegram        (wizard reads PROVIDERS, prompts for
        (future…)       (future…)        that provider's env_keys[0])

Clean separation that the architecture already implies:

Provider = a runtime.env concern. The runtime resolves the provider from CODEWHALE_PROVIDER and the provider's own key var. The bridge never needs to know which provider is behind the runtime — it only forwards model to /v1/threads. So "multi-provider" only touches runtime.env generation.
Cloud = where it runs + where secrets live.
Bridge = pure transport between a chat app and 127.0.0.1:7878.

Command surface

New variant in crates/cli/src/lib.rs Commands:

/// Provision and configure a remote CodeWhale agent (cloud + chat bridge).
RemoteSetup(RemoteSetupArgs),

RemoteSetupArgs (clap):

Flag	Meaning
`--cloud <azure\|lighthouse>`	Skip the cloud prompt.
`--bridge <telegram\|feishu>`	Skip the bridge prompt.
`--provider <slug>`	Provider slug; validated against `PROVIDERS`.
`--out <dir>`	Bundle output dir (default `./codewhale-deploy/<cloud>-<bridge>`).
`--generate-only`	Emit the bundle, do not provision (default).
`--apply`	Run the cloud CLI to actually provision (the auto-provision path).
`--yes`	Skip the final confirmation gate (CI/non-interactive).
`--non-interactive`	Fail instead of prompting if any required value is missing.

CLI delegates to the TUI binary exactly like Serve/Setup do (delegate_to_tui(&cli, &resolved_runtime, tui_args("remote-setup", args))). The implementation lives next to run_setup in crates/tui/src/.

Code layout

New module crates/tui/src/remote_setup/:

remote_setup/
  mod.rs          # run_remote_setup(): wizard orchestration + dispatch
  registry.rs     # CloudTarget + BridgeSpec tables (the matrix)
  prompt.rs       # thin stdin prompt helpers (reuse existing patterns)
  bundle.rs       # render env files / systemd units / RUNBOOK.md to --out
  provision/
    mod.rs        # Provisioner trait + confirmation gate + dry-run printer
    azure.rs      # az preflight, RG, VM+cloud-init, Key Vault, NSG, start
    lighthouse.rs # cnb.yml + tag_deploy.yml generation, CNB guidance
  templates/      # runtime.env, <bridge>.env, *.service, cloud-init.yaml.tmpl

Registry types

pub struct BridgeSpec {
    pub slug: &'static str,            // "telegram"
    pub display: &'static str,         // "Telegram"
    pub package_dir: &'static str,     // "integrations/telegram-bridge"
    pub service_unit: &'static str,    // "codewhale-telegram-bridge.service"
    pub env_template: &'static str,    // templates/telegram.env
    /// Bridge-specific secret env keys to prompt for (token, etc.).
    pub secret_keys: &'static [&'static str], // ["TELEGRAM_BOT_TOKEN"]
    /// One-liner shown before prompting (e.g. "Create a bot with @BotFather").
    pub setup_hint: &'static str,
}

pub struct CloudTarget {
    pub slug: &'static str,            // "azure"
    pub display: &'static str,         // "Azure VM"
    pub secret_store: SecretStore,     // KeyVault | EnvFile
    pub install: InstallMethod,        // Docker | NativeSystemd
    /// Builds the ordered list of provisioning steps as (description, command).
    /// Commands are returned as data so they can be dry-run printed, gated,
    /// and only then executed.
    pub plan: fn(&DeployInputs) -> Vec<ProvisionStep>,
}

A ProvisionStep { description, program, args, secret_args } is data, never a shell string — so the confirmation gate can print every command, secrets are fed via stdin/temp files (never argv/history), and --apply just executes the already-printed plan.

Wizard flow

Cloud — pick from CLOUD_TARGETS (or --cloud).
Bridge — pick from BRIDGES (or --bridge); print setup_hint.
Provider — list PROVIDERS (canonical names), pick (or --provider). Look up spec.env_keys[0] as the key var to prompt for.
Secrets — prompt for: provider API key, bridge token(s) from secret_keys, allowlist (chat ids). Generate a random CODEWHALE_RUNTIME_TOKEN.
Mode — generate-only vs --apply.
Render bundle to --out (always, even with --apply).
Confirm + provision (only if --apply): print the full ordered command plan, require y (unless --yes), then execute step by step with progress.
Print RUNBOOK.md path and the remaining manual steps.

Generated bundle

Written to ./codewhale-deploy/<cloud>-<bridge>/:

runtime.env — provider config lives here:

CODEWHALE_PROVIDER=openai
OPENAI_API_KEY=…              # the provider's own key var, from registry
CODEWHALE_MODEL=auto
CODEWHALE_RUNTIME_TOKEN=<random>
CODEWHALE_RUNTIME_PORT=7878
CODEWHALE_RUNTIME_WORKERS=2
RUST_LOG=info

<bridge>.env — transport only: CODEWHALE_RUNTIME_URL=http://127.0.0.1:7878, matching CODEWHALE_RUNTIME_TOKEN, allowlist, TELEGRAM_BOT_TOKEN (or Feishu app id/secret), CODEWHALE_WORKSPACE, CODEWHALE_MODEL.
codewhale-runtime.service, codewhale-<bridge>.service.
Cloud artifact: cloud-init.yaml + provision.sh (Azure) or cnb.yml + tag_deploy.yml (Lighthouse).
RUNBOOK.md — the exact remaining commands + first-pairing steps.

Auto-provision

Azure (`--apply --cloud azure`)

Preflight: az account show (fail with "run az login" if absent). Then the plan() emits, in order:

az group create (region prompted; default eastus).
az keyvault create + az keyvault secret set for the provider key and the runtime token (secrets via stdin, not argv).
az vm create with --custom-data cloud-init.yaml and a system-assigned managed identity; cloud-init pulls ghcr.io/hmbown/codewhale:latest, reads the secrets from Key Vault via the identity, writes /etc/codewhale/*.env, installs both systemd units, enable --now.
NSG: SSH (22) only, scoped to the caller's IP; 7878 stays on 127.0.0.1.
Print the SSH tunnel command for /status from a laptop if desired.

Lighthouse (`--apply --cloud lighthouse`)

Reuse the existing deploy/tencent-lighthouse/cnb/*.example pipeline: render cnb.yml + tag_deploy.yml from inputs and walk the user through the CNB trigger (CNB does the VM-side work). Systemd units mirror the existing codewhale-runtime.service.

Safety (matches the harness rules for outward-facing actions):

Every command printed before execution; y gate unless --yes.
Secrets never in argv or shell history.
--generate-only is the default; --apply is explicit.

Namespace migration: `DEEPSEEK_` → `CODEWHALE_`

Follow the convention already in crates/config/src/lib.rs: read CODEWHALE_X first, fall back to DEEPSEEK_X. Nothing breaks for existing deployments.

Touch list:

Bridges (integrations/feishu-bridge, integrations/telegram-bridge): in lib.mjs/index.mjs, read process.env.CODEWHALE_X ?? process.env.DEEPSEEK_X for RUNTIME_URL, RUNTIME_TOKEN, WORKSPACE, MODEL, MODE, ALLOW_SHELL, TRUST_MODE, AUTO_APPROVE, CHAT_ALLOWLIST, ALLOW_UNLISTED, TURN_TIMEOUT_MS. Validators accept either; templates emit CODEWHALE_*.
Deploy units (deploy/tencent-lighthouse/systemd/*, integrations/*/deploy/*): DEEPSEEK_RUNTIME_* → CODEWHALE_RUNTIME_*, env file paths /etc/deepseek/ → /etc/codewhale/ (keep reading the old path if present).
.env.example files + config.example.toml: lead with CODEWHALE_*, document DEEPSEEK_* as legacy aliases.
Drop DeepSeek-shaped defaults in the bridge: no hardcoded DEEPSEEK_MODEL=auto; the provider lives in runtime.env via CODEWHALE_PROVIDER + the registry's key var.

Note: items 1–3 touch tracked files, so they are part of the same "don't ship during 0.8.48" hold. The brand-new (untracked) Telegram bridge can be converted to CODEWHALE_* first as the reference implementation.

Tests

registry.rs: every CloudTarget/BridgeSpec slug is unique; each bridge's package_dir/service_unit/env_template exists.
bundle.rs: rendering a bundle for each cloud×bridge×provider triple produces files with CODEWHALE_* keys, a matching runtime/bridge token, and a non-empty RUNBOOK.
provision: plan() returns the expected ordered steps; commands are built but never executed in tests (assert on program+args, secrets redacted).

Extensibility check

Add GCP: one CloudTarget row + a provision/gcp.rs + a cloud-init reuse.
Add Slack: one BridgeSpec row + integrations/slack-bridge + template. No changes to the wizard control flow — it iterates the registries.

Suggested sequencing (given the 0.8.48 freeze)

Now (safe, untracked): convert the new Telegram bridge to CODEWHALE_* ?? DEEPSEEK_*; finalize this design.
Post-0.8.48, branch: namespace migration on tracked bridges + deploy units.
Then: implement remote-setup (registry → bundle → Azure provisioner → Lighthouse provisioner), generate-only first, --apply second.

13 KiB Raw Blame History Unescape Escape

codewhale remote-setup — Design & Implementation Plan