Release-preparation checkpoint for v0.8.51 (workspace + npm bumped to 0.8.51). Added: - Arcee AI direct provider: [providers.arcee], ARCEE_API_KEY/BASE_URL/MODEL, CLI auth, provider + model picker, registry. Default direct-API model is trinity-large-thinking (reasoning, 262K ctx/out); preview + mini selectable. Cloudflare-WAF-safe opening turn (benign read-only tool surface, system-prompt payload splitting) and reasoning_content replay on tool-call turns. - Expanded model catalog (qwen3.6 flash/plus/max-preview, Xiaomi MiMo v2.5 chat/ASR/TTS); provider-aware model picker with per-provider saved models. Changed: - Auto-compaction is percentage- and model-aware (compaction_threshold_for_model_at_percent; default 80%; auto-enable for <=256K windows, opt-in for 1M models). - Provider/gateway HTTP errors sanitized (HTML/WAF interstitials collapsed, 401/403 split into authentication vs authorization). Removed: - The session cycle / checkpoint-restart system: /cycles, /cycle, /recall, recall_archive tool, cycle_manager, cycle-handoff prompt, sidebar cycle lines, EngineConfig.cycle / Event::CycleAdvanced / seam cycle thresholds. Fixed: - Orphaned assistant 'blue dot' role glyph on whitespace-only turns. - Sidebar mouse-wheel scroll leaking into the transcript. - Sidebar hover tooltip overlap + warning-orange styling. - README Constitution description corrected to match prompts/base.md. - Repaired release-blocking unit/integration tests after the refactors. Preflight: cargo fmt clean, workspace builds, 3903 tui tests pass (1 known flaky MCP SSE test under parallel load, passes in isolation).
13 KiB
codewhale remote-setup — Design & Implementation Plan
Status: design (do not implement against the 0.8.48 release wrap; land on a
branch or after 0.8.48 ships). Author handoff doc, mirrors the style of
REFACTOR_HANDOFF.md.
Goal
One command — codewhale remote-setup — that guides a user through standing up
a remote CodeWhale agent they can talk to from a phone chat app, across:
- Cloud target: Tencent Lighthouse or Azure (extensible to GCP/Hetzner/bare).
- Chat bridge: Feishu/Lark or Telegram (extensible to Slack/Discord).
- Model provider: any entry in the existing
PROVIDERSregistry (DeepSeek, OpenAI, NVIDIA NIM, Atlascloud, WanjieArk, OpenRouter, Novita, Fireworks, Moonshot, SGLang, vLLM, Ollama, Xiaomi).
Decisions locked with the user:
- Form: native Rust subcommand in-binary (touches
crates/cli+crates/tui). - Scope: generate the deploy bundle and optionally auto-provision via the
cloud CLI (
az/ CNB), behind a confirmation gate.
Prior art: Hermes Agent (reference only — do not copy)
/Volumes/VIXinSSD/hermesagent (Nous Research's Hermes Agent, Python) solves the
same problem and validates this design. Use it for ideas; keep CodeWhale's
style (Rust core, zero-dep Node bridges, plain-text replies).
- Its
gateway/platform_registry.pyis exactly the table-driven approach here: aPlatformEntry { name, label, adapter_factory, check_fn, validate_config, required_env, install_hint, setup_fn, source }. That maps 1:1 onto ourBridgeSpec/CloudTargetrows, and its per-platformsetup_fn+required_envare what our wizard reads to prompt. A single gateway process fans out to many platforms — the model we want. - Its
gateway/pairing.pymirrors our allowlist/first-pairing flow.
Telegram hardening checklist (mined from gateway/platforms/telegram.py)
That adapter is battle-tested; its method names enumerate edge cases our MVP
bridge should handle. Status against integrations/telegram-bridge:
| Edge case | In Hermes | In our bridge |
|---|---|---|
409 polling conflict (two getUpdates) |
_looks_like_polling_conflict |
done — poll loop backs off 10s + warns |
429 retry_after |
rate-limit handling | done — telegramApi honors parameters.retry_after |
| Forum "General topic = 1" send/typing asymmetry | _message_thread_id_for_send vs _for_typing |
done — omit message_thread_id when id is 1 on send |
| "message to be replied not found" after restart | _send_with_dm_topic_reply_anchor_retry |
sidestepped — we never set reply_to_message_id |
| Network/connect-timeout retry | _looks_like_network_error |
partial — generic 3s backoff in poll loop |
| Text batching + progress-edit (edit one msg vs spam) | test_telegram_text_batching |
deferred — we send a chunk every 15s |
| MarkdownV2 escaping + table rendering | _escape_mdv2, _wrap_markdown_tables |
deferred — plain text (safe; tables look plain) |
| Webhook mode as an alternative to long polling | _webhook_mode |
out of scope — long-poll only (no inbound ports) |
Deferred items are deliberate: progress-edit and MarkdownV2 add real UX polish
but also complexity and (for MDV2) a whole class of parser-escaping bugs. Revisit
after remote-setup lands.
Design principle: table-driven, like ProviderSpec
The provider registry (crates/config/src/lib.rs::PROVIDERS) is the model to
copy: "adding a provider is one row." Apply the same to clouds and bridges so
the matrix grows by data, not by new control flow.
CloudTarget × BridgeSpec + ProviderSpec (existing registry)
─────────── ────────── ────────────────────────────────
lighthouse feishu deepseek / openai / nvidia-nim / …
azure telegram (wizard reads PROVIDERS, prompts for
(future…) (future…) that provider's env_keys[0])
Clean separation that the architecture already implies:
- Provider = a
runtime.envconcern. The runtime resolves the provider fromCODEWHALE_PROVIDERand the provider's own key var. The bridge never needs to know which provider is behind the runtime — it only forwardsmodelto/v1/threads. So "multi-provider" only touchesruntime.envgeneration. - Cloud = where it runs + where secrets live.
- Bridge = pure transport between a chat app and
127.0.0.1:7878.
Command surface
New variant in crates/cli/src/lib.rs Commands:
/// Provision and configure a remote CodeWhale agent (cloud + chat bridge).
RemoteSetup(RemoteSetupArgs),
RemoteSetupArgs (clap):
| Flag | Meaning |
|---|---|
--cloud <azure|lighthouse> |
Skip the cloud prompt. |
--bridge <telegram|feishu> |
Skip the bridge prompt. |
--provider <slug> |
Provider slug; validated against PROVIDERS. |
--out <dir> |
Bundle output dir (default ./codewhale-deploy/<cloud>-<bridge>). |
--generate-only |
Emit the bundle, do not provision (default). |
--apply |
Run the cloud CLI to actually provision (the auto-provision path). |
--yes |
Skip the final confirmation gate (CI/non-interactive). |
--non-interactive |
Fail instead of prompting if any required value is missing. |
CLI delegates to the TUI binary exactly like Serve/Setup do
(delegate_to_tui(&cli, &resolved_runtime, tui_args("remote-setup", args))).
The implementation lives next to run_setup in crates/tui/src/.
Code layout
New module crates/tui/src/remote_setup/:
remote_setup/
mod.rs # run_remote_setup(): wizard orchestration + dispatch
registry.rs # CloudTarget + BridgeSpec tables (the matrix)
prompt.rs # thin stdin prompt helpers (reuse existing patterns)
bundle.rs # render env files / systemd units / RUNBOOK.md to --out
provision/
mod.rs # Provisioner trait + confirmation gate + dry-run printer
azure.rs # az preflight, RG, VM+cloud-init, Key Vault, NSG, start
lighthouse.rs # cnb.yml + tag_deploy.yml generation, CNB guidance
templates/ # runtime.env, <bridge>.env, *.service, cloud-init.yaml.tmpl
Registry types
pub struct BridgeSpec {
pub slug: &'static str, // "telegram"
pub display: &'static str, // "Telegram"
pub package_dir: &'static str, // "integrations/telegram-bridge"
pub service_unit: &'static str, // "codewhale-telegram-bridge.service"
pub env_template: &'static str, // templates/telegram.env
/// Bridge-specific secret env keys to prompt for (token, etc.).
pub secret_keys: &'static [&'static str], // ["TELEGRAM_BOT_TOKEN"]
/// One-liner shown before prompting (e.g. "Create a bot with @BotFather").
pub setup_hint: &'static str,
}
pub struct CloudTarget {
pub slug: &'static str, // "azure"
pub display: &'static str, // "Azure VM"
pub secret_store: SecretStore, // KeyVault | EnvFile
pub install: InstallMethod, // Docker | NativeSystemd
/// Builds the ordered list of provisioning steps as (description, command).
/// Commands are returned as data so they can be dry-run printed, gated,
/// and only then executed.
pub plan: fn(&DeployInputs) -> Vec<ProvisionStep>,
}
A ProvisionStep { description, program, args, secret_args } is data, never a
shell string — so the confirmation gate can print every command, secrets are fed
via stdin/temp files (never argv/history), and --apply just executes the
already-printed plan.
Wizard flow
- Cloud — pick from
CLOUD_TARGETS(or--cloud). - Bridge — pick from
BRIDGES(or--bridge); printsetup_hint. - Provider — list
PROVIDERS(canonical names), pick (or--provider). Look upspec.env_keys[0]as the key var to prompt for. - Secrets — prompt for: provider API key, bridge token(s) from
secret_keys, allowlist (chat ids). Generate a randomCODEWHALE_RUNTIME_TOKEN. - Mode — generate-only vs
--apply. - Render bundle to
--out(always, even with--apply). - Confirm + provision (only if
--apply): print the full ordered command plan, requirey(unless--yes), then execute step by step with progress. - Print RUNBOOK.md path and the remaining manual steps.
Generated bundle
Written to ./codewhale-deploy/<cloud>-<bridge>/:
runtime.env— provider config lives here:CODEWHALE_PROVIDER=openai OPENAI_API_KEY=… # the provider's own key var, from registry CODEWHALE_MODEL=auto CODEWHALE_RUNTIME_TOKEN=<random> CODEWHALE_RUNTIME_PORT=7878 CODEWHALE_RUNTIME_WORKERS=2 RUST_LOG=info<bridge>.env— transport only:CODEWHALE_RUNTIME_URL=http://127.0.0.1:7878, matchingCODEWHALE_RUNTIME_TOKEN, allowlist,TELEGRAM_BOT_TOKEN(or Feishu app id/secret),CODEWHALE_WORKSPACE,CODEWHALE_MODEL.codewhale-runtime.service,codewhale-<bridge>.service.- Cloud artifact:
cloud-init.yaml+provision.sh(Azure) orcnb.yml+tag_deploy.yml(Lighthouse). RUNBOOK.md— the exact remaining commands + first-pairing steps.
Auto-provision
Azure (--apply --cloud azure)
Preflight: az account show (fail with "run az login" if absent). Then the
plan() emits, in order:
az group create(region prompted; defaulteastus).az keyvault create+az keyvault secret setfor the provider key and the runtime token (secrets via stdin, not argv).az vm createwith--custom-data cloud-init.yamland a system-assigned managed identity; cloud-init pullsghcr.io/hmbown/codewhale:latest, reads the secrets from Key Vault via the identity, writes/etc/codewhale/*.env, installs both systemd units,enable --now.- NSG: SSH (22) only, scoped to the caller's IP; 7878 stays on
127.0.0.1. - Print the SSH tunnel command for
/statusfrom a laptop if desired.
Lighthouse (--apply --cloud lighthouse)
Reuse the existing deploy/tencent-lighthouse/cnb/*.example pipeline: render
cnb.yml + tag_deploy.yml from inputs and walk the user through the CNB
trigger (CNB does the VM-side work). Systemd units mirror the existing
codewhale-runtime.service.
Safety (matches the harness rules for outward-facing actions):
- Every command printed before execution;
ygate unless--yes. - Secrets never in argv or shell history.
--generate-onlyis the default;--applyis explicit.
Namespace migration: DEEPSEEK_* → CODEWHALE_*
Follow the convention already in crates/config/src/lib.rs: read
CODEWHALE_X first, fall back to DEEPSEEK_X. Nothing breaks for existing
deployments.
Touch list:
- Bridges (
integrations/feishu-bridge,integrations/telegram-bridge): inlib.mjs/index.mjs, readprocess.env.CODEWHALE_X ?? process.env.DEEPSEEK_XforRUNTIME_URL,RUNTIME_TOKEN,WORKSPACE,MODEL,MODE,ALLOW_SHELL,TRUST_MODE,AUTO_APPROVE,CHAT_ALLOWLIST,ALLOW_UNLISTED,TURN_TIMEOUT_MS. Validators accept either; templates emitCODEWHALE_*. - Deploy units (
deploy/tencent-lighthouse/systemd/*,integrations/*/deploy/*):DEEPSEEK_RUNTIME_*→CODEWHALE_RUNTIME_*, env file paths/etc/deepseek/→/etc/codewhale/(keep reading the old path if present). .env.examplefiles +config.example.toml: lead withCODEWHALE_*, documentDEEPSEEK_*as legacy aliases.- Drop DeepSeek-shaped defaults in the bridge: no hardcoded
DEEPSEEK_MODEL=auto; the provider lives inruntime.envviaCODEWHALE_PROVIDER+ the registry's key var.
Note: items 1–3 touch tracked files, so they are part of the same
"don't ship during 0.8.48" hold. The brand-new (untracked) Telegram bridge can
be converted to CODEWHALE_* first as the reference implementation.
Tests
registry.rs: everyCloudTarget/BridgeSpecslug is unique; each bridge'spackage_dir/service_unit/env_templateexists.bundle.rs: rendering a bundle for each cloud×bridge×provider triple produces files withCODEWHALE_*keys, a matching runtime/bridge token, and a non-empty RUNBOOK.provision:plan()returns the expected ordered steps; commands are built but never executed in tests (assert on program+args, secrets redacted).
Extensibility check
- Add GCP: one
CloudTargetrow + aprovision/gcp.rs+ a cloud-init reuse. - Add Slack: one
BridgeSpecrow +integrations/slack-bridge+ template. No changes to the wizard control flow — it iterates the registries.
Suggested sequencing (given the 0.8.48 freeze)
- Now (safe, untracked): convert the new Telegram bridge to
CODEWHALE_* ?? DEEPSEEK_*; finalize this design. - Post-0.8.48, branch: namespace migration on tracked bridges + deploy units.
- Then: implement
remote-setup(registry → bundle → Azure provisioner → Lighthouse provisioner), generate-only first,--applysecond.