06612495fc
Release-preparation checkpoint for v0.8.51 (workspace + npm bumped to 0.8.51). Added: - Arcee AI direct provider: [providers.arcee], ARCEE_API_KEY/BASE_URL/MODEL, CLI auth, provider + model picker, registry. Default direct-API model is trinity-large-thinking (reasoning, 262K ctx/out); preview + mini selectable. Cloudflare-WAF-safe opening turn (benign read-only tool surface, system-prompt payload splitting) and reasoning_content replay on tool-call turns. - Expanded model catalog (qwen3.6 flash/plus/max-preview, Xiaomi MiMo v2.5 chat/ASR/TTS); provider-aware model picker with per-provider saved models. Changed: - Auto-compaction is percentage- and model-aware (compaction_threshold_for_model_at_percent; default 80%; auto-enable for <=256K windows, opt-in for 1M models). - Provider/gateway HTTP errors sanitized (HTML/WAF interstitials collapsed, 401/403 split into authentication vs authorization). Removed: - The session cycle / checkpoint-restart system: /cycles, /cycle, /recall, recall_archive tool, cycle_manager, cycle-handoff prompt, sidebar cycle lines, EngineConfig.cycle / Event::CycleAdvanced / seam cycle thresholds. Fixed: - Orphaned assistant 'blue dot' role glyph on whitespace-only turns. - Sidebar mouse-wheel scroll leaking into the transcript. - Sidebar hover tooltip overlap + warning-orange styling. - README Constitution description corrected to match prompts/base.md. - Repaired release-blocking unit/integration tests after the refactors. Preflight: cargo fmt clean, workspace builds, 3903 tui tests pass (1 known flaky MCP SSE test under parallel load, passes in isolation).
265 lines
13 KiB
Markdown
265 lines
13 KiB
Markdown
# `codewhale remote-setup` — Design & Implementation Plan
|
||
|
||
Status: **design** (do not implement against the 0.8.48 release wrap; land on a
|
||
branch or after 0.8.48 ships). Author handoff doc, mirrors the style of
|
||
`REFACTOR_HANDOFF.md`.
|
||
|
||
## Goal
|
||
|
||
One command — `codewhale remote-setup` — that guides a user through standing up
|
||
a remote CodeWhale agent they can talk to from a phone chat app, across:
|
||
|
||
- **Cloud target:** Tencent Lighthouse **or** Azure (extensible to GCP/Hetzner/bare).
|
||
- **Chat bridge:** Feishu/Lark **or** Telegram (extensible to Slack/Discord).
|
||
- **Model provider:** any entry in the existing `PROVIDERS` registry
|
||
(DeepSeek, OpenAI, NVIDIA NIM, Atlascloud, WanjieArk, OpenRouter, Novita,
|
||
Fireworks, Moonshot, SGLang, vLLM, Ollama, Xiaomi).
|
||
|
||
Decisions locked with the user:
|
||
- **Form:** native Rust subcommand in-binary (touches `crates/cli` + `crates/tui`).
|
||
- **Scope:** generate the deploy bundle **and** optionally auto-provision via the
|
||
cloud CLI (`az` / CNB), behind a confirmation gate.
|
||
|
||
## Prior art: Hermes Agent (reference only — do not copy)
|
||
|
||
`/Volumes/VIXinSSD/hermesagent` (Nous Research's Hermes Agent, Python) solves the
|
||
same problem and **validates this design**. Use it for ideas; keep CodeWhale's
|
||
style (Rust core, zero-dep Node bridges, plain-text replies).
|
||
|
||
- Its `gateway/platform_registry.py` is exactly the table-driven approach here: a
|
||
`PlatformEntry { name, label, adapter_factory, check_fn, validate_config,
|
||
required_env, install_hint, setup_fn, source }`. That maps 1:1 onto our
|
||
`BridgeSpec`/`CloudTarget` rows, and its per-platform `setup_fn` + `required_env`
|
||
are what our wizard reads to prompt. A single gateway process fans out to many
|
||
platforms — the model we want.
|
||
- Its `gateway/pairing.py` mirrors our allowlist/first-pairing flow.
|
||
|
||
### Telegram hardening checklist (mined from `gateway/platforms/telegram.py`)
|
||
|
||
That adapter is battle-tested; its method names enumerate edge cases our MVP
|
||
bridge should handle. Status against `integrations/telegram-bridge`:
|
||
|
||
| Edge case | In Hermes | In our bridge |
|
||
|---|---|---|
|
||
| 409 polling conflict (two `getUpdates`) | `_looks_like_polling_conflict` | **done** — poll loop backs off 10s + warns |
|
||
| 429 `retry_after` | rate-limit handling | **done** — `telegramApi` honors `parameters.retry_after` |
|
||
| Forum "General topic = 1" send/typing asymmetry | `_message_thread_id_for_send` vs `_for_typing` | **done** — omit `message_thread_id` when id is 1 on send |
|
||
| "message to be replied not found" after restart | `_send_with_dm_topic_reply_anchor_retry` | **sidestepped** — we never set `reply_to_message_id` |
|
||
| Network/connect-timeout retry | `_looks_like_network_error` | partial — generic 3s backoff in poll loop |
|
||
| Text batching + progress-edit (edit one msg vs spam) | `test_telegram_text_batching` | **deferred** — we send a chunk every 15s |
|
||
| MarkdownV2 escaping + table rendering | `_escape_mdv2`, `_wrap_markdown_tables` | **deferred** — plain text (safe; tables look plain) |
|
||
| Webhook mode as an alternative to long polling | `_webhook_mode` | out of scope — long-poll only (no inbound ports) |
|
||
|
||
Deferred items are deliberate: progress-edit and MarkdownV2 add real UX polish
|
||
but also complexity and (for MDV2) a whole class of parser-escaping bugs. Revisit
|
||
after `remote-setup` lands.
|
||
|
||
## Design principle: table-driven, like `ProviderSpec`
|
||
|
||
The provider registry (`crates/config/src/lib.rs::PROVIDERS`) is the model to
|
||
copy: "adding a provider is one row." Apply the same to clouds and bridges so
|
||
the matrix grows by data, not by new control flow.
|
||
|
||
```
|
||
CloudTarget × BridgeSpec + ProviderSpec (existing registry)
|
||
─────────── ────────── ────────────────────────────────
|
||
lighthouse feishu deepseek / openai / nvidia-nim / …
|
||
azure telegram (wizard reads PROVIDERS, prompts for
|
||
(future…) (future…) that provider's env_keys[0])
|
||
```
|
||
|
||
Clean separation that the architecture already implies:
|
||
- **Provider = a `runtime.env` concern.** The runtime resolves the provider from
|
||
`CODEWHALE_PROVIDER` and the provider's own key var. The bridge never needs to
|
||
know which provider is behind the runtime — it only forwards `model` to
|
||
`/v1/threads`. So "multi-provider" only touches `runtime.env` generation.
|
||
- **Cloud = where it runs + where secrets live.**
|
||
- **Bridge = pure transport** between a chat app and `127.0.0.1:7878`.
|
||
|
||
## Command surface
|
||
|
||
New variant in `crates/cli/src/lib.rs` `Commands`:
|
||
|
||
```rust
|
||
/// Provision and configure a remote CodeWhale agent (cloud + chat bridge).
|
||
RemoteSetup(RemoteSetupArgs),
|
||
```
|
||
|
||
`RemoteSetupArgs` (clap):
|
||
|
||
| Flag | Meaning |
|
||
|---|---|
|
||
| `--cloud <azure\|lighthouse>` | Skip the cloud prompt. |
|
||
| `--bridge <telegram\|feishu>` | Skip the bridge prompt. |
|
||
| `--provider <slug>` | Provider slug; validated against `PROVIDERS`. |
|
||
| `--out <dir>` | Bundle output dir (default `./codewhale-deploy/<cloud>-<bridge>`). |
|
||
| `--generate-only` | Emit the bundle, do not provision (default). |
|
||
| `--apply` | Run the cloud CLI to actually provision (the auto-provision path). |
|
||
| `--yes` | Skip the final confirmation gate (CI/non-interactive). |
|
||
| `--non-interactive` | Fail instead of prompting if any required value is missing. |
|
||
|
||
CLI delegates to the TUI binary exactly like `Serve`/`Setup` do
|
||
(`delegate_to_tui(&cli, &resolved_runtime, tui_args("remote-setup", args))`).
|
||
The implementation lives next to `run_setup` in `crates/tui/src/`.
|
||
|
||
## Code layout
|
||
|
||
New module `crates/tui/src/remote_setup/`:
|
||
|
||
```
|
||
remote_setup/
|
||
mod.rs # run_remote_setup(): wizard orchestration + dispatch
|
||
registry.rs # CloudTarget + BridgeSpec tables (the matrix)
|
||
prompt.rs # thin stdin prompt helpers (reuse existing patterns)
|
||
bundle.rs # render env files / systemd units / RUNBOOK.md to --out
|
||
provision/
|
||
mod.rs # Provisioner trait + confirmation gate + dry-run printer
|
||
azure.rs # az preflight, RG, VM+cloud-init, Key Vault, NSG, start
|
||
lighthouse.rs # cnb.yml + tag_deploy.yml generation, CNB guidance
|
||
templates/ # runtime.env, <bridge>.env, *.service, cloud-init.yaml.tmpl
|
||
```
|
||
|
||
### Registry types
|
||
|
||
```rust
|
||
pub struct BridgeSpec {
|
||
pub slug: &'static str, // "telegram"
|
||
pub display: &'static str, // "Telegram"
|
||
pub package_dir: &'static str, // "integrations/telegram-bridge"
|
||
pub service_unit: &'static str, // "codewhale-telegram-bridge.service"
|
||
pub env_template: &'static str, // templates/telegram.env
|
||
/// Bridge-specific secret env keys to prompt for (token, etc.).
|
||
pub secret_keys: &'static [&'static str], // ["TELEGRAM_BOT_TOKEN"]
|
||
/// One-liner shown before prompting (e.g. "Create a bot with @BotFather").
|
||
pub setup_hint: &'static str,
|
||
}
|
||
|
||
pub struct CloudTarget {
|
||
pub slug: &'static str, // "azure"
|
||
pub display: &'static str, // "Azure VM"
|
||
pub secret_store: SecretStore, // KeyVault | EnvFile
|
||
pub install: InstallMethod, // Docker | NativeSystemd
|
||
/// Builds the ordered list of provisioning steps as (description, command).
|
||
/// Commands are returned as data so they can be dry-run printed, gated,
|
||
/// and only then executed.
|
||
pub plan: fn(&DeployInputs) -> Vec<ProvisionStep>,
|
||
}
|
||
```
|
||
|
||
A `ProvisionStep { description, program, args, secret_args }` is *data*, never a
|
||
shell string — so the confirmation gate can print every command, secrets are fed
|
||
via stdin/temp files (never argv/`history`), and `--apply` just executes the
|
||
already-printed plan.
|
||
|
||
## Wizard flow
|
||
|
||
1. **Cloud** — pick from `CLOUD_TARGETS` (or `--cloud`).
|
||
2. **Bridge** — pick from `BRIDGES` (or `--bridge`); print `setup_hint`.
|
||
3. **Provider** — list `PROVIDERS` (canonical names), pick (or `--provider`).
|
||
Look up `spec.env_keys[0]` as the key var to prompt for.
|
||
4. **Secrets** — prompt for: provider API key, bridge token(s) from
|
||
`secret_keys`, allowlist (chat ids). Generate a random `CODEWHALE_RUNTIME_TOKEN`.
|
||
5. **Mode** — generate-only vs `--apply`.
|
||
6. **Render bundle** to `--out` (always, even with `--apply`).
|
||
7. **Confirm + provision** (only if `--apply`): print the full ordered command
|
||
plan, require `y` (unless `--yes`), then execute step by step with progress.
|
||
8. **Print RUNBOOK.md** path and the remaining manual steps.
|
||
|
||
## Generated bundle
|
||
|
||
Written to `./codewhale-deploy/<cloud>-<bridge>/`:
|
||
|
||
- `runtime.env` — **provider config lives here**:
|
||
```
|
||
CODEWHALE_PROVIDER=openai
|
||
OPENAI_API_KEY=… # the provider's own key var, from registry
|
||
CODEWHALE_MODEL=auto
|
||
CODEWHALE_RUNTIME_TOKEN=<random>
|
||
CODEWHALE_RUNTIME_PORT=7878
|
||
CODEWHALE_RUNTIME_WORKERS=2
|
||
RUST_LOG=info
|
||
```
|
||
- `<bridge>.env` — transport only: `CODEWHALE_RUNTIME_URL=http://127.0.0.1:7878`,
|
||
matching `CODEWHALE_RUNTIME_TOKEN`, allowlist, `TELEGRAM_BOT_TOKEN` (or Feishu
|
||
app id/secret), `CODEWHALE_WORKSPACE`, `CODEWHALE_MODEL`.
|
||
- `codewhale-runtime.service`, `codewhale-<bridge>.service`.
|
||
- Cloud artifact: `cloud-init.yaml` + `provision.sh` (Azure) or `cnb.yml` +
|
||
`tag_deploy.yml` (Lighthouse).
|
||
- `RUNBOOK.md` — the exact remaining commands + first-pairing steps.
|
||
|
||
## Auto-provision
|
||
|
||
### Azure (`--apply --cloud azure`)
|
||
Preflight: `az account show` (fail with "run `az login`" if absent). Then the
|
||
`plan()` emits, in order:
|
||
1. `az group create` (region prompted; default `eastus`).
|
||
2. `az keyvault create` + `az keyvault secret set` for the provider key and the
|
||
runtime token (secrets via stdin, not argv).
|
||
3. `az vm create` with `--custom-data cloud-init.yaml` and a **system-assigned
|
||
managed identity**; cloud-init pulls `ghcr.io/hmbown/codewhale:latest`, reads
|
||
the secrets from Key Vault via the identity, writes `/etc/codewhale/*.env`,
|
||
installs both systemd units, `enable --now`.
|
||
4. NSG: SSH (22) only, scoped to the caller's IP; **7878 stays on `127.0.0.1`**.
|
||
5. Print the SSH tunnel command for `/status` from a laptop if desired.
|
||
|
||
### Lighthouse (`--apply --cloud lighthouse`)
|
||
Reuse the existing `deploy/tencent-lighthouse/cnb/*.example` pipeline: render
|
||
`cnb.yml` + `tag_deploy.yml` from inputs and walk the user through the CNB
|
||
trigger (CNB does the VM-side work). Systemd units mirror the existing
|
||
`codewhale-runtime.service`.
|
||
|
||
Safety (matches the harness rules for outward-facing actions):
|
||
- Every command printed before execution; `y` gate unless `--yes`.
|
||
- Secrets never in argv or shell history.
|
||
- `--generate-only` is the default; `--apply` is explicit.
|
||
|
||
## Namespace migration: `DEEPSEEK_*` → `CODEWHALE_*`
|
||
|
||
Follow the convention already in `crates/config/src/lib.rs`: **read
|
||
`CODEWHALE_X` first, fall back to `DEEPSEEK_X`.** Nothing breaks for existing
|
||
deployments.
|
||
|
||
Touch list:
|
||
1. **Bridges** (`integrations/feishu-bridge`, `integrations/telegram-bridge`):
|
||
in `lib.mjs`/`index.mjs`, read `process.env.CODEWHALE_X ?? process.env.DEEPSEEK_X`
|
||
for `RUNTIME_URL`, `RUNTIME_TOKEN`, `WORKSPACE`, `MODEL`, `MODE`, `ALLOW_SHELL`,
|
||
`TRUST_MODE`, `AUTO_APPROVE`, `CHAT_ALLOWLIST`, `ALLOW_UNLISTED`, `TURN_TIMEOUT_MS`.
|
||
Validators accept either; templates emit `CODEWHALE_*`.
|
||
2. **Deploy units** (`deploy/tencent-lighthouse/systemd/*`,
|
||
`integrations/*/deploy/*`): `DEEPSEEK_RUNTIME_*` → `CODEWHALE_RUNTIME_*`,
|
||
env file paths `/etc/deepseek/` → `/etc/codewhale/` (keep reading the old path
|
||
if present).
|
||
3. **`.env.example` files + `config.example.toml`**: lead with `CODEWHALE_*`,
|
||
document `DEEPSEEK_*` as legacy aliases.
|
||
4. **Drop DeepSeek-shaped defaults** in the bridge: no hardcoded
|
||
`DEEPSEEK_MODEL=auto`; the provider lives in `runtime.env` via
|
||
`CODEWHALE_PROVIDER` + the registry's key var.
|
||
|
||
Note: items 1–3 touch **tracked** files, so they are part of the same
|
||
"don't ship during 0.8.48" hold. The brand-new (untracked) Telegram bridge can
|
||
be converted to `CODEWHALE_*` first as the reference implementation.
|
||
|
||
## Tests
|
||
|
||
- `registry.rs`: every `CloudTarget`/`BridgeSpec` slug is unique; each bridge's
|
||
`package_dir`/`service_unit`/`env_template` exists.
|
||
- `bundle.rs`: rendering a bundle for each cloud×bridge×provider triple produces
|
||
files with `CODEWHALE_*` keys, a matching runtime/bridge token, and a non-empty
|
||
RUNBOOK.
|
||
- `provision`: `plan()` returns the expected ordered steps; **commands are built
|
||
but never executed** in tests (assert on program+args, secrets redacted).
|
||
|
||
## Extensibility check
|
||
|
||
- Add **GCP**: one `CloudTarget` row + a `provision/gcp.rs` + a cloud-init reuse.
|
||
- Add **Slack**: one `BridgeSpec` row + `integrations/slack-bridge` + template.
|
||
No changes to the wizard control flow — it iterates the registries.
|
||
|
||
## Suggested sequencing (given the 0.8.48 freeze)
|
||
|
||
1. **Now (safe, untracked):** convert the new Telegram bridge to `CODEWHALE_* ??
|
||
DEEPSEEK_*`; finalize this design.
|
||
2. **Post-0.8.48, branch:** namespace migration on tracked bridges + deploy units.
|
||
3. **Then:** implement `remote-setup` (registry → bundle → Azure provisioner →
|
||
Lighthouse provisioner), generate-only first, `--apply` second.
|