From bf2e5504a2867a7f30f8bf90e51ec296a50b5ee7 Mon Sep 17 00:00:00 2001 From: Hunter B Date: Tue, 9 Jun 2026 23:22:32 -0700 Subject: [PATCH] docs: remove stale internal docs (handoffs, old audits, orphaned roadmaps) --- HANDOFF_v0.8.51.md | 140 --------------------- docs/LEGACY_RUST_AUDIT_0_7_6.md | 37 ------ docs/MIMO_BENCHMARK_ISSUES.md | 89 ------------- docs/REVIEW_PIPELINE.md | 84 ------------- docs/RLM_BRANCHING_ROADMAP.md | 92 -------------- docs/archive/V0_7_5_IMPLEMENTATION_PLAN.md | 61 --------- 6 files changed, 503 deletions(-) delete mode 100644 HANDOFF_v0.8.51.md delete mode 100644 docs/LEGACY_RUST_AUDIT_0_7_6.md delete mode 100644 docs/MIMO_BENCHMARK_ISSUES.md delete mode 100644 docs/REVIEW_PIPELINE.md delete mode 100644 docs/RLM_BRANCHING_ROADMAP.md delete mode 100644 docs/archive/V0_7_5_IMPLEMENTATION_PLAN.md diff --git a/HANDOFF_v0.8.51.md b/HANDOFF_v0.8.51.md deleted file mode 100644 index 175485e4..00000000 --- a/HANDOFF_v0.8.51.md +++ /dev/null @@ -1,140 +0,0 @@ -# v0.8.51 Release Handoff — 2026-06-02 - -## Workspace - -``` -/Volumes/VIXinSSD/codewhale -Branch: codex/v0.8.51-arcee-provider (12 commits ahead of origin/main) -``` - -## What's Already Landed (committed, 12 commits on branch) - -| Commit | What | -|--------|------| -| `e54a0a500` | feat(provider): add direct Arcee support | -| `99da87ca1` | fix(cli): wire arcee provider auth | -| `8eca75763` | test(tui): cover arcee provider picker entry | -| `06612495f` | chore(release): prep v0.8.51 — version bump, CHANGELOG | -| `fd69f4c80` | fix(tui): strip DEC private mode CSI fragments (#2592) | -| `5249723e1` | fix(engine): recover from turn panics (#2583, #1269) | -| `478bae451` | fix(tui): find deeply nested files via @/Ctrl+P (#2488) | -| `e95f759cd` | fix(tui): command-palette scroll visibility (#2590) | -| `cccc5ed55` | fix(shell): .NET/NuGet + Windows env (#1857) | -| `7aa73fad5` | fix(config): warn on misplaced shell/sandbox keys (#2589) | -| `a7d482067` | fix(clippy): clear -D warnings (#2599) | -| `79d78878b` | test(mcp): deterministic SSE reconnect (#2597) | -| `f886f28ac` | test(tui): update walk-depth test for new default depth | - -**Claude branch `origin/claude/busy-mayer-b66rA` is identical to our HEAD** — same commit `f886f28ac`. Nothing to merge; already up to date. - -## What's Applied in Working Tree (NOT committed) - -4 files modified (88 insertions, 25 deletions): - -| File | Change | Credits | -|------|--------|---------| -| `crates/tui/src/tui/diff_render.rs` | `wrap_text` preserves leading whitespace; fixes extra-space bug in PR #2591; adds 2 regression tests | @zlh124 (PR #2591, fix version from working tree) | -| `crates/tui/src/schema_migration.rs` | `#[allow(dead_code)]` on `pub mod registry` | @reidliu41 (PR #2601) | -| `crates/tui/src/prompts/base.md` | Tool desc: prefer `gh --json` CLI for GitHub triage | — | -| `crates/tui/src/prompts/base.txt` | Same prompt update for text variant | — | - -## Ready to Apply (patch saved, NOT yet applied) - -The model persistence patch is at `/tmp/model_persist.patch` (240 lines, 2 files): - -- `crates/tui/src/commands/core.rs`: `/model` command remembers per-provider model selection + persist warning -- `crates/tui/src/settings.rs`: new `set_provider_model_selection()` and `persist_provider_model_selection()` methods - -**Apply it:** -```bash -cd /Volumes/VIXinSSD/codewhale -git apply /tmp/model_persist.patch -``` - -This is small, self-contained, and directly improves UX for the new Arcee provider (model choice remembered across restarts). No dependency on the deferred image-attachment work. - -## Deferred (in stash, do NOT apply for v0.8.51) - -The stash (`stash@{0}`) contains: - -- **Image attachment** (#2584/#2587): `ContentBlock::ImageUrl`, multimodal chat requests, base64 encoding, + exhaustive match arms across 15 files. Deferred by Hmbown — changes the request shape, needs multimodal endpoint testing. -- **GitHub structured route** (`fetch_url.rs`): new feature — routes GitHub issue/PR URLs through `gh` CLI. Too broad for v0.8.51. -- **Config custom model changes** (`commands/config.rs`): `normalize_custom_model_id` etc. Need review. - -To view: `git stash show -p stash@{0}` - -## What Remains - -```bash -cd /Volumes/VIXinSSD/codewhale - -# 1. Apply the model persistence patch -git apply /tmp/model_persist.patch - -# 2. Commit all working-tree changes as one harvest commit -git add -A -git commit -m "harvest(v0.8.51): diff-render whitespace fix + schema dead_code + model persistence + prompt updates - -- fix(diff-render): preserve leading whitespace in patch content lines - Credit: @zlh124 (PR #2591), with extra-space bug fixed. -- fix(tui): allow unused schema migration registry - Credit: @reidliu41 (PR #2601). -- feat(tui): persist per-provider model selection from /model command -- docs(prompts): prefer gh --json CLI for GitHub triage in agent instructions" - -# 3. Run release gates -cargo fmt --all -- --check -cargo clippy --workspace --all-targets --all-features --locked -- -D warnings -cargo test -p codewhale-tui -- --test-threads=4 - -# 4. If gates pass, rebuild -cargo install --path crates/tui --locked --force -codewhale --version -codewhale-tui --version - -# 5. Final checks -git diff origin/main --stat -gh issue view 2600 --repo Hmbown/CodeWhale # re-read release checklist -``` - -## Release Checklist Status (issue #2600) - -- [x] Arcee provider landed + tested -- [x] Cycle/checkpoint-restart system removed -- [x] Auto-compaction percentage/model-aware -- [x] Provider/gateway HTTP error sanitization -- [x] TUI fixes (blue dot, sidebar scroll, tooltip) -- [x] CHANGELOG + version bump -- [x] Clippy clean (#2599) -- [x] MCP SSE test deterministic (#2597) -- [ ] Full `cargo test --workspace` green — codewhale-tui validated; 1 environment-only Landlock test may fail on macOS -- [ ] `npm test` in `npm/codewhale` -- [ ] Harvest commits applied + re-gated -- [ ] Merge branch to `main` -- [ ] Tag `v0.8.51`, push tag -- [ ] Publish GitHub release + `npm publish` - -## Contributors to Credit - -| Contributor | Contribution | PR/Issue | -|-------------|-------------|----------| -| @zlh124 (jayzhu) | diff-render whitespace preservation | #2591 | -| @reidliu41 (Reid) | schema migration dead_code allow | #2601 | -| @xyuai | Image attachment root cause + initial PR | #2587, #2584 | -| @IcedOranges | Image attachment bug report | #2584 | -| @idling11 (Hanmiao Li) | Sidebar resize feature request | #2602 | -| @gordonlu (Gordon) | Engine death recovery | #2585 | -| @cyq1017 | File picker depth fix draft | #2593 | - -## Risks / Notes - -1. **Working tree was stashed** — the image attachment feature and GitHub structured route are deferred for v0.8.52+. The model persistence patch is the only remaining piece worth landing. -2. **DeepSeek naming**: The branch and committed code use "CodeWhale" naming throughout. Do not imply DeepSeek is deprecated. -3. **The `origin/claude/busy-mayer-b66rA` branch is identical to HEAD** — the Claude Code session in #2600 claimed "+8 commits" but those are the same commits already on this branch. Verify with `git rev-parse HEAD origin/claude/busy-mayer-b66rA`. -4. **Landlock test**: `sandbox::tests::test_parity_linux_landlock_available` will fail on macOS (no Landlock LSM). This is environment-only, not a regression. On CI Linux runners it passes. -5. **Cross-platform artifacts**: The release workflow builds macOS + Windows + NSIS installer on tag push. Not buildable locally on macOS alone. - ---- - -Generated by deepseek-v4-pro in CodeWhale v0.8.51 pre-release triage. -Next session: read this file, apply `/tmp/model_persist.patch`, run gates, commit, and prepare the merge. diff --git a/docs/LEGACY_RUST_AUDIT_0_7_6.md b/docs/LEGACY_RUST_AUDIT_0_7_6.md deleted file mode 100644 index 2010ef67..00000000 --- a/docs/LEGACY_RUST_AUDIT_0_7_6.md +++ /dev/null @@ -1,37 +0,0 @@ -# v0.7.6 Legacy Rust Audit - -Status date: 2026-04-29 - -This audit is deliberately non-destructive. No compatibility code is removed in v0.7.6 unless tests prove public CLI, saved-session, tool-schema, and documented command paths no longer depend on it. - -## Summary - -| Surface | Owner module | Current consumer | Reference check | Compatibility reason | Current warning | Recommended action | -|---|---|---|---|---|---|---| -| Legacy MCP sync API (`McpServerInput`, `list`, `add`, `remove`, `call_tool`, `load_legacy`) | `crates/tui/src/mcp.rs` | Not wired into current `/mcp` command path; retained behind `#[allow(dead_code)]` | Direct Rust references and current MCP command path inspected; saved/config JSON compatibility still needs a dedicated smoke | Preserves old JSON shape including `mcpServers` alias and sync call helpers while the async MCP manager is the active path | Code TODO only | Gate behind an explicit legacy module or remove after CLI/runtime parity tests prove no caller uses it. Tracked by #218. | -| Legacy prompt constants/functions (`AGENT_PROMPT`, `YOLO_PROMPT`, `PLAN_PROMPT`, `base_system_prompt`, `normal_system_prompt`, etc.) | `crates/tui/src/prompts.rs` | Tests and older callers that still import prompt constants directly | Direct Rust references remain; public-crate and older harness imports are not proven absent | Layered prompt API replaced monolithic prompts, but older call sites may still compile against constants | None | Keep for v0.7.6; add deprecation annotations only after internal callers are migrated. Tracked by #219. | -| `/compact` slash command positioning | `crates/tui/src/commands/mod.rs` | Public slash-command registry and help overlay | Public command registry/docs path still references it | Users may still run `/compact` manually when they want an immediate replacement-style summary | Description is intentionally explicit about manual compaction | Keep as a manual compatibility command; do not remove until context/token issues are resolved. | -| `todo_*` compatibility tools | `crates/tui/src/tools/todo.rs` | Tool registry/model calls that still use `todo_add`, `todo_update`, `todo_list`, `todo_write` | Tool registry compatibility and saved tool-call risk remain | `checklist_*` is canonical, but old tool names may appear in saved prompts, traces, or model priors | Metadata marks `compat_alias: true`; descriptions say compatibility alias | Add explicit deprecation metadata with target version, then remove only after tool-schema migration evidence. Tracked by #220. | -| Deprecated sub-agent alias tools (`spawn_agent`, `send_input`, delegate aliases) | `crates/tui/src/tools/subagent/mod.rs` | Tool registry and model/tool-call compatibility | Tool registry compatibility and saved tool-call risk remain | Canonical names are `agent_spawn`, `agent_send_input`, etc.; alias names preserve older tool-call compatibility | `_deprecation` metadata and tracing warn; removal target is `v0.8.0` | Keep through v0.7.x; removal already has metadata. Tracked by #221. | -| Legacy root/provider TOML `api_key` compatibility | `crates/tui/src/config.rs`, `crates/config/src/lib.rs` | Config resolver; users with existing `api_key` in config files | Public config loading and docs still mention migration behavior | Keyring migration is preferred, but breaking existing configs would block startup/auth | Tracing warnings point to `deepseek auth set` / `deepseek auth migrate` | Keep; warnings are user-actionable. Removal should wait for a migration command and release-note window. | -| Model alias canonicalization (`deepseek-chat`, `deepseek-reasoner`, older V3/R1 aliases) | `crates/tui/src/config.rs`, `crates/config/src/lib.rs` | Config/env/model picker normalization | Public docs and existing configs may still use aliases | Preserves old documented DeepSeek aliases and maps them to `deepseek-v4-flash` | Silent alias by design | Keep; removing aliases would break configs without meaningful benefit. | -| Deprecated palette constants and aliases | `crates/tui/src/palette.rs`, `crates/tui/tests/palette_audit.rs` | Existing call sites plus audit tests | Palette audit enforces the remaining allowlist | Semantic aliases are preferred, but old constants exist to prevent broad style churn | Palette audit blocks direct deprecated uses outside allowlist | Keep aliases; continue moving call sites to semantic roles opportunistically. | - -## Follow-Up Removal Candidates - -These are not safe to remove in v0.7.6: - -1. #218 Legacy MCP sync API: requires a call-graph check and explicit CLI/runtime parity tests for `/mcp`, `deepseek mcp`, and MCP server validation flows. -2. #219 Legacy prompt constants/functions: requires proving no public crate or older test harness imports them. -3. #220 `todo_*` tool aliases: requires deprecation metadata and a saved-trace/tool-schema migration window. -4. #221 Deprecated sub-agent alias tools: removal target is already encoded as `v0.8.0`, but the actual removal should be tracked and tested separately. - -## Verification Checklist - -Before removing any compatibility surface: - -1. Search direct Rust references with `rg`. -2. Search docs and README command examples. -3. Run workspace tests with all features. -4. Run a saved-session/tool-call compatibility smoke if the surface affects tool schemas or persisted history. -5. Keep a release-note entry and, for user-visible config/tool changes, a migration hint for at least one minor release. diff --git a/docs/MIMO_BENCHMARK_ISSUES.md b/docs/MIMO_BENCHMARK_ISSUES.md deleted file mode 100644 index 25e155a1..00000000 --- a/docs/MIMO_BENCHMARK_ISSUES.md +++ /dev/null @@ -1,89 +0,0 @@ -# MiMo v2.5 Benchmarking — Known Issues - -Tracking doc for quirks and workarounds when benchmarking Xiaomi MiMo v2.5 -through CodeWhale's harness integrations. - -## PinchBench - -### Issue 1: Model validation requires OpenRouter prefix - -PinchBench validates models against OpenRouter's `/models` endpoint. If you -pass `mimo-v2.5-pro` without the `openrouter/xiaomi/` prefix, validation is -skipped entirely (it assumes it's a non-OpenRouter model). This means you -won't know if the model ID is wrong until the run fails. - -**Workaround:** Always use `openrouter/xiaomi/mimo-v2.5-pro` for OpenRouter -routing, or use `--direct-mimo` for the Xiaomi API. - -### Issue 2: PinchBench requires OPENROUTER_API_KEY - -Even when using a direct provider, PinchBench's `lib_agent.py` checks for -`OPENROUTER_API_KEY` in some code paths. The `--direct-mimo` flag in our -runner works around this by setting up a custom OpenAI-compatible provider -entry in OpenClaw's `models.json` and exporting `OPENAI_API_KEY`/`OPENAI_BASE_URL`. - -### Issue 3: Token Plan vs Pay-as-you-go key mismatch - -Xiaomi MiMo has two API endpoints: -- **Token Plan** (`tp-` keys): `https://token-plan-sgp.xiaomimimo.com/v1` -- **Pay-as-you-go** (`sk-` keys): `https://api.xiaomimimo.com/v1` - -Using the wrong key type with the wrong endpoint produces auth errors. The -runner now detects this and warns. - -### Issue 4: OpenClaw is the runtime, not CodeWhale - -PinchBench runs tasks through OpenClaw, not CodeWhale. This means the -benchmark measures MiMo v2.5's performance through OpenClaw's agent harness, -not through CodeWhale's tool system. For CodeWhale-native evaluation, -Terminal-Bench (via Harbor) is the better fit. - -**Future:** Create a CodeWhale-native PinchBench adapter that loads tasks -from PinchBench's `tasks/` directory and runs them through `codewhale exec`. - -## Terminal-Bench (Harbor) - -### Issue 1: MiMo provider routing - -Harbor passes models as `provider/model` format. For MiMo via OpenRouter, -use `openrouter/xiaomi/mimo-v2.5-pro`. For direct Xiaomi API, pass -`--provider xiaomi-mimo` as an extra agent flag. - -### Issue 2: Container environment - -The Harbor adapter installs codewhale via npm in the container. MiMo API -keys must be forwarded from the host environment. The adapter checks for -`XIAOMI_MIMO_API_KEY`, `OPENROUTER_API_KEY`, and `OPENAI_API_KEY`. - -## SWE-bench - -### Issue 1: MiMo thinking mode - -MiMo v2.5 Pro supports extended thinking. For SWE-bench patch generation, -ensure the thinking level is set appropriately. The `--thinking high` flag -is passed through the CLI. - -### Issue 2: Context window - -MiMo v2.5 Pro has a 128K context window. Large SWE-bench instances (e.g., -Django, sympy) may benefit from the full window. No special handling needed, -but worth monitoring token usage. - -## Environment Variables Reference - -``` -# Xiaomi MiMo direct API -XIAOMI_MIMO_API_KEY=tp-... # Token Plan key -XIAOMI_MIMO_API_KEY=sk-... # Pay-as-you-go key -XIAOMI_MIMO_BASE_URL=https://token-plan-sgp.xiaomimimo.com/v1 -XIAOMI_MIMO_MODEL=mimo-v2.5-pro - -# Aliases also accepted -XIAOMI_API_KEY=... -MIMO_API_KEY=... -MIMO_BASE_URL=... -MIMO_MODEL=... - -# OpenRouter (for MiMo via OpenRouter) -OPENROUTER_API_KEY=... -``` diff --git a/docs/REVIEW_PIPELINE.md b/docs/REVIEW_PIPELINE.md deleted file mode 100644 index 95af4eca..00000000 --- a/docs/REVIEW_PIPELINE.md +++ /dev/null @@ -1,84 +0,0 @@ -\# CodeWhale Review Pipeline - - - -Welcome to CodeWhale! We receive a high volume of community PRs. To ensure a smooth and fast review process, please review our pipeline expectations below. - - - -\## 1. CI Gates (Pre-Review Checklist) - -Before a maintainer reviews your PR, it must pass our continuous integration (CI) checks. - - - -\*\*Required Checks (Must Pass):\*\* - -Please run these locally before pushing your code to avoid CI failures: - -\* \*\*Format:\*\* `cargo fmt --all -- --check` - -\* \*\*Linting:\*\* `cargo clippy --workspace --all-targets --all-features` - -\* \*\*Tests:\*\* `cargo test --workspace --all-features --locked` - - - -\*\*Informational Checks:\*\* - -Checks from \*\*Greptile\*\* and \*\*GitGuardian\*\* are informational. If they flag something, review it, but they do not strictly block a review on their own unless a secret is leaked. - - - -\## 2. Common Failure Modes \& Local Fixes - -If CI fails, it is usually one of these three reasons: - -\* \*\*Version Drift (`Cargo.lock` out of date):\*\* Run `cargo update` or `cargo build` locally to update the lockfile and commit the changes. - -\* \*\*Lint Failures:\*\* Check the clippy warnings from the command above and fix the specific lines flagged. - -\* \*\*Windows Test Flakiness:\*\* Occasionally, tests may time out on Windows runners. If you are confident your code didn't break it, leave a comment asking a maintainer to re-trigger the CI. - - - -\## 3. PR Etiquette - -To help us review your code quickly, please adhere to the following: - -\* \*\*One Concern Per PR:\*\* Keep diffs highly focused. Do not mix refactoring with new feature additions. - -\* \*\*Link the Issue:\*\* Always include `Closes #N` (replace N with the issue number) in your PR description so GitHub automatically links them. - -\* \*\*Rebase:\*\* Always rebase your branch onto the latest `main` branch before requesting a review. - - - -\## 4. The Review Workflow - -Once CI is green, your PR enters the review queue. - -\* \*\*Who reviews:\*\* Core maintainers will review the PR. - -\* \*\*`autonomous-ready` Label:\*\* If a maintainer applies this label, it means the PR is approved in concept and is queued for our automated integration system. - -\* \*\*The Nightly Loop:\*\* We run extensive integration loops overnight. If your PR is approved, it may wait for this nightly loop before final merging to ensure system stability. - - - -\## 5. Post-Merge Actions - -After your code is merged, the following automated actions occur: - -\* `CHANGELOG.md` is updated. - -\* `npm` wrappers are synced. - -\* Binary rebuilds are triggered for all platforms. - -\* Website and documentation are synced with your new changes. - - - -Thank you for contributing to CodeWhale! - diff --git a/docs/RLM_BRANCHING_ROADMAP.md b/docs/RLM_BRANCHING_ROADMAP.md deleted file mode 100644 index 449dae5e..00000000 --- a/docs/RLM_BRANCHING_ROADMAP.md +++ /dev/null @@ -1,92 +0,0 @@ -# RLM Branching Roadmap - -This note records the v0.8.45 design direction for RLM, DSPy, GEPA, and Model -Lab without adding runtime dependencies or changing the live agent loop. - -## Branching Primitive - -CodeWhale uses the same branching primitive at three scales: - -1. Release tracks. Each milestone fans into named tracks. A track must stay - independently reviewable, mergeable, and slippable. Unfinished work rolls - forward instead of blocking the release. -2. Capability worksets. Model Lab capabilities such as Hugging Face, - observability, evals, serving, DSPy, GEPA, and training infrastructure ship - as opt-in worksets with their own feature flag, install path, license note, - and telemetry posture. -3. Pareto compile branches. Optimizable modules keep candidate - `(instructions, demos, score)` triples. Branches that violate pinned - constitution clauses are pruned; branches that win at least one eval remain - on the frontier until the maintainer lands or rejects them. - -The maintainer chooses the frontier point. CodeWhale should not collapse -branches prematurely. - -## v0.8.45 - -- Close the current control-plane and workbench issues before the broader - fan-out begins: #1982, #2027, #2032, #2016, and #2034. -- Keep `AGENTS.md` and `CLAUDE.md` maintainer-local. `AGENTS.md` is ignored - from this milestone forward. -- Land the RLM symbolic-object substrate: active prompt, session metadata, - transcript, latest user message, and per-message refs are named objects that - RLM can open without copying raw prompt/history text into the parent - transcript. - -## v0.8.46 - -- Generalize Fin into a structured-feedback verifier substrate. -- Add first replay-eval definitions harvested from existing trajectories. -- Scaffold the Repeatability Score footer slot as pending until evals populate - it. -- Add module artifact schema v0 as Rust types only. -- Draft the "Compiled Word" constitution article. - -## v0.8.47 - -- Promote Hugging Face as a first-class provider through Inference Providers - and Router. -- Add deterministic RLM replay: context snapshot, seed, child model IDs, and - temperatures. -- Route large logs and payloads to RLM workbench sessions instead of the - parent transcript. -- Add sub-query memoization keyed by prompt, context hash, and model. -- Enforce RLM budgets at the Rust registry layer: depth, calls, wall time, and - cost. - -## v0.8.48 - -- Remove the legacy `deepseek` and `deepseek-tui` shim binaries. -- Finish Docker and Homebrew rename cleanup. -- Populate Repeatability Score from a small offline eval suite that ships in - core. - -## v0.9.0 - -- Emit per-turn `trajectory.jsonl` as the trainset substrate. -- Add `codewhale replay ` for deterministic replay. -- Render module artifacts from the `[[ ## field ## ]]` form through a Rust - adapter. -- Land the eval pipeline: suites, replay evals, and measurement substrate. -- Add a `/compile` command stub that explains the offline loop. - -## v0.10.0 - -- Add opt-in Model Lab workset installers for DSPy and GEPA. The default - install keeps zero Python dependencies. -- Build the first offline compile pipeline: Rust harvests trainsets, a Python - sidecar runs the optimizer, and CodeWhale emits a reviewed Module JSON - artifact. -- Add the Compile TUI panel with Pareto frontier, lineage tree, and - Land/Reject/Revise actions. -- Land the first optimized tool-description and agent-prompt artifacts through - PRs. Constitution clauses remain pinned outside the optimized region. -- Add whale-species module passports, for example - `Sei: codewhale-agent-prompt.v0.10.0-gepa-1`. - -## Trust Boundary - -Compilation is offline. Runtime consumes reviewed JSON artifacts. Online -closed-loop optimization is out of scope because adversarial users could game a -live coding harness. Any workset can fail independently without dragging the -release, the core runtime, or other Pareto branches with it. diff --git a/docs/archive/V0_7_5_IMPLEMENTATION_PLAN.md b/docs/archive/V0_7_5_IMPLEMENTATION_PLAN.md deleted file mode 100644 index 775a9436..00000000 --- a/docs/archive/V0_7_5_IMPLEMENTATION_PLAN.md +++ /dev/null @@ -1,61 +0,0 @@ -# v0.7.5 Implementation Plan - -Scope: background shell job UX, in-TUI MCP management/discovery, and V4 -context/cache policy. Do not include provider expansion or Whalescale -rename/migration work in this release lane. - -## Context/cache decision - -Default path: - -- Keep the transcript append-only and preserve the stable prefix for DeepSeek V4 cache reuse. -- Disable replacement-style `auto_compact` by default. -- Keep replacement compaction manual or late: if a user enables `auto_compact`, V4 compacts only near the 80% model-window guard (`800000` tokens for 1M-context models), not at reasoning-effort soft caps. -- Keep the Flash seam manager (`[context].enabled`) opt-in until issue #200 has repeatable cache-hit/miss evidence. -- Keep the capacity controller disabled by default. Treat it as telemetry or an experimental guardrail unless `capacity.enabled = true` is set. -- Use emergency overflow recovery only when the request would otherwise exceed the model input budget. - -Rationale: V4's 1M-token window and prefix-cache economics make early -replacement compaction suspect. The first shippable slice should prevent old -128K-era heuristics from rewriting context before there is evidence that the -rewrite is cheaper and more reliable than preserving a hot prefix. - -## Shippable slices - -### Slice 1: Context policy and docs - -- Change default `auto_compact` to off. -- Keep V4 replacement-compaction thresholds late and independent of reasoning effort. -- Make `[context].enabled` default to false. -- Make `docs/CONFIGURATION.md`, `docs/capacity_controller.md`, and `config.example.toml` match code defaults. -- Add focused tests for defaults and V4 threshold behavior. - -### Slice 2: Background shell job center (#195) - -- Add a job-center view fed by `ShellManager::list()`. -- Show command, cwd, linked task id when available, status, elapsed time, exit code, and latest output. -- Add controls to inspect full output, poll latest output, send stdin for PTY/stdin-capable jobs, kill a background job, and attach completed output as task evidence. -- Mark restart-stale jobs explicitly rather than presenting them as live. -- Add lifecycle tests for start, poll, cancel, complete, stale/restart, plus TUI snapshots for running and completed job details. - -### Slice 3: MCP manager (#196) - -- Add `/mcp` or a command-palette action that opens an MCP manager view. -- Show resolved config path, server enabled/disabled state, transport, command/url, timeout settings, startup errors, and discovered tool/resource/prompt counts. -- Wire `mcp_config_path` into the interactive config surface. -- Support init, add stdio server, add HTTP/SSE server, enable, disable, remove, validate, reconnect, and inspect tools/resources/prompts. -- Preserve both `servers` and `mcpServers` config shapes. - -### Slice 4: MCP discoverability (#197) - -- Add an MCP command-palette section backed by the same discovery state as the manager. -- Group tools/resources/prompts by server. -- Show disabled/failed servers without blocking palette rendering. -- Keep model-visible names consistent with `mcp__`. - -## Stop rules - -- Do not close #159 or #162 unless a verified PR actually resolves them. -- Do not add provider expansion. -- Do not rename or migrate anything to Whalescale. -- Do not broaden the TUI into a large redesign; each slice should remain independently testable and shippable.