docs: remove stale internal docs (handoffs, old audits, orphaned roadmaps)
This commit is contained in:
@@ -1,140 +0,0 @@
|
||||
# v0.8.51 Release Handoff — 2026-06-02
|
||||
|
||||
## Workspace
|
||||
|
||||
```
|
||||
/Volumes/VIXinSSD/codewhale
|
||||
Branch: codex/v0.8.51-arcee-provider (12 commits ahead of origin/main)
|
||||
```
|
||||
|
||||
## What's Already Landed (committed, 12 commits on branch)
|
||||
|
||||
| Commit | What |
|
||||
|--------|------|
|
||||
| `e54a0a500` | feat(provider): add direct Arcee support |
|
||||
| `99da87ca1` | fix(cli): wire arcee provider auth |
|
||||
| `8eca75763` | test(tui): cover arcee provider picker entry |
|
||||
| `06612495f` | chore(release): prep v0.8.51 — version bump, CHANGELOG |
|
||||
| `fd69f4c80` | fix(tui): strip DEC private mode CSI fragments (#2592) |
|
||||
| `5249723e1` | fix(engine): recover from turn panics (#2583, #1269) |
|
||||
| `478bae451` | fix(tui): find deeply nested files via @/Ctrl+P (#2488) |
|
||||
| `e95f759cd` | fix(tui): command-palette scroll visibility (#2590) |
|
||||
| `cccc5ed55` | fix(shell): .NET/NuGet + Windows env (#1857) |
|
||||
| `7aa73fad5` | fix(config): warn on misplaced shell/sandbox keys (#2589) |
|
||||
| `a7d482067` | fix(clippy): clear -D warnings (#2599) |
|
||||
| `79d78878b` | test(mcp): deterministic SSE reconnect (#2597) |
|
||||
| `f886f28ac` | test(tui): update walk-depth test for new default depth |
|
||||
|
||||
**Claude branch `origin/claude/busy-mayer-b66rA` is identical to our HEAD** — same commit `f886f28ac`. Nothing to merge; already up to date.
|
||||
|
||||
## What's Applied in Working Tree (NOT committed)
|
||||
|
||||
4 files modified (88 insertions, 25 deletions):
|
||||
|
||||
| File | Change | Credits |
|
||||
|------|--------|---------|
|
||||
| `crates/tui/src/tui/diff_render.rs` | `wrap_text` preserves leading whitespace; fixes extra-space bug in PR #2591; adds 2 regression tests | @zlh124 (PR #2591, fix version from working tree) |
|
||||
| `crates/tui/src/schema_migration.rs` | `#[allow(dead_code)]` on `pub mod registry` | @reidliu41 (PR #2601) |
|
||||
| `crates/tui/src/prompts/base.md` | Tool desc: prefer `gh --json` CLI for GitHub triage | — |
|
||||
| `crates/tui/src/prompts/base.txt` | Same prompt update for text variant | — |
|
||||
|
||||
## Ready to Apply (patch saved, NOT yet applied)
|
||||
|
||||
The model persistence patch is at `/tmp/model_persist.patch` (240 lines, 2 files):
|
||||
|
||||
- `crates/tui/src/commands/core.rs`: `/model` command remembers per-provider model selection + persist warning
|
||||
- `crates/tui/src/settings.rs`: new `set_provider_model_selection()` and `persist_provider_model_selection()` methods
|
||||
|
||||
**Apply it:**
|
||||
```bash
|
||||
cd /Volumes/VIXinSSD/codewhale
|
||||
git apply /tmp/model_persist.patch
|
||||
```
|
||||
|
||||
This is small, self-contained, and directly improves UX for the new Arcee provider (model choice remembered across restarts). No dependency on the deferred image-attachment work.
|
||||
|
||||
## Deferred (in stash, do NOT apply for v0.8.51)
|
||||
|
||||
The stash (`stash@{0}`) contains:
|
||||
|
||||
- **Image attachment** (#2584/#2587): `ContentBlock::ImageUrl`, multimodal chat requests, base64 encoding, + exhaustive match arms across 15 files. Deferred by Hmbown — changes the request shape, needs multimodal endpoint testing.
|
||||
- **GitHub structured route** (`fetch_url.rs`): new feature — routes GitHub issue/PR URLs through `gh` CLI. Too broad for v0.8.51.
|
||||
- **Config custom model changes** (`commands/config.rs`): `normalize_custom_model_id` etc. Need review.
|
||||
|
||||
To view: `git stash show -p stash@{0}`
|
||||
|
||||
## What Remains
|
||||
|
||||
```bash
|
||||
cd /Volumes/VIXinSSD/codewhale
|
||||
|
||||
# 1. Apply the model persistence patch
|
||||
git apply /tmp/model_persist.patch
|
||||
|
||||
# 2. Commit all working-tree changes as one harvest commit
|
||||
git add -A
|
||||
git commit -m "harvest(v0.8.51): diff-render whitespace fix + schema dead_code + model persistence + prompt updates
|
||||
|
||||
- fix(diff-render): preserve leading whitespace in patch content lines
|
||||
Credit: @zlh124 (PR #2591), with extra-space bug fixed.
|
||||
- fix(tui): allow unused schema migration registry
|
||||
Credit: @reidliu41 (PR #2601).
|
||||
- feat(tui): persist per-provider model selection from /model command
|
||||
- docs(prompts): prefer gh --json CLI for GitHub triage in agent instructions"
|
||||
|
||||
# 3. Run release gates
|
||||
cargo fmt --all -- --check
|
||||
cargo clippy --workspace --all-targets --all-features --locked -- -D warnings
|
||||
cargo test -p codewhale-tui -- --test-threads=4
|
||||
|
||||
# 4. If gates pass, rebuild
|
||||
cargo install --path crates/tui --locked --force
|
||||
codewhale --version
|
||||
codewhale-tui --version
|
||||
|
||||
# 5. Final checks
|
||||
git diff origin/main --stat
|
||||
gh issue view 2600 --repo Hmbown/CodeWhale # re-read release checklist
|
||||
```
|
||||
|
||||
## Release Checklist Status (issue #2600)
|
||||
|
||||
- [x] Arcee provider landed + tested
|
||||
- [x] Cycle/checkpoint-restart system removed
|
||||
- [x] Auto-compaction percentage/model-aware
|
||||
- [x] Provider/gateway HTTP error sanitization
|
||||
- [x] TUI fixes (blue dot, sidebar scroll, tooltip)
|
||||
- [x] CHANGELOG + version bump
|
||||
- [x] Clippy clean (#2599)
|
||||
- [x] MCP SSE test deterministic (#2597)
|
||||
- [ ] Full `cargo test --workspace` green — codewhale-tui validated; 1 environment-only Landlock test may fail on macOS
|
||||
- [ ] `npm test` in `npm/codewhale`
|
||||
- [ ] Harvest commits applied + re-gated
|
||||
- [ ] Merge branch to `main`
|
||||
- [ ] Tag `v0.8.51`, push tag
|
||||
- [ ] Publish GitHub release + `npm publish`
|
||||
|
||||
## Contributors to Credit
|
||||
|
||||
| Contributor | Contribution | PR/Issue |
|
||||
|-------------|-------------|----------|
|
||||
| @zlh124 (jayzhu) | diff-render whitespace preservation | #2591 |
|
||||
| @reidliu41 (Reid) | schema migration dead_code allow | #2601 |
|
||||
| @xyuai | Image attachment root cause + initial PR | #2587, #2584 |
|
||||
| @IcedOranges | Image attachment bug report | #2584 |
|
||||
| @idling11 (Hanmiao Li) | Sidebar resize feature request | #2602 |
|
||||
| @gordonlu (Gordon) | Engine death recovery | #2585 |
|
||||
| @cyq1017 | File picker depth fix draft | #2593 |
|
||||
|
||||
## Risks / Notes
|
||||
|
||||
1. **Working tree was stashed** — the image attachment feature and GitHub structured route are deferred for v0.8.52+. The model persistence patch is the only remaining piece worth landing.
|
||||
2. **DeepSeek naming**: The branch and committed code use "CodeWhale" naming throughout. Do not imply DeepSeek is deprecated.
|
||||
3. **The `origin/claude/busy-mayer-b66rA` branch is identical to HEAD** — the Claude Code session in #2600 claimed "+8 commits" but those are the same commits already on this branch. Verify with `git rev-parse HEAD origin/claude/busy-mayer-b66rA`.
|
||||
4. **Landlock test**: `sandbox::tests::test_parity_linux_landlock_available` will fail on macOS (no Landlock LSM). This is environment-only, not a regression. On CI Linux runners it passes.
|
||||
5. **Cross-platform artifacts**: The release workflow builds macOS + Windows + NSIS installer on tag push. Not buildable locally on macOS alone.
|
||||
|
||||
---
|
||||
|
||||
Generated by deepseek-v4-pro in CodeWhale v0.8.51 pre-release triage.
|
||||
Next session: read this file, apply `/tmp/model_persist.patch`, run gates, commit, and prepare the merge.
|
||||
@@ -1,37 +0,0 @@
|
||||
# v0.7.6 Legacy Rust Audit
|
||||
|
||||
Status date: 2026-04-29
|
||||
|
||||
This audit is deliberately non-destructive. No compatibility code is removed in v0.7.6 unless tests prove public CLI, saved-session, tool-schema, and documented command paths no longer depend on it.
|
||||
|
||||
## Summary
|
||||
|
||||
| Surface | Owner module | Current consumer | Reference check | Compatibility reason | Current warning | Recommended action |
|
||||
|---|---|---|---|---|---|---|
|
||||
| Legacy MCP sync API (`McpServerInput`, `list`, `add`, `remove`, `call_tool`, `load_legacy`) | `crates/tui/src/mcp.rs` | Not wired into current `/mcp` command path; retained behind `#[allow(dead_code)]` | Direct Rust references and current MCP command path inspected; saved/config JSON compatibility still needs a dedicated smoke | Preserves old JSON shape including `mcpServers` alias and sync call helpers while the async MCP manager is the active path | Code TODO only | Gate behind an explicit legacy module or remove after CLI/runtime parity tests prove no caller uses it. Tracked by #218. |
|
||||
| Legacy prompt constants/functions (`AGENT_PROMPT`, `YOLO_PROMPT`, `PLAN_PROMPT`, `base_system_prompt`, `normal_system_prompt`, etc.) | `crates/tui/src/prompts.rs` | Tests and older callers that still import prompt constants directly | Direct Rust references remain; public-crate and older harness imports are not proven absent | Layered prompt API replaced monolithic prompts, but older call sites may still compile against constants | None | Keep for v0.7.6; add deprecation annotations only after internal callers are migrated. Tracked by #219. |
|
||||
| `/compact` slash command positioning | `crates/tui/src/commands/mod.rs` | Public slash-command registry and help overlay | Public command registry/docs path still references it | Users may still run `/compact` manually when they want an immediate replacement-style summary | Description is intentionally explicit about manual compaction | Keep as a manual compatibility command; do not remove until context/token issues are resolved. |
|
||||
| `todo_*` compatibility tools | `crates/tui/src/tools/todo.rs` | Tool registry/model calls that still use `todo_add`, `todo_update`, `todo_list`, `todo_write` | Tool registry compatibility and saved tool-call risk remain | `checklist_*` is canonical, but old tool names may appear in saved prompts, traces, or model priors | Metadata marks `compat_alias: true`; descriptions say compatibility alias | Add explicit deprecation metadata with target version, then remove only after tool-schema migration evidence. Tracked by #220. |
|
||||
| Deprecated sub-agent alias tools (`spawn_agent`, `send_input`, delegate aliases) | `crates/tui/src/tools/subagent/mod.rs` | Tool registry and model/tool-call compatibility | Tool registry compatibility and saved tool-call risk remain | Canonical names are `agent_spawn`, `agent_send_input`, etc.; alias names preserve older tool-call compatibility | `_deprecation` metadata and tracing warn; removal target is `v0.8.0` | Keep through v0.7.x; removal already has metadata. Tracked by #221. |
|
||||
| Legacy root/provider TOML `api_key` compatibility | `crates/tui/src/config.rs`, `crates/config/src/lib.rs` | Config resolver; users with existing `api_key` in config files | Public config loading and docs still mention migration behavior | Keyring migration is preferred, but breaking existing configs would block startup/auth | Tracing warnings point to `deepseek auth set` / `deepseek auth migrate` | Keep; warnings are user-actionable. Removal should wait for a migration command and release-note window. |
|
||||
| Model alias canonicalization (`deepseek-chat`, `deepseek-reasoner`, older V3/R1 aliases) | `crates/tui/src/config.rs`, `crates/config/src/lib.rs` | Config/env/model picker normalization | Public docs and existing configs may still use aliases | Preserves old documented DeepSeek aliases and maps them to `deepseek-v4-flash` | Silent alias by design | Keep; removing aliases would break configs without meaningful benefit. |
|
||||
| Deprecated palette constants and aliases | `crates/tui/src/palette.rs`, `crates/tui/tests/palette_audit.rs` | Existing call sites plus audit tests | Palette audit enforces the remaining allowlist | Semantic aliases are preferred, but old constants exist to prevent broad style churn | Palette audit blocks direct deprecated uses outside allowlist | Keep aliases; continue moving call sites to semantic roles opportunistically. |
|
||||
|
||||
## Follow-Up Removal Candidates
|
||||
|
||||
These are not safe to remove in v0.7.6:
|
||||
|
||||
1. #218 Legacy MCP sync API: requires a call-graph check and explicit CLI/runtime parity tests for `/mcp`, `deepseek mcp`, and MCP server validation flows.
|
||||
2. #219 Legacy prompt constants/functions: requires proving no public crate or older test harness imports them.
|
||||
3. #220 `todo_*` tool aliases: requires deprecation metadata and a saved-trace/tool-schema migration window.
|
||||
4. #221 Deprecated sub-agent alias tools: removal target is already encoded as `v0.8.0`, but the actual removal should be tracked and tested separately.
|
||||
|
||||
## Verification Checklist
|
||||
|
||||
Before removing any compatibility surface:
|
||||
|
||||
1. Search direct Rust references with `rg`.
|
||||
2. Search docs and README command examples.
|
||||
3. Run workspace tests with all features.
|
||||
4. Run a saved-session/tool-call compatibility smoke if the surface affects tool schemas or persisted history.
|
||||
5. Keep a release-note entry and, for user-visible config/tool changes, a migration hint for at least one minor release.
|
||||
@@ -1,89 +0,0 @@
|
||||
# MiMo v2.5 Benchmarking — Known Issues
|
||||
|
||||
Tracking doc for quirks and workarounds when benchmarking Xiaomi MiMo v2.5
|
||||
through CodeWhale's harness integrations.
|
||||
|
||||
## PinchBench
|
||||
|
||||
### Issue 1: Model validation requires OpenRouter prefix
|
||||
|
||||
PinchBench validates models against OpenRouter's `/models` endpoint. If you
|
||||
pass `mimo-v2.5-pro` without the `openrouter/xiaomi/` prefix, validation is
|
||||
skipped entirely (it assumes it's a non-OpenRouter model). This means you
|
||||
won't know if the model ID is wrong until the run fails.
|
||||
|
||||
**Workaround:** Always use `openrouter/xiaomi/mimo-v2.5-pro` for OpenRouter
|
||||
routing, or use `--direct-mimo` for the Xiaomi API.
|
||||
|
||||
### Issue 2: PinchBench requires OPENROUTER_API_KEY
|
||||
|
||||
Even when using a direct provider, PinchBench's `lib_agent.py` checks for
|
||||
`OPENROUTER_API_KEY` in some code paths. The `--direct-mimo` flag in our
|
||||
runner works around this by setting up a custom OpenAI-compatible provider
|
||||
entry in OpenClaw's `models.json` and exporting `OPENAI_API_KEY`/`OPENAI_BASE_URL`.
|
||||
|
||||
### Issue 3: Token Plan vs Pay-as-you-go key mismatch
|
||||
|
||||
Xiaomi MiMo has two API endpoints:
|
||||
- **Token Plan** (`tp-` keys): `https://token-plan-sgp.xiaomimimo.com/v1`
|
||||
- **Pay-as-you-go** (`sk-` keys): `https://api.xiaomimimo.com/v1`
|
||||
|
||||
Using the wrong key type with the wrong endpoint produces auth errors. The
|
||||
runner now detects this and warns.
|
||||
|
||||
### Issue 4: OpenClaw is the runtime, not CodeWhale
|
||||
|
||||
PinchBench runs tasks through OpenClaw, not CodeWhale. This means the
|
||||
benchmark measures MiMo v2.5's performance through OpenClaw's agent harness,
|
||||
not through CodeWhale's tool system. For CodeWhale-native evaluation,
|
||||
Terminal-Bench (via Harbor) is the better fit.
|
||||
|
||||
**Future:** Create a CodeWhale-native PinchBench adapter that loads tasks
|
||||
from PinchBench's `tasks/` directory and runs them through `codewhale exec`.
|
||||
|
||||
## Terminal-Bench (Harbor)
|
||||
|
||||
### Issue 1: MiMo provider routing
|
||||
|
||||
Harbor passes models as `provider/model` format. For MiMo via OpenRouter,
|
||||
use `openrouter/xiaomi/mimo-v2.5-pro`. For direct Xiaomi API, pass
|
||||
`--provider xiaomi-mimo` as an extra agent flag.
|
||||
|
||||
### Issue 2: Container environment
|
||||
|
||||
The Harbor adapter installs codewhale via npm in the container. MiMo API
|
||||
keys must be forwarded from the host environment. The adapter checks for
|
||||
`XIAOMI_MIMO_API_KEY`, `OPENROUTER_API_KEY`, and `OPENAI_API_KEY`.
|
||||
|
||||
## SWE-bench
|
||||
|
||||
### Issue 1: MiMo thinking mode
|
||||
|
||||
MiMo v2.5 Pro supports extended thinking. For SWE-bench patch generation,
|
||||
ensure the thinking level is set appropriately. The `--thinking high` flag
|
||||
is passed through the CLI.
|
||||
|
||||
### Issue 2: Context window
|
||||
|
||||
MiMo v2.5 Pro has a 128K context window. Large SWE-bench instances (e.g.,
|
||||
Django, sympy) may benefit from the full window. No special handling needed,
|
||||
but worth monitoring token usage.
|
||||
|
||||
## Environment Variables Reference
|
||||
|
||||
```
|
||||
# Xiaomi MiMo direct API
|
||||
XIAOMI_MIMO_API_KEY=tp-... # Token Plan key
|
||||
XIAOMI_MIMO_API_KEY=sk-... # Pay-as-you-go key
|
||||
XIAOMI_MIMO_BASE_URL=https://token-plan-sgp.xiaomimimo.com/v1
|
||||
XIAOMI_MIMO_MODEL=mimo-v2.5-pro
|
||||
|
||||
# Aliases also accepted
|
||||
XIAOMI_API_KEY=...
|
||||
MIMO_API_KEY=...
|
||||
MIMO_BASE_URL=...
|
||||
MIMO_MODEL=...
|
||||
|
||||
# OpenRouter (for MiMo via OpenRouter)
|
||||
OPENROUTER_API_KEY=...
|
||||
```
|
||||
@@ -1,84 +0,0 @@
|
||||
\# CodeWhale Review Pipeline
|
||||
|
||||
|
||||
|
||||
Welcome to CodeWhale! We receive a high volume of community PRs. To ensure a smooth and fast review process, please review our pipeline expectations below.
|
||||
|
||||
|
||||
|
||||
\## 1. CI Gates (Pre-Review Checklist)
|
||||
|
||||
Before a maintainer reviews your PR, it must pass our continuous integration (CI) checks.
|
||||
|
||||
|
||||
|
||||
\*\*Required Checks (Must Pass):\*\*
|
||||
|
||||
Please run these locally before pushing your code to avoid CI failures:
|
||||
|
||||
\* \*\*Format:\*\* `cargo fmt --all -- --check`
|
||||
|
||||
\* \*\*Linting:\*\* `cargo clippy --workspace --all-targets --all-features`
|
||||
|
||||
\* \*\*Tests:\*\* `cargo test --workspace --all-features --locked`
|
||||
|
||||
|
||||
|
||||
\*\*Informational Checks:\*\*
|
||||
|
||||
Checks from \*\*Greptile\*\* and \*\*GitGuardian\*\* are informational. If they flag something, review it, but they do not strictly block a review on their own unless a secret is leaked.
|
||||
|
||||
|
||||
|
||||
\## 2. Common Failure Modes \& Local Fixes
|
||||
|
||||
If CI fails, it is usually one of these three reasons:
|
||||
|
||||
\* \*\*Version Drift (`Cargo.lock` out of date):\*\* Run `cargo update` or `cargo build` locally to update the lockfile and commit the changes.
|
||||
|
||||
\* \*\*Lint Failures:\*\* Check the clippy warnings from the command above and fix the specific lines flagged.
|
||||
|
||||
\* \*\*Windows Test Flakiness:\*\* Occasionally, tests may time out on Windows runners. If you are confident your code didn't break it, leave a comment asking a maintainer to re-trigger the CI.
|
||||
|
||||
|
||||
|
||||
\## 3. PR Etiquette
|
||||
|
||||
To help us review your code quickly, please adhere to the following:
|
||||
|
||||
\* \*\*One Concern Per PR:\*\* Keep diffs highly focused. Do not mix refactoring with new feature additions.
|
||||
|
||||
\* \*\*Link the Issue:\*\* Always include `Closes #N` (replace N with the issue number) in your PR description so GitHub automatically links them.
|
||||
|
||||
\* \*\*Rebase:\*\* Always rebase your branch onto the latest `main` branch before requesting a review.
|
||||
|
||||
|
||||
|
||||
\## 4. The Review Workflow
|
||||
|
||||
Once CI is green, your PR enters the review queue.
|
||||
|
||||
\* \*\*Who reviews:\*\* Core maintainers will review the PR.
|
||||
|
||||
\* \*\*`autonomous-ready` Label:\*\* If a maintainer applies this label, it means the PR is approved in concept and is queued for our automated integration system.
|
||||
|
||||
\* \*\*The Nightly Loop:\*\* We run extensive integration loops overnight. If your PR is approved, it may wait for this nightly loop before final merging to ensure system stability.
|
||||
|
||||
|
||||
|
||||
\## 5. Post-Merge Actions
|
||||
|
||||
After your code is merged, the following automated actions occur:
|
||||
|
||||
\* `CHANGELOG.md` is updated.
|
||||
|
||||
\* `npm` wrappers are synced.
|
||||
|
||||
\* Binary rebuilds are triggered for all platforms.
|
||||
|
||||
\* Website and documentation are synced with your new changes.
|
||||
|
||||
|
||||
|
||||
Thank you for contributing to CodeWhale!
|
||||
|
||||
@@ -1,92 +0,0 @@
|
||||
# RLM Branching Roadmap
|
||||
|
||||
This note records the v0.8.45 design direction for RLM, DSPy, GEPA, and Model
|
||||
Lab without adding runtime dependencies or changing the live agent loop.
|
||||
|
||||
## Branching Primitive
|
||||
|
||||
CodeWhale uses the same branching primitive at three scales:
|
||||
|
||||
1. Release tracks. Each milestone fans into named tracks. A track must stay
|
||||
independently reviewable, mergeable, and slippable. Unfinished work rolls
|
||||
forward instead of blocking the release.
|
||||
2. Capability worksets. Model Lab capabilities such as Hugging Face,
|
||||
observability, evals, serving, DSPy, GEPA, and training infrastructure ship
|
||||
as opt-in worksets with their own feature flag, install path, license note,
|
||||
and telemetry posture.
|
||||
3. Pareto compile branches. Optimizable modules keep candidate
|
||||
`(instructions, demos, score)` triples. Branches that violate pinned
|
||||
constitution clauses are pruned; branches that win at least one eval remain
|
||||
on the frontier until the maintainer lands or rejects them.
|
||||
|
||||
The maintainer chooses the frontier point. CodeWhale should not collapse
|
||||
branches prematurely.
|
||||
|
||||
## v0.8.45
|
||||
|
||||
- Close the current control-plane and workbench issues before the broader
|
||||
fan-out begins: #1982, #2027, #2032, #2016, and #2034.
|
||||
- Keep `AGENTS.md` and `CLAUDE.md` maintainer-local. `AGENTS.md` is ignored
|
||||
from this milestone forward.
|
||||
- Land the RLM symbolic-object substrate: active prompt, session metadata,
|
||||
transcript, latest user message, and per-message refs are named objects that
|
||||
RLM can open without copying raw prompt/history text into the parent
|
||||
transcript.
|
||||
|
||||
## v0.8.46
|
||||
|
||||
- Generalize Fin into a structured-feedback verifier substrate.
|
||||
- Add first replay-eval definitions harvested from existing trajectories.
|
||||
- Scaffold the Repeatability Score footer slot as pending until evals populate
|
||||
it.
|
||||
- Add module artifact schema v0 as Rust types only.
|
||||
- Draft the "Compiled Word" constitution article.
|
||||
|
||||
## v0.8.47
|
||||
|
||||
- Promote Hugging Face as a first-class provider through Inference Providers
|
||||
and Router.
|
||||
- Add deterministic RLM replay: context snapshot, seed, child model IDs, and
|
||||
temperatures.
|
||||
- Route large logs and payloads to RLM workbench sessions instead of the
|
||||
parent transcript.
|
||||
- Add sub-query memoization keyed by prompt, context hash, and model.
|
||||
- Enforce RLM budgets at the Rust registry layer: depth, calls, wall time, and
|
||||
cost.
|
||||
|
||||
## v0.8.48
|
||||
|
||||
- Remove the legacy `deepseek` and `deepseek-tui` shim binaries.
|
||||
- Finish Docker and Homebrew rename cleanup.
|
||||
- Populate Repeatability Score from a small offline eval suite that ships in
|
||||
core.
|
||||
|
||||
## v0.9.0
|
||||
|
||||
- Emit per-turn `trajectory.jsonl` as the trainset substrate.
|
||||
- Add `codewhale replay <turn_id>` for deterministic replay.
|
||||
- Render module artifacts from the `[[ ## field ## ]]` form through a Rust
|
||||
adapter.
|
||||
- Land the eval pipeline: suites, replay evals, and measurement substrate.
|
||||
- Add a `/compile` command stub that explains the offline loop.
|
||||
|
||||
## v0.10.0
|
||||
|
||||
- Add opt-in Model Lab workset installers for DSPy and GEPA. The default
|
||||
install keeps zero Python dependencies.
|
||||
- Build the first offline compile pipeline: Rust harvests trainsets, a Python
|
||||
sidecar runs the optimizer, and CodeWhale emits a reviewed Module JSON
|
||||
artifact.
|
||||
- Add the Compile TUI panel with Pareto frontier, lineage tree, and
|
||||
Land/Reject/Revise actions.
|
||||
- Land the first optimized tool-description and agent-prompt artifacts through
|
||||
PRs. Constitution clauses remain pinned outside the optimized region.
|
||||
- Add whale-species module passports, for example
|
||||
`Sei: codewhale-agent-prompt.v0.10.0-gepa-1`.
|
||||
|
||||
## Trust Boundary
|
||||
|
||||
Compilation is offline. Runtime consumes reviewed JSON artifacts. Online
|
||||
closed-loop optimization is out of scope because adversarial users could game a
|
||||
live coding harness. Any workset can fail independently without dragging the
|
||||
release, the core runtime, or other Pareto branches with it.
|
||||
@@ -1,61 +0,0 @@
|
||||
# v0.7.5 Implementation Plan
|
||||
|
||||
Scope: background shell job UX, in-TUI MCP management/discovery, and V4
|
||||
context/cache policy. Do not include provider expansion or Whalescale
|
||||
rename/migration work in this release lane.
|
||||
|
||||
## Context/cache decision
|
||||
|
||||
Default path:
|
||||
|
||||
- Keep the transcript append-only and preserve the stable prefix for DeepSeek V4 cache reuse.
|
||||
- Disable replacement-style `auto_compact` by default.
|
||||
- Keep replacement compaction manual or late: if a user enables `auto_compact`, V4 compacts only near the 80% model-window guard (`800000` tokens for 1M-context models), not at reasoning-effort soft caps.
|
||||
- Keep the Flash seam manager (`[context].enabled`) opt-in until issue #200 has repeatable cache-hit/miss evidence.
|
||||
- Keep the capacity controller disabled by default. Treat it as telemetry or an experimental guardrail unless `capacity.enabled = true` is set.
|
||||
- Use emergency overflow recovery only when the request would otherwise exceed the model input budget.
|
||||
|
||||
Rationale: V4's 1M-token window and prefix-cache economics make early
|
||||
replacement compaction suspect. The first shippable slice should prevent old
|
||||
128K-era heuristics from rewriting context before there is evidence that the
|
||||
rewrite is cheaper and more reliable than preserving a hot prefix.
|
||||
|
||||
## Shippable slices
|
||||
|
||||
### Slice 1: Context policy and docs
|
||||
|
||||
- Change default `auto_compact` to off.
|
||||
- Keep V4 replacement-compaction thresholds late and independent of reasoning effort.
|
||||
- Make `[context].enabled` default to false.
|
||||
- Make `docs/CONFIGURATION.md`, `docs/capacity_controller.md`, and `config.example.toml` match code defaults.
|
||||
- Add focused tests for defaults and V4 threshold behavior.
|
||||
|
||||
### Slice 2: Background shell job center (#195)
|
||||
|
||||
- Add a job-center view fed by `ShellManager::list()`.
|
||||
- Show command, cwd, linked task id when available, status, elapsed time, exit code, and latest output.
|
||||
- Add controls to inspect full output, poll latest output, send stdin for PTY/stdin-capable jobs, kill a background job, and attach completed output as task evidence.
|
||||
- Mark restart-stale jobs explicitly rather than presenting them as live.
|
||||
- Add lifecycle tests for start, poll, cancel, complete, stale/restart, plus TUI snapshots for running and completed job details.
|
||||
|
||||
### Slice 3: MCP manager (#196)
|
||||
|
||||
- Add `/mcp` or a command-palette action that opens an MCP manager view.
|
||||
- Show resolved config path, server enabled/disabled state, transport, command/url, timeout settings, startup errors, and discovered tool/resource/prompt counts.
|
||||
- Wire `mcp_config_path` into the interactive config surface.
|
||||
- Support init, add stdio server, add HTTP/SSE server, enable, disable, remove, validate, reconnect, and inspect tools/resources/prompts.
|
||||
- Preserve both `servers` and `mcpServers` config shapes.
|
||||
|
||||
### Slice 4: MCP discoverability (#197)
|
||||
|
||||
- Add an MCP command-palette section backed by the same discovery state as the manager.
|
||||
- Group tools/resources/prompts by server.
|
||||
- Show disabled/failed servers without blocking palette rendering.
|
||||
- Keep model-visible names consistent with `mcp_<server>_<tool>`.
|
||||
|
||||
## Stop rules
|
||||
|
||||
- Do not close #159 or #162 unless a verified PR actually resolves them.
|
||||
- Do not add provider expansion.
|
||||
- Do not rename or migrate anything to Whalescale.
|
||||
- Do not broaden the TUI into a large redesign; each slice should remain independently testable and shippable.
|
||||
Reference in New Issue
Block a user