Document the portable SKILL.md compatibility boundary for Claude Code plugin bundles and keep /skill install from silently flattening plugin archives that carry multiple skills plus plugin.json runtime metadata.
Reported by @AiurArtanis in #2743.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The packaged changelog is now a recent-releases slice produced by
scripts/sync-changelog.sh (which gains a --check mode); also restore the
SECURITY.md contact line the version gate guards, and finish the stale
binary-name sweep (--bin codewhale examples, qa harness doc).
* refactor(palette): remove unused backward-compat aliases and add module docs
- Remove DEEPSEEK_AQUA_RGB, DEEPSEEK_NAVY_RGB, DEEPSEEK_AQUA, DEEPSEEK_NAVY
(unused backward-compatible aliases with no references in production code)
- Add module-level doc comment explaining the three-layer palette organization:
RGB tuples, semantic Color constants, and backward-compat aliases
- Note that some constants are kept for design-system completeness
* fix: remove deprecated color audit test (DEEPSEEK_AQUA no longer exists)
* fix: remove unused import in palette_audit test
---------
Co-authored-by: Hu Qiantao <huqiantao@HudeMacBook-Air.local>
Closes#2074
Adds FauxStep::Factory(Box<dyn Fn(&MessageRequest) -> CannedTurn + Send + Sync>)
to MockLlmClient. When a Factory step is dequeued, its closure runs
against the real outgoing MessageRequest before the response stream is
built, so any assert! panic surfaces directly from the client call
instead of later in stream polling.
Internal storage moves from VecDeque<CannedTurn> to VecDeque<FauxStep>,
but every existing public method keeps working:
- MockLlmClient::new(Vec<CannedTurn>) wraps each turn in FauxStep::Canned.
- push_turn(CannedTurn) appends as FauxStep::Canned.
Adds push_factory(closure) for tests that want the Factory branch.
Doc comment on the Factory variant captures the DeepSeek V4
thinking-mode tool-call invariant (the v0.4.9-v0.5.1 reasoning_content
drop that produced HTTP 400 on follow-up turns).
Adds:
- crates/tui/tests/reasoning_content_replayed_after_tool_call.rs — a
regression test whose factory asserts the assistant tool-call turn
carries a Thinking content block after a thinking + tool-call round.
- An additional unit test in mock.rs covering create_message_synthesizes_from_factory_turn.
All 20 tests in the new file pass, and the existing
integration_mock_llm suite (27 tests) is unchanged.
Harvested from PR #2118 by @Hmbown.
Includes Kimi/Moonshot OAuth, v0.8.45 release prep, the Codex/ChatGPT OAuth removal, open-source-first model defaults, and the safe green PR batch merged into main before the release branch refresh.
Rename brand-bearing string literals across the TUI source and the
system-prompt templates that ship inside the binary. The DeepSeek
provider integration is again left intact: the `ApiProvider::Deepseek`
enum variant, the `"deepseek"` provider name string returned by
`ApiProvider`-to-string mappings, model IDs, the `~/.deepseek/` config
directory and `DEEPSEEK_CONFIG_PATH` env var, the OS keyring key
`"deepseek"`, the Ollama `deepseek-coder*` model defaults, the China
preset alias `deepseek-china`, and the various provider list error
messages all keep the legacy spelling.
Touchpoints:
- `crates/tui/src/prompts/*.md` and `*.txt`: brand language flipped to
`codewhale`; the internal `<deepseek:subagent.done>`,
`<deepseek:subagent_context>`, `<deepseek:fork_state>`,
`<deepseek:tool_call>` XML-ish event tags rename in lockstep to
`<codewhale:…>` so the model-facing format stays consistent.
- `crates/tui/src/tools/subagent/mod.rs`: emits the new event tag.
- `crates/tui/src/core/tool_parser.rs`: parses the new event tag.
- `crates/tui/src/tools/subagent/tests.rs`,
`crates/tui/tests/protocol_recovery.rs`,
`crates/tui/src/prompts.rs`: test expectations updated to match the
new tag and the new prompt text.
- Status / display strings flipped to `codewhale`: `acp_server.rs`'s
agent name + title, `config_ui.rs`'s config schema title,
`share.rs`'s export title, `welcome.rs`'s onboarding banner,
`commands/status.rs`, `core/engine*`, `tui/notifications.rs`,
`tui/sidebar.rs`, `tui/widgets/header.rs`, `tui/widgets/mod.rs`,
`tui/ui.rs`'s resume-hint, `main.rs`'s clap header and `Doctor`
prose, `tui/ui/tests.rs` and other test assertions.
- `crates/tui/src/logging.rs` test fixture: `deepseek_cli=debug` ->
`codewhale_cli=debug` so the log-filter test references the renamed
crate.
- Tracing targets that were namespaced under the brand
(`target: "deepseek::config"`) move to `target: "codewhale::config"`.
- Test-fixture tempdir prefixes (`deepseek-tui-…` / `deepseek-…`) rename
for consistency.
Local gates green: `cargo check --workspace --all-targets --locked`,
`cargo fmt --all -- --check`, `cargo clippy --workspace --all-targets
--all-features --locked -- -D warnings`, `cargo test --workspace
--all-features --locked` (3226+ pass, 0 fail).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The qa_pty integration tests booted the canonical TUI via
`Harness::cargo_bin("deepseek-tui")`. After the previous commit
renamed the canonical bin to `codewhale-tui` and made `deepseek-tui`
a tiny deprecation shim that forwards to it via PATH, those tests
launched the shim and hung waiting for a TUI frame because the
sealed PATH in the test sandbox has no `codewhale-tui`.
Point the harness at `cargo_bin("codewhale-tui")` directly. Also
add a `CARGO_BIN_EXE_codewhale-tui` compile-time fallback in the
harness so older Cargo versions still work, alongside the existing
legacy `deepseek-tui` fallback.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lets the model OCR a screenshot, scanned receipt, whiteboard photo,
or image-only PDF the user drops into the workspace, without
bouncing through `exec_shell` (which would mean an approval prompt
plus the model having to remember tesseract's CLI surface). The
tool spawns `tesseract <image> -` and returns the recognised text
inline — no file is written. Capability is ReadOnly + parallel
since OCR is a side-effect-free read.
Registration is gated on `crate::dependencies::resolve_tesseract()`
via the new `ToolRegistryBuilder::with_image_ocr_tools()` builder,
hooked into `with_agent_tools` alongside `pandoc_convert`. When
tesseract is missing the tool isn't advertised — same
probe-then-decide pattern v0.8.31 introduced for Python. The
execute path also late-resolves so a concurrent uninstall surfaces
the install-tesseract hint rather than the raw spawn failure.
`deepseek doctor`'s "Tool Dependencies" section reports tesseract
status next to pandoc / node / python with platform-aware install
hints. For non-default language packs or PSM modes the user can
still drop into `exec_shell` with the full tesseract CLI surface.
Tests check the metadata (ReadOnly + parallel, not WritesFiles),
the missing-path rejection, and the happy-path OCR round-trip
against `crates/tui/tests/fixtures/ocr_hello.png` — a 2 KB
300×100 grayscale PNG generated with ImageMagick rendering
"HELLO OCR" in Helvetica. The happy-path test skips silently on
hosts without tesseract (matching the catalog-build behaviour) and
on hosts where the fixture isn't checked out (sparse / shallow
clones).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Teach /skill install to recognize compatible skill directories such as .claude/skills/<name>/SKILL.md, nested packages/.../skills/<name>/SKILL.md, and single nested skill repos while still extracting only the selected subtree.
Also make /init treat an existing AGENTS.md as an idempotent no-op so the TUI matches the dispatcher behavior instead of surfacing a scary error for an already-initialized project.
Two pieces:
**#1073 fix.** When a paste burst is currently being assembled, or when
the burst's Enter-suppression window is still open after a flush, the
trailing newline of the paste was firing `submit_input()` and the
in-flight burst buffer was getting destroyed by `clear_after_explicit_paste()`.
The PasteBurst module already exposed `newline_should_insert_instead_of_submit`
and `append_newline_if_active` for exactly this case, but no caller had
been wired up. Added `App::handle_composer_enter`, which checks the
suppression state and either appends `\n` to the burst buffer or inserts
it directly into the composer text — no submit. The `KeyCode::Enter`
arm in the composer event loop now dispatches through that helper.
Reproduces the Windows/PowerShell symptom from the report:
multi-line paste ending with `\n` no longer auto-submits AND the text
no longer leaks into the now-empty composer.
Four unit tests cover: active-burst Enter, post-flush window Enter,
normal Enter outside the window, and Enter with paste-burst detection
disabled (suppression must be off).
**PTY QA harness.** New `crates/tui/tests/support/qa_harness/` wraps
`portable-pty` (already a runtime dep) and `vt100` (new dev-dep) into
a small surface for scenarios that need a real PTY: spawn a binary,
send keys/paste/resize, parse the ANSI stream into a frame, assert
on visible text + filesystem state. The harness seals `$HOME` so
scenarios cannot read the developer's real `~/.deepseek/` and points
the base URL at 127.0.0.1:1 so no live request escapes. README under
`support/qa_harness/README.md` documents how to add a scenario.
Initial scenarios in `crates/tui/tests/qa_pty.rs`: smoke boot,
keystroke round-trip, and bracketed/unbracketed paste-with-trailing-
newline regression guards for #1073. The unbracketed scenario does
not deterministically reproduce the bug on macOS (single-syscall
PTY writes keep the burst continuously active), but the unit tests
above cover the path conclusively; the PTY test stands as a
regression guard for the visible-text invariant.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Merge the v0.8.6 feature batch and release hardening.\n\nIncludes the full #373-#380/#382-#402 milestone scope, version bump to 0.8.6, secure /share temp-file handling, Windows-safe self-update replacement, and CI portability fixes.\n\nRemote PR checks passed on the final head before merge.
The protocol-recovery contract tests `include_str!`-ed `engine.rs` and
asserted the fake-wrapper markers (`[TOOL_CALL]`, `<function_calls>`,
…) appeared as string literals in that file. The recent engine
decomposition refactor (commits f0fad7aa..a64bc9bb) split engine.rs
into `engine/streaming.rs`, `engine/turn_loop.rs`, `engine/dispatch.rs`,
`engine/tool_setup.rs`, `engine/tool_execution.rs`,
`engine/tool_catalog.rs`, `engine/context.rs`, `engine/approval.rs`,
`engine/capacity_flow.rs`, and `engine/lsp_hooks.rs`. The marker
literals followed the code into those files, so the original
single-file `include_str!` no longer saw them and 4 protocol-recovery
tests went red.
Switch to an `ENGINE_SOURCES: &[&str]` array of `include_str!`s across
engine.rs + every submodule, with a small `any_engine_source_contains`
helper. Test bodies are otherwise unchanged. The file-size sanity
check on `engine.rs` (>10_000 bytes) still passes — engine.rs is still
~65k bytes after the refactor.
Same regression coverage as before; just survives the new file layout.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add `crates/tui/src/skills/install.rs` — async installer that pulls
user-authored skills from GitHub repos, raw tarball URLs, or a curated
`index.json` registry. The whole pipeline is gated by the per-domain
`NetworkPolicy` (#135), validated against path-traversal / size / symlink
attacks before any bytes hit the destination, and atomic-renamed into place
so a half-installed skill cannot survive a failure mid-extract.
Public surface:
- `InstallSource::{GitHubRepo,DirectUrl,Registry}` with `parse(spec)`.
- `install` / `install_with_registry` returning
`InstallOutcome::{Installed,NeedsApproval,NetworkDenied}`.
- `update` / `update_with_registry` returning
`UpdateResult::{NoChange,Updated,NeedsApproval,NetworkDenied}` — uses a
SHA-256 over the downloaded tarball to short-circuit no-op fetches.
- `uninstall` / `trust` — both refuse to touch directories without an
`.installed-from` marker, so the bundled `skill-creator` system skill is
protected.
- `fetch_registry` — typed loader for the curated `index.json`.
Validation hard rules (each covered by an integration test):
- `..` segments and absolute paths in tar entries are rejected.
- Symlinks / hardlinks in tar entries are rejected outright.
- Uncompressed total size is bounded by `max_size` (default 5 MiB).
- SKILL.md must exist at the archive root or under `skills/<name>/`.
- Frontmatter must carry both `name` and `description`.
- `install` with an existing destination requires `update = true`.
- `update` re-fetches and only replaces the on-disk install when the
checksum changes; no-change paths skip the rename entirely.
Adds `tar`, `flate2`, and `sha2` to `crates/tui/Cargo.toml` and propagates
the resulting lockfile drift to `Cargo.lock`.
Tests: 11 colocated unit tests in `install.rs` + 11 integration tests in
`crates/tui/tests/skill_install.rs` driving a `tiny_http`-based server so
the network gate, download cap, validation pipeline, and atomic rename
all run end-to-end.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds `integration_mock_llm.rs` covering the LlmClient trait surface:
- streaming turn loop (text deltas + finish reason)
- reasoning-content replay across tool-call rounds (V4 §5.1.1, the
HTTP 400 path that broke v0.4.9-v0.5.1)
- tool-call round-trip with chunked input JSON
- multiple tool calls in one turn preserve event ordering
- compaction-style non-streaming `create_message`
- sub-agent style independent parent/child mocks
- capacity-gate observation of a captured request
Four full-engine tests are `#[ignore]`-marked as BLOCKED on the engine
refactor from concrete `Option<DeepSeekClient>` to `Arc<dyn LlmClient>`.
Once that wiring lands the ignored tests light up with no mock changes.
Adds:
- `tests/support/llm_client.rs` mirrors the trait so the mock can be
brought into the integration test via `#[path]` without dragging in
the rest of the binary's module tree
- `tests/fixtures/.gitkeep` so the `eval --record` output directory
rides the repo
- `tests/README.md` documents both the trait-level mocking strategy
and the `--record` fixture flow
- `record_flag_writes_one_jsonl_line_per_step` in `eval_harness.rs`
exercises the new `--record` flag end-to-end
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a compact `setup --status` view, a `setup --clean` for regenerable
session checkpoints, and `--tools`/`--plugins` scaffolding for
~/.deepseek/{tools,plugins} so the extension model has a documented home
that doctor can count. `doctor --json` lands as a CI-safe alternative to
the human-readable doctor (skips the live API probe).
Also locks down the engine's hostility to fake tool-call wrappers:
filter_tool_call_delta and the marker constants are now testable, the
streaming loop emits one compact status notice per turn when it strips
a wrapper, and a new protocol_recovery integration test asserts that
the legacy text parser never turns <function_calls> into a real tool
call. Adds 23 unit tests + 14 integration tests covering both slices.
- Remove FOOTER_HINT color constant from palette
- Drop footer clock label and related synchronization logic
- Simplify footer status line layout and narrow-terminal handling
- Update tests to align with simplified footer logic
- Remove empty state placeholder text for cleaner UI
- Bump version to 0.3.33
- Move src/* into crates/tui/src/ to create a proper workspace structure
- Add .claude/ and .trimtab/ directories for Trimtab closed-loop workflow
- Add DEPENDENCY_GRAPH.md and update documentation
- Update Cargo.toml files to reflect new crate dependencies
- Update CI workflows and npm package scripts
- All tests pass, release build works