dgf1988/codewhale

Files

T

Hunter Bown bd603a271c feat(tools): add image_ocr tool — extract text from images via tesseract

Lets the model OCR a screenshot, scanned receipt, whiteboard photo,
or image-only PDF the user drops into the workspace, without
bouncing through `exec_shell` (which would mean an approval prompt
plus the model having to remember tesseract's CLI surface). The
tool spawns `tesseract <image> -` and returns the recognised text
inline — no file is written. Capability is ReadOnly + parallel
since OCR is a side-effect-free read.

Registration is gated on `crate::dependencies::resolve_tesseract()`
via the new `ToolRegistryBuilder::with_image_ocr_tools()` builder,
hooked into `with_agent_tools` alongside `pandoc_convert`. When
tesseract is missing the tool isn't advertised — same
probe-then-decide pattern v0.8.31 introduced for Python. The
execute path also late-resolves so a concurrent uninstall surfaces
the install-tesseract hint rather than the raw spawn failure.

`deepseek doctor`'s "Tool Dependencies" section reports tesseract
status next to pandoc / node / python with platform-aware install
hints. For non-default language packs or PSM modes the user can
still drop into `exec_shell` with the full tesseract CLI surface.

Tests check the metadata (ReadOnly + parallel, not WritesFiles),
the missing-path rejection, and the happy-path OCR round-trip
against `crates/tui/tests/fixtures/ocr_hello.png` — a 2 KB
300×100 grayscale PNG generated with ImageMagick rendering
"HELLO OCR" in Helvetica. The happy-path test skips silently on
hosts without tesseract (matching the catalog-build behaviour) and
on hosts where the fixture isn't checked out (sparse / shallow
clones).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-12 00:58:48 -05:00

fixtures

feat(tools): add image_ocr tool — extract text from images via tesseract

2026-05-12 00:58:48 -05:00

support

fix(release): close v0.8.17 gate gaps

2026-05-07 13:27:31 -05:00

eval_harness.rs

test(tui): #69 integration tests for mock LLM client + record fixtures

2026-04-28 00:03:18 -05:00

integration_mock_llm.rs

feat(tools/agent_spawn): teach parent that subagent results are self-reports

2026-05-10 08:15:19 -05:00

palette_audit.rs

…

protocol_recovery.rs

test(parity): scan engine submodules after decomposition refactor

2026-05-01 09:34:11 -05:00

qa_pty.rs

fix(tui): reset terminal viewport before repaint

2026-05-07 15:17:03 -05:00

README.md

test(tui): #69 integration tests for mock LLM client + record fixtures

2026-04-28 00:03:18 -05:00

skill_install.rs

fix(skills): accept workflow pack archive layouts (#1164 )

2026-05-08 02:37:21 -05:00

README.md

`crates/tui/tests/`

Integration tests for the TUI binary. Per CONTRIBUTING.md, each crate's integration tests live in its own tests/ directory; the repository-root tests/ directory is unused.

Mock LLM client (`integration_mock_llm.rs`)

crates/tui/src/llm_client/mock.rs provides a MockLlmClient that implements the LlmClient trait by replaying queue-driven canned responses and capturing every outgoing MessageRequest. Tests mock at the trait boundary — never at the reqwest HTTP layer — because the trait is the durable abstraction the runtime is meant to depend on.

Coverage today exercises the trait surface end-to-end:

streaming turn loop
reasoning-content replay across tool-call rounds (V4 §5.1.1, the bug that broke v0.4.9-v0.5.1)
tool-call round-trip with chunked input JSON
multi-tool-call ordering inside a single turn
compaction-style non-streaming create_message
sub-agent style independent parent/child mocks
capacity-gate observation of a captured request before stream drain

Four full-engine tests (engine_full_*) are #[ignore]-marked. They unblock when core::engine::Engine is refactored to take Arc<dyn LlmClient> instead of a concrete Option<DeepSeekClient>. See the comment block at the bottom of integration_mock_llm.rs for the exact refactor surface.

`--record` mode for `deepseek eval`

The offline deepseek eval harness now accepts --record <DIR>. When set, each tool step appends one JSON Lines record to <DIR>/<scenario>.jsonl (default scenario: offline-tool-loop.jsonl). Each line is a self-contained JSON object with the schema:

{ "request":  { "step": "list_dir", "kind": "List" },
  "response_events": [ { "type": "ok", "output": "…" } ] }

The mock LLM client (crate::llm_client::mock) replays these fixtures by mapping each response_events array onto a canned Vec<StreamEvent>. Drop generated fixtures into crates/tui/tests/fixtures/ so they ride the repo and feed the mock in CI.

Quick example:

cargo run --bin deepseek -- eval --record crates/tui/tests/fixtures
cat crates/tui/tests/fixtures/offline-tool-loop.jsonl | jq .

The scenario name is sanitized to [A-Za-z0-9_-] before forming the filename, so unusual scenario strings stay portable across platforms.

README.md

crates/tui/tests/

Mock LLM client (integration_mock_llm.rs)

--record mode for deepseek eval

`crates/tui/tests/`

Mock LLM client (`integration_mock_llm.rs`)

`--record` mode for `deepseek eval`