dgf1988/codewhale

Files

T

Hunter Bown 4e285595b0 fix(tui): paste-Enter must not auto-submit (#1073 ) + PTY QA harness

Two pieces:

**#1073 fix.** When a paste burst is currently being assembled, or when
the burst's Enter-suppression window is still open after a flush, the
trailing newline of the paste was firing `submit_input()` and the
in-flight burst buffer was getting destroyed by `clear_after_explicit_paste()`.
The PasteBurst module already exposed `newline_should_insert_instead_of_submit`
and `append_newline_if_active` for exactly this case, but no caller had
been wired up. Added `App::handle_composer_enter`, which checks the
suppression state and either appends `\n` to the burst buffer or inserts
it directly into the composer text — no submit. The `KeyCode::Enter`
arm in the composer event loop now dispatches through that helper.
Reproduces the Windows/PowerShell symptom from the report:
multi-line paste ending with `\n` no longer auto-submits AND the text
no longer leaks into the now-empty composer.

Four unit tests cover: active-burst Enter, post-flush window Enter,
normal Enter outside the window, and Enter with paste-burst detection
disabled (suppression must be off).

**PTY QA harness.** New `crates/tui/tests/support/qa_harness/` wraps
`portable-pty` (already a runtime dep) and `vt100` (new dev-dep) into
a small surface for scenarios that need a real PTY: spawn a binary,
send keys/paste/resize, parse the ANSI stream into a frame, assert
on visible text + filesystem state. The harness seals `$HOME` so
scenarios cannot read the developer's real `~/.deepseek/` and points
the base URL at 127.0.0.1:1 so no live request escapes. README under
`support/qa_harness/README.md` documents how to add a scenario.

Initial scenarios in `crates/tui/tests/qa_pty.rs`: smoke boot,
keystroke round-trip, and bracketed/unbracketed paste-with-trailing-
newline regression guards for #1073. The unbracketed scenario does
not deterministically reproduce the bug on macOS (single-syscall
PTY writes keep the burst continuously active), but the unit tests
above cover the path conclusively; the PTY test stands as a
regression guard for the visible-text invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-07 12:23:57 -05:00

fixtures

test(tui): #69 integration tests for mock LLM client + record fixtures

2026-04-28 00:03:18 -05:00

support

fix(tui): paste-Enter must not auto-submit (#1073 ) + PTY QA harness

2026-05-07 12:23:57 -05:00

eval_harness.rs

test(tui): #69 integration tests for mock LLM client + record fixtures

2026-04-28 00:03:18 -05:00

integration_mock_llm.rs

v0.8.6: survivability, UX polish, and release hardening

2026-05-02 20:11:33 -05:00

palette_audit.rs

Footer polish: remove FOOTER_HINT, simplify footer rendering

2026-04-11 20:20:18 -05:00

protocol_recovery.rs

test(parity): scan engine submodules after decomposition refactor

2026-05-01 09:34:11 -05:00

qa_pty.rs

fix(tui): paste-Enter must not auto-submit (#1073 ) + PTY QA harness

2026-05-07 12:23:57 -05:00

README.md

test(tui): #69 integration tests for mock LLM client + record fixtures

2026-04-28 00:03:18 -05:00

skill_install.rs

fix(skills): ignore symlinks outside selected install root (#814 )

2026-05-06 02:36:57 -05:00

README.md

`crates/tui/tests/`

Integration tests for the TUI binary. Per CONTRIBUTING.md, each crate's integration tests live in its own tests/ directory; the repository-root tests/ directory is unused.

Mock LLM client (`integration_mock_llm.rs`)

crates/tui/src/llm_client/mock.rs provides a MockLlmClient that implements the LlmClient trait by replaying queue-driven canned responses and capturing every outgoing MessageRequest. Tests mock at the trait boundary — never at the reqwest HTTP layer — because the trait is the durable abstraction the runtime is meant to depend on.

Coverage today exercises the trait surface end-to-end:

streaming turn loop
reasoning-content replay across tool-call rounds (V4 §5.1.1, the bug that broke v0.4.9-v0.5.1)
tool-call round-trip with chunked input JSON
multi-tool-call ordering inside a single turn
compaction-style non-streaming create_message
sub-agent style independent parent/child mocks
capacity-gate observation of a captured request before stream drain

Four full-engine tests (engine_full_*) are #[ignore]-marked. They unblock when core::engine::Engine is refactored to take Arc<dyn LlmClient> instead of a concrete Option<DeepSeekClient>. See the comment block at the bottom of integration_mock_llm.rs for the exact refactor surface.

`--record` mode for `deepseek eval`

The offline deepseek eval harness now accepts --record <DIR>. When set, each tool step appends one JSON Lines record to <DIR>/<scenario>.jsonl (default scenario: offline-tool-loop.jsonl). Each line is a self-contained JSON object with the schema:

{ "request":  { "step": "list_dir", "kind": "List" },
  "response_events": [ { "type": "ok", "output": "…" } ] }

The mock LLM client (crate::llm_client::mock) replays these fixtures by mapping each response_events array onto a canned Vec<StreamEvent>. Drop generated fixtures into crates/tui/tests/fixtures/ so they ride the repo and feed the mock in CI.

Quick example:

cargo run --bin deepseek -- eval --record crates/tui/tests/fixtures
cat crates/tui/tests/fixtures/offline-tool-loop.jsonl | jq .

The scenario name is sanitized to [A-Za-z0-9_-] before forming the filename, so unusual scenario strings stay portable across platforms.

README.md

crates/tui/tests/

Mock LLM client (integration_mock_llm.rs)

--record mode for deepseek eval

`crates/tui/tests/`

Mock LLM client (`integration_mock_llm.rs`)

`--record` mode for `deepseek eval`