The packaged changelog is now a recent-releases slice produced by scripts/sync-changelog.sh (which gains a --check mode); also restore the SECURITY.md contact line the version gate guards, and finish the stale binary-name sweep (--bin codewhale examples, qa harness doc).
PTY/frame-capture TUI QA harness
Tiny helper for integration tests that need to drive deepseek-tui like a real
user typing in a real terminal — keys, paste, resize, plus assertions over the
parsed terminal frame and the workspace filesystem.
When to use this
Reach for this harness when a bug only shows up in the interactive
terminal: paste behaviour, slash menus, mode switching, viewport rendering,
onboarding flow, resize, mouse capture. Anything where a TestBackend or a
unit test on the underlying state machine is too divorced from what the user
actually sees.
For pure logic tests on App, SkillRegistry, the engine's Op / Event
plumbing, etc., keep using crates/tui/src/.../tests style unit tests. Don't
spin up a PTY just to assert a function returns the right value.
Anatomy
pty.rs—PtySession. Spawns a binary in a real PTY (viaportable-pty), pumps the child's stdout into a buffer on a background thread, exposeswrite_bytes,resize,drain,shutdown.frame.rs—Frame. Wrapsvt100::Parser. Feed bytes in, ask questions out:text(),row(y),contains(s),cursor(),debug_dump().keys.rs— byte-sequence builders for keys (key::ctrl('c'),key::enter(),key::tab(), …) and for paste (paste::bracketed(s),paste::unbracketed(s)).harness.rs—Harness. Composes the two. Haswait_for,wait_for_text,wait_for_idle, plusmake_sealed_workspace()for a tempdir HOME.
Adding a new scenario
-
Pick the smallest set of inputs that reproduce the user-visible behaviour. If you can't reproduce it without a real LLM turn, the scenario probably belongs in a unit test (or a
wiremock-driven turn test) instead. -
Build a sealed workspace so the scenario doesn't see the developer's real
~/.deepseek/or API keys:let ws = qa_harness::harness::make_sealed_workspace()?; std::fs::write(ws.user_skills_dir().join("foo/SKILL.md"), "...")?; -
Spawn:
let mut h = Harness::builder(Harness::cargo_bin("codewhale-tui")) .cwd(ws.workspace()) .seal_home(ws.home()) .env("DEEPSEEK_API_KEY", "ci-test-key") .args(["--workspace", ws.workspace().to_str().unwrap(), "--no-project-config", "--skip-onboarding"]) .size(40, 120) .spawn()?; -
Drive it:
h.wait_for_text("Composer", Duration::from_secs(10))?; h.send(keys::key::ch('/'))?; h.wait_for_text("/skills", Duration::from_secs(2))?; -
Assert:
let f = h.frame(); assert!(f.contains("local-skill"), "frame:\n{}", f.debug_dump()); -
Always shut down cleanly at the end so the PTY cleanup runs even on a failing assertion:
let _ = h.shutdown();
Conventions
- Sealed env always. No scenario should be able to see the real
$HOME/.deepseek/or contactapi.deepseek.com. If a scenario has to do a real model turn, route through a localwiremockortiny_httpfake provider and passDEEPSEEK_BASE_URL=<localhost>. - Fail noisily. When an assertion fails, print
frame.debug_dump()so the CI log shows the rendered screen, not justassertion failed. - Prefer
wait_for_textoversleep. A scenario that sleeps 500ms before asserting will flake under CI load. A scenario that polls with a 10s timeout is robust. - Expect output to be slow on first launch. The TUI does config probing, skill installation, and snapshot cleanup before showing the composer. Give startup at least 10–15 seconds before timing out.
Platforms
portable-pty works on macOS, Linux, and Windows (ConPTY). Today the
scenarios target Unix only — the test binary is gated with
#![cfg(unix)] until the Windows-specific input plumbing has been audited
under the same harness.