codewhale

dgf1988/codewhale

Author	SHA1	Message	Date
Paulo Aboim Pinto	a3ae974676	feat(tui): add ExternalTool abstraction layer (#2294 ) Harvested with thanks to @aboimpinto. Includes the ExternalTool abstraction layer plus follow-up fixes for lossy REPL stdout handling and unquoted unicode git diff paths. Validation included full CI and focused local checks for non-UTF8 REPL stdout, git_diff, and external_tool behavior.	2026-05-31 02:24:25 -07:00
Hunter Bown	2947eff9d1	fix(ci): satisfy Rust 1.88 clippy gate	2026-05-24 01:20:19 -05:00
Hunter Bown	b7bc8773f3	fix(tui): fail stuck stream turns and smooth RLM handles	2026-05-23 20:08:57 -05:00
Hunter Bown	32ce14d6b2	test(rebrand): residual brand-string cleanup across source and assets A small cleanup pass to catch brand mentions that the R5 sweep missed because they hid in: - HTTP User-Agent format strings (`Mozilla/5.0 (compatible; deepseek-tui/` in `client.rs` and `fetch_url.rs`). - Multi-line error messages whose phrase boundary straddled a line break ("…restart\n deepseek-tui." in `js_execution.rs`, `tool_catalog.rs`, `repl/runtime.rs`). - Doc comments mentioning `deepseek-tui` as a binary (`config/src/lib.rs`, `core/capacity.rs`, `tui/streaming/chunking.rs`, `features.rs`). - Skill descriptions shipped in `crates/tui/assets/skills/*/SKILL.md`. - Test fixtures with placeholder paths / git emails (`tui/external_editor.rs`, `snapshot/repo.rs`). - `task_manager.rs`'s `cargo test -p deepseek-tui --lib` example. - `scripts/tencent-lighthouse/doctor.sh` info-line prefix. The remaining `deepseek-tui` mentions in the codebase are intentional (the legacy `[[bin]]` entry in `crates/tui/Cargo.toml`, the legacy `npm/deepseek-tui/` deprecation shim package, the CNB mirror namespace, the security email, the legacy bin's shim source file, and historical CHANGELOG entries) and were preserved per the rebrand anti-scope. Local gates green: `cargo check --workspace --all-targets --locked`, `cargo fmt --all -- --check`, `cargo clippy --workspace --all-targets --all-features --locked -- -D warnings`, `cargo test --workspace --all-features --locked` (3226+ pass, 0 fail). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 11:58:34 -05:00
Hunter Bown	6821f294e2	fix(rlm): decode stdout lossily	2026-05-21 00:12:59 +08:00
Hunter Bown	d5c45d962d	chore(release): prepare v0.8.36 Squash merge of work/v0.8.36-cache-hygiene into main. All preflight gates passed: version-drift/check/lint/test (3073 pass, 0 fail) / CodeQL / GitGuardian / npm-smoke. Preparing the v0.8.36 release tag.	2026-05-14 00:31:18 -05:00
Hunter Bown	a507885fb8	test(rlm): tolerate windows python line endings	2026-05-12 22:18:59 -05:00
Hunter Bown	485ba7bbd4	chore(release): finish v0.8.33 polish	2026-05-12 22:03:47 -05:00
Hunter Bown	99c6b22e83	chore(release): v0.8.33 — sub-agent and RLM renovation with persistent sessions - Persistent RLM sessions (rlm_open/rlm_eval/rlm_close) with bounded REPL helpers - Fork-aware sub-agent sessions (agent_open/agent_eval/agent_close) with handle_read - Shared handle_read storage with slice/range/count/JSONPath projections - Slash-command routing: /rlm, /agent, /relay (/接力) for handoff prompts - Sidebar renamed to "Work" tab, consistent across Plan/Agent/YOLO modes - Tool papercuts: file_search excludes, grep_files strings, fetch_url JSON, edit_file fuzz, exec_shell merged stdout/stderr, revert_turn no-op reject - CLI reasoning-effort honoured on non-auto exec routes (#1511 @h3c-hexin) - Edit-file replacement boundaries clarified (#1516) - Pandoc output validated before probing (#1523) - Running turns steerable/repaintable (#1533, #1537) - Tasks/Activity Detail calmer under load - npm retry timeout hint (#1538 @reidliu41) - Issue templates improved (#1525 @reidliu41) - Shell: kill process group to prevent UI freeze (#828 @CrepuscularIRIS) - TUI: ignore leaked SGR mouse reports in composer (#1421 @reidliu41) - Footer: keep chips within available width (#1417 @Wenjunyun123) - Session picker: scope Ctrl+R to current workspace (#1395 @LinQ) - Removed stale competitive-analysis doc - Prompts/docs teach only new tool names	2026-05-12 19:54:08 -05:00
Hunter Bown	e6d4eae5d6	fix(security): scrub child process environments	2026-05-08 14:24:07 -05:00
Hunter Bown	4f77c625fd	fix(tui): forward-port v0.8.16 hotfix to main Forward-port the v0.8.16 RLM/sub-agent hotfix onto main after tagging the release branch.	2026-05-07 00:04:31 -05:00
Hunter Bown	03e59c60ce	fix(rlm): pin child calls to flash (#832 )	2026-05-06 03:41:47 -05:00
Hunter Bown	3f24759966	release: stabilize shell handles for v0.8.0 Bumps the workspace/npm wrapper to 0.8.0 and fixes completed background shell jobs retaining live process handles, which could cause Too many open files, checkpoint save failures, shell spawn failures, and lag around send/close/Esc. Also includes Windows REPL bootstrap timeout hardening and Cargo/TUNA mirror install docs.	2026-04-30 21:34:00 -05:00
Hunter Bown	2865c9a766	refactor(rlm): drop HTTP sidecar — long-lived Python REPL over stdin/stdout The RLM tool used to spawn a fresh `python3 -c "..."` per round and route sub-LLM calls through a localhost axum sidecar; state persisted only via a JSON file (lossy: imports and non-JSON values were lost). The model could also short-circuit by replying with prose and the loop would ship the prose as if it came from the REPL. This commit replaces that with one long-lived `python3 -u` subprocess per turn driven by a stdin/stdout RPC protocol with UUID-prefixed sentinels. No more HTTP server, no more port allocation, no more JSON state file — variables, imports, and any other Python state persist naturally across rounds. The `RlmBridge` (`crates/tui/src/rlm/bridge.rs`) services `llm_query` / `llm_query_batched` / `rlm_query` / `rlm_query_batched` calls inline, recursing into `run_rlm_turn_inner` for sub-RLMs. The system prompt is tightened: the only legal turn shape is one ` ```repl ` block; calling `FINAL(...)` from prose without ever invoking a sub-LLM is rejected with a strict reminder. The `DirectAnswer` termination is gone, replaced by `NoCode` which only surfaces after multiple consecutive empty rounds. `rlm_process` now returns a per-round trace (code summary, sub-LLM call count, elapsed) so callers can verify the model actually engaged with `context` rather than guessing from the preview. Net: -313 lines. 17 new REPL runtime tests cover variable persistence, import persistence, RPC round-trips, FINAL capture, and error recovery. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 01:44:53 -05:00
Hunter Bown	5cec1534be	feat(rlm): align with reference impl + add rlm_process tool; bump 0.6.5 The previous /rlm slash command flow had a UI rendering gap (the answer never made it back to the model's view) and required the user to invoke it manually. Pivoting to a tool-call surface and aligning the in-REPL helpers with the canonical reference (alexzhang13/rlm) by the paper authors so the same prompts and decomposition patterns transfer. New tool: rlm_process - crates/tui/src/tools/rlm_process.rs - Inputs: task (small, shown to root LLM each iter as root_prompt) + exactly one of file_path (workspace-relative, preferred) or content (inline, capped at 200k chars). Optional child_model and max_depth. - Loaded across Plan/Agent/YOLO; never deferred via ToolSearch. - Returns the final answer string + metadata (iterations, duration, tokens, termination). REPL surface aligned with reference (alexzhang13/rlm): - Variable name `context` (was PROMPT) - Code fence ```repl (was ```python; python/py kept as fallback) - Helpers: llm_query, llm_query_batched (NEW), rlm_query (was sub_rlm), rlm_query_batched (NEW), SHOW_VARS (NEW), FINAL, FINAL_VAR, repl_get/repl_set - Top-level JSON-serializable user variables auto-persist across rounds (no repl_set ceremony required) - FINAL(...) / FINAL_VAR(...) parseable from the model's raw response text (parse_text_final), in addition to the in-REPL sentinel path. Code-fenced occurrences are correctly ignored to prevent false hits. Sidecar (axum, 127.0.0.1:0): - Added POST /llm_batch and POST /rlm_batch endpoints (parallel fanout, cap 16 prompts per batch). Mirrors the reference's batched semantics. Other: - System prompt rewritten with reference's strategy patterns (PREVIEW → CHUNK+map-reduce via llm_query_batched → RECURSIVE decomposition via rlm_query → programmatic compute + LLM interp). - Strict termination loop unchanged: must emit ```repl or text-level FINAL each round; one fence-less round → reminder, two → DirectAnswer. - /rlm slash command remains for manual debug; description points the model toward rlm_process for the in-agent flow. Versions: workspace 0.6.4 → 0.6.5; npm wrapper 0.6.4 → 0.6.5. Gates green: cargo fmt, cargo clippy --all-targets --all-features --locked -D warnings, cargo test --workspace --all-features --locked (all pass), parity_protocol/parity_state/snapshot, RUSTDOCFLAGS= -Dwarnings cargo doc --workspace --no-deps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 01:17:09 -05:00
Hunter Bown	bd938a559c	fix(rlm): wire real recursive substrate; bump 0.6.4 The v0.6.3 RLM loop had Algorithm 1's outer shape but the substrate was non-functional: `llm_query()` was a Python stub that returned a hardcoded string and `child_model` was bound with an underscore prefix and silently dropped. The recursive sub-LLM call advertised by /rlm never fired. This commit wires the substrate end-to-end per Zhang/Kraska/Khattab (arXiv:2512.24601, Algorithm 1): - New axum HTTP sidecar (`rlm/sidecar.rs`) bound to 127.0.0.1:0 for the duration of one RLM turn. Python's `llm_query()` and `sub_rlm()` are real `urllib.request` POSTs; Rust services them via the existing DeepSeek client. Token usage from sidecar-served calls folds into the parent `RlmTurnResult.usage`. - `child_model` is plumbed through `Op::RlmQuery` → `AppAction::RlmQuery` → `run_rlm_turn` → sidecar handlers; default remains `deepseek-v4-flash`. - New `sub_rlm(prompt)` Python helper runs a full Algorithm-1 turn at depth-1 (paper's `sub_RLM`). Default `max_depth = 2` from `/rlm`. The recursive opaque-future cycle is broken by returning a concrete `Pin<Box<dyn Future + Send>>` from `run_rlm_turn_inner`. - Strict termination: the loop ends only via `FINAL(value)` (or the iteration cap). One fence-less round is tolerated with a reminder appended; two consecutive ones surface the model text as a `RlmTermination::DirectAnswer` exit. New `RlmTermination` enum lets callers tell `Final \| DirectAnswer \| Exhausted \| Error` apart. - Richer `Metadata(state)`: includes paper-required access patterns (`repl_get` / slicing / `splitlines` / `repl_set` / `llm_query` / `sub_rlm` / `FINAL`) and a live list of variable keys currently in the REPL state file. - Unicode-safe `truncate_text` (was mixing bytes with chars), per-turn state-file cleanup, `ROOM_TEMPERATURE` typo → `ROOT_TEMPERATURE`. - New end-to-end test `sidecar_url_is_exported_to_python_env` stands up a stand-in axum server, runs `print(llm_query('hello'))` in the real PythonRuntime, and asserts the reply round-trips. Catches future regressions in sidecar URL passthrough. Versions: workspace 0.6.3 → 0.6.4 in Cargo.toml; npm wrapper 0.6.3 → 0.6.4 in npm/deepseek-tui/package.json. Gates: cargo fmt, cargo clippy --all-targets --all-features --locked -D warnings, cargo test --workspace --all-features --locked (1088 passed), parity_protocol/parity_state/snapshot, RUSTDOCFLAGS=-Dwarnings cargo doc --workspace --no-deps — all green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 00:36:59 -05:00
Hunter Bown	42c684367f	feat(rlm): implement true RLM loop per Algorithm 1 (Zhang et al., arXiv:2512.24601) Adds the true Recursive Language Model (RLM) inference paradigm: - rlm/mod.rs — module root with public API - rlm/prompt.rs — RLM system prompt teaching the model to write code - rlm/turn.rs — Algorithm 1 implementation: - P stored as REPL variable (NEVER in LLM context window) - Metadata-only context sent to root LLM (constant-size) - LLM generates Python code, not free text - Code executed in PythonRuntime with llm_query() for recursion - FINAL() detection ends the loop - Op::RlmQuery variant in ops.rs - /rlm command in the command system - AppAction::RlmQuery handler in ui.rs - PythonRuntime::with_state_path made public for RLM integration - 18 new unit tests for code extraction, metadata building, truncation Key differences from previous 'RLM-inspired' approach: ✅ P is external (REPL variable), not in LLM context ✅ Only metadata(state) in LLM context (constant-size) ✅ LLM generates code, not free text + tool calls ✅ sub-LLM recursion via llm_query() inside REPL code ✅ FINAL() mechanism for programmatic termination	2026-04-26 23:34:17 -05:00
Hunter Bown	ac8a882be5	chore: clean v0.6.3 repl build warnings	2026-04-26 23:12:57 -05:00
Hunter Bown	4e46fd06f6	feat(repl): wire PythonRuntime into engine turn loop (Phase 2) After the assistant message is persisted, when tool_uses is empty, check for inline ```repl blocks and execute them via PythonRuntime: - Extract REPL blocks from assistant text - Spawn PythonRuntime and execute each block sequentially - If a round returns FINAL: replace the assistant message text with the final value and break the turn - If no FINAL: append truncated stdout/stderr as user feedback and continue the turn loop for iterative refinement - Emit status events so the user sees 'REPL round N: ...' in the UI All 26 REPL tests + RLM tests pass. Release build verified. Refs: paper-spec RLM (Zhang et al., arXiv:2512.24601) §2	2026-04-26 18:54:46 -05:00

19 Commits