codewhale

dgf1988/codewhale

Files

T

Hunter B ce46e29e38 fix(benchmarks): fix workspace file copying and add LLM judge grading

Two bugs from the initial run:
1. workspace_files format is [{source, dest}] not {path, content} —
   files live in PinchBench's assets/ directory, not tasks/. Now checks
   both tasks/ and assets/ directories.
2. LLM judge tasks (writing, research) scored 0% because the judge
   wasn't implemented. Now uses codewhale exec as the judge — sends
   the rubric + workspace contents and parses a JSON score response.

Also strips ANSI escape codes and control characters from judge output
to prevent JSON parse failures.

2026-06-05 15:57:06 -07:00

benchmarks

fix(benchmarks): fix workspace file copying and add LLM judge grading

2026-06-05 15:57:06 -07:00

installer

fix(release): ship NSIS installer artifact

2026-06-01 19:34:25 -07:00

release

chore(release): harden deepseek-tui deprecation path

2026-06-01 06:01:03 -07:00

tencent-lighthouse

chore(release): v0.8.53 — Arcee support, telegram bridge, provider fixes