c8fcef7f1e
Runs PinchBench tasks directly through codewhale exec --auto instead of going through OpenClaw. Loads task markdown, creates workspace, runs the prompt, and grades using PinchBench's embedded automated checks. No external agent framework dependency — just codewhale + pyyaml.