codewhale

dgf1988/codewhale

Files

T

Hunter Bown 1be18e691b feat(#27 ): per-mode soft context budget for V4 compaction trigger

Add compaction_threshold_for_model_and_effort() with mode-aware soft
caps based on DeepSeek V4 paper Figure 9 recall-quality data:

  Plan / off   ->  64K (paper eval: 8K-128K)
  Agent / high -> 192K (paper eval: 128K)
  YOLO / max   -> 384K (paper eval: 384K-512K)

Previously, the 80%-of-window rule gave 800K for V4's 1M window,
which is well past the point where MRCR MMR collapses (0.49 at 1M).

Non-V4 models keep the legacy 80% rule. None/unknown effort defaults
to agent-tier (192K).

2026-04-25 12:58:35 -05:00

agent

release: v0.5.0 — fix multi-turn tool call 400 error (missing reasoning_content on assistant messages with tool_calls)

2026-04-25 12:27:53 -05:00

app-server

release: v0.5.0 — fix multi-turn tool call 400 error (missing reasoning_content on assistant messages with tool_calls)

2026-04-25 12:27:53 -05:00

cli

release: v0.5.0 — fix multi-turn tool call 400 error (missing reasoning_content on assistant messages with tool_calls)

2026-04-25 12:27:53 -05:00

config

Add NIM env support and .env.example template