fc4f1e6564
Two coordinated changes that stop the engine from routinely rewriting the
prompt prefix and burning DeepSeek V4's prefix-cache discount:
1. `Settings::default().auto_compact` flips from `true` to `false`. The
`auto_compact = on` opt-in and the explicit `/compact` slash command
stay available for users / agents that decide their workload benefits
from compaction more than from cache stability. With V4's 1M-token
window the user has plenty of headroom to run long sessions without
auto-trimming, and aggressive compaction has been the dominant
cost-spike vector in long sessions (the rewritten prefix invalidates
~90% of the cache discount on every compaction event).
2. `DEFAULT_COMPACTION_TOKEN_THRESHOLD` raised from `50_000` to
`102_400` (80% of `DEFAULT_CONTEXT_WINDOW_TOKENS = 128_000`). This is
the last-resort threshold used when `context_window_for_model` returns
`None` — i.e. an unrecognised model id. Pre-v0.8.11 the fallback
compacted at ~5% of a V4 window when model detection silently fell
through. Now the fallback inherits the same late-trigger discipline as
the V4 path, so model-detection drift doesn't quietly burn cache.
Together: the two changes mean compaction never fires automatically by
default, and even when explicitly opted in (or when the runtime-thread /
capacity-flow paths invoke compaction with their own `enabled = true`
config), the threshold is anchored at 80% of the model's context window
(or 80% of the 128K default if the model is unknown), never below.
Tests
=====
- `default_settings_disable_auto_compact_to_protect_v4_prefix_cache` —
pins the new default and explains the rationale inline.
- `auto_compact_remains_explicitly_configurable` — unchanged; still
asserts the `set("auto_compact", "on" | "off")` round-trip works.
- `compaction_threshold_scales_with_context_window` — updated to assert
`compaction_threshold_for_model("unknown-model") == 102_400`.
- `v4_soft_caps_only_apply_to_v4_models` — updated to assert the
unknown-model + reasoning-effort path also lands on the new floor.
Verification
============
- `cargo fmt --all -- --check` clean.
- `cargo clippy -p deepseek-tui --bin deepseek-tui --all-features
--locked -- -D warnings` clean.
- `cargo test -p deepseek-tui --bin deepseek-tui --locked` →
2028 passed, 2 ignored.
Refs #664 (handoff-instead-of-compact pattern, full implementation
deferred). Behaviour-only change for v0.8.11; the larger
agent-aware-handoff mechanism is its own design surface.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
84 lines
1.0 KiB
Plaintext
84 lines
1.0 KiB
Plaintext
# Build artifacts
|
|
/target
|
|
*.pdb
|
|
*.exe
|
|
*.dll
|
|
*.so
|
|
*.dylib
|
|
*.rlib
|
|
*.o
|
|
|
|
# Development
|
|
.env
|
|
.env.*
|
|
!.env.example
|
|
node_modules/
|
|
.vscode/
|
|
.idea/
|
|
*.swp
|
|
*.swo
|
|
*~
|
|
.DS_Store
|
|
Thumbs.db
|
|
|
|
# Python
|
|
__pycache__/
|
|
*.py[cod]
|
|
*$py.class
|
|
.pytest_cache/
|
|
venv/
|
|
ENV/
|
|
env/
|
|
.venv/
|
|
*.egg-info/
|
|
dist/
|
|
|
|
# Logs
|
|
*.log
|
|
|
|
# Generated
|
|
outputs/
|
|
tmp/
|
|
|
|
# Reference papers / large research blobs (keep locally if needed, don't ship)
|
|
docs/DeepSeek_V4.pdf
|
|
docs/*.pdf
|
|
|
|
# Note: Cargo.lock is intentionally NOT ignored for reproducible builds
|
|
|
|
# Local dev scripts and temp files
|
|
*.sh
|
|
!scripts/**
|
|
test.txt
|
|
TODO*.md
|
|
todo*.md
|
|
CLAUDE.md
|
|
NEXT_SESSION.md
|
|
AI_HANDOFF.md
|
|
result.json
|
|
count_deps.py
|
|
project_overhaul_prompt.md
|
|
.codex/
|
|
.context/
|
|
|
|
# Local runtime state
|
|
.deepseek/
|
|
**/session_*.json
|
|
*.db
|
|
|
|
# Companion app (tracked separately)
|
|
apps/
|
|
|
|
# Claude Code runtime artifacts
|
|
.claude/scheduled_tasks.lock
|
|
.claude/worktrees/
|
|
.worktrees/
|
|
.ace-tool/
|
|
|
|
# Local-only Claude / ralph notes
|
|
.claude/*.local.md
|
|
.claude/*.local.json
|
|
|
|
# Maintainer-internal design notes (trade-secret material, never published)
|
|
.private/
|