The `effective_max_output_tokens` heuristic defaults to 64K for any model
not in the known-context-window table. This is fine for DeepSeek's hosted
API (1M context) but causes immediate HTTP 400s on self-hosted providers
with tight `max-model-len`.
Example: vLLM serving Qwen3.6 with `--max-model-len 65536` rejects
requests because 64000 (output) + ~1500 (input) exceeds the limit by 1
token.
This change lets the operator set `DEEPSEEK_MAX_OUTPUT_TOKENS=16384` (or
whatever fits their deployment) to override the heuristic. The env var
takes precedence over the model-table lookup when set to a positive
integer; otherwise the existing behavior is preserved.
No new config struct field — env-only override keeps the public API
unchanged. Useful for embedded users (e.g. pinvou3) who need to control
output budget without forking the engine config schema.
Co-authored-by: hexin <he.xin@h3c.com>
* feat: session token breakdown in footer and /status
Add accumulated session token tracking with input / cache-hit / output
breakdown. Rebased from PR #1666 onto post-rebrand main (v0.8.45).
Changes:
- SessionState: new total_input_tokens, total_cache_hit_tokens,
total_cache_miss_tokens, total_output_tokens fields
- Turn outcome handler: accumulate per-turn token breakdown
- StatusItem::Tokens: new footer chip, enabled by default
- Footer chip: "12K in · 8.1K cch · 2.5K out" format
- /status: expanded with session input/cache/output rows
- /clear and /load: reset accumulated breakdown
* fix: address review feedback — current_session_id, cache guarding, DRY helper
- Restore app.current_session_id assignment accidentally dropped in
apply_loaded_session during rebase (P1: breaks startup-resume and
session-sync paths)
- Guard cache-hit/miss accumulation behind is_some() so providers
that omit cache telemetry don't inflate miss totals
- Extract SessionState::reset_token_breakdown() to avoid duplicating
the four-field reset in core/session/ui call sites
- Hide the "cch" segment from the footer token chip when no cache
data has been recorded
- Show "not reported" in /status session-cache row instead of
"0 hit / 0 miss" when no cache telemetry is available
Adds Metaso AI Search as a new SearchProvider option alongside Bing,
DuckDuckGo, Tavily, and Bocha.
Co-authored-by: Zhao Xiaohong <zhaoxiaohong@metasota.ai>
grep_files runs its directory walk and per-file regex synchronously inside
the async execute(). On a large tree this pins the runtime worker for
minutes, so the turn loop can't observe the cancel token and the stop
button stays unresponsive — the same failure file_search fixed in #2035.
Mirror that fix: move the blocking walk onto spawn_blocking, bounded by a
30s hard timeout with a biased select on the cancel token, via a
run_blocking_grep helper that parallels run_blocking_file_search. Output
and the existing per-file/per-line cancel checks are unchanged.
Co-authored-by: hexin <he.xin@h3c.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(paste): preserve Enter suppression window after non-char keys during paste
* chore(paste): add debug_assert for buffer precondition in deactivate_keep_window
Documents the invariant that buffer must be flushed before calling deactivate_keep_window. The assert fires in debug builds only, catching future misuse without masking upstream bugs.
When the workspace is the user's home directory, the project-scope
config file (~/.deepseek/config.toml) is also the global config file.
Skip the merge to avoid redundant processing and a misleading
"project-scope config key ignored" warning on every launch from ~.
Fixes the home-directory false-positive in the #417 deny-list
check: the deny-list correctly refuses dangerous keys at project
scope, but when cwd == $HOME the project file *is* the global file
so the warning is noise.
The formula downloaded deepseek-macos-arm64 (the deprecation shim) as the
main binary. After the rebranding, deepseek is just a wrapper that spawns
codewhale, but codewhale was never installed — causing "codewhale not
found on PATH" for every Homebrew user.
Now the formula downloads codewhale-* as the primary binary and installs
all four artifacts: codewhale, codewhale-tui, deepseek (legacy shim), and
deepseek-tui (legacy TUI shim).
Closes#2104
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
When running in a Feishu thread-enabled group (话题群), every bot
response — status messages, approval prompts, streaming progress,
turn results — was sent via the Lark SDK's `create` API which spawns
a new standalone topic. The user sees a cluttered group with orphan
topics for each intermediate bot message.
Root cause: `sendText()` only called `client.im.message.create()`
with a bare `chat_id`, never passing any reply context. The Feishu
`reply` API was completely unused.
Fix (two changes, one site each):
1. **lib.mjs — incomingIdentity()**: expose `parentId`, `rootId`,
`threadId` from the raw Feishu message event so callers can
determine thread context. (Not consumed directly yet, but
available for future use.)
2. **index.mjs**:
- `handleIncomingMessage()`: store the latest incoming
`messageId` as `replyToMessageId` in the per-chat thread store.
- `sendText()`: look up `replyToMessageId` from the thread store;
when present, call `client.im.message.reply()` instead of
`create()`. This keeps ALL bot responses nested under the
original user message inside the same topic.
No config changes needed. New chats automatically start using the
reply path; existing chats without a `replyToMessageId` in the store
fall back to the old `create` behaviour.
/ 修复飞书话题群中 bot 消息新建独立话题的问题。所有回复改为使用 reply API
/ 在原话题内嵌套回复,而非通过 create API 创建新话题。
Three interrelated feedback issues resolved:
1. Windows desktop notifications (notifications.rs)
- resolve_method() now returns Method::Bel on Windows instead of
Method::Off, so windows_bell() (MessageBeep) fires a system
notification sound when a long turn completes.
2. Tool output summary truncation (history.rs)
- TOOL_TEXT_LIMIT increased from 180 to 300 characters, reducing
the chance that meaningful tool output is cut short in the
one-line summary shown in tool cards.
3. Turn completion status visibility (ui.rs)
- Added a push_status_toast(Info, 10s TTL) call alongside the
existing set_receipt_text() in the TurnComplete handler. The
receipt text is now visible in both the composer border (8s)
and the footer status bar (10s).
4. Footer working-time feedback (footer_ui.rs)
- Footer state label now includes elapsed seconds from
turn_started_at immediately, instead of waiting 30 seconds for
the stall_reason classification to kick in.
- When app.is_loading is true, the label shows elapsed time so
users see ongoing progress during model-loading phases.
Replace the feature-grid format with a narrative that leads with the
core problem the harness solves: conflicting information at scale.
Frame the Constitution as a formal jurisdiction framework (LLM-as-judge),
describe V4's prefix caching as what makes recursive constitutional
reference practical (open-book test, not closed-book), and explain that
the explicit authority structure enables honest failure feedback.
Cut: memory/handoffs (table stakes), /statusline chip (internal),
theme picker, desktop notifications (clutter).
* test(settings): handle TERM_PROGRAM env var in no_animations_env_recognises_truthy_spellings_only
Clear and restore TERM_PROGRAM environment variable during tests
to prevent low_motion being forced in vscode, ghostty and Termius.
* test(settings): handle other environment variables that can independently force low_motion in no_animations_env_recognises_truthy_spellings_only
---------
Co-authored-by: hqt <you@example.com>
* docs: v0.8.46 CHANGELOG — platform archives, palette, sub-agents, sandbox, web install, search fixes
Closes#2188
* feat(v0.8.46): quick fixes — palette, model picker Esc, sub-agent sidebar, shell chip, model name casing, CVE bump (#2212)
* fix: bump qs to >=6.15.2 for CVE-2026-8723
Add qs override in feishu-bridge package.json to force transitive
dependency resolution to >=6.15.2, addressing CVE-2026-8723.
Refs: #2198
* fix: Esc in model picker applies last-highlighted choice
Previously Esc reverted to the initial model when the user hadn't
moved the selection. Now Esc always applies the currently highlighted
model and thinking-effort tier, making Esc consistent with Enter.
Also updates the picker footer hint from 'Esc cancel' to 'Esc apply'.
Refs: #2196
* feat: show '⏳ shell running' chip in TUI footer
Adds a footer_shell_chip function that displays a '⏳ shell running'
status chip in the footer's right cluster whenever a foreground shell
command is active via exec_shell. The chip is always visible regardless
of user-configured status items.
Refs: #2194
* feat: auto-collapse finished sub-agents in sidebar
When a sub-agent completes (status = 'done'), its detail lines
(id, steps, duration, progress) are now hidden in the sidebar agents
panel. Only the summary label line is shown, keeping the sidebar
compact. Running agents still show full detail.
Refs: #2195
* feat: refresh Whale dark palette for better contrast
Improve contrast and layer separation in the Whale dark theme:
- Deepen base background for more depth (10,17,32)
- Lighten panel (22,34,56) for clearer distinction from bg
- Lighten elevated surface (36,52,78) for better elevation
- Lighten selection (48,68,100) for clearer selected state
- Boost text hint (138,150,174) and dim (118,130,156) readability
- Brighter border (52,88,145) for better edge definition
- Update tool surface colors for consistency
Refs: #2197
* fix: preserve model name casing in normalize_model_name_for_provider
When the user enters a model name like 'DeepSeek-V4-Flash', the
normalizer was lowercasing it to 'deepseek-v4-flash' via the
canonical_official_deepseek_model_id function. Now the normalizer
preserves the caller's casing when the input already matches a known
model id case-insensitively. Compact aliases like 'deepseek-v4pro'
are still rewritten to 'deepseek-v4-pro'.
Refs: #2109
* feat(web): install download tile with arch detection, SHA256, China mirrors + companion binary fix (#2213)
* fix(web): download both codewhale and codewhale-tui binaries in install snippets
The SNIPPETS map only fetched one binary per platform, causing the
dispatcher to fail with MISSING_COMPANION_BINARY. Every arch now
downloads both codewhale AND codewhale-tui side-by-side.
- macOS/Linux: added second curl + combined chmod/xattr/mv for tui
- Windows: added second Invoke-WebRequest for codewhale-tui.exe
- VERIFY: PowerShell now hashes both binaries; Unix --ignore-missing
covers all present binaries in a single sha256sum pass
* feat(web): add install download tile with arch detection, SHA256, and China mirrors (#2192)
* feat(sandbox/linux): process hardening — PR_SET_DUMPABLE, NO_NEW_PRIVS, RLIMIT_CORE (#2214)
* feat(sandbox/linux): add process hardening module — PR_SET_DUMPABLE, NO_NEW_PRIVS, RLIMIT_CORE (#2183)
* feat(sandbox/linux): seccomp filter + bwrap passthrough
- seccomp: BPF filter whitelisting safe syscalls, denying ptrace/mount/kexec
and other dangerous syscalls. Uses raw BPF instructions via libc prctl to
avoid external dependencies (#2182).
- bwrap: optional bubblewrap passthrough when /usr/bin/bwrap is present
and [sandbox] prefer_bwrap=true in config. Creates read-only rootfs with
write access limited to the working directory (#2184).
- landlock detect_denial extended to recognize seccomp SIGSYS/"Bad system
call" patterns alongside existing Landlock EACCES/EPERM detection.
- SandboxManager gains prefer_bwrap field; set_prefer_bwrap on ShellManager.
- EngineConfig gains prefer_bwrap field, wired through main/ui/runtime_threads.
- Diagnostics now reports bwrap_available and cgroup_version.
- config.example.toml documents the prefer_bwrap key.
Pre-existing clippy fixes picked up in the same build:
- collapsible_if in ui.rs version-check
- cmp_owned in goal.rs test
- consecutive str::replace in normalize_auth_mode
Closes#2182, closes#2184
* docs: add cross-links to issue and PR templates in CONTRIBUTING.md (#2215)
- Link .github/ISSUE_TEMPLATE/bug_report.md and feature_request.md from
the Reporting Issues section
- Link .github/PULL_REQUEST_TEMPLATE.md from the Pull Request Guidelines
section
* feat(release): bundle platform archives with install scripts (#2216)
- Add bundle job to release workflow that creates per-platform archives
(tar.gz for Linux/macOS, .zip for Windows) containing both codewhale
and codewhale-tui binaries plus install scripts
- Create install.bat (Windows) — copies binaries to %USERPROFILE%\bin
- Create install.sh (Unix) — copies binaries to ~/.local/bin
- Windows gets a portable .zip variant without install script
- Release notes updated to promote archives as primary download method
- Individual binaries retained for npm wrapper and scripting
Closes#2193
* fix(web_search): fall back to DuckDuckGo when Bing returns zero results (#2130)
When the configured search provider is Bing and the query returns zero
results (common for technical/compound queries), fall through to the
DuckDuckGo path instead of reporting empty. A provenance message is
surfaced: "Bing returned no results; used DuckDuckGo fallback".
Also adds Security and Code of Conduct cross-links to CONTRIBUTING.md
per the sub-agent renovation (#2203).
* docs: SANDBOX.md threat model + RFCs for persistence and MCP + SandboxExecutor trait
- docs/SANDBOX.md: complete threat model describing each platform's sandbox
(Seatbelt, Landlock, seccomp, process hardening, bwrap, Windows v1).
Covers defense-in-depth layering, config keys, denial detection, limitations.
- docs/rfcs/2189-persistence-sqlite.md: RFC for SQLite migration (drafted by sub-agent)
- docs/rfcs/2190-mcp-modularization.md: RFC for MCP crate split into
protocol/client/server with OAuth support
- crates/tui/src/sandbox/policy.rs: SandboxExecutor trait definition and
SafetyLevel→SandboxPolicyBehavior mapping function with tests
Closes#2180, closes#2186, closes#2189, closes#2190
* feat: sandbox parity tests + remove sub-agent 100-turn cap
- Add sandbox parity tests covering platform detection, denial patterns,
bwrap preference, and policy consistency across modes (#2187)
- Remove arbitrary 100-turn sub-agent cap: DEFAULT_MAX_STEPS changed
from 100 to u32::MAX. Sub-agents now run until they produce a final
text response, are cancelled by the parent, or hit a configured
explicit budget (#2034)
Closes#2187, closes#2034
Replaced naive with parse_semver tuple comparison
so dev builds (e.g. "0.8.46-pre") don't trigger false update hints.
Falls back to string compare when either side is non-semver.
Caught by Gemini Code Assist review on PR #2181.
grep_files default excludes now include bare directory names
(e.g. "node_modules") alongside the "dir/*" variants. The glob
matcher requires a '/' to match "dir/*", so the bare form skips
the directory traversal entirely instead of descending and filtering
each file — fixes 7MB result payloads from node_modules (#2200).
Added a background version check that fetches the latest GitHub
release tag once per TUI session. When a newer version is available,
the footer renders a persistent toast:
"vX.Y.Z available — run `codewhale update` and restart"
Silent on network errors (5s timeout).
Covers Openrouter, Novita, NvidiaNim, Fireworks, Sglang, Vllm, Ollama,
Atlascloud, and WanjieArk — the providers that were silently expanded
by the generic provider_env_vars loop but had no test coverage beyond
the existing Moonshot and OpenAI cases.
Co-Authored-By: bot_apk <apk@cognition.ai>
Harvested from PR #2118 by @Hmbown.
Includes Kimi/Moonshot OAuth, v0.8.45 release prep, the Codex/ChatGPT OAuth removal, open-source-first model defaults, and the safe green PR batch merged into main before the release branch refresh.