docs(config): clarify provider path suffix support
Records that #2506/#2508 are superseded by the safer #2558 path_suffix implementation, credits the original #1874 report and follow-up PR review trail, and documents that suffix overrides only affect chat completions while model and beta paths keep built-in routing.
This commit is contained in:
+11
-5
@@ -103,6 +103,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
generated-context targets, trust markers, and trust config paths, and it
|
||||
stores post-load signatures so auto-generated context deletion/regeneration
|
||||
stays correct (#2636).
|
||||
- Configuration docs now show the provider-local `path_suffix` escape hatch
|
||||
for OpenAI-compatible gateways that accept `/chat/completions` but reject
|
||||
`/v1/chat/completions`, while making clear that model listing and DeepSeek
|
||||
beta routes keep their built-in paths (#1874).
|
||||
|
||||
### Community
|
||||
|
||||
@@ -122,11 +126,13 @@ dense tool-call transcript collapse/sidebar detail direction (#2738, #2734,
|
||||
**@h3c-hexin** for the tool-agent model inheritance and configured
|
||||
`skills_dir` fixes (#2736, #2737). Thanks also to **@qiyuanlicn** for the
|
||||
checkpoint/resume report that shaped the sub-agent recovery slice (#2029),
|
||||
to **@bevis-wong** for the long-running shell/task liveness report (#1786),
|
||||
and to **@NASLXTO** and
|
||||
**@wuxixing** for the large-workspace startup reports (#697, #1827), and to
|
||||
**@linzhiqin2003** and **@merchloubna70-dot** for earlier context-cap and
|
||||
startup-diagnosis work that shaped this bounded fallback.
|
||||
**@bevis-wong** for the long-running shell/task liveness report (#1786),
|
||||
**@shuxiangxuebiancheng** for the third-party OpenAI-compatible path report
|
||||
(#1874), **@hongqitai** and **@cyq1017** for the follow-up path-suffix PR
|
||||
review trail (#2508, #2506), **@NASLXTO** and **@wuxixing** for the
|
||||
large-workspace startup reports (#697, #1827), and **@linzhiqin2003** and
|
||||
**@merchloubna70-dot** for earlier context-cap and startup-diagnosis work that
|
||||
shaped this bounded fallback.
|
||||
|
||||
## [0.8.53] - 2026-06-03
|
||||
|
||||
|
||||
@@ -638,6 +638,11 @@ Current v0.9 track credits:
|
||||
- **[NASLXTO](https://github.com/NASLXTO)** and
|
||||
**[wuxixing](https://github.com/wuxixing)** — large-workspace startup
|
||||
reports that shaped the bounded project-context fallback (#697, #1827)
|
||||
- **[shuxiangxuebiancheng](https://github.com/shuxiangxuebiancheng)**,
|
||||
**[hongqitai](https://github.com/hongqitai)**, and
|
||||
**[cyq1017](https://github.com/cyq1017)** — third-party
|
||||
OpenAI-compatible path-suffix report and follow-up review trail (#1874,
|
||||
#2508, #2506)
|
||||
|
||||
Current and recurring contributors include:
|
||||
|
||||
|
||||
+11
-5
@@ -103,6 +103,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
generated-context targets, trust markers, and trust config paths, and it
|
||||
stores post-load signatures so auto-generated context deletion/regeneration
|
||||
stays correct (#2636).
|
||||
- Configuration docs now show the provider-local `path_suffix` escape hatch
|
||||
for OpenAI-compatible gateways that accept `/chat/completions` but reject
|
||||
`/v1/chat/completions`, while making clear that model listing and DeepSeek
|
||||
beta routes keep their built-in paths (#1874).
|
||||
|
||||
### Community
|
||||
|
||||
@@ -122,11 +126,13 @@ dense tool-call transcript collapse/sidebar detail direction (#2738, #2734,
|
||||
**@h3c-hexin** for the tool-agent model inheritance and configured
|
||||
`skills_dir` fixes (#2736, #2737). Thanks also to **@qiyuanlicn** for the
|
||||
checkpoint/resume report that shaped the sub-agent recovery slice (#2029),
|
||||
to **@bevis-wong** for the long-running shell/task liveness report (#1786),
|
||||
and to **@NASLXTO** and
|
||||
**@wuxixing** for the large-workspace startup reports (#697, #1827), and to
|
||||
**@linzhiqin2003** and **@merchloubna70-dot** for earlier context-cap and
|
||||
startup-diagnosis work that shaped this bounded fallback.
|
||||
**@bevis-wong** for the long-running shell/task liveness report (#1786),
|
||||
**@shuxiangxuebiancheng** for the third-party OpenAI-compatible path report
|
||||
(#1874), **@hongqitai** and **@cyq1017** for the follow-up path-suffix PR
|
||||
review trail (#2508, #2506), **@NASLXTO** and **@wuxixing** for the
|
||||
large-workspace startup reports (#697, #1827), and **@linzhiqin2003** and
|
||||
**@merchloubna70-dot** for earlier context-cap and startup-diagnosis work that
|
||||
shaped this bounded fallback.
|
||||
|
||||
## [0.8.53] - 2026-06-03
|
||||
|
||||
|
||||
@@ -209,6 +209,19 @@ legacy top-level `base_url`, so the OpenAI-compatible provider receives it.
|
||||
provider tables in one config, `[providers.openai].model` can be used as the
|
||||
OpenAI-provider-specific override.
|
||||
|
||||
If the gateway accepts `POST /chat/completions` but rejects
|
||||
`/v1/chat/completions`, set a provider-local `path_suffix`:
|
||||
|
||||
```toml
|
||||
[providers.openai]
|
||||
base_url = "https://your-gateway.example/v1"
|
||||
path_suffix = "/chat/completions"
|
||||
```
|
||||
|
||||
The suffix applies only to chat-completion requests. Model listing and
|
||||
DeepSeek beta paths keep their built-in routing so a generic gateway override
|
||||
does not accidentally rewrite `/models` or `/beta/completions`.
|
||||
|
||||
Local HTTP endpoints such as Ollama, SGLang, and vLLM are allowed by default
|
||||
when they use localhost or loopback addresses. For a non-local `http://`
|
||||
gateway, launch with `DEEPSEEK_ALLOW_INSECURE_HTTP=1` only on a trusted network:
|
||||
@@ -744,6 +757,7 @@ If you are upgrading from older releases:
|
||||
- `provider` (string, optional): `deepseek` (default), `nvidia-nim`, `openai`, `atlascloud`, `wanjie-ark`, `volcengine`, `openrouter`, `xiaomi-mimo`, `novita`, `fireworks`, `siliconflow`, `siliconflow-CN`, `arcee`, `moonshot`, `sglang`, `vllm`, or `ollama`. Legacy `deepseek-cn` configs are still accepted as an alias for `deepseek`; DeepSeek uses the same official host [`https://api.deepseek.com`](https://api-docs.deepseek.com/) worldwide. `nvidia-nim` targets NVIDIA's NIM-hosted DeepSeek endpoints through `https://integrate.api.nvidia.com/v1`; `openai` targets a generic OpenAI-compatible endpoint, defaulting to `https://api.openai.com/v1`; `atlascloud` targets AtlasCloud's OpenAI-compatible endpoint at `https://api.atlascloud.ai/v1`; `wanjie-ark` targets Wanjie Ark's OpenAI-compatible endpoint at `https://maas-openapi.wanjiedata.com/api/v1`; `volcengine` targets Volcengine Ark's OpenAI-compatible coding endpoint at `https://ark.cn-beijing.volces.com/api/coding/v3`; `openrouter` targets `https://openrouter.ai/api/v1`; `xiaomi-mimo` targets Xiaomi MiMo's OpenAI-compatible endpoint at `https://api.xiaomimimo.com/v1`; `novita` targets `https://api.novita.ai/v1`; `fireworks` targets `https://api.fireworks.ai/inference/v1`; `siliconflow` targets SiliconFlow, defaulting to `https://api.siliconflow.com/v1`; `siliconflow-CN` targets the SiliconFlow China regional endpoint while sharing `[providers.siliconflow]`; `arcee` targets Arcee AI's OpenAI-compatible endpoint at `https://api.arcee.ai/api/v1`; `moonshot` targets Moonshot/Kimi, defaulting to `https://api.moonshot.ai/v1`; `sglang` targets a self-hosted OpenAI-compatible endpoint, defaulting to `http://localhost:30000/v1`; `vllm` targets a self-hosted vLLM OpenAI-compatible endpoint, defaulting to `http://localhost:8000/v1`; `ollama` targets Ollama's OpenAI-compatible endpoint, defaulting to `http://localhost:11434/v1`.
|
||||
- `api_key` (string, required for hosted providers): must be non-empty for DeepSeek/hosted providers (or set the provider API key env var). Self-hosted SGLang, vLLM, and Ollama can omit it.
|
||||
- `base_url` (string, optional): defaults to `https://api.deepseek.com/beta` for DeepSeek's OpenAI-compatible Chat Completions API, including legacy `provider = "deepseek-cn"` configs. Other defaults are `https://integrate.api.nvidia.com/v1` for `nvidia-nim`, `https://api.openai.com/v1` for `openai`, `https://api.atlascloud.ai/v1` for `atlascloud`, `https://maas-openapi.wanjiedata.com/api/v1` for `wanjie-ark`, `https://ark.cn-beijing.volces.com/api/coding/v3` for `volcengine`, `https://openrouter.ai/api/v1` for `openrouter`, `https://api.xiaomimimo.com/v1` for `xiaomi-mimo`, `https://api.novita.ai/v1` for `novita`, `https://api.fireworks.ai/inference/v1` for `fireworks`, `https://api.siliconflow.com/v1` for `siliconflow`, `https://api.siliconflow.cn/v1` for `siliconflow-CN`, `https://api.arcee.ai/api/v1` for `arcee`, `https://api.moonshot.ai/v1` for `moonshot`, `http://localhost:30000/v1` for `sglang`, `http://localhost:8000/v1` for `vllm`, and `http://localhost:11434/v1` for `ollama`. Set `https://api.deepseek.com` or `https://api.deepseek.com/v1` explicitly to opt out of DeepSeek beta features.
|
||||
- `path_suffix` (string, optional provider-table key): override the chat-completions path for OpenAI-compatible gateways that do not serve `/v1/chat/completions`. For example, `[providers.openai] path_suffix = "/chat/completions"` sends chat requests to the unversioned base URL plus `/chat/completions`; `models` and `beta/*` requests keep their normal routing.
|
||||
- `default_text_model` (string, optional): defaults to `deepseek-v4-pro` for DeepSeek and generic OpenAI-compatible endpoints, `deepseek-ai/deepseek-v4-pro` for NVIDIA NIM, `deepseek-ai/deepseek-v4-flash` for AtlasCloud, `deepseek-reasoner` for Wanjie Ark, `DeepSeek-V4-Pro` for Volcengine Ark, `deepseek/deepseek-v4-pro` for OpenRouter and Novita, `mimo-v2.5-pro` for Xiaomi MiMo, `accounts/fireworks/models/deepseek-v4-pro` for Fireworks, `deepseek-ai/DeepSeek-V4-Pro` for SiliconFlow, `trinity-large-thinking` for Arcee AI, `kimi-k2.6` for Moonshot, `deepseek-ai/DeepSeek-V4-Pro` for SGLang/vLLM, and `deepseek-coder:1.3b` for Ollama. Current public DeepSeek IDs are `deepseek-v4-pro` and `deepseek-v4-flash`, both with 1M context windows, 384K max output, and thinking mode enabled by default. Legacy `deepseek-chat` and `deepseek-reasoner` remain compatibility aliases for `deepseek-v4-flash` until July 24, 2026, except SiliconFlow maps `deepseek-reasoner` and `deepseek-r1` to its Pro model while `deepseek-chat` and `deepseek-v3` map to Flash. Provider-specific mappings translate `deepseek-v4-pro` / `deepseek-v4-flash` to each provider's model ID where supported. OpenRouter also recognizes recent large IDs such as `arcee-ai/trinity-large-thinking`, `minimax/minimax-m3`, `xiaomi/mimo-v2.5-pro`, `qwen/qwen3.6-flash`, `qwen/qwen3.6-35b-a3b`, `qwen/qwen3.6-max-preview`, `qwen/qwen3.6-27b`, `qwen/qwen3.6-plus`, `google/gemma-4-31b-it`, and `moonshotai/kimi-k2.6`; direct Arcee uses bare IDs such as `trinity-large-thinking` and `trinity-large-preview`; direct Xiaomi MiMo recognizes chat IDs `mimo-v2.5-pro` and `mimo-v2.5`, while TTS IDs are selected through `codewhale speech` / `tts`. Generic `openai`, `atlascloud`, `wanjie-ark`, `xiaomi-mimo`, `arcee`, and Ollama model IDs are passed through unchanged after known aliases are normalized. OpenRouter and SiliconFlow provider configs with a custom `base_url` also preserve explicit model values, which lets OpenAI-compatible gateways accept bare model IDs. Use `/models` or `codewhale models` to discover live IDs from your configured endpoint. `CODEWHALE_MODEL` overrides this for a single process; `DEEPSEEK_MODEL` is the legacy alias.
|
||||
- `reasoning_effort` (string, optional): `off`, `low`, `medium`, `high`, or `max`; defaults to the configured UI tier. DeepSeek Platform receives top-level `thinking` / `reasoning_effort` fields. NVIDIA NIM receives equivalent settings through `chat_template_kwargs`.
|
||||
- `allow_shell` (bool, optional): defaults to `false`; shell tools must be explicitly enabled.
|
||||
|
||||
@@ -107,9 +107,9 @@ v0.9 branch so the remaining Windows/manual checks are explicit.
|
||||
| #2501 in-process LLM response cache | Conflicting | Defer; cache key risks noted in prior review. |
|
||||
| #2502 web_run RwLock split | Mergeable | Manually harvested with panic-safety and shared cached-page reads; close/comment after branch is public. |
|
||||
| #2505 subagent cap accounting | Draft/conflicting | Compare with current subagent cap tests before harvest. |
|
||||
| #2506 provider path suffix overrides | Draft/conflicting | Partly superseded by current provider path-suffix support; verify. |
|
||||
| #2506 provider path suffix overrides | Draft/conflicting / superseded | The current branch already contains provider-table `path_suffix` support from #2558 with the safer constrained behavior: only `chat/completions` uses the override, while `models` and DeepSeek `beta/*` keep their built-in routing. `cargo test -p codewhale-tui --bin codewhale-tui --locked api_url_with_suffix -- --nocapture` passed. Credit @cyq1017 for the earlier design/review trail; comment/close after branch is public, keeping #1874 tied to the shipped #2558 implementation/docs. |
|
||||
| #2507 stream chunk timeout config | Draft/conflicting | Defer unless stabilization needs it. |
|
||||
| #2508 configurable path suffix | Conflicting | Likely superseded by #2506/current code; verify linked issue #2089. |
|
||||
| #2508 configurable path suffix | Conflicting / superseded | #2089 is already closed. The current implementation covers #1874's third-party gateway need without the broader env/CLI surface from #2508. Docs now show `[providers.openai].path_suffix = "/chat/completions"` and state that model/beta paths are not rewritten. Credit @hongqitai for the follow-up PR and @shuxiangxuebiancheng for the original #1874 report; close/comment after branch is public. |
|
||||
| #2509 parallel read-only web search | Closed / already merged via #2504 | Already present in `origin/main` as `a09af2024`; closed as harvested/superseded on 2026-06-04. |
|
||||
| #2510 custom DuckDuckGo endpoint | Draft/mergeable | Low priority; defer unless docs/search lane takes it. |
|
||||
| #2511 ToolCallBefore hooks | Conflicting | Defer to hook lifecycle lane. |
|
||||
|
||||
Reference in New Issue
Block a user