docs: align Hugging Face provider docs, errors, and tests with shipped route
This commit is contained in:
+21
-12
@@ -7,8 +7,7 @@ those models become discoverable, evaluable, routable, servable, and exportable
|
||||
without weakening the current terminal-agent contract: local workspace control,
|
||||
explicit provider auth, approval gates, and clear privacy boundaries.
|
||||
|
||||
This document is roadmap language. It does not mean every workset below is
|
||||
implemented today.
|
||||
This document is roadmap language. Some worksets below are roadmap-only.
|
||||
|
||||
## Implemented Today
|
||||
|
||||
@@ -19,6 +18,10 @@ implemented today.
|
||||
OpenAI-compatible endpoints, SGLang, vLLM, and Ollama are supported provider
|
||||
paths where their IDs appear in `/provider`, `codewhale --provider`, or
|
||||
`codewhale models`.
|
||||
- Hugging Face Inference Providers are available through the
|
||||
OpenAI-compatible router at `https://router.huggingface.co/v1`. Select the
|
||||
route with `huggingface`, `hugging-face`, `hugging_face`, or `hf`; configure
|
||||
`HUGGINGFACE_API_KEY` or `HF_TOKEN` for auth.
|
||||
- Model auto-routing chooses a concrete DeepSeek model and thinking level per
|
||||
turn. It is not a TUI mode.
|
||||
- Fin is the fast `deepseek-v4-flash` thinking-off path for routing,
|
||||
@@ -27,11 +30,10 @@ implemented today.
|
||||
- Self-hosted OpenAI-compatible endpoints can be used through SGLang, vLLM,
|
||||
Ollama, or the generic `openai` provider configuration.
|
||||
|
||||
## Not Implemented Yet
|
||||
## Still Planned
|
||||
|
||||
- A native Hugging Face provider or Hub browser.
|
||||
- Built-in Hugging Face model card, dataset, adapter, safetensors, or Jobs
|
||||
workflows.
|
||||
- Hugging Face Hub browsing, upload/export, model card, dataset, adapter,
|
||||
safetensors, or Jobs workflows.
|
||||
- Native Unsloth, NeMo, or Arcee integrations.
|
||||
- A dedicated Model Lab UI tab.
|
||||
- Built-in benchmark suites, eval leaderboards, hosted observability, or
|
||||
@@ -57,18 +59,24 @@ describe a model as available before CodeWhale can actually route to it.
|
||||
|
||||
## Hugging Face Workset
|
||||
|
||||
Implemented today:
|
||||
|
||||
- Hugging Face Inference Providers as an explicit OpenAI-compatible router
|
||||
provider, selected with `huggingface`, `hugging-face`, `hugging_face`, or
|
||||
`hf`.
|
||||
- Model IDs are sent to the router exactly as selected, including
|
||||
org-prefixed Hugging Face model IDs.
|
||||
|
||||
Planned scope:
|
||||
|
||||
- Hub API auth and model discovery.
|
||||
- Model cards, licenses, tags, safetensors metadata, adapters, and dataset
|
||||
links surfaced in a terminal-friendly way.
|
||||
- Inference Providers as explicit provider choices when the user configures
|
||||
them.
|
||||
- Hugging Face Jobs as an optional remote execution path for user-approved
|
||||
experiments.
|
||||
|
||||
Non-goal for now: claiming a native Hugging Face provider exists before it is
|
||||
implemented in code.
|
||||
Non-goal for now: treating the router route as Hub browsing/export, or
|
||||
inferring Hub upload/export auth from the inference-provider API key.
|
||||
|
||||
## Unsloth Workset
|
||||
|
||||
@@ -138,8 +146,9 @@ Planned scope:
|
||||
- Local files, prompts, transcripts, traces, model outputs, eval results,
|
||||
adapters, datasets, and checkpoints should remain local unless the user
|
||||
explicitly chooses a provider or export destination.
|
||||
- Provider auth must remain explicit. `DEEPSEEK_*`, OpenRouter, Hugging Face,
|
||||
and self-hosted credentials should not be inferred from unrelated config.
|
||||
- Provider auth must remain explicit. `DEEPSEEK_*`, OpenRouter,
|
||||
`HUGGINGFACE_API_KEY` / `HF_TOKEN`, and self-hosted credentials should not be
|
||||
inferred from unrelated config.
|
||||
- Exportable artifacts should include provenance: source model, provider,
|
||||
route, tool policy, eval inputs, and redaction status.
|
||||
- Public sharing, hosted telemetry, sponsorship badges, and external branding
|
||||
|
||||
+18
-1
@@ -44,6 +44,11 @@ Use any of these surfaces to select a provider:
|
||||
as legacy aliases for `deepseek`. They do not select a different official host;
|
||||
DeepSeek uses the same official API host worldwide.
|
||||
|
||||
`huggingface`, `hugging-face`, `hugging_face`, and `hf` all select the
|
||||
Hugging Face Inference Providers route. This is the OpenAI-compatible router
|
||||
path for chat/inference, not Hub browsing, model-card inspection, uploads, or
|
||||
artifact export.
|
||||
|
||||
Fresh shared config writes to `~/.codewhale/config.toml`. Existing
|
||||
`~/.deepseek/config.toml` files are still read for compatibility.
|
||||
|
||||
@@ -128,7 +133,7 @@ endpoint.
|
||||
| `sglang` | `[providers.sglang]` | Optional `SGLANG_API_KEY` | `SGLANG_BASE_URL`; default `http://localhost:30000/v1` | `deepseek-ai/DeepSeek-V4-Pro`, `deepseek-ai/DeepSeek-V4-Flash` | Self-hosted OpenAI-compatible route. Localhost deployments commonly omit auth. `SGLANG_MODEL` is accepted. |
|
||||
| `vllm` | `[providers.vllm]` | Optional `VLLM_API_KEY` | `VLLM_BASE_URL`; default `http://localhost:8000/v1` | `deepseek-ai/DeepSeek-V4-Pro`, `deepseek-ai/DeepSeek-V4-Flash` | Self-hosted vLLM OpenAI-compatible route. Localhost deployments commonly omit auth. `VLLM_MODEL` is accepted. |
|
||||
| `ollama` | `[providers.ollama]` | Optional `OLLAMA_API_KEY` | `OLLAMA_BASE_URL`; default `http://localhost:11434/v1` | `deepseek-coder:1.3b`; provider-hinted custom tags pass through | Self-hosted Ollama OpenAI-compatible route. Localhost deployments commonly omit auth. `OLLAMA_MODEL` is accepted. |
|
||||
| `huggingface` | `[providers.huggingface]` | `HUGGINGFACE_API_KEY`, `HF_TOKEN` | `HUGGINGFACE_BASE_URL`; default `https://router.huggingface.co/v1` | `deepseek-ai/DeepSeek-V4-Pro`, `deepseek-ai/DeepSeek-V4-Flash` | Hugging Face Inference Providers OpenAI-compatible route. Org-prefixed model IDs pass through. |
|
||||
| `huggingface` | `[providers.huggingface]` | `HUGGINGFACE_API_KEY`, `HF_TOKEN` | `HUGGINGFACE_BASE_URL`, `HF_BASE_URL`; default `https://router.huggingface.co/v1` | `deepseek-ai/DeepSeek-V4-Pro`, `deepseek-ai/DeepSeek-V4-Flash` | Hugging Face Inference Providers OpenAI-compatible router route. Accepted aliases: `huggingface`, `hugging-face`, `hugging_face`, `hf`. Org-prefixed model IDs pass through. `HUGGINGFACE_MODEL` and `HF_MODEL` are accepted. Hub browsing/export are separate future features. |
|
||||
|
||||
### Xiaomi MiMo Notes
|
||||
|
||||
@@ -223,6 +228,18 @@ the endpoint's ability to accept OpenAI-compatible `tools` payloads. A custom
|
||||
OpenAI-compatible or local endpoint can still reject tool calls even if
|
||||
CodeWhale can send the schema.
|
||||
|
||||
### Hugging Face Inference Providers Notes
|
||||
|
||||
The shipped Hugging Face route targets the OpenAI-compatible Inference Providers
|
||||
router at `https://router.huggingface.co/v1`. Configure auth with
|
||||
`HUGGINGFACE_API_KEY` first, or `HF_TOKEN` as a fallback. Configure the endpoint
|
||||
with `HUGGINGFACE_BASE_URL` first, or `HF_BASE_URL` as a fallback; configure the
|
||||
model with `HUGGINGFACE_MODEL` first, or `HF_MODEL` as a fallback.
|
||||
|
||||
This route does not imply Hub browsing, model-card metadata, dataset access,
|
||||
Jobs, uploads, or export. Those remain explicit Model Lab work items so
|
||||
provider auth and artifact movement stay separate.
|
||||
|
||||
### When a Local Model Prints Tool JSON
|
||||
|
||||
CodeWhale only executes tools when the provider returns Chat Completions
|
||||
|
||||
Reference in New Issue
Block a user