docs: align Hugging Face provider docs, errors, and tests with shipped route

This commit is contained in:
Matt Van Horn
2026-06-07 02:32:41 -07:00
parent 8dff2f7525
commit a855b41d91
6 changed files with 370 additions and 20 deletions
+21 -12
View File
@@ -7,8 +7,7 @@ those models become discoverable, evaluable, routable, servable, and exportable
without weakening the current terminal-agent contract: local workspace control,
explicit provider auth, approval gates, and clear privacy boundaries.
This document is roadmap language. It does not mean every workset below is
implemented today.
This document is roadmap language. Some worksets below are roadmap-only.
## Implemented Today
@@ -19,6 +18,10 @@ implemented today.
OpenAI-compatible endpoints, SGLang, vLLM, and Ollama are supported provider
paths where their IDs appear in `/provider`, `codewhale --provider`, or
`codewhale models`.
- Hugging Face Inference Providers are available through the
OpenAI-compatible router at `https://router.huggingface.co/v1`. Select the
route with `huggingface`, `hugging-face`, `hugging_face`, or `hf`; configure
`HUGGINGFACE_API_KEY` or `HF_TOKEN` for auth.
- Model auto-routing chooses a concrete DeepSeek model and thinking level per
turn. It is not a TUI mode.
- Fin is the fast `deepseek-v4-flash` thinking-off path for routing,
@@ -27,11 +30,10 @@ implemented today.
- Self-hosted OpenAI-compatible endpoints can be used through SGLang, vLLM,
Ollama, or the generic `openai` provider configuration.
## Not Implemented Yet
## Still Planned
- A native Hugging Face provider or Hub browser.
- Built-in Hugging Face model card, dataset, adapter, safetensors, or Jobs
workflows.
- Hugging Face Hub browsing, upload/export, model card, dataset, adapter,
safetensors, or Jobs workflows.
- Native Unsloth, NeMo, or Arcee integrations.
- A dedicated Model Lab UI tab.
- Built-in benchmark suites, eval leaderboards, hosted observability, or
@@ -57,18 +59,24 @@ describe a model as available before CodeWhale can actually route to it.
## Hugging Face Workset
Implemented today:
- Hugging Face Inference Providers as an explicit OpenAI-compatible router
provider, selected with `huggingface`, `hugging-face`, `hugging_face`, or
`hf`.
- Model IDs are sent to the router exactly as selected, including
org-prefixed Hugging Face model IDs.
Planned scope:
- Hub API auth and model discovery.
- Model cards, licenses, tags, safetensors metadata, adapters, and dataset
links surfaced in a terminal-friendly way.
- Inference Providers as explicit provider choices when the user configures
them.
- Hugging Face Jobs as an optional remote execution path for user-approved
experiments.
Non-goal for now: claiming a native Hugging Face provider exists before it is
implemented in code.
Non-goal for now: treating the router route as Hub browsing/export, or
inferring Hub upload/export auth from the inference-provider API key.
## Unsloth Workset
@@ -138,8 +146,9 @@ Planned scope:
- Local files, prompts, transcripts, traces, model outputs, eval results,
adapters, datasets, and checkpoints should remain local unless the user
explicitly chooses a provider or export destination.
- Provider auth must remain explicit. `DEEPSEEK_*`, OpenRouter, Hugging Face,
and self-hosted credentials should not be inferred from unrelated config.
- Provider auth must remain explicit. `DEEPSEEK_*`, OpenRouter,
`HUGGINGFACE_API_KEY` / `HF_TOKEN`, and self-hosted credentials should not be
inferred from unrelated config.
- Exportable artifacts should include provenance: source model, provider,
route, tool policy, eval inputs, and redaction status.
- Public sharing, hosted telemetry, sponsorship badges, and external branding