156 lines
5.8 KiB
Markdown
156 lines
5.8 KiB
Markdown
# Model Lab Roadmap
|
|
|
|
Model Lab is the planned open-model workbench for CodeWhale. The north star is
|
|
simple: CodeWhale should become the best terminal coding agent for open-source
|
|
and open-weight models across every provider that offers them. Model Lab is how
|
|
those models become discoverable, evaluable, routable, servable, and exportable
|
|
without weakening the current terminal-agent contract: local workspace control,
|
|
explicit provider auth, approval gates, and clear privacy boundaries.
|
|
|
|
This document is roadmap language. Some worksets below are roadmap-only.
|
|
|
|
## Implemented Today
|
|
|
|
- DeepSeek is the first-class default provider today, with `deepseek-v4-pro`,
|
|
`deepseek-v4-flash`, streaming thinking blocks, Fin routing, `DEEPSEEK_*`
|
|
environment variables, and `~/.deepseek` config compatibility.
|
|
- OpenRouter, Novita, Fireworks, NVIDIA NIM, AtlasCloud, Wanjie Ark, generic
|
|
OpenAI-compatible endpoints, SGLang, vLLM, and Ollama are supported provider
|
|
paths where their IDs appear in `/provider`, `codewhale --provider`, or
|
|
`codewhale models`.
|
|
- Hugging Face Inference Providers are available through the
|
|
OpenAI-compatible router at `https://router.huggingface.co/v1`. Select the
|
|
route with `huggingface`, `hugging-face`, `hugging_face`, or `hf`; configure
|
|
`HUGGINGFACE_API_KEY` or `HF_TOKEN` for auth.
|
|
- Model auto-routing chooses a concrete DeepSeek model and thinking level per
|
|
turn. It is not a TUI mode.
|
|
- Fin is the fast `deepseek-v4-flash` thinking-off path for routing,
|
|
summaries, cheap checks, RLM child calls, wakeup verification, and
|
|
binary-completion checks.
|
|
- Self-hosted OpenAI-compatible endpoints can be used through SGLang, vLLM,
|
|
Ollama, or the generic `openai` provider configuration.
|
|
|
|
## Still Planned
|
|
|
|
- Hugging Face Hub browsing, upload/export, model card, dataset, adapter,
|
|
safetensors, or Jobs workflows.
|
|
- Native Unsloth, NeMo, or Arcee integrations.
|
|
- A dedicated Model Lab UI tab.
|
|
- Built-in benchmark suites, eval leaderboards, hosted observability, or
|
|
training-infrastructure orchestration.
|
|
|
|
Until those land, use the provider paths above, MCP servers, or external
|
|
workflows explicitly configured by the user.
|
|
|
|
## Model Lab Principle
|
|
|
|
Model Lab should help users answer practical questions:
|
|
|
|
- Which model should handle this turn?
|
|
- Which open or open-weight model can I run locally or through a trusted
|
|
provider?
|
|
- Which provider offers this model with the latency, price, context window,
|
|
license, and privacy posture I need?
|
|
- What did this model cost, how did it perform, and what data left my machine?
|
|
- Can I reproduce, export, or self-host the route?
|
|
|
|
It should never hide provider boundaries, silently upload local artifacts, or
|
|
describe a model as available before CodeWhale can actually route to it.
|
|
|
|
## Hugging Face Workset
|
|
|
|
Implemented today:
|
|
|
|
- Hugging Face Inference Providers as an explicit OpenAI-compatible router
|
|
provider, selected with `huggingface`, `hugging-face`, `hugging_face`, or
|
|
`hf`.
|
|
- Model IDs are sent to the router exactly as selected, including
|
|
org-prefixed Hugging Face model IDs.
|
|
|
|
Planned scope:
|
|
|
|
- Hub API auth and model discovery.
|
|
- Model cards, licenses, tags, safetensors metadata, adapters, and dataset
|
|
links surfaced in a terminal-friendly way.
|
|
- Hugging Face Jobs as an optional remote execution path for user-approved
|
|
experiments.
|
|
|
|
Non-goal for now: treating the router route as Hub browsing/export, or
|
|
inferring Hub upload/export auth from the inference-provider API key.
|
|
|
|
## Unsloth Workset
|
|
|
|
Planned scope:
|
|
|
|
- Fine-tuning recipes and adapter workflows for users who already own the data
|
|
and compute path.
|
|
- Export guidance that keeps dataset, adapter, and checkpoint locations explicit.
|
|
- Compatibility notes for models that can return to local serving or a hosted
|
|
OpenAI-compatible endpoint.
|
|
|
|
## NeMo Workset
|
|
|
|
Planned scope:
|
|
|
|
- Training and alignment workflow notes for users operating NVIDIA-centric
|
|
infrastructure.
|
|
- Clear boundaries between NVIDIA NIM inference support that exists today and
|
|
future NeMo training or customization workflows.
|
|
|
|
## Arcee Workset
|
|
|
|
Planned scope:
|
|
|
|
- Small-model routing and specialization experiments.
|
|
- Exportable routes that make it clear when a task is handled by a smaller
|
|
model, Fin, or full DeepSeek reasoning.
|
|
|
|
## Serving Workset
|
|
|
|
Planned scope:
|
|
|
|
- Better local and private serving ergonomics for SGLang, vLLM, Ollama, and
|
|
OpenAI-compatible gateways.
|
|
- Health checks, model listing, context-window metadata, and route validation.
|
|
- No silent network exposure: public endpoints must be configured explicitly.
|
|
|
|
## Eval Workset
|
|
|
|
Planned scope:
|
|
|
|
- Reproducible task suites for coding, review, docs, release checks, and
|
|
long-context workflows.
|
|
- Side-by-side route comparisons where the exact model, provider, thinking
|
|
level, prompt, and tool policy are captured.
|
|
|
|
## Observability Workset
|
|
|
|
Planned scope:
|
|
|
|
- Local-first traces for turn routing, tool calls, approvals, cost, cache
|
|
behavior, and context pressure.
|
|
- Export rules that redact secrets and require explicit user action before data
|
|
leaves the machine.
|
|
|
|
## Training Infra Workset
|
|
|
|
Planned scope:
|
|
|
|
- Recipes for dataset preparation, adapter training, artifact naming, and
|
|
promotion into serving.
|
|
- Separation between local/private artifacts and anything published to a hub or
|
|
registry.
|
|
|
|
## Privacy And Export Rules
|
|
|
|
- Local files, prompts, transcripts, traces, model outputs, eval results,
|
|
adapters, datasets, and checkpoints should remain local unless the user
|
|
explicitly chooses a provider or export destination.
|
|
- Provider auth must remain explicit. `DEEPSEEK_*`, OpenRouter,
|
|
`HUGGINGFACE_API_KEY` / `HF_TOKEN`, and self-hosted credentials should not be
|
|
inferred from unrelated config.
|
|
- Exportable artifacts should include provenance: source model, provider,
|
|
route, tool policy, eval inputs, and redaction status.
|
|
- Public sharing, hosted telemetry, sponsorship badges, and external branding
|
|
require maintainer approval.
|