Adds Nebius Token Factory to the model picker so users can run their TrustClaw agent on open-source models (DeepSeek, Qwen, Llama, GLM, gpt-oss) in addition to Anthropic. The self-hoster picks how Nebius traffic is routed.
NEBIUS_ROUTING env)| Mode | What it does | Setup |
|---|---|---|
direct | Hit Nebius's OpenAI-compatible API with NEBIUS_API_KEY. Lowest latency, transparent Nebius bill. | Set NEBIUS_API_KEY (+ optional NEBIUS_BASE_URL to override the EU endpoint). |
gateway | Route through Vercel AI Gateway as a first-class provider. Unified observability + one Vercel bill, costs an extra hop. | Register a Nebius credential in the Vercel project's AI Gateway settings. |
| unset | Nebius models hidden from the picker entirely. Default. | — |
Anthropic always routes through AI Gateway regardless — this PR doesn't change that path.
Some self-hosters want the AI Gateway philosophy carried through everywhere (single chokepoint, consolidated observability and billing). Others want the lowest possible latency and a direct Nebius bill they can audit. Forcing one shape would alienate half of either camp, so the routing is a flag.
src/server/clients/nebius.ts — routing-aware client factory. resolveModel(id) returns an @ai-sdk/openai-compatible LanguageModel for Nebius ids in direct mode, null otherwise so callers fall through to the existing Gateway string-model path.src/server/api/routers/trustclaw/models.ts — single model catalog (3 Anthropic + 5 Nebius) replacing the duplicated MODELS arrays in onboarding consts and the settings page. Includes a toGatewayModelId() helper used by all three LLM call sites.getAvailableModels tRPC query — gates Nebius on the routing flag; returns the active mode so the settings page can label which path is in use.agent/setup.ts + the two compaction call sites — call resolveModel() first, fall through to the Gateway string. Anthropic cacheControl provider options are gated on isAnthropicModel(modelId) — even when Nebius runs through Gateway, those options are Anthropic-specific.models.ts registry.@ai-sdk/openai-compatible added as a dep.ALLOWED_ANTHROPIC_MODELS kept as a deprecated re-export alias so callers that already imported the Anthropic-only name still type-check.text-embedding-3-large via AI Gateway. Switching would invalidate existing pgvector rows.instance.anthropicModel — now a misnomer but renaming requires a migration for zero semantic gain. Catalog ids carry provider info (nebius/<org>/<model> vs bare Claude ids).direct to gateway on Nebius credit exhaustion. Doable, but needs verified detection of the credit-error shape and a LanguageModel wrapper that doesn't break mid-stream. Worth its own focused PR.pnpm tsc --noEmit — cleanpnpm build — clean (Prisma generate + Next.js build both pass)pnpm lint — only a pre-existing warning unrelated to this changeNEBIUS_ROUTING=direct + NEBIUS_API_KEY, pick DeepSeek V3.2 in settings, send a chat → expect streamed responseNEBIUS_ROUTING=gateway (after registering Nebius in Vercel Gateway), repeat → expect same response, traffic visible in Gateway dashboardNEBIUS_ROUTING, confirm Nebius models are hidden from the pickerThe manual tests aren't possible from this checkout without a real Nebius key + a Vercel deployment with Gateway credentials, so they're listed as TODO for reviewers / for me to do before merge.
cc @sarahsimionescu — happy to iterate on the routing-mode shape (single env vs per-instance setting, naming, default) before merging.
🤖 Generated with Claude Code