PR Custody

feat: add Nebius Token Factory as a model provider (direct or via AI Gateway)

@opencolinchecks n/achecks…feat/nebius-token-factory → main17 files · +564 −106updated 1mo ago

▸Description

Summary

Adds Nebius Token Factory to the model picker so users can run their TrustClaw agent on open-source models (DeepSeek, Qwen, Llama, GLM, gpt-oss) in addition to Anthropic. The self-hoster picks how Nebius traffic is routed.

Routing modes (via `NEBIUS_ROUTING` env)

Mode	What it does	Setup
`direct`	Hit Nebius's OpenAI-compatible API with `NEBIUS_API_KEY`. Lowest latency, transparent Nebius bill.	Set `NEBIUS_API_KEY` (+ optional `NEBIUS_BASE_URL` to override the EU endpoint).
`gateway`	Route through Vercel AI Gateway as a first-class provider. Unified observability + one Vercel bill, costs an extra hop.	Register a Nebius credential in the Vercel project's AI Gateway settings.
unset	Nebius models hidden from the picker entirely. Default.	—

Anthropic always routes through AI Gateway regardless — this PR doesn't change that path.

Motivation for offering both

Some self-hosters want the AI Gateway philosophy carried through everywhere (single chokepoint, consolidated observability and billing). Others want the lowest possible latency and a direct Nebius bill they can audit. Forcing one shape would alienate half of either camp, so the routing is a flag.

What changed

src/server/clients/nebius.ts — routing-aware client factory. resolveModel(id) returns an @ai-sdk/openai-compatible LanguageModel for Nebius ids in direct mode, null otherwise so callers fall through to the existing Gateway string-model path.
src/server/api/routers/trustclaw/models.ts — single model catalog (3 Anthropic + 5 Nebius) replacing the duplicated MODELS arrays in onboarding consts and the settings page. Includes a toGatewayModelId() helper used by all three LLM call sites.
getAvailableModels tRPC query — gates Nebius on the routing flag; returns the active mode so the settings page can label which path is in use.
agent/setup.ts + the two compaction call sites — call resolveModel() first, fall through to the Gateway string. Anthropic cacheControl provider options are gated on isAnthropicModel(modelId) — even when Nebius runs through Gateway, those options are Anthropic-specific.
Onboarding model step + settings page — group models by provider; render an empty-state hint with the env-var setup instructions when Nebius is disabled, or a one-liner indicating the active routing mode when on.
README + CLAUDE.md — new "Model providers" sections documenting the routing modes and the models.ts registry.
@ai-sdk/openai-compatible added as a dep.
ALLOWED_ANTHROPIC_MODELS kept as a deprecated re-export alias so callers that already imported the Anthropic-only name still type-check.

Preserved

Embedding model — still OpenAI text-embedding-3-large via AI Gateway. Switching would invalidate existing pgvector rows.
DB column name instance.anthropicModel — now a misnomer but renaming requires a migration for zero semantic gain. Catalog ids carry provider info (nebius/<org>/<model> vs bare Claude ids).
3-layer context management (pruning / memory flush / compaction) works unchanged across providers since it's all token-counting at the SDK level.

Deliberately deferred (better as follow-ups)

Failover from direct to gateway on Nebius credit exhaustion. Doable, but needs verified detection of the credit-error shape and a LanguageModel wrapper that doesn't break mid-stream. Worth its own focused PR.
Per-instance routing override. Right now it's a deployment-wide knob; could be moved into the per-user settings table if users actually want to choose.

Test plan

pnpm tsc --noEmit — clean
pnpm build — clean (Prisma generate + Next.js build both pass)
pnpm lint — only a pre-existing warning unrelated to this change
Manual smoke: set NEBIUS_ROUTING=direct + NEBIUS_API_KEY, pick DeepSeek V3.2 in settings, send a chat → expect streamed response
Manual smoke: set NEBIUS_ROUTING=gateway (after registering Nebius in Vercel Gateway), repeat → expect same response, traffic visible in Gateway dashboard
Manual smoke: unset NEBIUS_ROUTING, confirm Nebius models are hidden from the picker

The manual tests aren't possible from this checkout without a real Nebius key + a Vercel deployment with Gateway credentials, so they're listed as TODO for reviewers / for me to do before merge.

Branch / fork

cc @sarahsimionescu — happy to iterate on the routing-mode shape (single env vs per-instance setting, naming, default) before merging.

🤖 Generated with Claude Code

loading diff…

Summary

Routing modes (via NEBIUS_ROUTING env)

Motivation for offering both

What changed

Preserved

Deliberately deferred (better as follow-ups)

Test plan

Branch / fork

Summary

Routing modes (via NEBIUS_ROUTING env)

Motivation for offering both

What changed

Preserved

Deliberately deferred (better as follow-ups)

Test plan

Branch / fork

Routing modes (via `NEBIUS_ROUTING` env)

Routing modes (via `NEBIUS_ROUTING` env)