Description
Adds preload.auto_preload_count: N to tool-router session create. When set, the server resolves the top-N most-used tools for the (projectId, userId) pair from past tool-router usage and adds them to session.tools alongside the manual preload.tools list. Behind LaunchDarkly toolrouterAutoPreloadEnabled (default off).
What's in this PR
API surface (snake_case wire field):
preload.auto_preload_count: number on POST / PATCH session (v3 + v3.1)
- Resolved slugs ride through
session.tools / tool_router_tools alongside the manual preload.tools — no new top-level field on the public response, no internal metadata (computed_at / expires_at / reason) leaks to the wire
Architecture:
- Per-(project, user) ranked usage pool lives in a new
user_tool_usage_pool Postgres table. One row per pair; full ranked pool in a single JSONB column (we always read/write it together).
- Refreshed from ClickHouse
tool_execution_logs once per LD refreshAfterHours (default 24h) under a row-level refreshLockedUntil lock that other workers see and skip.
- Per-session snapshot lives in a new
auto_preload_snapshot JSONB column on tool_router_sessions so background TTL refresh never bumps configVersion. PATCH optimistic concurrency stays clean.
- Selector pipeline (pure): dedup against manual preload + helper slugs → drop slugs deprecated since pool computed → re-apply session filters → drop oversize tools (chars/N heuristic) → take top-N by score → sort alphabetically (prompt-cache stability).
- Stale-preservation: if fresh ClickHouse query is empty AND existing pool is non-empty, the row is preserved (computedAt bumped, data untouched). Quiet users keep their preferences. Hard ceiling at
maxStalePreservationDays (default 30d).
ClickHouse query (filters justified against prod data, 2026-05-12):
- TR-originated, non-meta, successful calls only
tool_router_session_id != '', provider != 'composio', errorRequest = ''
- No sandbox filter: verified that
sandbox_id is session-scoped (set after workbench provisioning), not call-scoped, so filtering it would over-drop legitimate direct MCP calls. Sub-tool rows from COMPOSIO_MULTI_EXECUTE_TOOL carry source = 'mcp' (identical to direct MCP) — no way to distinguish today, and both are agentic signals anyway.
Failure modes are non-blocking:
- ClickHouse timeout / error → session is created without auto-preload (snapshot null), logged.
- Pool empty after filters → snapshot has
resolved: [], reason: 'auto_preload_count_zero_after_filters'.
- Pool DB read fails → in-memory empty pool, snapshot null.
- The only Err that propagates is
auto_preload_count > maxCount (400 with explicit message).
DB migration (v86):
- New column
ToolRouterSession.autoPreloadSnapshot (JSONB nullable)
- New table
user_tool_usage_pool with (projectId, userId) unique index + computedAt index
- Liquibase + corresponding Prisma schema additions
LD flags (11 new, all project-scoped):
| Flag | Default | Purpose |
|---|
toolrouterAutoPreloadEnabled | false | Master kill switch |
toolrouterAutoPreloadMaxCount | 25 | Cap on auto_preload_count (rejects 400 over) |
toolrouterAutoPreloadLookbackDays | 30 | ClickHouse lookback (clamped to ≤30 in code) |
toolrouterAutoPreloadHalflifeDays | 7 | Decay half-life |
toolrouterAutoPreloadPoolSize | 50 | Per-user pool depth |
toolrouterAutoPreloadMinExecutionCount | 2 | Floor to consider a tool |
toolrouterAutoPreloadRefreshAfterHours | 24 | Pool refresh interval |
toolrouterAutoPreloadMaxStalePreservationDays | 30 | Drop preserved pools older than this |
toolrouterAutoPreloadMaxTokensPerTool | 2000 | Per-tool size guard |
toolrouterAutoPreloadCharsPerTokenEstimate | 4 | Heuristic ratio |
toolrouterAutoPreloadClickhouseTimeoutMs | 1500 | Pool query deadline |
Known limitation (follow-up)
PATCH accepts auto_preload_count but does not recompute the snapshot on filter changes. The read path re-fetches per-tool schemas at request time and execution enforces the gate, so this degrades to "agent sees a tool but the call returns a filter error" — not a correctness issue. Full PATCH-triggered recompute is a planned follow-up.
How did I test this PR
- Unit tests (
pnpm vitest run src/lib/toolRouterV2/features/auto_preload/ src/lib/toolRouterV2/features/session/util/autoPreloadSnapshot.unit.test.ts src/lib/toolRouterV2/utils/preload.unit.test.ts): 53 tests pass. Coverage:
- Selector: alphabetical sort for prompt-cache stability, requestedCount cap, dedup against manual + helper slugs (case-insensitive), drops slugs missing tool details (deprecated since pool), respects toolkit allowlist, drops oversized tools, empty-pool clean exit,
estimateToolTokens chars/N heuristic
- Snapshot orchestrator: fresh-pool serves as-is (no ClickHouse hit), stale-pool refresh under lock, contention path serves stale pool, stale-preservation on empty fresh, first-time empty persists empty, ClickHouse error falls through to existing pool, stale-beyond-ceiling drops + refreshes
- Snapshot serializer: parse/serialize round-trip, malformed JSON → null (graceful degradation),
isSnapshotStale boundary
- Existing 28 preload tests still pass — no regression
- Type check (
pnpm check-types): exit 0
- Lint (
pnpm lint): 0 errors, 1 pre-existing warning unrelated to this change
- ClickHouse query semantics verified against prod data via
METABASE_POST_API_DATASET (1.82M sub-tool rows over 3 days, source distribution confirmed)
🤖 Generated with Claude Code