Skip to Content
AgentsImprovementsGap 10: No Dynamic Model Routing

Gap 10: No Dynamic Model Routing

Problem

Adapter and model selection is static per agent role. AgentConfig.adapter and AgentConfig.model are set in the database and never change based on the actual task being executed.

This means:

  • A blog-writer configured for claude-sonnet-4-6 uses the same model for a 3,000-word technical SEO deep-dive as for a 150-word social caption
  • A fast, cheap task waits in the same queue and uses the same compute as a slow, complex one
  • There is no cost-quality tradeoff optimisation — every task pays for the same model regardless of need

The MRKL architecture (Karpas et al., 2022) describes the LLM as a router that dispatches to the most appropriate expert module per sub-task — neural or symbolic, cheap or powerful. The same principle applies to model selection.

What to Build

1. Task complexity classifier

Before executing a generation, classify the task complexity with a simple heuristic (no LLM call needed):

// packages/agents/src/lib/model-router.ts export type TaskComplexity = "trivial" | "simple" | "standard" | "complex" | "expert"; export function classifyTaskComplexity(params: { agentRole: string; outputLengthTarget?: number; // estimated word count hasRevisionHistory: boolean; // rejection re-run = harder contextRichness: number; // 0–1, how much context is available requiresReasoning: boolean; // strategy/analysis vs. creative/execution }): TaskComplexity { const { agentRole, outputLengthTarget, hasRevisionHistory, requiresReasoning } = params; if (agentRole === "social-post-writer" && !hasRevisionHistory) return "simple"; if (agentRole === "email-writer" && !hasRevisionHistory) return "simple"; if (agentRole === "strategy-writer") return "expert"; if (agentRole === "context-file-writer") return "complex"; if (requiresReasoning && (outputLengthTarget ?? 0) > 1500) return "complex"; if ((outputLengthTarget ?? 0) > 2500) return "standard"; return "standard"; }

2. Model selection table

export const MODEL_SELECTION: Record<TaskComplexity, { adapter: string; model: string; maxTokens: number }> = { trivial: { adapter: "claude_local", model: "claude-haiku-4-5-20251001", maxTokens: 1_024 }, simple: { adapter: "claude_local", model: "claude-haiku-4-5-20251001", maxTokens: 2_048 }, standard: { adapter: "claude_local", model: "claude-sonnet-4-6", maxTokens: 8_192 }, complex: { adapter: "claude_local", model: "claude-sonnet-4-6", maxTokens: 16_384 }, expert: { adapter: "claude_local", model: "claude-opus-4-7", maxTokens: 32_768 }, };

The AgentConfig.model in DB becomes a floor — the minimum model allowed for that role. Dynamic routing can only upgrade, never downgrade below the configured floor.

3. Cost-aware routing with budget check

Before routing to an expensive model, check whether the tenant’s credit balance can afford the estimated cost:

export async function selectModel( agentRole: string, complexity: TaskComplexity, tenantId: string, agentConfig: AgentConfig ): Promise<{ adapter: string; model: string; maxTokens: number }> { const dynamic = MODEL_SELECTION[complexity]; // Never downgrade below configured floor const selectedModel = isMoreCapable(dynamic.model, agentConfig.model) ? dynamic : { adapter: agentConfig.adapter, model: agentConfig.model, maxTokens: dynamic.maxTokens }; // Estimate cost and check credits const estimatedCost = estimateCost(selectedModel.model, selectedModel.maxTokens); const hasCredits = await checkCreditBalance(tenantId, estimatedCost); if (!hasCredits && selectedModel.model === "claude-opus-4-7") { // Downgrade to sonnet if out of credits for opus return MODEL_SELECTION["complex"]; } return selectedModel; }

4. Log model selection decision in AgentRun

model AgentRun { // existing adapter String? model String? // new taskComplexity String? // trivial | simple | standard | complex | expert modelRoutedTo String? // actual model used (may differ from agentConfig.model) modelRouteReason String? // "complexity:complex", "revision_history", "credit_limit" }

5. Model routing analytics in manage portal

Add a tab to /agents?tab=analytics showing:

  • Model distribution per agent role (how often does each complexity tier get selected?)
  • Cost savings from haiku vs. sonnet vs. opus routing
  • Correlation between model selection and quality scores

This gives visibility into whether the routing is working — if trivial tasks are routing to opus, the heuristics need tuning.

Files to Change

  • New file: packages/agents/src/lib/model-router.ts
  • packages/agents/src/workers/blog-writer.worker.ts — call selectModel() before adapter.execute()
  • packages/agents/src/workers/social-post-writer.worker.ts — same (big win: haiku for simple captions)
  • packages/agents/src/workers/setup.worker.ts — same
  • packages/db/prisma/schema.prisma — add routing fields to AgentRun
  • apps/dashboard/src/app/(dashboard)/... manage agent analytics — model breakdown chart
  • Gap 8: Context window management (smaller models have smaller context windows; budget must account for this)
  • Gap 13: Cost circuit breaker (model routing is the first line of cost control; circuit breaker is the safety net)

© 2026 Leadmetrics — Internal use only